\documentclass[10pt]{article}
\usepackage{amsmath,amsthm,amssymb,hyperref}
\newcommand{\bool}{\ensuremath{\{0,1\}}}
\newcommand{\Hcal}{\ensuremath{\mathcal{H}}}
\newcommand{\Rcal}{\ensuremath{\mathcal{R}}}
\newcommand{\Pfrak}{\ensuremath{\mathfrak{P}}}
\newcommand{\Nfrak}{\ensuremath{\mathfrak{N}}}
% order: \order[st]{1}, \order{k}
\newcommand{\order}[2][th]{\ensuremath{{#2}^{\mathrm{#1}}}}
\input{preamble}
\setcounter{lecturecounter}{20}
\usepackage{fullpage}
\usepackage{subfig,paralist,graphicx,boxedminipage}
\usepackage{tikz}
\usetikzlibrary{matrix}
\begin{document}
%{lecture number}{lecture date}{Lecturer's name}{Your name}
\lecture{20}{March 27, 2012}{Sofya Raskhodnikova}{Youngtae Youn}
We continue the topic of testing monotonicity on general poset domains
\cite{FLN}. We first define what an induced matching is and explain
what it looks like in a hypercube.
Then we show that a graph with many \emph{large} induced matchings is
hard to test monoticity. Fianlly, we explain how to construct such a
graph to prove the lower bound of query complexity.
%%%
\section{Induced Matching}
A \emph{matching} in a graph is a set of edges without any common
vertices. An \emph{induced matching} $M$ of a graph $G=(V,E)$ is a set
of edges $M\subseteq E$ such that $M$ is a matching and no two edges
of $M$ are joined by an edge of $G$. It can be explained by the
subgraph induced by $V(M)$, the endpoints of a matching $M$; that is,
the set of edges induced by $V(M)$ is the same as $M$. For example,
consider a graph in Figure \ref{fig:orig} with eight vertices and four
edges.
\begin{figure}[h]
\centering
\subfloat[original graph]{
\label{fig:orig}
\begin{tikzpicture}
\matrix (m) [matrix of math nodes, row sep=2.5em,
ampersand replacement=\&, column sep=2em]
{ a \& b \& c \& d \\
e \& f \& g \& h \\ };
\path
(m-1-2) edge [dotted] (m-2-1)
(m-1-2) edge [dotted] (m-2-4)
(m-1-4) edge [dotted] (m-2-3)
(m-1-4) edge [dotted] (m-2-4);
\end{tikzpicture}
}
\qquad
\subfloat[an induced matching]{
\label{fig:induced}
\begin{tikzpicture}
\matrix (m) [matrix of math nodes, row sep=2.5em,
ampersand replacement=\&, column sep=2em]
{ a \& \textbf{b} \& c \& \textbf{d} \\
\textbf{e} \& f \& \textbf{g} \& h \\ };
\path
(m-1-2) edge (m-2-1)
(m-1-2) edge [dotted] (m-2-4)
(m-1-4) edge (m-2-3)
(m-1-4) edge [dotted] (m-2-4);
\end{tikzpicture}
}
\qquad
\subfloat[a non-induced matching]{
\label{fig:non-induced}
\begin{tikzpicture}
\matrix (m) [matrix of math nodes, row sep=2.5em,
ampersand replacement=\&, column sep=2em]
{ a \& \textbf{b} \& c \& \textbf{d} \\
e \& f \& \textbf{g} \& \textbf{h} \\ };
\path
(m-1-2) edge [dotted] (m-2-1)
(m-1-2) edge (m-2-4)
(m-1-4) edge (m-2-3)
(m-1-4) edge (m-2-4);
\end{tikzpicture}
}
\caption{Example of an induced matching and a non-induced matching}
\label{fig:example}
\end{figure}
Picking $M=\{(b,e),(d,g)\}$ results in an induced matching in Figure
\ref{fig:induced} as the subgraph induced by the set of vertices
$\{b,d,e,g\}$ has only two edges $(b,e)$ and $(d,g)$. On the contrary,
picking $M=\{(b,h),(d,g)\}$ results in a non-induced matching in
Figure \ref{fig:non-induced} as the sugraph induced by $\{b,d,g,h\}$
has an extra edge $(d,h)$ other than $(b,h)$ and $(g,d)$.
\paragraph{An Induced Matching on $\Hcal_d$.}
Consider an induced matching on a $d$-dimensional hypercube $\Hcal_d$.
A node in $\Hcal_d$ is a $d$-bit string and whose weight is the number
of 1's in the string. One can treat a hypercube as a bipartite graph
by putting even weight nodes on the bottom and odd weight nodes on the
top. Observe that there are no edges between nodes in the same side.
An induced matching in $\Hcal_d$ can be formed for every dimension
$i\in[d]$ as follows:
\begin{enumerate}
\item Pick every bottom nodes with the $\order{i}$ bit being 0.
\item For each bottom node $\alpha 0 \beta$, choose an edge
$(\alpha 0 \beta, \alpha 1 \beta)$.
\end{enumerate}
Figure \ref{fig:cube-matching} shows how to construct an induced
matching in $\Hcal_3$ by setting $i=2$.
\begin{figure}[h]
\centering
\subfloat[$\Hcal_3$ as a bipartite graph]{
\label{fig:cube}
\begin{tikzpicture}
\matrix (m) [matrix of math nodes, row sep=2.5em,
ampersand replacement=\&, column sep=2em]
{ 001 \& 010 \& 100 \& 111 \\
000 \& 011 \& 101 \& 110 \\ };
\path
(m-1-1) edge [dotted] (m-2-1)
(m-1-1) edge [dotted] (m-2-2)
(m-1-1) edge [dotted] (m-2-3)
(m-1-2) edge [dotted] (m-2-1)
(m-1-2) edge [dotted] (m-2-2)
(m-1-2) edge [dotted] (m-2-4)
(m-1-3) edge [dotted] (m-2-1)
(m-1-3) edge [dotted] (m-2-3)
(m-1-3) edge [dotted] (m-2-4)
(m-1-4) edge [dotted] (m-2-2)
(m-1-4) edge [dotted] (m-2-3)
(m-1-4) edge [dotted] (m-2-4);
\end{tikzpicture}
}
\qquad
\subfloat[an induced matching in $\Hcal_3$]{
\label{fig:cube-induced}
\begin{tikzpicture}
\matrix (m) [matrix of math nodes, row sep=2.5em,
ampersand replacement=\&, column sep=2em]
{ \textbf{001} \& 010 \& \textbf{100} \& 111 \\
\textbf{000} \& 011 \& \textbf{101} \& 110 \\ };
\path
(m-1-1) edge (m-2-1)
(m-1-1) edge [dotted] (m-2-2)
(m-1-1) edge [dotted] (m-2-3)
(m-1-2) edge [dotted] (m-2-1)
(m-1-2) edge [dotted] (m-2-2)
(m-1-2) edge [dotted] (m-2-4)
(m-1-3) edge [dotted] (m-2-1)
(m-1-3) edge (m-2-3)
(m-1-3) edge [dotted] (m-2-4)
(m-1-4) edge [dotted] (m-2-2)
(m-1-4) edge [dotted] (m-2-3)
(m-1-4) edge [dotted] (m-2-4);
\end{tikzpicture}
}
\caption{An induced matching in $\Hcal_3$ obtained by setting $i=2$.}
\label{fig:cube-matching}
\end{figure}
Note that this constrction always outputs an induced
matching. Consider any two edges $(\alpha 0\beta, \alpha 1\beta)$ and
$(\alpha' 0\beta', \alpha' 1\beta')$ returned by this construction.
As $\alpha 1\beta$ and $\alpha' 0\beta'$ differs more than one bit,
these two nodes are not connected by any edge. The same reasoning
applies to $\alpha' 1\beta'$ and $\alpha 0\beta$ so that these two are
not connected either.
Both the bottom and the top side in $\Hcal_d$ has $2^{d-1}$ nodes,
respectively. And the half of the bottom nodes have the $\order{i}$
bit 0. It means each such matching has $n/4=2^{d-2}$ edges.
As our induced matching construction is determined by the choice of
$i\in[d]$, there are $d=\log n$ such matchings.
%%%
\section{Query Lower Bounds on General Bipartite Graphs}
We just explained how to construct an induced matching in a hypercube.
From now on, we switch our attention to general bipartite graphs with
$n$ vertices. Assume that all edges in these bipartite graphs go from
a bottom node to a top node.
Let $m$ denote the number of induced matching in these graphs.
An induced matching is \emph{large} if its size is $\eps n$ for some
constant $\eps < 1$.
Our goal is to prove the following.
\begin{theorem}
A bipartite graph $G$ with $m$ large induced matchings requires
$\Omega(\sqrt{m})$ queries for non-adaptive monotonicity testing.
\label{thm:lower}
\end{theorem}
We will use the variant of Yao's principle introduced in the last
lecture to prove the query complexity lower bound. Namely, we
construct distributions $\Pfrak$ on positive (monotone) inputs and
$\Nfrak$ on negative ($\eps$-far from monotone) inputs such that it is
hard to distinguish them with $q=o(\sqrt{m})$ queries.
%Our input distribution first chooses $\Pfrak$ or $\Nfrak$ with equal
%probability and then draws an input according to the chosen
%distribution. We show that every deterministic non-adaptive test with
%$o(\sqrt{m})$ queries has error probability larger than $1/3$ with
%respect to the induced probability on inputs.
The construction is as follows:
For both $\Pfrak$ and $\Nfrak$,
\begin{enumerate}
\item Randomly pick one of $m$ induced matching.
\item For non-matching edges $(u,v)$, set the top node $v$ to 1 and
the bottom node $u$ to 0.
\item For each top node $v$ in matching edges $(u,v)$,
set it to either 0 or 1 with probability 1/2.
\item For each bottom node $u$ in matching edges $(u,v)$,
\footnote{One might be tempted to set $f(u)=0$ for
$\Pfrak$ and $f(u)=1$ for $\Nfrak$. Unfortunately, however, these
distribtions are easy to distinguish for any tester.}
\begin{enumerate}
\item set $f(u) = f(v)$ for $\Pfrak$.
\item set $f(u) = 1 - f(v)$ for $\Nfrak$.
\end{enumerate}
\end{enumerate}
This construction is correct, which means $\Pfrak$ has a monotone node
labeing of the graph $G$ and $\Nfrak$ has a node labeling that are
$\eps$-far from monotone.
\begin{theorem}
The distribution $\Pfrak$ has a monotone node labeling of $G$.
\end{theorem}
\begin{proof}
Consider two kinds of edges in $G$.
\begin{enumerate}
\item Edges in a matching: Two endpoints of such an edge are set to be
equal by the last step of the construction.
\item Edges not in a matching: Bottom nodes are always set to 0 and
top nodes to 1.
\end{enumerate}
As all $m$ matchings are induced matchings, $G$ has these two kinds of
edges only. None of the edges are violated, so the node labeling is
monotone.
\end{proof}
\begin{theorem}
If the size of the matchings are at least $4\eps n$, then $\Nfrak$ has
a labeing $\eps$-far from monotone with overwhelming probability.
\end{theorem}
\begin{proof}
Consider any labeling in $\Nfrak$. The probability that any edge from
the selected matching is violated is 1/2, which corresponds to setting
the top node $f(v)=0$ and the bottom node $f(u)=1-f(v)$ in our
construction.
Then the expectated number of violated edges in this matching is at
least $\frac{1}{2} 4\eps n = 2\eps n$.
The probability that the node labeling is not $\eps$-far
from monotone is equal to the probability that the number of violated
edges in the mapping is less than $\eps n$, which amounts to
$\exp\{-\Omega(\eps n)\}$ by Chernoff bound.
Hence, the node labeling is $\eps$-far from
monotone with probability at least $1-\exp\{-\Omega(\eps n)\}$.
\end{proof}
\begin{proof}[Proof of Theorem \ref{thm:lower}]
Fix any non-adaptive deterministic tester with $q=o(\sqrt{m})$
queries.
Let $x_1,...,x_q$ denote the nodes that are queried by the tester.
Let $E$ be the events that the tester queries both endpoints in the
selected matching. Note that all inputs from $\Pfrak$ and $\Nfrak$
were constructed in pairs based on a selected matching. As the top
endpoints in this matching is labeled with a random binary value, an
algorithm cannot tell the difference unless querying a matching edge.
Hence, $(P-view|\overline{E}) = (N-view|\overline{E})$ where
$(P-view|\overline{E})$ is $P$-view conditioned on $\overline{E}$ and
$(N-view|\overline{E})$ is $N$-view conditioned on $\overline{E}$.
Meanwhile, the probability of $E$ is bounded by:
\begin{equation*}
\Pr\{E\} \leq \sum_{(i,j)\in \{x_1,...,x_q\}}
\Pr\{(i,j)\in \text{selected matching}\} \leq
q^2 / m = o(1)
\end{equation*}
where thet second inequality follows from the fact that any edge
belongs to a unique matching with probability at most $1/m$.
It leads to the conclusion that $SD(P-view, N-view)=o(1)$.
Combined with Yao's principle, the query complexity of non-adaptive
1-sided error testers must be $\Omega(\sqrt{m})$.
\end{proof}
%%%
\section{Constructing Hard-to-Test Graphs}
To achieve our lower bound, we show that there is a graph
with $n$ vertices that can be partitioned into $m$
matchings of size $s$ where $s$ is linear in $n$.
Ruzsa and Szemer{\'e}di \cite{RS} constructed such a graph with $t=n/3$,
but we cannot use their construction since the matching size $s$ is
\emph{nearly} linear in $n$.
In the next lecture, we construct such a graph with the matching size
$s=\frac{n}{8}-o(n)$ and the number of matchings $m=n^{\Omega(1/\log\log n)}$.
\begin{thebibliography}{99}
\bibitem{FLN}
Eldar Fischer, Eric Lehman, Ilan Newman, Sofya Raskhodnikova, Ronitt
Rubinfeld, and Alex Samorodnitsky.
\emph{Monotonicity testing over general poset domains},
STOC 2002.
\bibitem{RS}
Imre Ruzsa and Endre Szemer{\'e}di.
\emph{Triple systems with no six points carrying three triangles},
Combinatorics (Keszthely, 1976), Coll. Math. Soc. J. Bolyai, 1978
\end{thebibliography}
\end{document}