\documentclass[10pt]{article}
\usepackage{amsfonts,amsthm,amsmath,amssymb}
\usepackage{array}
\usepackage{epsfig}
\usepackage{fullpage}
\usepackage{amssymb}
\usepackage[colorlinks = false]{hyperref}
\newcommand{\1}{\mathbbm{1}}
\DeclareMathOperator*{\argmin}{argmin}
\DeclareMathOperator*{\argmax}{argmax}
\newcommand{\x}{\times}
\newcommand{\Z}{\mathbb{Z}}
\newcommand{\Q}{\mathbb{Q}}
\newcommand{\R}{\mathbb{R}}
\newcommand{\N}{\mathbb{N}}
\newcommand{\F}{\mathbb{F}}
\newcommand{\E}{\mathop{\mathbb{E}}}
\renewcommand{\bar}{\overline}
\renewcommand{\epsilon}{\varepsilon}
\newcommand{\eps}{\varepsilon}
\newcommand{\DTIME}{\textbf{DTIME}}
\renewcommand{\P}{\textbf{P}}
\newcommand{\SPACE}{\textbf{SPACE}}
\begin{document}
\input{preamble.tex}
\newtheorem{example}[theorem]{Example}
\theoremstyle{definition}
\newtheorem{defn}[theorem]{Definition}
\handout{CS 229r Information Theory in Computer Science}{Mar 14, 2019}{Instructor:
Madhu Sudan}{Scribe: Amir Shanehsazzadeh}{Lecture 14}
\section{Admin}
\begin{itemize}
\item PSET 3 Due Tomorrow! (Last PSET)
\item Project + *Team* Selection (Soft Deadline Tomorrow)
\item (Likely) Will Post Practice Problem Tomorrow
\end{itemize}
\section{Today}
\begin{itemize}
\item Future/Project Topics
\end{itemize}
\section{Information Complexity}
Right now we are in the middle of Information Complexity and will explore deeper into this. So far we have defined the external complexity as $$\text{IC}_{ext}(\pi) = I(XY;\pi)$$ the information about $XY$ given the protocol $\pi$ as viewed by an external observer. We can also define internal IC as $$\text{IC}_{int}(\pi) = I(Y; \pi | X) + I(X; \pi | Y),$$ which is the information that Alice and Bob learn about $X$ and $Y$ from the protocol $\pi$. \newline
The following inequality is not obvious for general RV's but it is for protocols $\pi$:
$$\text{IC}_{int}(\pi) \leq \text{IC}_{ext}(\pi) \leq \text{CC}(\pi).$$
\subsection{IC and Communication}
It turns out that IC limits the ability to communicate. A paper by [Barak et al.] proved the following main theorem:
\begin{theorem}[Barak et al.]
If $\pi$ communicates $C$ bits and requires $I$ bits of information, then it can be compressed to a $\tilde{O}(\sqrt{IC})$ bit protocol.
\end{theorem}
\noindent Ideally we would be able to compress to a $\tilde{O}(I)$ bit protocol, but then we could just keep compressing. However, this is still strong as it gives a protocol asymptotic to the geometric mean of $C$ and $I$. \newline
\subsection{IC and Amortized Communication}
\noindent A later work by [Braverman, Rao] gives another theorem:
\begin{theorem}[Braverman, Rao]
Amortized communication complexity $\approx$ information complexity. Take $t$ copies of $X$ and $Y$: $\{X_1, X_2, ..., X_t\}$ and $\{Y_1, Y_2, ..., Y_t\}$. The amount of communication needed to transmit $\pi(X_i, Y_i)$ for $i \in [t]$ is clearly in the interval $[tI, tC]$, but by this theorem we can replace this with the interval $[tI, \tilde{O}(tI)]$.
\end{theorem}
\noindent This tells us that we can squish the result after taking many, many copies. The result led to hopes that IC would give information about CC without amortization. This question was open until work by [Gamar-Kol-Raz] and [Braverman]:
\begin{theorem}[GKR]
$\exists f$ such that $\text{IC}_{int}(f) = k$ and $\text{CC}(f) \geq 2^k$.
\end{theorem}
\begin{theorem}[Braverman]
$\forall f$, $\text{CC}(f) \leq 2^{\text{IC}(f)}$.
\end{theorem}
\noindent Interestingly in the past many American and Soviet information theorists discovered identical results but could not communicate because of language barriers. Now different computer scientists find identical or very similar results but in different fields. \newline
\noindent This amortized results was also shown by Distributed Source-Coding [Ishwar-Ma].
\begin{exercise}
Potential project idea: Read this paper and compare its results to what we showed in class.
\end{exercise}
\subsection{Limits of Amortization}
What should $t$ be with respect to $n$? $I$ is independent from $t$ so the problem is more tractable, but we have
$$
\text{IC}(f) = \inf_{\pi \ \text{that compute} \ f}\{\text{IC}(\pi)\}.
$$
\noindent To find this we must look at a countable but not finite set.
\subsubsection{Entropy}
Both single-shot and amortized entropy are computable, via Huffman encoding and direct computation, in Poly time.
\subsubsection{Channel Capacity}
Single-shot could be NP-Complete, amortized can be done in Poly time via convex programming.
\subsubsection{Zero-Error Channel Capacity}
One our our papers considers the case where the error is 0 instead of being close to 0 with high probability. Single-shot is believed to be NP-complete and amortized may not be in $P$ and may not even be computable. Work by [Lovasz] discovered lower bounds for the Shannon capacity. Note that this problem is much more combinatoric whereas the case where error goes to 0 is information-theoretic. The problem can be defined in terms of common randomness generation.
\begin{defn}[Common Randomness Generation]
Given $(XY) \sim \mu$ (potentially correlated) give Alice $\{X_1, ..., X_t\}$ and Bob $\{Y_1, ..., Y_t\}$. Also give both of them some private randomness $R_A$ and $R_B$ and have them communication $\gamma \cdot t$ times between each other. We want Alice and Bob to output $\{R_1, R_2, ..., R_{\rho t}\}$ where $R_i \sim \text{Unif}$ and $\gamma, \rho \in (0, 1]$.
\end{defn}
\noindent This begs the question of whether or not we can define a protocol on $\mu$ with parameters $(\gamma, \rho)$. Can we get $\gamma < \rho$? The case where $X = Y \sim \text{Unif}$ requires no work. The hard case is when $X$ and $Y$ are only slightly correlated.\newline
\noindent Work on this problem goes back to [??]. The first paper mention in our list is by [Witsenhausen]. This paper considers the distribution for which $\mathbb{P}[00] = \frac{1}{2}$ and $\mathbb{P}[01] = \mathbb{P}[10] = \frac{1}{4}.$ Another paper by [Ahslwede-Ciszar] consider that case where $(R_1, R_1')$ are usually equal, but occasionally not. Other papers, including one by [Cuff-Liu-Verde] look at the ration
$$
\max_{\pi}\left\{\frac{\text{IC}_{ext}(\pi)}{\text{IC}_{int}(\pi)}\right\},
$$
which is very fundamental in this line of research.
\subsection{Computability of IC}
\noindent $\text{IC}(f)$ is not necessarily computable but is computably-approximable. [Braverman] shows this results. In a similar sense, some papers (one by [Ghuzl, Kamath, Sudan]) prove theorems that tell us if parameters $(\gamma, \rho)$ work for the common randomness generation problem. \newline
\noindent Another problem that is not known to be computable but is computably-approximable is a Markov chain we looked at before where the states are $X_1 \sim \text{Bern}(\delta)$ and $X_2 \sim \text{Bern}(\frac{1}{2}-\delta)$ and the transition matrix is $$\begin{pmatrix} 1-p & p \\ p & 1-p \end{pmatrix}.$$
\begin{exercise}
Come up with an idea for the final project (maybe relating to IC)!
\end{exercise}
\section{Next Topics}
The next topics we will look at are broad (some even form their own field).
\subsection{\underline{Streaming Algorithms}}
\begin{itemize}
\item Communication complexity lower bounds lead to streaming lower bounds.
\item How can you capture the limits/bounds? IT!
\end{itemize}
\subsection{\underline{Data Structures}}
\begin{itemize}
\item I have large amounts of data and want it stored efficiently in a way that is efficiently/reliably accessible.
\item Rich literature and active area of research
\end{itemize}
\subsection{Differential Privacy}
\begin{itemize}
\item How private is a mechanism? (Defined probabilistically)
\item IT-tools are used to measures this via bounds.
\item IT-tools are used to design mechanisms and to analyze limits.
\end{itemize}
\subsection{Learning/Statistics/Finance (Portfolio Optimization)}
\subsection{Complexity of Optimization}
\subsection{Extension Complexity}
\begin{itemize}
\item Paper by [Yannakakis] proved that any linear-time solution to TSP would require exponentially many constraints and thus be exponential. Communication complexity lower bounds were used to get restricted impossibility (symmetric linear programs).
\item Paper by [Fiorini et al.] used DISJ lower bounds to show general impossibility.
\item Paper by [Braverman, Moltra] used information complexity results for lower bounds.
\end{itemize}
\subsection{Hardness of Approximation}
\begin{itemize}
\item "2-prover proof systems"
\item Parallel Repetition
\item Started in [Raz '94], which opened the eyes of computer scientists to information theory. The work was improved in [Holenstein].
\end{itemize}
\subsection{Shearer's Lemma}
Equality in the inequality implies that the object is a box. What about approximate equality? [Ellis et al.] showed that approximate equality implies that the structure is approximately a box.
\end{document}