Edit Distance Codes via Synchronization Strings
Madhu Sudan
Harvard
I will describe a recent approach to designing codes that correct for "editing" errors - i.e., where an adversary is allowed to delete some of the symbols in a string being transmitted and insert new symbols. The classical Hamming model of errors (where adversary is allowed to change some symbols) may be viewed a special case where the number of insertions equals the number of deletions and insertions and deletions happen at the same location.
In STOC 2017 Haeupler and Shahrasbi introduced an elegant notion that they call a "synchronization string" that modularly converts standard error-correcting codes into edit distance codes with almost no loss in parameters and just a slight loss in alphabet size, while retaining algorithmic tractability of error-correction. A concrete example of such a result (from upcoming work with Haeupler and Shahrasbi, ICALP 2018) we show that for every epsilon, delta and gamma, there exists an alphabet Sigma and integer L and (efficient) coding and decoding schemes over the alphabet Sigma of rate to 1 - delta - epsilon that allow delta fraction deletions and gamma fraction insertions where the decoder outputs a list of at most L codewords that is guaranteed to include the transmitted string. In particular the rate does not depend on gamma and indeed gamma can be much larger than 1!
In this talk I will describe the notion of synchronization strings and this modular reduction from edit distance coding to Hamming distance coding.