# Mike66: Abstracts

- Graham Cormode (AT&T, USA)

**"On 'Selection and sorting with limited storage'"**

In 'Selection and Sorting with Limited Storage' [MP78], Munro and Paterson presented results on the selection (median finding) problem. Their model allowed a small number of one-way passes over input stored on a read-ony tape are possible. This paper is now regarded as one of the first works in the streaming model. I will recap the results in [MP78], and discuss two more recent results on this theme: approximate selection on a stream with both 'insertion' and 'deletion' records; and tight lower bounds for multi-pass median finding when the input data is randomly ordered.

- Maxime Crochemore (King's College London, UK)

**"Faster than Mike? Not yet done!"**

The problem of comparing two sequences to determine their similarity is one of the fundamental problems in pattern matching. Given two sequences over a fixed size alphabet, the classical algorithms for efficiently doing it use dynamic programming, and compare two strings in quadratic time.

We address the challenge of computing the similarity of two strings in sub-quadratic time following the fastest known method of Masek and Paterson (1980). The algorithm applies to both local and global similarity computations and answers one of their question because it works for metrics that use a scoring matrix of unrestricted weights.

The speed-up is achieved by dividing the dynamic programming matrix into variable-size blocks induced by the Ziv-Lempel parsing of both strings, and utilising the inherent periodic nature of the strings. For most texts, the time complexity is O(h n^2/log n) where h ≤ 1 is the entropy of the sequences.

- Josep Diaz (Universitat Politecnica de Catalunya, Spain)

**"New bounds for the phase transition of 3SAT"**

I will survey recent developments of upper bounds for 3SAT threshold (phase transition).

- Paul E. Dunne (University of Liverpool, UK)

**"Computational complexity and algorithms in semantics of abstract argumentation frameworks"**

Abstract argumentation frameworks (AFs) model argumentation processes via (finite) directed graph structures ⟨ X, A ⟩ in which X represents "atomic arguments" (so that undisputed facts, propositions, deductive templates and arbitrary supporting cases are all viewed as single indivisible "arguments") and A describes an "attack" relation between arguments: ⟨ x, y ⟩ ∈ A is read as "the argument x

*attacks*the argument y," e.g. if x is an argument asserting a proposition ¬ p and y asserts the proposition p then both ⟨ x, y ⟩ and ⟨ y, x ⟩ are attacks; similarly if y is an argument "q since (p and r)" again ⟨ x, y ⟩ could form an attack. An important feature of*abstract*AFs is that the structure of arguments and reasons occasioning attacks are not explicitly modelled.Since their initial formulation in the mid-1990s various proposals for the notion of "collection of justified arguments" have been advanced in terms of desirable properties of subsets of X. By analogy with concepts originating in the study of non-classical logics, the term "extension-based semantics" has become associated with such proposals. In algorithmic and complexity-theoretic terms these divers forms give rise to a number of generic decision problems involving AFs as part of their instance, e.g. verifying that S ⊆ X satisfies the criteria imposed by semantics s; deciding if the argument x ∈ X is in at least one (alternatively

*every*) subset S sanctioned by semantics s; deciding if the subsets sanctioned by semantics s are exactly those sanctioned by semantics s', etc.In this talk a brief survey of recent (2002 to present) algorithmic and complexity related work on argumentation semantics dealing with topics such as proof complexity, classification of novel semantics (e.g. value-based, ideal, semi-stable); and the effects of various graph-theoretic restrictions, will be given.

- Martin E. Dyer (University of Leeds, UK)

**"A complexity dichotomy for hypergraph partition functions"**

Partition functions are studied in statistical physics. In the case of graphs, the complexity of exactly evaluating these functions was considered by Bulatov and Grohe. They gave a dichotomy between problems which are in P and problems which are #P-hard. We consider the case of uniform hypergraphs with edge size at least 3, and prove a similar dichotomy theorem. However, this gives a surprisingly different characterisation for the problems in P from that of Bulatov and Grohe.

Joint work with Leslie Goldberg and Mark Jerrum**.**

- Faith Ellen (University of Toronto, Canada)

**"Valency arguments"**

The valency argument is one of the most important proof techniques for proving impossibility results in distributed computing. It was introduced by Fischer, Lynch, and Paterson in the mid 1980's to prove that consensus is unsolvable in an asynchronous message passing system.

This talk will present the valency argument and some variants of it, describe a number of different lower bounds that can be proved using this technique, and discuss some broader implications of these results.

- Michael J. Fischer (Yale, USA)

**"Analysis of Think-a-Dot"**

Think-a-Dot is a mathematical toy introduced by E.S.R. Inc. in the 1960's. Using plastic and marbles, it implements a state machine with 8 binary flip-flops. Marbles dropped through one of three holes in the top take a path through the toy that depends on the states of the flip-flops. Each flip-flop on the path flips state as the marble passes by. The states of the flip-flops show as patterns of blue and yellow dots through holes in the plastic case. One problem is to drop a sequence of marbles so as to achieve a particular target pattern.

Mike Paterson began using Think-a-Dot to illustrate finite state machines in his teaching as far back as 1969, where it is mentioned in the outline for M.I.T. course 18.161. Albert Meyer, Mike and I further developed this material and wrote it up as lecture notes for subsequent courses we all taught. It eventually found its way into a 2004 Yale technical report, but it has never been published.

These results are as beautiful today as they were when Mike first sketched them out. They also illustrate Mike's love of games and puzzles, his fascination with algorithms and combinatorial problems, and his ability to apply his broad mathematical knowledge to problems of interest. In this talk, I present a complete characterization of the Think-a-Dot state-transition group along with an algorithm and correctness proof for finding the minimal input sequence to move from one state to another.

- Kazuo Iwama (Kyoto University, Japan)

**"Big circuit reduction and network coding"**

Constructing k independent sessions between k source-sink pairs with the help of a linear operation at each vertex is one of the most standard problems in network coding. For an unbounded k, this is known to be NP-hard. Very recently, a polynomial-time algorithm was given for k=2 [Wang and Shroff, ISIT 2007], but was open for a general (constant) k. This talk gives a polynomial-time algorithm for this problem under the assumption that the size of the finite field for the linear operations is bounded by a fixed constant. Our main technique is what we call ''Big Circuit Reduction,'' which reduces a given (large) linear circuit only by cutting lines.

This is a joint work with Harumichi Nishimura,

*Mike Paterson*, Rudy Raymond, and Shigeru Yamashita.

- Mark Jerrum (Queen Mary London, UK)

**"An approximation trichotomy for Boolean #CSP"**

This talk examines the computational complexity of approximating the number of solutions to a Boolean constraint satisfaction problem (CSP). It extends a line of investigation started in a classical paper of Schaefer on the complexity of the decision problem for CSPs, and continued by Creignou and Hermann, who addressed exact counting. We find that the class of Boolean CSPs exhibits a trichotomy. Depending on the set of allowed relations, the CSP may be polynomial-time solvable (even exactly); or the number of solutions may be as hard to approximate as the number of accepting computations of a non-deterministic Turing machine. But there is a third possibility: approximating the number of solutions may be complete for a certain logically defined complexity class, and hence equivalent in complexity to a number of natural approximate counting problems, of which independent sets in a bipartite graph is an example.

I'll concentrate on definitions and context at the expense of proofs.

This is joint work with Martin Dyer (Leeds) and Leslie Ann Goldberg (Liverpool).

- Richard M. Karp (University of California at Berkeley, USA)

**"Implicit set cover problems"**

Let U be a finite set and S a family of subsets of U. Define a

*hitting*set as a subset of U that intersects every element of S. The*optimal hitting set problem*is: given a positive weight for each element of U, find a hitting set of minimum total weight. This problem is equivalent to the classic weighted set cover problem. We consider the*implicit hitting set problem*, in which the family S is not explicitly given, but there is an oracle that will supply a member of S satisfying certain conditions; for example, we might ask for a minimum-cardinality set in S that is disjoint from a given set Q. The problems of finding a minimum feedback arc set or minimum feedback vertex set in a digraph are examples of implicit hitting set problems. Our interest is in the number of oracle queries and the number of computation steps required to find an optimal hitting set. After presenting some generic algorithms for this problem we focus on our computational experience with an implicit hitting set problem related to the alignment of multiple genomes.

- Donald E. Knuth (Stanford University, USA)

**"The amazing Y functions"**

In the early 1950s, John Milnor invented a game based on a family of self-dual monotone Boolean functions of n choose 2 variables. Twenty years later, Craige Schensted discovered many remarkable properties of those functions. This talk summarizes some of the highlights, and presents partial results on a problem that has stumped the speaker (so far).

- Kurt Mehlhorn (Max-Planck-Institut für Informatik, Saarbrücken, Germany)

**"Assigning papers to reviewers"**

CS conferences typically have a program committee (PC) that selects the papers for the conference from among the submitted papers. The work of the PC is supported by a conference support system, e.g., EasyChair. It is the task of the program chair to assign the papers to the members of the PC. In order to achieve an effective assignment, the PC members classify the papers according to interest. The EasyChair conference system knows four levels: strongly interested, weakly interested, not interested, conflict of interest. Given the classification of the papers by the PC members, the chair seeks a good assignment. What are the right objectives? Here are some: balancing the load of the PC members, making the task of the PC members worthwhile by assigning their high interest papers, guaranteeing each paper a sufficient number of reviews, guaranteeing each paper a sufficient total level of interest, and so on. I will discuss several versions of the problem; I will pose more questions than I give answers.

- J. Ian Munro (University of Waterloo, Canada)

**"Succinct data structures: Techniques and lower bounds"**

Indexing text files with methods such as suffix trees and suffix arrays permits extremely fast search for substrings. Unfortunately the space cost of these can dominate that of the raw data. For example, the naive implementation of a suffix tree on genetic information could take 80 times as much space as the raw data. Succinct data structures offer a technique by which the extra space of the indexing can be kept, at least in principle, to a ''little oh'' with respect to the raw data. This begs the question of how much extra space is necessary to support fast substring searches of other queries such as the rank/select problem or representing a permutation so that both the forward permutation and its inverse can be determined quickly. After a brief introduction to some of the techniques of succinct data structures we survey some lower bounds on this type of problem.

- S. Muthukrishnan (Google Research, USA)

**"String convolutions"**

Fischer and Paterson showed a connection between string matching problems and vector convolutions in early 70's that remains central to understanding the difficulty of problems in stringology, convolutions or even finite state machines and beyond. I will present an overview of the story of string convolutions.

- Maurice Nivat (LIAFA, Paris, France)

**"About the birth of Theoretical Computer Science"**

A new chapter of science was born between 1968 and 1975 and is called

*Theoretical Computer Science*. Mike Paterson, whom I met for the first time in Colchester in September 1969, played a major role in the creation of both the European Association of Theoretical Computer Science and the journal TCS, Theoretical Computer Science. He played an even greater role in the affirmation that computer science is indeed a science (this was far from obvious in the late sixties) and that theoretical computer science, which is very close to mathematics but distinct in its motivation and inspiration, is indeed a challenging and fruitful field of research.

- Vaughan Pratt (Stanford University, USA)

**"Affine algebra: numbers from geometry"**

Dispensing with notions of zero and metric, we develop the affine integers, rationals, reals, and their complex counterparts purely algebraically in terms of purely geometric constructs. We extend the variety Ab of abelian groups to a larger variety Af of

*groves*(affine abelian groups), with the group multiplication of Ab replaced by a binary reflection operation. A*complex grove*is a grove with a binary quarter-turn operation, while independently a*den*(dense grove) is a grove with a family of finitary centroid operations. Every operation is axiomatized by finitely many equations.We extend this countable framework to the continuum via a notion of

*quorum*for a subset X of a den as a polyhedron (finitely generated centroid subalgebra) bijective with both X and its intersection with X. Sets X and Y*converge*when all pairs of quorums one for each intersect; convergence is a PER (partial equivalence relation). A*Cauchy set*is a set that converges with itself. The*completion*of a den or complex den D is comprised of its Cauchy sets modulo convergence, itself a den embedding D as its singleton Cauchy sets. A den homomorphism is*continuous*when it is compatible with convergence, such as that embedding.The integers, Gaussian integers, rationals, complex rationals, reals, and complex numbers, along with their usual metrics, modules, and vector spaces, then emerge by adjoining the constant 0 to the language, with no further equations.

- S. Cenk Sahinalp (Simon Fraser University, Canada)

**"How would you like to have your edit distance computed? Exactly or approximately?"**

The edit distance between two strings is defined as the minimum number of single character insertions, deletions and replacements to transform one string into the other. The fastest algorithm for computing the edit distance between two strings is by Masek and Paterson (1980) and runs in O(n

^{2}/log n) time. More recently (2006) Batu, Ergun and Sahinalp showed that if "some" level of approximation (only O(n^{1/3+o(1)}) can be tolerated, the edit distance between two strings can be computed in (almost) linear time.

- Leslie G. Valiant (Harvard University, USA)

**"Evolvability"**

Living organisms function according to complex mechanisms that operate in different ways depending on conditions. Darwin's theory of evolution suggests that such mechanisms evolved through random variation guided by natural selection. However, there has existed no theory that would explain quantitatively which mechanisms can so evolve in realistic population sizes within realistic time periods, and which are too complex. In this paper we suggest such a theory. Evolution is treated as a form of computational learning from examples in which the course of learning is influenced only by the aggregate fitness of the current hypothesis on the examples, and not otherwise by specific examples. We formulate a notion of evolvability that distinguishes function classes that are evolvable with polynomially bounded resources from those that are not. It is shown that in any one phase of evolution where selection is for one beneficial behavior, monotone Boolean conjunctions and disjunctions are demonstrably evolvable over the uniform distribution, while Boolean parity functions are demonstrably not. The framework also allows a wider range of issues in evolution to be quantified. Less formally, we suggest that the overall mechanism that underlies biological evolution is evolvable target pursuit, which consists of a series of evolutionary stages, each one pursuing an evolvable target in the technical sense suggested above, each target being rendered evolvable by the serendipitous combination of the environment and the outcomes of previous evolutionary stages.