Wednesday, October 26, 2011

Lower bound for element distinctness problem


Summary: We discuss the method of proving lower bounds on a linear decision tree model using connected components argument. A lower bound of Ω(n logn) for element distinctness problem(EDP) follows from this argument. 
 

We are given an input X=(x1,x2,..,x_n) and a property P; the problem is to decide if X has property P. In EDP, we are interested in knowing if all x_is are distinct.


What is a connected component?


Every input point belongs to R^n. Hence, the entire input space R^n can be partioned into `Yes' and `No' regions; a point from `Yes' region has property P while a point from `No' region does not.

In EDP, problem suppose n=2, so the input space is R^2. The input space has two regions: Yes and No. Points that lie in x1<x2 and x1>x2 form the `Yes' region and points on the line x1=x2 form the `No' region.

Two points a and b are said to be connected if they both belong to `Yes'(No) region and there exists a continuous path from a to b that passes only through `Yes'(No) region. For example, all points in x1<x2 region are connected. But a point a in x1<x2 and b in x1>x2 are not connected since any path from a to b must pass through the line x1=x2 which is a `No' region.


Computational Model

The model of computation here is a linear decision tree, which takes as input X and outputs `Yes' or `No' depending on whether X satisfies the given property P or not. For EDP, the decision tree outputs `Yes` if all the elements are distinct and `No' otherwise. Each internal node v of the tree contains a linear function in input variables d_v(X) and three edges: >0, <0 and =0. Given an input vector X, we decide to take the branch corresponding to the value of d_v(X). Computation starts at the root node and control branches recursively to its children. The decision is made when control reaches a leaf. Each leaf node is labelled `Yes' or `No'. We can think of the decision tree as a function F : Rn → {Yes,No} .

Linear decision trees have the following geometric interpretation: Each internal node defines a hyperplane with equation d(X). We branch at a node depending on whether the input is above, below or on the hyperplane. Let l be a leaf node. Consider the set of all input points R(l) that reach the leaf node l. The set R(l) contains all the points which satisfy the inequalities or equalities along its root-to-leaf path. Since such a set is formed by intersection of half-planes, the resulting set must be a convex region. A convex region is always connected. Hence, a single leaf node can accommodate only one connected component.

Claim 1. If two points a and b reach a leaf  then any point that is a convex combination of a and b must also reach the same leaf.

Bound on the height of the tree

The number of leaves is at least the number of connected components in $F^{-1}(Yes)$ and $F^{-1}(No)$. It follows that the height of the linear decision tree must be at least $\log_{3}( F^{-1}(Yes) + F^{-1}(No) )$, implying a lower bound on the number of comparisons the linear decision tree must make in order to give the correct answer.



Number of connected components in EDP problem

All that we have to do is get a lower bound of the number of connected components.

Claim 2. There are n! 'Yes' connected components in R^n.

Take a point a in R^n with all distinct coordinates and let b be a non-identical permutation of coordinates of a. For example, if a=(1,4,5) then b can be (4,5,1) in R^3. Since b is a non-identical permutation of a, there must be an inversion pair (i,j). i.e there must be two coordinates i and j such that a_i > a_j and b_i < b_j or vice versa. Can inputs a and b reach the same leaf? This is not possible since the straight line connecting a and b must pass through some point with x_i=x_j. So there exists a convex combination of a and b which belongs to 'No' region(which contradicts Claim 1). Every non-identical permutation of a is in a connected component of its own, hence there are n! connected components.

Claim 3. The height of the decision tree must be at least log(n!) = Ω(n log n).



Takeaway Puzzle: Prove an Ω(n log n) lower bound for the following problem by reducing from EDP: Given an array of numbers is there a contiguous subsequence whose sum is zero?

Friday, October 14, 2011

Theory Chat: k-server, Metrical Task Systems & Metrical Service...

Theory Chat: k-server, Metrical Task Systems & Metrical Service...: [@ 2:30pm. 15 Sept 2011. SIC-305 KReSIT]

In this talk I introduced the three problems mentioned in the title and the connection between them. I mentioned a few lower / upper bounds. I later introduced the problem I am working on - metrical service systems with k servers. I presented a proof of a lower bound on the competitive ratio, which holds for any metric space with sufficiently many points. For the case when the underlying metric space is uniform (and the problem is in some sense a variant of the caching problem), I presented a deterministic algorithm (proposed by Prof. Sundar) with competitive ratio k times the lower bound. Improving on this upper bound is one of the problems I am stuck with.

The proof of the competitive ratio used a result called the skew Bollobás theorem, which I would like to talk on some time. The proof uses what is called the "wedge product" of vectors, in addition to elementary linear algebra. In another talk, I would like to present a randomized algorithm for MSS with k servers on uniform metric space, as well as a few online traversal problem, and their "equivalence" with one another and with MSS.

Ashish Chiplunkar

Tuesday, October 4, 2011

Wish List


Topics I would like to learn about:
  1. Multiplicative weights update method.[link]
  2. Semi-definite programming.
  3. Natural Proofs--I know nothing about this, but sounds interesting.
  4. What are the common ways of exploiting planarity in design of algorithms?
  5. Derandomization/Random walks/Expanders.

Topics I would like to give talks on:

  1. Lower bounds for Element distinctness Problem.
    Consider the following problem: Given an array of n elements x1,..,xn decide if xi=xj for some i≠j. Prove that any comparison-based algorithm takes at least Ω(nlogn) operations to decide if all the elements are distinct.
  2. Trade-off revealing LPs
  3. Lower bounds for Linear Satisfiability. An Ω(nr/2) lower bound for the following problem: For some fixed linear equation in r variables, given a set of n real numbers, do any r of them satisfy the equation? [ pdf ]
  4. Broadcast scheduling on paths. A problem I was initially working on..there is a small gap in bounds and I would like to solicit ideas for improvement.
  5. Farkas Lemma and its applications.
  6. Introduction to Matroids.



--Jagadish



Topics I would like to see presented:
  1. "A deterministic single exponential time algorithm for most lattice problems based on voronoi cell computations" by Daniele Micciancio and Panagiotis Voulgaris, a breakthrough result on the shortest vector problem. [ pdf ]
  2. Proof(s) of the Borsuk Ulam theorem, a generalization of Brouwer's fixed point theorem to arbitrary dimensions.[ pdf ] Another excellent source is the book by Matousek, titled "Using the Borsuk-Ulam theorem".
  3. An intro to entropy and information theory.
  4. Szemeredi's regularity lemma and applications.
Note: I am willing to discuss on the Micciancio-Voulgaris paper.

Topics I would like to present:

  1. Expanders.
  2. Linear algebraic methods in combinatorics.
  3. The Guth-Katz bound on the Erdős distance problem: Terrence Tao has a nice expository post. This is a complex proof and I could use some help.
  4. Extractors and derandomization.
--Aravind



Topics I would like to see presented:
  1. Interior point methods to solve convex optimization problems
  2. UGC, hardness of approximation
  3. SDPs
  4. Expanders
  5. PCP theorem :D
Topics I would like to present:
  1. Two player games and Lemke-Howson algorithm
  2. PPAD problems and properties
  3. Lemke's algorithm

-Ruta.

Monday, October 3, 2011

Short & Clever Results

  1. Katona's proof of Erdos-Ka-Rado theorem. [link]
  2. Karger's randomized min-cut algorithm.[Notes]
  3. Interactive Proof System.
  4. Deffie-Hellman protocol.[link]
  5. Dobkin and Lipton's lower bound proof of element distinctness problem.
  6. Reflections on trust by Ken Thompson.[pdf]
  7. Schöning's Algorithm for 3SAT.[pdf]
  8. Perceptron algorithm for finding a separating plane.
  9. Deradomization using an expander.(Linial's ppt)
  10. Least Common Ancestor in O(1) time. (Erik Demaine notes).