Difference between revisions of "TADM2E 3.11"

From Algorithm Wiki
Jump to: navigation, search
(2. Design a data structure that uses O(n) space and answers queries in O(\log n) time.)
Line 45: Line 45:
  
 
--[[User:Ajitq|Ajitq]] ([[User talk:Ajitq|talk]]) 12:20, 3 May 2018 (UTC)
 
--[[User:Ajitq|Ajitq]] ([[User talk:Ajitq|talk]]) 12:20, 3 May 2018 (UTC)
 +
 +
'''Question:''' Why not use a sorted array? The queries can be executed with O(1) processing on O(n) space. The question does not require anything other than O(logn) query time and O(n) space. i.e., no requirements on insert and delete or any such other operations. <math>x_i</math> is always going to be the answer for any (i,j) because we have them sorted :-) And O(1) is definitely O(logn)

Revision as of 09:46, 27 September 2019

1. Design a data structure that uses $ O(n^2) $ space and answers queries in $ O(1) $ time.

Build an $ n × n $ lookup table with the answer to every possible query, indexed by $ i $ and $ j $.

2. Design a data structure that uses $ O(n) $ space and answers queries in $ O(\log n) $ time.

The basic idea in this solution is to build a multi-resolution representation of the input, with each node able to answer the query for part of the input sequence. A tree of such nodes makes it possible to answer arbitrary queries quickly by covering the query range with ranges covered by nodes.

Build a binary tree in which each node holds the lowest value for a range of indexes. The root node spans the whole input sequence. The root node's children span the left and right halves of the input sequence, and so on. Each leaf node spans a single-element "range" of the input, with the "lowest" value in that range being the value at that position in the input sequence. The total number of nodes in the tree is $ O(n) $.

The query function is recursive, starting from the root:

  • If the query range exactly matches the range spanned by the current node, return the current node's value.
  • If the query range falls entirely within the left half, return the result of calling the query function on the left child.
  • If the query range falls entirely within the right half, return the result of calling the query function on the right child.
  • Otherwise return the lowest of the results from calling the query function on the left child and the right child.

The query function will visit at most two leaf nodes, and runs in $ O(\log n) $ time.


While it is true that at most 2 leaf nodes are visited, it is clear to this reader that the above algorithm has worst case $ O( n) $ running time. For example, say we have elements indexed 0 to 15, and we are interested in querying the minimum of elements 1 to 14, inclusive. Then following the above procedure will require visiting nodes [1], [2,3], [4,7], [8,11], [12,13] and [14], a total of $ O(\log n) $ nodes (since there are at most 2 queried nodes per level of the tree), but requiring a distance of 8 to traverse. Indeed, for any $ n=2^k - 1 $ and query $ [1,2^k-2] $, $ O(\log n) $ nodes will be visited, with $ O(n) $ traversal time. See below for a true $ O(\log n) $ algorithm (basically the above algorithm branches down the tree from the root in multiple directions, while the below algorithm begins at a leaf and takes a single path up and then down the tree).


--Benw (talk) 18:08, 7 January 2015 (EST)

==== 2. (Alternate explanation)

We can assume, w.l.o.g., that n is a power of 2. For instance, if n is 16. We can then calculate minima for the following ranges:

  • 0-15
  • 0-7 and 8-15
  • 0-3, 4-7, 8-11 and 12-15
  • 0-1, 2-3, 4-5, 6-7, 8-9, 10-11, 12-13, 14-15

Note that this is O(n) space, or more precisely, (n-1) space. Now, to calculate the minimum for a particular range, build up non-overlapping ranges from the above set of ranges that cumulatively cover the desired range. If the range is 3-14, for example, the build-up would be 3; 4-7; 8-11; 12-13, and 14. This is O(log n). The minimum of the values of these ranges gives us the desired answer. To calculate the minimum of m unsorted values is O(m), and since m is O(log n), the minimum calculation is also O(log n).

An algorithm to obtain the covering 3; 4-7; 8-11; 12-13; 14 is as follows: Call $ i=3, j=15 $. We are trying to cover the half open interval $ [3,15) $. Initialize $ k=i $: at each step of the algorithm, we add the largest power of 2 not greater than the least significant bit of $ k $, which keeps $ k $ below $ j $. So starting with $ k=11 $ (in binary) we have $ k=k+01=100 $. Then we have $ k=k+100=1000 $. Next we attempt $ k=k+1000 $, but this leads to $ k>j $; so we try $ k=k+0100=1100 $. Next we do $ k=k+0010=1110 $. Lastly $ k=k+0001 $. The sequence of $ k $-values gives us the minimal set of ranges needed to cover $ [3,15) $; namely, $ [3,4) \; [4,8) \; [8,12) \; [12,14) \; [14,15) $, as desired.

--Ajitq (talk) 12:20, 3 May 2018 (UTC)

Question: Why not use a sorted array? The queries can be executed with O(1) processing on O(n) space. The question does not require anything other than O(logn) query time and O(n) space. i.e., no requirements on insert and delete or any such other operations. $ x_i $ is always going to be the answer for any (i,j) because we have them sorted :-) And O(1) is definitely O(logn)