Difference between pages "Chapter 11" and "Chapter 3"

From The Algorithm Design Manual Solution Wiki
(Difference between pages)
Jump to navigation Jump to search
 
 
Line 1: Line 1:
=NP-Completeness=
+
=Data Structure=
  
===Transformations and Satisfiability===
+
===Stacks, Queues, and Lists===
  
:[[11.1]]. Give the 3-SAT formula that results from applying the reduction of SAT to 3-SAT for the formula:
+
:[[3.1]]. A common problem for compilers and text editors is determining whether the parentheses in a string are balanced and properly nested. For example, the string <math>((())())()</math> contains properly nested pairs of parentheses, which the strings <math>)()(</math> and <math>())</math> do not. Give an algorithm that returns true if a string contains properly nested and balanced parentheses, and false if otherwise. For full credit, identify the position of the first offending parenthesis if the string is not properly nested and balanced.
:::<math> (x\or y \or \overline z \or w \or u \or \overline v) \and (\overline x \or \overline
+
[[3.1|Solution]]
y \or z \or \overline w \or u \or v) \and (x \or \overline y \or \overline z \or w \or u \or \overline v)\and (x \or \overline y) </math>
 
[[11.1|Solution]]
 
  
  
:11.2. Draw the graph that results from the reduction of 3-SAT to vertex cover for the expression
+
:3.2. Give an algorithm that takes a string <math>S</math> consisting of opening and closing parentheses, say )()(())()()))())))(, and finds the length of the longest balanced parentheses in <math>S</math>, which is 12 in the example above. (Hint: The solution is not necessarily a contiguous run of parenthesis from <math>S</math>.)
:::<math>(x \or \overline y \or z) \and (\overline x \or y \or \overline z) \and(\overline x \or y \or z) \and (x \or \overline y \or \overline x) </math>
+
  
 +
:[[3.3]]. Give an algorithm to reverse the direction of a given singly linked list. In other words, after the reversal all pointers should now point backwards. Your algorithm should take linear time.
 +
[[3.3|Solution]]
  
:[[11.3]]. Prove that 4-SAT is NP-hard.
 
[[11.3|Solution]]
 
  
 +
:3.4. Design a stack <math>S</math> that supports ''S.push(x)'', ''S.pop()'', and ''S.findmin()'', which returns the minimum element of <math>S</math>. All operations should run in constant time.
  
:11.4. ''Stingy'' SAT is the following problem: given a set of clauses (each a disjunction of literals) and an integer <math>k</math>, find a satisfying assignment in which at most <math>k</math> variables are true, if such an assignment exists. Prove that stingy SAT is NP-hard.
 
  
 +
:[[3.5]]. We have seen how dynamic arrays enable arrays to grow while still achieving constant-time amortized performance. This problem concerns extending dynamic arrays to let them both grow and shrink on demand.
 +
:(a) Consider an underflow strategy that cuts the array size in half whenever the array falls below half full. Give an example sequence of insertions and deletions where this strategy gives a bad amortized cost.
 +
:(b) Then, give a better underflow strategy than that suggested above, one that achieves constant amortized cost per deletion.
 +
[[3.5|Solution]]
  
:[[11.5]]. The ''Double SAT'' problem asks whether a given satisfiability problem has '''at least two different satisfying assignments'''. For example, the problem <math>{{v1, v2}, {v_1, v_2}, {v_1, v_2}}</math> is satisfiable, but has only one solution <math>(v_1 =F, v_2 = T)</math>. In contrast, <math>{{v_1, v_2}, {v_1, v_2}}</math> has exactly two solutions. Show that Double-SAT is NP-hard.
 
[[11.5|Solution]]
 
  
 +
:3.6. Suppose you seek to maintain the contents of a refrigerator so as to minimize food spoilage. What data structure should you use, and how should you use it?
  
:11.6. Suppose we are given a subroutine that can solve the traveling salesman decision problem on page 357 in (say) linear time. Give an efficient algorithm to find the actual TSP tour by making a polynomial number of calls to this subroutine.
 
  
 +
:[[3.7]]. Work out the details of supporting constant-time deletion from a singly linked list as per the footnote from page 79, ideally to an actual implementation. Support the other operations as efficiently as possible.
 +
[[3.7|Solution]]
  
:[[11.7]]. Implement a SAT to 3-SAT reduction that translates satisfiability instances into equivalent 3-SAT instances.
+
===Elementary Data Structures===
[[11.7|Solution]]
 
  
 +
:3.8. Tic-tac-toe is a game played on an <math>n * n</math> board (typically <math>n = 3</math>) where two players take consecutive turns placing “O” and “X” marks onto the board cells. The game is won if n consecutive “O” or ‘X” marks are placed in a row, column, or diagonal. Create a data structure with <math>O(n)</math> space that accepts a sequence of moves, and reports in constant time whether the last move won the game.
  
:11.8. Design and implement a backtracking algorithm to test whether a set of clause sets is satisfiable. What criteria can you use to prune this search?
 
  
 +
:[[3.9]]. Write a function which, given a sequence of digits 2–9 and a dictionary of <math>n</math> words, reports all words described by this sequence when typed in on a standard telephone keypad. For the sequence 269 you should return any, box, boy, and cow, among other words.
 +
[[3.9|Solution]]
  
:[[11.9]]. Implement the vertex cover to satisfiability reduction, and run the resulting clauses through a satisfiability solver code. Does this seem like a practical way to compute things?
 
[[11.9|Solution]]
 
  
===Basic Reductions===
+
:3.10. Two strings <math>X</math> and <math>Y</math> are anagrams if the letters of <math>X</math> can be rearranged to form <math>Y</math> . For example, silent/listen, and incest/insect are anagrams. Give an efficient algorithm to determine whether strings <math>X</math> and <math>Y</math> are anagrams.
  
:11.10. An instance of the ''set cover'' problem consists of a set <math>X</math> of <math>n</math> elements, a family <math>F</math> of subsets of <math>X</math>, and an integer <math>k</math>. The question is, does there exist <math>k</math> subsets from <math>F</math> whose union is <math>X</math>? For example, if <math>X = \{1,2,3,4\}</math> and <math>F = \{ \{1,2\}, \{2,3\}, \{4\}, \{2,4\} \}</math>, there does not exist a solution for <math>k=2</math>, but there does for <math>k=3</math> (for example, <math> \{1,2\}, \{2,3\}, \{4\}</math>). Prove that set cover is NP-complete with a reduction from vertex cover.
+
===Trees and Other Dictionary Structures===
  
 +
:[[3.11]]. Design a dictionary data structure in which search, insertion, and deletion can all be processed in <math>O(1)</math> time in the worst case. You may assume the set elements are integers drawn from a finite set <math>1, 2, .., n</math>, and initialization can take <math>O(n)</math> time.
 +
[[3.11|Solution]]
  
:[[11.11]]. The ''baseball card collector problem'' is as follows. Given packets <math>P_1, \ldots, P_m</math>, each of which contains a subset of this year's baseball cards, is it possible to collect all the year's cards by buying <math>\leq k</math> packets? For example, if the players are <math> \{Aaron, Mays, Ruth, Steven \} </math> and the packets are
 
:::<math> \{ \{Aaron,Mays\}, \{Mays,Ruth\}, \{Steven\}, \{Mays,Steven\} \}, </math>
 
:there does not exist a solution for <math>k=2</math>, but there does for <math>k=3</math>, such as
 
:::<math> \{Aaron,Mays\}, \{Mays,Ruth\}, \{Steven\} </math>
 
:Prove that the baseball card collector problem is NP hard using a reduction from vertex cover.
 
  
 +
:3.12. The maximum depth of a binary tree is the number of nodes on the path from the root down to the most distant leaf node. Give an <math>O(n)</math> algorithm to find the maximum depth of a binary tree with <math>n</math> nodes.
  
:11.12. The ''low-degree spanning tree problem'' is as follows. Given a graph <math>G</math> and an integer <math>k</math>, does <math>G</math> contain a spanning tree such that all vertices in the tree have degree ''at most'' <math>k</math> (obviously, only tree edges count towards the degree)? For example, in the following graph, there is no spanning tree such that all vertices have a degree less than three.
 
\fixedfigsize{pictures/lowdegree.png}{1.0in}
 
#Prove that the low-degree spanning tree problem is NP-hard with a reduction from Hamiltonian ''path''.
 
#Now consider the ''high-degree spanning tree problem'', which is as follows. Given a graph <math>G</math> and an integer <math>k</math>, does <math>G</math> contain a spanning tree whose highest degree vertex is ''at least'' <math>k</math>? In the previous example, there exists a spanning tree with a highest degree of 8. Give an efficient algorithm to solve the high-degree spanning tree problem, and an analysis of its time complexity.
 
  
 +
:[[3.13]]. Two elements of a binary search tree have been swapped by mistake. Give an <math>O(n)</math> algorithm to identify these two elements so they can be swapped back.
 +
[[3.13|Solution]]
  
:[[11.13]].In the minimum element set cover problem, we seek a set cover <math>S \subseteq C</math> of a universal set <math>U = {1, . . . , n}</math> such that sum of the sizes of the subsets in <math>S</math> is at most <math>k</math>.
 
::(a) Show that <math>C = {{1, 2, 3}, {1, 3, 4}, {2, 3, 4}, {3, 4, 5}}</math> has a cover of size 6, but none of size 5 because of a repeated element.
 
::(b) Prove that this problem is NP-hard. (Hint: set cover remains hard if all subsets are of the same size.)
 
[[11.13|Solution]]
 
  
 +
:3.14. Given two binary search trees, merge them into a doubly linked list in sorted order.
  
:11.14. The ''half-Hamiltonian cycle problem'' is, given a graph <math>G</math> with <math>n</math> vertices, determine whether <math>G</math> has a simple cycle of length exactly <math>[n/2]</math>, where the floor function rounds its input down to the nearest integer. Prove that this problem is NP-hard.
 
  
 +
:[[3.15]]. Describe an <math>O(n)</math>-time algorithm that takes an <math>n</math>-node binary search tree and constructs an equivalent height-balanced binary search tree. In a height-balanced binary search tree, the difference between the height of the left and right subtrees of every node is never more than 1.
 +
[[3.15|Solution]]
  
:[[11.15]]. The 3-phase power balance problem asks for a way to partition a set of n positive integers into three sets <math>A</math>, <math>B</math>, or <math>C</math> such that <math>\sum_{i} a_i = \sum_{i} b_i = \sum_{i} c_i</math>. Prove that this problem is NP-hard using a reduction from integer partition or subset sum (see Section 10.5 (page 329)).
 
[[11.15|Solution]]
 
  
 +
:3.16. Find the storage efficiency ratio (the ratio of data space over total space) for each of the following binary tree implementations on n nodes:
 +
:(a) All nodes store data, two child pointers, and a parent pointer. The data field requires 4 bytes and each pointer requires 4 bytes.
 +
:(b) Only leaf nodes store data; internal nodes store two child pointers. The data field requires four bytes and each pointer requires two bytes.
  
:11.16. Show that the following problem is NP-complete:
 
:* ''Problem:'' Dense subgraph
 
:* ''Input:'' A graph <math>G</math>, and integers <math>k</math> and <math>y</math>.
 
:* ''Output:'' Does <math>G</math> contain a subgraph with exactly <math>k</math> vertices and at least <math>y</math> edges?
 
  
 +
:[[3.17]]. Give an <math>O(n)</math> algorithm that determines whether a given <math>n</math>-node binary tree is height-balanced (see Problem 3-15).
 +
[[3.17|Solution]]
  
:[[11.17]]. Show that the following problem is NP-complete:
 
:* ''Problem:'' Clique, no-clique
 
:* ''Input:'' An undirected graph <math>G=(V,E)</math> and an integer <math>k</math>.
 
:* ''Output:'' Does <math>G</math> contain both a clique of size <math>k</math> and an independent set of size <math>k</math>.
 
[[11.17|Solution]]
 
  
 +
:3.18. Describe how to modify any balanced tree data structure such that search, insert, delete, minimum, and maximum still take <math>O(log n)</math> time each, but successor and predecessor now take <math>O(1)</math> time each. Which operations have to be modified to support this?
  
:11.18. n ''Eulerian cycle'' is a tour that visits every edge in a graph exactly once. An ''Eulerian subgraph'' is a subset of the edges and vertices of a graph that has an Eulerian cycle. Prove that the problem of finding the number of edges in the largest Eulerian subgraph of a graph is NP-hard. (Hint: the Hamiltonian circuit problem is NP-hard even if each vertex in the graph is incident upon exactly three edges.)
 
  
 +
:[[3.19]]. Suppose you have access to a balanced dictionary data structure that supports each of the operations search, insert, delete, minimum, maximum, successor, and predecessor in <math>O(log n)</math> time. Explain how to modify the insert and delete operations so they still take <math>O(log n)</math> but now minimum and maximum take <math>O(1)</math> time. (Hint: think in terms of using the abstract dictionary operations, instead of mucking about with pointers and the like.)
 +
[[3.19|Solution]]
  
:[[11.19]]. Show that the following problem is NP-hard:
 
:*''Problem:'' Maximum Common Subgraph
 
:*''Input:'' Two graphs <math>G_1 = (V_1, E_1)</math> and <math>G_2 = (V_2, E_2)</math>, and a budget <math>b</math>.
 
:*Output: Two sets of nodes <math>S_1 \subseteq V_1</math> and <math>S_2 \subseteq V_2</math> whose deletion leaves at least <math>b</math> nodes in each graph, and makes the two graphs identical.
 
[[11.19|Solution]]
 
  
 +
:3.20. Design a data structure to support the following operations:
 +
:• ''insert''(<math>x,T</math>) – Insert item <math>x</math> into the set <math>T</math>.
 +
:• ''delete''(<math>k,T</math>) – Delete the <math>k</math>th smallest element from <math>T</math>.
 +
:• ''member''(<math>x,T</math>) – Return true iff <math>x \in T</math>.
 +
:All operations must take <math>O(log n)</math> time on an <math>n</math>-element set.
  
:11.20. A ''strongly independent'' set is a subset of vertices <math>S</math> in a graph <math>G</math> such that for any two vertices in <math>S</math>, there is no path of length two in <math>G</math>. Prove that strongly independent set is NP-hard.
 
  
 +
:[[3.21]]. A ''concatenate operation'' takes two sets <math>S_1</math> and <math>S_2</math>, where every key in <math>S_1</math> is smaller than any key in <math>S_2</math>, and merges them. Give an algorithm to concatenate two binary search trees into one binary search tree. The worst-case running time should be <math>O(h)</math>, where <math>h</math> is the maximal height of the two trees.
 +
[[3.21|Solution]]
  
:[[11.21]]. A ''kite'' is a graph on an even number of vertices, say <math>2n</math>, in which <math>n</math> of the vertices form a clique and the remaining <math>n</math> vertices are connected in a tail that consists of a path joined to one of the vertices of the clique. Given a graph and a goal g, the ''max kite'' problem asks for a subgraph that is a kite and contains <math>2g</math> nodes. Prove that ''max kite'' is NP-hard.
+
===Applications of Tree Structures===
[[11.21|Solution]]
 
  
===Creative Reductions===
+
:3.22. Design a data structure that supports the following two operations:
 +
:• ''insert''(<math>x</math>) – Insert item <math>x</math> from the data stream to the data structure.
 +
:• ''median''() – Return the median of all elements so far.
 +
:All operations must take <math>O(log n)</math> time on an <math>n</math>-element set.
  
:11.22. Prove that the following problem is NP-complete:
 
:* ''Problem:'' Hitting Set
 
:* ''Input:'' A collection <math>C</math> of subsets of a set <math>S</math>, positive integer <math>k</math>.
 
:* ''Output:'' Does <math>S</math> contain a subset <math>S'</math> such that <math>|S'| \leq k</math> and each subset in <math>C</math> contains at least one element from <math>S'</math>?
 
  
 +
:[[3.23]]. Assume we are given a standard dictionary (balanced binary search tree) defined on a set of <math>n</math> strings, each of length at most <math>l</math>. We seek to print out all strings beginning with a particular prefix <math>p</math>. Show how to do this in <math>O(ml log n)</math> time, where <math>m</math> is the number of strings.
 +
[[3.23|Solution]]
  
:[[11.23]]. Prove that the following problem is NP-complete:
 
:* ''Problem:'' Knapsack
 
:* ''Input:'' A set <math>S</math> of <math>n</math> items, such that the <math>i</math>th item has value <math>v_i</math> and weight <math>w_i</math>.  Two positive integers: weight limit <math>W</math> and value requirement <math>V</math>.
 
:* ''Output:'' Does there exist a subset <math>S' \in S</math> such that <math>\sum_{i \in S'} w_i \leq W</math> and <math>\sum_{i \in S'} v_i \geq V</math>?
 
:(Hint: start from integer partition.)
 
[[11.23|Solution]]
 
  
 +
:3.24. An array <math>A</math> is called <math>k</math>-unique if it does not contain a pair of duplicate elements within <math>k</math> positions of each other, that is, there is no <math>i</math> and <math>j</math> such that <math>A[i] = A[j]</math> and <math>|j - i| \leq k</math>. Design a worst-case <math>O(n log k)</math> algorithm to test if <math>A</math> is <math>k</math>-unique.
  
:11.24. Prove that the following problem is NP-complete:
 
:* ''Problem:'' Hamiltonian Path
 
:* ''Input:'' A graph <math>G</math>, and vertices <math>s</math> and <math>t</math>.
 
:* ''Output:'' Does <math>G</math> contain a path which starts from <math>s</math>, ends at <math>t</math>, and visits all vertices without visiting any vertex more than once? (Hint: start from Hamiltonian cycle.)
 
  
 +
:[[3.25]]. In the ''bin-packing problem'', we are given <math>n</math> objects, each weighing at most 1 kilogram. Our goal is to find the smallest number of bins that will hold the <math>n</math> objects, with each bin holding 1 kilogram at most.
 +
:• The ''best-fit heuristic'' for bin packing is as follows. Consider the objects in the order in which they are given. For each object, place it into the partially filled bin with the smallest amount of extra room after the object is inserted. If no such bin exists, start a new bin. Design an algorithm that implements the best-fit heuristic (taking as input the <math>n</math> weights <math>w_1, w_2, ..., w_n</math> and outputting the number of bins used) in <math>O(n log n)</math> time.
 +
:• Repeat the above using the ''worst-fit heuristic'', where we put the next object into the partially filled bin with the largest amount of extra room after the object is inserted.
 +
[[3.25|Solution]]
  
:[[11.25]]. Prove that the following problem is NP-complete:
 
:* ''Problem:'' Longest Path
 
:* ''Input:'' A graph <math>G</math> and positive integer <math>k</math>.
 
:* ''Output:'' Does <math>G</math> contain a path that visits at least <math>k</math> different vertices without visiting any vertex more than once?
 
[[11.25|Solution]]
 
  
 +
:3.26. Suppose that we are given a sequence of <math>n</math> values <math>x_1, x_2, ..., x_n</math> and seek to quickly answer repeated queries of the form: given <math>i</math> and <math>j</math>, find the smallest value in <math>x_i, . . . , x_j</math>.
 +
:(a) Design a data structure that uses <math>O(n^2)</math> space and answers queries in <math>O(1)</math> time.
 +
:(b) Design a data structure that uses <math>O(n)</math> space and answers queries in <math>O(log n)</math> time. For partial credit, your data structure can use <math>O(n log n)</math> space and have <math>O(log n)</math> query time.
  
:11.26. Prove that the following problem is NP-complete:
 
:* ''Problem:'' Dominating Set
 
:* ''Input:'' A graph <math>G=(V,E)</math> and positive integer <math>k</math>.
 
:* ''Output:'' Is there a subset <math>V' \in V</math> such that <math>|V'|\leq k</math> where for each vertex <math>x \in V</math> either <math>x \in V'</math> or there exists an edge <math>(x,y)</math>, where <math>y \in V'</math>.
 
  
 +
:[[3.27]]. Suppose you are given an input set <math>S</math> of <math>n</math> integers, and a black box that if given any sequence of integers and an integer <math>k</math> instantly and correctly answers whether there is a subset of the input sequence whose sum is exactly <math>k</math>. Show how to use the black box <math>O(n)</math> times to find a subset of <math>S</math> that adds up to <math>k</math>.
 +
[[3.27|Solution]]
  
:[[11.27]]. Prove that the vertex cover problem (does there exist a subset <math>S</math> of <math>k</math> vertices in a graph <math>G</math> such that every edge in <math>G</math> is incident upon at least one vertex in <math>S</math>?) remains NP-complete even when all the vertices in the graph are restricted to have even degrees.
 
[[11.27|Solution]]
 
  
 +
:3.28. Let <math>A[1..n]</math> be an array of real numbers. Design an algorithm to perform any sequence of the following operations:
 +
:• ''Add''(<math>i,y</math>) – Add the value <math>y</math> to the <math>i</math>th number.
 +
:• ''Partial-sum''(<math>i</math>) – Return the sum of the first <math>i</math> numbers, that is, <math>\sum_{j=1}^i A[j]</math>.
 +
:There are no insertions or deletions; the only change is to the values of the numbers. Each operation should take <math>O(log n)</math> steps. You may use one additional array of size <math>n</math> as a work space.
  
:11.28. Prove that the following problem is NP-complete:
 
:* ''Problem:'' Set Packing
 
:* ''Input:'' A collection <math>C</math> of subsets of a set <math>S</math>, positive integer <math>k</math>.
 
:* ''Output:'' Does <math>S</math> contain at least <math>k</math> disjoint subsets (i.e., such that none of these subsets have any elements in common?)
 
  
 +
:[[3.29]]. Extend the data structure of the previous problem to support insertions and deletions. Each element now has both a ''key'' and a ''value''. An element is accessed by its key, but the addition operation is applied to the values. The ''Partial_sum'' operation is different.
 +
:• ''Add''(<math>k,y</math>) – Add the value <math>y</math> to the item with key <math>k</math>.
 +
:• ''Insert''(<math>k,y</math>) – Insert a new item with key <math>k</math> and value <math>y</math>.
 +
:• ''Delete''(<math>k</math>) – Delete the item with key <math>k</math>.
 +
:• ''Partial-sum''(<math>k</math>) – Return the sum of all the elements currently in the set whose key is less than <math>k</math>, that is, <math>\sum_{i < k} x_i</math>.
 +
:The worst-case running time should still be <math>O(n log n)</math> for any sequence of <math>O(n)</math> operations.
 +
[[3.29|Solution]]
  
:[[11.29]]. Prove that the following problem is NP-complete:
 
:* ''Problem:'' Feedback Vertex Set
 
:* ''Input:'' A directed graph <math>G=(V,A)</math> and positive integer <math>k</math>.
 
:* ''Output:'' Is there a subset <math>V' \in V</math> such that <math>|V'|\leq k</math>, such that deleting the vertices of <math>V'</math> from <math>G</math> leaves a DAG?
 
[[11.29|Solution]]
 
  
 +
:3.30. You are consulting for a hotel that has <math>n</math> one-bed rooms. When a guest checks in, they ask for a room whose number is in the range <math>[l, h]</math>. Propose a data structure that supports the following data operations in the allotted time:
 +
:(a) ''Initialize''(<math>n</math>): Initialize the data structure for empty rooms numbered <math>1, 2, . . . , n</math>, in polynomial time.
 +
:(b) ''Count''(<math>l, h</math>): Return the number of available rooms in <math>[l, h]</math>, in <math>O(log n)</math> time.
 +
:(c) ''Checkin''(<math>l, h</math>): In <math>O(log n)</math> time, return the first empty room in <math>[l, h]</math> and mark it occupied, or return NIL if all the rooms in <math>[l, h]</math> are occupied.
 +
:(d) ''Checkout''(<math>x</math>): Mark room <math>x</math> as unoccupied, in <math>O(log n)</math> time.
  
:11.30. Give a reduction from Sudoku to the vertex coloring problem in graphs. Specifically, describe how to take any partially filled Sudoku board and construct a graph that can be colored with nine colors iff the Sudoku board is solvable.
 
  
===Algorithms for Special Cases===
+
:[[3.31]]. Design a data structure that allows one to search, insert, and delete an integer <math>X</math> in <math>O(1)</math> time (i.e., constant time, independent of the total number of integers stored). Assume that <math>1 \leq X \leq n</math> and that there are <math>m + n</math> units of space available, where <math>m</math> is the maximum number of integers that can be in the table at any one time. (Hint: use two arrays <math>A[1..n]</math> and <math>B[1..m]</math>.) You are not allowed to initialize either <math>A</math> or <math>B</math>, as that would take <math>O(m)</math> or <math>O(n)</math> operations. This means the arrays are full of random garbage to begin with, so you must be very careful.
 +
[[3.31|Solution]]
  
:[[11.31]]. A Hamiltonian path <math>P</math> is a path that visits each vertex exactly once. The problem of testing whether a graph <math>G</math> contains a Hamiltonian path is NP-complete.
+
===Implementation Projects===
There does not have to be an edge in <math>G</math> from the ending vertex to the starting vertex of <math>P</math>, unlike in the Hamiltonian cycle problem. Give an <math>O(n+m)</math>-time algorithm to test whether a directed acyclic graph <math>G</math> (a DAG) contains a Hamiltonian path. (Hint: think about topological sorting and DFS.)
 
[[11.31|Solution]]
 
  
 +
:3.32. Implement versions of several different dictionary data structures, such as linked lists, binary trees, balanced binary search trees, and hash tables. Conduct experiments to assess the relative performance of these data structures in a simple application that reads a large text file and reports exactly one instance of each word that appears within it. This application can be efficiently implemented by maintaining a dictionary of all distinct words that have appeared thus far in the text and inserting/reporting each new word that appears in the stream. Write a brief report with your conclusions.
  
:11.32. Consider the ''k''-clique problem, which is the general clique problem restricted to graphs in which every vertex has degree at most <math>k</math>. Prove that ''k''-clique has an efficient algorithm for any given <math>k</math>, meaning that <math>k</math> is a constant.
 
  
 +
:[[3.33]]. A Caesar shift (see Section 21.6 (page 697)) is a very simple class of ciphers for secret messages. Unfortunately, they can be broken using statistical properties of English. Develop a program capable of decrypting Caesar shifts of sufficiently long texts.
 +
[[3.33|Solution]]
  
:[[11.33]]. The <math>2</math>-SAT problem is, given a Boolean formula in 2-conjunctive normal form (CNF), to decide whether the formula is satisfiable. <math>2</math>-SAT is like <math>3</math>-SAT, except that each clause can have only two literals. For example, the following formula is in <math>2</math>-CNF: <math> (x_1 \or x_2) \and (\bar{x}_2 \or x_3) \and (x_1 \or \bar{x}_3) </math>
+
===Interview Problems===
:Give a polynomial-time algorithm to solve <math>2</math>-SAT.
 
[[11.33|Solution]]
 
  
 +
:3.34. What method would you use to look up a word in a dictionary?
  
:11.34, Show that the following problems are in NP:
 
#Does graph <math>G</math> have a simple path (i.e., with no vertex repeated) of length <math>k</math>? \#Is integer <math>n</math> composite (i.e., not prime)?
 
#Does graph <math>G</math> have a vertex cover of size <math>k</math>?
 
  
 +
:[[3.35]]. Imagine you have a closet full of shirts. What can you do to organize your shirts for easy retrieval?
 +
[[3.35|Solution]]
  
:[[11.35]]. Until 2002, it was an open question whether the decision problem ''Is integer <math>n</math> a composite number, in other words, not prime?'' can be computed in time polynomial in the size of its input. Why doesn't the following algorithm suffice to prove it is in P, since it runs in <math>O(n)</math> time?
+
 
  PrimalityTesting(<math>n</math>)
+
:3.36. Write a function to find the middle node of a singly linked list.
      composite := <math>false</math>
+
 
      for i := 2 to <math>n-1</math> do
+
 
        if <math>(n\,\bmod\,i) = 0</math> then
+
:[[3.37]]. [4] Write a function to determine whether two binary trees are identical. Identical trees have the same key value at each position and the same structure.
            composite := <math>true</math>
+
[[3.37|Solution]]
[[11.35|Solution]]
+
 
 +
 
 +
:3.38. Write a program to convert a binary search tree into a linked list.
 +
 
 +
 
 +
:[[3.39]]. Implement an algorithm to reverse a linked list. Now do it without recursion.
 +
[[3.39|Solution]]
 +
 
 +
 
 +
:3.40. What is the best data structure for maintaining URLs that have been visited by a web crawler? Give an algorithm to test whether a given URL has already been visited, optimizing both space and time.
 +
 
 +
 
 +
:[[3.41]]. You are given a search string and a magazine. You seek to generate all the characters in the search string by cutting them out from the magazine. Give an algorithm to efficiently determine whether the magazine contains all the letters in the search string.
 +
[[3.41|Solution]]
 +
 
 +
 
 +
:3.42. Reverse the words in a sentence—that is, “My name is Chris” becomes “Chris is name My.” Optimize for time and space.
 +
 
 +
 
 +
:[[3.43]]. Determine whether a linked list contains a loop as quickly as possible without using any extra storage. Also, identify the location of the loop.
 +
[[3.43|Solution]]
 +
 
 +
 
 +
:3.44. You have an unordered array <math>X</math> of <math>n</math> integers. Find the array <math>M</math> containing <math>n</math> elements where <math>M_i</math> is the product of all integers in <math>X</math> except for <math>X_i</math>. You may not use division. You can use extra memory. (Hint: there are solutions faster than <math>O(n^2)</math>.)
 +
 
 +
 
 +
:[[3.45]]. Give an algorithm for finding an ordered word pair (e.g. “New York”) occurring with the greatest frequency in a given webpage. Which data structures would you use? Optimize both time and space.
 +
[[3.45|Solution]]
  
  
 
Back to [[Chapter List]]
 
Back to [[Chapter List]]

Revision as of 15:10, 18 September 2020

Data Structure

Stacks, Queues, and Lists

3.1. A common problem for compilers and text editors is determining whether the parentheses in a string are balanced and properly nested. For example, the string [math]\displaystyle{ ((())())() }[/math] contains properly nested pairs of parentheses, which the strings [math]\displaystyle{ )()( }[/math] and [math]\displaystyle{ ()) }[/math] do not. Give an algorithm that returns true if a string contains properly nested and balanced parentheses, and false if otherwise. For full credit, identify the position of the first offending parenthesis if the string is not properly nested and balanced.

Solution


3.2. Give an algorithm that takes a string [math]\displaystyle{ S }[/math] consisting of opening and closing parentheses, say )()(())()()))())))(, and finds the length of the longest balanced parentheses in [math]\displaystyle{ S }[/math], which is 12 in the example above. (Hint: The solution is not necessarily a contiguous run of parenthesis from [math]\displaystyle{ S }[/math].)


3.3. Give an algorithm to reverse the direction of a given singly linked list. In other words, after the reversal all pointers should now point backwards. Your algorithm should take linear time.

Solution


3.4. Design a stack [math]\displaystyle{ S }[/math] that supports S.push(x), S.pop(), and S.findmin(), which returns the minimum element of [math]\displaystyle{ S }[/math]. All operations should run in constant time.


3.5. We have seen how dynamic arrays enable arrays to grow while still achieving constant-time amortized performance. This problem concerns extending dynamic arrays to let them both grow and shrink on demand.
(a) Consider an underflow strategy that cuts the array size in half whenever the array falls below half full. Give an example sequence of insertions and deletions where this strategy gives a bad amortized cost.
(b) Then, give a better underflow strategy than that suggested above, one that achieves constant amortized cost per deletion.

Solution


3.6. Suppose you seek to maintain the contents of a refrigerator so as to minimize food spoilage. What data structure should you use, and how should you use it?


3.7. Work out the details of supporting constant-time deletion from a singly linked list as per the footnote from page 79, ideally to an actual implementation. Support the other operations as efficiently as possible.

Solution

Elementary Data Structures

3.8. Tic-tac-toe is a game played on an [math]\displaystyle{ n * n }[/math] board (typically [math]\displaystyle{ n = 3 }[/math]) where two players take consecutive turns placing “O” and “X” marks onto the board cells. The game is won if n consecutive “O” or ‘X” marks are placed in a row, column, or diagonal. Create a data structure with [math]\displaystyle{ O(n) }[/math] space that accepts a sequence of moves, and reports in constant time whether the last move won the game.


3.9. Write a function which, given a sequence of digits 2–9 and a dictionary of [math]\displaystyle{ n }[/math] words, reports all words described by this sequence when typed in on a standard telephone keypad. For the sequence 269 you should return any, box, boy, and cow, among other words.

Solution


3.10. Two strings [math]\displaystyle{ X }[/math] and [math]\displaystyle{ Y }[/math] are anagrams if the letters of [math]\displaystyle{ X }[/math] can be rearranged to form [math]\displaystyle{ Y }[/math] . For example, silent/listen, and incest/insect are anagrams. Give an efficient algorithm to determine whether strings [math]\displaystyle{ X }[/math] and [math]\displaystyle{ Y }[/math] are anagrams.

Trees and Other Dictionary Structures

3.11. Design a dictionary data structure in which search, insertion, and deletion can all be processed in [math]\displaystyle{ O(1) }[/math] time in the worst case. You may assume the set elements are integers drawn from a finite set [math]\displaystyle{ 1, 2, .., n }[/math], and initialization can take [math]\displaystyle{ O(n) }[/math] time.

Solution


3.12. The maximum depth of a binary tree is the number of nodes on the path from the root down to the most distant leaf node. Give an [math]\displaystyle{ O(n) }[/math] algorithm to find the maximum depth of a binary tree with [math]\displaystyle{ n }[/math] nodes.


3.13. Two elements of a binary search tree have been swapped by mistake. Give an [math]\displaystyle{ O(n) }[/math] algorithm to identify these two elements so they can be swapped back.

Solution


3.14. Given two binary search trees, merge them into a doubly linked list in sorted order.


3.15. Describe an [math]\displaystyle{ O(n) }[/math]-time algorithm that takes an [math]\displaystyle{ n }[/math]-node binary search tree and constructs an equivalent height-balanced binary search tree. In a height-balanced binary search tree, the difference between the height of the left and right subtrees of every node is never more than 1.

Solution


3.16. Find the storage efficiency ratio (the ratio of data space over total space) for each of the following binary tree implementations on n nodes:
(a) All nodes store data, two child pointers, and a parent pointer. The data field requires 4 bytes and each pointer requires 4 bytes.
(b) Only leaf nodes store data; internal nodes store two child pointers. The data field requires four bytes and each pointer requires two bytes.


3.17. Give an [math]\displaystyle{ O(n) }[/math] algorithm that determines whether a given [math]\displaystyle{ n }[/math]-node binary tree is height-balanced (see Problem 3-15).

Solution


3.18. Describe how to modify any balanced tree data structure such that search, insert, delete, minimum, and maximum still take [math]\displaystyle{ O(log n) }[/math] time each, but successor and predecessor now take [math]\displaystyle{ O(1) }[/math] time each. Which operations have to be modified to support this?


3.19. Suppose you have access to a balanced dictionary data structure that supports each of the operations search, insert, delete, minimum, maximum, successor, and predecessor in [math]\displaystyle{ O(log n) }[/math] time. Explain how to modify the insert and delete operations so they still take [math]\displaystyle{ O(log n) }[/math] but now minimum and maximum take [math]\displaystyle{ O(1) }[/math] time. (Hint: think in terms of using the abstract dictionary operations, instead of mucking about with pointers and the like.)

Solution


3.20. Design a data structure to support the following operations:
insert([math]\displaystyle{ x,T }[/math]) – Insert item [math]\displaystyle{ x }[/math] into the set [math]\displaystyle{ T }[/math].
delete([math]\displaystyle{ k,T }[/math]) – Delete the [math]\displaystyle{ k }[/math]th smallest element from [math]\displaystyle{ T }[/math].
member([math]\displaystyle{ x,T }[/math]) – Return true iff [math]\displaystyle{ x \in T }[/math].
All operations must take [math]\displaystyle{ O(log n) }[/math] time on an [math]\displaystyle{ n }[/math]-element set.


3.21. A concatenate operation takes two sets [math]\displaystyle{ S_1 }[/math] and [math]\displaystyle{ S_2 }[/math], where every key in [math]\displaystyle{ S_1 }[/math] is smaller than any key in [math]\displaystyle{ S_2 }[/math], and merges them. Give an algorithm to concatenate two binary search trees into one binary search tree. The worst-case running time should be [math]\displaystyle{ O(h) }[/math], where [math]\displaystyle{ h }[/math] is the maximal height of the two trees.

Solution

Applications of Tree Structures

3.22. Design a data structure that supports the following two operations:
insert([math]\displaystyle{ x }[/math]) – Insert item [math]\displaystyle{ x }[/math] from the data stream to the data structure.
median() – Return the median of all elements so far.
All operations must take [math]\displaystyle{ O(log n) }[/math] time on an [math]\displaystyle{ n }[/math]-element set.


3.23. Assume we are given a standard dictionary (balanced binary search tree) defined on a set of [math]\displaystyle{ n }[/math] strings, each of length at most [math]\displaystyle{ l }[/math]. We seek to print out all strings beginning with a particular prefix [math]\displaystyle{ p }[/math]. Show how to do this in [math]\displaystyle{ O(ml log n) }[/math] time, where [math]\displaystyle{ m }[/math] is the number of strings.

Solution


3.24. An array [math]\displaystyle{ A }[/math] is called [math]\displaystyle{ k }[/math]-unique if it does not contain a pair of duplicate elements within [math]\displaystyle{ k }[/math] positions of each other, that is, there is no [math]\displaystyle{ i }[/math] and [math]\displaystyle{ j }[/math] such that [math]\displaystyle{ A[i] = A[j] }[/math] and [math]\displaystyle{ |j - i| \leq k }[/math]. Design a worst-case [math]\displaystyle{ O(n log k) }[/math] algorithm to test if [math]\displaystyle{ A }[/math] is [math]\displaystyle{ k }[/math]-unique.


3.25. In the bin-packing problem, we are given [math]\displaystyle{ n }[/math] objects, each weighing at most 1 kilogram. Our goal is to find the smallest number of bins that will hold the [math]\displaystyle{ n }[/math] objects, with each bin holding 1 kilogram at most.
• The best-fit heuristic for bin packing is as follows. Consider the objects in the order in which they are given. For each object, place it into the partially filled bin with the smallest amount of extra room after the object is inserted. If no such bin exists, start a new bin. Design an algorithm that implements the best-fit heuristic (taking as input the [math]\displaystyle{ n }[/math] weights [math]\displaystyle{ w_1, w_2, ..., w_n }[/math] and outputting the number of bins used) in [math]\displaystyle{ O(n log n) }[/math] time.
• Repeat the above using the worst-fit heuristic, where we put the next object into the partially filled bin with the largest amount of extra room after the object is inserted.

Solution


3.26. Suppose that we are given a sequence of [math]\displaystyle{ n }[/math] values [math]\displaystyle{ x_1, x_2, ..., x_n }[/math] and seek to quickly answer repeated queries of the form: given [math]\displaystyle{ i }[/math] and [math]\displaystyle{ j }[/math], find the smallest value in [math]\displaystyle{ x_i, . . . , x_j }[/math].
(a) Design a data structure that uses [math]\displaystyle{ O(n^2) }[/math] space and answers queries in [math]\displaystyle{ O(1) }[/math] time.
(b) Design a data structure that uses [math]\displaystyle{ O(n) }[/math] space and answers queries in [math]\displaystyle{ O(log n) }[/math] time. For partial credit, your data structure can use [math]\displaystyle{ O(n log n) }[/math] space and have [math]\displaystyle{ O(log n) }[/math] query time.


3.27. Suppose you are given an input set [math]\displaystyle{ S }[/math] of [math]\displaystyle{ n }[/math] integers, and a black box that if given any sequence of integers and an integer [math]\displaystyle{ k }[/math] instantly and correctly answers whether there is a subset of the input sequence whose sum is exactly [math]\displaystyle{ k }[/math]. Show how to use the black box [math]\displaystyle{ O(n) }[/math] times to find a subset of [math]\displaystyle{ S }[/math] that adds up to [math]\displaystyle{ k }[/math].

Solution


3.28. Let [math]\displaystyle{ A[1..n] }[/math] be an array of real numbers. Design an algorithm to perform any sequence of the following operations:
Add([math]\displaystyle{ i,y }[/math]) – Add the value [math]\displaystyle{ y }[/math] to the [math]\displaystyle{ i }[/math]th number.
Partial-sum([math]\displaystyle{ i }[/math]) – Return the sum of the first [math]\displaystyle{ i }[/math] numbers, that is, [math]\displaystyle{ \sum_{j=1}^i A[j] }[/math].
There are no insertions or deletions; the only change is to the values of the numbers. Each operation should take [math]\displaystyle{ O(log n) }[/math] steps. You may use one additional array of size [math]\displaystyle{ n }[/math] as a work space.


3.29. Extend the data structure of the previous problem to support insertions and deletions. Each element now has both a key and a value. An element is accessed by its key, but the addition operation is applied to the values. The Partial_sum operation is different.
Add([math]\displaystyle{ k,y }[/math]) – Add the value [math]\displaystyle{ y }[/math] to the item with key [math]\displaystyle{ k }[/math].
Insert([math]\displaystyle{ k,y }[/math]) – Insert a new item with key [math]\displaystyle{ k }[/math] and value [math]\displaystyle{ y }[/math].
Delete([math]\displaystyle{ k }[/math]) – Delete the item with key [math]\displaystyle{ k }[/math].
Partial-sum([math]\displaystyle{ k }[/math]) – Return the sum of all the elements currently in the set whose key is less than [math]\displaystyle{ k }[/math], that is, [math]\displaystyle{ \sum_{i \lt k} x_i }[/math].
The worst-case running time should still be [math]\displaystyle{ O(n log n) }[/math] for any sequence of [math]\displaystyle{ O(n) }[/math] operations.

Solution


3.30. You are consulting for a hotel that has [math]\displaystyle{ n }[/math] one-bed rooms. When a guest checks in, they ask for a room whose number is in the range [math]\displaystyle{ [l, h] }[/math]. Propose a data structure that supports the following data operations in the allotted time:
(a) Initialize([math]\displaystyle{ n }[/math]): Initialize the data structure for empty rooms numbered [math]\displaystyle{ 1, 2, . . . , n }[/math], in polynomial time.
(b) Count([math]\displaystyle{ l, h }[/math]): Return the number of available rooms in [math]\displaystyle{ [l, h] }[/math], in [math]\displaystyle{ O(log n) }[/math] time.
(c) Checkin([math]\displaystyle{ l, h }[/math]): In [math]\displaystyle{ O(log n) }[/math] time, return the first empty room in [math]\displaystyle{ [l, h] }[/math] and mark it occupied, or return NIL if all the rooms in [math]\displaystyle{ [l, h] }[/math] are occupied.
(d) Checkout([math]\displaystyle{ x }[/math]): Mark room [math]\displaystyle{ x }[/math] as unoccupied, in [math]\displaystyle{ O(log n) }[/math] time.


3.31. Design a data structure that allows one to search, insert, and delete an integer [math]\displaystyle{ X }[/math] in [math]\displaystyle{ O(1) }[/math] time (i.e., constant time, independent of the total number of integers stored). Assume that [math]\displaystyle{ 1 \leq X \leq n }[/math] and that there are [math]\displaystyle{ m + n }[/math] units of space available, where [math]\displaystyle{ m }[/math] is the maximum number of integers that can be in the table at any one time. (Hint: use two arrays [math]\displaystyle{ A[1..n] }[/math] and [math]\displaystyle{ B[1..m] }[/math].) You are not allowed to initialize either [math]\displaystyle{ A }[/math] or [math]\displaystyle{ B }[/math], as that would take [math]\displaystyle{ O(m) }[/math] or [math]\displaystyle{ O(n) }[/math] operations. This means the arrays are full of random garbage to begin with, so you must be very careful.

Solution

Implementation Projects

3.32. Implement versions of several different dictionary data structures, such as linked lists, binary trees, balanced binary search trees, and hash tables. Conduct experiments to assess the relative performance of these data structures in a simple application that reads a large text file and reports exactly one instance of each word that appears within it. This application can be efficiently implemented by maintaining a dictionary of all distinct words that have appeared thus far in the text and inserting/reporting each new word that appears in the stream. Write a brief report with your conclusions.


3.33. A Caesar shift (see Section 21.6 (page 697)) is a very simple class of ciphers for secret messages. Unfortunately, they can be broken using statistical properties of English. Develop a program capable of decrypting Caesar shifts of sufficiently long texts.

Solution

Interview Problems

3.34. What method would you use to look up a word in a dictionary?


3.35. Imagine you have a closet full of shirts. What can you do to organize your shirts for easy retrieval?

Solution


3.36. Write a function to find the middle node of a singly linked list.


3.37. [4] Write a function to determine whether two binary trees are identical. Identical trees have the same key value at each position and the same structure.

Solution


3.38. Write a program to convert a binary search tree into a linked list.


3.39. Implement an algorithm to reverse a linked list. Now do it without recursion.

Solution


3.40. What is the best data structure for maintaining URLs that have been visited by a web crawler? Give an algorithm to test whether a given URL has already been visited, optimizing both space and time.


3.41. You are given a search string and a magazine. You seek to generate all the characters in the search string by cutting them out from the magazine. Give an algorithm to efficiently determine whether the magazine contains all the letters in the search string.

Solution


3.42. Reverse the words in a sentence—that is, “My name is Chris” becomes “Chris is name My.” Optimize for time and space.


3.43. Determine whether a linked list contains a loop as quickly as possible without using any extra storage. Also, identify the location of the loop.

Solution


3.44. You have an unordered array [math]\displaystyle{ X }[/math] of [math]\displaystyle{ n }[/math] integers. Find the array [math]\displaystyle{ M }[/math] containing [math]\displaystyle{ n }[/math] elements where [math]\displaystyle{ M_i }[/math] is the product of all integers in [math]\displaystyle{ X }[/math] except for [math]\displaystyle{ X_i }[/math]. You may not use division. You can use extra memory. (Hint: there are solutions faster than [math]\displaystyle{ O(n^2) }[/math].)


3.45. Give an algorithm for finding an ordered word pair (e.g. “New York”) occurring with the greatest frequency in a given webpage. Which data structures would you use? Optimize both time and space.

Solution


Back to Chapter List