To understand what makes NP-complete problems so hard, I conduct my research through two approaches: construct hard problem instances and study how badly the solvers work. A simple way to construct SAT instances is to choose a random truth assignment A, and then choose clauses randomly from among those satisfied by A. However, this method tends to produce easy problems, since the majority of literals point toward the "hidden" assignment A. Previously, we proposed a problem generator that cancels this effect by hiding both A and its complement Ā. While the resulting formulas appear to be just as hard for the Davis-Putnam-Logemann-Loveland (DPLL) algorithms as random 3-SAT formulas with no hidden assignment, they can be solved by
Next, we introduce a new application for constructing "hard to solve" SAT instances: Negative Databases (NDBs), a strange type of database that only stores records "not" in the original database. NDBs offer the promise of preserving the confidentiality of the individual records in a database, while permitting certain restricted queries. NDBs have many applications in the field of data security. Constructing an NDB corresponding to a DB is equivalent to constructing a SAT formula with a set of satisfiable truth assignments as its only satisfiable truth assignments. We present a new approach to construct such formulas, and we are able to show that the formula can only be satisfied by the truth assignments that are very close to one of the hidden assignments. We then demonstrate experimentally that the formulas generated by our scheme are much harder to solve than the formulas generated by the previously suggested prefix algorithm and RNDB algorithm[27, 25], and they are indeed hard enough to ensure security. We hope that this new approach would create interest in constructing hard SAT formulas as a way to protect the confidentiality of databases.
Finally, to understand how badly the solvers work, we analyze the behavior of two natural variants of DPLL algorithm for Graph 3-Coloring on sparse random graphs G(n, p = c/n). First, we calculate analytically the probability Pc(0) that these algorithms find a 3-coloring with no backtracking at all, and show that it goes to zero faster than any analytic function as c → c* = 3.847... Then we show that even in the "easy" phase 1 < c < c* where Pc(0) > 0, including just above the emergence of the giant component, the expected number of backtracks is exponentially large with positive probability. To our knowledge this is the first rigorous proof that the running time of a natural backtracking algorithm has a heavy tail for graph coloring. In addition, we give experimental evidence and heuristic arguments that this tail takes the form Pc(b) ∼ b-1 up to an exponential cutoff.
|School:||The University of New Mexico|
|School Location:||United States -- New Mexico|
|Source:||DAI-B 69/01, Dissertation Abstracts International|
|Subjects:||Statistics, Computer science|
|Keywords:||Data security, Hard instances, Hidden solutions, Negative databases|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be