Dissertation/Thesis Abstract

Hard instances with hidden solutions
by Jia, Haixia, Ph.D., The University of New Mexico, 2007, 73; 3298151
Abstract (Summary)

To understand what makes NP-complete problems so hard, I conduct my research through two approaches: construct hard problem instances and study how badly the solvers work. A simple way to construct SAT instances is to choose a random truth assignment A, and then choose clauses randomly from among those satisfied by A. However, this method tends to produce easy problems, since the majority of literals point toward the "hidden" assignment A. Previously, we proposed a problem generator that cancels this effect by hiding both A and its complement Ā. While the resulting formulas appear to be just as hard for the Davis-Putnam-Logemann-Loveland (DPLL) algorithms as random 3-SAT formulas with no hidden assignment, they can be solved by WalkSAT in only polynomial time. Here we propose a new method to cancel the attraction to A, by choosing a clause with t > 0 literals satisfied by A with probability proportional to qt for some q < 1. By varying q, we can generate formulas whose variables have no bias, i.e., which are equally likely to be true or false; we can even cause the formula to "deceptively" point away from A. We present theoretical and experimental results suggesting that these formulas are exponentially hard both for DPLL algorithms and for incomplete algorithms such as WalkSAT.

Next, we introduce a new application for constructing "hard to solve" SAT instances: Negative Databases (NDBs), a strange type of database that only stores records "not" in the original database. NDBs offer the promise of preserving the confidentiality of the individual records in a database, while permitting certain restricted queries. NDBs have many applications in the field of data security. Constructing an NDB corresponding to a DB is equivalent to constructing a SAT formula with a set of satisfiable truth assignments as its only satisfiable truth assignments. We present a new approach to construct such formulas, and we are able to show that the formula can only be satisfied by the truth assignments that are very close to one of the hidden assignments. We then demonstrate experimentally that the formulas generated by our scheme are much harder to solve than the formulas generated by the previously suggested prefix algorithm and RNDB algorithm[27, 25], and they are indeed hard enough to ensure security. We hope that this new approach would create interest in constructing hard SAT formulas as a way to protect the confidentiality of databases.

Finally, to understand how badly the solvers work, we analyze the behavior of two natural variants of DPLL algorithm for Graph 3-Coloring on sparse random graphs G(n, p = c/n). First, we calculate analytically the probability Pc(0) that these algorithms find a 3-coloring with no backtracking at all, and show that it goes to zero faster than any analytic function as c c* = 3.847... Then we show that even in the "easy" phase 1 < c < c* where Pc(0) > 0, including just above the emergence of the giant component, the expected number of backtracks is exponentially large with positive probability. To our knowledge this is the first rigorous proof that the running time of a natural backtracking algorithm has a heavy tail for graph coloring. In addition, we give experimental evidence and heuristic arguments that this tail takes the form Pc(b) ∼ b-1 up to an exponential cutoff.

Indexing (document details)
Advisor: Moore, Cristopher
Commitee:
School: The University of New Mexico
School Location: United States -- New Mexico
Source: DAI-B 69/01, Dissertation Abstracts International
Source Type: DISSERTATION
Subjects: Statistics, Computer science
Keywords: Data security, Hard instances, Hidden solutions, Negative databases
Publication Number: 3298151
ISBN: 978-0-549-42256-3
Copyright © 2019 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy
ProQuest