Human disease is complex. However, the explosion of biomedical data is providing new opportunities to improve our understanding. My dissertation focused on how to harness the biodata revolution. Broadly, I addressed three questions: how to integrate data, how to extract insights from data, and how to make science more open.
To integrate data, we pioneered the hetnet—a network with multiple node and relationship types. After several preludes, we released Hetionet v1.0, which contains 2,250,197 relationships of 24 types. Hetionet encodes the collective knowledge produced by millions of studies over the last half century.
To extract insights from data, we developed a machine learning approach for hetnets. In order to predict the probability that an unknown relationship exists, our algorithm identifies influential network patterns. We used the approach to prioritize disease—gene associations and drug repurposing opportunities. By evaluating our predictions on withheld knowledge, we demonstrated the systematic success of our method.
After encountering friction that interfered with data integration and rapid communication, I began looking at how to make science more open. The quest led me to explore realtime open notebook science and expose publishing delays at journals as well as the problematic licensing of publicly-funded research data.
|Advisor:||Baranzini, Sergio E.|
|Commitee:||Sali, Andrej, Witte, John S.|
|School:||University of California, San Francisco|
|Department:||Biological and Medical Informatics|
|School Location:||United States -- California|
|Source:||DAI-B 77/11(E), Dissertation Abstracts International|
|Subjects:||Biostatistics, Bioinformatics, Computer science|
|Keywords:||Disease, GWAS, Hetnet, Machine learning, Open science, Pharmacology|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be