Dissertation/Thesis Abstract

The hetnet awakens: understanding complex diseases through data integration and open science
by Himmelstein, Daniel S., Ph.D., University of California, San Francisco, 2016, 120; 10133408
Abstract (Summary)

Human disease is complex. However, the explosion of biomedical data is providing new opportunities to improve our understanding. My dissertation focused on how to harness the biodata revolution. Broadly, I addressed three questions: how to integrate data, how to extract insights from data, and how to make science more open.

To integrate data, we pioneered the hetnet—a network with multiple node and relationship types. After several preludes, we released Hetionet v1.0, which contains 2,250,197 relationships of 24 types. Hetionet encodes the collective knowledge produced by millions of studies over the last half century.

To extract insights from data, we developed a machine learning approach for hetnets. In order to predict the probability that an unknown relationship exists, our algorithm identifies influential network patterns. We used the approach to prioritize disease—gene associations and drug repurposing opportunities. By evaluating our predictions on withheld knowledge, we demonstrated the systematic success of our method.

After encountering friction that interfered with data integration and rapid communication, I began looking at how to make science more open. The quest led me to explore realtime open notebook science and expose publishing delays at journals as well as the problematic licensing of publicly-funded research data.

Indexing (document details)
Advisor: Baranzini, Sergio E.
Commitee: Sali, Andrej, Witte, John S.
School: University of California, San Francisco
Department: Biological and Medical Informatics
School Location: United States -- California
Source: DAI-B 77/11(E), Dissertation Abstracts International
Source Type: DISSERTATION
Subjects: Biostatistics, Bioinformatics, Computer science
Keywords: Disease, GWAS, Hetnet, Machine learning, Open science, Pharmacology
Publication Number: 10133408
ISBN: 9781339919881
Copyright © 2019 ProQuest LLC. All rights reserved. Terms and Conditions Privacy Policy Cookie Policy
ProQuest