There has been vast and growing amount of healthcare data especially with the rapid adoption of electronic health records (EHRs) as a result of the HITECH act of 2009. It is estimated that around 80% of the clinical information resides in the unstructured narrative of an EHR. Recently, natural language processing (NLP) techniques have offered opportunities to extract information from unstructured clinical texts needed for various clinical applications. A popular method for enabling secondary uses of EHRs is information or concept extraction, a subtask of NLP that seeks to locate and classify elements within text based on the context. Extraction of clinical concepts without considering the context has many complications, including inaccurate diagnosis of patients and contamination of study cohorts. Identifying the negation status and whether a clinical concept belongs to patients or his family members are two of the challenges faced in context detection. A negation algorithm called Dependency Parser Negation (DEEPEN) has been developed in this research study by taking into account the dependency relationship between negation words and concepts within a sentence using the Stanford Dependency Parser. The study results demonstrate that DEEPEN, can reduce the number of incorrect negation assignment for patients with positive findings, and therefore improve the identification of patients with the target clinical findings in EHRs. Additionally, an NLP system consisting of section segmentation and relation discovery was developed to identify patients’ family history. To assess the generalizability of the negation and family history algorithm, data from a different clinical institution was used in both algorithm evaluations. The temporal dimension of extracted information from clinical records representing the trajectory of disease progression in patients was also studied in this project. Clinical data of patients who lived in Olmsted County (Rochester, MN) during 1966 to 2010 was analyzed in this work. The patient records were modeled by diagnosis matrices with clinical events as rows and their temporal information as columns. Deep learning algorithm was used to find common temporal patterns within these diagnosis matrices.
|Advisor:||Jones, Josette F., Palakal, Mathew J.|
|Commitee:||Chien, Stanley Y-P, Liu, Xiaowen, Schmidt, Christian Max|
|School:||Indiana University - Purdue University Indianapolis|
|School Location:||United States -- Indiana|
|Source:||DAI-B 77/07(E), Dissertation Abstracts International|
|Subjects:||Health sciences, Information science|
|Keywords:||Deep learning, Family history, Natural language processing, Negation, Pancreatic cancer, Temporal pattern discovery|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be