In the past decade, the role of data has increased exponentially from being the output of a process, to becoming a true corporate asset. As the business landscape becomes increasingly complex and the pace of change increasingly faster, companies need a clear awareness of their data assets, their movement, and how they relate to the organization in order to make informed decisions, reduce cost, and identify opportunity. The increased complexity of corporate technology has also created a high level of risk, as the data moving across a multitude of systems lends itself to a higher likelihood of impacting dependent processes and systems, should something go wrong or be changed. The result of this increased difficulty in managing corporate data assets is poor enterprise data quality, the impacts of which, range in the billions of dollars of waste and lost opportunity to businesses.
Tools and processes exist to help companies manage this phenomena, however often times, data projects are subject to high amounts of scrutiny as senior leadership struggles to identify return on investment. While there are many tools and methods to increase a companies’ ability to govern data, this research stands by the fact that you can’t govern that which you don’t know. This lack of awareness of the corporate data landscape impacts the ability to govern data, which in turn impacts overall data quality within organizations.
This research seeks to propose a means for companies to better model the landscape of their data, processes, and organizational attributes through the use of linked data, via the Resource Description Framework (RDF) and ontology. The outcome of adopting such techniques is an increased level of data awareness within the organization, resulting in improved ability to govern corporate data assets. It does this by primarily addressing corporate leadership’s low tolerance for taking on large scale data centric projects. The nature of linked data, with it’s incremental and de-centralized approach to storing information, combined with a rich ecosystem of open source or low cost tools reduces the financial barriers to entry regarding these initiatives. Additionally, linked data’s distributed nature and flexible structure help foster maximum participation throughout the enterprise to assist in capturing information regarding data assets. This increased participation aids in increasing the quality of the information captured by empowering more of the individuals who handle the data to contribute.
Ontology, in conjunction with linked data, provides an incredibly powerful means to model the complex relationships between an organization, its people, processes, and technology assets. When combined with the graph based nature of RDF the model lends itself to presenting concepts such as data lineage to allow an organization to see the true reach of it’s data. This research further proposes an ontology that is based on data governance standards, visualization examples and queries against data to simulate common data governance situations, as well as guidelines to assist in its implementation in a enterprise setting.
The result of adopting such techniques will allow for an enterprise to accurately reflect the data assets, stewardship information and integration points that are so necessary to institute effective data governance.
|Commitee:||Qiu, Meikang, Tao, Lixin, Tappert, Charles|
|Department:||Computer Science and Information Technology|
|School Location:||United States -- New York|
|Source:||DAI-B 77/08(E), Dissertation Abstracts International|
|Subjects:||Information Technology, Information science|
|Keywords:||Data governance, Data quality, Information management, Linked data, Ontology|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be