During the last decade environmental scientists and managers have found in the Internet a new communication venue that has improved their productivity by allowing them to share data and knowledge more fluently. Since the invention of the eXtensible Markup Language (XML), XML has brought the attention of many researchers to improve the communications among people working with data-centric documents since XML was supposed to be the correct approach to standardize data. During this decade, many papers have been published under the pretenses of XML being the new and paradigmatic standard to share data even though no study has proved that the XML languages have been used by researchers or managers working with data-centric documents during this period of time. This thesis by researching all possible spaces proves that, on the contrary, that after more than a decade from its invention, XML is still not used by the vast scientific community that works with data-centric documents who are still using data archives with legacy formats. Therefore, if data standardization is difficult to attain, and facilitating sharing data a goal to reach, the other clear venue to follow to achieve the goal is to use metadata information. However, metadata languages such as the Ecological Modeling Language (EML) and others have no intrinsic features to complete and directly describe the information conveyed in many important types of data-centric documents used by environmentalists. By carefully studying the nature of data-centric archives and the process of metadata creation, this thesis shows that any data archive can be easily described using an "a posteriori" approach where the lexical descriptors of the physical data from a data-centric file are developed by inspection of the file instead of by following the specifications of the format of the file. In addition, following the principles of the Linked Open Data project, the lexical tree is mapped into a simple logic model with semantic annotations from controlled vocabularies which can be easily serialized for data exchange or data syndicalization. With the metadata extensions researched in this thesis, metadata languages such as EML can be improved by increasing its expression power. Environmental scientists and researchers can us this to exchange data-centric documents, and multidisciplinary projects can easily syndicate data from different authors in different formats in a data-centric cloud.
|Commitee:||Baldocchi, Dennis, Larson, Ray|
|School:||University of California, Berkeley|
|Department:||Environmental Science, Policy, & Management|
|School Location:||United States -- California|
|Source:||DAI-B 73/07(E), Dissertation Abstracts International|
|Subjects:||Information Technology, Environmental science|
|Keywords:||Environmental primary data, Extensions to metadata languages, Forest cloud, Metadata, Primary data, Xml|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be