Initial soil surveys are incomplete for large tracts of public land in the western USA. Digital soil mapping offers a quantitative approach as an alternative to traditional soil mapping. I sought to predict soil classes across an arid to semiarid watershed of western Utah by applying random forests (RF) and using environmental covariates derived from Landsat 7 Enhanced Thematic Mapper Plus (ETM+) and digital elevation models (DEM). Random forests are similar to classification and regression trees (CART). However, RF is doubly random. Many (e.g., 500) weak trees are grown (trained) independently because each tree is trained with a new randomly selected bootstrap sample, and a random subset of variables is used to split each node. To train and validate the RF trees, 561 soil descriptions were made in the field. An additional 111 points were added by case-based reasoning using aerial photo interpretation. As RF makes classification decisions from the mode of many independently grown trees, model uncertainty can be derived. The overall out of the bag (OOB) error was lower without weighting of classes; weighting increased the overall OOB error and the resulting output did not reflect soil-landscape relationships observed in the field. The final RF model had an OOB error of 55.2% and predicted soils on landforms consistent with soil-landscape relationships. The OOB error for individual classes typically decreased with increasing class size. In addition to the final classification, I determined the second and third most likely classification, model confidence, and the hypothetical extent of individual classes. Pixels that had high possibility of belonging to multiple soil classes were aggregated using a minimum confidence value based on limiting soil features, which is an effective and objective method of determining membership in soil map unit associations and complexes mapped at the 1:24,000 scale. Variables derived from both DEM and Landsat 7 ETM+ sources were important for predicting soil classes based on Gini and standard measures of variable importance and OOB errors from groves grown with exclusively DEM- or Landsat-derived data. Random forests was a powerful predictor of soil classes and produced outputs that facilitated further understanding of soil-landscape relationships.
|Advisor:||Boettinger, Janis L.|
|Commitee:||Douglas, R. D., White, Michael|
|School:||Utah State University|
|School Location:||United States -- Utah|
|Source:||MAI 49/01M, Masters Abstracts International|
|Subjects:||Geographic information science, Soil sciences, Remote sensing|
|Keywords:||Beaver county, Landsat 7, Model confidence, Random forests, Soil survey|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be