Large and balanced datasets are normally crucial for many machine learning models, especially when the problem is defined in a high dimensional space due to high complexity. In real-world applications, it is usually very hard and/or expensive to obtain adequate amounts of labeled data, even with the help of crowd-sourcing. To address these problems, a possible approach is to create synthetic data and use it for training. This approach has been applied in many application areas of computer vision including document recognition, object retrieval, and object classification. While a boosted performance has been demonstrated using synthetic data, the boosted performance is limited by two main factors in existing approaches. First, most existing approaches for creating and using synthetic data are application-specific and thus lack the ability to benefit other application areas. Further, such application-specific approaches are often heuristic in nature. Second, existing approaches do not recognize an inherent difference between synthetic data and actual data which is termed as a synthetic gap in my proposal. The synthetic gap in existing approaches is due to the fact that not all possible patterns and structures of actual data are present in the synthetic data. To address the problems of using synthetic data and using it to better improve the performance of learning algorithm, this proposal considers general ways of creating and using synthetic data. The problem caused by the synthetic gap is studied and approaches to overcome the gap are proposed. Experimental results demonstrate that the proposed approach is efficient and can boost the performance of many computer vision applications including building roof classification, character classification, and point cloud object classification.
|Commitee:||Bilgic, Mustafa, Chen, Xin, Glavic, Boris, Wang, Jia|
|School:||Illinois Institute of Technology|
|School Location:||United States -- Illinois|
|Source:||DAI-B 79/08(E), Dissertation Abstracts International|
|Keywords:||Computer vision, Data mining, Deep learning, Machine learning, Object recognition, Pattern recognition|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be