Nowadays, many human activities are shifted into the cyberspace. Unfortunately, the cybercrimes attacking the cyberspace are also fast growing in size, sophistication and cost, making billions of online users' interests under risks. To better understand and further mitigate the threats from these emerging cybercrimes, this dissertation takes the first steps to leverage a set of innovative intelligent data analysis techniques for analyzing the cybercrimes, which brings lights on the research in the fields.
Particularly, this dissertation contributes the first technique, called Sacabuche, for understanding the cybercrime of autocomplete manipulations. Sacabuche combines intelligent data analysis technologies, such as Natural Language Processing (i.e. NLP) and Machine Learning (i.e. ML) technique , to analyze autocomplete pairs and identify the manipulated suggestion terms. On the collected 114 millions autosuggestions, our measurement study reveals the prevalence of the threat (e.g. 1 out of 200 Google autosuggestions were manipulated). We also report the significant security implications of this underestimated cybercrime and shed light on understanding the underground ecosystem for the first time, which makes our study invaluable for the mitigation and ultimate elimination of the cyberthreat.
We further introduce the first chatbot (i.e. Aubrey) for active threat intelligence collection from real-world e-commerce miscreants. Observing from the conversations between smalltime workers and miscreants, Aubrey leverages their question-driven conversation pattern to process their interactions with a finite state machine, which enables the autonomous conversations. Aubrey is quite successful to collect a large number of fraud-related artifacts by chatting with 470 e-commerce miscreants, such as previously-unknown SIM gateways, fake account storefronts, and e-commerce fraud attack toolkits and etc. We further discover the underground e-commerce fraud supply chain and the miscreants' complicated business models. These findings have great practical impact which had helped our cooperated company to build a more secure e-commerce platform serving hundreds of millions of users.
Finally, we propose the first security analysis on local business listing (i.e. LBL) ecosystem to detect local business search poisoning for illicit drug promotion. Our semi-supervised graph mining algorithm, IDLLSpread, identifies suspicious LBLs from the known illicit drug local listings (i.e. IDLLs). We collect 94,856 drug-related LBLs from different parties in the ecosystem, and confirm 3,571 IDLLs with IDLLSpread. We further analyze how local search results are polluted by these IDLLs. Both our analysis on the confirmed IDLLs and the polluted local business searches has demonstrated the impacts of such illicit promotions on various local search services (e.g. web search, map search and voice search). From our infiltration studies, we also show the evidences that today's LBL ecosystem is less regulated and vulnerable, leading our discussion of strategies to mitigate such IDLL threats.
|Commitee:||Liao, Xiaojing, Liu, Xiaozhong, Huang, Yan|
|School Location:||United States -- Indiana|
|Source:||DAI-A 82/5(E), Dissertation Abstracts International|
|Subjects:||Computer science, Information science|
|Keywords:||Emerging cybercrimes, Intelligent data analysis|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be