Large physical systems are increasingly prevalent, and designing estimation strategies for them has become both a practical necessity and a complicated problem. Their sensing infrastructure is usually ad-hoc, and the estimate of interest is often a complex function of the data. At the same time, computing power is rapidly becoming a commodity. We show with the study of two estimation tasks in urban transportation how the proper design of algorithms can lead to significant gains in scalability compared to existing solutions.
A common problem in trip planning is to make a given deadline such as arriving at the airport within an hour. Existing routing services optimize for the expected time of arrival, but do not provide the most reliable route, which accounts for the variability in travel times. Providing statistical information is even harder for trips in cities which undergo a lot of variability. This thesis aims at building scalable algorithms for inferring statistical distributions of travel time over very large road networks, using GPS points from vehicles in real-time. We consider two complementary algorithms that differ in the characteristics of the GPS data input, and in the complexity of the model: a simpler streaming Expectation-Maximization algorithm that leverages very large volumes of extremely noisy data, and a novel Markov Model-Gaussian Markov Random Field that extracts global statistical correlations from high-frequency, privacy-preserving trajectories.
These two algorithms have been implemented and deployed in a pipeline that takes streams of GPS data as input, and produces distributions of travel times accessible as output. This pipeline is shown to scale on a large cluster of machines and can process tens of millions of GPS observations from an area that comprises hundreds of thousands of road segments. This is to our knowledge the first research framework that considers in an integrated fashion the problem of statistical estimation of traffic at a very large scale from streams of GPS data.
|Advisor:||Bayen, Alexandre M.|
|Commitee:||Abbeel, Pieter, Franklin, Michael, Pozdnukhov, Alexei|
|School:||University of California, Berkeley|
|Department:||Electrical Engineering & Computer Sciences|
|School Location:||United States -- California|
|Source:||DAI-B 76/08(E), Dissertation Abstracts International|
|Subjects:||Transportation planning, Computer science|
|Keywords:||Bayesian modeling, Cyberphysical systems, Estimation, Gaussian markov random field, Large-scale systems, Machine learning|
Copyright in each Dissertation and Thesis is retained by the author. All Rights Reserved
The supplemental file or files you are about to download were provided to ProQuest by the author as part of a
dissertation or thesis. The supplemental files are provided "AS IS" without warranty. ProQuest is not responsible for the
content, format or impact on the supplemental file(s) on our system. in some cases, the file type may be unknown or
may be a .exe file. We recommend caution as you open such files.
Copyright of the original materials contained in the supplemental file is retained by the author and your access to the
supplemental files is subject to the ProQuest Terms and Conditions of use.
Depending on the size of the file(s) you are downloading, the system may take some time to download them. Please be