bigtransport17
Workshop on Big Data Analytics for Enhancing Public Transport (BigTransport17)

In Conjunction with ACM International Conference on Information and Knowledge Management (CIKM17) November 6-10, 2017

The BigTransport workshop aims to bring researchers and practitioners across different Big Data and Public Transport research communities together in a unique forum to share the state-of-the-art technologies.

It welcomes researchers and practitioners to share the latest breakthroughs in analysing transport related data for improving commuting experience in public transport systems. These could include data science studies on commuter behavior, public transport data analytics applications and systems, large-scale behavioural experiments on public transport users, and simulation and visualization using massive public transport data. BigTransport will focus on application inspired novel findings, methods, systems and solutions which demonstrate the impact of big data analytics on public transport experience.

Time Programme Venue
08.00am - 08.30am Registration Pacific Ballroom

Level 1, Pan Pacific Singapore

7 Raffles Boulevard, Marina Square
08.30am - 09.00am CIKM 2017 Conference Opening Speech
09.00am - 10.00am CIKM 2017 Keynote Speaker Talk
10.00am - 10.45am Morning Refreshments Room 309-311

Level 3, Suntec Singapore Convention & Exhibition Centre

1 Raffles Boulevard, Suntec City
10.45am - 12.00pm Keynote Speaker
Dr Xing Xie
Microsoft Research Asia

Title: Understanding Users Using Large Scale Heterogenous Mobility Data
12.00pm - 01.30pm Lunch
01.30pm - 02.00pm Invited Speaker
Prof. Baihua Zheng
Title: Data Analytics for Public Transport Services
02.00pm - 02.20pm Invited Speaker
Dr Jagannadan Varadarajan,
Data Science Lead (Machine Learning) of Grab

Title: On the use of Bigdata and Machine Learning for a better ride-hailing experience in Southeast Asia
02.20pm - 02.40pm Invited Speaker
Dr Meng-Fen Chiang
Title: Identifying Congestion Cascades Using Bus Trajectory Data
02.50pm - 04.00pm Afternoon Refreshments
Poster Session & Demo
 
Keynote Speaker
Understanding Users Using Large Scale Heterogenous Mobility Data
Nowadays, human mobility data including public transit data, location check-ins, and GPS trajectories are widely available. They reflect various aspects of human activities in the physical word. However, it’s still challenging to leverage all these data for gaining a in depth understanding of users since they are sparse, noisy and not connected to each other. In this talk, I will introduce our recent research efforts on this direction, including a space alignment approach for reconstructing individual mobility from public transit transactions, a joint geographical modeling and matrix factorization for Point-of-Interest (POI) recommendation, and a hybrid predictive model integrating both the regularity and conformity of human mobility as well as their mutual reinforcement.
Dr. Xing Xie is currently a senior research manager in Microsoft Research Asia, and a guest Ph.D advisor for the University of Science and Technology of China. He received his B.S. and Ph.D. degrees in Computer Science from the University of Science and Technology of China in 1996 and 2001, respectively. He joined Microsoft Research Asia in July 2001, working on data mining, social computing and ubiquitous computing. During the past years, he has published over 200 referred journal and conference papers, such as ACM Transactions on Intelligent Systems and Technology, ACM Transactions on the Web, ACM/Springer Multimedia Systems Journal, IEEE Transactions on Knowledge and Data Engineering, IEEE Transactions on Mobile Computing, IEEE Transactions on Multimedia, etc. He has more than 50 patents filed or granted. He has been invited to give keynote speeches at ASONAM 2017, MobiQuitous 2016, SocInfo 2015, Socialinformatics 2015, GbR 2015, W2GIS 2011, HotDB 2012, SRSM 2012, etc.

He currently serves on the editorial boards of ACM Transactions on Intelligent Systems and Technology (TIST), Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT), Springer GeoInformatica, Elsevier Pervasive and Mobile Computing. In recent years, he was involved in the program or organizing committees of over 70 conferences and workshops. Especially, he served as program co-chair of ACM Ubicomp 2011, the 8th Chinese Pervasive Computing Conference (PCC 2012), the 12th IEEE International Conference on Ubiquitous Intelligence and Computing (UIC 2015), and the 6th National Conference on Social Media Processing (SMP 2017). In Oct. 2009, he founded the SIGSPATIAL China chapter which was the first regional chapter of ACM SIGSPATIAL. He is a senior member of ACM and the IEEE, and a distinguished member of China Computer Federation (CCF).

Dr Xing Xie

Senior Research Manager
Website
 
Invited Speakers

Prof Baihua ZHENG

Baihua ZHENG received her Bachelor of Engineering in Computer Science & Engineering from Zhejing University, and her PhD in Computer Science from Hong Kong University of Science and Technology. Currently, she is an Associate Professor in the School of Information Systems at Singapore Management University (SMU). She is also the Associate Dean of Postgraduate Research Programmes at the School. Her research interests include mobile and pervasive computing, and spatial databases.

Dr Jagannadan Varadarajan

Jagan received his PhD from Ecole Polytechnique Federale de Lausanne (EPFL), Switzerland in 2012. His PhD research was focused on Bayesian graphical models, inference techniques and their applications to large scale video analysis and visual surveillance. The prototypes he developed for video surveillance were successfully tested in several busy metro stations in Paris, Rome and Torino, and commercialized later. Post PhD, Jagan joined Advanced Digital Sciences Center (ADSC, a research unit of University of Illinois Urbana Champaign in Singapore) to work as a Post-doc researcher and then as a Research Scientist, in the area of "semantic video analysis" and "multi-modal analysis". He has co-authored more than 30 articles that were published in top-tier computer vision and pattern recognition conferences and journals. Since May 2017, he is with Grab (the leading ride-hailing platform in Southeast Asia), as the Data Science Lead for the Machine Learning team.

Dr Meng-Fen Chiang

Meng-Fen Chiang received her Bachelor/Master (BS/MS) in Computer Science from National Chengchi University in Taiwan, and her PhD in Computer Science from National Chiao Tung University in Taiwan. Currently, she is a Research Scientist in Living Analytics Research Centre (LARC) at Singapore Management University (SMU). Her research interests include urban computing and machine learning.
 
Demos
Bus Sense
In an urban transportation system, efficiency and customer satisfaction are the utmost concern for the transportation provider. With the rise of social media, commuters turns to social platforms such as Twitter to comment about their transportation experience. These comments often contain real-time transportation information which can greatly benefits commuters. We propose Bus Sense, a crowdsensing and analysis framework to collect and analyze real-time bus commuter feedback from Twitter. The framework consists of a series of text mining tasks that include bus-related entity extraction, sentiment analysis, and micro-event detection. We further build a front-end web dashboard to visualize the results.
Authors: (Thong Hoang, Pei Hua Cher, Philips Kokoh Prasetyo, Ee-Peng Lim) LARC
MRT Sense
Public transportation system especially Mass Rapid Transit (MRT) is an important part of urban cities. Given that more and more public MRT announcements, comments, and feedbacks are shared through social media platform such as Twitter, analyzing social media data for MRT-related content sheds light on MRT events and commuting experience. We build a web dashboard, MRT Sense, to visualize analysis of nearly 100K labelled MRT-related tweets from January 2015 to July 2017. The analysis includes time series analysis, event correlation and content analysis.
Authors: (Philips Kokoh Prasetyo, Fuxiang Chen, Ee-Peng Lim) LARC
Understanding the Benefit of Taxi Ride-Sharing-A case study of Singapore
This work studies the potential benefit of ride-sharing for serving more taxi requests and reducing city traffic flow. It proposes a simple yet practical framework for taxi ride-sharing and scheduling. The proposed framework recommends ride-sharing plans with limited waiting time and limited extra travel time to minimize the discomfort of travelers; it also helps travelers who share ride with other travelers achieve desired taxi fare saving; and helps taxi drivers who serve multiple taxi requests via one single trip to gain more earnings. Therefore, both travelers and drivers are economically motivated to participate in the ride-sharing practice. Comprehensive simulation study is conducted to evaluate the outcome of taxi ride-sharing with real taxi booking data. Simulation results indicate noticeable increase of taxi booking success rate and reduction of waiting time during peak hours.
Authors: (Wang Yazhe, Zheng Baihua, Lim Ee-Peng) LARC
Singapore MRT 2012 vs 2016
With the opening of the first two stages of Downtown line in 2015, Singapore MRT system becomes more comprehensive and it provides more travel options to passengers. This work conducts a comparative study between 2012 MRT system and 2016 MRT system based on real EZ-link transaction data to understand the impact of new MRT lines on passengers’ MRT travelling experience. The study reveals the passenger flow changes at each MRT station, the travel time /volume changes between MRT stations, as well as the travel condition improvement of some regions in Singapore.
Authors: (Wang Yazhe, Zheng Baihua, Agus Trisnajaya KWEE) LARC
Real-time Singapore Traffic Watch
Real-time Singapore Traffic Watch, a traffic data visualization system, demonstrates how data from different sources can be correlated to explain congestions and other traffic events. Using data from LTA DataMall and NEA weather services, one can monitor and study Singapore's traffic conditions and trends.
Authors: : (Agus Trisnajaya Kwee, Philips Kokoh Prasetyo, Lim Ee Peng, Baihua Zheng) LARC
Identifying Congestion Cascades Using Bus Trajectory Data
The knowledge of traffic health status is essential to the general public and urban traffic management. To identify congestion cascades, an important phenomenon of traffic health, we propose a Bus Trajectory based Congestion Identification (BTCI) framework that explores the anomalous traffic health status and structure properties of congestion cascades using bus trajectory data. First, BTCI models path speed from historical vehicle transitions for the purpose of quantifying a measure of congestion score. Afterwards, congested segments with high congestion scores are aggregated into traffic congestion cascades by unifying both attribute coherence and spatio-temporal closeness of congested segments within a cascade. Extensive evaluations on 11.8 million bus trajectory data show that BTCI is effective in highlighting congested segments and identifying congestion cascades.
Authors: (Agus Trisnajaya Kwee, Meng-Fen Chiang, Lim Ee Peng) LARC, Wang-Chien Lee (The Pennsylvania State University)
A Real-time Big Data Framework for Taxi Drivers
Traditional taxi fleets need to embrace technology in order to maintain their market share against ride-hailing services. However, very few past studies focus on addressing the practical difficulties faced by the traditional taxi fleet operators. By designing and implementing a big-data platform that is capable of processing real-time location updates from tens of thousands of taxis, we demonstrate how we can use such processing power to infer demands accurately and to generate driving recommendations at different levels for individual drivers. Using real-world dataset fed into our platform, we demonstrate that we can significantly increase average daily trips per driver aiding to the increase in taxi drivers’ revenues.
Authors: (Shashi Shekhar Jha, Shih-Fen Cheng, Meghna Lowalekar, Nicholas Wong Wai Hin, Rishikeshan Rajendram, Tran Trong Khiem, Pradeep Varakantham, Troung Troung Nghia, and Firmansyah Bin Abd Rahman) Fujitsu-SMU Urban Computing and Engineering (UNiCEN) Corp. Lab ● Singapore Management University ●
Posters
Inferring Trip Occupancies in the Rise of Sharing Economy
The knowledge of occupied and unoccupied trips made by self-employed drivers is critical to reveal vehicle supplies and passenger demands. Nevertheless, many ride hailing apps (e.g., Didi Dache, Uber, Grab, etc.), alternative to conventional taxi services, can only observe a subset of all occupied trips made by self-employed drivers due to drivers’ partial use of booking apps or use of multiple booking apps to arrange bookings from passengers. To tackle this problem, we propose a framework, Learning to INfer Trips (LINT), to infer occupancy of unknown trips. Two main research steps, stop point classification and structural segmentation, are included in LINT. First, LINT learns an effective stop point classifier to assign stop points with pick-up, drop-off, and intermediate labels. We further propose segmentation algorithms to infer occupied trip segments from stop point label sequences. Our comprehensive experiments on real vehicle trajectories showcase (1) an application of land use diagnose empowered by uncovered stop point analysis, and (2) inference accuracy of proposed segmentation algorithm.
Authors: (Meng-Fen Chiang, Lim Ee Peng) LARC, Wang-Chien Lee (The Pennsylvania State University), Tuan-Anh Hoang (L3S Research Center)
Topic Propagation Analysis of Geo-tagged Tweets
for Delay Detection based on Railway Network Topology
Recently, event detection on social media has been a very popular research area with many applications, such as traffic congestion and delay detection on the public transport network. In this paper, we propose a novel topic propagation analysis of geo-tagged tweets between railway stations for detecting train delays and periods of delays. In particular, we aim to predict train delays due to traffic accidents based on railway topology of real space and cyberspace. To realize one, we utilize railway network of stations as the topology of real space, and extract the topology of the social network that is mapped on the railway network, based on topic propagation analysis of accident delays between stations using geo-tagged tweets of each station with neural networks. This permits observing the influence on railway stations with a few tweets when a topic such as a delay occurs at a station, or predicting related tweets of affected stations by delays even if the tweets contain indirect topics about delays. In this paper, the proposed method enables to analyze geo-tagged tweets in order to predict accident delays by considering the railway topology of both real space and cyberspace. We evaluate the results of the proposed method on datasets derived from Twitter with the actual delay information.
Authors: Yuanyuan Wang (Yamaguchi University), Muhammad Syafiq Mohd Pozi, Yukiko Kawai, Toyokazu Akiyama (Kyoto Sangyo University)
Spatio-Temporal Clustering based Graph Signal Processing for Road Network Traffic Prediction
Accurate traffic prediction is the foundation of improved traffic management and intelligent transportation systems. Accurate traffic prediction has recently become possible since massive spatio-temporal data has been generated from sensors in public transport. How- ever, mining on those large-scale datasets is challenging because (1) spatio-temporal dependencies of different roads is complex, (2) the graph constructed from road networks is large and hence computationally expensive to run prediction algorithm on the whole graph. In this paper to address these challenges, we propose to use the Graph Signal Processing (GSP) technique to decouple dependencies in time domain and obtain independent graph data in frequency domain. In that way, we can make prediction in independent graph frequencies and later recover traffic to time domain. In addition, we use spatio-temporal clustering in order to capture the spatio-temporal pattern, split the large graph into multiple connected disjoint subgraphs and then make prediction within the subgraphs. In that way, we greatly reduce computation complexity and improve prediction accuracy, since the prediction is made based on more "informative" and "relevant" nodes. We implement experiments for our model on real datasets collected from Dallas Highways, on which our model shows substantial improvement with respect to temporal prediction.
Authors: (Arman Hasanzadeh Moghimi, Xi Liu, Nick Duffield and Krishna Narayanan) Texas A&M University
Bus Trip Detection from Cellular Data for Urban Planning
An efficient public transport system is crucial in the smart city, in which the bus plays the fundamental role because of its flexible, low cost and extensive service range. The traditional way to understand the crowd flow and density of public transport is by using smart card data. However, the smart card system only records the origin station of users, which results in difficulty in detecting bus trips. Nowadays, each user has her mobile phone. It is possible to know the users' transportation mode from her cellular data. In this paper, we focus on detecting bus trips from users' cellular data. To achieve this goal, we firstly propose a sequential cluster method and an oscillation detection method in data pre-process. Then a tower-route matching method is proposed to detect bus passengers and bus tips, which considers both spatial and temporal factors. Extensive experiments conducted on a real dataset provided by Chunghwa Telecom show that our proposed methods are efficient and scalable.
Authors: (Chun-Jie Chen, Guanyao Li, Ai-Jor Chou, Xiaochuan Gou, Wen-Chih Peng, Chih-Wei Yi) National Chiao Tung University, Taiwan
Understanding the Effects of Traffic and Weather Conditions on Public Transport Use in Jakarta
This paper investigates how traffic and weather dynamics affect the behavior of regular commuters in Jakarta. For this, we analyze more than one million traffic reports from Waze and the weather data from the Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS) as well as 72 million transaction records from approximately 3.12 million smart transport cards, which have been little analyzed since Transjakarta, the Jakarta Bus Rapid Transit (BRT), deployed automatic fare collection (AFC) systems and AFC card infrastructures on all main transport corridors in Jakarta. Our preliminary results show that (a) weekdays have a higher proportion of regular trips mostly during AM and PM peak times, with the highest proportion of regular trips occurs during AM peak time, (b) bad traffic and heavy rainfall have significant impact on the behavior of regular passengers in Jakarta, with bad traffic leading to inconsistent behavior of regular passenger across a larger area, and (c) in the most crowded sub-corridor during a selected high-volume-traffic day as a case study, it is observed that regular passengers are likely to postpone their trips by 105 minutes on average. These results can be used as valuable inputs for the optimization of public transport services. To the best of our knowledge this work is the largest quantitative study, fusing multiple sources of transportation-related data, in Jakarta, including both passively collected (machine-generated) and actively contributed, i.e., AFC transaction records and Waze citizen reports.
Authors: Imaduddin Amin (United Nations Global Pulse, Indonesia), Zakiya Aryana Pramestri (United Nations Global Pulse, Indonesia), Muhammad Rizal Khaefi (Jakarta Smart City, Government of Jakarta), Jong Gun Lee (United Nations Global Pulse, Indonesia)