×
Back

CIKM AnalytiCup

DataSpark Mobility Open-Task Challenge Dataset

This section describes the DataSpark data sources (Footfall, Origin-Destination, and Dwell Time), and lists some public data sources to explore. You are required to use at least 2 data sources in your solution: 1 from DataSpark, and 1 from a public data source.

For DataSpark data sources, you will receive a set of sample data in Phase 1 (formatted in CSV for convenience). Only shortlisted teams in Phase 2 and Finals will have access to the full APIs to build their applications. The APIs will serve the data in JSON format.

The API request and response formats described below are illustrative only, and subject to change. Documentation and access keys for the APIs will be provided to shortlisted teams.

In the following descriptions, Region of Interest (ROI) refers to the Singapore national boundaries: Planning Region, Planning Area and Sub Zone. The boundaries can be downloaded from Data.gov.sg:
Planning Regions
Planning Areas
Sub Zones

Note that these boundary data do not qualify as one of the public data sources required for this challenge.

DataSpark API: Footfall

Footfall is the unique count of people appearing at a given place and time. The API allows access to 60 days of daily or hourly footfall data for Singapore planning regions, planning areas and sub zones.

API query parameters:
● Start Time (e.g., 2017-03-01T12:00:00Z)
● End Time (e.g., 2017-03-02T12:00:00Z)
● ROI Layer (planning-region, planning-area or sub-zone)
● Interval (hour or day)

API response data schema:
● UTC Timestamp (e.g., 2017-03-01T12:00:00Z)
● ROI Layer (planning region, planning area or sub zone)
● ROI ID (e.g., 123)
● Count (e.g., 12345)

Download Sample

DataSpark API: Origin-Destination

Origin-destination is the unique count of people travelling from a given origin to a given destination area. The API allows access to 60 days of daily or hourly origin-destination data for Singapore planning regions, planning areas and sub zones.

API query parameters:
● Start Time (e.g., 2017-03-01T12:00:00Z)
● End Time (e.g., 2017-03-02T12:00:00Z)
● ROI Layer (planning-region, planning-area or sub-zone)
● ROI ID (e.g., 123)
● ROI Direction (origin or destination)
● Interval (hour or day)

API response data schema when origin ROI is provided:
Output represents a list of destinations of people departing the origin at the indicated time.
● ROI Layer (planning-region, planning-area or sub-zone)
● Origin UTC Timestamp (e.g., 2017-03-01T12:00:00Z)
● Origin ROI ID (e.g., 123)
● Destination ROI ID (e.g., 234)
● Count (e.g., 12345)

API response data schema when destination ROI is provided:
Output represents a list of origins of people arriving at the destination at the indicated time.
● ROI Layer (planning-region, planning-area or sub-zone)
● Destination UTC Timestamp (e.g., 2017-03-01T12:00:00Z)
● Destination ROI ID (e.g., 123)
● Origin ROI ID (e.g., 234)
● Count (e.g., 12345)

Download Sample

DataSpark API: Dwell Time

Dwell time is the average time spent by visitors who entered a given location at a given hour or day. The API allows access to 60 days of dwell time data, for Singapore planning regions, planning areas and sub zones.

API query parameters:
● Start Time (e.g., 2017-03-01T12:00:00Z)
● ROI Layer (planning-region, planning-area or sub-zone)
● ROI ID (e.g., 123)
● Interval (hour or day)

API response data schema:
● UTC Timestamp (e.g., 2017-03-01T12:00:00Z)
● ROI Layer (planning-region, planning-area or sub-zone)
● ROI ID (e.g., 123)
● Dwell Time in Minutes (e.g., 23)

Download Sample

Public Data Sources

Examples of public data sets:
Data.gov.sg
Data.gov.sg Developer Portal
LTA DataMall

You may use any other data sets that you find. Any data sets used must be publicly available, and licensed for free public use. You must cite all your data sources.