2. • Dosen DTETI ugm
• Direktur DSSDI ugm
• co-founder datains.id
• anggota Forum Masyarakat Statistik
• S1 teknik elektro ugm
• S2 erasmus univ netherland
• S3 electronic, cork institute of tech, ireland
• https://acadstaff.ugm.ac.id/widyawan
3. “In God we trust, all others must bring data.”
W. Edwards Deming
4. Agenda
• Big Data and Official Statistic
• Methodology for Inferring Location
• Human Mobility Model
5. Evolution of Statistical Data Source
Censuses
Survey
Admin
Source
Big Data
targeted, pre-
determined questions
and indicator
229 paper with ‘big data official statistic’
keywords in http://scopus.com
7603 paper with ‘big data statistic’
structured data
collected by gov, not
specifically for
statistical purpose
6. Beginning
• NASA researchers Michael Cox and David Ellsworth use the term “big
data” for the first time to describe a familiar challenge in the 1990s:
supercomputers generating massive amounts of information — in Cox
and Ellsworth’s case, simulations of airflow around aircraft — that
cannot be processed and visualized. Data sets are generally quite
large, taxing the capacities of main memory, local disk, and even
remote disk,” they write. “We call this the problem of big data.”1
• Doug Laney, 2001, mempopulerkan lewat 3V (Volume, Velocity,
Variety) of Big Data2
1 Application-controlled demand paging for out-of-core visualization, M. Cox ; D. Ellsworth, Proceedings. Visualization '97 (Cat. No.
97CB36155)
2 Controlling Data Volume, Velocity, and Variety, https://blogs.gartner.com/doug-laney/files/2012/01/ad949-3D-Data-Management-
Controlling-Data-Volume-Velocity-and-Variety.pdf
7.
8. Big Data source
Data Source Availability Veracity
Sensor Closed Medium - High
Admin/Transactional Database Closed High
Social Media Open Low
Online news Open Medium
• Sensor: GPS, Cellular Network, CCTV, WiFi, IoT
• Admin/Transactional Database: Gov. Enterprise System,
Marketplace, Banking, Tax, etc
• Social Media: Twitter, Facebook, IG, etc
• Online news: detik.com, kompas.com, cnn.com, etc
9. First Sensor Usage for Big Data Official Statistic
• 60.000 sensors
• Induction loop
• Camera
• Bluetooth
• Passing vehicle count each minute
• Large volume: 230 million record/day
Puts, M., Tennekes, M., Daas, P.J.H., de Blois, C. (2016) Using huge amounts of road sensor data for official statistics. European Conference on Quality in
Official Statistics 2016, Madrid, Spain
10. Mobile Phone Data for Web Service: population
movement between regions
Mobile Data for Tourism, Migration, Population and Transport in Korea, 6th International Conference on Big Data for Official Statistics, 31 August-2 Sept, 2020
11. • Big Data and Official Statistic
• Methodology for Inferring Location
• Human Mobility Model
12. • state x (location) is
unknown
• one can only obtain the
measurement z (e.g.:
time, signal strength) to
estimate x, denotes
p(x|z) in Bayesian term
• each estimation has an
error, often measured by
accuracy
• movement of location
state = mobility
Hidden Markov Model: the Basic Philosophy
zt-1 zt Zt+1
Xt-1 Xt Xt+1
S. Thrun, W. Burgard, and D. Fox, Probabilistic Robotics (Intelligent Robotics and Autonomous Agents). The MIT Press, September 2005.
13. Method for Location
Estimation
• GPS using lateration of TDOA
• Cellular Network using proximity
cell id, AOA angulation and RSS
lateration
15. Methodology for Location and Mobility
Estimation
measurement location, id, t mobility
estimation
group of
location, id, t
geometric
location
group mobility
estimation
batch
geometric query:
• Point location
• Range searching
• Nearest neighbor
• lateration
• angulation
• proximity
estimation
17. Most Common Technology for Inferring Group Location & Mobility
Technology Plus Minus Example or
Product/Provider
Global Positioning System • Id
• Global
• Accurate
• client-based, special effort to
gather group
location/mobility
• poor performance for indoor
and urban canyon scenario
• Google Map
• Google Mobility
• Waze
• Lotadata
Cellular Network (MPD) • Id
• Nation wide
• Network-based
• accuracy can vary
• relatively pricey
• Tsel
• Isat
18. • Big Data and Official Statistic
• Methodology for Inferring Location
• Human Mobility Model
19. Group of Location Data from Cellular
Network
group of sample data send send by operator
Individual record
courtesy of:
25. Why Human Mobility?
• despite the diversity of their travel history, humans follow simple
temporal and spatial reproducible patterns
• significant probability to return to a few highly frequented locations.
• could impact all phenomena driven by human mobility, from
epidemic prevention to emergency response and urban planning
González, M., Hidalgo, C. & Barabási, A. Understanding individual human mobility patterns. Nature 453, 779–782 (2008).
https://doi.org/10.1038/nature06958