Millions of people move to large cities every day. What if we make the people flow measurable and analyzable? This would be of great value for city traffic planning, real time monitoring of hot areas and for targeted advertising. This capability exists by leveraging and combining Apache Spark streaming, Spark SQL, Spark batch processing, plus DB2 with BLU Acceleration. Spark provides powerful stream and batch processing on big data, and BLU Acceleration enhances the ability of complex analytics on multiple dimensions. Learn how BLU Acceleration and Spark are integrated seamlessly into one solution. This session will also show a demo that is based on a large city in China.
2. • IBM’s statements regarding its plans, directions, and intent are subject to change or
withdrawal without notice at IBM’s sole discretion.
• Information regarding potential future products is intended to outline our general product
direction and it should not be relied on in making a purchasing decision.
• The information mentioned regarding potential future products is not a commitment,
promise, or legal obligation to deliver any material, code or functionality. Information
about potential future products may not be incorporated into any contract.
• The development, release, and timing of any future features or functionality described for
our products remains at our sole discretion.
Performance is based on measurements and projections using standard IBM benchmarks in a
controlled environment. The actual throughput or performance that any user will experience
will vary depending upon many factors, including considerations such as the amount of
multiprogramming in the user’s job stream, the I/O configuration, the storage configuration,
and the workload processed. Therefore, no assurance can be given that an individual user
will achieve results similar to those stated here.
Please Note:
2
3. Agenda
• Challenges and Motivations
• Approaches Discussion
• Technical Considerations
• Architecture
• Work Flow
• What’s Next ?
• Q & A
3
4. Challenges in City of Chong Qing
32 million population
Density of
population 12000
per kilometers
4.4 million Vehicles,
growths 41% last
year (nation average
9% )
6. Various Types of Data
GPS
User Logs from Map Apps
Toll Station Sensors
Traffic Center
Mobile Signal Data
7. • A signal tower produces
− 3K to 5K records per minute
− 10K to 20K files per day, 10G
to 15G size
• Signal towers in all districts
produce 150G to 200G size data
per day
Mobile Signal Data – Cont’d
MSID, TIMESTAMP, LAC,
CELLID, EVENTID,FLAG,
MSCID,BSCID ...
...
8. Mobile Signal Data
Key Features
Coverage
Majority of population coverage. Very easy to collect
Massive
Millions of endpoints continually generate new records
Real time
The record contains the information of who, when and where
Accurate
Small deviation on time and location
9. Technical Considerations
Considerations
• A high speed system collect real time data
• A scalable, reliable system to store and manage massive data
• A computing engine supports various of computations
• A flexible architecture that be capable of evolving
Key evaluation criteria
• Real-time capacity
• Scalability
• Reliability
• Accessibility
• Enterprise functionality
10. Technical Stack
• Apache Spark
− Lighting fast in-memory computing engine
− Scalable, fault-tolerable and distributed
− Rich set of libraries
• DB2 BLU Accelerations
− Distributed column data base
− Dynamical in memory technology
11. Overall View
Data
Applications
Views
Fetch online or offline results from data pool,
Present multiple-dimension views for end users
Customized applications targeting for various
of business requirements. Leverage a rich set of analysis
tools in the computing pool to provide streaming,
batch analysis capacities
Reliable and scalable data storage system,
comprehensive data management capacities
12. Big Data Platform Architecture
Data
sources
Distributed File System
Streaming
Resource Management
YARN
API Services
Orchestration
Machine
Learning
Batch
Column
Database
Computation Engine
Visualization
Data
Ingestion
HDFS
Kafka
LDAP
Service
Cluster
Management
Security
Service
The Big Data Platform
14. Real Time Interaction
Analyze hot areas in real time, query the density of
population in a particular area or region. Query the
traffic online.
Demographic Analytic
Analyze the working and living location of each
individuals, count the demographic statistic for millions of
people.
Population Mobility
Analyze the incoming and leaving people flow,
predict the rush hours in a particular area with
considering of a certain direction.
Discover values from the data
15. Visualization
Real time and Statistics
Leverage Cognos to generate various
of charts. This graph indicates the
statistics of the people mobility in a
residential quarter through the time line.
Leverage ArchGIS to
visualize the density of
population in real time.
16. Next Step
Discover more values from Mobile
Signal Data, that helps to make smarter
decisions
Hot Area
Alert
Historic
Traffic
Metrics
Query
Residential
Statistics
Traffic
Prediction
Orientation
Destination
Statistics
18. We Value Your Feedback!
Don’t forget to submit your Insight session and speaker
feedback! Your feedback is very important to us – we use it
to continually improve the conference.
Access your surveys at insight2015survey.com to quickly
submit your surveys from your smartphone, laptop or
conference kiosk.
18
20. Notices and Disclaimers (con’t)
Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly
available sources. IBM has not tested those products in connection with this publication and cannot confirm the accuracy of performance,
compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the
suppliers of those products. IBM does not warrant the quality of any third-party products, or the ability of any such third-party products to
interoperate with IBM’s products. IBM expressly disclaims all warranties, expressed or implied, including but not limited to, the implied
warranties of merchantability and fitness for a particular purpose.
The provision of the information contained herein is not intended to, and does not, grant any right or license under any IBM patents, copyrights,
trademarks or other intellectual property right.
• IBM, the IBM logo, ibm.com, Aspera®, Bluemix, Blueworks Live, CICS, Clearcase, Cognos®, DOORS®, Emptoris®, Enterprise Document
Management System™, FASP®, FileNet®, Global Business Services ®, Global Technology Services ®, IBM ExperienceOne™, IBM
SmartCloud®, IBM Social Business®, Information on Demand, ILOG, Maximo®, MQIntegrator®, MQSeries®, Netcool®, OMEGAMON,
OpenPower, PureAnalytics™, PureApplication®, pureCluster™, PureCoverage®, PureData®, PureExperience®, PureFlex®, pureQuery®,
pureScale®, PureSystems®, QRadar®, Rational®, Rhapsody®, Smarter Commerce®, SoDA, SPSS, Sterling Commerce®, StoredIQ,
Tealeaf®, Tivoli®, Trusteer®, Unica®, urban{code}®, Watson, WebSphere®, Worklight®, X-Force® and System z® Z/OS, are trademarks of
International Business Machines Corporation, registered in many jurisdictions worldwide. Other product and service names might be
trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and trademark information" at:
www.ibm.com/legal/copytrade.shtml.