More Related Content
Similar to esri2015cloudantdashdbpresentation-150731203041-lva1-app6892
Similar to esri2015cloudantdashdbpresentation-150731203041-lva1-app6892 (20)
More from Torsten Steinbach
More from Torsten Steinbach (17)
esri2015cloudantdashdbpresentation-150731203041-lva1-app6892
- 1. © 2015 IBM Corporation
Analyzing GeoSpatial data with IBM
Cloud Data Services & Esri ArcGIS
Torsten Steinbach, IBM!
torsten@de.ibm.com!
@torsstei!
See also a demo at: http://ibm.biz/dashDB-geospatial-analysis-tutorial
Raj Singh, IBM!
rrsingh@us.ibm.com!
@rajrsingh!
Visit us at booth #1808!!
- 2. © 2015 IBM Corporation2
Notices and Disclaimers
Copyright © 2015 by International Business Machines Corporation (IBM). No part of this document may be reproduced or
transmitted in any form without written permission from IBM.
U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract
with IBM.
Information in these presentations (including information relating to products that have not yet been announced by IBM) has
been reviewed for accuracy as of the date of initial publication and could include unintentional technical or typographical
errors. IBM shall have no responsibility to update this information. THIS DOCUMENT IS DISTRIBUTED "AS IS" WITHOUT
ANY WARRANTY, EITHER EXPRESS OR IMPLIED. IN NO EVENT SHALL IBM BE LIABLE FOR ANY DAMAGE ARISING
FROM THE USE OF THIS INFORMATION, INCLUDING BUT NOT LIMITED TO, LOSS OF DATA, BUSINESS
INTERRUPTION, LOSS OF PROFIT OR LOSS OF OPPORTUNITY. IBM products and services are warranted according to
the terms and conditions of the agreements under which they are provided.
Any statements regarding IBM's future direction, intent or product plans are subject to change or withdrawal without
notice.
Performance data contained herein was generally obtained in a controlled, isolated environments. Customer examples are
presented as illustrations of how those customers have used IBM products and the results they may have achieved. Actual
performance, cost, savings or other results in other operating environments may vary.
References in this document to IBM products, programs, or services does not imply that IBM intends to make such products,
programs or services available in all countries in which IBM operates or does business.
Workshops, sessions and associated materials may have been prepared by independent session speakers, and do not
necessarily reflect the views of IBM. All materials and discussions are provided for informational purposes only, and are
neither intended to, nor shall constitute legal or other guidance or advice to any individual participant or their specific situation.
It is the customer’s responsibility to insure its own compliance with legal requirements and to obtain advice of competent legal
counsel as to the identification and interpretation of any relevant laws and regulatory requirements that may affect the
customer’s business and any actions the customer may need to take to comply with such laws. IBM does not provide legal
advice or represent or warrant that its services or products will ensure that the customer is in compliance with any law.
- 3. © 2015 IBM Corporation3
Notices and Disclaimers (con’t)
Information concerning non-IBM products was obtained from the suppliers of those products, their published
announcements or other publicly available sources. IBM has not tested those products in connection with this
publication and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM
products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products.
IBM does not warrant the quality of any third-party products, or the ability of any such third-party products to interoperate
with IBM’s products. IBM EXPRESSLY DISCLAIMS ALL WARRANTIES, EXPRESSED OR IMPLIED, INCLUDING BUT
NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE.
The provision of the information contained herein is not intended to, and does not, grant any right or license under any
IBM patents, copyrights, trademarks or other intellectual property right.
§ IBM, the IBM logo, ibm.com, Bluemix, Blueworks Live, CICS, Clearcase, DOORS®, Enterprise Document
Management System™, Global Business Services ®, Global Technology Services ®, Information on Demand, ILOG,
Maximo®, MQIntegrator®, MQSeries®, Netcool®, OMEGAMON, OpenPower, PureAnalytics™, PureApplication®,
pureCluster™, PureCoverage®, PureData®, PureExperience®, PureFlex®, pureQuery®, pureScale®,
PureSystems®, QRadar®, Rational®, Rhapsody®, SoDA, SPSS, StoredIQ, Tivoli®, Trusteer®, urban{code}®,
Watson, WebSphere®, Worklight®, X-Force® and System z® Z/OS, are trademarks of International Business
Machines Corporation, registered in many jurisdictions worldwide. Other product and service names might be
trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at "Copyright and
trademark information" at: www.ibm.com/legal/copytrade.shtml.
- 4. © 2015 IBM Corporation4
The Structure of Bluemix
- 5. © 2015 IBM Corporation5
www.bluemix.net
www.cloudant.com
SDP
Schema Discovery !
Process!
DataWorks
Data Refinery!
Services!
Cloud-Based Systems of Engagement
(NoSQL, Mobile Apps, Internet of Things, Social Media)
IBM & Third Party Integrations
(Cognos, SPSS, SAS, Tableau, ESRI ArcGIS)
Systems of Record & Insight
(Watson Analytics, DB2, HDP, flat files)
Read/Write
(HTTP)
Write
Read/Write
Read/Write
Read/Write
(On/Off Prem)
SoftLayer Infrastructure as a Service!
dashDB and the IBM Cloud © 2015 IBM Corporation
www.dashDB.com
- 6. © 2015 IBM Corporation6
There is Valuable and Free Data Online
in the Cloud Everywhere
- 7. © 2015 IBM Corporation7
Data + Data > 2 x Data
Public Data
• Weather!
• News!
• Stocks!
• Social Media!
• ...!
Enterprise Data
• Orders!
• CRM!
• Master Data!
• Operations!
• ...!
Systems of Engagement
• IoT!
• Mobile Apps!
• Cloud Apps!
Correlation
of Structured
Data!
Pulling Together Data in a Central Place in the Cloud
Combining various data in a DW can be a fusion reactor for analytics
Benefits
• Speed to market
• Improved accuracy
• Lower cost
- 8. © 2015 IBM Corporation8
Cloudant Overview
§ Operational JSON data store
§ RESTful CouchDB API
§ Advanced APIs
- Replication & Sync
- Incremental MapReduce
- Geospatial
- Lucene Full-text Search
§ Scalable, Highly Available Performance
- Cross-data center data distribution & fail over
- Geo load balancing
§ Multi-tenant and dedicated-tenant clusters
§ Monitoring, administration, & development dashboards
§ Managed 24x7 by big data experts
“We want NoSQL for our GIS platform — we have internal and external customers who
want to ingest large streams of data from a range of sources like devices, sensors,
satellites, store that data, process it, and syndicated it across web apps.”
— Sr. Architect of Cloud Platforms
- 9. © 2015 IBM Corporation9
Geospatial Edge: Moving data closer to users
Key Challenges
§ Reduce time to delivery
§ Local, read/write access
§ Replication/sync in austere
environments
§ Making geodata transparent to the user
Cloudant Benefits
§ High Availability and Partition Tolerance
§ Offline sync for iOS, Android, and
HTML5
§ Sharded – geospatial data can be huge,
must span multiple nodes
§ GeoHash (Consistent Hash)
§ Spatial search functions
§ Configurable index types
- 10. © 2015 IBM Corporation10
Cloudant Warehousing
{JSON}
Other data sources
- 11. © 2015 IBM Corporation11
Cloudant Warehousing
{JSON}
Schema Discovering
Process (SDP)
• Targets homogeneous
databases
• Discover schema
• DashDB tables are created
from schema
- 12. © 2015 IBM Corporation12
Cloudant Warehousing
{JSON}
Schema Discovering
Process (SDP)
• Targets homogeneous
databases
• Discover schema
• DashDB tables are created
from schema
Data Transformation and
Movement Process
• Validation of data against schema
• Create DashDB inserts
• Multiple Reader, Tranformer and Writer
Threads
• Continous Replication with Cloudant
Change Feeds
• Issues reported in _overflow
table.
- 13. © 2015 IBM Corporation13
Cloudant Warehousing with GeoJSON
{GeoJSON}
Other data sources
- 14. © 2015 IBM Corporation14
{GeoJSON}
GeoJSON data comes in 3 flavours:
{
"type":
"LineString",
"coordinates":
[
[
2.3200,
48.8657
],
[
2.2951,
48.8738
]]
}
...as „Simple“ Geometry
{
"type":
"FeatureCollection",
"features":
[
{
"type":
"Feature",
"properties":
{
"name":
"Champs
Elysées"},
"geometry":
{
"type":
"LineString",
"coordinates":
[
[
2.3200,
48.8657
],
[
2.2951,
48.8738
]]
}
},
{
"type":
"Feature",
"properties":
{
"name"
:
"Notre-‐Dame"},
"geometry":
{
"type":
"Point",
"coordinates":
[
2.3497,
48.8528
]
}
}
]
}
...as Feature Collection
{
"type":
"Feature",
"properties":
{
"name":
"Champs
Elysées"},
"geometry":
{
"type":
"LineString",
"coordinates":
[
[
2.3200,
48.8657
],
[
2.2951,
48.8738
]]
}
}
...as Feature
- 15. © 2015 IBM Corporation15
{
"_id":
"75000",
"_rev":
"08066f8ecd5f780646aa1573460852c",
"type":
"FeatureCollection",
"features":
[
{
"type":
"Feature",
"properties":
{
"name":
"Le
Louvre"},
"geometry":
{
"type":
"Point",
"coordinates":
[2.3382,
48.8605]
}
},
{
"type":
"Feature",
"properties":
{
"name"
:
"Notre-‐Dame"},
"geometry":
{
"type":
"Point",
"coordinates":
[2.3500,
48.8530]
}
}
]
}
_id
_rev
type
75000
08066f8...
FeatureCollection
_id
array_index
type
properties_name
geometry
75000
0
Feature
Le
Louvre
POINT
(2.34
48.86)
75000
1
Feature
Notre-‐Dame
POINT
(2.35
48.85)
Cloudant Database:
SightsInParis
DashDB Warehouse
Table: SIGHTSINPARIS
Table: SIGHTSINPARIS_FEATURES
- 16. © 2015 IBM Corporation16
More to read on
https://cloudant.com/blog/warehousing-‐geojson-‐documents
- 17. © 2015 IBM Corporation17
GeoSpatial Analytics In dashDB
§ Implements OGC SFS & ISO SQL/MM part 3 standards for spatial!
- See http://www.iso.org/iso/catalogue_detail.htm?csnumber=38651!
§ Spatial data type ST_GEOMETRY (hierarchy)!
§ Enables spatial operations (e.g. joins) in database through spatial !
operators available as user defined functions!
§ Dedicated support in ESRI tools starting V 10.3!
§ GeoSpatial Applications Examples!
- Telco Location Data!
- Utilities Smart Grid!
- GPS Tracking in Transportation!
- Insurance Demographics!
- Cable Marketing Campaigns!
- Retail Store Placement!
- 18. © 2015 IBM Corporation18
GeoData & dashDB
{GeoJSON}
WKT((),())
Shapefiles
WKB
GML
- 19. © 2015 IBM Corporation19
Spatial Functions and Predicates in dashDB
ST_Distance(g1,g2)
?
SELECT a.name, a.type
FROM highways a, floodzones b
WHERE ST_Intersects(a.location,b.location) = 1
AND b.last_flood > 1950
ST_Intersects(g1,g2)
?
SELECT a.road_id, a.time, i.id,
ST_Distance(a.loc, i.loc,’METER’) as distance
FROM accidents a, intersections i
WHERE ST_Distance(a.loc,i.loc,’METER’) < 10000
AND a.weather = ‘RAIN’
- accidents near intersections
- highways in flood zones
- 20. © 2015 IBM Corporation20
And Many More …
ST_Area
ST_AsBinary
ST_AsText
ST_Boundary
ST_Buffer
ST_Centroid
ST_Contains
ST_ConvexHull
ST_CoordDim
ST_Crosses
ST_Difference
ST_Dimension
ST_Disjoint
ST_Distance
ST_Endpoint
ST_Envelope
ST_Equals
ST_ExteriorRing
ST_GeomFromWKB
ST_GeometryFromTe
xt
ST_GeometryN
ST_GeometryType
ST_InteriorRingN
ST_Intersection
ST_Intersects
ST_IsClosed
ST_IsEmpty
ST_IsRing
ST_IsSimple
ST_IsValid
ST_Length
ST_LineFromText
ST_LineFromWKB
ST_MLineFromText
ST_MLineFromWKB
ST_MPointFromText
ST_MPointFromWKB
ST_MPolyFromText
ST_MPolyFromWKB
ST_NumGeometries
ST_NumInteriorRing
ST_NumPoints
ST_OrderingEquals
ST_Overlaps
ST_Perimeter
ST_Point
ST_PointFromText
ST_PointFromWKB
ST_PointN
ST_PointOnSurface
ST_PolyFromText
ST_PolyFromWKB
ST_Polygon
ST_Relate
ST_SRID
ST_StartPoint
ST_SymmetricDiff
ST_Touches
ST_Transform
ST_Union
ST_WKBToSQL
ST_WKTToSQL
ST_Within
ST_X
ST_Y
And more…
Simplified
Constructors from
x,y
WKT
WKB
GML
shape
Linear referencing
Spatial aggregation
ST_AsGML
ST_AsShape
- 21. © 2015 IBM Corporation21
Spatial Constructor Functions
§ ST_Point(x, y, srs_id) – create point at this location
§ ST_Point(‘POINT (-121.5, 37.2)’, 1)
§ ST_Linestring(‘LINESTRING (-121.5 37.2,-121.7
37.1)’,1)
§ ST_Polygon(CAST (? AS CLOB(1M)),1)
– For host variable containing well-known text, well-known binary,
or shape representation
- 22. © 2015 IBM Corporation22
Spatial Predicates – WHERE Clause
§ ST_Distance(geom1, geom2) < distance_constant or var
§ ST_Contains(geom1, geom2) = 1
§ ST_Within(geom1,geom2) = 1
§ EnvelopesIntersect(geom1, geom2) = 1
§ EnvelopesIntersect(geom1, x1, y1, x2, y2, srs_id) = 1
§ ST_Area(geom) < some_value
- 23. © 2015 IBM Corporation23
Spatial Functions that Create New Spatial Values
§ ST_Buffer(geom, distance)
§ ST_Centroid(geom)
§ ST_Intersection(geom1, geom2)
§ ST_Union(geom1, geom2)
- 24. © 2015 IBM Corporation24
Functions that Return Information About a Spatial Value
§ ST_Area(geom), ST_Length(geom)
§ ST_MinX(geom, ST_MinY(geom), ST_MaxX(geom),
ST_MaxY(geom)
§ ST_IsMeasured(geom)
§ ST_X(geom), ST_Y(geom)
§ ST_AsText(geom)
- 25. © 2015 IBM Corporation25
Harness the Full Power of SQL
§ Outer join
§ Common table expressions
§ Recursive queries, sub-queries
§ Aggregate functions
§ Order by, group by, having clauses
§ OLAP, XML, and more ...
WITH sdStores AS (SELECT * FROM stores
WHERE
st_within(location, :sandiego) = 1)
SELECT s.id, s.name, AVG(h.income) FROM houseHolds h, sdStores
s
WHERE st_intersects(s.zone, h.location) = 1
GROUP BY s.id, s.name
ORDER BY s.name
Example problem: Determine the average household income for the sales zone of
each store in the San Diego area.
- 26. © 2015 IBM Corporation26
dashDB!
Predictive Analytics With R In dashDB
§ Built-in R runtime & R Studio!
§ ibmdbR package!
- Data frames logically representing data physically residing in dashDB tables
> con <- idaConnect("BLUDB", "", "")
> idaInit(con)
> sysusage<-ida.data.frame('DB2INST1.SHOWCASE_SYSUSAGE')
> systems<-ida.data.frame('DB2INST1.SHOWCASE_SYSTEMS')
> systypes<-ida.data.frame('DB2INST1.SHOWCASE_SYSTYPES’)!
- Push down of R data preparation to dashDB!
> sysusage2 <- sysusage[sysusage$MEMUSED>50000,c("MEMUSED","USERS")]
> mergedSys<-idaMerge(systems, systypes, by='TYPEID')
> mergedUsage<-idaMerge(sysusage2, mergedSys, by='SID’)!
- Push down of analytic algorithms to in-db execution!
> lm1 <- idaLm(MEMUSED~USERS, mergedUsage)
R Studio!Browser!
Any R Runtime!
ibmdbR
ibmdbR
- 27. © 2015 IBM Corporation27
Demo:
- Cloudant
- Bluemix
- dashDB
- Insurance Show Case
- Spatial analytics with R
- 28. © 2015 IBM Corporation28
Insurance Risk Analysis, Fraud Detection, Damage Prevention
See Video at: http://ibm.biz/dashDB-geospatial-analysis-tutorial
Public spatial data sets available online!
- Historical tornados from 1950s to today: http://www.spc.noaa.gov/gis/svrgis/!
- Current tornado weather warnings: http://www.nws.noaa.gov/regsci/gis/shapefiles/!
- US counties: https://www.census.gov/geo/maps-data/data/tiger-line.html!
Mobile application
generating!
spatial data for insurance
claims for tornado damage!
Cloud warehouse service for
analytics and correlation
between customer data and
public or third party data!
Visualization and
spatial analysis
capabilities by Esri
ArcGIS
www.bluemix.net!
www.cloudant.com!
dashDB!
Cloud service for
persistency of !
system of engagement
Insurance Master
Data (customers)!