Location brings an opportunity to gain a competitive edge by unlocking new insights that others have not considered. Why have they not considered location in they past? Because location is messy! Lat/longs, shapes on a map, lines on a map, routes, distance calculations… all of this data is often messy and difficult to bring into an analytics application. Learn how Pitney Bowes and Ironside of applied sophisticated methodologies to Organize, Enrich and Analyze data from a location perspective.
API World 2019 Presentation on Securing sensitive data through APIs and AI pa...
Similar to Data Con LA 2019 - Pitney Bowes methodologies to Organize, Enrich and Analyze data from a location perspective by Dan Kernaghan & Tim Kreytak
Similar to Data Con LA 2019 - Pitney Bowes methodologies to Organize, Enrich and Analyze data from a location perspective by Dan Kernaghan & Tim Kreytak (20)
2. Providing context, accuracy, and relevance to location through
Software – Spatial, Geocoding, Address Validation, Routing, and more
Data – Global Streets, Property Attributes, Boundaries, Demographics
Services – ETL, Integration, Analytics, Machine Learning, AI, and more
Every Address has a Location but not every Location has an Address
We leverage our experience with addresses to enrich them with 9100+
attributes that describe a location
3. Typical Database Options
Traditional – RDBMS
Generally use GIS retrieval
Partitioned and indexed
Updates are direct replacement
Big Data equalizes data access
- and the balance of fast ingestion versus fast retrieval
Traditional processing
GIS Database
Big Data processing
File/Data Frame
Typical Big Data Options
• Traditional – HDFS
• Sharded – Hbase, Phoenix,
Redshift, Big Table, etc, etc.
• Partitioned by location, time
• Updates – Defined as SCD Type II
4. Latitude-
Longitude
Advantages
• Hierarchical
• Easy to generate
Disadvantages
• Not tied to Address
• 2-dimensional
• Not always accurate
Geohash
Advantages
• Very Accurate
• Hierarchical
• Easy to generate
• Easy to partition
Disadvantages
• Not tied to Address
Pitney Bowes
pbKey
Advantages
• Persistent
• Hierarchical
• Tied to Address
• Unique
Disadvantages
• Limited Coverage
18. 18
DataRobot leaderboard:
● Ranks 50+ models in performance
● Balances performance and accuracy
● Performs intensive cross validation to
protect against overfitting
● Allows manual retraining of specific
models/blueprints on various feature sets