Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
BDW Meetup:
Big MDM Part 2
Using a Graph Database for MDM
& Relationship Management
Sponsored by:Hosted by:
6:30 Networking
Grab some food and drink... Make some friends.
6:45 Joe Caserta
President
Caserta Concepts
Welcome + Intro...
• Big Data is a complex, rapidly changing
landscape
• We want to share our stories and hear
about yours
• Great networking...
Top 20 Big Data
Consulting - CIO Review
Launched Big Data practice
Co-author, with Ralph Kimball, The
Data Warehouse ETL T...
About Caserta Concepts
• Award-winning technology innovation consulting with
expertise in:
• Big Data Solutions
• Data War...
Does this word cloud excite you?
Speak with us about our open positions: leslie@casertaconcepts.com
Help Wanted
Spark
Big ...
Data
User
Interface
Services
WorkflowRules
Security
Members Providers
Agents Plans
Policies
Consistent Policy
Enforcement...
How MDM Works
Standardization
Matching
Survivorship
Validation
Publication
Staging
Library
Consolidated
Library
Standardization Matching
Integrated
Library
Survivorship
Source ID Name Home Address ...
Informational
Master Data
MDM Information Ecosystem
10
Operational
Master Data
Holistic
Master Data
Service
Leads
Policies...
The Reality of Mastering Data
Graph Databases (NoSQL) to the Rescue
 Hierarchical relationships are never
rigid
 Relational models with tables and
col...
Graph Databases - Who are the players?
Base on popularity: http://db-engines.com/en/ranking/graph+dbms
Our favorite – Neo4J
• Open source graph database, implemented in Java
• 1.0 released in 2010  mature
• Popular - large c...
GRAPH DATABASES FOR MDM
Elliott Cordo
Chief Architect, Caserta Concepts
Graph DB, special kind of NoSQL
• A NoSQL database that is all about relationships
• Relationships are first class citizen...
Graph Use Cases
• Social Networks
• Network Asset Management
• Portfolio Management
• Risk Analysis
• Master Data Manageme...
Caserta Projects
• Relationship Science Workspace  Financial
• Alumni/Corporate Donor Network  Higher Education
What is wrong with traditional approach to
MDM
• Conceptually problems with “enterprise” approach
• Long, complex implemen...
MDM data persistence
• Fundamental challenges with data storage
• Sparse data
• Evolving schema
• Relationships
• How do w...
How does a Graph DB help MDM
• Data is stored in it’s natural form  no mismatch between
requirements and data model
• Bot...
So how do you work with a graph
• Gremlin – traditional, supported by most Graph databases
• Cyper - high level, user frie...
Cypher – ascii art
Cypher – “select *”
Match a pattern, return results
Relationship directions
Easy queries where graphs shine
2nd or nth level connections
Shortest path
Getting data in
Cypher shell
API’s: REST, modules/libraries for most languages
CSV Loader
Tools and Data Viz
Cypher is cool and all but are there BI tools?
• Little support by mainstream tools
• Healthy ecosystem...
Open source and commercialGelphi
Tom Sawyer
linkurio.us
elliott@casertaconcepts.com
Upcoming SlideShare
Loading in …5
×

Big MDM Part 2: Using a Graph Database for MDM and Relationship Management

During this Big Data Warehousing Meetup, we discussed how graph databases work, shared some real world use cases, and showed a live demo of the world’s leading graph database, Neo4J. Pitney Bowes demonstrated their new MDM product developed on a graph database.

For more information, check out the other slides from this meetup or visit our website at www.casertaconcepts.com

  • Login to see the comments

Big MDM Part 2: Using a Graph Database for MDM and Relationship Management

  1. 1. BDW Meetup: Big MDM Part 2 Using a Graph Database for MDM & Relationship Management Sponsored by:Hosted by:
  2. 2. 6:30 Networking Grab some food and drink... Make some friends. 6:45 Joe Caserta President Caserta Concepts Welcome + Intro to Big MDM About the Meetup. Why MDM needs Graph now. 7:00 Elliott Cordo Chief Architect Caserta Concepts Intro to Graph Databases and Cypher Deep dive into graph technology and how to work with it. 7:20 David Fauth, Senior Engineering Consultant Neo Technology Neo4j and Use Cases Real-world solutions and a demo of Neo4j for relationship management. 7:50 Aaron Wallace Principal Product Manager Pitney Bowes Spectrum MDM Hub Model, manage and govern data with graph database. 8:20 Q&A Ask Questions, Share your experience Agenda
  3. 3. • Big Data is a complex, rapidly changing landscape • We want to share our stories and hear about yours • Great networking opportunity for like minded data nerds • Founded by Caserta Concepts • November 10, 2012 • Next BDW Meetup: • April 7 • Topic: Predictive Analytics on Hadoop (with Zementis) • Location: NWC About the BDW Meetup #BDWmeetup @CasertaConcepts @neo4j @PitneyBowes
  4. 4. Top 20 Big Data Consulting - CIO Review Launched Big Data practice Co-author, with Ralph Kimball, The Data Warehouse ETL Toolkit (Wiley) Dedicated to Data Warehousing, Business Intelligence since 1996 Began consulting database programing and data modeling 25+ years hands-on experience building database solutions Founded Caserta Concepts in NYC Web log analytics solution published in Intelligent Enterprise Formalized Alliances / Partnerships – System Integrators Partnered with Big Data vendors Cloudera, Hortonworks, IBM, Cisco, Datameer, Basho more… Launched Training practice, teaching data concepts world-wide Laser focus on extending Data Warehouses with Big Data solutions 1986 2004 1996 2009 2001 2010 2013 Launched Big Data Warehousing (BDW) Meetup-NYC ~1500 Members 2012 2014 Established best practices for big data ecosystem implementation – Healthcare, Finance, Insurance Top 20 Most Powerful Big Data consulting firms Dedicated to Data Governance Techniques on Big Data (Innovation) Caserta Timeline
  5. 5. About Caserta Concepts • Award-winning technology innovation consulting with expertise in: • Big Data Solutions • Data Warehousing • Business Intelligence • Core focus in the following industries: • eCommerce / Retail / Marketing • Financial Services / Insurance • Healthcare / Ad Tech / Higher Ed • Established in 2001: • Increased growth year-over-year • Industry recognized work force • Strategy, Implementation • Writing, Education, Mentoring • Data Science & Analytics • Cloud Computing • Data Interaction & Visualization
  6. 6. Does this word cloud excite you? Speak with us about our open positions: leslie@casertaconcepts.com Help Wanted Spark Big Data Architect NoSQL EC2,EMR,Redshift
  7. 7. Data User Interface Services WorkflowRules Security Members Providers Agents Plans Policies Consistent Policy Enforcement and Security Integration with exiting ecosystem Data Governance through Workflow Management Data Quality enforcement through metadata-driven rules Time-Variant Hierarchies and attributes High Performance, Flexible, Scalable Database – Think Graph! Master Data Management Components
  8. 8. How MDM Works Standardization Matching Survivorship Validation Publication
  9. 9. Staging Library Consolidated Library Standardization Matching Integrated Library Survivorship Source ID Name Home Address Birth Date SSN SYS A 123 Jim Stagnitto 123 Main St 8/20/1959 123-45-6789 SYS B ABC J. Stagnitto 132 Main Street 8/20/1959 123-45-6789 SYS C XYZ James Stag NULL 8/20/1959 NULL Source ID Name Home Address Birth Date SSN Std Name Std Addr MDM ID SYS A 123 Jim Stagnitto 123 Main St 8/20/1959 123-45-6789 James Stagnitto 123 Main Street 1 SYS B ABC J. Stagnitto 132 Main Street 8/20/1959 123-45-6789 James Stagnitto 132 Main Street 1 SYS C XYZ James Stag NULL 8/20/1959 NULL James Stag NULL 1 MDM ID Name Home Address Birth Date SSN 1 James Stagnitto 123 Main Street 8/20/1959 123-45-6789 Mastering Data Validation
  10. 10. Informational Master Data MDM Information Ecosystem 10 Operational Master Data Holistic Master Data Service Leads Policies Claims Enrolls Sales Finance DW Dimensions & Cross-References Marketing Insights
  11. 11. The Reality of Mastering Data
  12. 12. Graph Databases (NoSQL) to the Rescue  Hierarchical relationships are never rigid  Relational models with tables and columns not flexible enough  Neo4j is the leading graph database  Many MDM systems are going graph:  Pitney Bowes - Spectrum MDM  Reltio - Worry-Free Data for Life Sciences.
  13. 13. Graph Databases - Who are the players? Base on popularity: http://db-engines.com/en/ranking/graph+dbms
  14. 14. Our favorite – Neo4J • Open source graph database, implemented in Java • 1.0 released in 2010  mature • Popular - large community • Commercially supported • Easy to setup and use
  15. 15. GRAPH DATABASES FOR MDM Elliott Cordo Chief Architect, Caserta Concepts
  16. 16. Graph DB, special kind of NoSQL • A NoSQL database that is all about relationships • Relationships are first class citizens, not a just a “constraint”
  17. 17. Graph Use Cases • Social Networks • Network Asset Management • Portfolio Management • Risk Analysis • Master Data Management …and many more
  18. 18. Caserta Projects • Relationship Science Workspace  Financial • Alumni/Corporate Donor Network  Higher Education
  19. 19. What is wrong with traditional approach to MDM • Conceptually problems with “enterprise” approach • Long, complex implementations  low ROI • Complex data model • Too much human interaction • Deliverable??? • Challenges with big data • Data volumes • Evolving data sources • Need to further remove humans out of the process
  20. 20. MDM data persistence • Fundamental challenges with data storage • Sparse data • Evolving schema • Relationships • How do we handle in RDBMS • Custom relations • Extreme normalization
  21. 21. How does a Graph DB help MDM • Data is stored in it’s natural form  no mismatch between requirements and data model • Both Nodes and Relationships can have properties  supports sparse and evolving data • MDM for analytics  your MDM solution now delivers new enablement, not just a back office system • Relationship science
  22. 22. So how do you work with a graph • Gremlin – traditional, supported by most Graph databases • Cyper - high level, user friendly
  23. 23. Cypher – ascii art
  24. 24. Cypher – “select *” Match a pattern, return results
  25. 25. Relationship directions
  26. 26. Easy queries where graphs shine 2nd or nth level connections Shortest path
  27. 27. Getting data in Cypher shell API’s: REST, modules/libraries for most languages CSV Loader
  28. 28. Tools and Data Viz Cypher is cool and all but are there BI tools? • Little support by mainstream tools • Healthy ecosystem of graph specific exploration and data visualization tools
  29. 29. Open source and commercialGelphi Tom Sawyer linkurio.us
  30. 30. elliott@casertaconcepts.com

×