SlideShare a Scribd company logo
Organize & manage master
meta data centrally, built
upon kong, cassandra, neo4j
& elasticsearch.
Hello!
I am Akhil Agrawal
Managing master & meta data is
a very common problem with
no good opensource alternative
as far as I know, so initiating this
project – MasterMetaData
Started BIZense in 2008 &
Digikrit in 2015
1.
Problem
Let’s start with what problem we are
addressing – why mastermetadata ?
Less Frequently Changing
 Master data and meta data both have one common
behavior of less frequent changes although their
purpose is different.
 The less frequently changing data whether it is data
about real world entities (master data) or data
about other data (meta data), both can be stored,
accessed and managed in very similar ways.
Why MasterMetaData ?
No Open Source Option
 There are MDM solutions (mostly from ERP
vendors like SAP, Oracle etc. & analytics
companies like Informatica, SAS) but the
master meta data intersection is being
explored only recently.
 There is no open source alternatives for smaller
companies or something that can be
embedded with SAAS products.
Why MasterMetaData ?
2.
Definitions
Let’s start with some definitions
around data categories
Definition of Data Categories
Meta Data
meta information
about other forms of
data (can describe
master, transaction
or lower level meta
data)
Master Data
real world entities
like customer,
partner etc. (only the
stable attributes are
considered part of
master data)
Transaction Data
real world
interactions which
have very short
lifespan and
occurrence is linked
with time/space
(unstable/changing
attribute values,
although
definition/description
is stable but each new
data point is unique)
Master Meta Data
combination of master and meta data
defined at application, enterprise or global
level (although the volume and variety
of master & meta data is very different, they
have lot of common access patterns)
3.
Implementation
Let’s discuss the implementation –
technologies & concepts involved
Background
◎ Faced difficulty with managing master
and meta data in previous projects
◎ Implemented custom solution while
building mobile ad platform
◎ Currently implementing same features
required for the communication platform
◎ Have worked with elasticsearch + kibana
while kong + cassandra seems useful
Build With Following Technologies
neo4j
highly scalable native graph
database that leverages data
relationships as first-class entities,
handles evolving data challenges
elasticsearch
search and analyze data in real
time, defacto standard for making
data accessible through search
and aggregations
cassandra
right choice when you need linear
scalability and high availability
without compromising
performance & durability
kong
the open-source management
layer for APIs and microservices,
delivering security, high
performance and reliability
lua
lua is a powerful, fast, lightweight,
embeddable scripting language.
For writing kong plugins for access
to various meta master data
kibana
explore and visualize data in
elasticsearch, opensource project
from elasticsearch team, intuitive
interface, visualization & dashboards
Opensource,
Scalable,
Searchable,
Ready to Use
Project mastermetadata
needs to be ready to use
for atleast few of the use
cases like location,
device, movie, tour etc.
Challenges
 Complex & hierarchical
data sets
 Real-time query
performance
 Dynamic structure
 Evolving relationships
Why neo4j for mastermetadata ?
Why neo4j ?
 Native graph store
 Flexible schema
 Performance and
scalability
 High availability
Referenced from
http://neo4j.com/use-cases/master-data-management
Why elasticsearch for mastermetadata ?
Scale
◎ Real-Time Data
◎ Massively
Distributed
◎ High Availability
◎ Multitenancy
◎ Per-Operation
Persistence
Search
◎ Full-Text Search
◎ Document-
Oriented
◎ Schema-Free
◎ Developer-
Friendly, RESTful
API
◎ Build on top of
Apache Lucene™
Analytics
◎ Real-Time Advanced
Analytics
◎ Very flexible Query
DSL
◎ Flexible analytics &
visualization
platform - Kibana
◎ Real-time summary
and charting of
streaming data
Referenced from https://www.elastic.co/products/elasticsearch
Why kong for mastermetadata ?
Secure, Manage &
Extend your APIs and
Microservices
RESTful Interface
Plugin Oriented
Platform Agnostic
Referenced from
https://getkong.org/
Without Kong With Kong
4.
Interesting
What are interesting things happening
around this ?
Master & Metadata Management Interesection
Maximized Metadata
Model
◎data model describing the metadata
needs to be “maximized” to cover as
many use cases possible
◎meta data model needs to be inclusive
of all metadata in the organization as
well as cover the master data
◎governance of metadata model
requires the ability to describe
maximum metadata in the system to
provide ability to govern data
describing other data
Minimalistic Master
Data Model
◎master data model describing master
data needs to be “minimalist”
◎master data model is neither inclusive
of all data in the organization, nor
specific to applications using it for
specific purpose
◎central governance of master data
requires that data model backing it is
minimalistic to be able to govern
without application specific details
◎master data model is basically
metadata describing the master data
Referenced from http://blogs.gartner.com/andrew_white/2011/04/26/more-
on-metadata-and-master-data-management-intersection/
From Big Data To Smart Data
Zero Latency Organization
data
◎latency linked to the data
(capturing)
◎latency linked to analytical
processes (processing)
structural
◎latency linked to decision
making processes
◎time needed to implement
actions linked with decisions
action
◎data latency added with
structural latency
◎time needed from capturing of
data till the action takes place
value
data is considered smart based on
the value it brings in decision
making and action taking (than
anything else like size, source, etc)
master
data which represents real world
entities and also remains stable
over time is the smart data as it
helps with common data reference
meta
data which describes other data
whether master, transactional or
lower level meta data is also smart
data as it helps in understanding
Types Of Latency
Smart Data
5.
Get Involved
Let’s discuss ways to get involved in
this project
Areas where you can get involved ?
DEMO
Functional Tests,
Integration Tests,
Run Demo
CODE
Implement Ideas,
Fix Bugs,
Enhance Features
DOCUMENT
User
Documentation,
Developer
Documentation
Current Focus
Devices
Storage: Device,
Browser, OS
Access: User
Agent
Locations
Storage: Country,
State, City
Access: IP Address
Tours
Storage: People,
Interest, Culture,
Destination, City,
Activity, Duration
Access: What, Where,
For
Storage & Access
Master Data Storage
Storage which is highly efficient
for read but at the same time
efficient for writes. Additional
requirement to be able to search
the stored data as well as flexible
efficient query interface to
enable faster access
Meta Data Storage
Storage which is highly flexible
in defining relationships like
inheritance, composition or
other relationships. Graph
modeled relationships are most
flexible to change as and when
the model evolves
Diagram featured by poweredtemplate.com
Meta Data Access
CRUD, Fill in the blanks,
Semantic Query, Search
Master Data Access
CRUD, Query (Structured /
Unstructured) & Search
References
 https://getkong.org/
 http://neo4j.com/
 http://cassandra.apache.org/
 https://www.elastic.co/
 http://booksite.elsevier.com/9780123743695/
10steps_DataCategories.pdf
 http://blogs.gartner.com/andrew_white/2011/
04/26/more-on-metadata-and-master-data-
management-intersection/
 http://neo4j.com/use-cases/master-data-
management/
Thanks!
Any questions?
You can find me at:
@digikrit
akhil@digikrit.com
Special thanks to all the people who made and released these awesome
resources for free:
 Presentation template by SlidesCarnival
 Presentation models by SlideModel & PoweredTemplate
 To companies behind kong, cassandra, neo4j & elasticsearch

More Related Content

What's hot

Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Dr. Arif Wider
 
Big Data in the Cloud with Azure Marketplace Images
Big Data in the Cloud with Azure Marketplace ImagesBig Data in the Cloud with Azure Marketplace Images
Big Data in the Cloud with Azure Marketplace Images
Mark Kromer
 
The Rise of Microservices
The Rise of MicroservicesThe Rise of Microservices
The Rise of Microservices
MongoDB
 
Domain Driven Data: Apache Kafka® and the Data Mesh
Domain Driven Data: Apache Kafka® and the Data MeshDomain Driven Data: Apache Kafka® and the Data Mesh
Domain Driven Data: Apache Kafka® and the Data Mesh
confluent
 
Azure Stream Analytics
Azure Stream AnalyticsAzure Stream Analytics
Azure Stream Analytics
James Serra
 
Perchè un programmatore ama anche i database NoSQL
Perchè un programmatore ama anche i database NoSQLPerchè un programmatore ama anche i database NoSQL
Perchè un programmatore ama anche i database NoSQL
Marco Parenzan
 
Leveraging Azure Analysis Services Tabular Data Models with Power BI by Tim M...
Leveraging Azure Analysis Services Tabular Data Models with Power BI by Tim M...Leveraging Azure Analysis Services Tabular Data Models with Power BI by Tim M...
Leveraging Azure Analysis Services Tabular Data Models with Power BI by Tim M...
KTL Solutions
 
Azure cafe marketplace with looker data analytics
Azure cafe marketplace with looker data analyticsAzure cafe marketplace with looker data analytics
Azure cafe marketplace with looker data analytics
Mark Kromer
 
Enterprise 360 - Graphs at the Center of a Data Fabric
Enterprise 360 - Graphs at the Center of a Data FabricEnterprise 360 - Graphs at the Center of a Data Fabric
Enterprise 360 - Graphs at the Center of a Data Fabric
Precisely
 
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
Fwdays
 
Datamesh community meetup 28th jan 2021
Datamesh community meetup 28th jan 2021Datamesh community meetup 28th jan 2021
Datamesh community meetup 28th jan 2021
Prasad Prabhakaran
 
IOOF Mongodb Australia
IOOF Mongodb AustraliaIOOF Mongodb Australia
IOOF Mongodb Australia
MongoDB
 
How to build your career
How to build your careerHow to build your career
How to build your career
James Serra
 
The Double win business transformation and in-year ROI and TCO reduction
The Double win business transformation and in-year ROI and TCO reductionThe Double win business transformation and in-year ROI and TCO reduction
The Double win business transformation and in-year ROI and TCO reduction
MongoDB
 
Data Structure and Types
Data Structure and TypesData Structure and Types
Data Structure and Types
Anjani Phuyal
 
Scaling Multi-Cloud Deployments with Denodo: Automated Infrastructure Management
Scaling Multi-Cloud Deployments with Denodo: Automated Infrastructure ManagementScaling Multi-Cloud Deployments with Denodo: Automated Infrastructure Management
Scaling Multi-Cloud Deployments with Denodo: Automated Infrastructure Management
Denodo
 
Big Data Storage Challenges and Solutions
Big Data Storage Challenges and SolutionsBig Data Storage Challenges and Solutions
Big Data Storage Challenges and SolutionsWSO2
 
Cepta The Future of Data with Power BI
Cepta The Future of Data with Power BICepta The Future of Data with Power BI
Cepta The Future of Data with Power BI
Kellyn Pot'Vin-Gorman
 
Data Mesh @ Yelp - 2019
Data Mesh @ Yelp - 2019Data Mesh @ Yelp - 2019
Data Mesh @ Yelp - 2019
Steven Moy
 
Power BI Overview, Deployment and Governance
Power BI Overview, Deployment and GovernancePower BI Overview, Deployment and Governance
Power BI Overview, Deployment and Governance
James Serra
 

What's hot (20)

Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
 
Big Data in the Cloud with Azure Marketplace Images
Big Data in the Cloud with Azure Marketplace ImagesBig Data in the Cloud with Azure Marketplace Images
Big Data in the Cloud with Azure Marketplace Images
 
The Rise of Microservices
The Rise of MicroservicesThe Rise of Microservices
The Rise of Microservices
 
Domain Driven Data: Apache Kafka® and the Data Mesh
Domain Driven Data: Apache Kafka® and the Data MeshDomain Driven Data: Apache Kafka® and the Data Mesh
Domain Driven Data: Apache Kafka® and the Data Mesh
 
Azure Stream Analytics
Azure Stream AnalyticsAzure Stream Analytics
Azure Stream Analytics
 
Perchè un programmatore ama anche i database NoSQL
Perchè un programmatore ama anche i database NoSQLPerchè un programmatore ama anche i database NoSQL
Perchè un programmatore ama anche i database NoSQL
 
Leveraging Azure Analysis Services Tabular Data Models with Power BI by Tim M...
Leveraging Azure Analysis Services Tabular Data Models with Power BI by Tim M...Leveraging Azure Analysis Services Tabular Data Models with Power BI by Tim M...
Leveraging Azure Analysis Services Tabular Data Models with Power BI by Tim M...
 
Azure cafe marketplace with looker data analytics
Azure cafe marketplace with looker data analyticsAzure cafe marketplace with looker data analytics
Azure cafe marketplace with looker data analytics
 
Enterprise 360 - Graphs at the Center of a Data Fabric
Enterprise 360 - Graphs at the Center of a Data FabricEnterprise 360 - Graphs at the Center of a Data Fabric
Enterprise 360 - Graphs at the Center of a Data Fabric
 
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
Дмитрий Лавриненко "Big & Fast Data for Identity & Telemetry services"
 
Datamesh community meetup 28th jan 2021
Datamesh community meetup 28th jan 2021Datamesh community meetup 28th jan 2021
Datamesh community meetup 28th jan 2021
 
IOOF Mongodb Australia
IOOF Mongodb AustraliaIOOF Mongodb Australia
IOOF Mongodb Australia
 
How to build your career
How to build your careerHow to build your career
How to build your career
 
The Double win business transformation and in-year ROI and TCO reduction
The Double win business transformation and in-year ROI and TCO reductionThe Double win business transformation and in-year ROI and TCO reduction
The Double win business transformation and in-year ROI and TCO reduction
 
Data Structure and Types
Data Structure and TypesData Structure and Types
Data Structure and Types
 
Scaling Multi-Cloud Deployments with Denodo: Automated Infrastructure Management
Scaling Multi-Cloud Deployments with Denodo: Automated Infrastructure ManagementScaling Multi-Cloud Deployments with Denodo: Automated Infrastructure Management
Scaling Multi-Cloud Deployments with Denodo: Automated Infrastructure Management
 
Big Data Storage Challenges and Solutions
Big Data Storage Challenges and SolutionsBig Data Storage Challenges and Solutions
Big Data Storage Challenges and Solutions
 
Cepta The Future of Data with Power BI
Cepta The Future of Data with Power BICepta The Future of Data with Power BI
Cepta The Future of Data with Power BI
 
Data Mesh @ Yelp - 2019
Data Mesh @ Yelp - 2019Data Mesh @ Yelp - 2019
Data Mesh @ Yelp - 2019
 
Power BI Overview, Deployment and Governance
Power BI Overview, Deployment and GovernancePower BI Overview, Deployment and Governance
Power BI Overview, Deployment and Governance
 

Similar to Master Meta Data

LinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbenchLinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbench
Sheetal Pratik
 
Key aspects of big data storage and its architecture
Key aspects of big data storage and its architectureKey aspects of big data storage and its architecture
Key aspects of big data storage and its architecture
Rahul Chaturvedi
 
[db tech showcase Tokyo 2018] #dbts2018 #B38 『Big Data and the Multi-model Da...
[db tech showcase Tokyo 2018] #dbts2018 #B38 『Big Data and the Multi-model Da...[db tech showcase Tokyo 2018] #dbts2018 #B38 『Big Data and the Multi-model Da...
[db tech showcase Tokyo 2018] #dbts2018 #B38 『Big Data and the Multi-model Da...
Insight Technology, Inc.
 
Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)
Denodo
 
BigData Analysis
BigData AnalysisBigData Analysis
Modern Data Management for Federal Modernization
Modern Data Management for Federal ModernizationModern Data Management for Federal Modernization
Modern Data Management for Federal Modernization
Denodo
 
Big Data Companies and Apache Software
Big Data Companies and Apache SoftwareBig Data Companies and Apache Software
Big Data Companies and Apache Software
Bob Marcus
 
Big data and oracle
Big data and oracleBig data and oracle
Big data and oracle
Sourabh Saxena
 
Big data analysis concepts and references
Big data analysis concepts and referencesBig data analysis concepts and references
Big data analysis concepts and references
Information Security Awareness Group
 
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Denodo
 
MongoDB
MongoDBMongoDB
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture Capabilities
Ashraf Uddin
 
Discussion post· The proper implementation of a database is es.docx
Discussion post· The proper implementation of a database is es.docxDiscussion post· The proper implementation of a database is es.docx
Discussion post· The proper implementation of a database is es.docx
madlynplamondon
 
Key Skills Required for Data Engineering
Key Skills Required for Data EngineeringKey Skills Required for Data Engineering
Key Skills Required for Data Engineering
Fibonalabs
 
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...
Denodo
 
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake
MetroStar
 
Microsoft Fabric Introduction
Microsoft Fabric IntroductionMicrosoft Fabric Introduction
Microsoft Fabric Introduction
James Serra
 
Understanding Metadata: Why it's essential to your big data solution and how ...
Understanding Metadata: Why it's essential to your big data solution and how ...Understanding Metadata: Why it's essential to your big data solution and how ...
Understanding Metadata: Why it's essential to your big data solution and how ...
Zaloni
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft Fabric
Nathan Bijnens
 

Similar to Master Meta Data (20)

LinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbenchLinkedInSaxoBankDataWorkbench
LinkedInSaxoBankDataWorkbench
 
Key aspects of big data storage and its architecture
Key aspects of big data storage and its architectureKey aspects of big data storage and its architecture
Key aspects of big data storage and its architecture
 
[db tech showcase Tokyo 2018] #dbts2018 #B38 『Big Data and the Multi-model Da...
[db tech showcase Tokyo 2018] #dbts2018 #B38 『Big Data and the Multi-model Da...[db tech showcase Tokyo 2018] #dbts2018 #B38 『Big Data and the Multi-model Da...
[db tech showcase Tokyo 2018] #dbts2018 #B38 『Big Data and the Multi-model Da...
 
Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)Data Virtualization: Introduction and Business Value (UK)
Data Virtualization: Introduction and Business Value (UK)
 
BigData Analysis
BigData AnalysisBigData Analysis
BigData Analysis
 
Modern Data Management for Federal Modernization
Modern Data Management for Federal ModernizationModern Data Management for Federal Modernization
Modern Data Management for Federal Modernization
 
Big Data Companies and Apache Software
Big Data Companies and Apache SoftwareBig Data Companies and Apache Software
Big Data Companies and Apache Software
 
Big data and oracle
Big data and oracleBig data and oracle
Big data and oracle
 
Big data analysis concepts and references
Big data analysis concepts and referencesBig data analysis concepts and references
Big data analysis concepts and references
 
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
Product Keynote: Denodo 8.0 - A Logical Data Fabric for the Intelligent Enter...
 
MongoDB
MongoDBMongoDB
MongoDB
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture Capabilities
 
Discussion post· The proper implementation of a database is es.docx
Discussion post· The proper implementation of a database is es.docxDiscussion post· The proper implementation of a database is es.docx
Discussion post· The proper implementation of a database is es.docx
 
Key Skills Required for Data Engineering
Key Skills Required for Data EngineeringKey Skills Required for Data Engineering
Key Skills Required for Data Engineering
 
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...
Denodo Partner Connect: A Review of the Top 5 Differentiated Use Cases for th...
 
5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake5 Steps for Architecting a Data Lake
5 Steps for Architecting a Data Lake
 
Microsoft Fabric Introduction
Microsoft Fabric IntroductionMicrosoft Fabric Introduction
Microsoft Fabric Introduction
 
Understanding Metadata: Why it's essential to your big data solution and how ...
Understanding Metadata: Why it's essential to your big data solution and how ...Understanding Metadata: Why it's essential to your big data solution and how ...
Understanding Metadata: Why it's essential to your big data solution and how ...
 
Data Mesh using Microsoft Fabric
Data Mesh using Microsoft FabricData Mesh using Microsoft Fabric
Data Mesh using Microsoft Fabric
 
big data
big databig data
big data
 

Recently uploaded

Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Anthony Dahanne
 
Software Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdfSoftware Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdf
MayankTawar1
 
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
XfilesPro
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
AMB-Review
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Globus
 
Visitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.appVisitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.app
NaapbooksPrivateLimi
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
Ortus Solutions, Corp
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
Georgi Kodinov
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Globus
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
IES VE
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
Globus
 
Strategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptxStrategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptx
varshanayak241
 
Why React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdfWhy React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdf
ayushiqss
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Shahin Sheidaei
 
Explore Modern SharePoint Templates for 2024
Explore Modern SharePoint Templates for 2024Explore Modern SharePoint Templates for 2024
Explore Modern SharePoint Templates for 2024
Sharepoint Designs
 
A Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfA Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdf
kalichargn70th171
 

Recently uploaded (20)

Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
 
Software Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdfSoftware Testing Exam imp Ques Notes.pdf
Software Testing Exam imp Ques Notes.pdf
 
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
How Does XfilesPro Ensure Security While Sharing Documents in Salesforce?
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
 
Visitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.appVisitor Management System in India- Vizman.app
Visitor Management System in India- Vizman.app
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
 
Vitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume MontevideoVitthal Shirke Microservices Resume Montevideo
Vitthal Shirke Microservices Resume Montevideo
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
 
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital TransformationWSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
 
Using IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New ZealandUsing IESVE for Room Loads Analysis - Australia & New Zealand
Using IESVE for Room Loads Analysis - Australia & New Zealand
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
 
Strategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptxStrategies for Successful Data Migration Tools.pptx
Strategies for Successful Data Migration Tools.pptx
 
Why React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdfWhy React Native as a Strategic Advantage for Startup Innovation.pdf
Why React Native as a Strategic Advantage for Startup Innovation.pdf
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
 
Explore Modern SharePoint Templates for 2024
Explore Modern SharePoint Templates for 2024Explore Modern SharePoint Templates for 2024
Explore Modern SharePoint Templates for 2024
 
A Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdfA Comprehensive Look at Generative AI in Retail App Testing.pdf
A Comprehensive Look at Generative AI in Retail App Testing.pdf
 

Master Meta Data

  • 1. Organize & manage master meta data centrally, built upon kong, cassandra, neo4j & elasticsearch.
  • 2. Hello! I am Akhil Agrawal Managing master & meta data is a very common problem with no good opensource alternative as far as I know, so initiating this project – MasterMetaData Started BIZense in 2008 & Digikrit in 2015
  • 3. 1. Problem Let’s start with what problem we are addressing – why mastermetadata ?
  • 4. Less Frequently Changing  Master data and meta data both have one common behavior of less frequent changes although their purpose is different.  The less frequently changing data whether it is data about real world entities (master data) or data about other data (meta data), both can be stored, accessed and managed in very similar ways. Why MasterMetaData ?
  • 5. No Open Source Option  There are MDM solutions (mostly from ERP vendors like SAP, Oracle etc. & analytics companies like Informatica, SAS) but the master meta data intersection is being explored only recently.  There is no open source alternatives for smaller companies or something that can be embedded with SAAS products. Why MasterMetaData ?
  • 6. 2. Definitions Let’s start with some definitions around data categories
  • 7. Definition of Data Categories Meta Data meta information about other forms of data (can describe master, transaction or lower level meta data) Master Data real world entities like customer, partner etc. (only the stable attributes are considered part of master data) Transaction Data real world interactions which have very short lifespan and occurrence is linked with time/space (unstable/changing attribute values, although definition/description is stable but each new data point is unique) Master Meta Data combination of master and meta data defined at application, enterprise or global level (although the volume and variety of master & meta data is very different, they have lot of common access patterns)
  • 8.
  • 9. 3. Implementation Let’s discuss the implementation – technologies & concepts involved
  • 10. Background ◎ Faced difficulty with managing master and meta data in previous projects ◎ Implemented custom solution while building mobile ad platform ◎ Currently implementing same features required for the communication platform ◎ Have worked with elasticsearch + kibana while kong + cassandra seems useful
  • 11. Build With Following Technologies neo4j highly scalable native graph database that leverages data relationships as first-class entities, handles evolving data challenges elasticsearch search and analyze data in real time, defacto standard for making data accessible through search and aggregations cassandra right choice when you need linear scalability and high availability without compromising performance & durability kong the open-source management layer for APIs and microservices, delivering security, high performance and reliability lua lua is a powerful, fast, lightweight, embeddable scripting language. For writing kong plugins for access to various meta master data kibana explore and visualize data in elasticsearch, opensource project from elasticsearch team, intuitive interface, visualization & dashboards
  • 12. Opensource, Scalable, Searchable, Ready to Use Project mastermetadata needs to be ready to use for atleast few of the use cases like location, device, movie, tour etc.
  • 13. Challenges  Complex & hierarchical data sets  Real-time query performance  Dynamic structure  Evolving relationships Why neo4j for mastermetadata ? Why neo4j ?  Native graph store  Flexible schema  Performance and scalability  High availability Referenced from http://neo4j.com/use-cases/master-data-management
  • 14. Why elasticsearch for mastermetadata ? Scale ◎ Real-Time Data ◎ Massively Distributed ◎ High Availability ◎ Multitenancy ◎ Per-Operation Persistence Search ◎ Full-Text Search ◎ Document- Oriented ◎ Schema-Free ◎ Developer- Friendly, RESTful API ◎ Build on top of Apache Lucene™ Analytics ◎ Real-Time Advanced Analytics ◎ Very flexible Query DSL ◎ Flexible analytics & visualization platform - Kibana ◎ Real-time summary and charting of streaming data Referenced from https://www.elastic.co/products/elasticsearch
  • 15. Why kong for mastermetadata ? Secure, Manage & Extend your APIs and Microservices RESTful Interface Plugin Oriented Platform Agnostic Referenced from https://getkong.org/ Without Kong With Kong
  • 16. 4. Interesting What are interesting things happening around this ?
  • 17. Master & Metadata Management Interesection Maximized Metadata Model ◎data model describing the metadata needs to be “maximized” to cover as many use cases possible ◎meta data model needs to be inclusive of all metadata in the organization as well as cover the master data ◎governance of metadata model requires the ability to describe maximum metadata in the system to provide ability to govern data describing other data Minimalistic Master Data Model ◎master data model describing master data needs to be “minimalist” ◎master data model is neither inclusive of all data in the organization, nor specific to applications using it for specific purpose ◎central governance of master data requires that data model backing it is minimalistic to be able to govern without application specific details ◎master data model is basically metadata describing the master data Referenced from http://blogs.gartner.com/andrew_white/2011/04/26/more- on-metadata-and-master-data-management-intersection/
  • 18. From Big Data To Smart Data Zero Latency Organization data ◎latency linked to the data (capturing) ◎latency linked to analytical processes (processing) structural ◎latency linked to decision making processes ◎time needed to implement actions linked with decisions action ◎data latency added with structural latency ◎time needed from capturing of data till the action takes place value data is considered smart based on the value it brings in decision making and action taking (than anything else like size, source, etc) master data which represents real world entities and also remains stable over time is the smart data as it helps with common data reference meta data which describes other data whether master, transactional or lower level meta data is also smart data as it helps in understanding Types Of Latency Smart Data
  • 19.
  • 20. 5. Get Involved Let’s discuss ways to get involved in this project
  • 21. Areas where you can get involved ? DEMO Functional Tests, Integration Tests, Run Demo CODE Implement Ideas, Fix Bugs, Enhance Features DOCUMENT User Documentation, Developer Documentation
  • 22. Current Focus Devices Storage: Device, Browser, OS Access: User Agent Locations Storage: Country, State, City Access: IP Address Tours Storage: People, Interest, Culture, Destination, City, Activity, Duration Access: What, Where, For
  • 23. Storage & Access Master Data Storage Storage which is highly efficient for read but at the same time efficient for writes. Additional requirement to be able to search the stored data as well as flexible efficient query interface to enable faster access Meta Data Storage Storage which is highly flexible in defining relationships like inheritance, composition or other relationships. Graph modeled relationships are most flexible to change as and when the model evolves Diagram featured by poweredtemplate.com Meta Data Access CRUD, Fill in the blanks, Semantic Query, Search Master Data Access CRUD, Query (Structured / Unstructured) & Search
  • 24. References  https://getkong.org/  http://neo4j.com/  http://cassandra.apache.org/  https://www.elastic.co/  http://booksite.elsevier.com/9780123743695/ 10steps_DataCategories.pdf  http://blogs.gartner.com/andrew_white/2011/ 04/26/more-on-metadata-and-master-data- management-intersection/  http://neo4j.com/use-cases/master-data- management/
  • 25. Thanks! Any questions? You can find me at: @digikrit akhil@digikrit.com Special thanks to all the people who made and released these awesome resources for free:  Presentation template by SlidesCarnival  Presentation models by SlideModel & PoweredTemplate  To companies behind kong, cassandra, neo4j & elasticsearch