NoSQL and Data Modeling for Data Modelers

Big Data, NoSQL &
Data Modeling
10 Tips for Data Modeling Success on Modern Data Projects
Karen Lopez, InfoAdvisors
www.datamodel.com

Data Models – Traditional Process
Conceptual
(Data)
Model
Logical
Data Model
Physical
Data
Model(s) OLTP
OLTP
OLTP OLTP
OLTP
MARTMART
OLTP
OLTP
OLTP
Aug 2014©InfoAdvisors - infoadvisors.com

Relational
Data Models started
with relational
modeling, so they look
like relational database
structures.

But….
That doesn’t mean they can’t be used to
model data that goes into a non-
relational format.
All that formatting happens at build OR
consumption time, not requirements
time.

The Big Data Story
Lots of data
Coming at us fast
Lots of variety in format & quality
We want all the data
Highly available
“It’s web scale”

What do we really mean by scale?
Bringing computing to the data
Massively parallel processing
Cheap, commodity hardware, but lots
of it
Optimized for
Query/Reads/Questions/Telling stories

We’ve been down this road before…
Traditional
transactional
applications
Reporting-
optimized
tables/structures
Data Warehouse
/ Dimensional
Modeling
Highly normalized Highly Denormalized

Hadoop
ETL
EDW
Analytics
Mart
Data
Mart

NoSQL, Not Only SQL
Relational Graph
Columnar/Column
Family
Key Value
Document
Databases
Others

Sample Hive Statement
CREATE EXTERNAL TABLE TaxRebateUsage (
state string,
zipcode string,
agi_class int,
n1 int,
mars2 int,
prep int,
n2 int,
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE

Sample JSON/MongoDB Notation

Sample FoundationDB Statement

Sample Cassandra Statement

Sample Vertica Statement

Sample Neo4j Statement

Those weren’t SCHEMALESS….
They had data facts, which had meanings. And sometimes
expected formats, precisions, and types.
In the NoSQL world, we don’t apply those necessarily at
write time, but at read time.
SCHEMALESS really is MULTIPLE SCHEMAs (Polyschematic)
or VARYING SCHEMAs.

The Big Data Big Lies
Schemaless
• Schema on
Read, not
Schema on Write
• Polyschematic
Big
• New data stories
• New
technologies
• Not just volume

10 Tips For Modeling in a Hybrid World
1. Models require a modeler
2. Data modeling tools are essential
3. There are many types of data models: know which ones
you need
4. Modeling does not have to happen at the same time in
every project. It should happen at the right time
5. Modeling is not just schema design. Think outside the
boxes and lines

10 Tips for Modeling in a Hybrid World
6. A data model is much more than a diagram
7. You will need training.
8. Team members may not understand modeling.
They will need training
9. NoSQL is not one thing. Learn many patterns
10.Modern data architectures are likely hybrid
solutions. You can’t just support one part.

What does this mean for data modelers?
There will be jobs for traditional, ERD, relational modelers….
….just like there are still jobs of RPG and COBOL programmers
All data has a data story. Many data stories.
A good modeler is a an architect at heart – finding the right
solution for the data story.

Business Intelligence Journal
Look for September 2014
Issue Article on Modern
Data Architectures

Thank You!
www.infoadvisors.com
www.datamodel.com
www.dataversity.net
community.embarcadero.com
#TEAMDATA

NoSQL and Data Modeling for Data Modelers

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to NoSQL and Data Modeling for Data Modelers

Similar to NoSQL and Data Modeling for Data Modelers (20)

More from Karen Lopez

More from Karen Lopez (17)

Recently uploaded

Recently uploaded (20)

NoSQL and Data Modeling for Data Modelers