• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment
 

Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment

on

  • 1,033 views

Data modeling emerged in the 1970’s in response to the needs of database designers. This accident of history has influenced perceptions and practices of data modeling in harmful ways. Most notably, ...

Data modeling emerged in the 1970’s in response to the needs of database designers. This accident of history has influenced perceptions and practices of data modeling in harmful ways. Most notably, business-focused requirements analysis has been wrongly commingled with relational modeling. Compounding the problem, vendors have produced data-modeling tools that blur the important distinction between the client’s problem and the technologist’s solution.

Enter NoSQL, with its promise of liberating practitioners from the tiresome burden of designing relational databases. The chance to dispense with relational modeling was embraced enthusiastically, but for many organizations, it has meant discarding the only rigorous activity that had any hope of formally expressing the client’s data needs. This is a textbook case of throwing out the baby with the bathwater. This presentation shows you how to save the baby, and your career as a data modeler.

Understanding the client’s data problem remains essential, regardless of the technology used to build the solution. For that matter, understanding the client’s data problem is the first step toward making an informed choice of technology for the solution.
Using concrete, real-world examples, the presenter will show the following:

- How abandoning modeling altogether is a recipe for disaster, even in—or especially in—NoSQL environments
How experienced relational modelers can leverage their skills for NoSQL projects
- How the NoSQL context both simplifies and complicates the modeling endeavor
- How lessons learned modeling for NoSQL projects can make you a more effective modeler for any kind of project

Statistics

Views

Total Views
1,033
Views on SlideShare
1,033
Embed Views
0

Actions

Likes
1
Downloads
15
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Point of having a merged cell for physical: it’s all coming together – it’s increasingly difficult to distinguish the underlying physical model services…Here again, hypertext is not 1:1 with HTML – it’s beyond-the-basics hypertext as manifested, e.g., in Web publishing and collaboration-oriented systems/serversXQuery is not mainstream today, but it is exceptionally powerful and was co-developed in conjunction with XPath 2.0
  • Point of having a merged cell for physical: it’s all coming together – it’s increasingly difficult to distinguish the underlying physical model services…Here again, hypertext is not 1:1 with HTML – it’s beyond-the-basics hypertext as manifested, e.g., in Web publishing and collaboration-oriented systems/serversXQuery is not mainstream today, but it is exceptionally powerful and was co-developed in conjunction with XPath 2.0
  • Point of having a merged cell for physical: it’s all coming together – it’s increasingly difficult to distinguish the underlying physical model services…Here again, hypertext is not 1:1 with HTML – it’s beyond-the-basics hypertext as manifested, e.g., in Web publishing and collaboration-oriented systems/serversXQuery is not mainstream today, but it is exceptionally powerful and was co-developed in conjunction with XPath 2.0
  • Point of having a merged cell for physical: it’s all coming together – it’s increasingly difficult to distinguish the underlying physical model services…Here again, hypertext is not 1:1 with HTML – it’s beyond-the-basics hypertext as manifested, e.g., in Web publishing and collaboration-oriented systems/serversXQuery is not mainstream today, but it is exceptionally powerful and was co-developed in conjunction with XPath 2.0
  • Point of this slide: reinforce ability to discern major similarities/differences between two tools/services focused on similar domain, by comparing/contrasting model diagrams Non-technical people can easily learn how to read/use this type of model – not the case with most logical and physical model diagramming techniquesEvernote conceptual model fragment example from http://www.quepublishing.com/articles/article.aspx?p=1684320 Incomplete – a full conceptual model includes accompanying documentation, e.g., with entity definitions and examplesMicrosoft OneNote 2010 conceptual model fragment example from http://www.quepublishing.com/articles/article.aspx?p=1684320 Reason for including it: it provides an example, comparing it to the Evernote conceptual model fragment, of how easy it is to understand domains, when using conceptual models – e.g., the fact that OneNote has a more elaborate info item containment structure, and supports tags at the item/paragraph level, while Evernote tagging is at the note/page level. That’s not meant to be a judgment call; the extent to which Evernote or OneNote is more useful is a function of your info item/note-taking needs.
  • Point of this slide: reinforce ability to discern major similarities/differences between two tools/services focused on similar domain, by comparing/contrasting model diagrams Non-technical people can easily learn how to read/use this type of model – not the case with most logical and physical model diagramming techniquesEvernote conceptual model fragment example from http://www.quepublishing.com/articles/article.aspx?p=1684320 Incomplete – a full conceptual model includes accompanying documentation, e.g., with entity definitions and examplesMicrosoft OneNote 2010 conceptual model fragment example from http://www.quepublishing.com/articles/article.aspx?p=1684320 Reason for including it: it provides an example, comparing it to the Evernote conceptual model fragment, of how easy it is to understand domains, when using conceptual models – e.g., the fact that OneNote has a more elaborate info item containment structure, and supports tags at the item/paragraph level, while Evernote tagging is at the note/page level. That’s not meant to be a judgment call; the extent to which Evernote or OneNote is more useful is a function of your info item/note-taking needs.

Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment Data Modelers Still Have Jobs: Adjusting for the NoSQL Environment Presentation Transcript

  • Data Modelers Save Their Careers:Surviving and Thriving with NoSQLJoe MaguireData Quality Strategies, LLChttp://www.DataQualityStrategies.com/© 2013 Data Quality Strategies, LLC
  • Thesis• Relational DBMS’s have dominated,• ...so relational modeling subsumed otherforms, including conceptual modeling.• As R-DBMS wanes, so does relationalmodeling – and sadly, whatever it subsumed.• Conceptual modeling must be saved.• Relational modelers can step in to save it...• ...with some significant effort.25 June 2013 © 2013 Data Quality Strategies, LLC 2
  • My Perspective• Over three decades in industry• Career is a three-legged stool– Product development for software vendors– Solution design for enterprises– Author, Industry Analyst, Thought Leader• Specialize in– Modeling– Requirements analysis– Data architecture– Data quality• Joe.Maguire@DataQualityStrategies.com25 June 2013 © 2013 Data Quality Strategies, LLC 3
  • Agenda• History• Current Events• Your Future as a Data Modeler• Q&A25 June 2013 © 2013 Data Quality Strategies, LLC 4
  • A Big-Picture Framework25 June 2013 © 2013 Data Quality Strategies, LLC 5Meta-model Data PerspectiveConceptual • Entities• Attributes• Relationships• IdentifiersLogical • Tables• Columns• Primary and foreign keysPhysical • Indexes• Table spaces• Vertical and horizontal partitioning• Denormalizations
  • Good Ideas in the Framework• Information Hiding– e.g., conceptual excludes implementation details• The Type/Instance distinction– Models describe categories, data describes members• Application/Data Independence– Data modeling is separate from process modeling• User Requirements ≠ System Requirements– Users should not participate in logical and physical• Model-Driven Development– Forward and reverse engineering across model levels25 June 2013 © 2013 Data Quality Strategies, LLC 6
  • A Big-Picture Framework, distorted25 June 2013 © 2013 Data Quality Strategies, LLC 7Meta-model Data PerspectiveRelational • Entities / Tables• Attributes / Columns• Relationships / FKs• Identifiers / PKsPhysical • Indexes• Table spaces• Vertical and horizontal partitioning• Denormalizations
  • How the Distortion Happens• Tool Vendors Dismiss Conceptual Modeling– Because their tools cannot support it anyway• Info Mgmt Specialists Confuse Models w Reality– E.g., believing the relational model suffices todescribe the universe• Institutionalized Expediency– We know about conceptual modeling, but to savetime, we combine it with relational modeling...– ...then we formalize that into our dev processes...– ...and eventually, that becomes the “best practices.”25 June 2013 © 2013 Data Quality Strategies, LLC 8
  • Distortions, Revisited• Summary of Distortions:– Distortion: Conceptual means vague– Distortion: Logical implies relational• Rather than implying XML, OO, KV Store, ArrayDatabase, Graph Database• Results of Distortions:– Two levels only: relational and physical– Relational modeling used for user requirements25 June 2013 © 2013 Data Quality Strategies, LLC 9
  • Agenda• History• Current Events• Your Future as a Data Modeler• Q&A25 June 2013 © 2013 Data Quality Strategies, LLC 10
  • Current Events: NoSQL• The “Just Say No” Interpretation25 June 2013 © 2013 Data Quality Strategies, LLC 11Meta-model Data PerspectiveLogicalRelational• Entities / Tables• Attributes / Columns• Relationships / FKs• Identifiers / PKsPhysical NO LONGER RELATIONAL:• Schemas Based on Big Table Implementations• Alien DDL language• Limited Support from Modeling Tools
  • Current Events: NoSQL25 June 2013 © 2013 Data Quality Strategies, LLC 12• The “Not Only SQL” Interpretation– Okay, so there might be some work for you– But you’re at risk of being marginalized
  • Agenda• History• Current Events• Your Future as a Data Modeler• Summary• Q&A25 June 2013 © 2013 Data Quality Strategies, LLC 13
  • Your Future as a Modeler25 June 2013 © 2013 Data Quality Strategies, LLC 14• Remaining Relevant– Selfishly: Saving your career– Nobly: Serving your client / company / customer• What You Can Do:– Wait for relational projects– Become a NoSQL database designer– Help your client choose data platforms• That starts with understanding the problems– which starts with CONCEPTUAL MODELING.
  • A New (?) Modeling Framework• Conceptual Modeling• Choosing a Logical Meta-model• Logical Modeling• Physical Modeling• Tool Support?25 June 2013 © 2013 Data Quality Strategies, LLC 15
  • Conceptual Modeling• Behaviors and constructs will compare torelational modeling:– Keep some– Discard some– Stress some– Change some25 June 2013 © 2013 Data Quality Strategies, LLC 16
  • Conceptual Data Model Example25 June 2013 © 2013 Data Quality Strategies, LLC 17
  • Keep Some• Keep Entities• Keep Attributes• Keep Relationships• Keep Identifiers• Keep Maximum Cardinality of Relationships25 June 2013 © 2013 Data Quality Strategies, LLC 18
  • Keep Entities• Minimum Expressiveness• Entities, Not Tables– Don’t express horizontal or vertical partitioning forperformance• But yes if motivated by privacy/security/risk• Entity names, not table names– Honor user vocabulary, not IT naming standards25 June 2013 © 2013 Data Quality Strategies, LLC 19
  • Keep Attributes• Honor The User Phenomenon– Attributes are part of user discourse• Attributes, Not Columns– Worry about scale(nominal, numeric, ordinal, Boolean, cyclic), notdata type– Attribute names, not column names• Support In-Progress Models– During which attributes can become entities25 June 2013 © 2013 Data Quality Strategies, LLC 20
  • Keep Relationships• Minimum Expressiveness– Relationships are part of user discourse• Allow Many-Many and Collection Entities– If the latter seem illegal, you’ve been in IT too long• Relationships, not FKs25 June 2013 © 2013 Data Quality Strategies, LLC 21
  • • Relationships, not Foreign Keys– (achievement DOES NOT have code or creatureID)Keep Relationships25 June 2013 © 2013 Data Quality Strategies, LLC 22
  • • Many-Many AllowedKeep Relationships25 June 2013 © 2013 Data Quality Strategies, LLC 23
  • Keep Identifiers• Identifiers, Not PKs– IDs are not motivated by computerization, but bytypography– IDs predate the information revolution• and the automotive revolution, for that matter– Allow collection entities• Support In-Progress Modeling– IDs help the modeler ferret out the homonymproblem25 June 2013 © 2013 Data Quality Strategies, LLC 24
  • Keep Identifiers• Identifiers, not PKs. (E.g., Collection Entities):– (each squad is identified by the skaters on it.)25 June 2013 © 2013 Data Quality Strategies, LLC 25
  • Discard Some• Discard Foreign Keys– They’re relational• Discard Minimum Cardinality– A function of process or policy, not data– Over-reported by users• Discard Most Constraints– A function of process or policy, not data– Are over-reported by users25 June 2013 © 2013 Data Quality Strategies, LLC 26
  • Discard Minimum Cardinality• Must EVERY instance of meeting have a person?– No. E.g., CassandraSummit 2014 already has a date andlocation but has zero persons associated with it.• More generally: Should the DBMS refuse to storeincomplete data?– People get interrupted and want to save their partialwork.25 June 2013 © 2013 Data Quality Strategies, LLC 27
  • Keep/Discard Rule of Thumb• Keep– Anything that helps you and the users togetherdiscover and name the user categories• Discard– Anything else25 June 2013 © 2013 Data Quality Strategies, LLC 28
  • Conceptual Data Model Examples25 June 2013 © 2013 Data Quality Strategies, LLC 29
  • Stress Some• Stress Consistency Requirements– Relational modelers (of non-distributed databases)have not been asking about these.• Stress Data Volume / Velocity Requirements– Can lead or force your to relax application-dataindependence25 June 2013 © 2013 Data Quality Strategies, LLC 30
  • Change Some• Change Your Process– From math-y normalization to English-yconversation with users– Very difficult to achieve rigor conversationally25 June 2013 © 2013 Data Quality Strategies, LLC 31• More help:– Mastering Data Modeling: AUser-Driven Approachby Carlis & Maguire
  • A New Modeling Framework• Conceptual Modeling• Choosing a Logical Meta-Model• Logical Modeling• Physical Modeling• Tool Support?25 June 2013 © 2013 Data Quality Strategies, LLC 32
  • Choosing a Logical Meta-Model• Don’t Assume Relational (Duh...)• Don’t Assume Big Table, KV-Store, Cassandra• Lots of Choices– Relational– Key-Value Store– XML/Document Database– Graph database– Array database– ...25 June 2013 © 2013 Data Quality Strategies, LLC 33
  • A New Modeling Framework• Conceptual Modeling• Choosing a Logical Meta-Model• Logical Modeling• Physical Modeling• Tool Support?25 June 2013 © 2013 Data Quality Strategies, LLC 34
  • Logical, Physical, and Tool Support• Minimal Support From Modeling Tools– Because few tools support conceptual modeling– Because vendors have not caught up to NoSQL yet• Community Needs to Develop Shapes– And the attendant transformations from conceptualshapes to Big-Table shapes• During Logical NoSQL Modeling, ProcessRequirements Will Infiltrate25 June 2013 © 2013 Data Quality Strategies, LLC 35
  • Agenda• History• Current Events• Your Future as a Data Modeler• Summary• Q&A25 June 2013 © 2013 Data Quality Strategies, LLC 36
  • Summary• Recommit to Conceptual Modeling forRequirements Analysis– Some but not all relational-modeling skills willapply– Must learn to focus on user communication, notnerdy stuff like intermediate normal forms25 June 2013 © 2013 Data Quality Strategies, LLC 37
  • Summary• Remember the fundamentals, so that you canmake informed decisions about relaxing them– Application-data independence (relax knowingly)– Distinguish problems from solutions (relax at yourown peril)– Consistency level as a user requirement (as youask, you’ll find immediate consistency is oftennegotiable)25 June 2013 © 2013 Data Quality Strategies, LLC 38
  • Summary• Additional Benefits– Users will like you better– Agile developers will like you better– This framework works in traditional, all-SQLenvironments25 June 2013 © 2013 Data Quality Strategies, LLC 39
  • Q&A• Joe.Maguire@DataQualityStrategies.com• www.DataQualityStrategies.com25 June 2013 © 2013 Data Quality Strategies, LLC 40