Data-Centric Infrastructure for Agile Development

933 views
720 views

Published on

Most data centers are filled with rigid data servers that are tightly linked to specific applications, leading to data duplication, lengthy development cycles, and unnecessary costs. Learn how you can use an Enterprise NoSQL database platform to help create a flexible, agile data fabric that will allow you to iterate your application development, optimize your data, and reduce costs.

When your enterprise infrastructure is data-centric instead of application-centric, you make it easy for anyone to pull crucial data without spending unnecessary time and money on plumbing...freeing resources for building better applications. Learn how other companies have built –and benefited from– a data-centric infrastructure for agile development.

Ingest and manage all your data, documents, and semantic triples in a flexible, schema-agnostic platform – without sacrificing the ACID transactions, granular security, database management tools and other features you’ve come to expect in a mature database platform
Quickly build complex, interactive search applications
Deliver robust, real-time search and alerting within your applications
Use – and optimize – modern infrastructure including Hadoop and cloud to attain operational agility
Simplify implementation of data governance requirements around security, privacy, provenance, retention, continuity, and compliance – while reducing risk, cost, and time

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
933
On SlideShare
0
From Embeds
0
Number of Embeds
63
Actions
Shares
0
Downloads
21
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Data-Centric Infrastructure for Agile Development

  1. 1. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. The Data-Centered Data Center Presented by: Jim Clark, Senior Director of Product Management
  2. 2. © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 2 THE WORLD IS VERY APPLICATION-CENTRIC
  3. 3. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 3 2. Determine needed data 3. Determine needed queries ? ? 1. Design the application 7. Load the data 8. Code the application5. Build a database 6. Design the ETL strategy 4. Design the schema and indexing strategy
  4. 4. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 4 OLTP Warehouse Data MartsArchives “Unstructured” “ ” Video Audio Signals, Logs, Streams Social Documents, Messages { } Metadata Search🔍 Reference Data
  5. 5. © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 5 HOW DO YOU DETERMINE IN ADVANCE WHAT'S USEFUL? Love the application...can you go back and include the data from 1990 – 1995?
  6. 6. © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 6 TOO MUCH DATA TO BE COPYING FOR EVERY NEW APPLICATION Serious?! Third time this month I'm moving that data around!
  7. 7. © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 7 ETL CONSUMES ALL RESOURCES With all of the new data we're trying to get into the database, there's no time to build new features!
  8. 8. © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 8 TOO MANY TECHNOLOGIES CREATES SCALING HEADACHES To scale this system, we've got to buy new hardware. We can take the old hardware and move it to this other system. That one can't get any bigger. Period.
  9. 9. © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 9 TOO MUCH AND TOO MANY COPIES...YOU'VE LOST CONTROL Who's reading it? Who's editing it? Where's the master copy? What's happened to it over time? Is it reliable? How up-to-date is this data store? Are the security models consistent? Are there different backup models? Are the lifecycles, retention, disposal policies the same?
  10. 10. © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 10 APPLICATION-CENTRIC DATA CENTER
  11. 11. © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 11 APPLICATION-CENTRIC DATA CENTER
  12. 12. © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 12 The data-centered data center
  13. 13. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 13 5. On-premises, Cloud... both! 3. Elasticity with no downtime 6. Create powerful data services 1. Hadoop 4. Manage the data lifecycle2. Low-cost Tiered Storage 7. Complete database platform How? 8. Enterprise Readiness
  14. 14. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 14 Enter Hadoop… Hadoop Staging Analytics Persistence
  15. 15. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 15
  16. 16. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 16 Legacy RDBMS  Indexes  Transactions  Security  Enterprise operations “NoSQL”  Flexible data model  Commodity scale out  Distributed, fault-tolerant  Hadoop sink/source Why must we choose?
  17. 17. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 17 Enterprise NoSQL  Flexible data model, comprehensive indexes o Documents: Hierarchy, text, values, tags—schema “when you need it” o Scalars: Aggregates and range filters, including geospatial o Triples: Linked facts and inferencing o Permissions: Users, roles, compartments, and privileges o Queries: Reverse indexes for alerting, matching  Ad hoc queries, lock-free reads  Real-time transformation  Strict consistency, security throughout
  18. 18. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 18 Data-centered Enterprise NoSQL HadoopMarkLogic
  19. 19. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 19 NoSQL  Online applications  Delivery  Decision-making  Real-time  Granular updates  Distributed indexes Hadoop  Offline analytics  Staging  Model-building  Long-haul batch  Write-once, read-many  Distributed file system Complementary approaches
  20. 20. © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 20 TIERED STORAGE
  21. 21. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 21 With Tiered Storage You Can  Provide multiple Service Level Agreements (SLAs) in a single system  Decrease time and costs of ETL to bring offline content back online  Empower your operations team without imposing burdens on your developers
  22. 22. SLIDE: 22 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Tiered Storage Here’s how you enable tiered storage…  Define data tiers based on a range index  Have content balanced into forests by tier  Move an entire tier to different storage  Query one tier… …or the other tier… …or both at once! All with no downtime, and 100% consistency!
  23. 23. © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 23 OPERATIONAL TRADE STORE Case Study:
  24. 24. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 24 Tier 1 Bank: Operational trade store “What are the bank’s obligations?” ETL Trade execution Post-trade processing Reporting Analytics Trade stores Reference data
  25. 25. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 25 Legacy trade store challenges  Long development cycles for new instrument types  Complex combinations of ETL and data models  Limited visibility across the business  Governance risk, maintenance costs of siloed infrastructure  Varied SLAs and access patterns created inefficiencies
  26. 26. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 26 Preserving Context with Documents Trade Cashflows Party Identifier Net Payment Payment Date Party Reference Payer Party Trade ID Payment AmountReceiver Party Application Model Provider Model Persistence Model
  27. 27. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 27 Information lifecycle Active Historical Archive Time SSD DAS SAN Hadoop DAS SAN NAS Hadoop S3 NAS Hadoop S3
  28. 28. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 28 Active Active  Local 10K SAS, RAID10  Replication for HA  Merge overhead for updates  20 hosts, 320 shards  4 TB of SSD cache 96 TB
  29. 29. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 29 Compliance Active Compliance  Shared NAS  63 hosts  Effective 8 TB/host 504 96 TB
  30. 30. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 30 Active Compliance Analytic  Hadoop  120 hosts  Effective 12 TB/host  10 MarkLogic hosts Analytic 1,044 504 96 TB
  31. 31. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 31 Active Compliance Analytic Online migration TB
  32. 32. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 32 96 504 1,044 592 2,066 2,080 Total Size (TB) Total Cost ($000) Effective Unit Cost ($/GB) $4 Compliance $1.50 AnalyticOperational $25 ($/GB)
  33. 33. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 33 Align infrastructure with objectives  Data volumes are increasing, but IT budgets are not  Storage is the dominant factor in the overall cost  Value of data and pattern of access varies widely and changes over time  Last month’s news  Current quarter’s open transactions  Latest message traffic
  34. 34. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 34
  35. 35. © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 35 ELASTICITY
  36. 36. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 36 With Elasticity You Can  Know when to scale  How much to scale  Programmatically expand and contract  On premises or in the cloud
  37. 37. SLIDE: 37 © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED. Elasticity Scale up and down with  Tools to understand in detail how your cluster is performing, and to find bottlenecks  Fine-grained tuning parameters for optimization of indexes, cache sizes, etc.  Cloud orchestration APIs to expand and contract clusters programmatically on-prem or in the cloud  Continuous, online rebalancing of content across nodes in a cluster to keep performance optimal for your cluster size
  38. 38. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 38
  39. 39. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 39 The data-centered data center Index once Single security model Flexible data model Transactions Elastic operations …when you need them Simplified governance
  40. 40. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 40 SECURE Minimize duplication, costly ETL, reduce risk REAL-TIME Enterprise-class database for real-time search, delivery & analytics THE DATA-CENTERED DATA CENTER RUN APPLICATIONS Run mission critical applications directly on HDFS
  41. 41. © COPYRIGHT 2013 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 41 Powerful Deliver more value, build more powerful applications Full Text Search Scalable Analytic Functions Alerting & Event Processing Geospatial Query In-database MapReduce Visualization Widgets Semantics: RDF & SPARQL Flexible Indexes JSON Storage REST & Java APIs Triple Index POWERFUL Deliver more value, build more powerful applications AGILE Prepare for and respond quickly to change BI Integration HDFS & Amazon S3 Storage Elastic Programmatic Controls & Metering Application Builder Information Studio SQL Support Hadoop Connector Tiered Storage Cloud Ready Schema- Agnostic mlcp Content Pump TRUSTED Enterprise-ready and secure for mission-critical apps ACID Transactions XA Distributed Transactions Database Rollback Backup/ Restore Automated Failover Journal Archiving Replication Point-in- time Recovery Monitoring & Management Role-based Security & LDAP Support Common Criteria Security Certification Configuration Management
  42. 42. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 42 Take-Aways  New and more data is both an opportunity and a threat  Last generation of data management is not sufficient  More copies, representations, transformations increase risk and slow innovation  Index once and reuse across workloads, lifecycle  NoSQL: indexing and updates for interactive apps  Hadoop: staging, persistence, and analytics
  43. 43. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 43 SEARCHDATABASE APPLICATION SERVICES
  44. 44. © COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.SLIDE: 44 Any Questions?

×