Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Writing Spaceand theCassandra NoSQL DBMSBrian King(with thanks to Michael Aillon)
Writing Space
“Writing is one of the most effective toolsavailable to develop a students critical thinking.”Why A Writing Space?
•  Efficient Administration Of Writing Assignments•  Scalable Classrooms (500+)•  Workflow Optimization / Automation•  Int...
•  Highly "Internet" Scalable•  Global Presence•  Continuous Availability (Fault Tolerance)•  Broad OS And Browser Support...
Writing Space - Instructor
Writing Space - Student
Cassandra
•  Highly Scalable•  Easy Multi-Data Center Support•  Performance•  Distributed Ring Configuration (Master-less)•  Dynamic...
•  Eventual / Tunable Consistency•  Key-Name-Value Data Store (Column Based)•  Data Modeling Based On Core Queries•  All R...
What Is Consistency?•  Write Consistency: Number Of Replicas Written To•  Read Consistency: Number Of Replicas Queried•  R...
Typical RDBMS Features Not Available (Yet):•  Referential Integrity Constraints / Foreign Keys•  Commit / Rollback•  Store...
CassandraInWriting Space
Document Versioning...
How We Modeled Our Data...Storage Strategy: Document-oriented1:M1:1
The Writing SpaceDB Infrastructure
The Hardware•  Many Inexpensive Servers (Actually 4 + 1)•  Our Configuration:Processor: Xeon E5630, 2.53GHz, 4 CoresMemory...
Why DataStax Cassandra?•  A Certified, Production Ready Version Of Cassandra•  24/7 World Class Support•  Integration With...
•  Doc Store and UI•  Load: 3x Anticipated Load•  Total Time Of Run: 1.75 hours•  Max Document Size: 10k (25k, 50k and 75k...
•  Document Store only•  Load: 100x Anticipated Load•  Total Time Of Run: 1 hour•  Document Size: 25k, 50k and 75kResultsA...
Wrapping It Up
Cloud Decision Points•  Cost Savings•  Continuous Availability•  Performance / Dynamic (Elastic) Scalability•  Global Dist...
•  Think About Reporting Up Front•  Data Analytics – Hadoop and Solr Are Heavy Duty•  More Expensive Hardware?•  Different...
Consider The Human Element...•  Mind Shift For RDBMS Folks•  Need To “Let Go” That Data Needs To Be Normalized•  Experienc...
Writing Spaceand theCassandra NoSQL DBMSThank you!Questions?Brian.King@Pearson.com
Upcoming SlideShare
Loading in …5
×

Writing Space and the Cassandra NoSQL DBMS

1,804 views

Published on

By: Brian King at Pearson Education

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Writing Space and the Cassandra NoSQL DBMS

  1. 1. Writing Spaceand theCassandra NoSQL DBMSBrian King(with thanks to Michael Aillon)
  2. 2. Writing Space
  3. 3. “Writing is one of the most effective toolsavailable to develop a students critical thinking.”Why A Writing Space?
  4. 4. •  Efficient Administration Of Writing Assignments•  Scalable Classrooms (500+)•  Workflow Optimization / Automation•  Integrated Access to Assessment Toolso  Grammar Checkingo  Auto-Scoringo  Plagiarism Detection (Source Check)•  Grading Rubrics•  Online Editing and Document Upload•  Peer Review•  Group ProjectsThe Business Needs
  5. 5. •  Highly "Internet" Scalable•  Global Presence•  Continuous Availability (Fault Tolerance)•  Broad OS And Browser Support•  Mobile Device Support - "Mobile First"•  Low Cost (Systems, Maintenance, Integration)•  Write Once, Integrate “Anywhere”•  Gain Experience With Modern NoSQL Technologies•  REST Service-Based Architecture•  Model UIThe Technical Goals
  6. 6. Writing Space - Instructor
  7. 7. Writing Space - Student
  8. 8. Cassandra
  9. 9. •  Highly Scalable•  Easy Multi-Data Center Support•  Performance•  Distributed Ring Configuration (Master-less)•  Dynamic Schema, “Schema-less”•  Slice QueriesWhat We Like
  10. 10. •  Eventual / Tunable Consistency•  Key-Name-Value Data Store (Column Based)•  Data Modeling Based On Core Queries•  All Rows in a CF Typically Dont Live On 1 Server•  However, All Columns For a Row Do•  RDBMS Mindset•  No Ad Hoc QueriesWhat Challenged Us
  11. 11. What Is Consistency?•  Write Consistency: Number Of Replicas Written To•  Read Consistency: Number Of Replicas Queried•  Replication Factor: Number Of Replicas For A Row•  Quorum Consistency Level (Read And Write):o  Option In Specifying Read/Write Consistencyo  (Replication_Factor / 2) + 1o  Ensures Strong Consistencyo  While Maintaining High Availability•  With 4 Servers, Writing Space uses:o  Replication Factor = 3o  Read and Write Quorum Consistency
  12. 12. Typical RDBMS Features Not Available (Yet):•  Referential Integrity Constraints / Foreign Keys•  Commit / Rollback•  Stored Procedures•  Joins•  Views•  Triggers•  Functions•  Security Privileges•  Rules•  Partitioned Table DefinitionsWhats Not In Cassandra...
  13. 13. CassandraInWriting Space
  14. 14. Document Versioning...
  15. 15. How We Modeled Our Data...Storage Strategy: Document-oriented1:M1:1
  16. 16. The Writing SpaceDB Infrastructure
  17. 17. The Hardware•  Many Inexpensive Servers (Actually 4 + 1)•  Our Configuration:Processor: Xeon E5630, 2.53GHz, 4 CoresMemory: 96 GBStorage:Two Mirrored Spinning Disks For OS / BinariesThree Striped 480GB Solid State Drives(Providing 1.3 TB Local DB Storage)•  Peer to Peer Ring•  Hot Swappable - Fault Tolerant•  "Whats Your Insurance Company?"
  18. 18. Why DataStax Cassandra?•  A Certified, Production Ready Version Of Cassandra•  24/7 World Class Support•  Integration With Hadoop•  Integration With Solr•  OpsCenter (Multi-Data Center Management Tool)
  19. 19. •  Doc Store and UI•  Load: 3x Anticipated Load•  Total Time Of Run: 1.75 hours•  Max Document Size: 10k (25k, 50k and 75k DS)ResultsAverage Response Time: < 300msMaximum Running Vusers: 684Total Throughput (bytes): 7,176,727,121Average Throughput (bytes/sec): 1,993,535Total Hits: 342,833Average Hits per Second: 95DB Server CPU < 0.3%Performance
  20. 20. •  Document Store only•  Load: 100x Anticipated Load•  Total Time Of Run: 1 hour•  Document Size: 25k, 50k and 75kResultsAverage Response Time: < 100msMaximum Running Vusers: 2,200Total Throughput (bytes): 2,291,522,553Average Throughput (bytes/sec): 565,808Total Hits: 834,640Average Hits per Second: 206DB Server CPU < 1%Performance
  21. 21. Wrapping It Up
  22. 22. Cloud Decision Points•  Cost Savings•  Continuous Availability•  Performance / Dynamic (Elastic) Scalability•  Global Distribution Of Access Points•  Redundancy•  Disaster Recovery•  Resiliency To Node / Connectivity Loses A Must
  23. 23. •  Think About Reporting Up Front•  Data Analytics – Hadoop and Solr Are Heavy Duty•  More Expensive Hardware?•  Different RAID Configuration (Not Striping)•  Get Training – Especially About Schema DesignWhat Would We Do Differently?
  24. 24. Consider The Human Element...•  Mind Shift For RDBMS Folks•  Need To “Let Go” That Data Needs To Be Normalized•  Experience Of Operations Team•  Netflix - 4 People Managing 800+ NodesGlobal Enterprise•  Global Presence•  Disaster Recovery•  Internet ScaleFinal Thoughts...
  25. 25. Writing Spaceand theCassandra NoSQL DBMSThank you!Questions?Brian.King@Pearson.com

×