Cloudera Federal Forum 2014: The REDDISK Big Data Architecture

2,818
-1

Published on

CEO of Koverse Paul Brown, shares the story of Accumulo and how the project is applied to Hadoop and Big Data.

Published in: Technology, Business
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,818
On Slideshare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
0
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

Cloudera Federal Forum 2014: The REDDISK Big Data Architecture

  1. 1. Red Disk AND SOME THOUGHTS ON BIG DATA IN THE GOVERNMENT PAUL BROWN KOVERSE, INC
  2. 2. Accumulo Origin Story (Paul’s Version)  Thinking was:  We were way behind the curve  Data unification was the only way to survive  Google’s architecture is proven to scale and the design is available  Need to prove as soon as possible:    Scale/Unification in real world scenarios Mission Impact What we Learned along the way:  Needed Secure Indexes across datasets  “Productization” is critical to scaling success  We are way ahead of the curve…
  3. 3. Why Accumulo and Hadoop  Interactive Query at Scale   Adaptive Schemas Heterogeneous Data   Bulk Processing Multiple Versions
  4. 4. Adoption of Big Data Home Grown(pre 2008) Open Source GOTS COTS
  5. 5. GOTS Phase  Mission Impact Goals:  Lower complexity  Mission Impact  Repeatability Sources and Methods Technology Core Principals
  6. 6. Red Disk  Goals:  Lower the complexity and time associated operationalizing data  “product”, purpose repeatable, documented, general  Interoperability between systems
  7. 7. Red Disk  RPMs  Key New Apps Existing Apps Node Types  Hadoop/Accumulo Red Disk  JBOSS  STORM Hadoop and Accumulo
  8. 8. Red Disk API -> UCD API  Pre-processing and data ingest: storm  Bulk Analytics: MapReduce Input/Output Formats  CRUD and Query: REST services
  9. 9. Red Disk Kafka DPF (UCD API) Mission Apps Storm Ingest Analytics – NLP, etc UCD Ingest / Query API Raw Data Indexing Providers: Koverse, GAIA, etc Accumulo, HDFS, etc
  10. 10. UCD logical structure Bob Person Place Bob Father Of Terms Bob Father of Joe Bobby AKA Bob Organization Artifacts Joe Statements UCD API Objects
  11. 11. Review  Questions…  Red Disk  Accumulo  Anything else

×