Your SlideShare is downloading. ×
Cloudera Federal Forum 2014: The REDDISK Big Data Architecture
Cloudera Federal Forum 2014: The REDDISK Big Data Architecture
Cloudera Federal Forum 2014: The REDDISK Big Data Architecture
Cloudera Federal Forum 2014: The REDDISK Big Data Architecture
Cloudera Federal Forum 2014: The REDDISK Big Data Architecture
Cloudera Federal Forum 2014: The REDDISK Big Data Architecture
Cloudera Federal Forum 2014: The REDDISK Big Data Architecture
Cloudera Federal Forum 2014: The REDDISK Big Data Architecture
Cloudera Federal Forum 2014: The REDDISK Big Data Architecture
Cloudera Federal Forum 2014: The REDDISK Big Data Architecture
Cloudera Federal Forum 2014: The REDDISK Big Data Architecture
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Cloudera Federal Forum 2014: The REDDISK Big Data Architecture

2,474

Published on

CEO of Koverse Paul Brown, shares the story of Accumulo and how the project is applied to Hadoop and Big Data.

CEO of Koverse Paul Brown, shares the story of Accumulo and how the project is applied to Hadoop and Big Data.

Published in: Technology, Business
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,474
On Slideshare
0
From Embeds
0
Number of Embeds
6
Actions
Shares
0
Downloads
0
Comments
0
Likes
3
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Red Disk AND SOME THOUGHTS ON BIG DATA IN THE GOVERNMENT PAUL BROWN KOVERSE, INC
  • 2. Accumulo Origin Story (Paul’s Version)  Thinking was:  We were way behind the curve  Data unification was the only way to survive  Google’s architecture is proven to scale and the design is available  Need to prove as soon as possible:    Scale/Unification in real world scenarios Mission Impact What we Learned along the way:  Needed Secure Indexes across datasets  “Productization” is critical to scaling success  We are way ahead of the curve…
  • 3. Why Accumulo and Hadoop  Interactive Query at Scale   Adaptive Schemas Heterogeneous Data   Bulk Processing Multiple Versions
  • 4. Adoption of Big Data Home Grown(pre 2008) Open Source GOTS COTS
  • 5. GOTS Phase  Mission Impact Goals:  Lower complexity  Mission Impact  Repeatability Sources and Methods Technology Core Principals
  • 6. Red Disk  Goals:  Lower the complexity and time associated operationalizing data  “product”, purpose repeatable, documented, general  Interoperability between systems
  • 7. Red Disk  RPMs  Key New Apps Existing Apps Node Types  Hadoop/Accumulo Red Disk  JBOSS  STORM Hadoop and Accumulo
  • 8. Red Disk API -> UCD API  Pre-processing and data ingest: storm  Bulk Analytics: MapReduce Input/Output Formats  CRUD and Query: REST services
  • 9. Red Disk Kafka DPF (UCD API) Mission Apps Storm Ingest Analytics – NLP, etc UCD Ingest / Query API Raw Data Indexing Providers: Koverse, GAIA, etc Accumulo, HDFS, etc
  • 10. UCD logical structure Bob Person Place Bob Father Of Terms Bob Father of Joe Bobby AKA Bob Organization Artifacts Joe Statements UCD API Objects
  • 11. Review  Questions…  Red Disk  Accumulo  Anything else

×