Your SlideShare is downloading. ×
ZettaVox: Content Mining and Analysis Across Heterogeneous Compute Clouds__HadoopSummit2010
ZettaVox: Content Mining and Analysis Across Heterogeneous Compute Clouds__HadoopSummit2010
ZettaVox: Content Mining and Analysis Across Heterogeneous Compute Clouds__HadoopSummit2010
ZettaVox: Content Mining and Analysis Across Heterogeneous Compute Clouds__HadoopSummit2010
ZettaVox: Content Mining and Analysis Across Heterogeneous Compute Clouds__HadoopSummit2010
ZettaVox: Content Mining and Analysis Across Heterogeneous Compute Clouds__HadoopSummit2010
ZettaVox: Content Mining and Analysis Across Heterogeneous Compute Clouds__HadoopSummit2010
ZettaVox: Content Mining and Analysis Across Heterogeneous Compute Clouds__HadoopSummit2010
ZettaVox: Content Mining and Analysis Across Heterogeneous Compute Clouds__HadoopSummit2010
ZettaVox: Content Mining and Analysis Across Heterogeneous Compute Clouds__HadoopSummit2010
ZettaVox: Content Mining and Analysis Across Heterogeneous Compute Clouds__HadoopSummit2010
ZettaVox: Content Mining and Analysis Across Heterogeneous Compute Clouds__HadoopSummit2010
ZettaVox: Content Mining and Analysis Across Heterogeneous Compute Clouds__HadoopSummit2010
ZettaVox: Content Mining and Analysis Across Heterogeneous Compute Clouds__HadoopSummit2010
ZettaVox: Content Mining and Analysis Across Heterogeneous Compute Clouds__HadoopSummit2010
ZettaVox: Content Mining and Analysis Across Heterogeneous Compute Clouds__HadoopSummit2010
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

ZettaVox: Content Mining and Analysis Across Heterogeneous Compute Clouds__HadoopSummit2010

1,298

Published on

Hadoop Summit 2010 - Application Track …

Hadoop Summit 2010 - Application Track
ZettaVox: Content Mining and Analysis Across Heterogeneous Compute Clouds
Mark Davis, Kitenga

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,298
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
43
Comments
0
Likes
3
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • This is the Title slide. Please use the name of the presentation that was used in the abstract submission.
  • This is the agenda slide. There is only one of these in the deck.
  • This is a topic/content slide. Duplicate as many of these as are needed. Generally, there is one slide per three minutes of talk time.
  • This is the final slide; generally for questions at the end of the talk. Please post your contact information here.
  • Transcript

    • 1. ZettaVox: Content Mining and Analysis across Heterogeneous Compute Clouds
      • Mark Davis
      Kitenga, Inc.
    • 2.
      • The Company
      • The Problem
      • The Solution
      • Demo
      Session Agenda
    • 3.
      • Kitenga 1,2 : (Maori) A view or perception
        • 2004-present
        • CTO: Mark Davis, InXight Software (Business Objects/SAP), Microsoft, Defense R&D
        • CEO: Anil Uberoi, Lucid Imagination, Amdocs, Sun
      Kitenga 1 also a region in Uganda 2 also a bed-and-breakfast in Clevendon, Auckland
        • Solutions for Information Overload
      2953 Bunker Hill Lane, Santa Clara, CA
    • 4. Support Prediction Logic, Inc.
    • 5. The Never-Ending Problem Multimedia Data Video Imagery Audio Sensor Streams Biometric data 3D Text Email Web pages Tweets Posts Enterprise Data Enterprise data CDRs Financial records Access logs
    • 6. Solving the Problem is Hard Content mining analysts Machine learning specialists Information retrieval specialists Software Engineers Expensive and hard to find Parallel Supercomputers Racked clusters Systems management Enterprise storage solutions Gigabit switches Power management Text analytics Ontologies Database reporting tools ETL tools Business intelligence Open source components
    • 7.
      • Convert raw data into actionable intelligence
      Defense Intelligence Situation Reports Geotagged Imagery Improve Force Effectiveness ZettaVox Named Entity Extraction Image tagging Video analytics Linkage Analysis Network Visualization Search Hadoop, GPUs, HDFS, Hbase, SOLR
    • 8.
      • Increase speed of drug discovery
      Pharmaceutical R&D Patents Genetic Sequence Data Journal Articles Faster Discovery ZettaVox Biological Named Entity Extraction Author Name Extraction and Normalization Linkage Analysis Timelines Facetted Search Hadoop, HDFS, Hbase, GPUs, SOLR
    • 9. ZettaVox
      • Compose analysis workflows using out-of-the-box components
      • Interact with HDFS/Hadoop through Rich Internet Application
      • Monitor system progress
      • Visualize and analyze results
      • Batch mode via XML and JSON
      • Heterogenous compute resources
    • 10. Heterogenous Compute Clouds 42 U ≈ 84-168 cores 2 PCIe slots 15 multiprocessors 480 cores $0.13-$0.35/Gflop Amazon AWS Rackspace Mosso Private Cloud
    • 11. Author Analysis Solutions
    • 12. Interact with HDFS
    • 13. Monitor Analysis Jobs
    • 14. Use and Visualize Results
    • 15. ZettaVox Current Approach Slow analytics Methods don’t scale Expensive hardware Expensive software Capital investment Expertise investment Hadoop with GPU support Scalable SaaS Out-of-the-box expertise Rich user experience ZettaVox Internet-scale cloud and cluster-based content mining
    • 16. Questions?
      • Mark Davis
      • [email_address]

    ×