ZettaVox: Content Mining and Analysis across Heterogeneous Compute Clouds <ul><li>Mark Davis </li></ul>Kitenga, Inc.
<ul><li>The Company </li></ul><ul><li>The Problem </li></ul><ul><li>The Solution </li></ul><ul><li>Demo </li></ul>Session ...
<ul><li>Kitenga 1,2 : (Maori) A view or perception </li></ul><ul><ul><li>2004-present </li></ul></ul><ul><ul><li>CTO: Mark...
Support Prediction Logic, Inc.
The Never-Ending Problem Multimedia Data Video Imagery Audio Sensor Streams Biometric data 3D Text Email Web pages Tweets ...
Solving the Problem is Hard Content mining analysts Machine learning specialists Information retrieval specialists Softwar...
<ul><li>Convert raw data into actionable intelligence </li></ul>Defense Intelligence Situation Reports Geotagged Imagery I...
<ul><li>Increase speed of drug discovery </li></ul>Pharmaceutical R&D Patents Genetic Sequence Data Journal Articles Faste...
ZettaVox <ul><li>Compose analysis workflows using out-of-the-box components </li></ul><ul><li>Interact with HDFS/Hadoop th...
Heterogenous Compute Clouds 42 U ≈ 84-168 cores 2 PCIe slots 15 multiprocessors 480 cores $0.13-$0.35/Gflop Amazon AWS Rac...
Author Analysis Solutions
Interact with HDFS
Monitor Analysis Jobs
Use and Visualize Results
ZettaVox Current Approach Slow analytics Methods don’t scale Expensive hardware Expensive software Capital investment Expe...
Questions? <ul><li>Mark Davis </li></ul><ul><li>[email_address] </li></ul>
Upcoming SlideShare
Loading in …5
×

ZettaVox: Content Mining and Analysis Across Heterogeneous Compute Clouds__HadoopSummit2010

1,350
-1

Published on

Hadoop Summit 2010 - Application Track
ZettaVox: Content Mining and Analysis Across Heterogeneous Compute Clouds
Mark Davis, Kitenga

Published in: Technology
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,350
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
44
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide
  • This is the Title slide. Please use the name of the presentation that was used in the abstract submission.
  • This is the agenda slide. There is only one of these in the deck.
  • This is a topic/content slide. Duplicate as many of these as are needed. Generally, there is one slide per three minutes of talk time.
  • This is the final slide; generally for questions at the end of the talk. Please post your contact information here.
  • ZettaVox: Content Mining and Analysis Across Heterogeneous Compute Clouds__HadoopSummit2010

    1. 1. ZettaVox: Content Mining and Analysis across Heterogeneous Compute Clouds <ul><li>Mark Davis </li></ul>Kitenga, Inc.
    2. 2. <ul><li>The Company </li></ul><ul><li>The Problem </li></ul><ul><li>The Solution </li></ul><ul><li>Demo </li></ul>Session Agenda
    3. 3. <ul><li>Kitenga 1,2 : (Maori) A view or perception </li></ul><ul><ul><li>2004-present </li></ul></ul><ul><ul><li>CTO: Mark Davis, InXight Software (Business Objects/SAP), Microsoft, Defense R&D </li></ul></ul><ul><ul><li>CEO: Anil Uberoi, Lucid Imagination, Amdocs, Sun </li></ul></ul>Kitenga 1 also a region in Uganda 2 also a bed-and-breakfast in Clevendon, Auckland <ul><ul><li>Solutions for Information Overload </li></ul></ul>2953 Bunker Hill Lane, Santa Clara, CA
    4. 4. Support Prediction Logic, Inc.
    5. 5. The Never-Ending Problem Multimedia Data Video Imagery Audio Sensor Streams Biometric data 3D Text Email Web pages Tweets Posts Enterprise Data Enterprise data CDRs Financial records Access logs
    6. 6. Solving the Problem is Hard Content mining analysts Machine learning specialists Information retrieval specialists Software Engineers Expensive and hard to find Parallel Supercomputers Racked clusters Systems management Enterprise storage solutions Gigabit switches Power management Text analytics Ontologies Database reporting tools ETL tools Business intelligence Open source components
    7. 7. <ul><li>Convert raw data into actionable intelligence </li></ul>Defense Intelligence Situation Reports Geotagged Imagery Improve Force Effectiveness ZettaVox Named Entity Extraction Image tagging Video analytics Linkage Analysis Network Visualization Search Hadoop, GPUs, HDFS, Hbase, SOLR
    8. 8. <ul><li>Increase speed of drug discovery </li></ul>Pharmaceutical R&D Patents Genetic Sequence Data Journal Articles Faster Discovery ZettaVox Biological Named Entity Extraction Author Name Extraction and Normalization Linkage Analysis Timelines Facetted Search Hadoop, HDFS, Hbase, GPUs, SOLR
    9. 9. ZettaVox <ul><li>Compose analysis workflows using out-of-the-box components </li></ul><ul><li>Interact with HDFS/Hadoop through Rich Internet Application </li></ul><ul><li>Monitor system progress </li></ul><ul><li>Visualize and analyze results </li></ul><ul><li>Batch mode via XML and JSON </li></ul><ul><li>Heterogenous compute resources </li></ul>
    10. 10. Heterogenous Compute Clouds 42 U ≈ 84-168 cores 2 PCIe slots 15 multiprocessors 480 cores $0.13-$0.35/Gflop Amazon AWS Rackspace Mosso Private Cloud
    11. 11. Author Analysis Solutions
    12. 12. Interact with HDFS
    13. 13. Monitor Analysis Jobs
    14. 14. Use and Visualize Results
    15. 15. ZettaVox Current Approach Slow analytics Methods don’t scale Expensive hardware Expensive software Capital investment Expertise investment Hadoop with GPU support Scalable SaaS Out-of-the-box expertise Rich user experience ZettaVox Internet-scale cloud and cluster-based content mining
    16. 16. Questions? <ul><li>Mark Davis </li></ul><ul><li>[email_address] </li></ul>
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×