Big Data Cloud Meetup<br />Big Data & Cloud Computing - Help, Educate & Demystify.<br />June 3rd 2011<br />
Kitenga, Mark Davis CTO<br />June 3rd 2011 Meetup<br />Unlocking Big Data through Analytics and Search<br />
Big Data<br />Enormous transactional data<br />Enormous unstructured information<br />Too big for databases<br />New tools...
Unstructured data explosion<br />Multimedia Content<br />Text<br />Imagery<br />Audio<br />Video<br />Sensor Streams<br />...
Big Data <br />Trillions of user interactions/transactions == Big Data<br />>100M<br /><10M<br /><1M<br />Open source<br /...
The Structured/Unstructured Chasm<br />SQL<br />RDBMS<br />Transactional Data<br />BI Tools<br />Search<br />Documents<br ...
Unstructured Analytics: Surfacing Metadata<br />
Information Extraction<br />Machine-Learning<br />Finite State Transducer<br />Finite State Transducer<br />Finite State T...
Search + Analytics<br />Resource Integration<br />Facet Browsing<br />Facet Charting<br />Autosuggest<br />Spellcheck<br /...
Defense Intelligence<br />Analyst support staff needs to convert raw data into actionable intelligence<br />10<br />Named ...
CASE STUDY: US ARMY<br />11<br />The Solution<br />>200 data feeds<br /><0.5s queries<br />Fast analysis cycles<br />Machi...
Pharma Bioinformatics<br />Increase speed of drug discovery<br />12<br />Biological Named Entity Extraction<br />Author Na...
PharmaTreemap<br />13<br />
14<br />
Demo<br />
Summary<br />Big Data spans unstructured and structured data<br />Effective tools for managing both involve understanding ...
Questions?<br />
Contact Info<br />20<br />mark@kitenga.com<br />http://www.kitenga.com<br />Kitenga, Inc.<br />2953 Bunker Hill Lane, Suit...
Upcoming SlideShare
Loading in...5
×

Big Data Cloud June 3rd Meetup - Presentation by Mark Davis

1,093

Published on

Big Data Cloud June 3rd Meetup
- Presentation by Mark Davis

Unlocking Big Data through Analytics and Search

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,093
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Big Data Cloud June 3rd Meetup - Presentation by Mark Davis

  1. 1. Big Data Cloud Meetup<br />Big Data & Cloud Computing - Help, Educate & Demystify.<br />June 3rd 2011<br />
  2. 2. Kitenga, Mark Davis CTO<br />June 3rd 2011 Meetup<br />Unlocking Big Data through Analytics and Search<br />
  3. 3. Big Data<br />Enormous transactional data<br />Enormous unstructured information<br />Too big for databases<br />New tools are needed<br />
  4. 4. Unstructured data explosion<br />Multimedia Content<br />Text<br />Imagery<br />Audio<br />Video<br />Sensor Streams<br />Biometric data<br />3D<br />Text<br />Email<br />Documents<br />Web pages<br />Tweets<br />Posts<br /><5%<br />Structured Enterprise Data<br />Datawarehouse<br />CDRs<br />Financial records<br />Access logs<br />4<br />
  5. 5. Big Data <br />Trillions of user interactions/transactions == Big Data<br />>100M<br /><10M<br /><1M<br />Open source<br />MySQL<br />PHP<br />Data warehousing<br />Parallel SQL<br />Big hardware<br />NoSQL<br />Hadoop/MapReduce<br />Hbase/HIVE<br />Emerging technologies <br />Traditional (DBMS-based) solutions <br />5<br />
  6. 6. The Structured/Unstructured Chasm<br />SQL<br />RDBMS<br />Transactional Data<br />BI Tools<br />Search<br />Documents<br />Text Classification<br />Taxonomies<br />Ontologies<br />
  7. 7. Unstructured Analytics: Surfacing Metadata<br />
  8. 8. Information Extraction<br />Machine-Learning<br />Finite State Transducer<br />Finite State Transducer<br />Finite State Transducer<br />Parts-of-Speech Tagging<br />Lemmatization<br />Tokenization<br />
  9. 9. Search + Analytics<br />Resource Integration<br />Facet Browsing<br />Facet Charting<br />Autosuggest<br />Spellcheck<br />Query Language<br />Indexing<br />Metadata Extraction<br />
  10. 10. Defense Intelligence<br />Analyst support staff needs to convert raw data into actionable intelligence<br />10<br />Named Entity Extraction<br />Image tagging<br />Video analytics<br />Linkage Analysis<br />Network Visualization<br />Search<br />Improve Force Effectiveness<br />Hadoop/MapReduce, GPUs, HDFS, Hbase, SOLR<br />Situation Reports<br />Geo-tagged Imagery<br />US Army <br />Navy<br />DHS<br />NSA<br />
  11. 11. CASE STUDY: US ARMY<br />11<br />The Solution<br />>200 data feeds<br /><0.5s queries<br />Fast analysis cycles<br />Machine Learning<br />Analytics<br />Biometrics<br />Linkage Analysis<br />Face recognition<br />Video tagging<br />Collaborative systems<br />Analysis Bottlenecks<br />200 data feeds<br />Unacceptable response time<br />Analysts avoid complete searches<br />Basic entity extraction<br />Slow analysis cycles<br />Distribution by PowerPoint<br />Enabling techonolgies: GPU clouds, Hadoop/MapReduce, Katta, Lucene, NoSQL, Hbase<br />Enabling Technologies: Oracle and custom thick clients<br />
  12. 12. Pharma Bioinformatics<br />Increase speed of drug discovery<br />12<br />Biological Named Entity Extraction<br />Author Name Extraction and Normalization<br />Linkage Analysis<br />Timelines<br />Facetted Search<br />ZettaVox<br />Faster Discovery<br />Hadoop/MapReduce, HDFS, Hbase, GPUs, SOLR<br />Patents<br />Genetic Sequence Data<br />Journal Articles<br />
  13. 13. PharmaTreemap<br />13<br />
  14. 14. 14<br />
  15. 15.
  16. 16. Demo<br />
  17. 17.
  18. 18. Summary<br />Big Data spans unstructured and structured data<br />Effective tools for managing both involve understanding the differences and similarities of both<br />Bridging the chasm between them means merging search and analytics together<br />
  19. 19. Questions?<br />
  20. 20. Contact Info<br />20<br />mark@kitenga.com<br />http://www.kitenga.com<br />Kitenga, Inc.<br />2953 Bunker Hill Lane, Suite 400<br />Santa Clara, CA 95054<br />1-(408)-462-KITE<br />1-(253)-541-6799 (FAX)<br />
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×