Your SlideShare is downloading. ×
Hadoop Summit Japan 2011 Fall - LT by IBM
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Hadoop Summit Japan 2011 Fall - LT by IBM


Published on

Data Discovery Tool for BigInsights (on top of Hadoop) - MapReduce with no coding.

Data Discovery Tool for BigInsights (on top of Hadoop) - MapReduce with no coding.

Published in: Technology

1 Like
  • Be the first to comment

No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Data Discovery Tool BigSheetsMapReduce with No Coding? p gAtsushi Tsuchiya ( Tsuchiya ( Big Data Tiger Team IBM Software IBM Software
  • 2. Looking at Data Looking at Data• What would you do with Big data?  h ld d ih i d ?• How to make use of it?• It is difficult! – too vague. • No specific problem that needs to be solved. p p • No specific question that needs to be answered.• Only you know is to improve the business. yy p• But you have *data*• So what would you do first? So, what would you do first? Looking at Data! g
  • 3. IBM with Hadoop IBM with Hadoop• IBM has been working with Open source  y g community for the long time. – Eclipse, Hadoop and so on …• BigInsights include Hadoop
  • 4. BigInsights• BigInsihgts i i ih is IBM Hadoop product for Big data  d d f i d analytics. – Basic Edition (up to 10TB) – Free 無償で使えます! – Enterprise Edition  p• Next version BigInsights ‐ coming soon Next version BigInsights coming soon. – v1.2 available.• And many more
  • 5. BigInsights Componetns BigInsights Componetns• BigInsihgts i l d i ih includes: – IBM Java – JAQL - IBMが開発した言語(オープンソース) – IBM Distribution of Hadoop – BigSheets - データ探索ツール – FLEX scheduler for Adaptive MapReduce  – Orchestrator (Workflow Engine) – SystemT (Text Analytics), SystemML (Machine Learning) – LDAP – Web Console / Developer Studio
  • 6. BigInsights – Basic Edition BigInsights – Basic Edition Version Will be Update Basic EnterpriseFunction in Nov Edition Editi Edition Editi release.Integrated Install Inc IncOpen Source components:Hadoop (including common utilities, HDFS, MapReduce framework) 0.20.2 Inc IncJaql (programming / query language) 0.5.2 Inc IncPig (programming / query language) 0.7 Inc IncFlume (data collection/aggregation) 0.9.1 Inc IncHive (data summarization/querying) 0.5 Inc IncLucene (text search) 3.0.2 302 Inc IncZookeeper (process coordination) 3.2.2 Inc IncAvro (data serialization) 1.3.0 Inc IncHBase ( (real time read/write) / ) 0.20.6 0 20 6 Inc IncOozie (workflow/ job orchestration) 2.2.2 Inc IncOnline documentation Inc IncCapability to integrate with DB2, InfoSphere Warehouse Inc Inc Two DB2 UDFs to submit jobs, and read results from BigInsights
  • 7. BigInsights – Enterprise Edition Enterprise Edition Basic EnterpriseFunction Edition EditionR Connector Jaql module to invoke R statistical capabilities from BigInsights n/a IncNetezza CN t Connector t Jaql modules to read/write data from/to Netezza n/a IncLDAP n/a IncWeb Console n/a IncWorkflow Engine n/a IncScheduler (Orchestrator) n/a IncText Analytics Module (System T) n/a IncEclipse support (for System T)* n/a IncBigSheets – Data Discovery Tool n/a IncIBM Optim Development Studio V2.2.1.0 n/a IncSupport by IBM pp y n/a Inc
  • 8. BigSheets• A data exploring tool for Hadoop• Only comes with BigInsights Enterprise edition Only comes with BigInsights Enterprise edition
  • 9. BigSheets Concept Model Concept Model Enrich Inspect ExploreInternet No Coding is Required! Gather BigSheetsIntranet Publish Get/ Manipulate Logs Gather Massive Results Other in BigInsights Explore &  Analyze
  • 10. It s like a spreadsheets.It’s like a spreadsheets Looks very familiar ?!?
  • 11. Visualizations• Predefined visualization• Customer Plug‐in Customer Plug in A number of coffee shops in North America for each States.
  • 12. DEMO
  • 13. Internet BigSheets Intranet Gather Logs Other BigInsight s• BigInsights can gather data from i i h h d f – Predefined formats : • BigSheets data reader • Basic crawler data reader • Basic crawler data reader (binary support) Basic crawler data reader (binary support) • Character‐delimited data reader • Tab Separated Value (TSV) data reader p ( ) • JavaScript Object Notation (JSON) array reader • Comma Separated Value (CSV) data reader – Customer BigSheets Reader 
  • 14. Internet BigSheets Intranet Gather Logs Other BigInsight s• BigInsights can import structured and  i i h i d d unstructured data – CSV – Files – Network • http p • hdfs • AWS (S3n/S3) – Other • Customer Importer
  • 15. Internet BigSheets Intranet Collection Logs Other BigInsight sA complete list of MacDonald s in North America.A complete list of MacDonalds in North America
  • 16. Internet BigSheets Intranet Logs BigInsight Other s Calculate ReformatImport A complete list of MacDonalds in North America.
  • 17. Internet BigSheets Intranet Logs BigInsight Other sColumn chart Heat map
  • 18. BigSheets in Action in Action 映 売 げ• Blockbuster 映画売り上げ予測 – ABC Newsより
  • 19. Blockbuster – 映画の売り上げ予測 IBM BigInsights/BigSheets ①週末につぶやかれたTweets  ①週末につぶやかれたTweets (約200,000)フィードを受けて、 ②数時間以内に、 (今までは、月曜の朝になってから) 売り上げ予測チャ ト作成 ‐売り上げ予測チャート作成 ‐センチメント分析 例えば、今年の夏は、 がどれよりも人気があ た( X‐manがどれよりも人気があった(つ ぶやかれた)→宣伝、上映戦略など をこまめに修正
  • 20. Conclusion• We all need to improve the business.• S So, where would you start with Big data? h ld t t ith Bi d t ? Data Discovery is a key to start improving  YOUR Business! YOUR Business!
  • 21. Thank you!Thank you!