SlideShare a Scribd company logo
Data Discovery Tool
        BigSheets
MapReduce with No Coding?
  p                     g
Atsushi Tsuchiya (eAtsuhsi@JP.ibm.com)
Atsushi Tsuchiya (eAtsuhsi@JP.ibm.com)
          Big Data Tiger Team
             IBM Software
             IBM Software
Looking at Data
              Looking at Data
• What would you do with Big data? 
    h      ld    d ih i d ?
• How to make use of it?
• It is difficult! – too vague.
   • No specific problem that needs to be solved.
            p       p
   • No specific question that needs to be answered.
• Only you know is to improve the business.
       yy                   p
• But you have *data*
• So what would you do first?
  So, what would you do first?
                Looking at Data!
                      g
IBM with Hadoop
            IBM with Hadoop
• IBM has been working with Open source 
           y           g
  community for the long time.
  – Eclipse, Hadoop and so on …

• BigInsights include Hadoop
BigInsights
• BigInsihgts i
   i    ih is IBM Hadoop product for Big data 
                    d       d    f    i d
  analytics.
  – Basic Edition (up to 10TB) – Free   無償で使えます!
  – Enterprise Edition 
         p

• Next version BigInsights ‐ coming soon
  Next version BigInsights coming soon.
  – v1.2 available.

• And many more
BigInsights Componetns
         BigInsights Componetns
• BigInsihgts i l d
   i    ih includes:
  –   IBM Java
  –   JAQL           - IBMが開発した言語(オープンソース)
  –   IBM Distribution of Hadoop
  –   BigSheets      - データ探索ツール
  –   FLEX scheduler for Adaptive MapReduce 
  –   Orchestrator (Workflow Engine)
  –   SystemT (Text Analytics), SystemML (Machine Learning)
  –   LDAP
  –   Web Console / Developer Studio
BigInsights – Basic Edition
                BigInsights – Basic Edition
                                                                      Version
                                                                  Will be Update     Basic    Enterprise
Function                                                             in Nov         Edition
                                                                                    Editi      Edition
                                                                                               Editi
                                                                     release.

Integrated Install                                                                 Inc        Inc
Open Source components:
Hadoop (including common utilities, HDFS, MapReduce framework)    0.20.2           Inc        Inc
Jaql (programming / query language)                               0.5.2            Inc        Inc
Pig (programming / query language)                                0.7              Inc        Inc
Flume (data collection/aggregation)                               0.9.1            Inc        Inc
Hive (data summarization/querying)                                0.5              Inc        Inc
Lucene (text search)                                              3.0.2
                                                                  302              Inc        Inc
Zookeeper (process coordination)                                  3.2.2            Inc        Inc
Avro (data serialization)                                         1.3.0            Inc        Inc
HBase (
      (real time read/write)
                     /     )                                      0.20.6
                                                                  0 20 6           Inc        Inc
Oozie (workflow/ job orchestration)                               2.2.2            Inc        Inc
Online documentation                                                               Inc        Inc
Capability to integrate with DB2, InfoSphere Warehouse                             Inc        Inc
 Two DB2 UDFs to submit jobs, and read results from BigInsights
BigInsights – Enterprise Edition
                     Enterprise Edition
                                                                        Basic    Enterprise
Function                                                               Edition    Edition
R Connector
 Jaql module to invoke R statistical capabilities from BigInsights   n/a         Inc
Netezza C
N t     Connector
                t
 Jaql modules to read/write data from/to Netezza                     n/a         Inc
LDAP                                                                 n/a         Inc
Web Console                                                          n/a         Inc
Workflow Engine                                                      n/a         Inc
Scheduler (Orchestrator)                                             n/a         Inc
Text Analytics Module (System T)                                     n/a         Inc
Eclipse support (for System T)*                                      n/a         Inc
BigSheets – Data Discovery Tool                                      n/a         Inc
IBM Optim Development Studio V2.2.1.0                                n/a         Inc
Support by IBM
  pp     y                                                           n/a         Inc
BigSheets
• A data exploring tool for Hadoop
• Only comes with BigInsights Enterprise edition
  Only comes with BigInsights Enterprise edition
BigSheets Concept Model
                     Concept Model
                           Enrich   Inspect


                                               Explore
Internet                                                   No Coding is Required!
            Gather
                             BigSheets


Intranet

                 Publish                      Get/
                                              Manipulate
 Logs       Gather


                           Massive Results
 Other                      in BigInsights

                                                    Explore & 
                                                    Analyze
It s like a spreadsheets.
It’s like a spreadsheets

                    Looks very familiar ?!?
Visualizations
• Predefined visualization
• Customer Plug‐in
  Customer Plug in




                  A number of coffee shops in North America for each States.
DEMO
Internet
                                                                     BigSheets

                                                          Intranet




                           Gather                         Logs


                                                          Other
                                                                     BigInsight
                                                                          s




• BigInsights can gather data from
   i    i h          h d f
  – Predefined formats :
     •   BigSheets data reader
     •   Basic crawler data reader
     •   Basic crawler data reader (binary support)
         Basic crawler data reader (binary support)
     •   Character‐delimited data reader
     •   Tab Separated Value (TSV) data reader
                p             (    )
     •   JavaScript Object Notation (JSON) array reader
     •   Comma Separated Value (CSV) data reader

  – Customer BigSheets Reader 
Internet
                                                  BigSheets

                                       Intranet




                      Gather           Logs


                                       Other
                                                  BigInsight
                                                       s




• BigInsights can import structured and 
   i    i h       i               d d
  unstructured data
  – CSV
  – Files
  – Network
     • http
          p
     • hdfs
     • AWS (S3n/S3)
  – Other
     • Customer Importer
Internet
                                                    BigSheets

                                         Intranet




       Collection                         Logs


                                          Other
                                                    BigInsight
                                                         s




A complete list of MacDonald s in North America.
A complete list of MacDonald's in North America
Internet
                                                                         BigSheets

                                                              Intranet


                                                              Logs

                                                                         BigInsight
                                                              Other           s




                                                  Calculate



               Reformat

Import



         A complete list of MacDonald's in North America.
Internet
                                     BigSheets

                          Intranet


                          Logs

                                     BigInsight
                          Other           s




Column chart




               Heat map
BigSheets in Action
                    in Action
              映 売  げ
• Blockbuster 映画売り上げ予測
 – ABC Newsより
Blockbuster – 映画の売り上げ予測
    IBM BigInsights/BigSheets
                 ①週末につぶやかれたTweets 
                 ①週末につぶやかれたTweets
                 (約200,000)フィードを受けて、




                 ②数時間以内に、
                 (今までは、月曜の朝になってから)
                  売り上げ予測チャ ト作成
                 ‐売り上げ予測チャート作成
                 ‐センチメント分析
                 例えば、今年の夏は、
                      がどれよりも人気があ た(
                 X‐manがどれよりも人気があった(つ
                 ぶやかれた)→宣伝、上映戦略など
                 をこまめに修正
Conclusion


• We all need to improve the business.

• S
  So, where would you start with Big data?
       h       ld      t t ith Bi d t ?

 Data Discovery is a key to start improving 
              YOUR Business!
              YOUR Business!
Thank you!
Thank you!

More Related Content

What's hot

Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1
Thanh Nguyen
 
Hadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve LoughranHadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve Loughran
JAX London
 
Hadoop as data refinery
Hadoop as data refineryHadoop as data refinery
Hadoop as data refinery
Steve Loughran
 
Jan 2013 HUG: Cloud-Friendly Hadoop and Hive
Jan 2013 HUG: Cloud-Friendly Hadoop and HiveJan 2013 HUG: Cloud-Friendly Hadoop and Hive
Jan 2013 HUG: Cloud-Friendly Hadoop and Hive
Yahoo Developer Network
 
Using hadoop to expand data warehousing
Using hadoop to expand data warehousingUsing hadoop to expand data warehousing
Using hadoop to expand data warehousing
DataWorks Summit
 
Emergent Distributed Data Storage
Emergent Distributed Data StorageEmergent Distributed Data Storage
Emergent Distributed Data Storage
hybrid cloud
 
HugeTable:Application-Oriented Structure Data Storage System
HugeTable:Application-Oriented Structure Data Storage SystemHugeTable:Application-Oriented Structure Data Storage System
HugeTable:Application-Oriented Structure Data Storage System
qlw5
 
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
Jonathan Seidman
 
Introduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for WindowsIntroduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for Windows
Hortonworks
 
SQL in Hadoop
SQL in HadoopSQL in Hadoop
SQL in Hadoop
Sven Bayer
 
Alex Wade, Digital Library Interoperability
Alex Wade, Digital Library InteroperabilityAlex Wade, Digital Library Interoperability
Alex Wade, Digital Library Interoperability
parker01
 
Steve Watt Presentation
Steve Watt PresentationSteve Watt Presentation
Steve Watt Presentation
Big Data Houston
 
Db tech show - hivemall
Db tech show - hivemallDb tech show - hivemall
Db tech show - hivemallMakoto Yui
 
First Step for Big Data with Apache Hadoop
First Step for Big Data with Apache HadoopFirst Step for Big Data with Apache Hadoop
First Step for Big Data with Apache Hadoop
Born2Learn Co., Ltd
 
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Mahantesh Angadi
 
Etu L2 Training - Hadoop 企業應用實作
Etu L2 Training - Hadoop 企業應用實作Etu L2 Training - Hadoop 企業應用實作
Etu L2 Training - Hadoop 企業應用實作
James Chen
 
Big Data Warehousing: Pig vs. Hive Comparison
Big Data Warehousing: Pig vs. Hive ComparisonBig Data Warehousing: Pig vs. Hive Comparison
Big Data Warehousing: Pig vs. Hive Comparison
Caserta
 
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Jonathan Seidman
 

What's hot (19)

Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1Overview of Big data, Hadoop and Microsoft BI - version1
Overview of Big data, Hadoop and Microsoft BI - version1
 
Hadoop
HadoopHadoop
Hadoop
 
Hadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve LoughranHadoop as Data Refinery - Steve Loughran
Hadoop as Data Refinery - Steve Loughran
 
Hadoop as data refinery
Hadoop as data refineryHadoop as data refinery
Hadoop as data refinery
 
Jan 2013 HUG: Cloud-Friendly Hadoop and Hive
Jan 2013 HUG: Cloud-Friendly Hadoop and HiveJan 2013 HUG: Cloud-Friendly Hadoop and Hive
Jan 2013 HUG: Cloud-Friendly Hadoop and Hive
 
Using hadoop to expand data warehousing
Using hadoop to expand data warehousingUsing hadoop to expand data warehousing
Using hadoop to expand data warehousing
 
Emergent Distributed Data Storage
Emergent Distributed Data StorageEmergent Distributed Data Storage
Emergent Distributed Data Storage
 
HugeTable:Application-Oriented Structure Data Storage System
HugeTable:Application-Oriented Structure Data Storage SystemHugeTable:Application-Oriented Structure Data Storage System
HugeTable:Application-Oriented Structure Data Storage System
 
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
Data Analysis with Hadoop and Hive, ChicagoDB 2/21/2011
 
Introduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for WindowsIntroduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for Windows
 
SQL in Hadoop
SQL in HadoopSQL in Hadoop
SQL in Hadoop
 
Alex Wade, Digital Library Interoperability
Alex Wade, Digital Library InteroperabilityAlex Wade, Digital Library Interoperability
Alex Wade, Digital Library Interoperability
 
Steve Watt Presentation
Steve Watt PresentationSteve Watt Presentation
Steve Watt Presentation
 
Db tech show - hivemall
Db tech show - hivemallDb tech show - hivemall
Db tech show - hivemall
 
First Step for Big Data with Apache Hadoop
First Step for Big Data with Apache HadoopFirst Step for Big Data with Apache Hadoop
First Step for Big Data with Apache Hadoop
 
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
Introduction and Overview of BigData, Hadoop, Distributed Computing - BigData...
 
Etu L2 Training - Hadoop 企業應用實作
Etu L2 Training - Hadoop 企業應用實作Etu L2 Training - Hadoop 企業應用實作
Etu L2 Training - Hadoop 企業應用實作
 
Big Data Warehousing: Pig vs. Hive Comparison
Big Data Warehousing: Pig vs. Hive ComparisonBig Data Warehousing: Pig vs. Hive Comparison
Big Data Warehousing: Pig vs. Hive Comparison
 
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
Integrating Hadoop Into the Enterprise – Hadoop Summit 2012
 

Similar to Hadoop Summit Japan 2011 Fall - LT by IBM

Iotbds v1.0
Iotbds v1.0Iotbds v1.0
Iotbds v1.0
Roy Cecil
 
Big Data 視覺化分析解決方案
Big Data 視覺化分析解決方案Big Data 視覺化分析解決方案
Big Data 視覺化分析解決方案
Etu Solution
 
Big Data: Technical Introduction to BigSheets for InfoSphere BigInsights
Big Data:  Technical Introduction to BigSheets for InfoSphere BigInsightsBig Data:  Technical Introduction to BigSheets for InfoSphere BigInsights
Big Data: Technical Introduction to BigSheets for InfoSphere BigInsights
Cynthia Saracco
 
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Chris Baglieri
 
Webinar: Open Source Business Intelligence Intro
Webinar: Open Source Business Intelligence IntroWebinar: Open Source Business Intelligence Intro
Webinar: Open Source Business Intelligence Intro
SpagoWorld
 
Intel IT OpenStack Journey - OpenStack Fall 2012 Summit.pdf
Intel IT OpenStack Journey - OpenStack Fall 2012 Summit.pdfIntel IT OpenStack Journey - OpenStack Fall 2012 Summit.pdf
Intel IT OpenStack Journey - OpenStack Fall 2012 Summit.pdf
OpenStack Foundation
 
Dynamic Cubes Deep Dive IBM Cognos 10.2
Dynamic Cubes Deep Dive IBM Cognos 10.2Dynamic Cubes Deep Dive IBM Cognos 10.2
Dynamic Cubes Deep Dive IBM Cognos 10.2
Senturus
 
The sensor data challenge - Innovations (not only) for the Internet of Things
The sensor data challenge - Innovations (not only) for the Internet of ThingsThe sensor data challenge - Innovations (not only) for the Internet of Things
The sensor data challenge - Innovations (not only) for the Internet of Things
Stephan Reimann
 
Avoiding 10 common SharePoint Administration mistakes
Avoiding 10 common SharePoint Administration mistakesAvoiding 10 common SharePoint Administration mistakes
Avoiding 10 common SharePoint Administration mistakes
Benjamin Athawes
 
Tableau 7.0 prsentation
Tableau 7.0 prsentationTableau 7.0 prsentation
Tableau 7.0 prsentationinam_slides
 
Big data and hadoop introduction
Big data and hadoop introductionBig data and hadoop introduction
Big data and hadoop introduction
Ajay Mittal
 
Know thy logos
Know thy logosKnow thy logos
Know thy logos
Vishal V
 
Impact of in-memory technology and SAP HANA on your business, IT, and career
Impact of in-memory technology and SAP HANA on your business, IT, and careerImpact of in-memory technology and SAP HANA on your business, IT, and career
Impact of in-memory technology and SAP HANA on your business, IT, and career
Vitaliy Rudnytskiy
 
Big data-at-detik
Big data-at-detikBig data-at-detik
Big data-at-detikk4ndar
 
sones company presentation
sones company presentationsones company presentation
sones company presentation
sones GmbH
 
01 necto introduction_ready
01 necto introduction_ready01 necto introduction_ready
01 necto introduction_readywww.panorama.com
 
hari_duche_updated
hari_duche_updatedhari_duche_updated
hari_duche_updatedHari Duche
 
All Grown Up: Maturation of Analytics in the Cloud
All Grown Up: Maturation of Analytics in the CloudAll Grown Up: Maturation of Analytics in the Cloud
All Grown Up: Maturation of Analytics in the Cloud
Inside Analysis
 
From open data to API-driven business
From open data to API-driven businessFrom open data to API-driven business
From open data to API-driven business
OpenDataSoft
 

Similar to Hadoop Summit Japan 2011 Fall - LT by IBM (20)

Iotbds v1.0
Iotbds v1.0Iotbds v1.0
Iotbds v1.0
 
Big Data 視覺化分析解決方案
Big Data 視覺化分析解決方案Big Data 視覺化分析解決方案
Big Data 視覺化分析解決方案
 
Big Data: Technical Introduction to BigSheets for InfoSphere BigInsights
Big Data:  Technical Introduction to BigSheets for InfoSphere BigInsightsBig Data:  Technical Introduction to BigSheets for InfoSphere BigInsights
Big Data: Technical Introduction to BigSheets for InfoSphere BigInsights
 
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
Finding the needles in the haystack. An Overview of Analyzing Big Data with H...
 
Webinar: Open Source Business Intelligence Intro
Webinar: Open Source Business Intelligence IntroWebinar: Open Source Business Intelligence Intro
Webinar: Open Source Business Intelligence Intro
 
Intel IT OpenStack Journey - OpenStack Fall 2012 Summit.pdf
Intel IT OpenStack Journey - OpenStack Fall 2012 Summit.pdfIntel IT OpenStack Journey - OpenStack Fall 2012 Summit.pdf
Intel IT OpenStack Journey - OpenStack Fall 2012 Summit.pdf
 
Dynamic Cubes Deep Dive IBM Cognos 10.2
Dynamic Cubes Deep Dive IBM Cognos 10.2Dynamic Cubes Deep Dive IBM Cognos 10.2
Dynamic Cubes Deep Dive IBM Cognos 10.2
 
The sensor data challenge - Innovations (not only) for the Internet of Things
The sensor data challenge - Innovations (not only) for the Internet of ThingsThe sensor data challenge - Innovations (not only) for the Internet of Things
The sensor data challenge - Innovations (not only) for the Internet of Things
 
Avoiding 10 common SharePoint Administration mistakes
Avoiding 10 common SharePoint Administration mistakesAvoiding 10 common SharePoint Administration mistakes
Avoiding 10 common SharePoint Administration mistakes
 
Tableau 7.0 prsentation
Tableau 7.0 prsentationTableau 7.0 prsentation
Tableau 7.0 prsentation
 
Big data and hadoop introduction
Big data and hadoop introductionBig data and hadoop introduction
Big data and hadoop introduction
 
Know thy logos
Know thy logosKnow thy logos
Know thy logos
 
Impact of in-memory technology and SAP HANA on your business, IT, and career
Impact of in-memory technology and SAP HANA on your business, IT, and careerImpact of in-memory technology and SAP HANA on your business, IT, and career
Impact of in-memory technology and SAP HANA on your business, IT, and career
 
Ofm msft-interop-v5c-132827
Ofm msft-interop-v5c-132827Ofm msft-interop-v5c-132827
Ofm msft-interop-v5c-132827
 
Big data-at-detik
Big data-at-detikBig data-at-detik
Big data-at-detik
 
sones company presentation
sones company presentationsones company presentation
sones company presentation
 
01 necto introduction_ready
01 necto introduction_ready01 necto introduction_ready
01 necto introduction_ready
 
hari_duche_updated
hari_duche_updatedhari_duche_updated
hari_duche_updated
 
All Grown Up: Maturation of Analytics in the Cloud
All Grown Up: Maturation of Analytics in the CloudAll Grown Up: Maturation of Analytics in the Cloud
All Grown Up: Maturation of Analytics in the Cloud
 
From open data to API-driven business
From open data to API-driven businessFrom open data to API-driven business
From open data to API-driven business
 

Recently uploaded

Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
RinaMondal9
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
Peter Spielvogel
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 

Recently uploaded (20)

Free Complete Python - A step towards Data Science
Free Complete Python - A step towards Data ScienceFree Complete Python - A step towards Data Science
Free Complete Python - A step towards Data Science
 
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfSAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdf
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 

Hadoop Summit Japan 2011 Fall - LT by IBM

  • 1. Data Discovery Tool BigSheets MapReduce with No Coding? p g Atsushi Tsuchiya (eAtsuhsi@JP.ibm.com) Atsushi Tsuchiya (eAtsuhsi@JP.ibm.com) Big Data Tiger Team IBM Software IBM Software
  • 2. Looking at Data Looking at Data • What would you do with Big data?  h ld d ih i d ? • How to make use of it? • It is difficult! – too vague. • No specific problem that needs to be solved. p p • No specific question that needs to be answered. • Only you know is to improve the business. yy p • But you have *data* • So what would you do first? So, what would you do first? Looking at Data! g
  • 3. IBM with Hadoop IBM with Hadoop • IBM has been working with Open source  y g community for the long time. – Eclipse, Hadoop and so on … • BigInsights include Hadoop
  • 4. BigInsights • BigInsihgts i i ih is IBM Hadoop product for Big data  d d f i d analytics. – Basic Edition (up to 10TB) – Free 無償で使えます! – Enterprise Edition  p • Next version BigInsights ‐ coming soon Next version BigInsights coming soon. – v1.2 available. • And many more
  • 5. BigInsights Componetns BigInsights Componetns • BigInsihgts i l d i ih includes: – IBM Java – JAQL - IBMが開発した言語(オープンソース) – IBM Distribution of Hadoop – BigSheets - データ探索ツール – FLEX scheduler for Adaptive MapReduce  – Orchestrator (Workflow Engine) – SystemT (Text Analytics), SystemML (Machine Learning) – LDAP – Web Console / Developer Studio
  • 6. BigInsights – Basic Edition BigInsights – Basic Edition Version Will be Update Basic Enterprise Function in Nov Edition Editi Edition Editi release. Integrated Install Inc Inc Open Source components: Hadoop (including common utilities, HDFS, MapReduce framework) 0.20.2 Inc Inc Jaql (programming / query language) 0.5.2 Inc Inc Pig (programming / query language) 0.7 Inc Inc Flume (data collection/aggregation) 0.9.1 Inc Inc Hive (data summarization/querying) 0.5 Inc Inc Lucene (text search) 3.0.2 302 Inc Inc Zookeeper (process coordination) 3.2.2 Inc Inc Avro (data serialization) 1.3.0 Inc Inc HBase ( (real time read/write) / ) 0.20.6 0 20 6 Inc Inc Oozie (workflow/ job orchestration) 2.2.2 Inc Inc Online documentation Inc Inc Capability to integrate with DB2, InfoSphere Warehouse Inc Inc Two DB2 UDFs to submit jobs, and read results from BigInsights
  • 7. BigInsights – Enterprise Edition Enterprise Edition Basic Enterprise Function Edition Edition R Connector Jaql module to invoke R statistical capabilities from BigInsights n/a Inc Netezza C N t Connector t Jaql modules to read/write data from/to Netezza n/a Inc LDAP n/a Inc Web Console n/a Inc Workflow Engine n/a Inc Scheduler (Orchestrator) n/a Inc Text Analytics Module (System T) n/a Inc Eclipse support (for System T)* n/a Inc BigSheets – Data Discovery Tool n/a Inc IBM Optim Development Studio V2.2.1.0 n/a Inc Support by IBM pp y n/a Inc
  • 8. BigSheets • A data exploring tool for Hadoop • Only comes with BigInsights Enterprise edition Only comes with BigInsights Enterprise edition
  • 9. BigSheets Concept Model Concept Model Enrich Inspect Explore Internet No Coding is Required! Gather BigSheets Intranet Publish Get/ Manipulate Logs Gather Massive Results Other in BigInsights Explore &  Analyze
  • 10. It s like a spreadsheets. It’s like a spreadsheets Looks very familiar ?!?
  • 11. Visualizations • Predefined visualization • Customer Plug‐in Customer Plug in A number of coffee shops in North America for each States.
  • 12. DEMO
  • 13. Internet BigSheets Intranet Gather Logs Other BigInsight s • BigInsights can gather data from i i h h d f – Predefined formats : • BigSheets data reader • Basic crawler data reader • Basic crawler data reader (binary support) Basic crawler data reader (binary support) • Character‐delimited data reader • Tab Separated Value (TSV) data reader p ( ) • JavaScript Object Notation (JSON) array reader • Comma Separated Value (CSV) data reader – Customer BigSheets Reader 
  • 14. Internet BigSheets Intranet Gather Logs Other BigInsight s • BigInsights can import structured and  i i h i d d unstructured data – CSV – Files – Network • http p • hdfs • AWS (S3n/S3) – Other • Customer Importer
  • 15. Internet BigSheets Intranet Collection Logs Other BigInsight s A complete list of MacDonald s in North America. A complete list of MacDonald's in North America
  • 16. Internet BigSheets Intranet Logs BigInsight Other s Calculate Reformat Import A complete list of MacDonald's in North America.
  • 17. Internet BigSheets Intranet Logs BigInsight Other s Column chart Heat map
  • 18. BigSheets in Action in Action 映 売 げ • Blockbuster 映画売り上げ予測 – ABC Newsより
  • 19. Blockbuster – 映画の売り上げ予測 IBM BigInsights/BigSheets ①週末につぶやかれたTweets  ①週末につぶやかれたTweets (約200,000)フィードを受けて、 ②数時間以内に、 (今までは、月曜の朝になってから) 売り上げ予測チャ ト作成 ‐売り上げ予測チャート作成 ‐センチメント分析 例えば、今年の夏は、 がどれよりも人気があ た( X‐manがどれよりも人気があった(つ ぶやかれた)→宣伝、上映戦略など をこまめに修正
  • 20. Conclusion • We all need to improve the business. • S So, where would you start with Big data? h ld t t ith Bi d t ? Data Discovery is a key to start improving  YOUR Business! YOUR Business!