SlideShare a Scribd company logo
1 of 38
Download to read offline
The Briefing Room
Welcome




                       Host:
                       Eric Kavanagh
                       eric.kavanagh@bloorgroup.com




Twitter Tag: #briefr                                  The Briefing Room
Mission


  !   Reveal the essential characteristics of enterprise software,
      good and bad

  !   Provide a forum for detailed analysis of today s innovative
      technologies

  !   Give vendors a chance to explain their product to savvy
      analysts

  !   Allow audience members to pose serious questions... and get
      answers!




Twitter Tag: #briefr                                   The Briefing Room
December: Innovators



 January: Big Data

 February: Analytics

 March: Data in Motion



Twitter Tag: #briefr     The Briefing Room
Innovators



     !   Charles Babbage conceived the Analytical Engine in 1834.

     !   Automation and ease of use have driven innovation in
         computing ever since.

     !   The Cloud and Big Data are raising the bar.




Twitter Tag: #briefr                                   The Briefing Room
Analyst: Robin Bloor




                         Robin Bloor is
                       Chief Analyst at
                       The Bloor Group


                          robin.bloor@bloorgroup.com




Twitter Tag: #briefr                      The Briefing Room
Cirro

    ! Cirro provides a single method to access any type of data,
      on any platform, in any environment.

    !   Its product suite consists of Cirro Data Hub, Analyst for
        Excel and Multi Store – all designed to remove complexity
        from Big Data analytics.

    ! Cirro’s products are cloud based and can run in public,
      private and on-premise environments.




Twitter Tag: #briefr                                  The Briefing Room
Mark Theissen

                       Mark is CEO at Cirro. He is a respected analytics and data
                       warehousing expert with more than 22 years in the industry.
                       Most recently Mark was the worldwide data warehousing
                       technical lead at Microsoft following the acquisition of
                       DATAllegro. At DATAllegro Mark was the COO and a member
                       of the board of directors. Prior to joining DATAllegro, Mark
                       was Vice President and Research Lead at META Group
   (Gartner Group) for Enterprise Analytics Strategies, covering data warehousing,
   business intelligence and data integration markets. Before META, Mark was VP
   of Professional Services at Accruent where he was responsible for domestic and
   overseas services and operations. Mark has a BS in Computer Information
   Systems from Chapman University and a MBA from the University of California,
   Irvine. 	




Twitter Tag: #briefr                                             The Briefing Room
Bringing Big Data to
    the Desktop




                       Corporate Overview


                              ©2012 Cirro Inc. All rights reserved.
The Big Data Dilemma




                       ©2012 Cirro Inc. All rights reserved.
The Big Data Dilemma




                       ©2012 Cirro Inc. All rights reserved.
The Big Data Dilemma




                       ©2012 Cirro Inc. All rights reserved.
Accessing Big Data




                     ©2012 Cirro Inc. All rights reserved.
Accessing Big Data

      Incumbent	
  Approach	
     Hadoop	
  Approach	
  




                                             ©2012 Cirro Inc. All rights reserved.
Accessing Big Data

      Incumbent	
  Approach	
     Hadoop	
  Approach	
  




                                             ©2012 Cirro Inc. All rights reserved.
Accessing Big Data

      Incumbent	
  Approach	
     Hadoop	
  Approach	
  




                                             ©2012 Cirro Inc. All rights reserved.
What the Market Needs



       An enterprise data hub to
      access any type of data, on
         any platform, in any
             environment



                               ©2012 Cirro Inc. All rights reserved.
The Enterprise Data Hub




                          ©2012 Cirro Inc. All rights reserved.
Simplifying the Access to Your Data



                                 Conven/onal	
  Approach	
                                                                        Cirro	
  Approach	
  
                                People	
  manage	
  the	
  access	
  to	
  data	
                                              Cirro	
  Data	
  Hub	
  manages	
  	
  
                                                                                                                                       access	
  to	
  data	
  
                                                     HIVE	
      Hadoop	
  
                                     Map	
  
                                                                 Install	
  &	
  
                                    Reduce	
  
                                                                  Config	
  

                                                                                    Hive	
  –	
  Scoop	
  
                                                                                                                                        Access	
  tool	
  
                    Sqoop	
                                                           Install	
  &	
  
                                                                                       Config	
  




                                                                                                             Source	
  
         Java	
                                                                                              Control	
  




   SQL	
  
                                             Structured	
  -­‐	
                                                                        Cirro	
  
 (mul;ple	
  
 versions)	
  
                                             Unstructured	
                                                    DataBase	
  
                                                                                                              Management	
            Data	
  Hub	
  
                                              Mashups	
  


                                                                                                                                                ©2012 Cirro Inc. All rights reserved.
Architecture Overview


Cirro	
  Data	
  Hub	
  
       •      Cost	
  based	
  federa;on	
  op;mizer	
  
       •      Smart	
  caching	
  	
  
       •      Dynamic	
  op;miza;on	
  
       •      Normalized	
  cost	
  es;mates	
  
       •      Metadata	
  for	
  unstructured	
  sources	
  
	
  
Cirro	
  Func;on	
  Library	
  
       •      Library	
  of	
  Func;ons	
  
       •      Logic	
  to	
  build	
  complex	
  specific	
  formulas	
  
	
  
Cirro	
  Analyst	
  
       •  Excel	
  plug-­‐in	
  that	
  allows	
  analysts	
  to	
  explore	
  	
  
       	
  	
  	
  	
  	
  	
  	
  &	
  process	
  Big	
  Data	
  and	
  tradi;onal	
  data	
  
	
  
Cirro	
  Mul;	
  Store	
  (op;onal)	
  
       •      Pre-­‐built	
  structured/unstructured	
  data	
  store	
  
       •      Used	
  for	
  holding	
  data	
  or	
  addi;onal	
  workspace	
  
	
  


                                                                                                  ©2012 Cirro Inc. All rights reserved.
Typical Deployment


                                                    Excel Analyst Users              Data Consumers
                                                           •  Design Views           Access CDH Views via ODBC
                                                        •  Minimal IT Support        & JDBC across all data types


                                  •  Publish Views
                                  •  Data Exploration
                                  •  Analysis                                                Tableau



                     Extend, Add                        Cirro Data Hub
                     Proprietary                  •  Cirro Function Library                 Business
                   Functions to CFL               • Proprietary MapReduce                   Objects
                                                       • Custom Views


 IT Staff
 •  Programmers                                                                            Other BI Tools
 •  Developers
 •  DBA’s
                                         MapReduce	
                      HQL	
  

                  No	
  SQL	
                                                       RDBMS	
  
                  Splunk	
  
                                                                           Hive
                                                                                    Oracle	
  
                  Cassandra	
                               MapReduce               Teradata	
  
                  MongoDB	
                                                         MySQL	
  
                                                                                    SQL	
  	
  
                                              Hadoop Distributed File System        Ver;ca	
  

                                                                                            ©2012 Cirro Inc. All rights reserved.
Sample Use Case


    Summarize the number of tweets per hour with
      certain keywords from a raw twitter feed.

    Requirements:
      •  Use raw twitter data files in Hadoop
      •  Keywords stored in SQL table for easy
         manipulation
      •  Results into Tableau Excel for visualization




                                                        ©2012 Cirro Inc. All rights reserved.
Too Many Skills, Coding, Processing

Write	
  mapper/reducer	
  in	
  java	
  using	
  development	
  tool	
  :	
  	
  
   • parse	
  twi[er	
  text	
  -­‐	
  convert	
  to	
  lower	
  case	
  -­‐	
  parse	
  words	
  -­‐	
  exclude	
  common	
  words	
  -­‐	
  group	
  words	
  by	
  hour	
  

Import	
  java	
  classes	
  into	
  Hadoop	
  

Execute	
  command	
  line	
  hadoop	
  using	
  CLI	
  
   • bin/hadoop	
  	
  jar	
  Twi[erParse	
  	
  /home/cloudera/WordCount.jar	
  /usr/tweet/input	
  /usr/local/output	
  –libjars	
  	
  

Move	
  result	
  into	
  HIVE	
  using	
  JDBC	
  SQL	
  tool	
  
   • create	
  table	
  output1	
  (text	
  STRING,created_at	
  STRING,count	
  BIGINT)	
  ROW	
  FORMAT	
  DELIMITED	
  FIELDS	
  TERMINATED	
  BY	
  
     't'	
  STORED	
  AS	
  TEXTFILE	
  	
  
   • LOAD	
  DATA	
  INPATH	
  '/usr/data/1-­‐88f1-­‐864e22e77801/part*'OVERWRITE	
  INTO	
  TABLE	
  output1	
  

Move	
  SQL	
  table	
  with	
  keywords	
  to	
  HIVE	
  through	
  Scoop	
  using	
  CLI	
  
   • export	
  -­‐-­‐connect	
  jdbc:mySQL://10.17.185.44/mytable	
  -­‐-­‐password	
  	
  mypasswd	
  -­‐-­‐username	
  root	
  -­‐-­‐table	
  words	
  -­‐-­‐export-­‐dir	
  
     '/home/cloudera/inpumile	
  
   • create	
  table	
  mytable	
  (word	
  STRING)	
  ROW	
  FORMAT	
  DELIMITED	
  FIELDS	
  TERMINATED	
  BY	
  ','	
  STORED	
  AS	
  TEXTFILE	
  	
  
   • LOAD	
  DATA	
  INPATH	
  '/home/cloudera/inpumile/part*'OVERWRITE	
  INTO	
  TABLE	
  mytable	
  

Run	
  HIVE	
  query	
  using	
  JDBC	
  SQL	
  tool	
  
   • select	
  a.text	
  ,a.created_at	
  ,a.count	
  from	
  output1	
  a	
  	
  join	
  mytable	
  b	
  	
  on	
  (a.text	
  	
  =	
  b.word	
  )	
  	
  

Import	
  results	
  into	
  Excel	
  using	
  Excel	
  
                                                                                                                                                                     ©2012 Cirro Inc. All rights reserved.
Too Many Skills, Coding, Processing

Write	
  mapper/reducer	
  in	
  java	
  using	
  development	
  tool	
  :	
  	
  
   • parse	
  twi[er	
  text	
  -­‐	
  convert	
  to	
  lower	
  case	
  -­‐	
  parse	
  words	
  -­‐	
  exclude	
  common	
  words	
  -­‐	
  group	
  words	
  by	
  hour	
  

Import	
  java	
  classes	
  into	
  Hadoop	
  

Execute	
  command	
  line	
  hadoop	
  using	
  CLI	
  
   • bin/hadoop	
  	
  jar	
  Twi[erParse	
  	
  /home/cloudera/WordCount.jar	
  /usr/tweet/input	
  /usr/local/output	
  –libjars	
  	
  
                                    B1=Twi[erParse("/user/twi[er/sample","text,created_at")	
  

Move	
  result	
  into	
  HIVE	
  using	
  JDBC	
  SQL	
  tool	
  
                              B2=ToLower(B1,"text")	
  
   • create	
  table	
  output1	
  (text	
  STRING,created_at	
  STRING,count	
  BIGINT)	
  ROW	
  FORMAT	
  DELIMITED	
  FIELDS	
  TERMINATED	
  BY	
  
                                      B3=WordSeparate(B2,"text")	
  
     't'	
  STORED	
  AS	
  TEXTFILE	
  	
  
   • LOAD	
  DATA	
  INPATH	
  '/usr/data/1-­‐88f1-­‐864e22e77801/part*'OVERWRITE	
  INTO	
  TABLE	
  output1	
  
                                      B4=Exclude(B3,"text")	
  

Move	
  SQL	
  table	
  with	
  keywords	
  to	
  HIVE	
  through	
  Scoop	
  using	
  CLI	
  
                                B5=GroupBy(B4,"text,created_at")	
  
   • export	
  -­‐-­‐connect	
  jdbc:mySQL://10.17.185.44/mytable	
  -­‐-­‐password	
  	
  mypasswd	
  -­‐-­‐username	
  root	
  -­‐-­‐table	
  words	
  -­‐-­‐export-­‐dir	
  
                                      B6=Cirro_Match(B5,"text","MYSQL.KeyWords","word",C9)	
  
     '/home/cloudera/inpumile	
  
   • create	
  table	
  mytable	
  (word	
  STRING)	
  ROW	
  FORMAT	
  DELIMITED	
  FIELDS	
  TERMINATED	
  BY	
  ','	
  STORED	
  AS	
  TEXTFILE	
  	
  
                                      Results	
  displayed	
  at	
  cell	
  C9	
  
   • LOAD	
  DATA	
  INPATH	
  '/home/cloudera/inpumile/part*'OVERWRITE	
  INTO	
  TABLE	
  mytable	
  

Run	
  HIVE	
  query	
  using	
  JDBC	
  SQL	
  tool	
  
   • select	
  a.text	
  ,a.created_at	
  ,a.count	
  from	
  output1	
  a	
  	
  join	
  mytable	
  b	
  	
  on	
  (a.text	
  	
  =	
  b.word	
  )	
  	
  

Import	
  results	
  into	
  Excel	
  using	
  Excel	
  
                                                                                                                                                                     ©2012 Cirro Inc. All rights reserved.
Bringing Big Data to
    the Desktop




                       Corporate Overview


                              ©2012 Cirro Inc. All rights reserved.
Perceptions & Questions




                       Analyst:
                       Robin Bloor


Twitter Tag: #briefr                 The Briefing Room
Big Data, Hot Data?




               The Bloor Group
Hadoop & The Big Data Dynamic

Hadoop   has become the de facto reservoir for data




                                           The Bloor Group
Hadoop & The Big Data Dynamic

–  We witnessed something like this a long time
   ago, with ISAM files - before the advent of
   RDBMS
–  The difference this time is that Hadoop has an
   ecosystem and it is growing
–  Big Data (usually caught first by Hadoop) is
   mostly new data and mostly event data
–  Hadoop is not (yet) a performance engine. It is
   an all-purpose capability
–  It is delivering business benefits in a big way: it
   is hot….


                                            The Bloor Group
BI Categories

HINDSIGHT       Regular reporting/operational BI, Excel 	




OVERSIGHT           Dashboards, OLAP, BPM, Excel




                   Data mining, statistical analysis
 INSIGHT              (trends and relationships)




FORESIGHT                Predictive analytics



                                           The Bloor Group
The New BI Universe (?)




                     The Bloor Group
Data Sources


                     Graph
                     DBMS,
                      XML
         Standard    DBMS,       NoSQL
            SQL     Flat files
Hadoop
 and                                       Metadata
Hadoop                                      Hub?
  ++




                                         The Bloor Group
Problems Of The Data Layer
Hadoop is capable of ETL and often
                                       Hadoop is multi-role and hence
  used for ETL, but that usually
                                        can spawn multiple instances
    involves coding of a kind


                                       BI tools, which had good-enough
     The data layer is more
                                      interfaces to RDBMS, don’t link to
 complicated than it was and its
                                        Hadoop directly, and probably
    complexity is increasing
                                                   shouldn’t


Point to point connectivity usually
                                        A connectivity architecture is
 was, is and may always be a bad
                                                   needed
                idea



            IT REQUIRES SIMPLE CONNECTORS

                                                         The Bloor Group
!  How would one use the Cirro Multi Store?
!  Which companies/products do you regard as
  competitors (either directly or close competitors)?

!  How does a Cirro implementation proceed, i.e.,
  where do you start, what are the medium term
  goals, what do you replace?

!  Conceptually a hub for the data layer is attractive.
  But how well does it scale out?



                                             The Bloor Group
!  Can the hub be physically distributed, i.e., one
  logical instance with multiple physical instances?

!  How does your proprietary MapReduce differ from
  Hadoop MapReduce?

!  Is there any aspect of BI that you don’t or can’t
  cater for (CEP, Data governance, MDM, etc.)?




                                             The Bloor Group
Twitter Tag: #briefr   The Briefing Room
Upcoming Topics



   January: Big Data

   February: Analytics

   March: Data in Motion

   2013 Editorial Calendar
        www.insideanalysis.com




Twitter Tag: #briefr             The Briefing Room
Thank You
                        for Your
                       Attention


Twitter Tag: #briefr               The Briefing Room

More Related Content

What's hot

InfiniDB 3 - Speeding Big Data Analytics in Amazon EC2
InfiniDB 3 - Speeding Big Data Analytics in Amazon EC2InfiniDB 3 - Speeding Big Data Analytics in Amazon EC2
InfiniDB 3 - Speeding Big Data Analytics in Amazon EC2Calpont Corporation
 
Teradata Aster: Big Data Discovery Made Easy
Teradata Aster: Big Data Discovery Made EasyTeradata Aster: Big Data Discovery Made Easy
Teradata Aster: Big Data Discovery Made EasyTIBCO Spotfire
 
Hadoop Summit 2012 | Integrating Hadoop Into the Enterprise
Hadoop Summit 2012 | Integrating Hadoop Into the EnterpriseHadoop Summit 2012 | Integrating Hadoop Into the Enterprise
Hadoop Summit 2012 | Integrating Hadoop Into the EnterpriseCloudera, Inc.
 
Analytic Platforms in the Real World with 451Research and Calpont_July 2012
Analytic Platforms in the Real World with 451Research and Calpont_July 2012Analytic Platforms in the Real World with 451Research and Calpont_July 2012
Analytic Platforms in the Real World with 451Research and Calpont_July 2012Calpont Corporation
 
Big Data World Forum
Big Data World ForumBig Data World Forum
Big Data World Forumbigdatawf
 
Agile analytics applications on hadoop
Agile analytics applications on hadoopAgile analytics applications on hadoop
Agile analytics applications on hadoopHortonworks
 
SplunkLive! New York April 2013 - Enrich Machine Data with Structured Data
SplunkLive! New York April 2013 - Enrich Machine Data with Structured DataSplunkLive! New York April 2013 - Enrich Machine Data with Structured Data
SplunkLive! New York April 2013 - Enrich Machine Data with Structured DataSplunk
 
IT @ Intel: Preparing the Future Enterprise with the Internet of Things
IT @ Intel: Preparing the Future Enterprise with the Internet of ThingsIT @ Intel: Preparing the Future Enterprise with the Internet of Things
IT @ Intel: Preparing the Future Enterprise with the Internet of ThingsIntel IT Center
 
Solving the Really Big Tech Problems with IoT
 Solving the Really Big Tech Problems with IoT Solving the Really Big Tech Problems with IoT
Solving the Really Big Tech Problems with IoTEric Kavanagh
 
Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2Hortonworks
 
Using BrightWork for Project Management with SharePoint 2010 - from Atidan
Using BrightWork for Project Management with SharePoint 2010 - from AtidanUsing BrightWork for Project Management with SharePoint 2010 - from Atidan
Using BrightWork for Project Management with SharePoint 2010 - from AtidanDavid J Rosenthal
 
Big Data for Everyman
Big Data for EverymanBig Data for Everyman
Big Data for EverymanMichael Wilde
 
IT Infrastructure Specialist
IT Infrastructure SpecialistIT Infrastructure Specialist
IT Infrastructure Specialistmomentuminfocare
 
DDN Accelerating-Decisions-Through-Enterprise-Hadoop-final
DDN Accelerating-Decisions-Through-Enterprise-Hadoop-finalDDN Accelerating-Decisions-Through-Enterprise-Hadoop-final
DDN Accelerating-Decisions-Through-Enterprise-Hadoop-finalIntelHealthcare
 
SQL Server Data Mining - Taking your Application Design to the Next Level
SQL Server Data Mining - Taking your Application Design to the Next LevelSQL Server Data Mining - Taking your Application Design to the Next Level
SQL Server Data Mining - Taking your Application Design to the Next LevelMark Ginnebaugh
 
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo SlidesWebinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo SlidesCloudera, Inc.
 
Making your Analytics Investment Pay Off - StampedeCon 2012
Making your Analytics Investment Pay Off - StampedeCon 2012Making your Analytics Investment Pay Off - StampedeCon 2012
Making your Analytics Investment Pay Off - StampedeCon 2012StampedeCon
 

What's hot (20)

InfiniDB 3 - Speeding Big Data Analytics in Amazon EC2
InfiniDB 3 - Speeding Big Data Analytics in Amazon EC2InfiniDB 3 - Speeding Big Data Analytics in Amazon EC2
InfiniDB 3 - Speeding Big Data Analytics in Amazon EC2
 
Sql no sql
Sql no sqlSql no sql
Sql no sql
 
Teradata Aster: Big Data Discovery Made Easy
Teradata Aster: Big Data Discovery Made EasyTeradata Aster: Big Data Discovery Made Easy
Teradata Aster: Big Data Discovery Made Easy
 
Big Data on AWS
Big Data on AWSBig Data on AWS
Big Data on AWS
 
Hadoop Summit 2012 | Integrating Hadoop Into the Enterprise
Hadoop Summit 2012 | Integrating Hadoop Into the EnterpriseHadoop Summit 2012 | Integrating Hadoop Into the Enterprise
Hadoop Summit 2012 | Integrating Hadoop Into the Enterprise
 
Analytic Platforms in the Real World with 451Research and Calpont_July 2012
Analytic Platforms in the Real World with 451Research and Calpont_July 2012Analytic Platforms in the Real World with 451Research and Calpont_July 2012
Analytic Platforms in the Real World with 451Research and Calpont_July 2012
 
Big Data World Forum
Big Data World ForumBig Data World Forum
Big Data World Forum
 
Agile analytics applications on hadoop
Agile analytics applications on hadoopAgile analytics applications on hadoop
Agile analytics applications on hadoop
 
SplunkLive! New York April 2013 - Enrich Machine Data with Structured Data
SplunkLive! New York April 2013 - Enrich Machine Data with Structured DataSplunkLive! New York April 2013 - Enrich Machine Data with Structured Data
SplunkLive! New York April 2013 - Enrich Machine Data with Structured Data
 
IT @ Intel: Preparing the Future Enterprise with the Internet of Things
IT @ Intel: Preparing the Future Enterprise with the Internet of ThingsIT @ Intel: Preparing the Future Enterprise with the Internet of Things
IT @ Intel: Preparing the Future Enterprise with the Internet of Things
 
Solving the Really Big Tech Problems with IoT
 Solving the Really Big Tech Problems with IoT Solving the Really Big Tech Problems with IoT
Solving the Really Big Tech Problems with IoT
 
Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2
 
Using BrightWork for Project Management with SharePoint 2010 - from Atidan
Using BrightWork for Project Management with SharePoint 2010 - from AtidanUsing BrightWork for Project Management with SharePoint 2010 - from Atidan
Using BrightWork for Project Management with SharePoint 2010 - from Atidan
 
Big Data for Everyman
Big Data for EverymanBig Data for Everyman
Big Data for Everyman
 
IT Infrastructure Specialist
IT Infrastructure SpecialistIT Infrastructure Specialist
IT Infrastructure Specialist
 
DDN Accelerating-Decisions-Through-Enterprise-Hadoop-final
DDN Accelerating-Decisions-Through-Enterprise-Hadoop-finalDDN Accelerating-Decisions-Through-Enterprise-Hadoop-final
DDN Accelerating-Decisions-Through-Enterprise-Hadoop-final
 
Secure Big Data Analytics - Hadoop & Intel
Secure Big Data Analytics - Hadoop & IntelSecure Big Data Analytics - Hadoop & Intel
Secure Big Data Analytics - Hadoop & Intel
 
SQL Server Data Mining - Taking your Application Design to the Next Level
SQL Server Data Mining - Taking your Application Design to the Next LevelSQL Server Data Mining - Taking your Application Design to the Next Level
SQL Server Data Mining - Taking your Application Design to the Next Level
 
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo SlidesWebinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
Webinar | From Zero to Big Data Answers in Less Than an Hour – Live Demo Slides
 
Making your Analytics Investment Pay Off - StampedeCon 2012
Making your Analytics Investment Pay Off - StampedeCon 2012Making your Analytics Investment Pay Off - StampedeCon 2012
Making your Analytics Investment Pay Off - StampedeCon 2012
 

Viewers also liked

Big Data Madison: Architecting for Big Data (with notes)
Big Data Madison: Architecting for Big Data (with notes)Big Data Madison: Architecting for Big Data (with notes)
Big Data Madison: Architecting for Big Data (with notes)MIO | the data experts
 
10 razones para quiebran un emprendimiento (2)
10 razones para quiebran un emprendimiento (2)10 razones para quiebran un emprendimiento (2)
10 razones para quiebran un emprendimiento (2)Ronald Quiros
 
Convergence and Interoperability (IFLA 2011)
Convergence and Interoperability (IFLA 2011)Convergence and Interoperability (IFLA 2011)
Convergence and Interoperability (IFLA 2011)Figoblog
 
Data Lake vs. Data Warehouse: Which is Right for Healthcare?
Data Lake vs. Data Warehouse: Which is Right for Healthcare?Data Lake vs. Data Warehouse: Which is Right for Healthcare?
Data Lake vs. Data Warehouse: Which is Right for Healthcare?Health Catalyst
 
Enterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big DataEnterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big DataCloudera, Inc.
 
Big Data = Bigger Metadata
Big Data = Bigger MetadataBig Data = Bigger Metadata
Big Data = Bigger MetadataIan White
 
Master Data Management methodology
Master Data Management methodologyMaster Data Management methodology
Master Data Management methodologyDatabase Architechs
 

Viewers also liked (10)

Inline Tagging and Dictionary Connection
Inline Tagging and Dictionary ConnectionInline Tagging and Dictionary Connection
Inline Tagging and Dictionary Connection
 
Big Data Madison: Architecting for Big Data (with notes)
Big Data Madison: Architecting for Big Data (with notes)Big Data Madison: Architecting for Big Data (with notes)
Big Data Madison: Architecting for Big Data (with notes)
 
3 dw architectures
3 dw architectures3 dw architectures
3 dw architectures
 
10 razones para quiebran un emprendimiento (2)
10 razones para quiebran un emprendimiento (2)10 razones para quiebran un emprendimiento (2)
10 razones para quiebran un emprendimiento (2)
 
Data Harmony Thesaurus Master®
Data Harmony Thesaurus Master®Data Harmony Thesaurus Master®
Data Harmony Thesaurus Master®
 
Convergence and Interoperability (IFLA 2011)
Convergence and Interoperability (IFLA 2011)Convergence and Interoperability (IFLA 2011)
Convergence and Interoperability (IFLA 2011)
 
Data Lake vs. Data Warehouse: Which is Right for Healthcare?
Data Lake vs. Data Warehouse: Which is Right for Healthcare?Data Lake vs. Data Warehouse: Which is Right for Healthcare?
Data Lake vs. Data Warehouse: Which is Right for Healthcare?
 
Enterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big DataEnterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big Data
 
Big Data = Bigger Metadata
Big Data = Bigger MetadataBig Data = Bigger Metadata
Big Data = Bigger Metadata
 
Master Data Management methodology
Master Data Management methodologyMaster Data Management methodology
Master Data Management methodology
 

Similar to Self-Service Access and Exploration of Big Data

All Grown Up: Maturation of Analytics in the Cloud
All Grown Up: Maturation of Analytics in the CloudAll Grown Up: Maturation of Analytics in the Cloud
All Grown Up: Maturation of Analytics in the CloudInside Analysis
 
The Anywhere Enterprise – How a Flexible Foundation Opens Doors
The Anywhere Enterprise – How a Flexible Foundation Opens DoorsThe Anywhere Enterprise – How a Flexible Foundation Opens Doors
The Anywhere Enterprise – How a Flexible Foundation Opens DoorsInside Analysis
 
A Strategic View of Enterprise Reporting and Analytics: The Data Funnel
A Strategic View of Enterprise Reporting and Analytics: The Data FunnelA Strategic View of Enterprise Reporting and Analytics: The Data Funnel
A Strategic View of Enterprise Reporting and Analytics: The Data FunnelInside Analysis
 
Big Data 視覺化分析解決方案
Big Data 視覺化分析解決方案Big Data 視覺化分析解決方案
Big Data 視覺化分析解決方案Etu Solution
 
Impala Unlocks Interactive BI on Hadoop
Impala Unlocks Interactive BI on HadoopImpala Unlocks Interactive BI on Hadoop
Impala Unlocks Interactive BI on HadoopCloudera, Inc.
 
Business Intelligence and Data Analytics Revolutionized with Apache Hadoop
Business Intelligence and Data Analytics Revolutionized with Apache HadoopBusiness Intelligence and Data Analytics Revolutionized with Apache Hadoop
Business Intelligence and Data Analytics Revolutionized with Apache HadoopCloudera, Inc.
 
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...Amr Awadallah
 
Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHortonworks
 
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...Cloudera, Inc.
 
20150630 kca big-data-with-cloud_output
20150630 kca big-data-with-cloud_output20150630 kca big-data-with-cloud_output
20150630 kca big-data-with-cloud_outputericpi Bi
 
All data accessible to all my organization - Presentation at OW2con'19, June...
 All data accessible to all my organization - Presentation at OW2con'19, June... All data accessible to all my organization - Presentation at OW2con'19, June...
All data accessible to all my organization - Presentation at OW2con'19, June...OW2
 
What it takes to bring Hadoop to a production-ready state
What it takes to bring Hadoop to a production-ready stateWhat it takes to bring Hadoop to a production-ready state
What it takes to bring Hadoop to a production-ready stateClouderaUserGroups
 
Webinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafkaWebinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafkaJeffrey T. Pollock
 
What is the Point of Hadoop
What is the Point of HadoopWhat is the Point of Hadoop
What is the Point of HadoopDataWorks Summit
 
Getting Cloud Architecture Right the First Time Ver 2
Getting Cloud Architecture Right the First Time Ver 2Getting Cloud Architecture Right the First Time Ver 2
Getting Cloud Architecture Right the First Time Ver 2David Linthicum
 
The New Frontier: Optimizing Big Data Exploration
The New Frontier: Optimizing Big Data ExplorationThe New Frontier: Optimizing Big Data Exploration
The New Frontier: Optimizing Big Data ExplorationInside Analysis
 
2010/10 - Database Architechs - Data Services Summary
2010/10 - Database Architechs - Data Services Summary2010/10 - Database Architechs - Data Services Summary
2010/10 - Database Architechs - Data Services SummaryDatabase Architechs
 
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011Cloudera, Inc.
 
Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013Jonathan Seidman
 
Infochimps #1 Big Data Platform for the Cloud
Infochimps #1 Big Data Platform for the CloudInfochimps #1 Big Data Platform for the Cloud
Infochimps #1 Big Data Platform for the CloudBrian Krpec
 

Similar to Self-Service Access and Exploration of Big Data (20)

All Grown Up: Maturation of Analytics in the Cloud
All Grown Up: Maturation of Analytics in the CloudAll Grown Up: Maturation of Analytics in the Cloud
All Grown Up: Maturation of Analytics in the Cloud
 
The Anywhere Enterprise – How a Flexible Foundation Opens Doors
The Anywhere Enterprise – How a Flexible Foundation Opens DoorsThe Anywhere Enterprise – How a Flexible Foundation Opens Doors
The Anywhere Enterprise – How a Flexible Foundation Opens Doors
 
A Strategic View of Enterprise Reporting and Analytics: The Data Funnel
A Strategic View of Enterprise Reporting and Analytics: The Data FunnelA Strategic View of Enterprise Reporting and Analytics: The Data Funnel
A Strategic View of Enterprise Reporting and Analytics: The Data Funnel
 
Big Data 視覺化分析解決方案
Big Data 視覺化分析解決方案Big Data 視覺化分析解決方案
Big Data 視覺化分析解決方案
 
Impala Unlocks Interactive BI on Hadoop
Impala Unlocks Interactive BI on HadoopImpala Unlocks Interactive BI on Hadoop
Impala Unlocks Interactive BI on Hadoop
 
Business Intelligence and Data Analytics Revolutionized with Apache Hadoop
Business Intelligence and Data Analytics Revolutionized with Apache HadoopBusiness Intelligence and Data Analytics Revolutionized with Apache Hadoop
Business Intelligence and Data Analytics Revolutionized with Apache Hadoop
 
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
How Apache Hadoop is Revolutionizing Business Intelligence and Data Analytics...
 
Hadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - JaspersoftHadoop Reporting and Analysis - Jaspersoft
Hadoop Reporting and Analysis - Jaspersoft
 
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
Hadoop World 2011: How Hadoop Revolutionized Business Intelligence and Advanc...
 
20150630 kca big-data-with-cloud_output
20150630 kca big-data-with-cloud_output20150630 kca big-data-with-cloud_output
20150630 kca big-data-with-cloud_output
 
All data accessible to all my organization - Presentation at OW2con'19, June...
 All data accessible to all my organization - Presentation at OW2con'19, June... All data accessible to all my organization - Presentation at OW2con'19, June...
All data accessible to all my organization - Presentation at OW2con'19, June...
 
What it takes to bring Hadoop to a production-ready state
What it takes to bring Hadoop to a production-ready stateWhat it takes to bring Hadoop to a production-ready state
What it takes to bring Hadoop to a production-ready state
 
Webinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafkaWebinar future dataintegration-datamesh-and-goldengatekafka
Webinar future dataintegration-datamesh-and-goldengatekafka
 
What is the Point of Hadoop
What is the Point of HadoopWhat is the Point of Hadoop
What is the Point of Hadoop
 
Getting Cloud Architecture Right the First Time Ver 2
Getting Cloud Architecture Right the First Time Ver 2Getting Cloud Architecture Right the First Time Ver 2
Getting Cloud Architecture Right the First Time Ver 2
 
The New Frontier: Optimizing Big Data Exploration
The New Frontier: Optimizing Big Data ExplorationThe New Frontier: Optimizing Big Data Exploration
The New Frontier: Optimizing Big Data Exploration
 
2010/10 - Database Architechs - Data Services Summary
2010/10 - Database Architechs - Data Services Summary2010/10 - Database Architechs - Data Services Summary
2010/10 - Database Architechs - Data Services Summary
 
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
Hadoop in the Enterprise - Dr. Amr Awadallah @ Microstrategy World 2011
 
Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013Integrating hadoop - Big Data TechCon 2013
Integrating hadoop - Big Data TechCon 2013
 
Infochimps #1 Big Data Platform for the Cloud
Infochimps #1 Big Data Platform for the CloudInfochimps #1 Big Data Platform for the Cloud
Infochimps #1 Big Data Platform for the Cloud
 

More from Inside Analysis

An Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BIAn Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BIInside Analysis
 
Agile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessAgile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessInside Analysis
 
First in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationFirst in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationInside Analysis
 
Fit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data LetdownFit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data LetdownInside Analysis
 
To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security Inside Analysis
 
The Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeThe Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeInside Analysis
 
Introducing: A Complete Algebra of Data
Introducing: A Complete Algebra of DataIntroducing: A Complete Algebra of Data
Introducing: A Complete Algebra of DataInside Analysis
 
The Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionThe Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionInside Analysis
 
Ahead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsAhead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsInside Analysis
 
All Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingAll Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingInside Analysis
 
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLGoodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLInside Analysis
 
The Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelThe Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelInside Analysis
 
Structurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureStructurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureInside Analysis
 
SQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskSQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskInside Analysis
 
The Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataThe Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataInside Analysis
 
A Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseA Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseInside Analysis
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopInside Analysis
 
Rethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldRethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldInside Analysis
 
DisrupTech - Dave Duggal
DisrupTech - Dave DuggalDisrupTech - Dave Duggal
DisrupTech - Dave DuggalInside Analysis
 

More from Inside Analysis (20)

An Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BIAn Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BI
 
Agile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessAgile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for Success
 
First in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationFirst in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter Integration
 
Fit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data LetdownFit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data Letdown
 
To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security
 
The Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeThe Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On Time
 
Introducing: A Complete Algebra of Data
Introducing: A Complete Algebra of DataIntroducing: A Complete Algebra of Data
Introducing: A Complete Algebra of Data
 
The Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionThe Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop Adoption
 
Ahead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsAhead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time Analytics
 
All Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingAll Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of Everything
 
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLGoodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
 
The Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelThe Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global Level
 
Structurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureStructurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your Architecture
 
SQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskSQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the Risk
 
The Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataThe Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big Data
 
A Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseA Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data Warehouse
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of Hadoop
 
Rethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldRethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile World
 
DisrupTech - Dave Duggal
DisrupTech - Dave DuggalDisrupTech - Dave Duggal
DisrupTech - Dave Duggal
 
Modus Operandi
Modus OperandiModus Operandi
Modus Operandi
 

Recently uploaded

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 

Recently uploaded (20)

The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 

Self-Service Access and Exploration of Big Data

  • 2. Welcome Host: Eric Kavanagh eric.kavanagh@bloorgroup.com Twitter Tag: #briefr The Briefing Room
  • 3. Mission !   Reveal the essential characteristics of enterprise software, good and bad !   Provide a forum for detailed analysis of today s innovative technologies !   Give vendors a chance to explain their product to savvy analysts !   Allow audience members to pose serious questions... and get answers! Twitter Tag: #briefr The Briefing Room
  • 4. December: Innovators January: Big Data February: Analytics March: Data in Motion Twitter Tag: #briefr The Briefing Room
  • 5. Innovators !   Charles Babbage conceived the Analytical Engine in 1834. !   Automation and ease of use have driven innovation in computing ever since. !   The Cloud and Big Data are raising the bar. Twitter Tag: #briefr The Briefing Room
  • 6. Analyst: Robin Bloor  Robin Bloor is Chief Analyst at The Bloor Group robin.bloor@bloorgroup.com Twitter Tag: #briefr The Briefing Room
  • 7. Cirro ! Cirro provides a single method to access any type of data, on any platform, in any environment. !   Its product suite consists of Cirro Data Hub, Analyst for Excel and Multi Store – all designed to remove complexity from Big Data analytics. ! Cirro’s products are cloud based and can run in public, private and on-premise environments. Twitter Tag: #briefr The Briefing Room
  • 8. Mark Theissen Mark is CEO at Cirro. He is a respected analytics and data warehousing expert with more than 22 years in the industry. Most recently Mark was the worldwide data warehousing technical lead at Microsoft following the acquisition of DATAllegro. At DATAllegro Mark was the COO and a member of the board of directors. Prior to joining DATAllegro, Mark was Vice President and Research Lead at META Group (Gartner Group) for Enterprise Analytics Strategies, covering data warehousing, business intelligence and data integration markets. Before META, Mark was VP of Professional Services at Accruent where he was responsible for domestic and overseas services and operations. Mark has a BS in Computer Information Systems from Chapman University and a MBA from the University of California, Irvine. Twitter Tag: #briefr The Briefing Room
  • 9. Bringing Big Data to the Desktop Corporate Overview ©2012 Cirro Inc. All rights reserved.
  • 10. The Big Data Dilemma ©2012 Cirro Inc. All rights reserved.
  • 11. The Big Data Dilemma ©2012 Cirro Inc. All rights reserved.
  • 12. The Big Data Dilemma ©2012 Cirro Inc. All rights reserved.
  • 13. Accessing Big Data ©2012 Cirro Inc. All rights reserved.
  • 14. Accessing Big Data Incumbent  Approach   Hadoop  Approach   ©2012 Cirro Inc. All rights reserved.
  • 15. Accessing Big Data Incumbent  Approach   Hadoop  Approach   ©2012 Cirro Inc. All rights reserved.
  • 16. Accessing Big Data Incumbent  Approach   Hadoop  Approach   ©2012 Cirro Inc. All rights reserved.
  • 17. What the Market Needs An enterprise data hub to access any type of data, on any platform, in any environment ©2012 Cirro Inc. All rights reserved.
  • 18. The Enterprise Data Hub ©2012 Cirro Inc. All rights reserved.
  • 19. Simplifying the Access to Your Data Conven/onal  Approach   Cirro  Approach   People  manage  the  access  to  data   Cirro  Data  Hub  manages     access  to  data   HIVE   Hadoop   Map   Install  &   Reduce   Config   Hive  –  Scoop   Access  tool   Sqoop   Install  &   Config   Source   Java   Control   SQL   Structured  -­‐   Cirro   (mul;ple   versions)   Unstructured   DataBase   Management   Data  Hub   Mashups   ©2012 Cirro Inc. All rights reserved.
  • 20. Architecture Overview Cirro  Data  Hub   •  Cost  based  federa;on  op;mizer   •  Smart  caching     •  Dynamic  op;miza;on   •  Normalized  cost  es;mates   •  Metadata  for  unstructured  sources     Cirro  Func;on  Library   •  Library  of  Func;ons   •  Logic  to  build  complex  specific  formulas     Cirro  Analyst   •  Excel  plug-­‐in  that  allows  analysts  to  explore                  &  process  Big  Data  and  tradi;onal  data     Cirro  Mul;  Store  (op;onal)   •  Pre-­‐built  structured/unstructured  data  store   •  Used  for  holding  data  or  addi;onal  workspace     ©2012 Cirro Inc. All rights reserved.
  • 21. Typical Deployment Excel Analyst Users Data Consumers •  Design Views Access CDH Views via ODBC •  Minimal IT Support & JDBC across all data types •  Publish Views •  Data Exploration •  Analysis Tableau Extend, Add Cirro Data Hub Proprietary •  Cirro Function Library Business Functions to CFL • Proprietary MapReduce Objects • Custom Views IT Staff •  Programmers Other BI Tools •  Developers •  DBA’s MapReduce   HQL   No  SQL   RDBMS   Splunk   Hive Oracle   Cassandra   MapReduce Teradata   MongoDB   MySQL   SQL     Hadoop Distributed File System Ver;ca   ©2012 Cirro Inc. All rights reserved.
  • 22. Sample Use Case Summarize the number of tweets per hour with certain keywords from a raw twitter feed. Requirements: •  Use raw twitter data files in Hadoop •  Keywords stored in SQL table for easy manipulation •  Results into Tableau Excel for visualization ©2012 Cirro Inc. All rights reserved.
  • 23. Too Many Skills, Coding, Processing Write  mapper/reducer  in  java  using  development  tool  :     • parse  twi[er  text  -­‐  convert  to  lower  case  -­‐  parse  words  -­‐  exclude  common  words  -­‐  group  words  by  hour   Import  java  classes  into  Hadoop   Execute  command  line  hadoop  using  CLI   • bin/hadoop    jar  Twi[erParse    /home/cloudera/WordCount.jar  /usr/tweet/input  /usr/local/output  –libjars     Move  result  into  HIVE  using  JDBC  SQL  tool   • create  table  output1  (text  STRING,created_at  STRING,count  BIGINT)  ROW  FORMAT  DELIMITED  FIELDS  TERMINATED  BY   't'  STORED  AS  TEXTFILE     • LOAD  DATA  INPATH  '/usr/data/1-­‐88f1-­‐864e22e77801/part*'OVERWRITE  INTO  TABLE  output1   Move  SQL  table  with  keywords  to  HIVE  through  Scoop  using  CLI   • export  -­‐-­‐connect  jdbc:mySQL://10.17.185.44/mytable  -­‐-­‐password    mypasswd  -­‐-­‐username  root  -­‐-­‐table  words  -­‐-­‐export-­‐dir   '/home/cloudera/inpumile   • create  table  mytable  (word  STRING)  ROW  FORMAT  DELIMITED  FIELDS  TERMINATED  BY  ','  STORED  AS  TEXTFILE     • LOAD  DATA  INPATH  '/home/cloudera/inpumile/part*'OVERWRITE  INTO  TABLE  mytable   Run  HIVE  query  using  JDBC  SQL  tool   • select  a.text  ,a.created_at  ,a.count  from  output1  a    join  mytable  b    on  (a.text    =  b.word  )     Import  results  into  Excel  using  Excel   ©2012 Cirro Inc. All rights reserved.
  • 24. Too Many Skills, Coding, Processing Write  mapper/reducer  in  java  using  development  tool  :     • parse  twi[er  text  -­‐  convert  to  lower  case  -­‐  parse  words  -­‐  exclude  common  words  -­‐  group  words  by  hour   Import  java  classes  into  Hadoop   Execute  command  line  hadoop  using  CLI   • bin/hadoop    jar  Twi[erParse    /home/cloudera/WordCount.jar  /usr/tweet/input  /usr/local/output  –libjars     B1=Twi[erParse("/user/twi[er/sample","text,created_at")   Move  result  into  HIVE  using  JDBC  SQL  tool   B2=ToLower(B1,"text")   • create  table  output1  (text  STRING,created_at  STRING,count  BIGINT)  ROW  FORMAT  DELIMITED  FIELDS  TERMINATED  BY   B3=WordSeparate(B2,"text")   't'  STORED  AS  TEXTFILE     • LOAD  DATA  INPATH  '/usr/data/1-­‐88f1-­‐864e22e77801/part*'OVERWRITE  INTO  TABLE  output1   B4=Exclude(B3,"text")   Move  SQL  table  with  keywords  to  HIVE  through  Scoop  using  CLI   B5=GroupBy(B4,"text,created_at")   • export  -­‐-­‐connect  jdbc:mySQL://10.17.185.44/mytable  -­‐-­‐password    mypasswd  -­‐-­‐username  root  -­‐-­‐table  words  -­‐-­‐export-­‐dir   B6=Cirro_Match(B5,"text","MYSQL.KeyWords","word",C9)   '/home/cloudera/inpumile   • create  table  mytable  (word  STRING)  ROW  FORMAT  DELIMITED  FIELDS  TERMINATED  BY  ','  STORED  AS  TEXTFILE     Results  displayed  at  cell  C9   • LOAD  DATA  INPATH  '/home/cloudera/inpumile/part*'OVERWRITE  INTO  TABLE  mytable   Run  HIVE  query  using  JDBC  SQL  tool   • select  a.text  ,a.created_at  ,a.count  from  output1  a    join  mytable  b    on  (a.text    =  b.word  )     Import  results  into  Excel  using  Excel   ©2012 Cirro Inc. All rights reserved.
  • 25. Bringing Big Data to the Desktop Corporate Overview ©2012 Cirro Inc. All rights reserved.
  • 26. Perceptions & Questions Analyst: Robin Bloor Twitter Tag: #briefr The Briefing Room
  • 27. Big Data, Hot Data? The Bloor Group
  • 28. Hadoop & The Big Data Dynamic Hadoop has become the de facto reservoir for data The Bloor Group
  • 29. Hadoop & The Big Data Dynamic –  We witnessed something like this a long time ago, with ISAM files - before the advent of RDBMS –  The difference this time is that Hadoop has an ecosystem and it is growing –  Big Data (usually caught first by Hadoop) is mostly new data and mostly event data –  Hadoop is not (yet) a performance engine. It is an all-purpose capability –  It is delivering business benefits in a big way: it is hot…. The Bloor Group
  • 30. BI Categories HINDSIGHT Regular reporting/operational BI, Excel OVERSIGHT Dashboards, OLAP, BPM, Excel Data mining, statistical analysis INSIGHT (trends and relationships) FORESIGHT Predictive analytics The Bloor Group
  • 31. The New BI Universe (?) The Bloor Group
  • 32. Data Sources Graph DBMS, XML Standard DBMS, NoSQL SQL Flat files Hadoop and Metadata Hadoop Hub? ++ The Bloor Group
  • 33. Problems Of The Data Layer Hadoop is capable of ETL and often Hadoop is multi-role and hence used for ETL, but that usually can spawn multiple instances involves coding of a kind BI tools, which had good-enough The data layer is more interfaces to RDBMS, don’t link to complicated than it was and its Hadoop directly, and probably complexity is increasing shouldn’t Point to point connectivity usually A connectivity architecture is was, is and may always be a bad needed idea IT REQUIRES SIMPLE CONNECTORS The Bloor Group
  • 34. !  How would one use the Cirro Multi Store? !  Which companies/products do you regard as competitors (either directly or close competitors)? !  How does a Cirro implementation proceed, i.e., where do you start, what are the medium term goals, what do you replace? !  Conceptually a hub for the data layer is attractive. But how well does it scale out? The Bloor Group
  • 35. !  Can the hub be physically distributed, i.e., one logical instance with multiple physical instances? !  How does your proprietary MapReduce differ from Hadoop MapReduce? !  Is there any aspect of BI that you don’t or can’t cater for (CEP, Data governance, MDM, etc.)? The Bloor Group
  • 36. Twitter Tag: #briefr The Briefing Room
  • 37. Upcoming Topics January: Big Data February: Analytics March: Data in Motion 2013 Editorial Calendar www.insideanalysis.com Twitter Tag: #briefr The Briefing Room
  • 38. Thank You for Your Attention Twitter Tag: #briefr The Briefing Room