SlideShare a Scribd company logo
1 of 38
Download to read offline
Eric.kavanagh@bloorgroup.com




Twitter Tag: #briefr                   The Briefing Room
!   Reveal the essential characteristics of enterprise
       software, good and bad

    !   Provide a forum for detailed analysis of today s
       innovative technologies

    !   Give vendors a chance to explain their product to
       savvy analysts

    !   Allow audience members to pose serious questions...
       and get answers!



Twitter Tag: #briefr                             The Briefing Room
!  November: Cloud
   !  December: Innovators
   !  January: Big Data
   !  February: Performance
   !  March: Integration

Twitter Tag: #briefr          The Briefing Room
!  The Data Warehouse was once considered the Holy Grail of
         Business Intelligence, but as data volumes increase
         exponentially, we’re finding that data warehousing cannot be
         all things for all users.

       ! Hadoop was initially developed at Yahoo! to support a search
         engine project and has since turned into the poster child for
         open source Big Data processing.

       !  While Hadoop is not a data warehouse, its capabilities can help
         organizations store and analyze huge volumes of data.




Twitter Tag: #briefr                                       The Briefing Room
Mark Madsen is president of Third
                       Nature, a technology research and
                       consulting firm focused on business
                       intelligence, data integration and data
                       management. Mark is an award-winning
                       author, architect and CTO whose work
                       has been featured in numerous industry
                       publications. Over the past ten years
                       Mark received awards for his work from
                       the American Productivity & Quality
                       Center, TDWI, and the Smithsonian
                       Institute. He is an international speaker,
                       a contributor at Forbes Online and
                       Information Management. For more
                       information or to contact Mark, follow
                       @markmadsen on Twitter or visit
                       http://ThirdNature.net




Twitter Tag: #briefr                             The Briefing Room
!    Hortonworks is an enterprise software company that focuses on
         the development and support of Apache Hadoop.


    !    Its product is the Hortonworks Data Platform, an open source
         platform for storing, processing and analyzing large volumes of
         data from many sources and in a variety of formats.


    !    Hortonworks recently introduced its Hive ODBC Driver 1.0, which
         allows users to integrate its Hadoop platform with the BI apps
         running on top.




Twitter Tag: #briefr                                        The Briefing Room
Jim is the Director of Product Marketing at
    Hortonworks. He is a recovering developer,
    professional marketer and amateur
    photographer with nearly twenty years
    experience building products and developing
    emerging technologies. During his career, he
    has brought multiple  products to market in
    a variety of fields, including data loss
    prevention, master data management and
    now big data.  At Hortonworks, Jim is
    focused on accelerating the development
    and adoption of Apache Hadoop.




Twitter Tag: #briefr                               The Briefing Room
Hadoop: What It Is & Isn’t
October 2012
Jim Walker
Director, Product Marketing
Hortonworks




© Hortonworks Inc. 2012       Page 9
Big Data: Organizational Game Changer

                                                                     Transactions + Interactions
Petabytes
                  BIG DATA                       Mobile Web                  + Observations
                                                 Sentiment

                                                  User Click Stream
                                                                    SMS/MMS
                                                                                   = BIG DATA
                                                                         Speech to Text

                                                                Social Interactions & Feeds
  Terabytes       WEB                Web logs
                                                                         Spatial & GPS Coordinates
                                         A/B testing
                                                                                Sensors / RFID / Devices
                                                  Behavioral Targeting
   Gigabytes      CRM                                                                   Business Data Feeds
                                                             Dynamic Pricing
                                     Segmentation                                             External Demographics
                                                                    Search Marketing
                                         Customer Touches                                      User Generated Content
                  ERP
   Megabytes                                                           Affiliate Networks
                   Purchase detail              Support Contacts                                  HD Video, Audio, Images
                                                                         Dynamic Funnels
                   Purchase record
                                                    Offer details          Offer history            Product/Service Logs
                   Payment record



                                                  Increasing Data Variety and Complexity


                                                                                                                            Page 10
               © Hortonworks Inc. 2012
What is a Data Driven Business?
     •  DEFINITION
        Better use of available data in the decision making process

     •  RULE
        Key metrics derived from data should be tied to goals

     •  PROVEN RESULTS
        Firms that adopt Data-Driven Decision Making have output and
        productivity that is 5-6% higher than what would be expected
        given their investments and usage of information technology*



1110010100001010011101010100010010100100101001001000010010001001000001000100000100
0100100100010000101110000100100010001010010010111101010010001001001010010100100111
11001010010100011111010001001010000010010001010010111101010011001001010010001000111


        * “Strength in Numbers: How Does Data-Driven Decisionmaking Affect Firm Performance?” Brynjolfsson, Hitt and Kim (April 22, 2011)


                                                                                                                                            Page 11
                  © Hortonworks Inc. 2012
Big Data: Optimize Outcomes at Scale
                     Media     optimize                 Content
       Intelligence            optimize                 Detection
                Finance        optimize                 Algorithms
       Advertising             optimize                 Performance
                     Fraud     optimize                 Prevention
Retail / Wholesale             optimize                 Inventory turns
   Manufacturing               optimize                 Supply chains
         Healthcare            optimize                 Patient outcomes
           Education           optimize                 Learning outcomes
     Government                optimize                 Citizen services
                                     Source: Geoffrey Moore. Hadoop Summit 2012 keynote presentation.

                                                                                             Page 12
     © Hortonworks Inc. 2012
Enterprise Big Data Flows

        Unstructured                                                                  Business                    CRM, ERP
            Data
                                                                                    Transactions                  Web, Mobile
                                                                                    & Interactions                Point of sale
           Log files                         Big Data
                                             Platform
        Exhaust Data
                                                                                                    Classic Data
                                                                                                    Integration & ETL
         Social Media


           Sensors,
           devices                                                                     Business                   Dashboards,
                                                                                      Intelligence                Reports,
                                                                                      & Analytics                 Visualization, …
           DB data




    Capture Big Data                     Process                      Distribute Results                Feedback
1   Collect data from all sources
    structured &unstructured
                                         2
                                         Transform, refine,
                                         aggregate, analyze, report
                                                                      3
                                                                      Interoperate and share data
                                                                      with applications/analytics
                                                                                                    4   Use operational data w/in
                                                                                                        big data platform, preserve data


                                                                                                                               Page 13
               © Hortonworks Inc. 2012
Data Platform Requirements for Big Data

                                  Data Platform for Big Data

          Capture                                Process                              Exchange
 •  Collect data from all              •  Transform, refine,                 •  Deliver data with
    sources - structured and              aggregate, analyze,                   enterprise data systems
    unstructured data                     report                             •  Share data with analytic
 •  all speeds batch, async,                                                    applications and
    streaming, real-time                                                        processing


                                                 Operate
         •  Provision, monitor, diagnose, manage at scale
         •  Reliability, availability, affordability, scalability, interoperability

                                     Across all deployment models
      Operating                       Virtual                    Cloud                    Big Data
      Systems                        Platforms                 Platforms                 Appliances




                                                                                                           Page 14
        © Hortonworks Inc. 2012
Apache Hadoop & Big Data Use Cases

                                           Big Data
                             Transactions, Interactions, Observations




                             Refine         Explore          Enrich




                                  Business Case

                                                                        Page 15
   © Hortonworks Inc. 2012
Operational Data Refinery
Hadoop as platform for ETL modernization
                                                                             Refine   Explore       Enrich



Unstructured     Log files           DB data   Capture
                                               •  Capture new unstructured data along with log
                                                  files all alongside existing sources
                                               •  Retain inputs in raw form for audit and
         Capture and archive
                                                  continuity purposes
           Parse & Cleanse
                                               Process
          Structure and join                   •  Parse the data & cleanse
                 Upload                        •  Apply structure and definition
                              Refinery
                                               •  Join datasets together across disparate data
                                                  sources
                                               Exchange
                                               •  Push to existing data warehouse for
                                                  downstream consumption
              Enterprise
                                               •  Feeds operational reporting and online systems
           Data Warehouse



                                                                                                Page 16
           © Hortonworks Inc. 2012
Big Data Exploration & Visualization
  Hadoop as agile, ad-hoc data mart
                                                                                   Refine   Explore       Enrich



  Unstructured       Log files           DB data   Capture
                                                   •  Capture multi-structured data and retain inputs
                                                      in raw form for iterative analysis
             Capture and archive                   Process
                                                   •  Parse the data into queryable format
              Structure and join
                                                   •  Explore & analyze using Hive, Pig, Mahout and
            Categorize into tables                    other tools to discover value
           upload             JDBC / ODBC          •  Label data and type information for
                                                      compatibility and later discovery
                                         Explore
                                                   •  Pre-compute stats, groupings, patterns in data
Optional                                              to accelerate analysis
                                                   Exchange
                                                   •  Use visualization tools to facilitate exploration
                                                      and find key insights
                                  Visualization
  EDW / Datamart                      Tools        •  Optionally move actionable insights into EDW
                                                      or datamart
                                                                                                      Page 17
               © Hortonworks Inc. 2012
Application Enrichment
Deliver Hadoop analysis to online apps
                                                                                  Refine   Explore       Enrich



Unstructured      Log files          DB data   Capture
                                               •  Capture data that was once
                                                  too bulky and unmanageable
      Capture
                          Enrich
       Parse
                                               Process
    Derive/Filter                              •    Uncover aggregate characteristics across data
                           Scheduled &
                           near real time      •    Use Hive Pig and Map Reduce to identify patterns
   NoSQL, HBase                                •    Filter useful data from mass streams (Pig)
    Low Latency
                                               •    Micro or macro batch oriented schedules

                                               Exchange
                                               •  Push results to HBase or other NoSQL alternative
                                                  for real time delivery
     Online
                                               •  Use patterns to deliver right content/offer to the
   Applications                                   right person at the right time


                                                                                                     Page 18
           © Hortonworks Inc. 2012
Hadoop in Enterprise Data Architectures
    Existing Business Infrastructure                                                 Web                      New Tech

                                                                                                                   Datameer
                                                                                                                    Tableau
                                                                                                                  Karmasphere
   IDE &          ODS &             Applications &   Visualization &                  Web                            Splunk
  Dev Tools      Datamarts          Spreadsheets       Intelligence                Applications


                                                                                                                                 Operations

                      Discovery                                                 Low Latency/
                        Tools                         EDW
                                                                                  NoSQL
                                                                                                                                 Custom   Existing



                                                              Templeton        WebHDFS             Sqoop            Flume
                                                                              HCatalog
                                                                                                                  HBase
                                                                       Pig                 Hive
                                                                       MapReduce                           HDFS
                                                                     Ambari                Oozie                    HA
                                                                                       ZooKeeper




                                                            Social               Exhaust                   logs          files
       CRM           ERP             financials             Media                 Data


                                                  Big Data Sources
                                      (transactions, observations, interactions)



                                                                                                                                          Page 19
          © Hortonworks Inc. 2012
Where Does It Fit into Your Business?

   Vertical Refine                                  Explore                            Enrich
                                                                                       •  Dynamic Pricing
                    •  Log Analysis/Site
 Retail & Web                                       •  Social Network Analysis         •  Session & Content
                       Optimization
                                                                                          Optimization

                    •  Loyalty Program                                                 •  Dynamic Pricing/Targeted
       Retail                                       •  Brand and Sentiment Analysis
                       Optimization                                                       Offer


  Intelligence      •  Threat Identification         •  Person of Interest Discovery    •  Cross Jurisdiction Queries


                    •  Risk Modeling & Fraud
                                                    •  Surveillance and Fraud
                       Identification                                                  •  Real-time upsell, cross sales
     Finance        •  Trade Performance
                                                       Detection
                                                                                          marketing offers
                                                    •  Customer Risk Analysis
                       Analytics

                    •  Smart Grid: Production       •  Grid Failure Prevention
      Energy                                                                           •  Individual Power Grid
                       Optimization                 •  Smart Meters


                                                                                       •  Dynamic Delivery
Manufacturing       •  Supply Chain Optimization    •  Customer Churn Analysis
                                                                                       •  Replacement parts

 Healthcare &       •  Electronic Medical Records   •  Clinical Trials Analysis        •  Insurance Premium
        Payer          (EMPI)                                                             Determination




                                                                                                                  Page 20
         © Hortonworks Inc. 2012
Hortonworks Vision & Leadership

                                    We believe that by the end of 2015,
                                    more than half the world's data will be
                                    processed by Apache Hadoop.



       Trusted                                 Open              Innovative
•  Stewards of core Hadoop          •  100% open platform   •  Innovating current platform
•  Original builders and            •  No POS holdback         with HCatalog, Ambari, HA
   operators of Hadoop              •  Open to the Hadoop   •  Innovating future platform
•  100+ years Hadoop                   community               with YARN, HA
   development experience           •  Open to the Hadoop   •  Complete vision for
•  Managed every viable,               ecosystem               Hadoop-based platform
   stable Hadoop release            •  Closely aligned to   •  Enable the Hadoop
•  HDP built on Hadoop 1.0             Hadoop core             ecosystem




                                                                                       Page 21
          © Hortonworks Inc. 2012
Hortonworks Data Platform
                                                       •  Simplify deployment to get
                                                          started quickly and easily

                                                       •  Monitor, manage any size
                                                          cluster with familiar
                                                          console and tools


                                1                      •  Only platform to include
                                                          data integration services
                                                          to interact with any data

                                                       •  Metadata services opens
                                                          the platform for integration
                                                          with existing applications

                                                       •  Dependable high
                                                          availability architecture
ü  Reduce risks and cost of adoption
ü  Lower the total cost to administer and provision   •  Tested at scale to future
                                                          proof your cluster growth
ü  Integrate with your existing ecosystem

                                                                                Page 22
      © Hortonworks Inc. 2012
Twitter Tag: #briefr   The Briefing Room
“In	
  pioneer	
  days	
  they	
  used	
  oxen	
  for	
  heavy	
  pulling,	
  and	
  
     when	
  one	
  ox	
  couldn't	
  budge	
  a	
  log,	
  they	
  didn't	
  try	
  to	
  
     grow	
  a	
  larger	
  ox.	
  We	
  shouldn't	
  be	
  trying	
  for	
  bigger	
  
     computers,	
  but	
  for	
  more	
  systems	
  of	
  computers.”	
  
                                                           	
  Grace	
  Hopper	
  




© Third Nature Inc.
What’s	
  different	
  today?	
  
  We’re	
  not	
  ge@ng	
  more	
  CPU	
  
  speed,	
  but	
  more	
  CPU	
  cycles.	
  
  There	
  are	
  too	
  many	
  CPUs	
  
  relaEve	
  to	
  other	
  resources,	
  
  creaEng	
  an	
  imbalance	
  in	
  
  hardware	
  plaForms.	
  
  We	
  therefore	
  use	
  nodes	
  to	
  
  aggregate	
  memory,	
  network	
  
  bandwidth	
  and	
  IOPs.	
  
  Most	
  soJware	
  is	
  designed	
  for	
  
  a	
  single	
  worker,	
  not	
  	
  high	
  
  degrees	
  of	
  parallelism	
  and	
  
  won’t	
  scale	
  well.	
  
© Third Nature Inc.
Data	
  volume	
  is	
  the	
  oldest,	
  easiest	
  problem	
  




© Third Nature Inc.                                                                  Teradata
Analy:cs	
  makes	
  the	
  data	
  volume	
  problem	
  bigger
                                                                      	
  




         Many	
  of	
  the	
  processing	
  problems	
  are	
  O(n2)	
  or	
  worse,	
  so	
  
         moderate	
  data	
  can	
  be	
  a	
  problem	
  for	
  DW	
  architectures	
  
© Third Nature Inc.
I need that           It would be logical
                                  data now.             to keep all the
                                                                               It will take
.	
                                                     data in one place.
                                                                               6 months

	
  
	
  
	
  




                      A	
  common	
  problem	
  with	
  new	
  projects	
  or	
  
© Third Nature Inc.
                             unexpected	
  business	
  problems…	
  
The	
  proposed	
  solu:on?	
  Load	
  Hadoop	
  and	
  analyze	
  




© Third Nature Inc.
Welcome	
  to	
  the	
  Hadoop	
  schema!
                                                              	
  




    Why	
  soJ	
  /	
  no	
  schema	
  can	
  be	
  good:	
  
    Easier	
  programming	
  
    Easier	
  modeling	
  since	
  you	
  don’t	
  have	
  to	
  be	
  perfect	
  in	
  advance,	
  and	
  
    it’s	
  change-­‐resilient	
  
    Join	
  eliminaEon	
  =	
  I/O	
  savings	
  (if	
  no	
  updates)	
  
© Third Nature Inc.
Whether	
  to	
  switch	
  from	
  a	
  DB	
  isn’t	
  the	
  right	
  discussion	
  
                  SQL?

                                                            Hadoop
                         SQL!

                                SQL

                                      SQL..
                                      .




© Third Nature Inc.
Strategy:	
  There’s	
  a	
  pony	
  in	
  there	
  somewhere	
  




© Third Nature Inc.
…but	
  you	
  need	
  a	
  unicorn	
  to	
  find	
  the	
  pony	
  




© Third Nature Inc.
Ques:ons	
  for	
  discussion	
  
   1. Is	
  scale	
  of	
  data	
  really	
  that	
  much	
  of	
  a	
  problem	
  for	
  most	
  
      organizaEons?	
  
   2. Hadoop	
  is	
  designed	
  for	
  batch	
  work	
  –	
  how	
  good	
  is	
  it	
  for	
  
      interacEve	
  use?	
  Real-­‐Eme	
  use	
  cases?	
  
   3. How	
  do	
  you	
  define	
  “plaForm”?	
  
   4. ETL	
  modernizaEon	
  is	
  menEoned,	
  but	
  isn’t	
  this	
  a	
  reversion	
  
      to	
  manual	
  coding?	
  
   5. How	
  do	
  you	
  design	
  for	
  long-­‐term	
  use	
  rather	
  than	
  one-­‐off	
  
      analysis	
  projects?	
  
   6. Does	
  open	
  source	
  really	
  macer	
  for	
  this	
  part	
  of	
  the	
  stack?	
  


© Third Nature Inc.
CC	
  Image	
  AOribu:ons	
  
     Thanks	
  to	
  the	
  people	
  who	
  supplied	
  the	
  creaEve	
  commons	
  licensed	
  images	
  used	
  in	
  this	
  presentaEon:	
  
     	
  
     Phone	
  dump	
  -­‐	
  Richard	
  Barnes	
  
     ponies	
  in	
  field.jpg	
  -­‐	
  hcp://www.flickr.com/photos/bulle_de/352732514/	
  
     	
  




© Third Nature Inc.
Twitter Tag: #briefr   The Briefing Room
!  This Month: Database
   !  November: Cloud
   !  December: Innovators
   !  January: Big Data
   !  2013 Editorial Calendar
          (www.insideanalysis.com)




Twitter Tag: #briefr                 The Briefing Room
Twitter Tag: #briefr   The Briefing Room

More Related Content

What's hot

IAB/Winterberry Group Member Webinar: "From Information to Audiences--The Eme...
IAB/Winterberry Group Member Webinar: "From Information to Audiences--The Eme...IAB/Winterberry Group Member Webinar: "From Information to Audiences--The Eme...
IAB/Winterberry Group Member Webinar: "From Information to Audiences--The Eme...IABmembership
 
Right now corporatepresentation july 2011
Right now corporatepresentation july 2011Right now corporatepresentation july 2011
Right now corporatepresentation july 2011Frank Ragol
 
Cutting Big Data Down to Size with AMD and Dell
Cutting Big Data Down to Size with AMD and DellCutting Big Data Down to Size with AMD and Dell
Cutting Big Data Down to Size with AMD and DellAMD
 
The Future of ERP by Bertrand Andries
The Future of ERP by Bertrand Andries  The Future of ERP by Bertrand Andries
The Future of ERP by Bertrand Andries CONFENIS 2012
 
Datalicious - Smart Data Driven Marketing
Datalicious - Smart Data Driven MarketingDatalicious - Smart Data Driven Marketing
Datalicious - Smart Data Driven MarketingDatalicious
 
Big Data and Analytics
Big Data and AnalyticsBig Data and Analytics
Big Data and Analyticsdmurph4
 
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)Mark Heid
 
SharePoint Search Goes Public!
SharePoint Search Goes Public!SharePoint Search Goes Public!
SharePoint Search Goes Public!SurfRay
 
Big Data Challenges
Big Data ChallengesBig Data Challenges
Big Data ChallengesDatalicious
 
Embedded Analytics: The Next Mega-Wave of Innovation
Embedded Analytics: The Next Mega-Wave of InnovationEmbedded Analytics: The Next Mega-Wave of Innovation
Embedded Analytics: The Next Mega-Wave of InnovationInside Analysis
 
Mesa Big Data 2nd Screen Final
Mesa Big Data 2nd Screen FinalMesa Big Data 2nd Screen Final
Mesa Big Data 2nd Screen FinalTripp Payne
 
The power of_mobile_and_social_data_webinar_slides_21_may2012
The power of_mobile_and_social_data_webinar_slides_21_may2012The power of_mobile_and_social_data_webinar_slides_21_may2012
The power of_mobile_and_social_data_webinar_slides_21_may2012Accenture
 
Y&R Data Driven Marketing
Y&R Data Driven MarketingY&R Data Driven Marketing
Y&R Data Driven MarketingDatalicious
 
Gartner Session Final Prez October 2012
Gartner Session Final Prez October 2012Gartner Session Final Prez October 2012
Gartner Session Final Prez October 2012Rebecca Croucher
 
Northridge Webinar Share Point 2010 Public Web
Northridge Webinar Share Point 2010 Public WebNorthridge Webinar Share Point 2010 Public Web
Northridge Webinar Share Point 2010 Public Webjfarq
 
Enterprise Apps Future State
Enterprise Apps Future StateEnterprise Apps Future State
Enterprise Apps Future StateBruce MacVarish
 
01 im overview high level
01 im overview high level01 im overview high level
01 im overview high levelJames Findlay
 

What's hot (20)

IAB/Winterberry Group Member Webinar: "From Information to Audiences--The Eme...
IAB/Winterberry Group Member Webinar: "From Information to Audiences--The Eme...IAB/Winterberry Group Member Webinar: "From Information to Audiences--The Eme...
IAB/Winterberry Group Member Webinar: "From Information to Audiences--The Eme...
 
Right now corporatepresentation july 2011
Right now corporatepresentation july 2011Right now corporatepresentation july 2011
Right now corporatepresentation july 2011
 
Cutting Big Data Down to Size with AMD and Dell
Cutting Big Data Down to Size with AMD and DellCutting Big Data Down to Size with AMD and Dell
Cutting Big Data Down to Size with AMD and Dell
 
The Future of ERP by Bertrand Andries
The Future of ERP by Bertrand Andries  The Future of ERP by Bertrand Andries
The Future of ERP by Bertrand Andries
 
Datalicious - Smart Data Driven Marketing
Datalicious - Smart Data Driven MarketingDatalicious - Smart Data Driven Marketing
Datalicious - Smart Data Driven Marketing
 
Big Data and Analytics
Big Data and AnalyticsBig Data and Analytics
Big Data and Analytics
 
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
Big Data Meets Social Analytics - IBM Connect 2012 (CN-CC13)
 
SharePoint Search Goes Public!
SharePoint Search Goes Public!SharePoint Search Goes Public!
SharePoint Search Goes Public!
 
Opening keynote gianni cooreman
Opening keynote gianni cooremanOpening keynote gianni cooreman
Opening keynote gianni cooreman
 
Big Data Challenges
Big Data ChallengesBig Data Challenges
Big Data Challenges
 
Embedded Analytics: The Next Mega-Wave of Innovation
Embedded Analytics: The Next Mega-Wave of InnovationEmbedded Analytics: The Next Mega-Wave of Innovation
Embedded Analytics: The Next Mega-Wave of Innovation
 
IBM Stream au Hadoop User Group
IBM Stream au Hadoop User GroupIBM Stream au Hadoop User Group
IBM Stream au Hadoop User Group
 
OWF12/Java Michael hirt
OWF12/Java Michael hirtOWF12/Java Michael hirt
OWF12/Java Michael hirt
 
Mesa Big Data 2nd Screen Final
Mesa Big Data 2nd Screen FinalMesa Big Data 2nd Screen Final
Mesa Big Data 2nd Screen Final
 
The power of_mobile_and_social_data_webinar_slides_21_may2012
The power of_mobile_and_social_data_webinar_slides_21_may2012The power of_mobile_and_social_data_webinar_slides_21_may2012
The power of_mobile_and_social_data_webinar_slides_21_may2012
 
Y&R Data Driven Marketing
Y&R Data Driven MarketingY&R Data Driven Marketing
Y&R Data Driven Marketing
 
Gartner Session Final Prez October 2012
Gartner Session Final Prez October 2012Gartner Session Final Prez October 2012
Gartner Session Final Prez October 2012
 
Northridge Webinar Share Point 2010 Public Web
Northridge Webinar Share Point 2010 Public WebNorthridge Webinar Share Point 2010 Public Web
Northridge Webinar Share Point 2010 Public Web
 
Enterprise Apps Future State
Enterprise Apps Future StateEnterprise Apps Future State
Enterprise Apps Future State
 
01 im overview high level
01 im overview high level01 im overview high level
01 im overview high level
 

Similar to Hadoop: What It Is and What It's Not

Big Data, Hadoop, Hortonworks and Microsoft HDInsight
Big Data, Hadoop, Hortonworks and Microsoft HDInsightBig Data, Hadoop, Hortonworks and Microsoft HDInsight
Big Data, Hadoop, Hortonworks and Microsoft HDInsightHortonworks
 
Hortonworks roadshow
Hortonworks roadshowHortonworks roadshow
Hortonworks roadshowAccenture
 
Talend Open Studio and Hortonworks Data Platform
Talend Open Studio and Hortonworks Data PlatformTalend Open Studio and Hortonworks Data Platform
Talend Open Studio and Hortonworks Data PlatformHortonworks
 
Break Through the Traditional Advertisement Services with Big Data and Apache...
Break Through the Traditional Advertisement Services with Big Data and Apache...Break Through the Traditional Advertisement Services with Big Data and Apache...
Break Through the Traditional Advertisement Services with Big Data and Apache...Hortonworks
 
Introduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for WindowsIntroduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for WindowsHortonworks
 
The Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsThe Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsHortonworks
 
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptxHortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptxHortonworks
 
Big Data World Forum
Big Data World ForumBig Data World Forum
Big Data World Forumbigdatawf
 
Tackling big data with hadoop and open source integration
Tackling big data with hadoop and open source integrationTackling big data with hadoop and open source integration
Tackling big data with hadoop and open source integrationDataWorks Summit
 
Hadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHortonworks
 
A Strategic View of Enterprise Reporting and Analytics: The Data Funnel
A Strategic View of Enterprise Reporting and Analytics: The Data FunnelA Strategic View of Enterprise Reporting and Analytics: The Data Funnel
A Strategic View of Enterprise Reporting and Analytics: The Data FunnelInside Analysis
 
ActuateOne for Utility Analytics
ActuateOne for Utility AnalyticsActuateOne for Utility Analytics
ActuateOne for Utility Analyticskatsoulis
 
Big data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You WantBig data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You WantStuart Miniman
 
How agile BI delivers business value
How agile BI delivers business valueHow agile BI delivers business value
How agile BI delivers business valueGerry Brown
 
CIO priorities and Data Virtualization: Balancing the Yin and Yang of the IT
CIO priorities and Data Virtualization: Balancing the Yin and Yang of the ITCIO priorities and Data Virtualization: Balancing the Yin and Yang of the IT
CIO priorities and Data Virtualization: Balancing the Yin and Yang of the ITDenodo
 

Similar to Hadoop: What It Is and What It's Not (20)

Big Data, Hadoop, Hortonworks and Microsoft HDInsight
Big Data, Hadoop, Hortonworks and Microsoft HDInsightBig Data, Hadoop, Hortonworks and Microsoft HDInsight
Big Data, Hadoop, Hortonworks and Microsoft HDInsight
 
Hortonworks roadshow
Hortonworks roadshowHortonworks roadshow
Hortonworks roadshow
 
Talend Open Studio and Hortonworks Data Platform
Talend Open Studio and Hortonworks Data PlatformTalend Open Studio and Hortonworks Data Platform
Talend Open Studio and Hortonworks Data Platform
 
Break Through the Traditional Advertisement Services with Big Data and Apache...
Break Through the Traditional Advertisement Services with Big Data and Apache...Break Through the Traditional Advertisement Services with Big Data and Apache...
Break Through the Traditional Advertisement Services with Big Data and Apache...
 
2012 06 hortonworks paris hug
2012 06 hortonworks paris hug2012 06 hortonworks paris hug
2012 06 hortonworks paris hug
 
Introduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for WindowsIntroduction to Hortonworks Data Platform for Windows
Introduction to Hortonworks Data Platform for Windows
 
The Next Generation of Big Data Analytics
The Next Generation of Big Data AnalyticsThe Next Generation of Big Data Analytics
The Next Generation of Big Data Analytics
 
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptxHortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
Hortonworks Data Platform for Systems Integrators Webinar 9-5-2012.pptx
 
Big Data World Forum
Big Data World ForumBig Data World Forum
Big Data World Forum
 
Tackling big data with hadoop and open source integration
Tackling big data with hadoop and open source integrationTackling big data with hadoop and open source integration
Tackling big data with hadoop and open source integration
 
vBACD July 2012 - Apache Hadoop, Now and Beyond
vBACD July 2012 - Apache Hadoop, Now and BeyondvBACD July 2012 - Apache Hadoop, Now and Beyond
vBACD July 2012 - Apache Hadoop, Now and Beyond
 
Enterprise Services Solutions
Enterprise Services SolutionsEnterprise Services Solutions
Enterprise Services Solutions
 
Hadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data Processing
 
A Strategic View of Enterprise Reporting and Analytics: The Data Funnel
A Strategic View of Enterprise Reporting and Analytics: The Data FunnelA Strategic View of Enterprise Reporting and Analytics: The Data Funnel
A Strategic View of Enterprise Reporting and Analytics: The Data Funnel
 
ActuateOne for Utility Analytics
ActuateOne for Utility AnalyticsActuateOne for Utility Analytics
ActuateOne for Utility Analytics
 
IBM Big Data Platform Nov 2012
IBM Big Data Platform Nov 2012IBM Big Data Platform Nov 2012
IBM Big Data Platform Nov 2012
 
Big data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You WantBig data? No. Big Decisions are What You Want
Big data? No. Big Decisions are What You Want
 
How agile BI delivers business value
How agile BI delivers business valueHow agile BI delivers business value
How agile BI delivers business value
 
Barak regev
Barak regevBarak regev
Barak regev
 
CIO priorities and Data Virtualization: Balancing the Yin and Yang of the IT
CIO priorities and Data Virtualization: Balancing the Yin and Yang of the ITCIO priorities and Data Virtualization: Balancing the Yin and Yang of the IT
CIO priorities and Data Virtualization: Balancing the Yin and Yang of the IT
 

More from Inside Analysis

An Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BIAn Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BIInside Analysis
 
Agile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessAgile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessInside Analysis
 
First in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationFirst in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationInside Analysis
 
Fit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data LetdownFit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data LetdownInside Analysis
 
To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security Inside Analysis
 
The Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeThe Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeInside Analysis
 
Introducing: A Complete Algebra of Data
Introducing: A Complete Algebra of DataIntroducing: A Complete Algebra of Data
Introducing: A Complete Algebra of DataInside Analysis
 
The Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionThe Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionInside Analysis
 
Ahead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsAhead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsInside Analysis
 
All Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingAll Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingInside Analysis
 
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLGoodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLInside Analysis
 
The Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelThe Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelInside Analysis
 
Structurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureStructurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureInside Analysis
 
SQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskSQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskInside Analysis
 
The Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataThe Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataInside Analysis
 
A Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseA Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseInside Analysis
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopInside Analysis
 
Rethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldRethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldInside Analysis
 
DisrupTech - Dave Duggal
DisrupTech - Dave DuggalDisrupTech - Dave Duggal
DisrupTech - Dave DuggalInside Analysis
 

More from Inside Analysis (20)

An Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BIAn Ounce of Prevention: Forging Healthy BI
An Ounce of Prevention: Forging Healthy BI
 
Agile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for SuccessAgile, Automated, Aware: How to Model for Success
Agile, Automated, Aware: How to Model for Success
 
First in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter IntegrationFirst in Class: Optimizing the Data Lake for Tighter Integration
First in Class: Optimizing the Data Lake for Tighter Integration
 
Fit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data LetdownFit For Purpose: Preventing a Big Data Letdown
Fit For Purpose: Preventing a Big Data Letdown
 
To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security To Serve and Protect: Making Sense of Hadoop Security
To Serve and Protect: Making Sense of Hadoop Security
 
The Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On TimeThe Hadoop Guarantee: Keeping Analytics Running On Time
The Hadoop Guarantee: Keeping Analytics Running On Time
 
Introducing: A Complete Algebra of Data
Introducing: A Complete Algebra of DataIntroducing: A Complete Algebra of Data
Introducing: A Complete Algebra of Data
 
The Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop AdoptionThe Role of Data Wrangling in Driving Hadoop Adoption
The Role of Data Wrangling in Driving Hadoop Adoption
 
Ahead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time AnalyticsAhead of the Stream: How to Future-Proof Real-Time Analytics
Ahead of the Stream: How to Future-Proof Real-Time Analytics
 
All Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of EverythingAll Together Now: Connected Analytics for the Internet of Everything
All Together Now: Connected Analytics for the Internet of Everything
 
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETLGoodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
Goodbye, Bottlenecks: How Scale-Out and In-Memory Solve ETL
 
The Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global LevelThe Biggest Picture: Situational Awareness on a Global Level
The Biggest Picture: Situational Awareness on a Global Level
 
Structurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your ArchitectureStructurally Sound: How to Tame Your Architecture
Structurally Sound: How to Tame Your Architecture
 
SQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the RiskSQL In Hadoop: Big Data Innovation Without the Risk
SQL In Hadoop: Big Data Innovation Without the Risk
 
The Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big DataThe Perfect Fit: Scalable Graph for Big Data
The Perfect Fit: Scalable Graph for Big Data
 
A Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data WarehouseA Revolutionary Approach to Modernizing the Data Warehouse
A Revolutionary Approach to Modernizing the Data Warehouse
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of Hadoop
 
Rethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile WorldRethinking Data Availability and Governance in a Mobile World
Rethinking Data Availability and Governance in a Mobile World
 
DisrupTech - Dave Duggal
DisrupTech - Dave DuggalDisrupTech - Dave Duggal
DisrupTech - Dave Duggal
 
Modus Operandi
Modus OperandiModus Operandi
Modus Operandi
 

Recently uploaded

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 

Recently uploaded (20)

SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 

Hadoop: What It Is and What It's Not

  • 1.
  • 3. !   Reveal the essential characteristics of enterprise software, good and bad !   Provide a forum for detailed analysis of today s innovative technologies !   Give vendors a chance to explain their product to savvy analysts !   Allow audience members to pose serious questions... and get answers! Twitter Tag: #briefr The Briefing Room
  • 4. !  November: Cloud !  December: Innovators !  January: Big Data !  February: Performance !  March: Integration Twitter Tag: #briefr The Briefing Room
  • 5. !  The Data Warehouse was once considered the Holy Grail of Business Intelligence, but as data volumes increase exponentially, we’re finding that data warehousing cannot be all things for all users. ! Hadoop was initially developed at Yahoo! to support a search engine project and has since turned into the poster child for open source Big Data processing. !  While Hadoop is not a data warehouse, its capabilities can help organizations store and analyze huge volumes of data. Twitter Tag: #briefr The Briefing Room
  • 6. Mark Madsen is president of Third Nature, a technology research and consulting firm focused on business intelligence, data integration and data management. Mark is an award-winning author, architect and CTO whose work has been featured in numerous industry publications. Over the past ten years Mark received awards for his work from the American Productivity & Quality Center, TDWI, and the Smithsonian Institute. He is an international speaker, a contributor at Forbes Online and Information Management. For more information or to contact Mark, follow @markmadsen on Twitter or visit http://ThirdNature.net Twitter Tag: #briefr The Briefing Room
  • 7. ! Hortonworks is an enterprise software company that focuses on the development and support of Apache Hadoop. !  Its product is the Hortonworks Data Platform, an open source platform for storing, processing and analyzing large volumes of data from many sources and in a variety of formats. ! Hortonworks recently introduced its Hive ODBC Driver 1.0, which allows users to integrate its Hadoop platform with the BI apps running on top. Twitter Tag: #briefr The Briefing Room
  • 8. Jim is the Director of Product Marketing at Hortonworks. He is a recovering developer, professional marketer and amateur photographer with nearly twenty years experience building products and developing emerging technologies. During his career, he has brought multiple  products to market in a variety of fields, including data loss prevention, master data management and now big data.  At Hortonworks, Jim is focused on accelerating the development and adoption of Apache Hadoop. Twitter Tag: #briefr The Briefing Room
  • 9. Hadoop: What It Is & Isn’t October 2012 Jim Walker Director, Product Marketing Hortonworks © Hortonworks Inc. 2012 Page 9
  • 10. Big Data: Organizational Game Changer Transactions + Interactions Petabytes BIG DATA Mobile Web + Observations Sentiment User Click Stream SMS/MMS = BIG DATA Speech to Text Social Interactions & Feeds Terabytes WEB Web logs Spatial & GPS Coordinates A/B testing Sensors / RFID / Devices Behavioral Targeting Gigabytes CRM Business Data Feeds Dynamic Pricing Segmentation External Demographics Search Marketing Customer Touches User Generated Content ERP Megabytes Affiliate Networks Purchase detail Support Contacts HD Video, Audio, Images Dynamic Funnels Purchase record Offer details Offer history Product/Service Logs Payment record Increasing Data Variety and Complexity Page 10 © Hortonworks Inc. 2012
  • 11. What is a Data Driven Business? •  DEFINITION Better use of available data in the decision making process •  RULE Key metrics derived from data should be tied to goals •  PROVEN RESULTS Firms that adopt Data-Driven Decision Making have output and productivity that is 5-6% higher than what would be expected given their investments and usage of information technology* 1110010100001010011101010100010010100100101001001000010010001001000001000100000100 0100100100010000101110000100100010001010010010111101010010001001001010010100100111 11001010010100011111010001001010000010010001010010111101010011001001010010001000111 * “Strength in Numbers: How Does Data-Driven Decisionmaking Affect Firm Performance?” Brynjolfsson, Hitt and Kim (April 22, 2011) Page 11 © Hortonworks Inc. 2012
  • 12. Big Data: Optimize Outcomes at Scale Media optimize Content Intelligence optimize Detection Finance optimize Algorithms Advertising optimize Performance Fraud optimize Prevention Retail / Wholesale optimize Inventory turns Manufacturing optimize Supply chains Healthcare optimize Patient outcomes Education optimize Learning outcomes Government optimize Citizen services Source: Geoffrey Moore. Hadoop Summit 2012 keynote presentation. Page 12 © Hortonworks Inc. 2012
  • 13. Enterprise Big Data Flows Unstructured Business CRM, ERP Data Transactions Web, Mobile & Interactions Point of sale Log files Big Data Platform Exhaust Data Classic Data Integration & ETL Social Media Sensors, devices Business Dashboards, Intelligence Reports, & Analytics Visualization, … DB data Capture Big Data Process Distribute Results Feedback 1 Collect data from all sources structured &unstructured 2 Transform, refine, aggregate, analyze, report 3 Interoperate and share data with applications/analytics 4 Use operational data w/in big data platform, preserve data Page 13 © Hortonworks Inc. 2012
  • 14. Data Platform Requirements for Big Data Data Platform for Big Data Capture Process Exchange •  Collect data from all •  Transform, refine, •  Deliver data with sources - structured and aggregate, analyze, enterprise data systems unstructured data report •  Share data with analytic •  all speeds batch, async, applications and streaming, real-time processing Operate •  Provision, monitor, diagnose, manage at scale •  Reliability, availability, affordability, scalability, interoperability Across all deployment models Operating Virtual Cloud Big Data Systems Platforms Platforms Appliances Page 14 © Hortonworks Inc. 2012
  • 15. Apache Hadoop & Big Data Use Cases Big Data Transactions, Interactions, Observations Refine Explore Enrich Business Case Page 15 © Hortonworks Inc. 2012
  • 16. Operational Data Refinery Hadoop as platform for ETL modernization Refine Explore Enrich Unstructured Log files DB data Capture •  Capture new unstructured data along with log files all alongside existing sources •  Retain inputs in raw form for audit and Capture and archive continuity purposes Parse & Cleanse Process Structure and join •  Parse the data & cleanse Upload •  Apply structure and definition Refinery •  Join datasets together across disparate data sources Exchange •  Push to existing data warehouse for downstream consumption Enterprise •  Feeds operational reporting and online systems Data Warehouse Page 16 © Hortonworks Inc. 2012
  • 17. Big Data Exploration & Visualization Hadoop as agile, ad-hoc data mart Refine Explore Enrich Unstructured Log files DB data Capture •  Capture multi-structured data and retain inputs in raw form for iterative analysis Capture and archive Process •  Parse the data into queryable format Structure and join •  Explore & analyze using Hive, Pig, Mahout and Categorize into tables other tools to discover value upload JDBC / ODBC •  Label data and type information for compatibility and later discovery Explore •  Pre-compute stats, groupings, patterns in data Optional to accelerate analysis Exchange •  Use visualization tools to facilitate exploration and find key insights Visualization EDW / Datamart Tools •  Optionally move actionable insights into EDW or datamart Page 17 © Hortonworks Inc. 2012
  • 18. Application Enrichment Deliver Hadoop analysis to online apps Refine Explore Enrich Unstructured Log files DB data Capture •  Capture data that was once too bulky and unmanageable Capture Enrich Parse Process Derive/Filter •  Uncover aggregate characteristics across data Scheduled & near real time •  Use Hive Pig and Map Reduce to identify patterns NoSQL, HBase •  Filter useful data from mass streams (Pig) Low Latency •  Micro or macro batch oriented schedules Exchange •  Push results to HBase or other NoSQL alternative for real time delivery Online •  Use patterns to deliver right content/offer to the Applications right person at the right time Page 18 © Hortonworks Inc. 2012
  • 19. Hadoop in Enterprise Data Architectures Existing Business Infrastructure Web New Tech Datameer Tableau Karmasphere IDE & ODS & Applications & Visualization & Web Splunk Dev Tools Datamarts Spreadsheets Intelligence Applications Operations Discovery Low Latency/ Tools EDW NoSQL Custom Existing Templeton WebHDFS Sqoop Flume HCatalog HBase Pig Hive MapReduce HDFS Ambari Oozie HA ZooKeeper Social Exhaust logs files CRM ERP financials Media Data Big Data Sources (transactions, observations, interactions) Page 19 © Hortonworks Inc. 2012
  • 20. Where Does It Fit into Your Business? Vertical Refine Explore Enrich •  Dynamic Pricing •  Log Analysis/Site Retail & Web •  Social Network Analysis •  Session & Content Optimization Optimization •  Loyalty Program •  Dynamic Pricing/Targeted Retail •  Brand and Sentiment Analysis Optimization Offer Intelligence •  Threat Identification •  Person of Interest Discovery •  Cross Jurisdiction Queries •  Risk Modeling & Fraud •  Surveillance and Fraud Identification •  Real-time upsell, cross sales Finance •  Trade Performance Detection marketing offers •  Customer Risk Analysis Analytics •  Smart Grid: Production •  Grid Failure Prevention Energy •  Individual Power Grid Optimization •  Smart Meters •  Dynamic Delivery Manufacturing •  Supply Chain Optimization •  Customer Churn Analysis •  Replacement parts Healthcare & •  Electronic Medical Records •  Clinical Trials Analysis •  Insurance Premium Payer (EMPI) Determination Page 20 © Hortonworks Inc. 2012
  • 21. Hortonworks Vision & Leadership We believe that by the end of 2015, more than half the world's data will be processed by Apache Hadoop. Trusted Open Innovative •  Stewards of core Hadoop •  100% open platform •  Innovating current platform •  Original builders and •  No POS holdback with HCatalog, Ambari, HA operators of Hadoop •  Open to the Hadoop •  Innovating future platform •  100+ years Hadoop community with YARN, HA development experience •  Open to the Hadoop •  Complete vision for •  Managed every viable, ecosystem Hadoop-based platform stable Hadoop release •  Closely aligned to •  Enable the Hadoop •  HDP built on Hadoop 1.0 Hadoop core ecosystem Page 21 © Hortonworks Inc. 2012
  • 22. Hortonworks Data Platform •  Simplify deployment to get started quickly and easily •  Monitor, manage any size cluster with familiar console and tools 1 •  Only platform to include data integration services to interact with any data •  Metadata services opens the platform for integration with existing applications •  Dependable high availability architecture ü  Reduce risks and cost of adoption ü  Lower the total cost to administer and provision •  Tested at scale to future proof your cluster growth ü  Integrate with your existing ecosystem Page 22 © Hortonworks Inc. 2012
  • 23. Twitter Tag: #briefr The Briefing Room
  • 24. “In  pioneer  days  they  used  oxen  for  heavy  pulling,  and   when  one  ox  couldn't  budge  a  log,  they  didn't  try  to   grow  a  larger  ox.  We  shouldn't  be  trying  for  bigger   computers,  but  for  more  systems  of  computers.”    Grace  Hopper   © Third Nature Inc.
  • 25. What’s  different  today?   We’re  not  ge@ng  more  CPU   speed,  but  more  CPU  cycles.   There  are  too  many  CPUs   relaEve  to  other  resources,   creaEng  an  imbalance  in   hardware  plaForms.   We  therefore  use  nodes  to   aggregate  memory,  network   bandwidth  and  IOPs.   Most  soJware  is  designed  for   a  single  worker,  not    high   degrees  of  parallelism  and   won’t  scale  well.   © Third Nature Inc.
  • 26. Data  volume  is  the  oldest,  easiest  problem   © Third Nature Inc. Teradata
  • 27. Analy:cs  makes  the  data  volume  problem  bigger   Many  of  the  processing  problems  are  O(n2)  or  worse,  so   moderate  data  can  be  a  problem  for  DW  architectures   © Third Nature Inc.
  • 28. I need that It would be logical data now. to keep all the It will take .   data in one place. 6 months       A  common  problem  with  new  projects  or   © Third Nature Inc. unexpected  business  problems…  
  • 29. The  proposed  solu:on?  Load  Hadoop  and  analyze   © Third Nature Inc.
  • 30. Welcome  to  the  Hadoop  schema!   Why  soJ  /  no  schema  can  be  good:   Easier  programming   Easier  modeling  since  you  don’t  have  to  be  perfect  in  advance,  and   it’s  change-­‐resilient   Join  eliminaEon  =  I/O  savings  (if  no  updates)   © Third Nature Inc.
  • 31. Whether  to  switch  from  a  DB  isn’t  the  right  discussion   SQL? Hadoop SQL! SQL SQL.. . © Third Nature Inc.
  • 32. Strategy:  There’s  a  pony  in  there  somewhere   © Third Nature Inc.
  • 33. …but  you  need  a  unicorn  to  find  the  pony   © Third Nature Inc.
  • 34. Ques:ons  for  discussion   1. Is  scale  of  data  really  that  much  of  a  problem  for  most   organizaEons?   2. Hadoop  is  designed  for  batch  work  –  how  good  is  it  for   interacEve  use?  Real-­‐Eme  use  cases?   3. How  do  you  define  “plaForm”?   4. ETL  modernizaEon  is  menEoned,  but  isn’t  this  a  reversion   to  manual  coding?   5. How  do  you  design  for  long-­‐term  use  rather  than  one-­‐off   analysis  projects?   6. Does  open  source  really  macer  for  this  part  of  the  stack?   © Third Nature Inc.
  • 35. CC  Image  AOribu:ons   Thanks  to  the  people  who  supplied  the  creaEve  commons  licensed  images  used  in  this  presentaEon:     Phone  dump  -­‐  Richard  Barnes   ponies  in  field.jpg  -­‐  hcp://www.flickr.com/photos/bulle_de/352732514/     © Third Nature Inc.
  • 36. Twitter Tag: #briefr The Briefing Room
  • 37. !  This Month: Database !  November: Cloud !  December: Innovators !  January: Big Data !  2013 Editorial Calendar (www.insideanalysis.com) Twitter Tag: #briefr The Briefing Room
  • 38. Twitter Tag: #briefr The Briefing Room