Hortonworks for Financial Analysts Presentation
Upcoming SlideShare
Loading in...5
×
 

Hortonworks for Financial Analysts Presentation

on

  • 5,788 views

Hortonworks presentation from Cowen Big Data Day for financial industry analysts

Hortonworks presentation from Cowen Big Data Day for financial industry analysts

Statistics

Views

Total Views
5,788
Views on SlideShare
5,787
Embed Views
1

Actions

Likes
10
Downloads
269
Comments
0

1 Embed 1

http://static.ak.facebook.com 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Our commitment to Apache has already changed the market!Ultimately contributing the code that maters and making it work is the currency in open source
  • Our commitment is to continue growing our contribution
  • For more information on the history of Hadoop, see: http://developer.yahoo.com/blogs/hadoop/posts/2011/01/the-backstory-of-yahoo-and-hadoop/

Hortonworks for Financial Analysts Presentation Hortonworks for Financial Analysts Presentation Presentation Transcript

  • Hortonworks
    Eric Baldeschwieler, Co-Founder and CEO
    September 2011
    Overview for Cowen Big Data Day 2011
    © Hortonworks Inc. 2011
  • Agenda
    Hortonworks
    Apache Hadoop
    Use cases
    Hadoop in the Enterprise
    Market
    Strategy
    2
    © Hortonworks Inc. 2011
  • About Hortonworks – Basics
    Founded – July 1st, 2011
    22 architects & committers from Yahoo!
    Mission – Architect the future of Big Data
    Revolutionize and commoditize the storage and processing of Big Data via open source
    Vision – Half of the worlds data will be stored in Hadoop within five years
    3
    © Hortonworks Inc. 2011
  • About Hortonworks – Game Plan
    Support the growth of a huge Apache Hadoop ecosystem
    Invest in ease of use, management, and other enterprise features
    Define APIs for ISVs, OEMs and others to integrate with Apache Hadoop
    Continue to invest in advancing the Hadoop core, remain the experts
    Contribute all of our work to Apache
    Profit by providing training & support to the Hadoop community
    4
    © Hortonworks Inc. 2011
  • Credentials
    Technical: key architects and committers from Yahoo! Hadoop engineering team
    Delivered every major Apache Hadoop release since 0.1
    Highest concentration of Apache Hadoop committers
    Driving innovation across entire Apache Hadoop stack
    Experience managing world’s largest deployment
    Access to Yahoo!’s 1,000+ users and 42k+ nodes for testing, QA, etc.
    Business operations: team of highly successful open source veterans
    Led by Rob Bearden, former COO of SpringSource & JBoss
    Investors: backed by Benchmark Capital and Yahoo!
    5
    © Hortonworks Inc. 2011
  • What is Apache Hadoop?
    Set of open source projects
    Owned by Apache Software Foundation
    Transforms commodity hardware into a service that:
    Stores petabytes of data reliably (HDFS)
    Allows huge distributed computations (MapReduce)
    Key attributes:
    Redundant and reliable
    Doesn’t stop or lose data even if hardware fails
    Easy to program
    Extremely powerful
    Allows the development of big data algorithms & tools
    Batch processing centric
    Runs on commodity hardware
    Computers & network
    6
    © Hortonworks Inc. 2011
  • Typical Hadoop Applications
    7
    data analytics
    advertising optimization
    machine learning search ranking
    Mail anti-spam
    advertising data systems
    audience, ad and search pipelines
    ad selection
    Website personalization
    Content Optimization
    ad inventory prediction
    user interest prediction
    © Hortonworks Inc. 2011
  • Who Builds Hadoop?Lines of code contributed since Hadoop inception
    8
    © Hortonworks Inc. 2011
  • Who Builds Hadoop?Lines of code contributed in 2011
    9
    © Hortonworks Inc. 2011
  • , early adopters
    Scale and productize Hadoop
    2006 – present
    Other Internet Companies
    Add tools / frameworks, enhance Hadoop
    2008 – present
    Service Providers
    Provide training, support, hosting
    2010 – present
    Apache Hadoop
    A Brief History
    Nascent / 2011
    Wide Enterprise Adoption
    Funds further development, enhancements
    10
    © Hortonworks Inc. 2011
  • HADOOP @ YAHOO!
    40K+ Servers
    170 PB Storage
    5M+ Monthly Jobs
    1000+ Active users
    © Yahoo 2011
    11
  • CASE STUDY
    YAHOO! HOMEPAGE
    twice the engagement
    Personalized
    for each visitor
    Result:
    twice the engagement
    News Interests
    Top Searches
    Recommended links
    +43% clicks
    vs. editor selected
    +79% clicks
    vs. randomly selected
    +160% clicks
    vs. one size fits all
    © Yahoo 2011
    12
  • CASE STUDY
    YAHOO! HOMEPAGE
    SCIENCE
    HADOOP
    CLUSTER
    • ServingMaps
    • Users - Interests
    • Five Minute Production
    • Weekly Categorization models
    »Machine learning to build ever better categorization models
    CATEGORIZATION
    MODELS (weekly)
    USER
    BEHAVIOR
    PRODUCTION
    HADOOP
    CLUSTER
    »Identify user interests using Categorization models
    SERVING
    MAPS
    (every 5 minutes)
    USER
    BEHAVIOR
    Build customized home pages with latest data (thousands / second)
    SERVING SYSTEMS
    ENGAGED USERS
    © Yahoo 2011
    13
    13
  • CASE STUDY
    YAHOO! MAIL
    Enabling quick response in the spam arms race
    SCIENCE
    • 450M mail boxes
    • 5B+ deliveries/day
    • Antispam models retrained
    every few hours on Hadoop
    PRODUCTION

    40% less spam than Hotmail and 55% less spam than Gmail

    © Yahoo 2011
    14
    14
  • Hadoop in the Enterprise
    © Hortonworks Inc. 2011
    15
  • Big Data PlatformsCost per TB, Adoption
    Size of bubble = cost effectiveness of solution
    Source:
    16
    © Hortonworks Inc. 2011
  • Traditional Enterprise ArchitectureData Silos + ETL
    17
    Traditional Data Warehouses,
    BI & Analytics
    Serving Applications
    Web Serving
    NoSQLRDMS

    Traditional ETL &
    Message buses
    EDW
    Data Marts
    BI / Analytics
    Traditional ETL &
    Message buses
    Serving Logs
    Social Media
    Sensor Data
    Text Systems

    Unstructured Systems
    © Hortonworks Inc. 2011
  • Hadoop Enterprise ArchitectureConnecting All of Your Big Data
    18
    Traditional Data Warehouses,
    BI & Analytics
    Serving Applications
    Web Serving
    NoSQLRDMS

    Traditional ETL &
    Message buses
    EDW
    Data Marts
    BI / Analytics
    Apache Hadoop
    EsTsL (s = Store)
    Custom Analytics
    Traditional ETL &
    Message buses
    Serving Logs
    Social Media
    Sensor Data
    Text Systems

    Unstructured Systems
    © Hortonworks Inc. 2011
  • Hadoop Enterprise ArchitectureConnecting All of Your Big Data
    19
    Traditional Data Warehouses,
    BI & Analytics
    Serving Applications
    Web Serving
    NoSQLRDMS

    Traditional ETL &
    Message buses
    EDW
    Data Marts
    BI / Analytics
    Apache Hadoop
    EsTsL (s = Store)
    Custom Analytics
    Gartner predicts
    800% data growth
    over next 5 years
    80-90% of data
    produced today
    is unstructured
    Traditional ETL &
    Message buses
    Serving Logs
    Social Media
    Sensor Data
    Text Systems

    Unstructured Systems
    © Hortonworks Inc. 2011
  • The Hadoop Market
    © Hortonworks Inc. 2011
    20
  • Market Drivers for Apache Hadoop
    Business drivers
    Identified high value projects that require use of more data
    Belief that there is great ROI in mastering big data
    Financial drivers
    Growing cost of data systems as proportion of IT spend
    Cost advantage of commodity hardware + open source
    Enables departmental-level big data strategies
    Technical drivers
    Existing solutions failing under growing requirements
    3Vs - Volume, velocity, variety
    Proliferation of unstructured data
    21
    Significant opportunity for Hadoop in enterprise data architectures
    © Hortonworks Inc. 2011
  • Market Opportunity for Hadoop
    Current
    Apache Hadoop can become de facto platform for managing unstructured data in the enterprise
    Enable new breed of applications to be built on top of Apache Hadoop
    Future
    Hadoop becomes the next generation enterprise data architecture
    22
    © Hortonworks Inc. 2011
  • Market Dynamics
    Technology & knowledge gaps are preventing Apache Hadoop from becoming an enterprise standard
    Difficult to install and deploy Hadoop projects
    Lack of technical content to assist
    Demand for knowledgeable developers far exceeds supply
    Virtually every F500 company is constructing a Hadoop strategy
    But most are still in POC/experimentation phase with Hadoop
    Top ISV/OEMs working to create Hadoop strategies
    Driven by customer demand
    Community is becoming increasingly confused by all of the noise
    Multiple distributions, many vendor announcements
    Fear of market fragmentation
    23
    © Hortonworks Inc. 2011
  • Conclusion
    There is not a Hadoop market to “win” today
    Most organizations haven’t moved to full-scale production
    Lack of mass adoption limiting short-term monetization opportunities
    Need to drive Apache Hadoop as a unifying standard
    In order to succeed, we need to enable the market
    Continue investment to overcome technology gaps
    Enable a vibrant partner ecosystem
    Expand availability of content and services to address knowledge gaps
    How will Hortonworks do that?
    24
    © Hortonworks Inc. 2011
  • Hortonworks Strategy
    © Hortonworks Inc. 2011
    25
  • Hortonworks Strategy #1Overcome Technology Gaps
    Make Apache Hadoop projects easier to install, manage & use
    Regular sustaining releases
    Projects released as binary (RPM, .deb)
    Open source Management & Monitoring
    Make Apache Hadoop more robust
    Performance gains
    High availability
    Administration & monitoring
    All done within Apache Hadoop community
    • Develop collaboratively with community
    • Complete transparency
    • All code contributed back to Apache
    Anyone should be able to easily deploy the Hadoop projects from Apache
    26
    © Hortonworks Inc. 2011
  • HortonworksStrategy #2Enable a Vibrant Ecosystem
    Unify the community around a strong Apache Hadoop offering
    Make Apache Hadoop easier to integrate & extend
    Work closely with partners to define and build open APIs
    Everything contributed back to Apache
    Provide enablement services as necessary to optimize integration
    27
    Integration & Services Partners
    Hadoop Application Partners
    DW, Analytics & BI Partners
    Serving & Unstructured Data Systems Partners
    Hardware Partners
    Cloud & Hosting Platform Partners
    © Hortonworks Inc. 2011
  • Hortonworks Strategy #3Overcome Knowledge Gaps
    Improve user experience with Apache Hadoop software
    Binaries, installers, etc.
    Expand Apache Hadoop technical content
    Core content on Apache.org
    Docs, installation guides, etc.
    Advanced tools on Hortonworks.com
    Best practices, screencasts, forums, etc.
    Extensive Hadoop training & certification program
    Expert technical support services
    28
    © Hortonworks Inc. 2011
  • Rationale for Hortonworks Strategy
    Strong interest from community (enterprises and ISV/OEMs) in a complete, enterprise-viable, Apache Hadoop platform
    Strong desire for core to remain unified and strong, avoid UNIX wars II
    Fremium model seen as a barrier to growth and adoption
    Highly defensible because of Hortonworks leadership in core projects
    Proven experience executing open source business models
    Rob Bearden & Benchmark
    29
    © Hortonworks Inc. 2011
  • 30
    Thank You.
    © Hortonworks Inc. 2011