• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Intel And Big Data: An Open Platform for Next-Gen Analytics
 

Intel And Big Data: An Open Platform for Next-Gen Analytics

on

  • 379 views

On Intel as a platform for big data Intel's VP of Architecture Group and GM of Datacenter Software Boyd Davis discusses Intel's contribution and expansion of the foundational technology HADOOP as a ...

On Intel as a platform for big data Intel's VP of Architecture Group and GM of Datacenter Software Boyd Davis discusses Intel's contribution and expansion of the foundational technology HADOOP as a means to enrich business intelligence and analysis from the edge to the cloud. Head to http://intel.com/bigdata to learn more.

Statistics

Views

Total Views
379
Views on SlideShare
379
Embed Views
0

Actions

Likes
1
Downloads
6
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Intel And Big Data: An Open Platform for Next-Gen Analytics Intel And Big Data: An Open Platform for Next-Gen Analytics Presentation Transcript

    • Open Platform for Next-Gen Analytics Boyd Davis VP Intel Architecture Group GM Datacenter Software Division @IntelITS
    • Legal InformationToday’s presentations contain forward-looking statements. All statements made that are not historical facts are subject to a number ofrisks and uncertainties, and actual results may differ materially. Please refer to our most recent Earnings Release and our most recentForm 10-Q or 10-K filing for more information on the risk factors that could cause actual results to differ.If we use any non-GAAP financial measures during the presentations, you will find on our website, intc.com, the required reconciliationto the most directly comparable GAAP financial measure.INFORMATION IN THIS DOCUMENT IS PROVIDED “AS IS”. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTSIS GRANTED BY THIS DOCUMENT. INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO THISINFORMATION INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT,COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmarkand MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause theresults to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performanceof that product when combined with other products.Intel product plans in this presentation do not constitute Intel plan of record product roadmaps. Please contact your Intel representative to obtain Intels current plan ofrecord product roadmaps.Intels compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intelmicroprocessors. These optimizations include SSE2, SSE3, and SSE3 instruction sets and other optimizations. Intel does notguarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel.Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations notspecific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and ReferenceGuides for more information regarding the specific instruction sets covered by this notice.Notice revision #20110804
    • Making sense of one petabyte 50x 13y 11s To read To view To generatein Library of Congress as HD Video in 2012 http://blogs.loc.gov/digitalpreservation/2011/07/transferring-libraries-of-congress-of-data/
    • Analysis of data can transform society Enhance scientific understanding, drive innovation, and accelerate medical cures Create new business models and improve organizational processes Increase public safety and improve energy efficiency with smart grids
    • Virtuous cycle of data-driven user experience Richer user experiences Richer data to analyze CLIENTS Richer data CLOUD from devices INTELLIGENT SYSTEMS
    • Intel at the intersection of forces behind big data HPC Cloud Open Source Intel® TrueScale InfinibandEnabling exascale computing Helping enterprises build Contributing code and on massive data sets open interoperable clouds fostering ecosystem * Other names and brands may be claimed as the property of others.
    • Democratize data analysis from edge to cloud Unlock value in silicon Support open platforms Deliver software value
    • History of Intel and Apache Hadoop* Product Optimization Tuning Benchmarking Release 2.0Research Telco Smart City (2012) Release 1.0 HiBench Healthcare Retail (2011) Web Open Cirrus*2009 2013 * Other names and brands may be claimed as the property of others.
    • Announcing availability ofIntel® Distribution for Apache Hadoop* software Hardware-enhanced performance & security Enables partner innovation in analytics Strengthens Apache Hadoop* ecosystem * Other names and brands may be claimed as the property of others.
    • Intel® Distribution for Apache Hadoop* software Intel® Manager for Apache Hadoop software Deployment, Configuration, Monitoring, Alerts, and Security Data Exchange Sqoop 1.4.1 Oozie 3.3.0 Pig 0.9.2 Mahout 0.7 R connectors Hive 0.9.0 HBase 0.94.1 Workflow Scripting Machine Learning Statistics SQL Query Columnar Store Zookeeper 3.4.5     Coordination YARN (MRv2) Distributed Processing Framework Flume 1.3.0 Log Collector HDFS 2.0.3 Hadoop Distributed File SystemIntel proprietaryIntel enhancements contributed back to open source All external names and brands are claimed as the property of others.Open source components included without change
    • Intel® Distribution for Apache Hadoop* software •  Up to 20x faster decryption with AES-NI* •  Optimized with SSD and Cache Acceleration •  Up to 8.5X faster queries in Hive* •  Hardware-enhanced compression with AVX & SSE4.2 •  Automated tuning with Intel® Active Tuner *Based on internal testing
    • Sold with World-Class Intel Support Annual Subscription with Technical Support Support Coverage Options: 24x7 or 8x5 Via Solution Vendors and Service Providers
    • Backed by broad portfolio of datacenter products Software Cache Acceleration Software Server Storage & Memory Network
    • Paul Perez Vice President and GM Data Center Group* Other names and brands may be claimed as the property of others.
    • Intel portfolio delivers balanced performance >4 hours Shown to improve 1 Terabyte sort from 4 hours to 7 minutes Intel® Xeon® E5-2690 processor ~50% improved Intel® SSD 520 Series Intel® 10GbE Intel® Xeon Adapters 5690 ~80% Intel® Distribution for Apache Hadoop* improved ~50% software 7200 HDD improved ~40% improved1GbE Adapter ~7 minutes Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors.  Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions.  Any change to any of those factors may cause the results to vary.  You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Source: Intel Internal testing For more information go to : intel.com/performance Other brands and names are the property of their respective owners `
    • Proven in the enterpriseUsing the Intel® Distribution to gain tremendous results IT * Other names and brands may be claimed as the property of others.
    • Satnam Alag Vice President and CTO* Other names and brands may be claimed as the property of others.
    • Delivering innovation in the openPipeline of innovation from Intel Labs •  Machine Learning •  Data-Intensive Algorithms & Computer ArchitectureRoadmap of open source from Intel Software •  Project Panthera: Standard SQL on Apache Hadoop •  Project Rhino: Hardening Apache Hadoop
    • Lighting up unused data for big impact Intel accelerating adoption of Hadoop + Apache Hadoop landing on Intel Xeon 2 years faster Units Intel® Xeon processor growth from big data use 2013 2014 2015 2016 2017
    • With broad support from the ecosystem * Other names and brands may be claimed as the property of others.
    • Enabling partner innovation in next-gen analytics Richard Pledereder, Senior Vice President SAP® HANA* Engineering Steve Garrou, Vice President Global Solutions Ranga Rangachari, Vice President and GM Storage Business Paul Perez, Vice President and GM Data Center Group
    • Summary•  Intel announced Intel® Distribution for Apache Hadoop* software•  Delivers hardware-enhanced capabilities and software enhancements•  Backed by broad portfolio of Intel data center products•  Contributes to open source and supports Apache Hadoop•  Enabling ecosystem of partners to innovate on analytics solutions
    • Q&A
    • Legal DisclaimersAll products, computer systems, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice.Intel processor numbers are not a measure of performance.  Processor numbers differentiate features within each processor family, not acrossdifferent processor families.  Go to: http://www.intel.com/products/processor_numberIntel, processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate frompublished specifications. Current characterized errata are available on request.Intel® Virtualization Technology requires a computer system with an enabled Intel® processor, BIOS, virtual machine monitor (VMM).  Functionality,performance or other benefits will vary depending on hardware and software configurations.  Software applications may not be compatible with alloperating systems.  Consult your PC manufacturer.  For more information, visit http://www.intel.com/go/virtualizationNo computer system can provide absolute security under all conditions.  Intel® Trusted Execution Technology (Intel® TXT) requires a computersystem with Intel® Virtualization Technology, an Intel TXT-enabled processor, chipset, BIOS, Authenticated Code Modules and an Intel TXT-compatiblemeasured launched environment (MLE).  Intel TXT also requires the system to contain a TPM v1.s.  For more information, visit http://www.intel.com/technology/securityIntel, Intel Xeon, Intel Atom, Intel Xeon Phi, Intel Itanium, the Intel Itanium logo, the Intel Xeon Phi logo, the Intel Xeon logo and the Intel logo aretrademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries.Other names and brands may be claimed as the property of others.Copyright © 2013, Intel Corporation. All rights reserved.
    • Apache Hadoop Performance Test Configuration 4 hours to 7 minutes Cluster Configuration Head Node Hardware q  1 Head Node (name node, job tracker) q  1 x Dell r710 1U servers q  10 Workers (data nodes, task trackers) §  Intel: 2x3.47GHz Intel® Xeon® q  10-Gigabit Switch: Cisco Nexus 5020 processor X5690 §  Memory: 48G RAM §  Storage: 10K SAS HDD Software Configuration §  Intel® Ethernet 10 Gigabit SFP+ q  Intel Distribution for Apache Hadoop 2.1.1 §  Intel® Ethernet 1 Gigabit q  Apache Hadoop 1.0.3 q  RHEL 6.3 q  Oracle Java 1.7.0_05 Worker Node HardwareResults have been estimated based on internal Intel analysis and are provided for 10 x Dell r720 2U serversinformational purposes only. Any difference in system hardware or software design or §  Intel: 2 x 2.90Ghz Intel® Xeon® processor E5-2690configuration may affect actual performance. Software and workloads used inperformance tests may have been optimized for performance only on Intel §  Memory: 128G RAMmicroprocessors. Performance tests, such as SYSmark and MobileMark, are measuredusing specific computer systems, components, software, operations and functions. Any §  Storage: 520 Series SSDschange to any of those factors may cause the results to vary. You should consult otherinformation and performance tests to assist you in fully evaluating your contemplated §  Intel® Ethernet 10 Gigabit SFP+purchases, including the performance of that product when combined with otherproducts. Note: The below disclaimer should be included whenever the general §  Intel® Ethernet 1 Gigabitperformance disclaimer is used, but should be numbered separately:Configurations: [describe config + what test used + who did testing]. For moreinformation go to http://www.intel.com/performance