Your SlideShare is downloading. ×
  • Like
  • Save
Intel’s Big Data and Hadoop Security Initiatives - StampedeCon 2014
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.


Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Intel’s Big Data and Hadoop Security Initiatives - StampedeCon 2014


At StampedeCon 2014, Todd Speck (Intel) presented "Intel’s Big Data and Hadoop Security Initiatives." …

At StampedeCon 2014, Todd Speck (Intel) presented "Intel’s Big Data and Hadoop Security Initiatives."

In this talk, we will cover various aspects of software and hardware initiatives that Intel is contributing to Hadoop as well as other aspects of our involvement in solutions for Big Data and Hadoop, with a special focus on security. We will discuss specific security initiatives as well as our recent partnership with Cloudera. You should leave the session with a clear understanding of Intel’s involvement and contributions to Hadoop today and coming in the near future.

Published in Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads


Total Views
On SlideShare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Intel Confidential — Do Not Forward Intel in Big Data and the Internet of Things Todd Speck – Director, Intel 3SO 5/30/14
  • 2. Intel Confidential Legal Disclaimers All products, computer systems, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice. Intel processor numbers are not a measure of performance. Processor numbers differentiate features within each processor family, not across different processor families. Go to: Intel, processors, chipsets, and desktop boards may contain design defects or errors known as errata, which may cause the product to deviate from published specifications. Current characterized errata are available on request. Intel® Virtualization Technology requires a computer system with an enabled Intel® processor, BIOS, virtual machine monitor (VMM). Functionality, performance or other benefits will vary depending on hardware and software configurations. Software applications may not be compatible with all operating systems. Consult your PC manufacturer. For more information, visit No computer system can provide absolute security under all conditions. Intel® Trusted Execution Technology (Intel® TXT) requires a computer system with Intel® Virtualization Technology, an Intel TXT-enabled processor, chipset, BIOS, Authenticated Code Modules and an Intel TXT-compatible measured launched environment (MLE). Intel TXT also requires the system to contain a TPM v1.s. For more information, visit Requires a system with Intel® Turbo Boost Technology capability. Consult your PC manufacturer. Performance varies depending on hardware, software and system configuration. For more information, visit Intel® AES-NI requires a computer system with an AES-NI enabled processor, as well as non-Intel software to execute the instructions in the correct sequence. AES-NI is available on select Intel® processors. For availability, consult your reseller or system manufacturer. For more information, see advanced-encryption-standard-instructions-aes-ni/ Intel product is manufactured on a lead-free process. Lead is below 1000 PPM per EU RoHS directive (2002/95/EC, Annex A). No exemptions required Halogen-free: Applies only to halogenated flame retardants and PVC in components. Halogens are below 900ppm bromine and 900ppm chlorine. Intel, Intel Xeon, Intel Core microarchitecture, the Intel Xeon logo and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. 2
  • 3. Intel Confidential Big Opportunity: Extract value from data x = THINGS DATA VALUE Revenue Growth Cost Savings Margin Gain 50 Billion 35 ZB
  • 4. Intel Confidential Big Gap: Roadblocks on the journey x = THINGS DATA VALUE Revenue Growth Cost Savings Margin Gain 50 Billion 35 ZB NO NO NO SECURITY INSIGHT PROOF Pay more for data management Delay insights with batch processing Waste time on misguided pilots Use sub-optimal hardware Hold back production deployment Worry about attacks Bring data to compute -- fail to scale Fail to show ROI Store underutilized data
  • 5. Intel Confidential 5 Big Picture: Datacenter Inflection Cluster to Cloud ASIC to IA/Fabric3 Big Data4 Physical to Virtual SW-only to HW-assisted2 2010 2011 2012 2013 Public Private 2008 2009 2010 2011 2012 2013 Virtualized Nonvirtualized RISC to IA UNIX to Linux 1 Linux/x86 Units UNIX/RISC units 2000 20132001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 0 Intel Confidential — NDA ONLY “In 2000 Intel saw Linux coming & invested in heavily in Red Hat; in 2005 we saw virtualization happening and invested in VMware; in 2008 we started investing heavily in hyper-scale computing.” We think big data & Hadoop will dwarf all of them.” Diane Bryant, SVP & GM Data Center Group, Intel
  • 6. Intel Confidential 6 Now introducing our Strategic Big Data Partner Other brands and names are the property of their respective owners Enabling the Apache Hadoop ecosystem with joint leadership
  • 7. Intel Confidential Big Deal: Cloudera + Intel Alliance Intel invests $740M in Cloudera  As Intel’s largest datacenter venture deal, represents Intel commitment to big data  Supports Cloudera’s ability to remain independent Intel & Cloudera drive innovation through open source  Accelerate evolution of Hadoop by joining forces on foundational technologies  Enable open source developers to innovate in and on top of the Hadoop platform Intel enables CDH to run best on Intel Architecture  Enables Cloudera to make best use of Intel data center technologies  Provides datacenter infrastructure for Cloudera development & benchmarking at scale
  • 8. Intel Confidential Big Goal: Converge on one open source platform • Most stable, compatible, and mature Hadoop distribution • Leading SQL functionality & performance (Impala) • Deepest management and governance capabilities • 150 Hadoop developers • 100 open source committers • The only distribution with performance and security enhanced from the silicon up • Leading security capabilities including encryption, access control, and auditing • 50 Hadoop developers and 12 committers • Long-standing committment to open source with 1000 developers working on Linux, KVM, Xen, Java, OpenStack, Hadoop
  • 9. Intel Confidential Driving innovation through open source Project Gryphon Impala Ramp innovation in Apache Hadoop platform while reducing fragmentation SQL Impala Apache Storm Apache Spark Streaming Streaming Spark Streaming Apache Tez Apache Spark Performance Spark Project Rhino Apache Sentry Security Project Rhino (including Sentry) Storage Apache HDFS Apache HBase Accelerated investment in both
  • 10. Intel Confidential Enabling CDH to run best on Intel Architecture Software & Silicon co-evolve to deliver dramatic gains 1 Push compute- intensive work down to the silicon Increase main memory utilization up to 20X Design for rack- scale architecture Encryption (AES-NI) Compression (SSE 4.2) Math (MKL) 200:1 10:1 Improve Disk:Memory 2 3
  • 11. Intel Confidential Focus of Joint Engineering Feature / Target Cloudera Enterprise SECURITY • HDFS Encryption and extended file ACLs • Centralized authorization via Sentry • Simplified Kerberos • HBase cell-level authorization • Search: document and index security • Auditing & data lineage PERFORMANCE • Crypto acceleration with AES-NI • MR/Shuffle optimizations • Compression acceleration with SSE 4.2 • Optimizations using AVX and other IA • Optimizations using MKL • Explore Xeon Phi with Java support MANAGEMENT • Service management extensions • Simplified cloud provisioning, including AWS support • Backup and Disaster Recovery • Deeper diagnostics of various modules • Support for Azure, VMware, OpenStack • Extended RBAC in Cloudera Manager APPLICATIONS • Certified w/ Intel Enterprise Edition of Lustre • Impala enhancements including low-latency SQL engine, SQL-92 analytic queries, and more • Spark support in CDH, including Spark on YARN, Spark security, and Spark streaming • SQL on HBase • Spark interoperability with Impala • Wire encryption for Spark • Pig integration with Spark • Spark/Sentry integration
  • 12. Intel Confidential Cloudera Enterprise Data Hub powered by Apache Hadoop 12 Enterprise Data Hub, powered by Apache Hadoop Storage for Any Type of Data Unified, Elastic, Resilient Batch Processing Workload Management Online NoSQL Analytic SQL Search Stream Processing 3rd Party Apps System Management Data Management , Secure Open Source Scalable Flexible Cost-Effective ✔ Managed ✔ Open Architecture ✔ Secure Governed ✔ Filesystem Machine Learning
  • 13. Intel Confidential Improving Apache Hadoop performance with IA Up to 50% Faster Up to 80% Faster Up to 50% Faster Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products. Source: Intel Internal testing For more information go to : ` As measured by time to completion of 1TB sort on 10 node cluster NetworkStorage & MemoryCompute Compared to previous generation SSD compared to HDD 10GbE compared to 1GbE
  • 14. Intel Confidential Enabling ecosystem with joint leadership • Market leader in big data management systems • Largest base of paid customers & free users • Consistently delivering industry- leading capabilities around Apache Hadoop • Market leader in silicon • Long & successful history of investment and collaboration with software platforms • Global reach; market leading Hadoop distribution in China
  • 15. Intel Confidential Joint customers leading the way Revenue Growth Cost Savings Margin Gain • Captures TB’s of data from smart meters • Analyzes usage patterns to optimize consumption • $320M USD in utility savings “Utilities simply can’t cope with the vast volumes of smart meter data – not just with storing the data, but being able to analyze it and put it to use” -- Drew Hylbert, VP Technology & Infrastucture, Opower
  • 16. Intel Confidential 16 Unlocking Big Data Value With Graph Analytics - (Beta test via Intel) Graph Query Processing & Storage Input Data Construct Graph Analyze Graph Mine Graph Insight & Prediction HDFS* DB Web Docs  Deliver a fully-integrated solution that is easy to program  Scale like Hadoop*; speed and accuracy of in-memory graph analytics and mining  Enable applications in network security, retail, life sciences, financial markets, etc.
  • 17. Intel Confidential Enablement Professional Services (EPS)
  • 18. Intel Confidential Summary: Faster Insights, Better Security, Less Complexity • Maintain an open horizontal platform for big data • Continue to enhance Apache Hadoop and related projects Accelerate innovation via open source software • Optimize performance across compute, storage, & network • Ensure platform security, enhanced by hardware Enable CDH to run best on IA • Establish usage models and industry standard benchmarks • Develop reference architectures and industry-wide solutions Foster evolution of big data ecosystem
  • 19. Intel Confidential 19 More Resources
  • 20. Intel Confidential — Do Not Forward