Cisco UCS with the Intel Distribution for Apache Hadoop Software


Published on

Cisco and Intel have a long history of collaboration and innovation that was first
demonstrated with the announcement of the Cisco Unified Computing System
in 2009. In their long-term collaboration, the two companies have worked
together to design and deliver the next generation of open standards-based big
data deployment architectures for enterprises. The solution combines the Intel
Distribution for Apache Hadoop software with the Cisco® Common Platform
Architecture (CPA) for Big Data. The result is an enterprise-class solution that
delivers performance and capacity while reducing risk and accelerating deployment.

Published in: Technology

Comments are closed

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Cisco UCS with the Intel Distribution for Apache Hadoop Software

  1. 1. Cisco UCS with the Intel Distribution Solution Brief February 2013for Apache Hadoop Software In Collaboration With: Highlights The Cisco Unified Computing Optimized for Performance System™ (Cisco UCS®) with the • The Cisco Unified Computing System™ (Cisco UCS®) with the Intel® Distribution for Apache Intel® Distribution for Apache Hadoop software integrates feature-enhanced Hadoop software uses the software with Cisco UCS servers based on Intelligent Intel® Xeon® power of hardware-enhanced processors to propel performance of the most challenging MapReduce and software to deliver performance, capacity, HBase workloads. and security for enterprise-class Hadoop Ease of Deployment • Cisco UCS Manager and the Intel deployments. Manager for Apache Hadoop software automate server infrastructure deployment and scaling, reducing risk Cisco and Intel have a long history of collaboration and innovation that was first of configuration errors that can cause demonstrated with the announcement of the Cisco Unified Computing System downtime. in 2009. In their long-term collaboration, the two companies have worked Robust Manageability together to design and deliver the next generation of open standards-based big • The solution provides a single point data deployment architectures for enterprises. The solution combines the Intel of management for up to thousands Distribution for Apache Hadoop software with the Cisco® Common Platform of servers along with their network Architecture (CPA) for Big Data. The result is an enterprise-class solution that infrastructure. delivers performance and capacity while reducing risk and accelerating deployment. Integration with Enterprise Applications The Rise of Big Data Technology • Big data and enterprise applications can coexist in the same system, Big data technology, and Apache Hadoop in particular, is finding use in an enormous sharing high-bandwidth connectivity so that analytic results can be quickly number of applications and is being evaluated and adopted by enterprises of all put to use. sizes. As this important technology helps transform large volumes of data into actionable information, many organizations are struggling to deploy effective and Architectural Scalability • The solution is designed to grow reliable Hadoop infrastructure that performs and scales and is appropriate for to its maximum scale without the mission-critical applications in the enterprise. Many of the challenges arise from the need for complex layers of switching friction between the rapid pace of change inherent in open-source software and the infrastructure. need for enterprise-class performance, reliability, and support. Enterprise-Class Support • Intel provides technical support and professional services for the Intel Distribution. Cisco provides support © 2013 Cisco Systems, Inc. All rights reserved. This document is Cisco Public Information. and services for Cisco UCS.
  2. 2. Cisco UCS with the Intel Distributionfor Apache Hadoop SoftwareA Unique Solution from Cisco Common Platform Architecture Enterprise Applications on Cisco UCSIndustry Leaders for Big Data Automate Deployment, Blade and Rack Servers Managing, and Monitoring with Cisco UCS ManagerCisco UCS with the Intel Distributionfor Apache Hadoop software was Cisco UCS 6200 Seriesdevelopment by the two companies Fabric Interconnectsto help reduce the time and risk of Cisco Nexus® 2232PP 10GE Fabric ExtendersHadoop deployment by enhancing 10-Gbps Unifiedfeatures and controlling the release Fabriccycle and then optimizing theresulting software for outstanding 10 Gigabit Ethernetperformance and scalability whenit is run on the Cisco CPA. With itsenterprise-class support, the solution SAN Storageis a customer-centered platform thatcan be rapidly deployed, scaled on Cisco UCS C240 M3 Rack Serversdemand, and secured. The solutionhas the performance and reliability thatorganizations need to support their Figure 1. Cisco CPA for Big Data Integrates with Enterprise Applications in a Single Managemententerprise applications. DomainCisco UCS with the Intel Distribution for data bottlenecks, streamline • Ease of deployment: Cisco UCSApache Hadoop software features: operations, and increase agility. is the first unified system built from Complementing the processing the beginning so that every aspect• Powerful computing infrastructure: power of these servers is the of server personality, configuration, Cisco UCS servers are powered by massive storage capacity of Cisco and connectivity is set on demand, the Intel® Xeon® UCS C240 M3 Rack Servers. The through Cisco UCS Manager. processor E5 servers offer up to 24 Small Form- Through the powerful concept of family, the core Factor (SFF) disk drives in the Cisco service profiles, the Hadoop of a flexible performance-optimized configuration cluster’s servers can be configured and efficient or 12 Large Form-Factor (LFF) disk rapidly and automatically without the data center that drives in the capacity-optimized risk of configuration drift that can meets diverse configuration. lead to errors that cause downtime. business Unified management in Cisco UCS needs. This family of processors • High-performance unified fabric: enables greater agility and more is designed to deliver versatility, The solution’s low-latency, lossless rapid deployment. with an outstanding combination of 10-Gbps unified fabric is fully performance, built-in capabilities, redundant. Through its active-active • Robust manageability: Big data and cost effectiveness. With configuration, the fabric delivers high environments can consist of these processors, I/O latency is performance and scalability for up hundreds of servers, resulting in dramatically reduced with Intel to 160 servers in a single switching immense management complexity. Integrated I/O, which helps eliminate domain and thousands of servers in Cisco UCS provides a single point a single management domain. of management for the entire© 2013 Cisco Systems, Inc. All rights reserved. This document is Cisco Public Information. Page 2 of 6
  3. 3. Cisco UCS with the Intel Distributionfor Apache Hadoop Software system: for both blade servers • Enterprise service and support: Hadoop software, with feature supporting enterprise applications Enterprises using Apache Hadoop enhancements, performance and rack servers supporting big data to help with business-critical optimizations, and security options applications. With the system’s self- decisions want to know that the that are responsible for the solution’s aware, self-integrating infrastructure, vendors providing the solution enterprise quality. The Intel Distribution IT departments can proactively have the expertise to help them for Apache Hadoop software includes monitor the system and reduce quickly proceed through the initial (Figure 2): operating costs. design, deployment, and testing. They also need to have confidence • Intel Manager: The Intel Manager• Integration with enterprise that they will receive timely and for Apache Hadoop software applications: Big data environments professional support if a critical streamlines Hadoop cluster need high-speed connectivity component fails. One of the factors configuration, management, and to transfer results to enterprise that makes this solution unique is resource monitoring. This powerful, applications. The Cisco solution can the collaboration between Cisco and easy-to-use, web-based tool host the Intel Distribution for Apache Intel support to make Cisco UCS allows IT departments to focus Hadoop software and enterprise with the Intel Distribution for Apache critical resources and expertise on applications from vendors including Hadoop software a fully supported, deriving business value from the Microsoft, Oracle, and SAP in the enterprise-class solution. Hadoop environment rather than same management and connectivity worrying about the details of cluster domains, further simplifying data center management (Figure 1). Intel Distribution for Apache management. The Intel Manager for Hadoop Software Apache Hadoop software provides• Architectural scalability: The system installation and configuration is designed with logically centralized The Intel Distribution for Apache features, wizard-based cluster connectivity management that is Hadoop software is a controlled management, proactive cluster physically distributed across the distribution based on the Apache health checks, monitoring and racks and blade chassis that house big data and enterprise applications. After the initial system is established, Intel Manager for Apache Hadoop Software Deployment, Configuration, Monitoring, Alerting, and Security it is designed to grow to maximum size without the need to add any Log Collector Data Exchange Hive Columnar Storage new switching components or Mahout Pig Oozie SQL-Like R-Connector Sqoop Data Mining Scripting Workflow Query redesign the system’s connectivity HBase Zookeeper Coordination in any way. The solution can be MapReduce deployed a rack at a time, with the Distributed Processing Framework initial rack hosting the system’s fabric Flume interconnects (described later in this HDFS document). Subsequent racks use Hadoop Distributed File System Cisco fabric extenders, low-cost, low-power-consumption devices Cisco Common Platform Architecture that bring the unified fabric to each for Big Data server in the rack with no additional points of management. Figure 2. The Solution Combines the Intel Distribution with the Cisco CPA© 2013 Cisco Systems, Inc. All rights reserved. This document is Cisco Public Information. Page 3 of 6
  4. 4. Cisco UCS with the Intel Distributionfor Apache Hadoop Software logging, and secure authentication Common Platform Architecture for and exceptional scalability needed and authorization. Big Data. The Cisco CPA is a highly to support the large number of• Hadoop Data Storage Framework scalable architecture designed to nodes that are typical in clusters (HDFS): HDFS is a distributed, meet a variety of scale-out application serving big data applications. Cisco scalable, and portable file system demands with transparent data and UCS Manager enables rapid and that stores data about the cluster management integration capabilities. consistent server configuration using nodes. The Intel Distribution for service profiles, automating ongoing The Cisco CPA is built using Cisco system maintenance activities such Apache Hadoop software includes UCS, the first truly unified data center as firmware updates across the compression and encryption for platform that combines industry- entire cluster as a single operation. enhanced security and performance. standard, x86-architecture servers Cisco UCS Manager also offers• Data Processing Framework with networking and storage access in advanced monitoring with options to (MapReduce): This massively a single system. Cisco UCS is smart raise alarms and send notifications parallel computing framework is infrastructure that is automatically about the health of the entire cluster. inspired by Google’s MapReduce configured through integrated, model- documents. The Intel Distribution for • Cisco Nexus 2200 Series Fabric based management to simplify and Apache Hadoop software includes Extenders bring the system’s accelerate deployment of enterprise- dynamic replication capabilities unified fabric to each rack, class applications and services running that intelligently increases and establishing a physically distributed in bare-metal, virtualized, and cloud- decreases the number of data but logically centralized network computing environments. Benefits of replicas according to workload infrastructure. These low-cost, the Intel Distribution for Apache Hadoop characteristics. low-power-consumption devices software available only from Cisco act as remote line cards for the• Real-Time Query Processing include the capability to unify both big fabric interconnects, providing Framework: This component data and enterprise applications in the connectivity without adding the cost includes HBase, a scalable, same centralized management domain. and management complexity that distributed, columnar data storage top-of-rack switches would require. system for large tables and the Hive The Cisco CPA is built using the following components: The result is highly scalable and data warehouse infrastructure for cost-effective connectivity for a large ad-hoc query processing. The Intel number of nodes. • Cisco UCS 6200 Series Fabric Distribution for Apache Hadoop Interconnects establish a • Cisco UCS C240 M3 Rack Servers software includes extensions single point of connectivity and are designed for a wide range to support big tables across management for the entire system. of computing, I/O, and storage- geographically distributed data The fabric interconnects provide capacity demands in a compact two- centers as well as feature additions high-bandwidth, low-latency rack-unit (2RU) design. Cisco UCS that improve HBase and Hive connectivity for servers, with C240 M3 servers are powered by performance. integrated, unified management dual Intel Xeon processor E5-2600 for all connected devices providedCisco CPA for Big Data series CPUs and support up to 768 by Cisco UCS Manager. Deployed GB of main memory (128 or 256 GBCisco UCS with the Intel Distribution for in redundant pairs, Cisco fabric is typical for big data applications).Apache Hadoop software is optimized interconnects offer the full active- These servers support a range offor high performance on the Cisco active redundancy, performance, disk drive options as well as Cisco© 2013 Cisco Systems, Inc. All rights reserved. This document is Cisco Public Information. Page 4 of 6
  5. 5. Cisco UCS with the Intel Distributionfor Apache Hadoop Software UCS virtual interface cards (VICs) optimized for high-bandwidth and low-latency cluster connectivity, with support for up to 256 virtual devices Cisco UCS Central Software that are configured on demand through Cisco UCS Manager.Choice of ConfigurationThe solution is offered as referencearchitectures and as Cisco UCSSmartPlay solutions that can be Cisco UCS Manager Multiple Ciscopurchased by ordering a single part UCS Domains:number. Up to Thousands of ServersA single-rack configuration providestwo fully redundant Cisco UCS 6200Series Fabric Interconnects (to connect Single Ciscoup to 10 racks and 160 servers), along UCS Domain: Single Rackwith two Cisco Nexus® 2232PP 10GE Up to 160 Servers 16 ServersFabric Extenders and 16 Cisco UCSC240 M3 Rack Servers (either high-performance or high-capacity CPUconfigurations.) Multirack configurationsinclude two Cisco Nexus 2232PP fabric Figure 3. Cisco UCS with the Intel Distribution Can Scale to Thousands of Serversextenders and 16 Cisco UCS C240 M3servers for every additional rack. domain with a pair of Cisco fabric With only a single part number to order, interconnects. Scaling beyond 160 the program makes it easy to quicklyEach server in the configuration servers can be accomplished by deploy a powerful and secure big dataconnects to the Cisco Unified Fabric interconnecting multiple Cisco UCS environment without the expense orthrough two active-active 10 Gigabit domains using Cisco Nexus® 6000 risk entailed in designing and building aEthernet links using a Cisco UCS VIC. or 7000 Series Switches. With Cisco custom solution.Each high-performance rack can UCS Central Software, thousands of servers and hundreds of petabytessupport up to 256 cores and 32-GBps ConclusionI/O bandwidth. Each high-capacity (PB) of storage can be managedrack can support up to 576 TB of raw through a single interface with the same Big data technology is becomingstorage. automation that Cisco UCS Manager compelling for business organizations provides (Figure 3). of all sizes. But although organizationsMassive Scalability want software that can meet mission-The Cisco CPA supports the massive Cisco SmartPlay Configurations critical needs, they are understandablyscalability that big data environments Both the high-performance and high- concerned about the risk and stability ofdemand. Up to 160 servers are capacity options are available through unsupported open-source software.supported in a single switching the Cisco SmartPlay program (Table 1).© 2013 Cisco Systems, Inc. All rights reserved. This document is Cisco Public Information. Page 5 of 6
  6. 6. Cisco UCS with the Intel Distributionfor Apache Hadoop SoftwareCisco UCS with the Intel Distribution Table 1. Cisco SmartPlay Solutions Are Optimized for High Performance or High Capacity andfor Apache Hadoop software provides Are Tested and Validated for Rapid Deploymentcritical technology enhancements Base Rack Big Data High Capacity Big Data High Performancethat allow organizations to easily and Solutionsafely deploy big data applicationsin enterprise environments. The Part Number UCS-EZ-BD-HC UCS-EZ-BD-HPcombination of the Intel Distribution forApache Hadoop software and Cisco Computing 16 Cisco UCS C240 M3 Rack 16 Cisco UCS C240 M3 RackUCS joins the power of big data with a and Storage Servers, each with: Servers, each with:dependable deployment model that can • 2 Intel Xeon processors E5- • 2 Intel Xeon processors E5-be implemented rapidly and customized 2640 at 2.5 GHz 2690 at 2.9 GHzfor either high performance or high • 128 GB of memory • 256 GB of memorycapacity using Cisco Unified Fabric • Cisco UCS P81E VIC • Cisco UCS P81E VIC • 12 LFF 3-TB 7.2K 3.5-inch • 24 SFF 1-TB 7.2K SFF SATAand powerful and efficient Cisco UCS SAS HDDs HDDsrack servers. Enterprise-class services • LSI MegaRAID 9266-CV 8i card • LSI MegaRAID 9266-CV 8i cardcan help with design, deployment, andtesting, and organizations can continue Performance 192 cores, 16 GBps I/O 256 cores, 32 GBps I/Oto rely on these services through and Capacity bandwidth, 576 TB storage bandwidth, 384 TB storagecontrolled and supported releases. per Rack capacity (raw) 720 TB (typical capacity (raw) or 480 TB (typical user storage capacity, 3-way user storage capacity, 3-wayWhether you are deploying a large data replicated and compressed) replicated and compressed)center or buying single racks throughthe Cisco SmartPlay program, Cisco Network 10-Gbps unified fabric supported by:UCS with the Intel Distribution for • 2 Cisco UCS 6296UP 96-Port Fabric Interconnects (supports up toApache Hadoop software can be scaled 160 servers) • 2 Cisco Nexus 2232PP 10GE Fabric Extendersto meet the challenges of any size oforganization. • For more information about the CiscoFor More Information SmartPlay program, please visit• For more information about the collaboration between Cisco and • For more information about Cisco Intel, please visit CPA for Big Data, please visit http:// com/go/intel.• For more information about Cisco UCS, please visit com/go/ucs.Americas Headquarters Asia Pacific Headquarters Europe HeadquartersCisco Systems, Inc. Cisco Systems (USA) Pte. Ltd. Cisco Systems International BV Amsterdam,San Jose, CA Singapore The NetherlandsCisco has more than 200 offices worldwide. Addresses, phone numbers, and fax numbers are listed on the Cisco Website at and the Cisco logo are trademarks or registered trademarks of Cisco and/or its affiliates in the U.S. and other countries. To view a list of Cisco trademarks, go to thisURL: Third party trademarks mentioned are the property of their respective owners. The use of the word partner does not imply a partnershiprelationship between Cisco and any other company. Intel, the Intel logo, Xeon, and Xeon Inside are trademarks or registered trademarks of Intel Corporation in the U.S. and/orother countries (1110R) LE-37705-00 02/13