Cisco UCS with MapR: Delivering Advanced Performance for Hadoop Workloads


Published on

MapR on the Cisco Unified Computing SystemTM (Cisco UCS®) delivers a fully optimized Hadoop solution that provides lights-out data center capabilities and ease of use with superior performance for different classes of Hadoop applications.

Published in: Technology

Comments are closed

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Cisco UCS with MapR: Delivering Advanced Performance for Hadoop Workloads

  1. 1. © 2013 Cisco Systems, Inc. All rights reserved. This document is Cisco Public Information.MapR on the Cisco Unified ComputingSystem™ (Cisco UCS®) delivers a fullyoptimized Hadoop solution that provideslights-out data center capabilities andease of use with superior performance fordifferent classes of Hadoop applications.Big data technology, and Apache Hadoop in particular, are finding use in anenormous number of applications and are being evaluated and adopted byenterprises of all kinds. As this important technology helps transform large volumesof data into actionable information, many organizations are struggling to deployeffective and reliable Hadoop infrastructure that performs and scales and isappropriate for mission-critical applications in the enterprise. Deployed as part of acomprehensive data center architecture, the Cisco UCS with MapR solution deliversa powerful and flexible infrastructure that increases business and IT agility, reducestotal cost of ownership (TCO), and delivers exceptional return on investment (ROI) atscale, while fundamentally transforming the way that organizations do business withHadoop technology.MapR: A Complete Hadoop PlatformAs the technology leader in Hadoop, the MapR distribution provides enterprise-classHadoop solutions that are fast to develop and easy to administer. With significantinvestment in critical technologies, MapR offers the industry’s most comprehensiveHadoop platform, fully optimized for performance scalability. MapR’s distributiondelivers more than a dozen tested and validated Hadoop software modules over afortified data platform, offering exceptional ease of use, reliability, and performancefor Hadoop solutions (Figure 1).Cisco UCS with MapR:Delivering Advanced Performancefor Hadoop WorkloadsSolution BriefFebruary 2013HighlightsOptimized for Performance• Cisco UCS® with MapR delivers anintegrated Hadoop solution that isspecifically engineered to handle themost demanding MapReduce andHBase workloads.Highly Scalable Platform• The Cisco Unified ComputingSystem™ (Cisco UCS) provides ahighly scalable platform that can beoptimized and easily scaled for anysize of Hadoop cluster.Advanced Hadoop Distribution• MapR brings cutting-edge innovationto make Hadoop easy, dependable,fast, and ready for all big dataanalytics.Ease of Management• Cisco UCS Manager provides unified,embedded management of allcomputing, networking, and storage-access resources.Choice of Configurations• The solution provides a choice ofCisco UCS configurations, lettingorganizations select performance andcapacity as their needs dictate.Enterprise-Class Support andServices• Reference configurations from Ciscooffer confidence and help accelerateimplementation of successfulHadoop deployments with worldwideenterprise-class support from Ciscoand MapR.
  2. 2. Cisco UCS with MapR: DeliveringAdvanced Performance for Hadoop Workloads© 2013 Cisco Systems, Inc. All rights reserved. This document is Cisco Public Information. Page 2 of 4• Ease of use: The MapR 100 percentPOSIX-compliant system allowsusers to access the Hadoop clusterthrough industry-standard APIs suchas Network File Service (NFS), OpenDatabase Connectivity (ODBC), LinuxPluggable Authentication Modules(PAM), and Representational StateTransfer (REST). MapR also providesmultitenancy, data-placementcontrol, and hardware-levelmonitoring of the cluster.• Reliability: The MapR distributionprovides lights-out data centercapabilities for Hadoop. Featuresinclude self-healing of the criticalservices that maintain the clusternodes and jobs, snapshots thatallow point-in-time recovery ofdata, mirroring that allows wide-area intercluster replication, androlling upgrades that prevent servicedisruption.• Performance: The MapR distributionis twice as fast as any other Hadoopdistribution. To provide superiorand exceptional perfo rmance overother Hadoop distributions, MapRuses an optimized shuffle algorithm,direct access to the disk, built-incompression, and code written inadvanced C++ rather than Java. As aresult, the MapR distribution providesthe best hardware utilization whencompared to any other distribution.MapR technology innovations bringthese capabilities onto a single dataplatform built for all big data analytics.The platform supports a wide range ofapplications that process structuredand unstructured data stored in files aswell as NoSQL databases. This singledata platform for all Hadoop workloadsfurther solidifies MapR’s valueproposition of providing the best ROI forhardware utilization.Cisco UCS with MapRSolutionThe Cisco UCS solution for MapR isbased on the Cisco® Common PlatformArchitecture (CPA) for Big Data.The Cisco CPA is a highly scalablearchitecture designed to meet a varietyof scale-out application demands withtransparent data and managementintegration capabilities built using thefollowing components:• Cisco UCS 6200 Series FabricInterconnects provide high-bandwidth, low-latency connectivityfor servers, with integrated, unifiedmanagement provided for allconnected devices by Cisco UCSManager. Deployed in redundantpairs, Cisco fabric interconnectsoffer the full active-activeredundancy, performance, andexceptional scalability needed tosupport the large number of nodesthat are typical in clusters servingbig data applications. Cisco UCSManger enables rapid and consistentserver configuration using serviceprofiles, automating ongoing systemmaintenance activities such asfirmware updates across the entirecluster as a single operation. CiscoUCS Manager also offers advancedmonitoring with options to raisealarms and send notifications aboutthe health of the entire cluster.Self-Healing, High Availability,Snapshots, and MirroringStreaming Writes, Instant Recovery,Real-Time NoSQL, and Lightweight OLTPIndustry-Standard APIs, NFS,OBDC, LDAP, and RESTEnterprise IntegrationMapR Control System MapR Control SystemMapR Data PlatformMapR Data PlatformHCatalog Zookeeper MapReducePigHive HBase Cascading FlumeGangliaAvroReal TimeMission CriticalApacheProjectsFigure 1. The MapR Hadoop Distribution Assists Mission-Critical Applications bySupporting the Entire Apache Hadoop Stack over a Fortified Data Platform
  3. 3. Cisco UCS with MapR: DeliveringAdvanced Performance for Hadoop Workloads© 2013 Cisco Systems, Inc. All rights reserved. This document is Cisco Public Information. Page 3 of 4• Cisco UCS 2200 Series FabricExtenders extend the networkinto each rack, acting as remoteline cards for fabric interconnectsand providing highly scalable andextremely cost-effective connectivityfor a large number of nodes.• Cisco UCS C240 M3 Rack Serversare designed for a wide range ofcomputing, I/O, and storage-capacitydemands in a compact two-rack-unit(2RU) design. Cisco UCS C240 M3servers are powered by dual Intel®Xeon® processor E5-2600 seriesCPUs and support up to 768 GB ofmain memory (128 GB or 256 GBis typical for big data applications).These servers support a range ofdisk drive options as well as CiscoUCS virtual interface cards (VICs)optimized for high-bandwidth andlow-latency cluster connectivity, withsupport for up to 256 virtual devices.The solution is offered as referencearchitecture blueprints and as CiscoUCS SmartPlay solutions that can bepurchased by ordering a single partnumber.Reference ArchitectureAvailable reference architectureblueprints offer a choice of high-performance and high-capacity options,selected according to the specificcomputing and storage requirements ofthe organization.• High-performance option: Thehigh-performance option offers abalance of computing power and I/Obandwidth optimized to achieve anexcellent price-to-performance ratio.Equipped for performance, CiscoUCS C240 M3 Rack Servers arepowered by two Intel Xeon E5-2665processors (16 cores), with 256 GBof memory and 24 1-terabyte (TB)Small Form-Factor (SFF) disk drives.• High-capacity option: The high-capacity option is optimized for lowcost per terabyte and is built usingCisco UCS C240 M3 Rack Serverspowered by two Intel Xeon E5-2640processors (12 cores), with 128GB of memory and 12 3-TB LargeForm-Factor (LFF) disk drives.Reference architectures areavailable in both single and multirackconfigurations, with considerablecapacity built into the Cisco UnifiedFabric provided in each rack.Architectural ScalabilityThe single-rack configuration providestwo fully redundant Cisco UCS6248UP 48-Port Fabric Interconnects(to connect up to five racks) or twoCisco UCS 6296UP 96-Port FabricInterconnects (to connect up to 10racks and 160 servers), along withtwo Cisco Nexus 2232PP 10GE FabricExtenders and 16 Cisco UCS C240 M3Rack Servers (either high-performanceor high-capacity CPU configurations).Multirack configurations include twoCisco Nexus 2232PP fabric extendersand 16 Cisco UCS C240 M3 serversfor every additional rack.Single CiscoUCS Domain:Up to 160 ServersMultiple CiscoUCS Domains:Up to Thousandsof ServersSingle Rack16 ServersCisco UCS Central SoftwareCisco UCS ManagerFigure 2. The Cisco UCS with MapR Solution Can Scale to Thousands of Servers
  4. 4. Americas HeadquartersCisco Systems, Inc.San Jose, CAAsia Pacific HeadquartersCisco Systems (USA) Pte. Ltd.SingaporeEurope HeadquartersCisco Systems International BV Amsterdam,The NetherlandsCisco UCS with MapR: DeliveringAdvanced Performance for Hadoop WorkloadsCisco has more than 200 offices worldwide. Addresses, phone numbers, and fax numbers are listed on the Cisco Website at and the Cisco logo are trademarks or registered trademarks of Cisco and/or its affiliates in the U.S. and other countries. To view a list of Cisco trademarks, go to thisURL: Third party trademarks mentioned are the property of their respective owners. The use of the word partner does not imply a partnershiprelationship between Cisco and any other company. (1110R) LE-38203-00 02/13Each server in the configurationconnects to the Cisco Unified Fabricthrough two active-active 10 GigabitEthernet links using a Cisco UCSVIC. Each high-performance rackcan support up to 256 cores and 32GBps (SATA) or 48 GBps (SAS) I/Obandwidth. Each high-capacity rack cansupport up to 576 TB of raw storage.Up to 160 servers are supported in asingle management domain with a pairof fabric interconnects. Scaling beyond160 servers can be accomplished byinterconnecting multiple Cisco UCSdomains using Cisco Nexus® 6000or 7000 Series Switches. With CiscoUCS Central Software, thousdandsof servers and hundreds of petabytes(PB) of storage can be managedthrough a single interface with the sameautomation that Cisco UCS Managerprovides (Figure 2).Cisco SmartPlay SolutionsBoth the high-performance and high-capacity options are available throughthe Cisco SmartPlay program (Table 1).With only a single part number to order,the program makes it easy to quicklydeploy a powerful and secure big dataenvironment without the expense orrisk entailed in designing and building acustom solution.ConclusionDespite the compelling power ofbig data technology, deploymentof successful, reliable, and high-performance infrastructure for ApacheHadoop can be daunting. The MapRHadoop distribution provides criticaltechnology advances to make Hadoopimplementation easy, dependable, andfast for enterprise-ready deployments.MapR has made crucial investmentsto improve the underlying technologywhile maintaining the Apache HadoopAPI and application compatibility forbroad applicability. The combination ofMapR and Cisco UCS brings the powerof the MapR distribution to a dependabledeployment model that can beimplemented rapidly and customized foreither high performance or high capacityusing Cisco Unified Fabric and powerfuland efficient Cisco UCS rack servers.Whether you are deploying a large datacenter or buying single racks throughthe Cisco SmartPlay program, the CiscoUCS with MapR solution can be sized tomeet the challenges of Hadoop.For More Information• For more information about the CiscoSmartPlay program, please visit• For more information about CiscoUCS big data solutions, please visit• For more information about the CiscoCPA for Big Data, please visit 1. Cisco SmartPlay Solutions Are Optimized for High Performance or High Capacity andAre Tested and Validated for Rapid DeploymentBase RackSolutionBig Data High Capacity Big Data High PerformancePart Number UCS-EZ-BD-HC UCS-EZ-BD-HPComputingand Storage16 Cisco UCS C240 M3 RackServers, each with:• 2 Intel Xeon processors E5-2640 at 2.5 GHz• 128 GB of memory• Cisco UCS P81E VIC• 12 LFF 3-TB 7.2K 3.5-inchSAS HDDs• LSI MegaRAID 9266-CV 8i card16 Cisco UCS C240 M3 RackServers, each with:• 2 Intel Xeon processors E5-2690 at 2.9 GHz• 256 GB of memory• Cisco UCS P81E VIC• 24 SFF 1-TB 7.2K SFF SATAHDDs• LSI MegaRAID 9266-CV 8i cardNetwork 10-Gbps unified fabric supported by:• 2 Cisco UCS 6296UP 96-Port Fabric Interconnects• 2 Cisco Nexus 2232PP 10GE Fabric Extenders