SlideShare a Scribd company logo
1 of 20
Fabric Architecture
A Big Idea for the Big Data infrastructure


Satheesh Nanniyur
Senior Product Line Manager
AMD Data Center Server Solutions (formerly, SeaMicro)
Agenda
•   Defining Big Data from an Infrastructure perspective
•   Fabric Architecture for Big Data
•   An overview of the Fabric Server and Fabric Storage
•   Illustrating Fabric Architecture Benefits for Hadoop
•   Conclusion
Have you come across Big Data?

         Apple’s virtual smartphone assistant, Siri,
         uses complex machine learning
         techniques




         Target’s “pregnancy prediction score”
            – NY Times: “How companies learn your
              secrets” – Feb 2012
So, what really is Big Data?

Business
• “Key basis of competition and growth…”

Observational
• “Too big, moves too fast, or doesn’t fit the structures of your
  database”
Mathematical
• “Every day, we create 2.5 “million trillion” (quintillion) bytes
  of data"
Systems
• “Exceeds the processing capacity of conventional database”
The Infrastructural definition of Big Data

  Massive   • Store “all” data not knowing its
  Storage     use in advance




 Massive    • Ask a query, and when you do,
 Compute      get the answer fast
Big Data infrastructure is not business as
usual

       Massive                       • Petabyte scale high density storage
       Storage                       • Flexible storage to compute ratio to
                                       meet evolving business needs



      Massive                        • High density scale-out compute
      Compute                        • Power and space efficient
                                       infrastructure



     The IT architectural approach used in clustered environments
     such as a large Hadoop grid is radically different from the
     converged and virtualized IT environments

IDC White Paper, “Big Data: What It Is and Why You Should Care”
Fabric Architecture for Big Data
The holy grail of Big Data Infrastructure

Imagine a world where you could simply stack up servers,
with each server:




                                                             Flexible
 Fraction of a       Share over 5       10GE network
                                                          provisioning of
   rack unit         PB of storage      with no cabling
                                                             storage
A deeper look at the traditional rack-
mount architecture
              Aggregation



              ToR
                                 Cabling and Management


              Nodes

               •    Compromise between Compute and
                    Storage density
               •    Rigid compute to storage ratio
               •    Oversubscribed network suitable for
                    north-south traffic, not heavy east-west
                    required for Big Data
               •    Too many adapters (NIC, Storage Ctlr)
                    and cabling that can fail
Fabric with 3-D Torus for Big Data
Infrastructure
   Big Data is a big shift from North-South traffic to East-West



                                           High Speed and Low Latency
                                                 Interconnection

                                         Switchless Linear Scalability that
                                               avoids bottlenecks

                                       Highly available network minimizing
                                        node loss and data reconstruction

                                        High density scale-out architecture
                                            with low power and space
An overview of the Fabric Server

                Y+
                                 X-
Z+




X+                    PCIe Z-
                                      •   512 x86 cores with 4TB
               Y-                         DRAM in 10RU
                       x86 Server
                                      •   Up to 5 petabytes of
     SeaMicro Fabric Node with            storage
               IOVT                   •   Flexible Storage to Compute
                                          ratio
                                      •   10GE network per server
                                          160GE of uplink bandwidth
Fabric Storage ... for Big Data?
     Isn’t Big Data always deployed with DAS?

          “.. the rate of change was killing us, where the data volumes were practically
          doubling every month. Trying to keep up with that growth was an extreme challenge
          to say the least.. “

          Customer quote from IDC white paper - “Big Data – What It Is and Why You Should Care”



              Underutilized Compute                                         •   Add storage capacity
              & Network            Rigid Storage to Compute                     independent of compute
                                               Ratio (Traditional
                                               Rackmount)
                                                                                to increase cluster
                                                                                efficiency
Compute




                                               Flexible Fabric Storage to
                                                                            •   Flexibly provision storage
                                               Compute Ratio                    capacity to meet evolving
                                                                                customer needs

                                     Storage
Massive capacity scale-out Fabric Storage
•      Massive scale-out capacity with commodity drives
•      Decoupled from Compute and Network to grow storage
       independently



     Captive DAS with Rigid                Flexible scale-out Fabric Storage
    Storage to Compute Ratio                           up to 5PB




                                                                               Intel /AMD x86
                               Freedom Fabric                                  servers




          Traditional
          Rackmount
Hadoop and the SMAQ stack

  Built to scale linearly with massive scale-out storage (HDFS)
  and compute (MapReduce)




         Query                       Pig, Hive

                                   MapReduce
     Data Processing
                                   Framework

      Data Storage                    HDFS
Hadoop data processing phases
Fabric Architecture cost efficiently meets the Hadoop
infrastructure needs

  Storage             Compute                   Network             Compute        Storage
 Intensive            Intensive                 Intensive           Intensive     Intensive


                       Map
                                                                     Reduce
                       Map
                                                                     Reduce
                       Map



     HDFS             Map and                    Shuffle            Reduce         HDFS
     Input          Intermediate                                                  Output
                     Data Write



512 x86 cores                5 Petabytes of
                                                            10 Gpbs Inter-      160 Gbps shared
with 4TB DRAM                storage capacity
                                                            Node Bandwidth      uplink for Inter-
per Fabric Server            with independent
                                                            per server          Rack traffic
in 10RU                      scale-out
Hadoop resource usage pattern
          Based on Terasort run on SeaMicro SM15000

                                           Map

Compute                                   Shuffle

                                          Reduce




                              Map                     Shuffle
Storage

                                                      Reduce



                                Shuffle

Network
Deployment Challenges of Hadoop
• Plan for peak utilization
    – Hadoop infrastructure utilization is bursty
• Compute, Storage, and Network mix dependent on
  application workload
    – Flexible ratios optimize deployment
• Power and Space Efficiency key to large scale
  deployment
• Administrative cost can increase as rapidly as your data
    – Simplified deployment and reduced hardware components
      decrease TCO
Fabric Server for Hadoop Deployment
Fabric Server offers 60% more compute and storage in the same
power and space envelope


                                                      Traditional       SeaMicro Fabric
                                                      Rackmount         Server

                            Intel Xeon Cores          320               512

                            AMD Opteron Cores*        320               1024

                            Storage                   720 TB            1136 TB

                            Storage Scalability       None              Up to 4PB

                            Network B/W per
                                                      Up to 2Gbps       Up to 8Gbps
                            server

                            Network Downlinks         40                0

                            ToR Switch                2                 0 (Built-in)

                            Aggregation (End of
                                                      1                 1
                            Row) switch/router

                                Based on SeaMicro SM15000 and HP DL380 Gen8 2U
                                dual socket octal core servers in a 42U rack
Summary

 Traditional architectures cannot scale to meet the needs of Big Data



Efficient Big Data deployments need flexible storage to compute ratio




 Conventional wisdom of reduced hardware components still holds



   Fabric Servers provide unprecedented density, bandwidth, and
                scalability for Big Data deployments
http://www.amd.com/seamicro
For more information, visit http://www.amd.com/seamicro or email
info@seamicro.com

More Related Content

What's hot

Denodo Data Virtualization - IT Days in Luxembourg with Oktopus
Denodo Data Virtualization - IT Days in Luxembourg with OktopusDenodo Data Virtualization - IT Days in Luxembourg with Oktopus
Denodo Data Virtualization - IT Days in Luxembourg with OktopusDenodo
 
Best Practices for Migrating from Denodo 6.x to 7.0
Best Practices for Migrating from Denodo 6.x to 7.0Best Practices for Migrating from Denodo 6.x to 7.0
Best Practices for Migrating from Denodo 6.x to 7.0Denodo
 
How to select a modern data warehouse and get the most out of it?
How to select a modern data warehouse and get the most out of it?How to select a modern data warehouse and get the most out of it?
How to select a modern data warehouse and get the most out of it?Slim Baltagi
 
Data Virtualization: An Essential Component of a Cloud Data Lake
Data Virtualization: An Essential Component of a Cloud Data LakeData Virtualization: An Essential Component of a Cloud Data Lake
Data Virtualization: An Essential Component of a Cloud Data LakeDenodo
 
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Denodo
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
 
Rethink Your Data Governance - POPI Act Compliance Made Easy with Data Virtua...
Rethink Your Data Governance - POPI Act Compliance Made Easy with Data Virtua...Rethink Your Data Governance - POPI Act Compliance Made Easy with Data Virtua...
Rethink Your Data Governance - POPI Act Compliance Made Easy with Data Virtua...Denodo
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshJeffrey T. Pollock
 
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)Denodo
 
Chug building a data lake in azure with spark and databricks
Chug   building a data lake in azure with spark and databricksChug   building a data lake in azure with spark and databricks
Chug building a data lake in azure with spark and databricksBrandon Berlinrut
 
Govern and Protect Your End User Information
Govern and Protect Your End User InformationGovern and Protect Your End User Information
Govern and Protect Your End User InformationDenodo
 
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Denodo
 
The Executive View on Big Data Platform Hosting - Evaluating Hosting Services...
The Executive View on Big Data Platform Hosting - Evaluating Hosting Services...The Executive View on Big Data Platform Hosting - Evaluating Hosting Services...
The Executive View on Big Data Platform Hosting - Evaluating Hosting Services...Chad Lawler
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationDenodo
 
Logical Data Warehouse and Data Lakes
Logical Data Warehouse and Data Lakes Logical Data Warehouse and Data Lakes
Logical Data Warehouse and Data Lakes Denodo
 
Parallel In-Memory Processing and Data Virtualization Redefine Analytics Arch...
Parallel In-Memory Processing and Data Virtualization Redefine Analytics Arch...Parallel In-Memory Processing and Data Virtualization Redefine Analytics Arch...
Parallel In-Memory Processing and Data Virtualization Redefine Analytics Arch...Denodo
 
Designing Fast Data Architecture for Big Data using Logical Data Warehouse a...
Designing Fast Data Architecture for Big Data  using Logical Data Warehouse a...Designing Fast Data Architecture for Big Data  using Logical Data Warehouse a...
Designing Fast Data Architecture for Big Data using Logical Data Warehouse a...Denodo
 
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014Amazon Web Services
 
Apache Kafka® and the Data Mesh
Apache Kafka® and the Data MeshApache Kafka® and the Data Mesh
Apache Kafka® and the Data MeshConfluentInc1
 
Cloud Modernization and Data as a Service Option
Cloud Modernization and Data as a Service OptionCloud Modernization and Data as a Service Option
Cloud Modernization and Data as a Service OptionDenodo
 

What's hot (20)

Denodo Data Virtualization - IT Days in Luxembourg with Oktopus
Denodo Data Virtualization - IT Days in Luxembourg with OktopusDenodo Data Virtualization - IT Days in Luxembourg with Oktopus
Denodo Data Virtualization - IT Days in Luxembourg with Oktopus
 
Best Practices for Migrating from Denodo 6.x to 7.0
Best Practices for Migrating from Denodo 6.x to 7.0Best Practices for Migrating from Denodo 6.x to 7.0
Best Practices for Migrating from Denodo 6.x to 7.0
 
How to select a modern data warehouse and get the most out of it?
How to select a modern data warehouse and get the most out of it?How to select a modern data warehouse and get the most out of it?
How to select a modern data warehouse and get the most out of it?
 
Data Virtualization: An Essential Component of a Cloud Data Lake
Data Virtualization: An Essential Component of a Cloud Data LakeData Virtualization: An Essential Component of a Cloud Data Lake
Data Virtualization: An Essential Component of a Cloud Data Lake
 
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
Bridging the Last Mile: Getting Data to the People Who Need It (APAC)
 
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)Data Lakehouse, Data Mesh, and Data Fabric (r1)
Data Lakehouse, Data Mesh, and Data Fabric (r1)
 
Rethink Your Data Governance - POPI Act Compliance Made Easy with Data Virtua...
Rethink Your Data Governance - POPI Act Compliance Made Easy with Data Virtua...Rethink Your Data Governance - POPI Act Compliance Made Easy with Data Virtua...
Rethink Your Data Governance - POPI Act Compliance Made Easy with Data Virtua...
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
 
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
Logical Data Lakes: From Single Purpose to Multipurpose Data Lakes (APAC)
 
Chug building a data lake in azure with spark and databricks
Chug   building a data lake in azure with spark and databricksChug   building a data lake in azure with spark and databricks
Chug building a data lake in azure with spark and databricks
 
Govern and Protect Your End User Information
Govern and Protect Your End User InformationGovern and Protect Your End User Information
Govern and Protect Your End User Information
 
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
 
The Executive View on Big Data Platform Hosting - Evaluating Hosting Services...
The Executive View on Big Data Platform Hosting - Evaluating Hosting Services...The Executive View on Big Data Platform Hosting - Evaluating Hosting Services...
The Executive View on Big Data Platform Hosting - Evaluating Hosting Services...
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data Virtualization
 
Logical Data Warehouse and Data Lakes
Logical Data Warehouse and Data Lakes Logical Data Warehouse and Data Lakes
Logical Data Warehouse and Data Lakes
 
Parallel In-Memory Processing and Data Virtualization Redefine Analytics Arch...
Parallel In-Memory Processing and Data Virtualization Redefine Analytics Arch...Parallel In-Memory Processing and Data Virtualization Redefine Analytics Arch...
Parallel In-Memory Processing and Data Virtualization Redefine Analytics Arch...
 
Designing Fast Data Architecture for Big Data using Logical Data Warehouse a...
Designing Fast Data Architecture for Big Data  using Logical Data Warehouse a...Designing Fast Data Architecture for Big Data  using Logical Data Warehouse a...
Designing Fast Data Architecture for Big Data using Logical Data Warehouse a...
 
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
(ENT211) Migrating the US Government to the Cloud | AWS re:Invent 2014
 
Apache Kafka® and the Data Mesh
Apache Kafka® and the Data MeshApache Kafka® and the Data Mesh
Apache Kafka® and the Data Mesh
 
Cloud Modernization and Data as a Service Option
Cloud Modernization and Data as a Service OptionCloud Modernization and Data as a Service Option
Cloud Modernization and Data as a Service Option
 

Viewers also liked

Aaas Data Intensive Science And Grid
Aaas Data Intensive Science And GridAaas Data Intensive Science And Grid
Aaas Data Intensive Science And GridIan Foster
 
IBM System x en BladeCenter overzicht (june 2012)
IBM System x en BladeCenter overzicht (june 2012)IBM System x en BladeCenter overzicht (june 2012)
IBM System x en BladeCenter overzicht (june 2012)ibmserverblog
 
New high-density storage server - IBM System x3650 M4 HD
New high-density storage server - IBM System x3650 M4 HDNew high-density storage server - IBM System x3650 M4 HD
New high-density storage server - IBM System x3650 M4 HDCliff Kinard
 
JetStor 780JH JBOD 4U 640TB
JetStor 780JH JBOD 4U 640TBJetStor 780JH JBOD 4U 640TB
JetStor 780JH JBOD 4U 640TBGene Leyzarovich
 
An Active and Hybrid Storage System for Data-intensive Applications
An Active and Hybrid Storage System for Data-intensive ApplicationsAn Active and Hybrid Storage System for Data-intensive Applications
An Active and Hybrid Storage System for Data-intensive ApplicationsXiao Qin
 
IMEXresearch software defined storage
IMEXresearch software defined storageIMEXresearch software defined storage
IMEXresearch software defined storageIMEX Research
 
Software-Defined Storage (SDS)
Software-Defined Storage (SDS)Software-Defined Storage (SDS)
Software-Defined Storage (SDS)Ali Mirfallah
 

Viewers also liked (8)

Aaas Data Intensive Science And Grid
Aaas Data Intensive Science And GridAaas Data Intensive Science And Grid
Aaas Data Intensive Science And Grid
 
IBM System x en BladeCenter overzicht (june 2012)
IBM System x en BladeCenter overzicht (june 2012)IBM System x en BladeCenter overzicht (june 2012)
IBM System x en BladeCenter overzicht (june 2012)
 
New high-density storage server - IBM System x3650 M4 HD
New high-density storage server - IBM System x3650 M4 HDNew high-density storage server - IBM System x3650 M4 HD
New high-density storage server - IBM System x3650 M4 HD
 
JetStor 780JH JBOD 4U 640TB
JetStor 780JH JBOD 4U 640TBJetStor 780JH JBOD 4U 640TB
JetStor 780JH JBOD 4U 640TB
 
Intorduce to Ceph
Intorduce to CephIntorduce to Ceph
Intorduce to Ceph
 
An Active and Hybrid Storage System for Data-intensive Applications
An Active and Hybrid Storage System for Data-intensive ApplicationsAn Active and Hybrid Storage System for Data-intensive Applications
An Active and Hybrid Storage System for Data-intensive Applications
 
IMEXresearch software defined storage
IMEXresearch software defined storageIMEXresearch software defined storage
IMEXresearch software defined storage
 
Software-Defined Storage (SDS)
Software-Defined Storage (SDS)Software-Defined Storage (SDS)
Software-Defined Storage (SDS)
 

Similar to Sn wf12 amd fabric server (satheesh nanniyur) oct 12

Beyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear ScalabilityBeyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear ScalabilityBen Stopford
 
Using Distributed In-Memory Computing for Fast Data Analysis
Using Distributed In-Memory Computing for Fast Data AnalysisUsing Distributed In-Memory Computing for Fast Data Analysis
Using Distributed In-Memory Computing for Fast Data AnalysisScaleOut Software
 
Netezza vs Teradata vs Exadata
Netezza vs Teradata vs ExadataNetezza vs Teradata vs Exadata
Netezza vs Teradata vs ExadataAsis Mohanty
 
Future of cloud storage
Future of cloud storageFuture of cloud storage
Future of cloud storageGlusterFS
 
Big Data and HPC
Big Data and HPCBig Data and HPC
Big Data and HPCNetApp
 
Hadoop World 2011: Hadoop as a Service in Cloud
Hadoop World 2011: Hadoop as a Service in CloudHadoop World 2011: Hadoop as a Service in Cloud
Hadoop World 2011: Hadoop as a Service in CloudCloudera, Inc.
 
NetApp-ClusteredONTAP-Fall2012
NetApp-ClusteredONTAP-Fall2012NetApp-ClusteredONTAP-Fall2012
NetApp-ClusteredONTAP-Fall2012Michael Harding
 
Big Data/Hadoop Infrastructure Considerations
Big Data/Hadoop Infrastructure ConsiderationsBig Data/Hadoop Infrastructure Considerations
Big Data/Hadoop Infrastructure ConsiderationsRichard McDougall
 
SAN overview.pptx
SAN overview.pptxSAN overview.pptx
SAN overview.pptxMugabo4
 
Seattle Scalability Meetup - Ted Dunning - MapR
Seattle Scalability Meetup - Ted Dunning - MapRSeattle Scalability Meetup - Ted Dunning - MapR
Seattle Scalability Meetup - Ted Dunning - MapRclive boulton
 
Rhs story61712
Rhs story61712Rhs story61712
Rhs story61712rhstorage
 
High Performance Cloud Computing
High Performance Cloud ComputingHigh Performance Cloud Computing
High Performance Cloud ComputingAmazon Web Services
 
Lug best practice_hpc_workflow
Lug best practice_hpc_workflowLug best practice_hpc_workflow
Lug best practice_hpc_workflowrjmurphyslideshare
 
Gluster open stack dev summit 042011
Gluster open stack dev summit 042011Gluster open stack dev summit 042011
Gluster open stack dev summit 042011Open Stack
 
Oracle en Entel Summit 2010
Oracle en Entel Summit 2010Oracle en Entel Summit 2010
Oracle en Entel Summit 2010Entel
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesshnkr_rmchndrn
 

Similar to Sn wf12 amd fabric server (satheesh nanniyur) oct 12 (20)

Beyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear ScalabilityBeyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
Beyond The Data Grid: Coherence, Normalisation, Joins and Linear Scalability
 
Using Distributed In-Memory Computing for Fast Data Analysis
Using Distributed In-Memory Computing for Fast Data AnalysisUsing Distributed In-Memory Computing for Fast Data Analysis
Using Distributed In-Memory Computing for Fast Data Analysis
 
Netezza vs Teradata vs Exadata
Netezza vs Teradata vs ExadataNetezza vs Teradata vs Exadata
Netezza vs Teradata vs Exadata
 
Future of cloud storage
Future of cloud storageFuture of cloud storage
Future of cloud storage
 
Big Data and HPC
Big Data and HPCBig Data and HPC
Big Data and HPC
 
Hadoop World 2011: Hadoop as a Service in Cloud
Hadoop World 2011: Hadoop as a Service in CloudHadoop World 2011: Hadoop as a Service in Cloud
Hadoop World 2011: Hadoop as a Service in Cloud
 
NetApp-ClusteredONTAP-Fall2012
NetApp-ClusteredONTAP-Fall2012NetApp-ClusteredONTAP-Fall2012
NetApp-ClusteredONTAP-Fall2012
 
IBM System Storage DCS3700
IBM System Storage DCS3700IBM System Storage DCS3700
IBM System Storage DCS3700
 
Big Data/Hadoop Infrastructure Considerations
Big Data/Hadoop Infrastructure ConsiderationsBig Data/Hadoop Infrastructure Considerations
Big Data/Hadoop Infrastructure Considerations
 
SAN overview.pptx
SAN overview.pptxSAN overview.pptx
SAN overview.pptx
 
Super cluster oracleday cl 7
Super cluster oracleday cl 7Super cluster oracleday cl 7
Super cluster oracleday cl 7
 
Seattle Scalability Meetup - Ted Dunning - MapR
Seattle Scalability Meetup - Ted Dunning - MapRSeattle Scalability Meetup - Ted Dunning - MapR
Seattle Scalability Meetup - Ted Dunning - MapR
 
Rhs story61712
Rhs story61712Rhs story61712
Rhs story61712
 
High Performance Cloud Computing
High Performance Cloud ComputingHigh Performance Cloud Computing
High Performance Cloud Computing
 
Introduction to Amazon Redshift
Introduction to Amazon RedshiftIntroduction to Amazon Redshift
Introduction to Amazon Redshift
 
Lug best practice_hpc_workflow
Lug best practice_hpc_workflowLug best practice_hpc_workflow
Lug best practice_hpc_workflow
 
Gluster open stack dev summit 042011
Gluster open stack dev summit 042011Gluster open stack dev summit 042011
Gluster open stack dev summit 042011
 
Oracle en Entel Summit 2010
Oracle en Entel Summit 2010Oracle en Entel Summit 2010
Oracle en Entel Summit 2010
 
Navigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skiesNavigating NoSQL in cloudy skies
Navigating NoSQL in cloudy skies
 
Hadoop on VMware
Hadoop on VMwareHadoop on VMware
Hadoop on VMware
 

Recently uploaded

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embeddingZilliz
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfSeasiaInfotech2
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 

Recently uploaded (20)

Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Training state-of-the-art general text embedding
Training state-of-the-art general text embeddingTraining state-of-the-art general text embedding
Training state-of-the-art general text embedding
 
The Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdfThe Future of Software Development - Devin AI Innovative Approach.pdf
The Future of Software Development - Devin AI Innovative Approach.pdf
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 

Sn wf12 amd fabric server (satheesh nanniyur) oct 12

  • 1.
  • 2. Fabric Architecture A Big Idea for the Big Data infrastructure Satheesh Nanniyur Senior Product Line Manager AMD Data Center Server Solutions (formerly, SeaMicro)
  • 3. Agenda • Defining Big Data from an Infrastructure perspective • Fabric Architecture for Big Data • An overview of the Fabric Server and Fabric Storage • Illustrating Fabric Architecture Benefits for Hadoop • Conclusion
  • 4. Have you come across Big Data? Apple’s virtual smartphone assistant, Siri, uses complex machine learning techniques Target’s “pregnancy prediction score” – NY Times: “How companies learn your secrets” – Feb 2012
  • 5. So, what really is Big Data? Business • “Key basis of competition and growth…” Observational • “Too big, moves too fast, or doesn’t fit the structures of your database” Mathematical • “Every day, we create 2.5 “million trillion” (quintillion) bytes of data" Systems • “Exceeds the processing capacity of conventional database”
  • 6. The Infrastructural definition of Big Data Massive • Store “all” data not knowing its Storage use in advance Massive • Ask a query, and when you do, Compute get the answer fast
  • 7. Big Data infrastructure is not business as usual Massive • Petabyte scale high density storage Storage • Flexible storage to compute ratio to meet evolving business needs Massive • High density scale-out compute Compute • Power and space efficient infrastructure The IT architectural approach used in clustered environments such as a large Hadoop grid is radically different from the converged and virtualized IT environments IDC White Paper, “Big Data: What It Is and Why You Should Care”
  • 8. Fabric Architecture for Big Data The holy grail of Big Data Infrastructure Imagine a world where you could simply stack up servers, with each server: Flexible Fraction of a Share over 5 10GE network provisioning of rack unit PB of storage with no cabling storage
  • 9. A deeper look at the traditional rack- mount architecture Aggregation ToR Cabling and Management Nodes • Compromise between Compute and Storage density • Rigid compute to storage ratio • Oversubscribed network suitable for north-south traffic, not heavy east-west required for Big Data • Too many adapters (NIC, Storage Ctlr) and cabling that can fail
  • 10. Fabric with 3-D Torus for Big Data Infrastructure Big Data is a big shift from North-South traffic to East-West High Speed and Low Latency Interconnection Switchless Linear Scalability that avoids bottlenecks Highly available network minimizing node loss and data reconstruction High density scale-out architecture with low power and space
  • 11. An overview of the Fabric Server Y+ X- Z+ X+ PCIe Z- • 512 x86 cores with 4TB Y- DRAM in 10RU x86 Server • Up to 5 petabytes of SeaMicro Fabric Node with storage IOVT • Flexible Storage to Compute ratio • 10GE network per server 160GE of uplink bandwidth
  • 12. Fabric Storage ... for Big Data? Isn’t Big Data always deployed with DAS? “.. the rate of change was killing us, where the data volumes were practically doubling every month. Trying to keep up with that growth was an extreme challenge to say the least.. “ Customer quote from IDC white paper - “Big Data – What It Is and Why You Should Care” Underutilized Compute • Add storage capacity & Network Rigid Storage to Compute independent of compute Ratio (Traditional Rackmount) to increase cluster efficiency Compute Flexible Fabric Storage to • Flexibly provision storage Compute Ratio capacity to meet evolving customer needs Storage
  • 13. Massive capacity scale-out Fabric Storage • Massive scale-out capacity with commodity drives • Decoupled from Compute and Network to grow storage independently Captive DAS with Rigid Flexible scale-out Fabric Storage Storage to Compute Ratio up to 5PB Intel /AMD x86 Freedom Fabric servers Traditional Rackmount
  • 14. Hadoop and the SMAQ stack Built to scale linearly with massive scale-out storage (HDFS) and compute (MapReduce) Query Pig, Hive MapReduce Data Processing Framework Data Storage HDFS
  • 15. Hadoop data processing phases Fabric Architecture cost efficiently meets the Hadoop infrastructure needs Storage Compute Network Compute Storage Intensive Intensive Intensive Intensive Intensive Map Reduce Map Reduce Map HDFS Map and Shuffle Reduce HDFS Input Intermediate Output Data Write 512 x86 cores 5 Petabytes of 10 Gpbs Inter- 160 Gbps shared with 4TB DRAM storage capacity Node Bandwidth uplink for Inter- per Fabric Server with independent per server Rack traffic in 10RU scale-out
  • 16. Hadoop resource usage pattern Based on Terasort run on SeaMicro SM15000 Map Compute Shuffle Reduce Map Shuffle Storage Reduce Shuffle Network
  • 17. Deployment Challenges of Hadoop • Plan for peak utilization – Hadoop infrastructure utilization is bursty • Compute, Storage, and Network mix dependent on application workload – Flexible ratios optimize deployment • Power and Space Efficiency key to large scale deployment • Administrative cost can increase as rapidly as your data – Simplified deployment and reduced hardware components decrease TCO
  • 18. Fabric Server for Hadoop Deployment Fabric Server offers 60% more compute and storage in the same power and space envelope Traditional SeaMicro Fabric Rackmount Server Intel Xeon Cores 320 512 AMD Opteron Cores* 320 1024 Storage 720 TB 1136 TB Storage Scalability None Up to 4PB Network B/W per Up to 2Gbps Up to 8Gbps server Network Downlinks 40 0 ToR Switch 2 0 (Built-in) Aggregation (End of 1 1 Row) switch/router Based on SeaMicro SM15000 and HP DL380 Gen8 2U dual socket octal core servers in a 42U rack
  • 19. Summary Traditional architectures cannot scale to meet the needs of Big Data Efficient Big Data deployments need flexible storage to compute ratio Conventional wisdom of reduced hardware components still holds Fabric Servers provide unprecedented density, bandwidth, and scalability for Big Data deployments
  • 20. http://www.amd.com/seamicro For more information, visit http://www.amd.com/seamicro or email info@seamicro.com