CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
Big Data Webex
Sascha Oehl
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
Ist es real?
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
WOHER
Zukunft
Wenn wir nur die Zukunft kennen würden,
könnten wir in der Gegenwart die richtigen
Entscheidungen treffen.
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
Life Sciences
Research
Location-Based
Advertising
One to One
Marketing
On-Demand
Maintenance
Satellite
Images
Fraud
Detection
Churn
Analysis
Risk
Analysis
Sentiment
Analysis
One to One
Marketing
Geomation
Farming
Location-Based
Advertising
Oil
Exploration
Network
Monitoring
Asset
Tracking
On-Demand
Maintenance
Traffic Flow
Optimization
Seismic
Monitoring
Satellite
Images
Fraud
Detection
Churn
Analysis
Risk
Analysis
Sentiment
Analysis
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
Autoversicherungen
Berechnung zuerst nur nach meiner Fahrfähigkeit
Dann nach PS meines Autos
Dann auch nach Autotyp
Dann auch nach Regionen
Dann auch nach Familienstand, Kinder, Beruf,
Alter, Stellplatz, Wohnsituation, …
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
Supermarkt
Aufbau zuerst nach Funktionalität
Dann nach Gespür des Marktleiters
Dann nach Befragung der Kunden
Dann nach Analyse des einzelnen Einkaufs
Dann nach Analyse meines Einkaufsverhaltens
Dann nach Analyse meines
geschäftsübergreifenden Einkaufsverhalten
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
Automobilzulieferer
Just in Time Lieferung
Bedarf für eine Vorhersage was gebraucht wird
SAP APO (Advanced Planning and Optimization)
Vorhersage der Zukunft (Was wird gebraucht) auf
Basis der Vergangenheit und der Gegenwart
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
Entwicklung
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
Klassische Entscheidungsfindung
HiPPO
Highest payed persons opinion
Das „Bachgefühl“ / Die Erfahrung des
Entscheiders
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
Umbruch in der
Entscheidungsfindung
Analyse von Daten
Schnelle Reaktionen
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
Business
Process
Database Data
OLTP
Machine
Sensor Data, Complex Data
M2m Log
Files
Satellite
Imaging Bio-
Informatics
Sensors
Recording
Video
Human
Enterprise Content,
External Sources
Email
Documents
Web Logs Social
1x 10x 100x
Big Data Transforms how we Capture and Capitalize on Data
It is one of the Biggest Drivers of IT Spend Today!
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
Store Control
 Schnellere Entscheidungen
 Genauere Steuerung
 Weniger Aufwand
Insight Out!
Viele
Daten
rein
 Mehr Datenquellen
 Mehr Datenvolumen
 Schnellere Speicherung und Analyse
 Weniger Algorithmen
 Geringere Datenqualität
Understand
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
Konsequenz
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
Klassisch
Man kennt den Grund
Man legt Indizes an
Die Anwendung ist klar
Man speichert was man benötigt
BigData
Man speichert
Keine Klarheit wonach man suchen wird
Man indiziert nicht
Datenbanken
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
Klassisch
Geplant und individuell
Redundanz und Sicherheit
BigData
Schnell und billig
Skalierbar
Einfach
Infrastruktur
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
Klassisch
Komplizierte Algorithmen
Vorhersage der Zukunft aus der
Vergangenheit
BigData
Interpretation der Gegenwart in
Verbindung mit der Vergangenheit
(Was ist das letzte mal in solch einer
Konstellation passiert)
Analyse
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
Klassisch
Monatliche / Wöchentliche
Entscheidung
BigData
Tägliche / Stündliche
Entscheidung
Entscheidung
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
Ausführung
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
HDS infrastructure for Big Data
HDS Solution – OPAD
One Platform for All Data
Big Data
Dark
Data
Multi-protocol/Multi-
data Type
Virtualized
Data Mobility
Universal
Management
Infrastructure On
Demand
HDI/HCP
UCP
SAP Oracle Microsoft®HNAS Hadoop
Resilience and
Protection
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
Hadoop
Software Plattform zur
Verarbeitung großer
Datenmengen,
unstrukturiert und
semi-strukturiert.
Verteilung der Arbeit
auf viele Server.
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
Cloudera
Eine kommerzielle
Implementation von
Apache Haddop
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
Hadoop Ökosystem
Java Virtual Machine
Betriebssystem – Linux (Ubuntu, RedHat…) / Windows
Hardware
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
Hadoop Ökosystem
Java Virtual Machine
Betriebssystem – Linux (Ubuntu, RedHat…) / Windows
Hardware
Daten Speicherung
HDFS
HBASE
Koordinierung
ZooKeeper
Datenverarbeitung
Map Reduce
Task Tracker
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
Netzwerk
Hadoop Ökosystem
Java Virtual Machine
Betriebssystem – Linux (Ubuntu, RedHat…) / Windows
Hardware
Daten Speicherung
HDFS
HBASE
Koordinierung
ZooKeeper
Datenverarbeitung
Map Reduce
Task Tracker
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
Netzwerk
Hadoop Ökosystem
Java Virtual Machine
Betriebssystem – Linux (Ubuntu, RedHat…) / Windows
Hardware
Daten Speicherung
HDFS
HBASE
Koordinierung
ZooKeeper
Datenverarbeitung
Map Reduce
Task Tracker
Orchestration
Oozie
Data Mining
Mahout
Datenzugriff
Flume
Sqoop
Client Zugriff
Hue
Hive
Pig
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
Netzwerk
Hadoop Rechnertypen
Java Virtual Machine
Betriebssystem – Linux (Ubuntu, RedHat…) / Windows
Hardware
Data Node (3-…)
Daten Speicherung
HDFS
HBASE
Datenverarbeitung
Map Reduce
Task Tracker
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
Netzwerk
Hadoop Rechnertypen
Java Virtual Machine
Betriebssystem – Linux (Ubuntu, RedHat…) / Windows
Hardware
Data Node (3-…)
Daten Speicherung
HDFS
HBASE
Datenverarbeitung
Map Reduce
Task Tracker
Cluster Name Node
Verwaltung des verteilten
Dateisystems
Job Tracker
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
Netzwerk
Hadoop Rechnertypen
Java Virtual Machine
Betriebssystem – Linux (Ubuntu, RedHat…) / Windows
Hardware
Data Node (3-…)
Daten Speicherung
HDFS
HBASE
Datenverarbeitung
Map Reduce
Task Tracker
Edge Node
Annahme von Anfragen
Client Zugriff
Hue,Hive,Pig
Orchestration
Oozie
Koordinierung
ZooKeeper
Cluster Name Node
Verwaltung des verteilten
Dateisystems
Job Tracker
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
Shared FS
Java Virtual MachineJava Virtual Machine
Netzwerk
Hadoop Second Copy für die Name Nodes
Betriebssystem – Linux (Ubuntu,
RedHat…) / Windows
Hardware
Active
Cluster Name Node
Verwaltung des verteilten
Dateisystems
Job Tracker
Betriebssystem – Linux (Ubuntu,
RedHat…) / Windows
Hardware
Standby
Cluster Name Node
Housekeeping
Backup Copy
Verwaltung des verteilten
Dateisystems nach
Übernahme
Job Tracker nach Übernahme
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
Data Warehouse
Anwendungs Server
Datenbank
Server
Storage
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
Data Warehouse
Anwendungs Server
Datenbank
Server
Storage
Big Data
Edge
Cluster
Name
Node
Data Nodes
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
 Pre-tested, pre-integrated hardware and software
Hadoop Reference Architecture
Management
DATA
NODE
-
HDFS
TASK
TRAC
KER
Name Node
Sec Name Node
LAN
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
 Can be customized to fit any application
‒ Customer purchases Cloudera and other applications from vendors
 Comprehensive Big Data services
‒ Red Hat Linux, Cloudera
Hadoop Reference Architecture
Management
DATA
NODE
-
HDFS
TASK
TRAC
KER
Name Node
Sec Name Node
LAN
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
 Management node
‒ 1 x Compute Rack 210H Server
‒ 2 x 6-core E2620@2Ghz Processors
‒ 64GB RAM
‒ 2xGigE (onboard)
‒ 2 x 300GB SAS 10K RPM
 Networking
‒ 2 x Cisco Nexus 3348
Hadoop Reference Architecture
Management
DATA
NODE
-
HDFS
TASK
TRAC
KER
Name Node
Sec Name Node
LAN
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
 Hadoop Cluster Name nodes (Primary + Secondary)
‒ 2 x Hitachi Compute Rack 220S server
‒ 2 x 8-core E2470@2,3 Ghz
‒ 64GB RAM
‒ 2 x GigE (onboard)
‒ 12 x 3.5-inch 3TB NL-SAS 7200 RPM drives
 Datenredundanz durch Raid5 11+1
Hadoop Reference Architecture
Management
DATA
NODE
-
HDFS
TASK
TRAC
KER
Name Node
Sec Name Node
LAN
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
 Hadoop Data nodes
‒ 3… x Hitachi Compute Rack 220S server
‒ 2 x 8-core E2470@2,3 Ghz
‒ 64GB RAM
‒ 2 x GigE (onboard)
‒ 12 x 3.5-inch 3TB NL-SAS 7200 RPM drives
 Datenredundanz durch Kopien der Daten auf
mehreren Nodes
 Performancesteigerung durch hinzufügen weiterer Nodes
Grundlage für Sizing der Lösung
‒ Terasort – 120MB/s pro Node
‒ TestDFSIO – 75MB/s Write pro Node
Hadoop Reference Architecture
Management
DATA
NODE
-
HDFS
TASK
TRAC
KER
Name Node
Sec Name Node
LAN
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
 Desaster Recovery Schutz
‒ Nutzung von DistCp
„Spiegelung“
Management
DATA
NODE
-
HDFS
TASK
TRAC
KER
Name Node
Sec Name Node
LAN
Management
DATA
NODE
-
HDFS
TASK
TRAC
KER
Name Node
Sec Name Node
LAN
Primäre Seite
Cluster A
Sekundäre Seite
Cluster B
Parallele Kopie über IP
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
 Backup
Datensicherung
Management
DATA
NODE
-
HDFS
TASK
TRAC
KER
Name Node
Sec Name Node
LAN
Primäre Seite
Cluster A
Inkrementelles Backup der
HDFS Dateien
Datensicherung der Name Nodes
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
Hitachi Hadoop Appliance
Skalierbar von 3 bis viele Data Nodes
Nutzung marktführender Integration
Klare Performanceerwartungen
Verfügbar
CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
Fragen?

Big data webex sascha oehl

  • 1.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only. Big Data Webex Sascha Oehl
  • 2.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only. Ist es real?
  • 3.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only. WOHER Zukunft Wenn wir nur die Zukunft kennen würden, könnten wir in der Gegenwart die richtigen Entscheidungen treffen.
  • 4.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only. Life Sciences Research Location-Based Advertising One to One Marketing On-Demand Maintenance Satellite Images Fraud Detection Churn Analysis Risk Analysis Sentiment Analysis One to One Marketing Geomation Farming Location-Based Advertising Oil Exploration Network Monitoring Asset Tracking On-Demand Maintenance Traffic Flow Optimization Seismic Monitoring Satellite Images Fraud Detection Churn Analysis Risk Analysis Sentiment Analysis
  • 5.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only.
  • 6.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only. Autoversicherungen Berechnung zuerst nur nach meiner Fahrfähigkeit Dann nach PS meines Autos Dann auch nach Autotyp Dann auch nach Regionen Dann auch nach Familienstand, Kinder, Beruf, Alter, Stellplatz, Wohnsituation, …
  • 7.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only.
  • 8.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only. Supermarkt Aufbau zuerst nach Funktionalität Dann nach Gespür des Marktleiters Dann nach Befragung der Kunden Dann nach Analyse des einzelnen Einkaufs Dann nach Analyse meines Einkaufsverhaltens Dann nach Analyse meines geschäftsübergreifenden Einkaufsverhalten
  • 9.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only.
  • 10.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only. Automobilzulieferer Just in Time Lieferung Bedarf für eine Vorhersage was gebraucht wird SAP APO (Advanced Planning and Optimization) Vorhersage der Zukunft (Was wird gebraucht) auf Basis der Vergangenheit und der Gegenwart
  • 11.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only. Entwicklung
  • 12.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only.
  • 13.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only. Klassische Entscheidungsfindung HiPPO Highest payed persons opinion Das „Bachgefühl“ / Die Erfahrung des Entscheiders
  • 14.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only.
  • 15.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only. Umbruch in der Entscheidungsfindung Analyse von Daten Schnelle Reaktionen
  • 16.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only. Business Process Database Data OLTP Machine Sensor Data, Complex Data M2m Log Files Satellite Imaging Bio- Informatics Sensors Recording Video Human Enterprise Content, External Sources Email Documents Web Logs Social 1x 10x 100x Big Data Transforms how we Capture and Capitalize on Data It is one of the Biggest Drivers of IT Spend Today!
  • 17.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only. Store Control  Schnellere Entscheidungen  Genauere Steuerung  Weniger Aufwand Insight Out! Viele Daten rein  Mehr Datenquellen  Mehr Datenvolumen  Schnellere Speicherung und Analyse  Weniger Algorithmen  Geringere Datenqualität Understand
  • 18.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only. Konsequenz
  • 19.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only. Klassisch Man kennt den Grund Man legt Indizes an Die Anwendung ist klar Man speichert was man benötigt BigData Man speichert Keine Klarheit wonach man suchen wird Man indiziert nicht Datenbanken
  • 20.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only. Klassisch Geplant und individuell Redundanz und Sicherheit BigData Schnell und billig Skalierbar Einfach Infrastruktur
  • 21.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only. Klassisch Komplizierte Algorithmen Vorhersage der Zukunft aus der Vergangenheit BigData Interpretation der Gegenwart in Verbindung mit der Vergangenheit (Was ist das letzte mal in solch einer Konstellation passiert) Analyse
  • 22.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only. Klassisch Monatliche / Wöchentliche Entscheidung BigData Tägliche / Stündliche Entscheidung Entscheidung
  • 23.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only. Ausführung
  • 24.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only. HDS infrastructure for Big Data HDS Solution – OPAD One Platform for All Data Big Data Dark Data Multi-protocol/Multi- data Type Virtualized Data Mobility Universal Management Infrastructure On Demand HDI/HCP UCP SAP Oracle Microsoft®HNAS Hadoop Resilience and Protection
  • 25.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only. Hadoop Software Plattform zur Verarbeitung großer Datenmengen, unstrukturiert und semi-strukturiert. Verteilung der Arbeit auf viele Server.
  • 26.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only. Cloudera Eine kommerzielle Implementation von Apache Haddop
  • 27.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only. Hadoop Ökosystem Java Virtual Machine Betriebssystem – Linux (Ubuntu, RedHat…) / Windows Hardware
  • 28.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only. Hadoop Ökosystem Java Virtual Machine Betriebssystem – Linux (Ubuntu, RedHat…) / Windows Hardware Daten Speicherung HDFS HBASE Koordinierung ZooKeeper Datenverarbeitung Map Reduce Task Tracker
  • 29.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only. Netzwerk Hadoop Ökosystem Java Virtual Machine Betriebssystem – Linux (Ubuntu, RedHat…) / Windows Hardware Daten Speicherung HDFS HBASE Koordinierung ZooKeeper Datenverarbeitung Map Reduce Task Tracker
  • 30.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only. Netzwerk Hadoop Ökosystem Java Virtual Machine Betriebssystem – Linux (Ubuntu, RedHat…) / Windows Hardware Daten Speicherung HDFS HBASE Koordinierung ZooKeeper Datenverarbeitung Map Reduce Task Tracker Orchestration Oozie Data Mining Mahout Datenzugriff Flume Sqoop Client Zugriff Hue Hive Pig
  • 31.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only. Netzwerk Hadoop Rechnertypen Java Virtual Machine Betriebssystem – Linux (Ubuntu, RedHat…) / Windows Hardware Data Node (3-…) Daten Speicherung HDFS HBASE Datenverarbeitung Map Reduce Task Tracker
  • 32.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only. Netzwerk Hadoop Rechnertypen Java Virtual Machine Betriebssystem – Linux (Ubuntu, RedHat…) / Windows Hardware Data Node (3-…) Daten Speicherung HDFS HBASE Datenverarbeitung Map Reduce Task Tracker Cluster Name Node Verwaltung des verteilten Dateisystems Job Tracker
  • 33.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only. Netzwerk Hadoop Rechnertypen Java Virtual Machine Betriebssystem – Linux (Ubuntu, RedHat…) / Windows Hardware Data Node (3-…) Daten Speicherung HDFS HBASE Datenverarbeitung Map Reduce Task Tracker Edge Node Annahme von Anfragen Client Zugriff Hue,Hive,Pig Orchestration Oozie Koordinierung ZooKeeper Cluster Name Node Verwaltung des verteilten Dateisystems Job Tracker
  • 34.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only. Shared FS Java Virtual MachineJava Virtual Machine Netzwerk Hadoop Second Copy für die Name Nodes Betriebssystem – Linux (Ubuntu, RedHat…) / Windows Hardware Active Cluster Name Node Verwaltung des verteilten Dateisystems Job Tracker Betriebssystem – Linux (Ubuntu, RedHat…) / Windows Hardware Standby Cluster Name Node Housekeeping Backup Copy Verwaltung des verteilten Dateisystems nach Übernahme Job Tracker nach Übernahme
  • 35.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
  • 36.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only. Data Warehouse Anwendungs Server Datenbank Server Storage
  • 37.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only. Data Warehouse Anwendungs Server Datenbank Server Storage Big Data Edge Cluster Name Node Data Nodes
  • 38.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.  Pre-tested, pre-integrated hardware and software Hadoop Reference Architecture Management DATA NODE - HDFS TASK TRAC KER Name Node Sec Name Node LAN
  • 39.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.  Can be customized to fit any application ‒ Customer purchases Cloudera and other applications from vendors  Comprehensive Big Data services ‒ Red Hat Linux, Cloudera Hadoop Reference Architecture Management DATA NODE - HDFS TASK TRAC KER Name Node Sec Name Node LAN
  • 40.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.  Management node ‒ 1 x Compute Rack 210H Server ‒ 2 x 6-core E2620@2Ghz Processors ‒ 64GB RAM ‒ 2xGigE (onboard) ‒ 2 x 300GB SAS 10K RPM  Networking ‒ 2 x Cisco Nexus 3348 Hadoop Reference Architecture Management DATA NODE - HDFS TASK TRAC KER Name Node Sec Name Node LAN
  • 41.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.  Hadoop Cluster Name nodes (Primary + Secondary) ‒ 2 x Hitachi Compute Rack 220S server ‒ 2 x 8-core E2470@2,3 Ghz ‒ 64GB RAM ‒ 2 x GigE (onboard) ‒ 12 x 3.5-inch 3TB NL-SAS 7200 RPM drives  Datenredundanz durch Raid5 11+1 Hadoop Reference Architecture Management DATA NODE - HDFS TASK TRAC KER Name Node Sec Name Node LAN
  • 42.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.  Hadoop Data nodes ‒ 3… x Hitachi Compute Rack 220S server ‒ 2 x 8-core E2470@2,3 Ghz ‒ 64GB RAM ‒ 2 x GigE (onboard) ‒ 12 x 3.5-inch 3TB NL-SAS 7200 RPM drives  Datenredundanz durch Kopien der Daten auf mehreren Nodes  Performancesteigerung durch hinzufügen weiterer Nodes Grundlage für Sizing der Lösung ‒ Terasort – 120MB/s pro Node ‒ TestDFSIO – 75MB/s Write pro Node Hadoop Reference Architecture Management DATA NODE - HDFS TASK TRAC KER Name Node Sec Name Node LAN
  • 43.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.  Desaster Recovery Schutz ‒ Nutzung von DistCp „Spiegelung“ Management DATA NODE - HDFS TASK TRAC KER Name Node Sec Name Node LAN Management DATA NODE - HDFS TASK TRAC KER Name Node Sec Name Node LAN Primäre Seite Cluster A Sekundäre Seite Cluster B Parallele Kopie über IP
  • 44.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.  Backup Datensicherung Management DATA NODE - HDFS TASK TRAC KER Name Node Sec Name Node LAN Primäre Seite Cluster A Inkrementelles Backup der HDFS Dateien Datensicherung der Name Nodes
  • 45.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only.
  • 46.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only.CONFIDENTIAL – For use by Hitachi Data Systems employees and other audiences under NDA only. Hitachi Hadoop Appliance Skalierbar von 3 bis viele Data Nodes Nutzung marktführender Integration Klare Performanceerwartungen Verfügbar
  • 47.
    CONFIDENTIAL – Foruse by Hitachi Data Systems employees and other audiences under NDA only. Fragen?