SlideShare a Scribd company logo
1 of 13
WHY HADOOP?
• PROSES DATA DENGAN UKURAN YANG SANGAT BESAR
• MAHALNYA HARGA MESIN YANG DAPAT MEMPROSES DATA BESAR DENGAN CEPAT
• EFISIENSI, RELIABEL, DAN MUDAH DIGUNAKAN
• OPEN SOURCE
HADOOP
• SOFTWARE OPENSOURCE DARI APACHE UNTUK KOMPUTASI TERDEISTRIBUSI YANG HANDAL DAN
SKALABILITAS TINGGI
• PEMROSESAN TERDISTRIBUSI DARI KUMPULAN DATA YANG BESAR PADACLUSTER DENGAN
MENGGUNAKAN PEMROGRAMAN SEDERHANA
• MEMILIKI KEMAMPUAN UNTUK MENDETEKSI DAN MENANGANI KEGAGALAN PADALAYER APLIKASI UNTUK
MEMBERIKAN LAYANAN HIGH-AVAILABILTY PADA SETIAP CLUSTER
HADOOP
• HDFS
• NAME NODE
• DATA NODE
• MAP/REDUCE
• JOB TRACKER
• TASK TRACKER
HDFS (HADOOP DISTRIBUTED FILE SYSTEM)
• TEMPAT PENYIMPANAN DATA PADAHADOOP TERDIRI DARI NODE-NODE PENYIMPANAN
• DAPAT MENYIMPAN DATA DALAM JUMLAH BESAR
• HIGH-AVAILABILITY (SETIAP DATA DIDUPLIKASI)
• DATA DIPECAH TERLEBIH DAHULU KE DALAM BENTUKBLOCK-BLOCK SEBELUM DIMASUKKAN KE DALAM
HDFS
• TERDIRI DARI DATANODE DAN NAMENODE
NAME NODE
• TEMPAT MENYIMPAN ALAMAT DATA YANG DIMASUKKAN PADA DATA NODE (META DATA)
• MANAGEMEN KONFIGURASI CLUSTER
• MAPPING BLOCK DATA PADA DATANODE
• SATU CLUSTER TERDAPAT 1 NAMENODE YANG BERJALAN
DATA NODE
• TEMPAT PENYIMPANAN BLOCK-BLOCK FILE
• SATU CLUSTER TERDIRI DARI BEBERAPA DATANODE
• BESAR BLOCK TERSERAH ADMIN (BIASANYA 64MB, 128MB, DST)
MAP/REDUCE
• PROGRAMMING MODEL UNTUK PENGOLAHAN DATA SECARA DISTRIBUSI
• PEMROSESAN DIPECAH MENJADI 2, TAHAPAN MAP DAN TAHAPAN REDUCE
WORD COUNT EXAMPLE
• MAPPER
• INPUT: VALUE: LINES OF TEXT OF INPUT
• OUTPUT: KEY: WORD, VALUE: 1
• REDUCER
• INPUT: KEY: WORD, VALUE: SET OF COUNTS
• OUTPUT: KEY: WORD, VALUE: SUM
• LAUNCHING PROGRAM
• DEFINES THIS JOB
• SUBMITS JOB TO CLUSTER
WORD COUNT DATAFLOW
MATUR TENGKYU

More Related Content

Viewers also liked

Usability test
Usability testUsability test
Usability testAnsviaLab
 
casperjs presentation
 casperjs presentation casperjs presentation
casperjs presentationAnsviaLab
 
Material Design With Polymer
Material Design With PolymerMaterial Design With Polymer
Material Design With PolymerAnsviaLab
 
The most technical mistakes in tech startup
The most technical mistakes in tech startupThe most technical mistakes in tech startup
The most technical mistakes in tech startupAnsviaLab
 
Blackbox And Whitebox Testing
Blackbox And Whitebox TestingBlackbox And Whitebox Testing
Blackbox And Whitebox TestingAnsviaLab
 
Mengamankan SSH ID
Mengamankan SSH IDMengamankan SSH ID
Mengamankan SSH IDAnsviaLab
 
Artificial intelligence deep learning
Artificial intelligence deep learningArtificial intelligence deep learning
Artificial intelligence deep learningAnsviaLab
 
Debian server
Debian serverDebian server
Debian serverAnsviaLab
 
Bagaimana menjadi system administrator yang baik
Bagaimana menjadi system administrator yang baikBagaimana menjadi system administrator yang baik
Bagaimana menjadi system administrator yang baikAnsviaLab
 
Intercept Analyze Data
Intercept Analyze DataIntercept Analyze Data
Intercept Analyze DataAnsviaLab
 
Evaluasi user interface
Evaluasi user interfaceEvaluasi user interface
Evaluasi user interfaceAnsviaLab
 
Content marketing
Content marketingContent marketing
Content marketingAnsviaLab
 
Search engine optimization
Search engine optimizationSearch engine optimization
Search engine optimizationAnsviaLab
 
Best Practices For Writing Super Readable Code
Best Practices For Writing Super Readable CodeBest Practices For Writing Super Readable Code
Best Practices For Writing Super Readable CodeAnsviaLab
 
File carving
File carvingFile carving
File carvingAnsviaLab
 

Viewers also liked (20)

Usability test
Usability testUsability test
Usability test
 
Oop scala
Oop scalaOop scala
Oop scala
 
casperjs presentation
 casperjs presentation casperjs presentation
casperjs presentation
 
Material Design With Polymer
Material Design With PolymerMaterial Design With Polymer
Material Design With Polymer
 
The most technical mistakes in tech startup
The most technical mistakes in tech startupThe most technical mistakes in tech startup
The most technical mistakes in tech startup
 
Blackbox And Whitebox Testing
Blackbox And Whitebox TestingBlackbox And Whitebox Testing
Blackbox And Whitebox Testing
 
Mengamankan SSH ID
Mengamankan SSH IDMengamankan SSH ID
Mengamankan SSH ID
 
Artificial intelligence deep learning
Artificial intelligence deep learningArtificial intelligence deep learning
Artificial intelligence deep learning
 
Omni plan
Omni planOmni plan
Omni plan
 
Debian server
Debian serverDebian server
Debian server
 
Bagaimana menjadi system administrator yang baik
Bagaimana menjadi system administrator yang baikBagaimana menjadi system administrator yang baik
Bagaimana menjadi system administrator yang baik
 
Dynamic dns
Dynamic dnsDynamic dns
Dynamic dns
 
Seo
SeoSeo
Seo
 
CAPISTRANO
CAPISTRANOCAPISTRANO
CAPISTRANO
 
Intercept Analyze Data
Intercept Analyze DataIntercept Analyze Data
Intercept Analyze Data
 
Evaluasi user interface
Evaluasi user interfaceEvaluasi user interface
Evaluasi user interface
 
Content marketing
Content marketingContent marketing
Content marketing
 
Search engine optimization
Search engine optimizationSearch engine optimization
Search engine optimization
 
Best Practices For Writing Super Readable Code
Best Practices For Writing Super Readable CodeBest Practices For Writing Super Readable Code
Best Practices For Writing Super Readable Code
 
File carving
File carvingFile carving
File carving
 

Similar to Hadoop

Aspera - Bridging On Premise and Cloud Deployments for Broadcast IT
Aspera - Bridging On Premise and Cloud Deployments for Broadcast ITAspera - Bridging On Premise and Cloud Deployments for Broadcast IT
Aspera - Bridging On Premise and Cloud Deployments for Broadcast ITFrançois Quereuil
 
IBM Aspera - Moving the world’s data at maximum speed
IBM Aspera - Moving the world’s data at maximum speedIBM Aspera - Moving the world’s data at maximum speed
IBM Aspera - Moving the world’s data at maximum speedMohamed Morsi
 
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud Stefan Lipp
 
Pachube: an open, easy to use, secure & scalable platform for building the 'I...
Pachube: an open, easy to use, secure & scalable platform for building the 'I...Pachube: an open, easy to use, secure & scalable platform for building the 'I...
Pachube: an open, easy to use, secure & scalable platform for building the 'I...pachube
 
Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL David Smelker
 
Red Hat Storage Day Dallas - Storage for OpenShift Containers
Red Hat Storage Day Dallas - Storage for OpenShift Containers Red Hat Storage Day Dallas - Storage for OpenShift Containers
Red Hat Storage Day Dallas - Storage for OpenShift Containers Red_Hat_Storage
 
Building a geospatial processing pipeline using Hadoop and HBase and how Mons...
Building a geospatial processing pipeline using Hadoop and HBase and how Mons...Building a geospatial processing pipeline using Hadoop and HBase and how Mons...
Building a geospatial processing pipeline using Hadoop and HBase and how Mons...DataWorks Summit
 
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...Amazon Web Services
 
Hadoop and NoSQL joining forces by Dale Kim of MapR
Hadoop and NoSQL joining forces by Dale Kim of MapRHadoop and NoSQL joining forces by Dale Kim of MapR
Hadoop and NoSQL joining forces by Dale Kim of MapRData Con LA
 
True Reusable Code - DevSum2016
True Reusable Code - DevSum2016True Reusable Code - DevSum2016
True Reusable Code - DevSum2016Eduard Lazar
 
Building a scalable analytics environment to support diverse workloads
Building a scalable analytics environment to support diverse workloadsBuilding a scalable analytics environment to support diverse workloads
Building a scalable analytics environment to support diverse workloadsAlluxio, Inc.
 
Analytics using big data technologies
Analytics using big data technologiesAnalytics using big data technologies
Analytics using big data technologiesBalakrishnan Vinchu
 
Semantic web meetup 14.november 2013
Semantic web meetup 14.november 2013Semantic web meetup 14.november 2013
Semantic web meetup 14.november 2013Jean-Pierre König
 
Customer Applications Of Hadoop On Red Hat Storage Server
Customer Applications Of Hadoop On Red Hat Storage ServerCustomer Applications Of Hadoop On Red Hat Storage Server
Customer Applications Of Hadoop On Red Hat Storage ServerRed_Hat_Storage
 

Similar to Hadoop (20)

Aspera - Bridging On Premise and Cloud Deployments for Broadcast IT
Aspera - Bridging On Premise and Cloud Deployments for Broadcast ITAspera - Bridging On Premise and Cloud Deployments for Broadcast IT
Aspera - Bridging On Premise and Cloud Deployments for Broadcast IT
 
IBM Aspera - Moving the world’s data at maximum speed
IBM Aspera - Moving the world’s data at maximum speedIBM Aspera - Moving the world’s data at maximum speed
IBM Aspera - Moving the world’s data at maximum speed
 
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
Cloudera Analytics and Machine Learning Platform - Optimized for Cloud
 
Pachube: an open, easy to use, secure & scalable platform for building the 'I...
Pachube: an open, easy to use, secure & scalable platform for building the 'I...Pachube: an open, easy to use, secure & scalable platform for building the 'I...
Pachube: an open, easy to use, secure & scalable platform for building the 'I...
 
Hadoop
HadoopHadoop
Hadoop
 
From Zero to Data Flow in Hours with Apache NiFi
From Zero to Data Flow in Hours with Apache NiFiFrom Zero to Data Flow in Hours with Apache NiFi
From Zero to Data Flow in Hours with Apache NiFi
 
Hp hadoop platform
Hp hadoop platformHp hadoop platform
Hp hadoop platform
 
HUG France - Apache Drill
HUG France - Apache DrillHUG France - Apache Drill
HUG France - Apache Drill
 
Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL Colorado Springs Open Source Hadoop/MySQL
Colorado Springs Open Source Hadoop/MySQL
 
Red Hat Storage Day Dallas - Storage for OpenShift Containers
Red Hat Storage Day Dallas - Storage for OpenShift Containers Red Hat Storage Day Dallas - Storage for OpenShift Containers
Red Hat Storage Day Dallas - Storage for OpenShift Containers
 
BIG DATA ANALYSIS
BIG DATA ANALYSISBIG DATA ANALYSIS
BIG DATA ANALYSIS
 
Building a geospatial processing pipeline using Hadoop and HBase and how Mons...
Building a geospatial processing pipeline using Hadoop and HBase and how Mons...Building a geospatial processing pipeline using Hadoop and HBase and how Mons...
Building a geospatial processing pipeline using Hadoop and HBase and how Mons...
 
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...
 
Concepts on Hadoop
Concepts on HadoopConcepts on Hadoop
Concepts on Hadoop
 
Hadoop and NoSQL joining forces by Dale Kim of MapR
Hadoop and NoSQL joining forces by Dale Kim of MapRHadoop and NoSQL joining forces by Dale Kim of MapR
Hadoop and NoSQL joining forces by Dale Kim of MapR
 
True Reusable Code - DevSum2016
True Reusable Code - DevSum2016True Reusable Code - DevSum2016
True Reusable Code - DevSum2016
 
Building a scalable analytics environment to support diverse workloads
Building a scalable analytics environment to support diverse workloadsBuilding a scalable analytics environment to support diverse workloads
Building a scalable analytics environment to support diverse workloads
 
Analytics using big data technologies
Analytics using big data technologiesAnalytics using big data technologies
Analytics using big data technologies
 
Semantic web meetup 14.november 2013
Semantic web meetup 14.november 2013Semantic web meetup 14.november 2013
Semantic web meetup 14.november 2013
 
Customer Applications Of Hadoop On Red Hat Storage Server
Customer Applications Of Hadoop On Red Hat Storage ServerCustomer Applications Of Hadoop On Red Hat Storage Server
Customer Applications Of Hadoop On Red Hat Storage Server
 

Recently uploaded

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Recently uploaded (20)

Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

Hadoop

  • 1.
  • 2. WHY HADOOP? • PROSES DATA DENGAN UKURAN YANG SANGAT BESAR • MAHALNYA HARGA MESIN YANG DAPAT MEMPROSES DATA BESAR DENGAN CEPAT • EFISIENSI, RELIABEL, DAN MUDAH DIGUNAKAN • OPEN SOURCE
  • 3. HADOOP • SOFTWARE OPENSOURCE DARI APACHE UNTUK KOMPUTASI TERDEISTRIBUSI YANG HANDAL DAN SKALABILITAS TINGGI • PEMROSESAN TERDISTRIBUSI DARI KUMPULAN DATA YANG BESAR PADACLUSTER DENGAN MENGGUNAKAN PEMROGRAMAN SEDERHANA • MEMILIKI KEMAMPUAN UNTUK MENDETEKSI DAN MENANGANI KEGAGALAN PADALAYER APLIKASI UNTUK MEMBERIKAN LAYANAN HIGH-AVAILABILTY PADA SETIAP CLUSTER
  • 4. HADOOP • HDFS • NAME NODE • DATA NODE • MAP/REDUCE • JOB TRACKER • TASK TRACKER
  • 5. HDFS (HADOOP DISTRIBUTED FILE SYSTEM) • TEMPAT PENYIMPANAN DATA PADAHADOOP TERDIRI DARI NODE-NODE PENYIMPANAN • DAPAT MENYIMPAN DATA DALAM JUMLAH BESAR • HIGH-AVAILABILITY (SETIAP DATA DIDUPLIKASI) • DATA DIPECAH TERLEBIH DAHULU KE DALAM BENTUKBLOCK-BLOCK SEBELUM DIMASUKKAN KE DALAM HDFS • TERDIRI DARI DATANODE DAN NAMENODE
  • 6. NAME NODE • TEMPAT MENYIMPAN ALAMAT DATA YANG DIMASUKKAN PADA DATA NODE (META DATA) • MANAGEMEN KONFIGURASI CLUSTER • MAPPING BLOCK DATA PADA DATANODE • SATU CLUSTER TERDAPAT 1 NAMENODE YANG BERJALAN
  • 7. DATA NODE • TEMPAT PENYIMPANAN BLOCK-BLOCK FILE • SATU CLUSTER TERDIRI DARI BEBERAPA DATANODE • BESAR BLOCK TERSERAH ADMIN (BIASANYA 64MB, 128MB, DST)
  • 8.
  • 9.
  • 10. MAP/REDUCE • PROGRAMMING MODEL UNTUK PENGOLAHAN DATA SECARA DISTRIBUSI • PEMROSESAN DIPECAH MENJADI 2, TAHAPAN MAP DAN TAHAPAN REDUCE
  • 11. WORD COUNT EXAMPLE • MAPPER • INPUT: VALUE: LINES OF TEXT OF INPUT • OUTPUT: KEY: WORD, VALUE: 1 • REDUCER • INPUT: KEY: WORD, VALUE: SET OF COUNTS • OUTPUT: KEY: WORD, VALUE: SUM • LAUNCHING PROGRAM • DEFINES THIS JOB • SUBMITS JOB TO CLUSTER