SlideShare a Scribd company logo
1 of 23
Download to read offline
1 
کارگاه پردازش داده توزیع شده 
پردیس- شهیدبهشتی 
دانشکده علوم و مهندسی کامپیوتر 
درس: پایگاه داده توزیع شده 
استاد: دکتر هادی طباطبایی 
ارائه: ابوالفضل صدیقی 
آبان ۱۳۹۳
Distributed Data Processing 
School of Computer Science and Engineering 
A. Sedighi 
@amirsedighi 
Hexican.com 
sedighi@gmail.com
3 
Every Game needs it's Playing Yard
4 
Every Game needs it's Playing Yard
What can I do on a Single Machine? 
5 
● MVC Programming 
● Regular Biz Apps 
● 100 GBs Data 
● Web Surfing 
● ...
6 
Linux Cluster
7
8
9 
Introduction 
This is a 4 sessions, hands-on, step-by-step 
tutorial on setting up, a Linux cluster on your 
machine (Notebook or PC), to try a few number 
of big-data processing frameworks and tools.
10 
What we are going to do? 
● Your notebook, or a PC is just enough for starting. 
– Setting your Linux cluster up. 
● Distributed Log Management and Realtime Search-Engines 
– What is Elasticsearch? 
– Elasticsearch on the cluster. 
– Monitoring and Usage. 
● The most popular Distributed Data Processing Framework. 
– What is Apache Hadoop? 
– Apache Hadoop on the cluster. 
– Using Scenarios.
11 
What we would Learn? 
● Leveraging our knowledge of Big-Data. 
● Getting familiar with distributed data processing. 
● Maximizing availability and reliability. 
● Increasing data storage capacity. 
● Leveraging data processing performance. 
● Data locality is a silver bullet. 
● Increasing cluster utilization. 
● Taming giants by giving them a try.
12 
Preparing the Linux Cluster - 
VirtualBox
13 
Preparing the Cluster - Hosting 
● VirtualBox 
– Memory Size, Disk Capacity and CPU cores. 
– Network Interfaces. 
● NAT, provides Internet. 
● Host-Only, provides cluster communication.
14 
Preparing the Cluster – Adding a 
Host-Only Network
15 
Preparing the Cluster – Adding a 
NAT Interface
16 
Preparing the Cluster – Adding a 
Host-Only Interface
17 
Preparing the Cluster – First Node 
● Creating a Linux machine inside VirtualBox. 
● Installing Linux. (I've used Ubuntu 12.04) 
– Check Samba 
– Check OpenSSH 
● Give the first node all. 
– Having an “install” folder on. 
– Having primitives such as Java installed on. 
● Shutting down the first node.
Preparing the Cluster – Cloning, The 
18 
Virtual Box Side 
● Cloning the first node. (tutorial)
Preparing the Cluster – Cloning, the 
19 
Linux side 
● Turning the new node on. 
● Network configuration 
– sudo nano /etc/hosts 
– sudo nano /etc/hostname 
– sudo nano /etc/network/interfaces 
– sudo rm /etc/udev/rules.d/70-persistent-net.rules 
● sudo reboot
20 
Preparing the Cluster – No 
Password Login 
● Do this: 
– ssh-keygen 
– ssh-copy-id -i ~/.ssh/id_rsa.pub user@host 
● Or this: 
– ssh-keygen -t dsa -p '' -f ~/.ssh/id_dsa 
– scp .ssh/id_rsa.pub user@host:~/master_key 
– ssh user@host 
– cat master_key >> ./ssh/authorized_keys
21 
Preparing the Cluster – Distributed 
Shell 
● Do it like a Commander 
– Installing DSH (Optional)
22 
Preparing the Cluster – Enjoy it 
● To scale your cluster just repeat the cloning 
step.
23 
Next? 
● An introduction to distributed Log Management 
and analytical search-engines. 
– How Elasticsearch works? 
– Workshop. 
● An introduction to Apache Hadoop 
– How Apache Hadoop works? 
– Workshop.

More Related Content

What's hot

Install hadoop in a cluster
Install hadoop in a clusterInstall hadoop in a cluster
Install hadoop in a clusterXuhong Zhang
 
Understanding blue store, Ceph's new storage backend - Tim Serong, SUSE
Understanding blue store, Ceph's new storage backend - Tim Serong, SUSEUnderstanding blue store, Ceph's new storage backend - Tim Serong, SUSE
Understanding blue store, Ceph's new storage backend - Tim Serong, SUSEOpenStack
 
Ceph-Mesos framework
Ceph-Mesos frameworkCeph-Mesos framework
Ceph-Mesos frameworkZhongyue Luo
 
LCA 2012: High Availability Sprint
LCA 2012: High Availability SprintLCA 2012: High Availability Sprint
LCA 2012: High Availability Sprinthastexo
 
Setting up repositories: Technical Requirements, Repository Software, Metad...
Setting up repositories:  Technical Requirements,  Repository Software, Metad...Setting up repositories:  Technical Requirements,  Repository Software, Metad...
Setting up repositories: Technical Requirements, Repository Software, Metad...Iryna Kuchma
 
深入了解Redis
深入了解Redis深入了解Redis
深入了解Redisiammutex
 
Friends of Solr - Nutch & HDFS
Friends of Solr - Nutch & HDFSFriends of Solr - Nutch & HDFS
Friends of Solr - Nutch & HDFSSaumitra Srivastav
 
Guava Overview Part 2 Bucharest JUG #2
Guava Overview Part 2 Bucharest JUG #2 Guava Overview Part 2 Bucharest JUG #2
Guava Overview Part 2 Bucharest JUG #2 Andrei Savu
 
Philipp Krenn "Elasticsearch (R)Evolution — You Know, for Search…"
Philipp Krenn "Elasticsearch (R)Evolution — You Know, for Search…"Philipp Krenn "Elasticsearch (R)Evolution — You Know, for Search…"
Philipp Krenn "Elasticsearch (R)Evolution — You Know, for Search…"Fwdays
 
Get mysql clusterrunning-windows
Get mysql clusterrunning-windowsGet mysql clusterrunning-windows
Get mysql clusterrunning-windowsJoeSg
 
Large Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and FriendsLarge Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and Friendslucenerevolution
 
Container Security via Monitoring and Orchestration - Container Security Summit
Container Security via Monitoring and Orchestration - Container Security SummitContainer Security via Monitoring and Orchestration - Container Security Summit
Container Security via Monitoring and Orchestration - Container Security SummitDavid Timothy Strauss
 
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...Glenn K. Lockwood
 
Create a RESTful API with NodeJS, Express and MongoDB
Create a RESTful API with NodeJS, Express and MongoDBCreate a RESTful API with NodeJS, Express and MongoDB
Create a RESTful API with NodeJS, Express and MongoDBHengki Sihombing
 
Hidden gems in Apache Jackrabbit and BloomReach Forge
Hidden gems in Apache Jackrabbit and BloomReach ForgeHidden gems in Apache Jackrabbit and BloomReach Forge
Hidden gems in Apache Jackrabbit and BloomReach ForgeWoonsan Ko
 
TWJUG 2016 - Mogilefs, 簡約可靠的儲存方案
TWJUG 2016 - Mogilefs, 簡約可靠的儲存方案TWJUG 2016 - Mogilefs, 簡約可靠的儲存方案
TWJUG 2016 - Mogilefs, 簡約可靠的儲存方案Hua Chu
 
dba_lounge_Iasi: Everybody likes redis
dba_lounge_Iasi: Everybody likes redisdba_lounge_Iasi: Everybody likes redis
dba_lounge_Iasi: Everybody likes redisLiviu Costea
 

What's hot (20)

Install hadoop in a cluster
Install hadoop in a clusterInstall hadoop in a cluster
Install hadoop in a cluster
 
Containers > VMs
Containers > VMsContainers > VMs
Containers > VMs
 
Understanding blue store, Ceph's new storage backend - Tim Serong, SUSE
Understanding blue store, Ceph's new storage backend - Tim Serong, SUSEUnderstanding blue store, Ceph's new storage backend - Tim Serong, SUSE
Understanding blue store, Ceph's new storage backend - Tim Serong, SUSE
 
Ceph-Mesos framework
Ceph-Mesos frameworkCeph-Mesos framework
Ceph-Mesos framework
 
LCA 2012: High Availability Sprint
LCA 2012: High Availability SprintLCA 2012: High Availability Sprint
LCA 2012: High Availability Sprint
 
Setting up repositories: Technical Requirements, Repository Software, Metad...
Setting up repositories:  Technical Requirements,  Repository Software, Metad...Setting up repositories:  Technical Requirements,  Repository Software, Metad...
Setting up repositories: Technical Requirements, Repository Software, Metad...
 
深入了解Redis
深入了解Redis深入了解Redis
深入了解Redis
 
Friends of Solr - Nutch & HDFS
Friends of Solr - Nutch & HDFSFriends of Solr - Nutch & HDFS
Friends of Solr - Nutch & HDFS
 
Guava
GuavaGuava
Guava
 
Guava Overview Part 2 Bucharest JUG #2
Guava Overview Part 2 Bucharest JUG #2 Guava Overview Part 2 Bucharest JUG #2
Guava Overview Part 2 Bucharest JUG #2
 
Philipp Krenn "Elasticsearch (R)Evolution — You Know, for Search…"
Philipp Krenn "Elasticsearch (R)Evolution — You Know, for Search…"Philipp Krenn "Elasticsearch (R)Evolution — You Know, for Search…"
Philipp Krenn "Elasticsearch (R)Evolution — You Know, for Search…"
 
Get mysql clusterrunning-windows
Get mysql clusterrunning-windowsGet mysql clusterrunning-windows
Get mysql clusterrunning-windows
 
Large Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and FriendsLarge Scale Crawling with Apache Nutch and Friends
Large Scale Crawling with Apache Nutch and Friends
 
Caching. api. http 1.1
Caching. api. http 1.1Caching. api. http 1.1
Caching. api. http 1.1
 
Container Security via Monitoring and Orchestration - Container Security Summit
Container Security via Monitoring and Orchestration - Container Security SummitContainer Security via Monitoring and Orchestration - Container Security Summit
Container Security via Monitoring and Orchestration - Container Security Summit
 
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
The Proto-Burst Buffer: Experience with the flash-based file system on SDSC's...
 
Create a RESTful API with NodeJS, Express and MongoDB
Create a RESTful API with NodeJS, Express and MongoDBCreate a RESTful API with NodeJS, Express and MongoDB
Create a RESTful API with NodeJS, Express and MongoDB
 
Hidden gems in Apache Jackrabbit and BloomReach Forge
Hidden gems in Apache Jackrabbit and BloomReach ForgeHidden gems in Apache Jackrabbit and BloomReach Forge
Hidden gems in Apache Jackrabbit and BloomReach Forge
 
TWJUG 2016 - Mogilefs, 簡約可靠的儲存方案
TWJUG 2016 - Mogilefs, 簡約可靠的儲存方案TWJUG 2016 - Mogilefs, 簡約可靠的儲存方案
TWJUG 2016 - Mogilefs, 簡約可靠的儲存方案
 
dba_lounge_Iasi: Everybody likes redis
dba_lounge_Iasi: Everybody likes redisdba_lounge_Iasi: Everybody likes redis
dba_lounge_Iasi: Everybody likes redis
 

Viewers also liked

Big Data and Machine Learning Workshop - Day 7 @ UTACM
Big Data and Machine Learning Workshop - Day 7 @ UTACM Big Data and Machine Learning Workshop - Day 7 @ UTACM
Big Data and Machine Learning Workshop - Day 7 @ UTACM Amir Sedighi
 
Big Data and Machine Learning Workshop - Day 5 @ UTACM
Big Data and Machine Learning Workshop - Day 5 @ UTACMBig Data and Machine Learning Workshop - Day 5 @ UTACM
Big Data and Machine Learning Workshop - Day 5 @ UTACMAmir Sedighi
 
Case Studies on Big-Data Processing and Streaming - Iranian Java User Group
Case Studies on Big-Data Processing and Streaming - Iranian Java User GroupCase Studies on Big-Data Processing and Streaming - Iranian Java User Group
Case Studies on Big-Data Processing and Streaming - Iranian Java User GroupAmir Sedighi
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache KafkaAmir Sedighi
 
آشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگ
آشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگآشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگ
آشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگAmir Sedighi
 
Big Data Processing Utilizing Open-source Technologies - May 2015
Big Data Processing Utilizing Open-source Technologies - May 2015Big Data Processing Utilizing Open-source Technologies - May 2015
Big Data Processing Utilizing Open-source Technologies - May 2015Amir Sedighi
 

Viewers also liked (7)

Dark data
Dark dataDark data
Dark data
 
Big Data and Machine Learning Workshop - Day 7 @ UTACM
Big Data and Machine Learning Workshop - Day 7 @ UTACM Big Data and Machine Learning Workshop - Day 7 @ UTACM
Big Data and Machine Learning Workshop - Day 7 @ UTACM
 
Big Data and Machine Learning Workshop - Day 5 @ UTACM
Big Data and Machine Learning Workshop - Day 5 @ UTACMBig Data and Machine Learning Workshop - Day 5 @ UTACM
Big Data and Machine Learning Workshop - Day 5 @ UTACM
 
Case Studies on Big-Data Processing and Streaming - Iranian Java User Group
Case Studies on Big-Data Processing and Streaming - Iranian Java User GroupCase Studies on Big-Data Processing and Streaming - Iranian Java User Group
Case Studies on Big-Data Processing and Streaming - Iranian Java User Group
 
An Introduction to Apache Kafka
An Introduction to Apache KafkaAn Introduction to Apache Kafka
An Introduction to Apache Kafka
 
آشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگ
آشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگآشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگ
آشنایی با داده‌های بزرگ و تکنیک‌های برنامه‌سازی برای پردازش داده‌های بزرگ
 
Big Data Processing Utilizing Open-source Technologies - May 2015
Big Data Processing Utilizing Open-source Technologies - May 2015Big Data Processing Utilizing Open-source Technologies - May 2015
Big Data Processing Utilizing Open-source Technologies - May 2015
 

Similar to Distributed Data Processing Workshop - SBU

Node in Real Time - The Beginning
Node in Real Time - The BeginningNode in Real Time - The Beginning
Node in Real Time - The BeginningAxilis
 
Joomla on Raspberry Pi using Nginx - Nederlandse Linux Gebruikers Group novem...
Joomla on Raspberry Pi using Nginx - Nederlandse Linux Gebruikers Group novem...Joomla on Raspberry Pi using Nginx - Nederlandse Linux Gebruikers Group novem...
Joomla on Raspberry Pi using Nginx - Nederlandse Linux Gebruikers Group novem...Peter Martin
 
New Jersey Red Hat Users Group Presentation: Provisioning anywhere
New Jersey Red Hat Users Group Presentation: Provisioning anywhereNew Jersey Red Hat Users Group Presentation: Provisioning anywhere
New Jersey Red Hat Users Group Presentation: Provisioning anywhereRodrique Heron
 
Hacking and Forensics on the Go - 44CON 2012
Hacking and Forensics on the Go - 44CON 2012Hacking and Forensics on the Go - 44CON 2012
Hacking and Forensics on the Go - 44CON 201244CON
 
Building SuperComputers @ Home
Building SuperComputers @ HomeBuilding SuperComputers @ Home
Building SuperComputers @ HomeAbhishek Parolkar
 
Network Automation: Ansible 101
Network Automation: Ansible 101Network Automation: Ansible 101
Network Automation: Ansible 101APNIC
 
Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013dotCloud
 
Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013Docker, Inc.
 
Deploy Mediawiki Using FIWARE Lab Facilities
Deploy Mediawiki Using FIWARE Lab FacilitiesDeploy Mediawiki Using FIWARE Lab Facilities
Deploy Mediawiki Using FIWARE Lab FacilitiesFIWARE
 
[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...
[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...
[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...OpenStack Korea Community
 
OpenStack Integration with OpenContrail and OpenDaylight
OpenStack Integration with OpenContrail and OpenDaylightOpenStack Integration with OpenContrail and OpenDaylight
OpenStack Integration with OpenContrail and OpenDaylightSyed Moneeb
 
The Deck by Phil Polstra GrrCON2012
The Deck by Phil Polstra GrrCON2012The Deck by Phil Polstra GrrCON2012
The Deck by Phil Polstra GrrCON2012Philip Polstra
 
Cobbler, Func and Puppet: Tools for Large Scale Environments
Cobbler, Func and Puppet: Tools for Large Scale EnvironmentsCobbler, Func and Puppet: Tools for Large Scale Environments
Cobbler, Func and Puppet: Tools for Large Scale EnvironmentsMichael Zhang
 
Introduction to Stacki - World's fastest Linux server provisioning Tool
Introduction to Stacki - World's fastest Linux server provisioning ToolIntroduction to Stacki - World's fastest Linux server provisioning Tool
Introduction to Stacki - World's fastest Linux server provisioning ToolSuresh Paulraj
 
NFD9 - Matt Peterson, Data Center Operations
NFD9 - Matt Peterson, Data Center OperationsNFD9 - Matt Peterson, Data Center Operations
NFD9 - Matt Peterson, Data Center OperationsCumulus Networks
 

Similar to Distributed Data Processing Workshop - SBU (20)

Node in Real Time - The Beginning
Node in Real Time - The BeginningNode in Real Time - The Beginning
Node in Real Time - The Beginning
 
Joomla on Raspberry Pi using Nginx - Nederlandse Linux Gebruikers Group novem...
Joomla on Raspberry Pi using Nginx - Nederlandse Linux Gebruikers Group novem...Joomla on Raspberry Pi using Nginx - Nederlandse Linux Gebruikers Group novem...
Joomla on Raspberry Pi using Nginx - Nederlandse Linux Gebruikers Group novem...
 
New Jersey Red Hat Users Group Presentation: Provisioning anywhere
New Jersey Red Hat Users Group Presentation: Provisioning anywhereNew Jersey Red Hat Users Group Presentation: Provisioning anywhere
New Jersey Red Hat Users Group Presentation: Provisioning anywhere
 
Polstra 44con2012
Polstra 44con2012Polstra 44con2012
Polstra 44con2012
 
Hacking and Forensics on the Go - 44CON 2012
Hacking and Forensics on the Go - 44CON 2012Hacking and Forensics on the Go - 44CON 2012
Hacking and Forensics on the Go - 44CON 2012
 
Building SuperComputers @ Home
Building SuperComputers @ HomeBuilding SuperComputers @ Home
Building SuperComputers @ Home
 
Network Automation: Ansible 101
Network Automation: Ansible 101Network Automation: Ansible 101
Network Automation: Ansible 101
 
Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013Lightweight Virtualization with Linux Containers and Docker | YaC 2013
Lightweight Virtualization with Linux Containers and Docker | YaC 2013
 
Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013Lightweight Virtualization with Linux Containers and Docker I YaC 2013
Lightweight Virtualization with Linux Containers and Docker I YaC 2013
 
Deploy Mediawiki Using FIWARE Lab Facilities
Deploy Mediawiki Using FIWARE Lab FacilitiesDeploy Mediawiki Using FIWARE Lab Facilities
Deploy Mediawiki Using FIWARE Lab Facilities
 
Deploy MediaWiki usgin Fiware Lab Facilities
Deploy MediaWiki usgin Fiware Lab FacilitiesDeploy MediaWiki usgin Fiware Lab Facilities
Deploy MediaWiki usgin Fiware Lab Facilities
 
IPv6 training guide - Yuval Shaul
IPv6 training guide - Yuval ShaulIPv6 training guide - Yuval Shaul
IPv6 training guide - Yuval Shaul
 
[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...
[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...
[OpenStack Day in Korea 2015] Track 1-6 - 갈라파고스의 이구아나, 인프라에 오픈소스를 올리다. 그래서 보이...
 
IPv6 at CSCS
IPv6 at CSCSIPv6 at CSCS
IPv6 at CSCS
 
OpenStack Integration with OpenContrail and OpenDaylight
OpenStack Integration with OpenContrail and OpenDaylightOpenStack Integration with OpenContrail and OpenDaylight
OpenStack Integration with OpenContrail and OpenDaylight
 
The Deck by Phil Polstra GrrCON2012
The Deck by Phil Polstra GrrCON2012The Deck by Phil Polstra GrrCON2012
The Deck by Phil Polstra GrrCON2012
 
Cobbler, Func and Puppet: Tools for Large Scale Environments
Cobbler, Func and Puppet: Tools for Large Scale EnvironmentsCobbler, Func and Puppet: Tools for Large Scale Environments
Cobbler, Func and Puppet: Tools for Large Scale Environments
 
Cobbler, Func and Puppet: Tools for Large Scale Environments
Cobbler, Func and Puppet: Tools for Large Scale EnvironmentsCobbler, Func and Puppet: Tools for Large Scale Environments
Cobbler, Func and Puppet: Tools for Large Scale Environments
 
Introduction to Stacki - World's fastest Linux server provisioning Tool
Introduction to Stacki - World's fastest Linux server provisioning ToolIntroduction to Stacki - World's fastest Linux server provisioning Tool
Introduction to Stacki - World's fastest Linux server provisioning Tool
 
NFD9 - Matt Peterson, Data Center Operations
NFD9 - Matt Peterson, Data Center OperationsNFD9 - Matt Peterson, Data Center Operations
NFD9 - Matt Peterson, Data Center Operations
 

More from Amir Sedighi

Big Data and Machine Learning Workshop - Day 6 @ UTACM
Big Data and Machine Learning Workshop - Day 6 @ UTACMBig Data and Machine Learning Workshop - Day 6 @ UTACM
Big Data and Machine Learning Workshop - Day 6 @ UTACMAmir Sedighi
 
Big Data and Machine Learning Workshop - Day 4 @ UTACM
Big Data and Machine Learning Workshop - Day 4 @ UTACM Big Data and Machine Learning Workshop - Day 4 @ UTACM
Big Data and Machine Learning Workshop - Day 4 @ UTACM Amir Sedighi
 
Big Data and Machine Learning Workshop - Day 3 @ UTACM
Big Data and Machine Learning Workshop - Day 3 @ UTACMBig Data and Machine Learning Workshop - Day 3 @ UTACM
Big Data and Machine Learning Workshop - Day 3 @ UTACMAmir Sedighi
 
Big Data and Machine Learning Workshop - Day 2 @ UTACM
Big Data and Machine Learning Workshop - Day 2 @ UTACMBig Data and Machine Learning Workshop - Day 2 @ UTACM
Big Data and Machine Learning Workshop - Day 2 @ UTACMAmir Sedighi
 
Big Data and Machine Learning Workshop - Day 1 @ UTACM
Big Data and Machine Learning Workshop - Day 1 @ UTACMBig Data and Machine Learning Workshop - Day 1 @ UTACM
Big Data and Machine Learning Workshop - Day 1 @ UTACMAmir Sedighi
 
Two Case Studies Big-Data and Machine Learning at Scale Solutions in Iran
Two Case Studies Big-Data and Machine Learning at Scale Solutions in IranTwo Case Studies Big-Data and Machine Learning at Scale Solutions in Iran
Two Case Studies Big-Data and Machine Learning at Scale Solutions in IranAmir Sedighi
 
Helio, a Continues Real-Time Fraud Detection and Monitoring Solution
Helio, a Continues Real-Time Fraud Detection and Monitoring SolutionHelio, a Continues Real-Time Fraud Detection and Monitoring Solution
Helio, a Continues Real-Time Fraud Detection and Monitoring SolutionAmir Sedighi
 
Opensource Frameworks and BigData Processing
Opensource Frameworks and BigData ProcessingOpensource Frameworks and BigData Processing
Opensource Frameworks and BigData ProcessingAmir Sedighi
 

More from Amir Sedighi (8)

Big Data and Machine Learning Workshop - Day 6 @ UTACM
Big Data and Machine Learning Workshop - Day 6 @ UTACMBig Data and Machine Learning Workshop - Day 6 @ UTACM
Big Data and Machine Learning Workshop - Day 6 @ UTACM
 
Big Data and Machine Learning Workshop - Day 4 @ UTACM
Big Data and Machine Learning Workshop - Day 4 @ UTACM Big Data and Machine Learning Workshop - Day 4 @ UTACM
Big Data and Machine Learning Workshop - Day 4 @ UTACM
 
Big Data and Machine Learning Workshop - Day 3 @ UTACM
Big Data and Machine Learning Workshop - Day 3 @ UTACMBig Data and Machine Learning Workshop - Day 3 @ UTACM
Big Data and Machine Learning Workshop - Day 3 @ UTACM
 
Big Data and Machine Learning Workshop - Day 2 @ UTACM
Big Data and Machine Learning Workshop - Day 2 @ UTACMBig Data and Machine Learning Workshop - Day 2 @ UTACM
Big Data and Machine Learning Workshop - Day 2 @ UTACM
 
Big Data and Machine Learning Workshop - Day 1 @ UTACM
Big Data and Machine Learning Workshop - Day 1 @ UTACMBig Data and Machine Learning Workshop - Day 1 @ UTACM
Big Data and Machine Learning Workshop - Day 1 @ UTACM
 
Two Case Studies Big-Data and Machine Learning at Scale Solutions in Iran
Two Case Studies Big-Data and Machine Learning at Scale Solutions in IranTwo Case Studies Big-Data and Machine Learning at Scale Solutions in Iran
Two Case Studies Big-Data and Machine Learning at Scale Solutions in Iran
 
Helio, a Continues Real-Time Fraud Detection and Monitoring Solution
Helio, a Continues Real-Time Fraud Detection and Monitoring SolutionHelio, a Continues Real-Time Fraud Detection and Monitoring Solution
Helio, a Continues Real-Time Fraud Detection and Monitoring Solution
 
Opensource Frameworks and BigData Processing
Opensource Frameworks and BigData ProcessingOpensource Frameworks and BigData Processing
Opensource Frameworks and BigData Processing
 

Recently uploaded

VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Delhi Call girls
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...shambhavirathore45
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxolyaivanovalion
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfRachmat Ramadhan H
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 

Recently uploaded (20)

VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
Best VIP Call Girls Noida Sector 39 Call Me: 8448380779
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdfMarket Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
Market Analysis in the 5 Largest Economic Countries in Southeast Asia.pdf
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 

Distributed Data Processing Workshop - SBU

  • 1. 1 کارگاه پردازش داده توزیع شده پردیس- شهیدبهشتی دانشکده علوم و مهندسی کامپیوتر درس: پایگاه داده توزیع شده استاد: دکتر هادی طباطبایی ارائه: ابوالفضل صدیقی آبان ۱۳۹۳
  • 2. Distributed Data Processing School of Computer Science and Engineering A. Sedighi @amirsedighi Hexican.com sedighi@gmail.com
  • 3. 3 Every Game needs it's Playing Yard
  • 4. 4 Every Game needs it's Playing Yard
  • 5. What can I do on a Single Machine? 5 ● MVC Programming ● Regular Biz Apps ● 100 GBs Data ● Web Surfing ● ...
  • 7. 7
  • 8. 8
  • 9. 9 Introduction This is a 4 sessions, hands-on, step-by-step tutorial on setting up, a Linux cluster on your machine (Notebook or PC), to try a few number of big-data processing frameworks and tools.
  • 10. 10 What we are going to do? ● Your notebook, or a PC is just enough for starting. – Setting your Linux cluster up. ● Distributed Log Management and Realtime Search-Engines – What is Elasticsearch? – Elasticsearch on the cluster. – Monitoring and Usage. ● The most popular Distributed Data Processing Framework. – What is Apache Hadoop? – Apache Hadoop on the cluster. – Using Scenarios.
  • 11. 11 What we would Learn? ● Leveraging our knowledge of Big-Data. ● Getting familiar with distributed data processing. ● Maximizing availability and reliability. ● Increasing data storage capacity. ● Leveraging data processing performance. ● Data locality is a silver bullet. ● Increasing cluster utilization. ● Taming giants by giving them a try.
  • 12. 12 Preparing the Linux Cluster - VirtualBox
  • 13. 13 Preparing the Cluster - Hosting ● VirtualBox – Memory Size, Disk Capacity and CPU cores. – Network Interfaces. ● NAT, provides Internet. ● Host-Only, provides cluster communication.
  • 14. 14 Preparing the Cluster – Adding a Host-Only Network
  • 15. 15 Preparing the Cluster – Adding a NAT Interface
  • 16. 16 Preparing the Cluster – Adding a Host-Only Interface
  • 17. 17 Preparing the Cluster – First Node ● Creating a Linux machine inside VirtualBox. ● Installing Linux. (I've used Ubuntu 12.04) – Check Samba – Check OpenSSH ● Give the first node all. – Having an “install” folder on. – Having primitives such as Java installed on. ● Shutting down the first node.
  • 18. Preparing the Cluster – Cloning, The 18 Virtual Box Side ● Cloning the first node. (tutorial)
  • 19. Preparing the Cluster – Cloning, the 19 Linux side ● Turning the new node on. ● Network configuration – sudo nano /etc/hosts – sudo nano /etc/hostname – sudo nano /etc/network/interfaces – sudo rm /etc/udev/rules.d/70-persistent-net.rules ● sudo reboot
  • 20. 20 Preparing the Cluster – No Password Login ● Do this: – ssh-keygen – ssh-copy-id -i ~/.ssh/id_rsa.pub user@host ● Or this: – ssh-keygen -t dsa -p '' -f ~/.ssh/id_dsa – scp .ssh/id_rsa.pub user@host:~/master_key – ssh user@host – cat master_key >> ./ssh/authorized_keys
  • 21. 21 Preparing the Cluster – Distributed Shell ● Do it like a Commander – Installing DSH (Optional)
  • 22. 22 Preparing the Cluster – Enjoy it ● To scale your cluster just repeat the cloning step.
  • 23. 23 Next? ● An introduction to distributed Log Management and analytical search-engines. – How Elasticsearch works? – Workshop. ● An introduction to Apache Hadoop – How Apache Hadoop works? – Workshop.