SlideShare a Scribd company logo
1 of 41
Apache Spark
Agenda
Hadoop vs Spark: Big ‘Big Data’ question
Spark Ecosystem
What is RDD
Operations on RDD: Actions vs
Transformations
Running in cluster
Task schedulers
Spark Streaming
Dataframes API
Let’s remember: MapReduce
Apache Hadoop MapReduce
Hadoop VS/AND Spark
Hadoop: DFS
Spark: Speed (RAM)
Spark ecosystem
Glossary
Job
RDD
Stages
Tasks
DAG
Executor
Driver
Simple Example
RDD: Resilient Distributed Dataset
Represents an immutable, partitioned collection of elements that can be
operated in parallel with failure recovery possibilities.
Example
Hadoop RDD
getPartitions = HDFS blocks
getDependencies = None
compute = load block in memory
getPrefferedLocations = HDFS block locations
partitioner = None
MapPartitions RDD
getPartitions = same as parent
getDependencies = parent RDD
compute = compute parent and apply map()
getPrefferedLocations = same as parent
partitioner = None
RDD: Resilient Distributed Dataset
RDD Example
RDD Example
RDD Operations
● Transformations
○ Apply user function to every element in a partition
○ Apply aggregation function to a whole dataset
(groupBy, sortBy)
○ Provide functionality for repartitioning (repartition,
partitionBy)
● Actions
○ Materialize computation results (collect, count,
take)
○ Store RDDs in memory or on disk (cache, persist)
RDD Dependencies
DAG: Directed Acyclic Graph
All the operators in a job
are used to construct a
DAG (Directed Acyclic
Graph). The DAG is
optimized by rearranging
and combining operators
where possible.
DAG Example
DAG Scheduler
The DAG scheduler divides
operators into stages of
tasks. A stage is comprised
of tasks based on partitions
of the input data. Pipelines
operators together.
DAG Scheduler example
RDD Persistence: persist() & cache()
When you persist an RDD, each node stores any partitions of it that it computes in memory
and reuses them in other actions on that dataset (or datasets derived from it).
Storage levels: MEMORY_ONLY (default), MEMORY_AND_DISK,
MEMORY_ONLY_SER, MEMORY_AND_DISK_SER, DISK_ONLY,
MEMORY_ONLY_2, MEMORY_AND_DISK_2, etc.
Removing data: least-recently-used (LRU) fashion or RDD.unpersist() method.
Job execution
Task Schedulers
Standalone
Default
FIFO strategy
Controls number of CPU
cores and executor
memory
YARN
Hadoop oriented
Takes all available
resources
Was designed for
stateless batch jobs
that can be restarted
easily if they fail.
Mesos
Resource oriented
Dynamic sharing or CPU
cores
Less predictive latency
Spark Driver (application)
Running in cluster
Memory usage
• Execution memory
• Storage for data needed during tasks execution
• Shuffle-related data
• Storage memory
• Cached RDDs
• Possible to borrow from execution memory
• User memory
• User data structures and internal metadata
• Safeguarding against OOM
• Reserved memory
• Memory needed for running executor itself
Spark Streaming
Spark Streaming: Basic Concept
Spark Streaming: Architecture
Spark Streaming receives live input data streams and divides the data into
batches, which are then processed by the Spark engine to generate the final
stream of results in batches.
Discretized Streams (DStreams)
Windowed computations
Spark Streaming checkpoints
• Create heavy objects in foreachRDD
• Default persistence level of DStreams keeps the data serialized in memory.
• Checkpointing (metadata and received data)
• Automatic restart (task manager)
• Max receiving rate
• Level of Parallelism
• Kryo serialization
Spark Streaming Example
Spark Dataframes
(SQL)
Apache Hive
• Hadoop product
• Stores metadata in the relational database, but data only in HDFS
• Is not suited for real time data processing
• Best used for batch jobs over large datasets of immutable data (web logs)
Is a good choice if you:
• Want to query the data
• When you’re familiar with SQL
About Spark SQL
Part of Spark core since April 2014
Works with structured data
Mixes SQL queries with Spark programs
Connect to any datasource (files, Hive
tables, external databases, RDDs)
Spark Dataframes
Spark Dataframes
Spark SQL
Spark SQL with schema
Dataframes benchmark
Q&A

More Related Content

What's hot

Spark introduction and architecture
Spark introduction and architectureSpark introduction and architecture
Spark introduction and architectureSohil Jain
 
Apache Spark Introduction
Apache Spark IntroductionApache Spark Introduction
Apache Spark Introductionsudhakara st
 
Apache Spark Fundamentals
Apache Spark FundamentalsApache Spark Fundamentals
Apache Spark FundamentalsZahra Eskandari
 
Learn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive GuideLearn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive GuideWhizlabs
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...Simplilearn
 
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark Summit
 
Spark overview
Spark overviewSpark overview
Spark overviewLisa Hua
 
Apache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper OptimizationApache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper OptimizationDatabricks
 
Introduction to apache spark
Introduction to apache spark Introduction to apache spark
Introduction to apache spark Aakashdata
 
Introduction to Spark Internals
Introduction to Spark InternalsIntroduction to Spark Internals
Introduction to Spark InternalsPietro Michiardi
 
Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekProcessing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekVenkata Naga Ravi
 
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...Edureka!
 
Programming in Spark using PySpark
Programming in Spark using PySpark      Programming in Spark using PySpark
Programming in Spark using PySpark Mostafa
 
Apache Kafka® and the Data Mesh
Apache Kafka® and the Data MeshApache Kafka® and the Data Mesh
Apache Kafka® and the Data MeshConfluentInc1
 
Apache spark - Architecture , Overview & libraries
Apache spark - Architecture , Overview & librariesApache spark - Architecture , Overview & libraries
Apache spark - Architecture , Overview & librariesWalaa Hamdy Assy
 

What's hot (20)

Spark introduction and architecture
Spark introduction and architectureSpark introduction and architecture
Spark introduction and architecture
 
Apache Spark Introduction
Apache Spark IntroductionApache Spark Introduction
Apache Spark Introduction
 
Apache Spark Fundamentals
Apache Spark FundamentalsApache Spark Fundamentals
Apache Spark Fundamentals
 
Learn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive GuideLearn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive Guide
 
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
 
Apache Spark 101
Apache Spark 101Apache Spark 101
Apache Spark 101
 
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
Spark + Parquet In Depth: Spark Summit East Talk by Emily Curtin and Robbie S...
 
Spark overview
Spark overviewSpark overview
Spark overview
 
Apache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper OptimizationApache Spark Core—Deep Dive—Proper Optimization
Apache Spark Core—Deep Dive—Proper Optimization
 
Introduction to apache spark
Introduction to apache spark Introduction to apache spark
Introduction to apache spark
 
Apache Spark Overview
Apache Spark OverviewApache Spark Overview
Apache Spark Overview
 
Introduction to Spark Internals
Introduction to Spark InternalsIntroduction to Spark Internals
Introduction to Spark Internals
 
Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekProcessing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeek
 
Spark architecture
Spark architectureSpark architecture
Spark architecture
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
Map reduce vs spark
Map reduce vs sparkMap reduce vs spark
Map reduce vs spark
 
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
What is Apache Spark | Apache Spark Tutorial For Beginners | Apache Spark Tra...
 
Programming in Spark using PySpark
Programming in Spark using PySpark      Programming in Spark using PySpark
Programming in Spark using PySpark
 
Apache Kafka® and the Data Mesh
Apache Kafka® and the Data MeshApache Kafka® and the Data Mesh
Apache Kafka® and the Data Mesh
 
Apache spark - Architecture , Overview & libraries
Apache spark - Architecture , Overview & librariesApache spark - Architecture , Overview & libraries
Apache spark - Architecture , Overview & libraries
 

Viewers also liked

Qa talk-test manager-Oksana Kharchuk
Qa talk-test manager-Oksana KharchukQa talk-test manager-Oksana Kharchuk
Qa talk-test manager-Oksana KharchukDataArt
 
Propiedad intelectual del soft ware
Propiedad intelectual del soft warePropiedad intelectual del soft ware
Propiedad intelectual del soft wareJoel Quintana
 
The Rental Policies You Need to Know About
The Rental Policies You Need to Know AboutThe Rental Policies You Need to Know About
The Rental Policies You Need to Know AboutUrbanBound
 
IR
IRIR
IRMAK
 
Роман Еникеев - PHP или откуда взялся слон
Роман Еникеев - PHP или откуда взялся слонРоман Еникеев - PHP или откуда взялся слон
Роман Еникеев - PHP или откуда взялся слонDataArt
 
Андрей Беляев "Мыслить как заказчик"
Андрей Беляев "Мыслить как заказчик"Андрей Беляев "Мыслить как заказчик"
Андрей Беляев "Мыслить как заказчик"DataArt
 
photos
photosphotos
photosdiakxr
 
Visiting unpleasent places
Visiting unpleasent placesVisiting unpleasent places
Visiting unpleasent placesArpanasa
 
Estrategika nuevos productos proteccion
Estrategika nuevos productos proteccionEstrategika nuevos productos proteccion
Estrategika nuevos productos proteccionJUAN CARLOS CALDERON
 
Bit trade labs sovereign identity fintech summit 2016
Bit trade labs sovereign identity   fintech summit 2016Bit trade labs sovereign identity   fintech summit 2016
Bit trade labs sovereign identity fintech summit 2016Glen Frost
 
Арсений Жижелев «Наблюдение за игровым миром Аллодов (Play+Scala+Slick+Postgr...
Арсений Жижелев «Наблюдение за игровым миром Аллодов (Play+Scala+Slick+Postgr...Арсений Жижелев «Наблюдение за игровым миром Аллодов (Play+Scala+Slick+Postgr...
Арсений Жижелев «Наблюдение за игровым миром Аллодов (Play+Scala+Slick+Postgr...DataArt
 
นิทาน
นิทานนิทาน
นิทานExitOfLove
 
Reader’s theater (1)
Reader’s theater (1)Reader’s theater (1)
Reader’s theater (1)IIPCONX
 
Игорь Савка "Как выжить в безнадежном проекте. Личный опыт"
Игорь Савка "Как выжить в безнадежном проекте. Личный опыт"Игорь Савка "Как выжить в безнадежном проекте. Личный опыт"
Игорь Савка "Как выжить в безнадежном проекте. Личный опыт"DataArt
 
Uses and gratification theory
Uses and gratification theoryUses and gratification theory
Uses and gratification theoryAbbey Cotterill
 
Android wear, Alexey Rybakov DataArt Kharkov
Android wear, Alexey Rybakov DataArt KharkovAndroid wear, Alexey Rybakov DataArt Kharkov
Android wear, Alexey Rybakov DataArt KharkovDataArt
 

Viewers also liked (20)

Qa talk-test manager-Oksana Kharchuk
Qa talk-test manager-Oksana KharchukQa talk-test manager-Oksana Kharchuk
Qa talk-test manager-Oksana Kharchuk
 
Propiedad intelectual del soft ware
Propiedad intelectual del soft warePropiedad intelectual del soft ware
Propiedad intelectual del soft ware
 
The Rental Policies You Need to Know About
The Rental Policies You Need to Know AboutThe Rental Policies You Need to Know About
The Rental Policies You Need to Know About
 
IR
IRIR
IR
 
Роман Еникеев - PHP или откуда взялся слон
Роман Еникеев - PHP или откуда взялся слонРоман Еникеев - PHP или откуда взялся слон
Роман Еникеев - PHP или откуда взялся слон
 
Андрей Беляев "Мыслить как заказчик"
Андрей Беляев "Мыслить как заказчик"Андрей Беляев "Мыслить как заказчик"
Андрей Беляев "Мыслить как заказчик"
 
photos
photosphotos
photos
 
Visiting unpleasent places
Visiting unpleasent placesVisiting unpleasent places
Visiting unpleasent places
 
Mapas etiquetas
Mapas etiquetasMapas etiquetas
Mapas etiquetas
 
Estrategika nuevos productos proteccion
Estrategika nuevos productos proteccionEstrategika nuevos productos proteccion
Estrategika nuevos productos proteccion
 
Biblioterapia
BiblioterapiaBiblioterapia
Biblioterapia
 
Bit trade labs sovereign identity fintech summit 2016
Bit trade labs sovereign identity   fintech summit 2016Bit trade labs sovereign identity   fintech summit 2016
Bit trade labs sovereign identity fintech summit 2016
 
Арсений Жижелев «Наблюдение за игровым миром Аллодов (Play+Scala+Slick+Postgr...
Арсений Жижелев «Наблюдение за игровым миром Аллодов (Play+Scala+Slick+Postgr...Арсений Жижелев «Наблюдение за игровым миром Аллодов (Play+Scala+Slick+Postgr...
Арсений Жижелев «Наблюдение за игровым миром Аллодов (Play+Scala+Slick+Postgr...
 
นิทาน
นิทานนิทาน
นิทาน
 
Reader’s theater (1)
Reader’s theater (1)Reader’s theater (1)
Reader’s theater (1)
 
Игорь Савка "Как выжить в безнадежном проекте. Личный опыт"
Игорь Савка "Как выжить в безнадежном проекте. Личный опыт"Игорь Савка "Как выжить в безнадежном проекте. Личный опыт"
Игорь Савка "Как выжить в безнадежном проекте. Личный опыт"
 
Uses and gratification theory
Uses and gratification theoryUses and gratification theory
Uses and gratification theory
 
Joint venture
Joint ventureJoint venture
Joint venture
 
Bio pharma vessels & tanks
Bio pharma vessels & tanksBio pharma vessels & tanks
Bio pharma vessels & tanks
 
Android wear, Alexey Rybakov DataArt Kharkov
Android wear, Alexey Rybakov DataArt KharkovAndroid wear, Alexey Rybakov DataArt Kharkov
Android wear, Alexey Rybakov DataArt Kharkov
 

Similar to Apache Spark overview

Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming Djamel Zouaoui
 
Geek Night - Functional Data Processing using Spark and Scala
Geek Night - Functional Data Processing using Spark and ScalaGeek Night - Functional Data Processing using Spark and Scala
Geek Night - Functional Data Processing using Spark and ScalaAtif Akhtar
 
Unit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptxUnit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptxRahul Borate
 
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014cdmaxime
 
Apache Spark™ is a multi-language engine for executing data-S5.ppt
Apache Spark™ is a multi-language engine for executing data-S5.pptApache Spark™ is a multi-language engine for executing data-S5.ppt
Apache Spark™ is a multi-language engine for executing data-S5.pptbhargavi804095
 
Spark from the Surface
Spark from the SurfaceSpark from the Surface
Spark from the SurfaceJosi Aranda
 
Introduction to Apache Spark :: Lagos Scala Meetup session 2
Introduction to Apache Spark :: Lagos Scala Meetup session 2 Introduction to Apache Spark :: Lagos Scala Meetup session 2
Introduction to Apache Spark :: Lagos Scala Meetup session 2 Olalekan Fuad Elesin
 
Apache Spark Introduction.pdf
Apache Spark Introduction.pdfApache Spark Introduction.pdf
Apache Spark Introduction.pdfMaheshPandit16
 
Study Notes: Apache Spark
Study Notes: Apache SparkStudy Notes: Apache Spark
Study Notes: Apache SparkGao Yunzhong
 
TriHUG talk on Spark and Shark
TriHUG talk on Spark and SharkTriHUG talk on Spark and Shark
TriHUG talk on Spark and Sharktrihug
 
Big data vahidamiri-tabriz-13960226-datastack.ir
Big data vahidamiri-tabriz-13960226-datastack.irBig data vahidamiri-tabriz-13960226-datastack.ir
Big data vahidamiri-tabriz-13960226-datastack.irdatastack
 
Apache Spark Fundamentals Meetup Talk
Apache Spark Fundamentals Meetup TalkApache Spark Fundamentals Meetup Talk
Apache Spark Fundamentals Meetup TalkEren Avşaroğulları
 
Apache spark on Hadoop Yarn Resource Manager
Apache spark on Hadoop Yarn Resource ManagerApache spark on Hadoop Yarn Resource Manager
Apache spark on Hadoop Yarn Resource Managerharidasnss
 
Bigdata processing with Spark - part II
Bigdata processing with Spark - part IIBigdata processing with Spark - part II
Bigdata processing with Spark - part IIArjen de Vries
 

Similar to Apache Spark overview (20)

Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming
 
Apache Spark on HDinsight Training
Apache Spark on HDinsight TrainingApache Spark on HDinsight Training
Apache Spark on HDinsight Training
 
Spark architechure.pptx
Spark architechure.pptxSpark architechure.pptx
Spark architechure.pptx
 
Intro to Apache Spark
Intro to Apache SparkIntro to Apache Spark
Intro to Apache Spark
 
Intro to Apache Spark
Intro to Apache SparkIntro to Apache Spark
Intro to Apache Spark
 
Geek Night - Functional Data Processing using Spark and Scala
Geek Night - Functional Data Processing using Spark and ScalaGeek Night - Functional Data Processing using Spark and Scala
Geek Night - Functional Data Processing using Spark and Scala
 
Big data overview
Big data overviewBig data overview
Big data overview
 
Unit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptxUnit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptx
 
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014
 
Apache Spark™ is a multi-language engine for executing data-S5.ppt
Apache Spark™ is a multi-language engine for executing data-S5.pptApache Spark™ is a multi-language engine for executing data-S5.ppt
Apache Spark™ is a multi-language engine for executing data-S5.ppt
 
Spark from the Surface
Spark from the SurfaceSpark from the Surface
Spark from the Surface
 
Introduction to Apache Spark :: Lagos Scala Meetup session 2
Introduction to Apache Spark :: Lagos Scala Meetup session 2 Introduction to Apache Spark :: Lagos Scala Meetup session 2
Introduction to Apache Spark :: Lagos Scala Meetup session 2
 
Apache Spark Introduction.pdf
Apache Spark Introduction.pdfApache Spark Introduction.pdf
Apache Spark Introduction.pdf
 
Study Notes: Apache Spark
Study Notes: Apache SparkStudy Notes: Apache Spark
Study Notes: Apache Spark
 
TriHUG talk on Spark and Shark
TriHUG talk on Spark and SharkTriHUG talk on Spark and Shark
TriHUG talk on Spark and Shark
 
Big data vahidamiri-tabriz-13960226-datastack.ir
Big data vahidamiri-tabriz-13960226-datastack.irBig data vahidamiri-tabriz-13960226-datastack.ir
Big data vahidamiri-tabriz-13960226-datastack.ir
 
Apache Spark Fundamentals Meetup Talk
Apache Spark Fundamentals Meetup TalkApache Spark Fundamentals Meetup Talk
Apache Spark Fundamentals Meetup Talk
 
Apache spark on Hadoop Yarn Resource Manager
Apache spark on Hadoop Yarn Resource ManagerApache spark on Hadoop Yarn Resource Manager
Apache spark on Hadoop Yarn Resource Manager
 
Bigdata processing with Spark - part II
Bigdata processing with Spark - part IIBigdata processing with Spark - part II
Bigdata processing with Spark - part II
 
Apache Spark
Apache SparkApache Spark
Apache Spark
 

More from DataArt

DataArt Custom Software Engineering with a Human Approach
DataArt Custom Software Engineering with a Human ApproachDataArt Custom Software Engineering with a Human Approach
DataArt Custom Software Engineering with a Human ApproachDataArt
 
DataArt Healthcare & Life Sciences
DataArt Healthcare & Life SciencesDataArt Healthcare & Life Sciences
DataArt Healthcare & Life SciencesDataArt
 
DataArt Financial Services and Capital Markets
DataArt Financial Services and Capital MarketsDataArt Financial Services and Capital Markets
DataArt Financial Services and Capital MarketsDataArt
 
About DataArt HR Partners
About DataArt HR PartnersAbout DataArt HR Partners
About DataArt HR PartnersDataArt
 
Event management в IT
Event management в ITEvent management в IT
Event management в ITDataArt
 
Digital Marketing from inside
Digital Marketing from insideDigital Marketing from inside
Digital Marketing from insideDataArt
 
What's new in Android, Igor Malytsky ( Google Post I|O Tour)
What's new in Android, Igor Malytsky ( Google Post I|O Tour)What's new in Android, Igor Malytsky ( Google Post I|O Tour)
What's new in Android, Igor Malytsky ( Google Post I|O Tour)DataArt
 
DevOps Workshop:Что бывает, когда DevOps приходит на проект
DevOps Workshop:Что бывает, когда DevOps приходит на проектDevOps Workshop:Что бывает, когда DevOps приходит на проект
DevOps Workshop:Что бывает, когда DevOps приходит на проектDataArt
 
IT Talk Kharkiv: «‎Soft skills в IT. Польза или вред? Максим Бастион, DataArt
IT Talk Kharkiv: «‎Soft skills в IT. Польза или вред? Максим Бастион, DataArtIT Talk Kharkiv: «‎Soft skills в IT. Польза или вред? Максим Бастион, DataArt
IT Talk Kharkiv: «‎Soft skills в IT. Польза или вред? Максим Бастион, DataArtDataArt
 
«Ноль копеек. Спастись от выгорания» — Сергей Чеботарев (Head of Design, Han...
 «Ноль копеек. Спастись от выгорания» — Сергей Чеботарев (Head of Design, Han... «Ноль копеек. Спастись от выгорания» — Сергей Чеботарев (Head of Design, Han...
«Ноль копеек. Спастись от выгорания» — Сергей Чеботарев (Head of Design, Han...DataArt
 
Communication in QA's life
Communication in QA's lifeCommunication in QA's life
Communication in QA's lifeDataArt
 
Нельзя просто так взять и договориться, или как мы работали со сложными людьми
Нельзя просто так взять и договориться, или как мы работали со сложными людьмиНельзя просто так взять и договориться, или как мы работали со сложными людьми
Нельзя просто так взять и договориться, или как мы работали со сложными людьмиDataArt
 
Знакомьтесь, DevOps
Знакомьтесь, DevOpsЗнакомьтесь, DevOps
Знакомьтесь, DevOpsDataArt
 
DevOps in real life
DevOps in real lifeDevOps in real life
DevOps in real lifeDataArt
 
Codeless: автоматизация тестирования
Codeless: автоматизация тестированияCodeless: автоматизация тестирования
Codeless: автоматизация тестированияDataArt
 
Selenoid
SelenoidSelenoid
SelenoidDataArt
 
Selenide
SelenideSelenide
SelenideDataArt
 
A. Sirota "Building an Automation Solution based on Appium"
A. Sirota "Building an Automation Solution based on Appium"A. Sirota "Building an Automation Solution based on Appium"
A. Sirota "Building an Automation Solution based on Appium"DataArt
 
Эмоциональный интеллект или как не сойти с ума в условиях сложного и динамичн...
Эмоциональный интеллект или как не сойти с ума в условиях сложного и динамичн...Эмоциональный интеллект или как не сойти с ума в условиях сложного и динамичн...
Эмоциональный интеллект или как не сойти с ума в условиях сложного и динамичн...DataArt
 
IT talk: Как я перестал бояться и полюбил TestNG
IT talk: Как я перестал бояться и полюбил TestNGIT talk: Как я перестал бояться и полюбил TestNG
IT talk: Как я перестал бояться и полюбил TestNGDataArt
 

More from DataArt (20)

DataArt Custom Software Engineering with a Human Approach
DataArt Custom Software Engineering with a Human ApproachDataArt Custom Software Engineering with a Human Approach
DataArt Custom Software Engineering with a Human Approach
 
DataArt Healthcare & Life Sciences
DataArt Healthcare & Life SciencesDataArt Healthcare & Life Sciences
DataArt Healthcare & Life Sciences
 
DataArt Financial Services and Capital Markets
DataArt Financial Services and Capital MarketsDataArt Financial Services and Capital Markets
DataArt Financial Services and Capital Markets
 
About DataArt HR Partners
About DataArt HR PartnersAbout DataArt HR Partners
About DataArt HR Partners
 
Event management в IT
Event management в ITEvent management в IT
Event management в IT
 
Digital Marketing from inside
Digital Marketing from insideDigital Marketing from inside
Digital Marketing from inside
 
What's new in Android, Igor Malytsky ( Google Post I|O Tour)
What's new in Android, Igor Malytsky ( Google Post I|O Tour)What's new in Android, Igor Malytsky ( Google Post I|O Tour)
What's new in Android, Igor Malytsky ( Google Post I|O Tour)
 
DevOps Workshop:Что бывает, когда DevOps приходит на проект
DevOps Workshop:Что бывает, когда DevOps приходит на проектDevOps Workshop:Что бывает, когда DevOps приходит на проект
DevOps Workshop:Что бывает, когда DevOps приходит на проект
 
IT Talk Kharkiv: «‎Soft skills в IT. Польза или вред? Максим Бастион, DataArt
IT Talk Kharkiv: «‎Soft skills в IT. Польза или вред? Максим Бастион, DataArtIT Talk Kharkiv: «‎Soft skills в IT. Польза или вред? Максим Бастион, DataArt
IT Talk Kharkiv: «‎Soft skills в IT. Польза или вред? Максим Бастион, DataArt
 
«Ноль копеек. Спастись от выгорания» — Сергей Чеботарев (Head of Design, Han...
 «Ноль копеек. Спастись от выгорания» — Сергей Чеботарев (Head of Design, Han... «Ноль копеек. Спастись от выгорания» — Сергей Чеботарев (Head of Design, Han...
«Ноль копеек. Спастись от выгорания» — Сергей Чеботарев (Head of Design, Han...
 
Communication in QA's life
Communication in QA's lifeCommunication in QA's life
Communication in QA's life
 
Нельзя просто так взять и договориться, или как мы работали со сложными людьми
Нельзя просто так взять и договориться, или как мы работали со сложными людьмиНельзя просто так взять и договориться, или как мы работали со сложными людьми
Нельзя просто так взять и договориться, или как мы работали со сложными людьми
 
Знакомьтесь, DevOps
Знакомьтесь, DevOpsЗнакомьтесь, DevOps
Знакомьтесь, DevOps
 
DevOps in real life
DevOps in real lifeDevOps in real life
DevOps in real life
 
Codeless: автоматизация тестирования
Codeless: автоматизация тестированияCodeless: автоматизация тестирования
Codeless: автоматизация тестирования
 
Selenoid
SelenoidSelenoid
Selenoid
 
Selenide
SelenideSelenide
Selenide
 
A. Sirota "Building an Automation Solution based on Appium"
A. Sirota "Building an Automation Solution based on Appium"A. Sirota "Building an Automation Solution based on Appium"
A. Sirota "Building an Automation Solution based on Appium"
 
Эмоциональный интеллект или как не сойти с ума в условиях сложного и динамичн...
Эмоциональный интеллект или как не сойти с ума в условиях сложного и динамичн...Эмоциональный интеллект или как не сойти с ума в условиях сложного и динамичн...
Эмоциональный интеллект или как не сойти с ума в условиях сложного и динамичн...
 
IT talk: Как я перестал бояться и полюбил TestNG
IT talk: Как я перестал бояться и полюбил TestNGIT talk: Как я перестал бояться и полюбил TestNG
IT talk: Как я перестал бояться и полюбил TestNG
 

Recently uploaded

Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSSpellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSAnaAcapella
 
Major project report on Tata Motors and its marketing strategies
Major project report on Tata Motors and its marketing strategiesMajor project report on Tata Motors and its marketing strategies
Major project report on Tata Motors and its marketing strategiesAmanpreetKaur157993
 
Improved Approval Flow in Odoo 17 Studio App
Improved Approval Flow in Odoo 17 Studio AppImproved Approval Flow in Odoo 17 Studio App
Improved Approval Flow in Odoo 17 Studio AppCeline George
 
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...Nguyen Thanh Tu Collection
 
FICTIONAL SALESMAN/SALESMAN SNSW 2024.pdf
FICTIONAL SALESMAN/SALESMAN SNSW 2024.pdfFICTIONAL SALESMAN/SALESMAN SNSW 2024.pdf
FICTIONAL SALESMAN/SALESMAN SNSW 2024.pdfPondicherry University
 
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文中 央社
 
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...Nguyen Thanh Tu Collection
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxannathomasp01
 
Trauma-Informed Leadership - Five Practical Principles
Trauma-Informed Leadership - Five Practical PrinciplesTrauma-Informed Leadership - Five Practical Principles
Trauma-Informed Leadership - Five Practical PrinciplesPooky Knightsmith
 
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUMDEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUMELOISARIVERA8
 
UChicago CMSC 23320 - The Best Commit Messages of 2024
UChicago CMSC 23320 - The Best Commit Messages of 2024UChicago CMSC 23320 - The Best Commit Messages of 2024
UChicago CMSC 23320 - The Best Commit Messages of 2024Borja Sotomayor
 
The Liver & Gallbladder (Anatomy & Physiology).pptx
The Liver &  Gallbladder (Anatomy & Physiology).pptxThe Liver &  Gallbladder (Anatomy & Physiology).pptx
The Liver & Gallbladder (Anatomy & Physiology).pptxVishal Singh
 
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptxAnalyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptxLimon Prince
 
Spring gala 2024 photo slideshow - Celebrating School-Community Partnerships
Spring gala 2024 photo slideshow - Celebrating School-Community PartnershipsSpring gala 2024 photo slideshow - Celebrating School-Community Partnerships
Spring gala 2024 photo slideshow - Celebrating School-Community Partnershipsexpandedwebsite
 
male presentation...pdf.................
male presentation...pdf.................male presentation...pdf.................
male presentation...pdf.................MirzaAbrarBaig5
 

Recently uploaded (20)

Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPSSpellings Wk 4 and Wk 5 for Grade 4 at CAPS
Spellings Wk 4 and Wk 5 for Grade 4 at CAPS
 
Major project report on Tata Motors and its marketing strategies
Major project report on Tata Motors and its marketing strategiesMajor project report on Tata Motors and its marketing strategies
Major project report on Tata Motors and its marketing strategies
 
Improved Approval Flow in Odoo 17 Studio App
Improved Approval Flow in Odoo 17 Studio AppImproved Approval Flow in Odoo 17 Studio App
Improved Approval Flow in Odoo 17 Studio App
 
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
 
FICTIONAL SALESMAN/SALESMAN SNSW 2024.pdf
FICTIONAL SALESMAN/SALESMAN SNSW 2024.pdfFICTIONAL SALESMAN/SALESMAN SNSW 2024.pdf
FICTIONAL SALESMAN/SALESMAN SNSW 2024.pdf
 
OS-operating systems- ch05 (CPU Scheduling) ...
OS-operating systems- ch05 (CPU Scheduling) ...OS-operating systems- ch05 (CPU Scheduling) ...
OS-operating systems- ch05 (CPU Scheduling) ...
 
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文會考英文
 
Mattingly "AI and Prompt Design: LLMs with NER"
Mattingly "AI and Prompt Design: LLMs with NER"Mattingly "AI and Prompt Design: LLMs with NER"
Mattingly "AI and Prompt Design: LLMs with NER"
 
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
TỔNG HỢP HƠN 100 ĐỀ THI THỬ TỐT NGHIỆP THPT TOÁN 2024 - TỪ CÁC TRƯỜNG, TRƯỜNG...
 
Supporting Newcomer Multilingual Learners
Supporting Newcomer  Multilingual LearnersSupporting Newcomer  Multilingual Learners
Supporting Newcomer Multilingual Learners
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
ESSENTIAL of (CS/IT/IS) class 07 (Networks)
ESSENTIAL of (CS/IT/IS) class 07 (Networks)ESSENTIAL of (CS/IT/IS) class 07 (Networks)
ESSENTIAL of (CS/IT/IS) class 07 (Networks)
 
Trauma-Informed Leadership - Five Practical Principles
Trauma-Informed Leadership - Five Practical PrinciplesTrauma-Informed Leadership - Five Practical Principles
Trauma-Informed Leadership - Five Practical Principles
 
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUMDEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
DEMONSTRATION LESSON IN ENGLISH 4 MATATAG CURRICULUM
 
UChicago CMSC 23320 - The Best Commit Messages of 2024
UChicago CMSC 23320 - The Best Commit Messages of 2024UChicago CMSC 23320 - The Best Commit Messages of 2024
UChicago CMSC 23320 - The Best Commit Messages of 2024
 
The Liver & Gallbladder (Anatomy & Physiology).pptx
The Liver &  Gallbladder (Anatomy & Physiology).pptxThe Liver &  Gallbladder (Anatomy & Physiology).pptx
The Liver & Gallbladder (Anatomy & Physiology).pptx
 
Including Mental Health Support in Project Delivery, 14 May.pdf
Including Mental Health Support in Project Delivery, 14 May.pdfIncluding Mental Health Support in Project Delivery, 14 May.pdf
Including Mental Health Support in Project Delivery, 14 May.pdf
 
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptxAnalyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
Analyzing and resolving a communication crisis in Dhaka textiles LTD.pptx
 
Spring gala 2024 photo slideshow - Celebrating School-Community Partnerships
Spring gala 2024 photo slideshow - Celebrating School-Community PartnershipsSpring gala 2024 photo slideshow - Celebrating School-Community Partnerships
Spring gala 2024 photo slideshow - Celebrating School-Community Partnerships
 
male presentation...pdf.................
male presentation...pdf.................male presentation...pdf.................
male presentation...pdf.................
 

Apache Spark overview

Editor's Notes

  1. Исполнение в кластере Параллельность Отказоустойчивость Скорость Различные форматы данных Мониторинг и распределение ресурсов
  2. Spark’s cache is fault-tolerant – if any partition of an RDD is lost, it will automatically be recomputed using the transformations that originally created it. Пример: вычитываем данные из файла, берем имя работника, его должность, зарплату и возраст. Фильтруем по нужным должностям. Потом хоть аггрегировать: среднюю зп по должности и по возрасту.
  3. window length - The duration of the window (3 in the figure). sliding interval - The interval at which the window operation is performed (2 in the figure).
  4. Пример с созданием коннекшена к базе