SlideShare a Scribd company logo
I N T R O D U C T I O N
A P A C H E S P A R K I S A N O P E N S O U R C E C L U S T E R
C O M P U T I N G S Y S T E M T H A T F O C U S D A T A
A N A L Y T I C S F A S T A N D B O T H T O R U N A N D F A S T
T O W R I T E .
A P A C H E S P A R K I S A F A S T , I N - M E M O R Y D A T A
P R O C E S S I N G E N G I N E W I T H S M A R T A N D
E X P R E S S I V E D E V E L O P M E N T A P I S I N S C A L A ,
J A V A , P Y T H O N , A N D R T H A T A L L O W D A T A
W O R K E R S T O E F F I C I E N T L Y E X E C U T E M A C H I N E
L E A R N I N G A L G O R I T H M S T H A T R E Q U I R E F A S T
I T E R A T I V E A C C E S S T O D A T A S E T S .
APACHE SPARK
Speed
 Run programs up to 100x faster than Hadoop
MapReduce in memory, or 10x faster on disk.
 Apache Spark has an advanced DAG execution
engine that supports cyclic data flow and in-memory
computing.
Ease of Use
 Write applications quickly in Java, Scala, Python, R.
 Spark offers over 80 high-level operators that make
it easy to build parallel apps. And you can use
it interactively from the Scala, Python and R shells
Generality
 Compound SQL, streaming, and complex analytics.
 Spark powers a stack of libraries including SQL and
DataFrames,MLlib for machine learning, GraphX,
and Spark Streaming. You can combine these
libraries seamlessly in the same application.
Runs Everywhere
 Spark runs on Hadoop, Mesos, standalone, or in
the cloud. It can access diverse data sources
including HDFS, Cassandra, HBase, and S3.
Spark
HDFS,Hbase
Hadoop
Spark SQL
Hive
Spark is very easy to get started writing powerful Big Data applications
 Spark uses different data storage model, resilient
distributed datasets (RDD), uses a clever way of
guaranteeing fault tolerance that minimizes network I/O
 Spark has become another data processing engine in
Hadoop ecosystem and which is good for all businesses
and community as it provides more capability to Hadoop
stack.
 Spark enables applications in Hadoop clusters to run up
to 100x faster in memory, and 10x faster even when
running on disk. Spark makes it possible by reducing
number of read/write to disc. It stores this intermediate
processing data in-memory.
Spark SQL
 Spark SQL is a component on top of Spark Core that
introduces a new data abstraction called
SchemaRDD, which provides support for structured
and semi-structured data.
Spark advantages
 Iterative Algorithms in Machine Learning
 Interactive Data Mining and Data Processing
 Spark is a fully Apache Hive-compatible data
warehousing system that can run 100x faster than
Hive.
 Stream processing: Log processing and Fraud
detection in live streams for alerts, aggregates and
analysis
 Sensor data processing: Where data is fetched and
joined from multiple sources, in-memory dataset
really helpful as they are easy and fast to process.
Spark Shell
 Spark provides an interactive shell − a powerful tool
to analyze data interactively. It is available in either
Scala or Python language. Spark’s primary
abstraction is a distributed collection of items called
a Resilient Distributed Dataset (RDD). RDDs can be
created from Hadoop Input Formats (such as HDFS
files) or by transforming other RDDs.
RDD Transformations
 RDD transformations returns pointer to new RDD and
allows you to create dependencies between RDDs. Each
RDD in dependency chain (String of Dependencies) has a
function for calculating its data and has a pointer
(dependency) to its parent RDD.
 Spark is lazy, so nothing will be executed unless you call
some transformation or action that will trigger job
creation and execution

More Related Content

Viewers also liked

Júlia, Isabela e Larissa
Júlia, Isabela e Larissa Júlia, Isabela e Larissa
Júlia, Isabela e Larissa
Nute Jpa
 
Atmosfera
AtmosferaAtmosfera
Atmosfera
Nute Jpa
 
Completar al cuadrado
Completar al cuadradoCompletar al cuadrado
Completar al cuadrado
Computer Learning Centers
 
Uswa e rasool by allama abdul ahad qadri
Uswa e rasool by allama abdul ahad qadriUswa e rasool by allama abdul ahad qadri
Uswa e rasool by allama abdul ahad qadri
Muhammad Tariq
 
Integration of mule esb with microsoft azure
Integration of mule esb with microsoft azureIntegration of mule esb with microsoft azure
Integration of mule esb with microsoft azure
sivachandra mandalapu
 
How to use Cache scope
How to use Cache scopeHow to use Cache scope
How to use Cache scope
sivachandra mandalapu
 
Stockist Italy presentation in Russian 2016
Stockist Italy presentation in Russian 2016 Stockist Italy presentation in Russian 2016
Stockist Italy presentation in Russian 2016
P.L.T. Forniture Industriali S.r.l.
 
Laudo instalacoes-eletricas
Laudo instalacoes-eletricasLaudo instalacoes-eletricas
Laudo instalacoes-eletricas
Heronildo Apolinario
 
Apostila revit aulas
Apostila revit aulasApostila revit aulas
Apostila revit aulas
Eduardo Gauw
 
Stockist Italy company profile in Russian 2016
Stockist Italy company profile in Russian 2016 Stockist Italy company profile in Russian 2016
Stockist Italy company profile in Russian 2016
P.L.T. Forniture Industriali S.r.l.
 
ApresentaçãO JerusaléM1
ApresentaçãO JerusaléM1ApresentaçãO JerusaléM1
ApresentaçãO JerusaléM1
eng_guilherme
 
ApresentaçãO Fluxos
ApresentaçãO FluxosApresentaçãO Fluxos
ApresentaçãO Fluxos
eng_guilherme
 
IWCF Certificate
IWCF CertificateIWCF Certificate
IWCF CertificateKen White
 
JSONSchema with golang
JSONSchema with golangJSONSchema with golang
JSONSchema with golang
Suraj Deshmukh
 
Microsoft Word ExecuçãO De Alvenaria NãO Estrutural RevisãO [11] 21.08
Microsoft Word   ExecuçãO De Alvenaria NãO Estrutural   RevisãO [11] 21.08Microsoft Word   ExecuçãO De Alvenaria NãO Estrutural   RevisãO [11] 21.08
Microsoft Word ExecuçãO De Alvenaria NãO Estrutural RevisãO [11] 21.08
eng_guilherme
 
Forma vértice de la ecuación estándar cuadrática
Forma vértice de la ecuación estándar cuadrática Forma vértice de la ecuación estándar cuadrática
Forma vértice de la ecuación estándar cuadrática
juanreyesolvera3
 
Funciones cuadráticas. parámetros de la parábola
Funciones cuadráticas. parámetros de la parábolaFunciones cuadráticas. parámetros de la parábola
Funciones cuadráticas. parámetros de la parábola
juanreyesolvera3
 

Viewers also liked (17)

Júlia, Isabela e Larissa
Júlia, Isabela e Larissa Júlia, Isabela e Larissa
Júlia, Isabela e Larissa
 
Atmosfera
AtmosferaAtmosfera
Atmosfera
 
Completar al cuadrado
Completar al cuadradoCompletar al cuadrado
Completar al cuadrado
 
Uswa e rasool by allama abdul ahad qadri
Uswa e rasool by allama abdul ahad qadriUswa e rasool by allama abdul ahad qadri
Uswa e rasool by allama abdul ahad qadri
 
Integration of mule esb with microsoft azure
Integration of mule esb with microsoft azureIntegration of mule esb with microsoft azure
Integration of mule esb with microsoft azure
 
How to use Cache scope
How to use Cache scopeHow to use Cache scope
How to use Cache scope
 
Stockist Italy presentation in Russian 2016
Stockist Italy presentation in Russian 2016 Stockist Italy presentation in Russian 2016
Stockist Italy presentation in Russian 2016
 
Laudo instalacoes-eletricas
Laudo instalacoes-eletricasLaudo instalacoes-eletricas
Laudo instalacoes-eletricas
 
Apostila revit aulas
Apostila revit aulasApostila revit aulas
Apostila revit aulas
 
Stockist Italy company profile in Russian 2016
Stockist Italy company profile in Russian 2016 Stockist Italy company profile in Russian 2016
Stockist Italy company profile in Russian 2016
 
ApresentaçãO JerusaléM1
ApresentaçãO JerusaléM1ApresentaçãO JerusaléM1
ApresentaçãO JerusaléM1
 
ApresentaçãO Fluxos
ApresentaçãO FluxosApresentaçãO Fluxos
ApresentaçãO Fluxos
 
IWCF Certificate
IWCF CertificateIWCF Certificate
IWCF Certificate
 
JSONSchema with golang
JSONSchema with golangJSONSchema with golang
JSONSchema with golang
 
Microsoft Word ExecuçãO De Alvenaria NãO Estrutural RevisãO [11] 21.08
Microsoft Word   ExecuçãO De Alvenaria NãO Estrutural   RevisãO [11] 21.08Microsoft Word   ExecuçãO De Alvenaria NãO Estrutural   RevisãO [11] 21.08
Microsoft Word ExecuçãO De Alvenaria NãO Estrutural RevisãO [11] 21.08
 
Forma vértice de la ecuación estándar cuadrática
Forma vértice de la ecuación estándar cuadrática Forma vértice de la ecuación estándar cuadrática
Forma vértice de la ecuación estándar cuadrática
 
Funciones cuadráticas. parámetros de la parábola
Funciones cuadráticas. parámetros de la parábolaFunciones cuadráticas. parámetros de la parábola
Funciones cuadráticas. parámetros de la parábola
 

Similar to Apache spark

Apachespark 160612140708
Apachespark 160612140708Apachespark 160612140708
Apachespark 160612140708
Srikrishna k
 
Apache spark
Apache sparkApache spark
Apache spark
Ramakrishna kapa
 
Spark with Azure HDInsight - Tampa Bay Data Science - Adnan Masood, PhD
Spark with Azure HDInsight  - Tampa Bay Data Science - Adnan Masood, PhDSpark with Azure HDInsight  - Tampa Bay Data Science - Adnan Masood, PhD
Spark with Azure HDInsight - Tampa Bay Data Science - Adnan Masood, PhD
Adnan Masood
 
Introduction to spark
Introduction to sparkIntroduction to spark
Introduction to spark
Home
 
SparkPaper
SparkPaperSparkPaper
SparkPaper
Suraj Thapaliya
 
Big data with java
Big data with javaBig data with java
Big data with java
Stefan Angelov
 
Machine Learning with SparkR
Machine Learning with SparkRMachine Learning with SparkR
Machine Learning with SparkR
Olgun Aydın
 
BigDL: Image Recognition Using Apache Spark with BigDL - MCL358 - re:Invent 2017
BigDL: Image Recognition Using Apache Spark with BigDL - MCL358 - re:Invent 2017BigDL: Image Recognition Using Apache Spark with BigDL - MCL358 - re:Invent 2017
BigDL: Image Recognition Using Apache Spark with BigDL - MCL358 - re:Invent 2017
Amazon Web Services
 
Lighting up Big Data Analytics with Apache Spark in Azure
Lighting up Big Data Analytics with Apache Spark in AzureLighting up Big Data Analytics with Apache Spark in Azure
Lighting up Big Data Analytics with Apache Spark in Azure
Jen Stirrup
 
Apache Spark: the next big thing? - StampedeCon 2014
Apache Spark: the next big thing? - StampedeCon 2014Apache Spark: the next big thing? - StampedeCon 2014
Apache Spark: the next big thing? - StampedeCon 2014
StampedeCon
 
Apache Spark Notes
Apache Spark NotesApache Spark Notes
Apache Spark Notes
Venkateswaran Kandasamy
 
Started with-apache-spark
Started with-apache-sparkStarted with-apache-spark
Started with-apache-spark
Happiest Minds Technologies
 
Apache spark
Apache sparkApache spark
Apache spark
Dona Mary Philip
 
Apache spark installation [autosaved]
Apache spark installation [autosaved]Apache spark installation [autosaved]
Apache spark installation [autosaved]
Shweta Patnaik
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
Samy Dindane
 
Getting Started with Apache Spark (Scala)
Getting Started with Apache Spark (Scala)Getting Started with Apache Spark (Scala)
Getting Started with Apache Spark (Scala)
Knoldus Inc.
 
Learn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive GuideLearn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive Guide
Whizlabs
 
Why Spark over Hadoop?
Why Spark over Hadoop?Why Spark over Hadoop?
Why Spark over Hadoop?
Prwatech Institution
 
Comparison among rdbms, hadoop and spark
Comparison among rdbms, hadoop and sparkComparison among rdbms, hadoop and spark
Comparison among rdbms, hadoop and spark
AgnihotriGhosh2
 
Spark, the new age of data scientist
Spark, the new age of data scientistSpark, the new age of data scientist
Spark, the new age of data scientist
Massimiliano Martella
 

Similar to Apache spark (20)

Apachespark 160612140708
Apachespark 160612140708Apachespark 160612140708
Apachespark 160612140708
 
Apache spark
Apache sparkApache spark
Apache spark
 
Spark with Azure HDInsight - Tampa Bay Data Science - Adnan Masood, PhD
Spark with Azure HDInsight  - Tampa Bay Data Science - Adnan Masood, PhDSpark with Azure HDInsight  - Tampa Bay Data Science - Adnan Masood, PhD
Spark with Azure HDInsight - Tampa Bay Data Science - Adnan Masood, PhD
 
Introduction to spark
Introduction to sparkIntroduction to spark
Introduction to spark
 
SparkPaper
SparkPaperSparkPaper
SparkPaper
 
Big data with java
Big data with javaBig data with java
Big data with java
 
Machine Learning with SparkR
Machine Learning with SparkRMachine Learning with SparkR
Machine Learning with SparkR
 
BigDL: Image Recognition Using Apache Spark with BigDL - MCL358 - re:Invent 2017
BigDL: Image Recognition Using Apache Spark with BigDL - MCL358 - re:Invent 2017BigDL: Image Recognition Using Apache Spark with BigDL - MCL358 - re:Invent 2017
BigDL: Image Recognition Using Apache Spark with BigDL - MCL358 - re:Invent 2017
 
Lighting up Big Data Analytics with Apache Spark in Azure
Lighting up Big Data Analytics with Apache Spark in AzureLighting up Big Data Analytics with Apache Spark in Azure
Lighting up Big Data Analytics with Apache Spark in Azure
 
Apache Spark: the next big thing? - StampedeCon 2014
Apache Spark: the next big thing? - StampedeCon 2014Apache Spark: the next big thing? - StampedeCon 2014
Apache Spark: the next big thing? - StampedeCon 2014
 
Apache Spark Notes
Apache Spark NotesApache Spark Notes
Apache Spark Notes
 
Started with-apache-spark
Started with-apache-sparkStarted with-apache-spark
Started with-apache-spark
 
Apache spark
Apache sparkApache spark
Apache spark
 
Apache spark installation [autosaved]
Apache spark installation [autosaved]Apache spark installation [autosaved]
Apache spark installation [autosaved]
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
Getting Started with Apache Spark (Scala)
Getting Started with Apache Spark (Scala)Getting Started with Apache Spark (Scala)
Getting Started with Apache Spark (Scala)
 
Learn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive GuideLearn Apache Spark: A Comprehensive Guide
Learn Apache Spark: A Comprehensive Guide
 
Why Spark over Hadoop?
Why Spark over Hadoop?Why Spark over Hadoop?
Why Spark over Hadoop?
 
Comparison among rdbms, hadoop and spark
Comparison among rdbms, hadoop and sparkComparison among rdbms, hadoop and spark
Comparison among rdbms, hadoop and spark
 
Spark, the new age of data scientist
Spark, the new age of data scientistSpark, the new age of data scientist
Spark, the new age of data scientist
 

More from sivachandra mandalapu

Mock component in munit
Mock component in munitMock component in munit
Mock component in munit
sivachandra mandalapu
 
Jms selector
Jms selectorJms selector
Jms selector
sivachandra mandalapu
 
Sftplite
SftpliteSftplite
Object store
Object storeObject store
Object store
sivachandra mandalapu
 
How to use SFTP
How to use SFTPHow to use SFTP
How to use SFTP
sivachandra mandalapu
 
How to use secure property placeholder
How to use secure property placeholderHow to use secure property placeholder
How to use secure property placeholder
sivachandra mandalapu
 
Specifying a default exception strategy
Specifying a default exception strategySpecifying a default exception strategy
Specifying a default exception strategy
sivachandra mandalapu
 
Defining global exception strategies
Defining global exception strategiesDefining global exception strategies
Defining global exception strategies
sivachandra mandalapu
 
Reference exception strategy
Reference exception strategyReference exception strategy
Reference exception strategy
sivachandra mandalapu
 
Validate json schema
Validate json schemaValidate json schema
Validate json schema
sivachandra mandalapu
 
Validation
ValidationValidation
Property place holder
Property place holderProperty place holder
Property place holder
sivachandra mandalapu
 
Collection aggregator
Collection aggregatorCollection aggregator
Collection aggregator
sivachandra mandalapu
 
Cloud hub deployment
Cloud hub deploymentCloud hub deployment
Cloud hub deployment
sivachandra mandalapu
 
Securing api with_o_auth2
Securing api with_o_auth2Securing api with_o_auth2
Securing api with_o_auth2
sivachandra mandalapu
 
Deployment options for mule applications
Deployment options for mule applicationsDeployment options for mule applications
Deployment options for mule applications
sivachandra mandalapu
 
Setting up organization with api access
Setting up organization with api accessSetting up organization with api access
Setting up organization with api access
sivachandra mandalapu
 
API gateway setup
API gateway setupAPI gateway setup
API gateway setup
sivachandra mandalapu
 
Splitter
SplitterSplitter
Expression
ExpressionExpression

More from sivachandra mandalapu (20)

Mock component in munit
Mock component in munitMock component in munit
Mock component in munit
 
Jms selector
Jms selectorJms selector
Jms selector
 
Sftplite
SftpliteSftplite
Sftplite
 
Object store
Object storeObject store
Object store
 
How to use SFTP
How to use SFTPHow to use SFTP
How to use SFTP
 
How to use secure property placeholder
How to use secure property placeholderHow to use secure property placeholder
How to use secure property placeholder
 
Specifying a default exception strategy
Specifying a default exception strategySpecifying a default exception strategy
Specifying a default exception strategy
 
Defining global exception strategies
Defining global exception strategiesDefining global exception strategies
Defining global exception strategies
 
Reference exception strategy
Reference exception strategyReference exception strategy
Reference exception strategy
 
Validate json schema
Validate json schemaValidate json schema
Validate json schema
 
Validation
ValidationValidation
Validation
 
Property place holder
Property place holderProperty place holder
Property place holder
 
Collection aggregator
Collection aggregatorCollection aggregator
Collection aggregator
 
Cloud hub deployment
Cloud hub deploymentCloud hub deployment
Cloud hub deployment
 
Securing api with_o_auth2
Securing api with_o_auth2Securing api with_o_auth2
Securing api with_o_auth2
 
Deployment options for mule applications
Deployment options for mule applicationsDeployment options for mule applications
Deployment options for mule applications
 
Setting up organization with api access
Setting up organization with api accessSetting up organization with api access
Setting up organization with api access
 
API gateway setup
API gateway setupAPI gateway setup
API gateway setup
 
Splitter
SplitterSplitter
Splitter
 
Expression
ExpressionExpression
Expression
 

Recently uploaded

Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptxC1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
mulvey2
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
Katrina Pritchard
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
Jean Carlos Nunes Paixão
 
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptxBeyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
EduSkills OECD
 
Cognitive Development Adolescence Psychology
Cognitive Development Adolescence PsychologyCognitive Development Adolescence Psychology
Cognitive Development Adolescence Psychology
paigestewart1632
 
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Diana Rendina
 
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
สมใจ จันสุกสี
 
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
Nguyen Thanh Tu Collection
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
Nicholas Montgomery
 
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptxPrésentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
siemaillard
 
Leveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit InnovationLeveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit Innovation
TechSoup
 
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skillsspot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
haiqairshad
 
How to Create a More Engaging and Human Online Learning Experience
How to Create a More Engaging and Human Online Learning Experience How to Create a More Engaging and Human Online Learning Experience
How to Create a More Engaging and Human Online Learning Experience
Wahiba Chair Training & Consulting
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
adhitya5119
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
Nicholas Montgomery
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
amberjdewit93
 
Wound healing PPT
Wound healing PPTWound healing PPT
Wound healing PPT
Jyoti Chand
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
TechSoup
 
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
National Information Standards Organization (NISO)
 

Recently uploaded (20)

Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
 
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptxC1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
 
BBR 2024 Summer Sessions Interview Training
BBR  2024 Summer Sessions Interview TrainingBBR  2024 Summer Sessions Interview Training
BBR 2024 Summer Sessions Interview Training
 
A Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdfA Independência da América Espanhola LAPBOOK.pdf
A Independência da América Espanhola LAPBOOK.pdf
 
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptxBeyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
Beyond Degrees - Empowering the Workforce in the Context of Skills-First.pptx
 
Cognitive Development Adolescence Psychology
Cognitive Development Adolescence PsychologyCognitive Development Adolescence Psychology
Cognitive Development Adolescence Psychology
 
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
Reimagining Your Library Space: How to Increase the Vibes in Your Library No ...
 
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
คำศัพท์ คำพื้นฐานการอ่าน ภาษาอังกฤษ ระดับชั้น ม.1
 
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
 
writing about opinions about Australia the movie
writing about opinions about Australia the moviewriting about opinions about Australia the movie
writing about opinions about Australia the movie
 
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptxPrésentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
Présentationvvvvvvvvvvvvvvvvvvvvvvvvvvvv2.pptx
 
Leveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit InnovationLeveraging Generative AI to Drive Nonprofit Innovation
Leveraging Generative AI to Drive Nonprofit Innovation
 
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skillsspot a liar (Haiqa 146).pptx Technical writhing and presentation skills
spot a liar (Haiqa 146).pptx Technical writhing and presentation skills
 
How to Create a More Engaging and Human Online Learning Experience
How to Create a More Engaging and Human Online Learning Experience How to Create a More Engaging and Human Online Learning Experience
How to Create a More Engaging and Human Online Learning Experience
 
Main Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docxMain Java[All of the Base Concepts}.docx
Main Java[All of the Base Concepts}.docx
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
 
Digital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental DesignDigital Artefact 1 - Tiny Home Environmental Design
Digital Artefact 1 - Tiny Home Environmental Design
 
Wound healing PPT
Wound healing PPTWound healing PPT
Wound healing PPT
 
Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
 
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
 

Apache spark

  • 1. I N T R O D U C T I O N A P A C H E S P A R K I S A N O P E N S O U R C E C L U S T E R C O M P U T I N G S Y S T E M T H A T F O C U S D A T A A N A L Y T I C S F A S T A N D B O T H T O R U N A N D F A S T T O W R I T E . A P A C H E S P A R K I S A F A S T , I N - M E M O R Y D A T A P R O C E S S I N G E N G I N E W I T H S M A R T A N D E X P R E S S I V E D E V E L O P M E N T A P I S I N S C A L A , J A V A , P Y T H O N , A N D R T H A T A L L O W D A T A W O R K E R S T O E F F I C I E N T L Y E X E C U T E M A C H I N E L E A R N I N G A L G O R I T H M S T H A T R E Q U I R E F A S T I T E R A T I V E A C C E S S T O D A T A S E T S . APACHE SPARK
  • 2. Speed  Run programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk.  Apache Spark has an advanced DAG execution engine that supports cyclic data flow and in-memory computing.
  • 3. Ease of Use  Write applications quickly in Java, Scala, Python, R.  Spark offers over 80 high-level operators that make it easy to build parallel apps. And you can use it interactively from the Scala, Python and R shells
  • 4. Generality  Compound SQL, streaming, and complex analytics.  Spark powers a stack of libraries including SQL and DataFrames,MLlib for machine learning, GraphX, and Spark Streaming. You can combine these libraries seamlessly in the same application.
  • 5. Runs Everywhere  Spark runs on Hadoop, Mesos, standalone, or in the cloud. It can access diverse data sources including HDFS, Cassandra, HBase, and S3. Spark HDFS,Hbase Hadoop Spark SQL Hive
  • 6. Spark is very easy to get started writing powerful Big Data applications  Spark uses different data storage model, resilient distributed datasets (RDD), uses a clever way of guaranteeing fault tolerance that minimizes network I/O  Spark has become another data processing engine in Hadoop ecosystem and which is good for all businesses and community as it provides more capability to Hadoop stack.  Spark enables applications in Hadoop clusters to run up to 100x faster in memory, and 10x faster even when running on disk. Spark makes it possible by reducing number of read/write to disc. It stores this intermediate processing data in-memory.
  • 7. Spark SQL  Spark SQL is a component on top of Spark Core that introduces a new data abstraction called SchemaRDD, which provides support for structured and semi-structured data.
  • 8. Spark advantages  Iterative Algorithms in Machine Learning  Interactive Data Mining and Data Processing  Spark is a fully Apache Hive-compatible data warehousing system that can run 100x faster than Hive.  Stream processing: Log processing and Fraud detection in live streams for alerts, aggregates and analysis  Sensor data processing: Where data is fetched and joined from multiple sources, in-memory dataset really helpful as they are easy and fast to process.
  • 9. Spark Shell  Spark provides an interactive shell − a powerful tool to analyze data interactively. It is available in either Scala or Python language. Spark’s primary abstraction is a distributed collection of items called a Resilient Distributed Dataset (RDD). RDDs can be created from Hadoop Input Formats (such as HDFS files) or by transforming other RDDs.
  • 10. RDD Transformations  RDD transformations returns pointer to new RDD and allows you to create dependencies between RDDs. Each RDD in dependency chain (String of Dependencies) has a function for calculating its data and has a pointer (dependency) to its parent RDD.  Spark is lazy, so nothing will be executed unless you call some transformation or action that will trigger job creation and execution