SlideShare a Scribd company logo
1 of 16
HADOOP VS
SPARK
YOUR PRESENTER – SAMPAT KUMAR BUDANKAYALA
• Sr . Big Data Analyst @ Harman Solutions
• Over 4.5 years of Big Data experience working on over 15-20 projects .
• Specialist in Building Data Lake Projects, Data Security, Streaming
Solutions(RealTime Ingestion),Linear Regression and Building
Recommendation Systems .
• Email: sampatbigdata@gmail.com
• Linkedin:
AGENDA
• Around the Globe (Spark and Hadoop)
• Big Data, Big Data Stack, Apache Hadoop, Apache Spark.
• What is Hadoop and What is Spark ?
• SparkVs Hadoop and the combination effect.
• Q & A
Around the Globe:
NEWS:
----------
• Is it Spark ‘vs’ OR ‘and’ Hadoop.
• Apache Spark is continuing beyond Apache Hadoop.
SURVEYS:
--------------
• Big Data, the analysis of large quantities of data to gain new insight has become a ubiquitous phrase in
recent years. Day by day the data is growing at a staggering rate. One of the efficient technologies that
deal with the Big Data is Hadoop.
• Hadoop, for processing large data volume jobs uses MapReduce programming model.
http://www.ijetae.com/files/Volume4Issue5/IJETAE_0514_15.pdf
• Hadoop's historic focus on batch processing of data was well supported by MapReduce, but there is an
appetite for more flexible developer tools to support the larger market of 'mid-size' datasets and use
cases that call for real-time processing.
http://www.marketwired.com/press-release/survey-indicates-apache-spark-gaining-developer-
adoption-as-big-datas-projects-require-1986162.htm
Around the Globe Cont :
Big Data, Big Data Stack, Apache Spark and Hadoop
Big Data
----------
• Big data is a term that describes the large volume of data –structured ,semi-structured and unstructured .
• But it’s not the amount of data that’s important. It’s what organizations do with the data that matters. Big data can
be analyzed for insights that lead to better decisions and strategic business moves.
• The concept gained momentum in the early 2000s when industry analysts articulated the now- mainstream
definition of big data as the threeVs:
Volume – Organizations collect data from a variety of sources, including business transactions, social media and
information from sensor or machine-to-machine data. In the past, storing it would’ve been a problem – but new
technologies (such as Hadoop) have eased the burden.
Velocity – Data streams in at an unprecedented speed and must be dealt with in a timely manner. RFID tags, sensors and
smart metering are driving the need to deal with torrents of data in near-real time.
Variety – Data comes in all types of formats – from structured, numeric data in traditional databases to unstructured text
documents, email, video, audio, stock ticker data and financial transactions.
https://www.zettaset.com/index.php/info-center/what-is-big-data/
Big Data, Big Data Stack, Apache Spark and Hadoop
Big Data Stack
-------------------
Big Data, Big Data Stack, Apache Spark and Hadoop
Apache Hadoop
---------------------
• Hadoop is a framework designed to work with huge amount of data sets which is much larger in magnitude than
the normal systems can handle.
• Hadoop distributes this data across a set of machines.The real power of Hadoop comes from the fact its
competence to scalable to hundreds or thousands of computers each containing several processor cores.
• Many big enterprises believe that within a few years more than half of the world’s data will be stored in Hadoop.
• Hadoop mainly consists of:
1. Hadoop Distributed File System (HDFS): a distributed file system to achieve storage and fault tolerance
2. Hadoop MapReduce a powerful parallel programming model which processes vast quantity of data via
distributed computing across the clusters.
Big Data, Big Data Stack, Apache Spark and Hadoop
Apache Spark
---------------------
• Apache Spark is a fast, in-memory data processing engine with elegant and expressive development APIs to
allow data workers to efficiently execute streaming, machine learning or SQL workloads that require fast
iterative access to datasets.
• Apache Spark consists of Spark Core and a set of libraries.The core is the distributed execution engine and the
Java, Scala, and Python APIs offer a platform for distributed ETL application development.
• Spark is designed for data science and its abstraction makes data science easier. Data scientists commonly use
machine learning – a set of techniques and algorithms that can learn from data.
• Run programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk.
Spark Vs Hadoop and the combination effect
Performance
-----------------
• Apache Spark processes data in-memory while Hadoop MapReduce persists back to the disk after a map or
reduce action, so Spark should outperform Hadoop MapReduce.
• Nonetheless, Spark needs a lot of memory. Much like standard DBs, it loads a process into memory and keeps it
there until further notice, for the sake of caching.
• If Spark runs on HadoopYARN with other resource-demanding services, or if the data is too big to fit entirely into
the memory, then there could be major performance degradations for Spark.
• MapReduce, however, kills its processes as soon as a job is done, so it can easily run alongside other services with
minor performance differences.
• Bottom line: Spark performs better when all the data fits in the memory, especially on dedicated clusters; Hadoop
MapReduce is designed for data that doesn’t fit in the memory and it can run well alongside other services.
Spark Vs Hadoop and the combination effect
Ease Of User:
-----------------
• Spark has comfortable APIs for Java, Scala and Python, and also includes Spark SQL (formerly known as Shark) for
the SQL savvy.
• Hadoop MapReduce is written in Java and is infamous for being very difficult to program. Pig makes it easier,
though it requires some time to learn the syntax, and Hive adds SQL compatibility to the plate.
• MapReduce doesn’t have an interactive mode, although Hive includes a command line interface. Projects like
Impala, Presto andTez want to bring full interactive querying to Hadoop.
Bottom line: Spark is easier to program and includes an interactive mode; Hadoop MapReduce is more difficult to
program but many tools are available to make it easier.
Spark Vs Hadoop and the combination effect
Cost:
-----------------
• Both Spark and Hadoop MapReduce are open source, but money still needs to be spent on machines and staff.
• Hardware Requirements.
• The memory in the Spark cluster should be at least as large as the amount of data you need to process, because
the data has to fit into the memory for optimal performance. So, if you need to process really Big Data, Hadoop
will definitely be the cheaper option since hard disk space comes at a much lower rate than memory space.
• Furthermore, there is a wide array of Hadoop-as-a-service offerings and Hadoop-based, which help to skip the
hardware and staffing requirements. In comparison, there are few Spark-as-a-service options and they are all very
new.
• Bottom line: Spark is more cost-effective according to the benchmarks, though staffing could be more costly;
Hadoop MapReduce could be cheaper because more personnel are available and because of Hadoop-as-a-service
offerings.
Spark Vs Hadoop and the combination effect
Data Processing:
----------------------
• Apache Spark can do more than plain data processing: it can process graphs and use the existing machine-learning
libraries.
• Spark can do real-time processing as well as batch processing.
• Hadoop MapReduce is great for batch processing. If you want a real-time option you’ll need to use another
platform like Storm or Impala, and for graph processing you can use Giraph. MapReduce used to have Apache
Mahout for machine learning, but the elephant riders have ditched it in favor of Spark and h2o.
• Bottom line: Spark is key for real time data processing; Hadoop MapReduce is the key for batch processing.
Spark Vs Hadoop and the combination effect
FailureTolerance:
----------------------
• Spark has retries per task and speculative execution—just like MapReduce. Nonetheless, because MapReduce
relies on hard drives, if a process crashes in the middle of execution, it could continue where it left off, whereas
Spark will have to start processing from the beginning.This can save time.
• Bottom line: Spark and Hadoop MapReduce both have good failure tolerance, but Hadoop MapReduce is
slightly more tolerant.
Security:
------------------
• Spark is a bit bare at the moment when it comes to security.
• Spark can run onYARN and use HDFS, which means that it can also enjoy Kerberos authentication, HDFS file
permissions and encryption between nodes.
• Hadoop MapReduce can enjoy all the Hadoop security benefits and integrate with Hadoop security projects, like
Knox Gateway and Sentry.
• Bottom line: Spark security is still in its infancy; Hadoop MapReduce has more security features and projects.
Practical Demo On Performance and
Ease of Using API’s
Reference Links:
https://www.xplenty.com/blog/2014/11/apache-spark-vs-hadoop-mapreduce/

More Related Content

What's hot

Hadoop Presentation - PPT
Hadoop Presentation - PPTHadoop Presentation - PPT
Hadoop Presentation - PPTAnand Pandey
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadhMithlesh Sadh
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache SparkRahul Jain
 
Snowflake essentials
Snowflake essentialsSnowflake essentials
Snowflake essentialsqureshihamid
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data ArchitectureGuido Schmutz
 
Apache Spark - Dataframes & Spark SQL - Part 1 | Big Data Hadoop Spark Tutori...
Apache Spark - Dataframes & Spark SQL - Part 1 | Big Data Hadoop Spark Tutori...Apache Spark - Dataframes & Spark SQL - Part 1 | Big Data Hadoop Spark Tutori...
Apache Spark - Dataframes & Spark SQL - Part 1 | Big Data Hadoop Spark Tutori...CloudxLab
 
Simplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache SparkSimplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache SparkDatabricks
 
Pig Tutorial | Apache Pig Tutorial | What Is Pig In Hadoop? | Apache Pig Arch...
Pig Tutorial | Apache Pig Tutorial | What Is Pig In Hadoop? | Apache Pig Arch...Pig Tutorial | Apache Pig Tutorial | What Is Pig In Hadoop? | Apache Pig Arch...
Pig Tutorial | Apache Pig Tutorial | What Is Pig In Hadoop? | Apache Pig Arch...Simplilearn
 
Introduction to apache spark
Introduction to apache spark Introduction to apache spark
Introduction to apache spark Aakashdata
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation HadoopVarun Narang
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptxAlex Ivy
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL DatabasesDerek Stainer
 

What's hot (20)

Hadoop Presentation - PPT
Hadoop Presentation - PPTHadoop Presentation - PPT
Hadoop Presentation - PPT
 
Big data by Mithlesh sadh
Big data by Mithlesh sadhBig data by Mithlesh sadh
Big data by Mithlesh sadh
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
Big Data
Big DataBig Data
Big Data
 
Snowflake essentials
Snowflake essentialsSnowflake essentials
Snowflake essentials
 
Big Data Architecture
Big Data ArchitectureBig Data Architecture
Big Data Architecture
 
Apache Spark - Dataframes & Spark SQL - Part 1 | Big Data Hadoop Spark Tutori...
Apache Spark - Dataframes & Spark SQL - Part 1 | Big Data Hadoop Spark Tutori...Apache Spark - Dataframes & Spark SQL - Part 1 | Big Data Hadoop Spark Tutori...
Apache Spark - Dataframes & Spark SQL - Part 1 | Big Data Hadoop Spark Tutori...
 
Unit-3_BDA.ppt
Unit-3_BDA.pptUnit-3_BDA.ppt
Unit-3_BDA.ppt
 
Simplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache SparkSimplifying Big Data Analytics with Apache Spark
Simplifying Big Data Analytics with Apache Spark
 
Hadoop Tutorial For Beginners
Hadoop Tutorial For BeginnersHadoop Tutorial For Beginners
Hadoop Tutorial For Beginners
 
Big data and Hadoop
Big data and HadoopBig data and Hadoop
Big data and Hadoop
 
Big Data
Big DataBig Data
Big Data
 
Pig Tutorial | Apache Pig Tutorial | What Is Pig In Hadoop? | Apache Pig Arch...
Pig Tutorial | Apache Pig Tutorial | What Is Pig In Hadoop? | Apache Pig Arch...Pig Tutorial | Apache Pig Tutorial | What Is Pig In Hadoop? | Apache Pig Arch...
Pig Tutorial | Apache Pig Tutorial | What Is Pig In Hadoop? | Apache Pig Arch...
 
Introduction to apache spark
Introduction to apache spark Introduction to apache spark
Introduction to apache spark
 
Seminar Presentation Hadoop
Seminar Presentation HadoopSeminar Presentation Hadoop
Seminar Presentation Hadoop
 
Big data
Big dataBig data
Big data
 
Introduction to HDFS
Introduction to HDFSIntroduction to HDFS
Introduction to HDFS
 
Databricks Platform.pptx
Databricks Platform.pptxDatabricks Platform.pptx
Databricks Platform.pptx
 
Introduction to NoSQL Databases
Introduction to NoSQL DatabasesIntroduction to NoSQL Databases
Introduction to NoSQL Databases
 
Presentation on Big Data
Presentation on Big DataPresentation on Big Data
Presentation on Big Data
 

Viewers also liked

Architecting DevOps Ready Application
Architecting DevOps Ready Application Architecting DevOps Ready Application
Architecting DevOps Ready Application Agile Testing Alliance
 
Making DevOps a reality for Legacy Enterprise Monolithic Products
Making DevOps a reality for Legacy Enterprise Monolithic ProductsMaking DevOps a reality for Legacy Enterprise Monolithic Products
Making DevOps a reality for Legacy Enterprise Monolithic ProductsAgile Testing Alliance
 
A systemic approach to shaping a DevOps culture
A systemic approach to shaping a DevOps cultureA systemic approach to shaping a DevOps culture
A systemic approach to shaping a DevOps cultureMasa Maeda
 
Distributed And Scaled (DiSc) Agile PMO
Distributed And Scaled (DiSc) Agile PMODistributed And Scaled (DiSc) Agile PMO
Distributed And Scaled (DiSc) Agile PMOAgile Testing Alliance
 
Addressing the challenges of delivering Microservice applications in the ente...
Addressing the challenges of delivering Microservice applications in the ente...Addressing the challenges of delivering Microservice applications in the ente...
Addressing the challenges of delivering Microservice applications in the ente...Agile Testing Alliance
 
Design Thinking Approach for Analytics
Design Thinking Approach for AnalyticsDesign Thinking Approach for Analytics
Design Thinking Approach for AnalyticsAgile Testing Alliance
 
Demonetization, IoT and related thoughts!
Demonetization, IoT and related thoughts!Demonetization, IoT and related thoughts!
Demonetization, IoT and related thoughts!Agile Testing Alliance
 
Prediction Of Muscle Power In Elderly Using Functional Screening Data
Prediction Of Muscle Power In Elderly Using Functional Screening DataPrediction Of Muscle Power In Elderly Using Functional Screening Data
Prediction Of Muscle Power In Elderly Using Functional Screening DataAgile Testing Alliance
 
Linuxkit and Moby - A Sneek Peek into The Future of Container Ecosystem
Linuxkit and Moby - A Sneek Peek into The Future of Container EcosystemLinuxkit and Moby - A Sneek Peek into The Future of Container Ecosystem
Linuxkit and Moby - A Sneek Peek into The Future of Container EcosystemAgile Testing Alliance
 
DevOps In Mobility World With Microsoft Technology
DevOps In Mobility World With Microsoft Technology DevOps In Mobility World With Microsoft Technology
DevOps In Mobility World With Microsoft Technology Agile Testing Alliance
 
Industrial Approach IOT: Practical Approach
Industrial Approach IOT: Practical Approach Industrial Approach IOT: Practical Approach
Industrial Approach IOT: Practical Approach Agile Testing Alliance
 
Key Success (And Failure) modes for your Large Scale DevOps Transformation
Key Success (And Failure) modes for your Large Scale DevOps TransformationKey Success (And Failure) modes for your Large Scale DevOps Transformation
Key Success (And Failure) modes for your Large Scale DevOps TransformationAgile Testing Alliance
 
Strengthening CX through Agile Ecosystems
Strengthening CX through Agile EcosystemsStrengthening CX through Agile Ecosystems
Strengthening CX through Agile EcosystemsAgile Testing Alliance
 

Viewers also liked (20)

Architecting DevOps Ready Application
Architecting DevOps Ready Application Architecting DevOps Ready Application
Architecting DevOps Ready Application
 
Making DevOps a reality for Legacy Enterprise Monolithic Products
Making DevOps a reality for Legacy Enterprise Monolithic ProductsMaking DevOps a reality for Legacy Enterprise Monolithic Products
Making DevOps a reality for Legacy Enterprise Monolithic Products
 
Salesforce: CI,CD & CT
Salesforce: CI,CD & CTSalesforce: CI,CD & CT
Salesforce: CI,CD & CT
 
Windows Automation with Ansible
Windows Automation with Ansible Windows Automation with Ansible
Windows Automation with Ansible
 
A systemic approach to shaping a DevOps culture
A systemic approach to shaping a DevOps cultureA systemic approach to shaping a DevOps culture
A systemic approach to shaping a DevOps culture
 
Distributed And Scaled (DiSc) Agile PMO
Distributed And Scaled (DiSc) Agile PMODistributed And Scaled (DiSc) Agile PMO
Distributed And Scaled (DiSc) Agile PMO
 
Addressing the challenges of delivering Microservice applications in the ente...
Addressing the challenges of delivering Microservice applications in the ente...Addressing the challenges of delivering Microservice applications in the ente...
Addressing the challenges of delivering Microservice applications in the ente...
 
Design Thinking Approach for Analytics
Design Thinking Approach for AnalyticsDesign Thinking Approach for Analytics
Design Thinking Approach for Analytics
 
Demonetization, IoT and related thoughts!
Demonetization, IoT and related thoughts!Demonetization, IoT and related thoughts!
Demonetization, IoT and related thoughts!
 
Monitoring With Prometheus
Monitoring With PrometheusMonitoring With Prometheus
Monitoring With Prometheus
 
DevOps++ Global Summit 2017
DevOps++ Global Summit 2017DevOps++ Global Summit 2017
DevOps++ Global Summit 2017
 
Prediction Of Muscle Power In Elderly Using Functional Screening Data
Prediction Of Muscle Power In Elderly Using Functional Screening DataPrediction Of Muscle Power In Elderly Using Functional Screening Data
Prediction Of Muscle Power In Elderly Using Functional Screening Data
 
Linuxkit and Moby - A Sneek Peek into The Future of Container Ecosystem
Linuxkit and Moby - A Sneek Peek into The Future of Container EcosystemLinuxkit and Moby - A Sneek Peek into The Future of Container Ecosystem
Linuxkit and Moby - A Sneek Peek into The Future of Container Ecosystem
 
DevOps In Mobility World With Microsoft Technology
DevOps In Mobility World With Microsoft Technology DevOps In Mobility World With Microsoft Technology
DevOps In Mobility World With Microsoft Technology
 
Industrial Approach IOT: Practical Approach
Industrial Approach IOT: Practical Approach Industrial Approach IOT: Practical Approach
Industrial Approach IOT: Practical Approach
 
Key Success (And Failure) modes for your Large Scale DevOps Transformation
Key Success (And Failure) modes for your Large Scale DevOps TransformationKey Success (And Failure) modes for your Large Scale DevOps Transformation
Key Success (And Failure) modes for your Large Scale DevOps Transformation
 
Strengthening CX through Agile Ecosystems
Strengthening CX through Agile EcosystemsStrengthening CX through Agile Ecosystems
Strengthening CX through Agile Ecosystems
 
About Agile Testing Alliance (ATA)
About Agile Testing Alliance (ATA)About Agile Testing Alliance (ATA)
About Agile Testing Alliance (ATA)
 
BDaas- BigData as a service
BDaas- BigData as a service  BDaas- BigData as a service
BDaas- BigData as a service
 
Robotic Process Automation
Robotic Process Automation Robotic Process Automation
Robotic Process Automation
 

Similar to Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Processing

Introduction to spark
Introduction to sparkIntroduction to spark
Introduction to sparkHome
 
Hadoop vs spark
Hadoop vs sparkHadoop vs spark
Hadoop vs sparkamarkayam
 
Unit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptxUnit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptxRahul Borate
 
Spark_Talha.pptx
Spark_Talha.pptxSpark_Talha.pptx
Spark_Talha.pptxITLAb21
 
Big_data_analytics_NoSql_Module-4_Session
Big_data_analytics_NoSql_Module-4_SessionBig_data_analytics_NoSql_Module-4_Session
Big_data_analytics_NoSql_Module-4_SessionRUHULAMINHAZARIKA
 
Hadoop Vs Spark — Choosing the Right Big Data Framework
Hadoop Vs Spark — Choosing the Right Big Data FrameworkHadoop Vs Spark — Choosing the Right Big Data Framework
Hadoop Vs Spark — Choosing the Right Big Data FrameworkAlaina Carter
 
Apache Spark Fundamentals
Apache Spark FundamentalsApache Spark Fundamentals
Apache Spark FundamentalsZahra Eskandari
 
Apache Spark in Scientific Applications
Apache Spark in Scientific ApplicationsApache Spark in Scientific Applications
Apache Spark in Scientific ApplicationsDr. Mirko Kämpf
 
Apache Spark in Scientific Applciations
Apache Spark in Scientific ApplciationsApache Spark in Scientific Applciations
Apache Spark in Scientific ApplciationsDr. Mirko Kämpf
 
Transitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to SparkTransitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to SparkSlim Baltagi
 

Similar to Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Processing (20)

Introduction to spark
Introduction to sparkIntroduction to spark
Introduction to spark
 
Hadoop vs spark
Hadoop vs sparkHadoop vs spark
Hadoop vs spark
 
Why Spark over Hadoop?
Why Spark over Hadoop?Why Spark over Hadoop?
Why Spark over Hadoop?
 
finap ppt conference.pptx
finap ppt conference.pptxfinap ppt conference.pptx
finap ppt conference.pptx
 
spark_v1_2
spark_v1_2spark_v1_2
spark_v1_2
 
Unit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptxUnit II Real Time Data Processing tools.pptx
Unit II Real Time Data Processing tools.pptx
 
Apache Spark PDF
Apache Spark PDFApache Spark PDF
Apache Spark PDF
 
Spark_Talha.pptx
Spark_Talha.pptxSpark_Talha.pptx
Spark_Talha.pptx
 
Big_data_analytics_NoSql_Module-4_Session
Big_data_analytics_NoSql_Module-4_SessionBig_data_analytics_NoSql_Module-4_Session
Big_data_analytics_NoSql_Module-4_Session
 
Big data with java
Big data with javaBig data with java
Big data with java
 
Hadoop Vs Spark — Choosing the Right Big Data Framework
Hadoop Vs Spark — Choosing the Right Big Data FrameworkHadoop Vs Spark — Choosing the Right Big Data Framework
Hadoop Vs Spark — Choosing the Right Big Data Framework
 
Apache Spark Fundamentals
Apache Spark FundamentalsApache Spark Fundamentals
Apache Spark Fundamentals
 
SparkPaper
SparkPaperSparkPaper
SparkPaper
 
Apache Spark in Scientific Applications
Apache Spark in Scientific ApplicationsApache Spark in Scientific Applications
Apache Spark in Scientific Applications
 
Apache Spark in Scientific Applciations
Apache Spark in Scientific ApplciationsApache Spark in Scientific Applciations
Apache Spark in Scientific Applciations
 
APACHE SPARK.pptx
APACHE SPARK.pptxAPACHE SPARK.pptx
APACHE SPARK.pptx
 
View on big data technologies
View on big data technologiesView on big data technologies
View on big data technologies
 
Transitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to SparkTransitioning Compute Models: Hadoop MapReduce to Spark
Transitioning Compute Models: Hadoop MapReduce to Spark
 
Getting started big data
Getting started big dataGetting started big data
Getting started big data
 
Spark_Part 1
Spark_Part 1Spark_Part 1
Spark_Part 1
 

More from Agile Testing Alliance

#Interactive Session by Anindita Rath and Mahathee Dandibhotla, "From Good to...
#Interactive Session by Anindita Rath and Mahathee Dandibhotla, "From Good to...#Interactive Session by Anindita Rath and Mahathee Dandibhotla, "From Good to...
#Interactive Session by Anindita Rath and Mahathee Dandibhotla, "From Good to...Agile Testing Alliance
 
#Interactive Session by Ajay Balamurugadas, "Where Are The Real Testers In T...
#Interactive Session by  Ajay Balamurugadas, "Where Are The Real Testers In T...#Interactive Session by  Ajay Balamurugadas, "Where Are The Real Testers In T...
#Interactive Session by Ajay Balamurugadas, "Where Are The Real Testers In T...Agile Testing Alliance
 
#Interactive Session by Jishnu Nambiar and Mayur Ovhal, "Monitoring Web Per...
#Interactive Session by  Jishnu Nambiar and  Mayur Ovhal, "Monitoring Web Per...#Interactive Session by  Jishnu Nambiar and  Mayur Ovhal, "Monitoring Web Per...
#Interactive Session by Jishnu Nambiar and Mayur Ovhal, "Monitoring Web Per...Agile Testing Alliance
 
#Interactive Session by Pradipta Biswas and Sucheta Saurabh Chitale, "Navigat...
#Interactive Session by Pradipta Biswas and Sucheta Saurabh Chitale, "Navigat...#Interactive Session by Pradipta Biswas and Sucheta Saurabh Chitale, "Navigat...
#Interactive Session by Pradipta Biswas and Sucheta Saurabh Chitale, "Navigat...Agile Testing Alliance
 
#Interactive Session by Apoorva Ram, "The Art of Storytelling for Testers" at...
#Interactive Session by Apoorva Ram, "The Art of Storytelling for Testers" at...#Interactive Session by Apoorva Ram, "The Art of Storytelling for Testers" at...
#Interactive Session by Apoorva Ram, "The Art of Storytelling for Testers" at...Agile Testing Alliance
 
#Interactive Session by Nikhil Jain, "Catch All Mail With Graph" at #ATAGTR2023.
#Interactive Session by Nikhil Jain, "Catch All Mail With Graph" at #ATAGTR2023.#Interactive Session by Nikhil Jain, "Catch All Mail With Graph" at #ATAGTR2023.
#Interactive Session by Nikhil Jain, "Catch All Mail With Graph" at #ATAGTR2023.Agile Testing Alliance
 
#Interactive Session by Ashok Kumar S, "Test Data the key to robust test cove...
#Interactive Session by Ashok Kumar S, "Test Data the key to robust test cove...#Interactive Session by Ashok Kumar S, "Test Data the key to robust test cove...
#Interactive Session by Ashok Kumar S, "Test Data the key to robust test cove...Agile Testing Alliance
 
#Interactive Session by Seema Kohli, "Test Leadership in the Era of Artificia...
#Interactive Session by Seema Kohli, "Test Leadership in the Era of Artificia...#Interactive Session by Seema Kohli, "Test Leadership in the Era of Artificia...
#Interactive Session by Seema Kohli, "Test Leadership in the Era of Artificia...Agile Testing Alliance
 
#Interactive Session by Ashwini Lalit, RRR of Test Automation Maintenance" at...
#Interactive Session by Ashwini Lalit, RRR of Test Automation Maintenance" at...#Interactive Session by Ashwini Lalit, RRR of Test Automation Maintenance" at...
#Interactive Session by Ashwini Lalit, RRR of Test Automation Maintenance" at...Agile Testing Alliance
 
#Interactive Session by Srithanga Aishvarya T, "Machine Learning Model to aut...
#Interactive Session by Srithanga Aishvarya T, "Machine Learning Model to aut...#Interactive Session by Srithanga Aishvarya T, "Machine Learning Model to aut...
#Interactive Session by Srithanga Aishvarya T, "Machine Learning Model to aut...Agile Testing Alliance
 
#Interactive Session by Kirti Ranjan Satapathy and Nandini K, "Elements of Qu...
#Interactive Session by Kirti Ranjan Satapathy and Nandini K, "Elements of Qu...#Interactive Session by Kirti Ranjan Satapathy and Nandini K, "Elements of Qu...
#Interactive Session by Kirti Ranjan Satapathy and Nandini K, "Elements of Qu...Agile Testing Alliance
 
#Interactive Session by Sudhir Upadhyay and Ashish Kumar, "Strengthening Test...
#Interactive Session by Sudhir Upadhyay and Ashish Kumar, "Strengthening Test...#Interactive Session by Sudhir Upadhyay and Ashish Kumar, "Strengthening Test...
#Interactive Session by Sudhir Upadhyay and Ashish Kumar, "Strengthening Test...Agile Testing Alliance
 
#Interactive Session by Sayan Deb Kundu, "Testing Gen AI Applications" at #AT...
#Interactive Session by Sayan Deb Kundu, "Testing Gen AI Applications" at #AT...#Interactive Session by Sayan Deb Kundu, "Testing Gen AI Applications" at #AT...
#Interactive Session by Sayan Deb Kundu, "Testing Gen AI Applications" at #AT...Agile Testing Alliance
 
#Interactive Session by Dinesh Boravke, "Zero Defects – Myth or Reality" at #...
#Interactive Session by Dinesh Boravke, "Zero Defects – Myth or Reality" at #...#Interactive Session by Dinesh Boravke, "Zero Defects – Myth or Reality" at #...
#Interactive Session by Dinesh Boravke, "Zero Defects – Myth or Reality" at #...Agile Testing Alliance
 
#Interactive Session by Saby Saurabh Bhardwaj, "Redefine Quality Assurance –...
#Interactive Session by  Saby Saurabh Bhardwaj, "Redefine Quality Assurance –...#Interactive Session by  Saby Saurabh Bhardwaj, "Redefine Quality Assurance –...
#Interactive Session by Saby Saurabh Bhardwaj, "Redefine Quality Assurance –...Agile Testing Alliance
 
#Keynote Session by Sanjay Kumar, "Innovation Inspired Testing!!" at #ATAGTR2...
#Keynote Session by Sanjay Kumar, "Innovation Inspired Testing!!" at #ATAGTR2...#Keynote Session by Sanjay Kumar, "Innovation Inspired Testing!!" at #ATAGTR2...
#Keynote Session by Sanjay Kumar, "Innovation Inspired Testing!!" at #ATAGTR2...Agile Testing Alliance
 
#Keynote Session by Schalk Cronje, "Don’t Containerize me" at #ATAGTR2023.
#Keynote Session by Schalk Cronje, "Don’t Containerize me" at #ATAGTR2023.#Keynote Session by Schalk Cronje, "Don’t Containerize me" at #ATAGTR2023.
#Keynote Session by Schalk Cronje, "Don’t Containerize me" at #ATAGTR2023.Agile Testing Alliance
 
#Interactive Session by Chidambaram Vetrivel and Venkatesh Belde, "Revolution...
#Interactive Session by Chidambaram Vetrivel and Venkatesh Belde, "Revolution...#Interactive Session by Chidambaram Vetrivel and Venkatesh Belde, "Revolution...
#Interactive Session by Chidambaram Vetrivel and Venkatesh Belde, "Revolution...Agile Testing Alliance
 
#Interactive Session by Aniket Diwakar Kadukar and Padimiti Vaidik Eswar Dat...
#Interactive Session by Aniket Diwakar Kadukar and  Padimiti Vaidik Eswar Dat...#Interactive Session by Aniket Diwakar Kadukar and  Padimiti Vaidik Eswar Dat...
#Interactive Session by Aniket Diwakar Kadukar and Padimiti Vaidik Eswar Dat...Agile Testing Alliance
 
#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...
#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...
#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...Agile Testing Alliance
 

More from Agile Testing Alliance (20)

#Interactive Session by Anindita Rath and Mahathee Dandibhotla, "From Good to...
#Interactive Session by Anindita Rath and Mahathee Dandibhotla, "From Good to...#Interactive Session by Anindita Rath and Mahathee Dandibhotla, "From Good to...
#Interactive Session by Anindita Rath and Mahathee Dandibhotla, "From Good to...
 
#Interactive Session by Ajay Balamurugadas, "Where Are The Real Testers In T...
#Interactive Session by  Ajay Balamurugadas, "Where Are The Real Testers In T...#Interactive Session by  Ajay Balamurugadas, "Where Are The Real Testers In T...
#Interactive Session by Ajay Balamurugadas, "Where Are The Real Testers In T...
 
#Interactive Session by Jishnu Nambiar and Mayur Ovhal, "Monitoring Web Per...
#Interactive Session by  Jishnu Nambiar and  Mayur Ovhal, "Monitoring Web Per...#Interactive Session by  Jishnu Nambiar and  Mayur Ovhal, "Monitoring Web Per...
#Interactive Session by Jishnu Nambiar and Mayur Ovhal, "Monitoring Web Per...
 
#Interactive Session by Pradipta Biswas and Sucheta Saurabh Chitale, "Navigat...
#Interactive Session by Pradipta Biswas and Sucheta Saurabh Chitale, "Navigat...#Interactive Session by Pradipta Biswas and Sucheta Saurabh Chitale, "Navigat...
#Interactive Session by Pradipta Biswas and Sucheta Saurabh Chitale, "Navigat...
 
#Interactive Session by Apoorva Ram, "The Art of Storytelling for Testers" at...
#Interactive Session by Apoorva Ram, "The Art of Storytelling for Testers" at...#Interactive Session by Apoorva Ram, "The Art of Storytelling for Testers" at...
#Interactive Session by Apoorva Ram, "The Art of Storytelling for Testers" at...
 
#Interactive Session by Nikhil Jain, "Catch All Mail With Graph" at #ATAGTR2023.
#Interactive Session by Nikhil Jain, "Catch All Mail With Graph" at #ATAGTR2023.#Interactive Session by Nikhil Jain, "Catch All Mail With Graph" at #ATAGTR2023.
#Interactive Session by Nikhil Jain, "Catch All Mail With Graph" at #ATAGTR2023.
 
#Interactive Session by Ashok Kumar S, "Test Data the key to robust test cove...
#Interactive Session by Ashok Kumar S, "Test Data the key to robust test cove...#Interactive Session by Ashok Kumar S, "Test Data the key to robust test cove...
#Interactive Session by Ashok Kumar S, "Test Data the key to robust test cove...
 
#Interactive Session by Seema Kohli, "Test Leadership in the Era of Artificia...
#Interactive Session by Seema Kohli, "Test Leadership in the Era of Artificia...#Interactive Session by Seema Kohli, "Test Leadership in the Era of Artificia...
#Interactive Session by Seema Kohli, "Test Leadership in the Era of Artificia...
 
#Interactive Session by Ashwini Lalit, RRR of Test Automation Maintenance" at...
#Interactive Session by Ashwini Lalit, RRR of Test Automation Maintenance" at...#Interactive Session by Ashwini Lalit, RRR of Test Automation Maintenance" at...
#Interactive Session by Ashwini Lalit, RRR of Test Automation Maintenance" at...
 
#Interactive Session by Srithanga Aishvarya T, "Machine Learning Model to aut...
#Interactive Session by Srithanga Aishvarya T, "Machine Learning Model to aut...#Interactive Session by Srithanga Aishvarya T, "Machine Learning Model to aut...
#Interactive Session by Srithanga Aishvarya T, "Machine Learning Model to aut...
 
#Interactive Session by Kirti Ranjan Satapathy and Nandini K, "Elements of Qu...
#Interactive Session by Kirti Ranjan Satapathy and Nandini K, "Elements of Qu...#Interactive Session by Kirti Ranjan Satapathy and Nandini K, "Elements of Qu...
#Interactive Session by Kirti Ranjan Satapathy and Nandini K, "Elements of Qu...
 
#Interactive Session by Sudhir Upadhyay and Ashish Kumar, "Strengthening Test...
#Interactive Session by Sudhir Upadhyay and Ashish Kumar, "Strengthening Test...#Interactive Session by Sudhir Upadhyay and Ashish Kumar, "Strengthening Test...
#Interactive Session by Sudhir Upadhyay and Ashish Kumar, "Strengthening Test...
 
#Interactive Session by Sayan Deb Kundu, "Testing Gen AI Applications" at #AT...
#Interactive Session by Sayan Deb Kundu, "Testing Gen AI Applications" at #AT...#Interactive Session by Sayan Deb Kundu, "Testing Gen AI Applications" at #AT...
#Interactive Session by Sayan Deb Kundu, "Testing Gen AI Applications" at #AT...
 
#Interactive Session by Dinesh Boravke, "Zero Defects – Myth or Reality" at #...
#Interactive Session by Dinesh Boravke, "Zero Defects – Myth or Reality" at #...#Interactive Session by Dinesh Boravke, "Zero Defects – Myth or Reality" at #...
#Interactive Session by Dinesh Boravke, "Zero Defects – Myth or Reality" at #...
 
#Interactive Session by Saby Saurabh Bhardwaj, "Redefine Quality Assurance –...
#Interactive Session by  Saby Saurabh Bhardwaj, "Redefine Quality Assurance –...#Interactive Session by  Saby Saurabh Bhardwaj, "Redefine Quality Assurance –...
#Interactive Session by Saby Saurabh Bhardwaj, "Redefine Quality Assurance –...
 
#Keynote Session by Sanjay Kumar, "Innovation Inspired Testing!!" at #ATAGTR2...
#Keynote Session by Sanjay Kumar, "Innovation Inspired Testing!!" at #ATAGTR2...#Keynote Session by Sanjay Kumar, "Innovation Inspired Testing!!" at #ATAGTR2...
#Keynote Session by Sanjay Kumar, "Innovation Inspired Testing!!" at #ATAGTR2...
 
#Keynote Session by Schalk Cronje, "Don’t Containerize me" at #ATAGTR2023.
#Keynote Session by Schalk Cronje, "Don’t Containerize me" at #ATAGTR2023.#Keynote Session by Schalk Cronje, "Don’t Containerize me" at #ATAGTR2023.
#Keynote Session by Schalk Cronje, "Don’t Containerize me" at #ATAGTR2023.
 
#Interactive Session by Chidambaram Vetrivel and Venkatesh Belde, "Revolution...
#Interactive Session by Chidambaram Vetrivel and Venkatesh Belde, "Revolution...#Interactive Session by Chidambaram Vetrivel and Venkatesh Belde, "Revolution...
#Interactive Session by Chidambaram Vetrivel and Venkatesh Belde, "Revolution...
 
#Interactive Session by Aniket Diwakar Kadukar and Padimiti Vaidik Eswar Dat...
#Interactive Session by Aniket Diwakar Kadukar and  Padimiti Vaidik Eswar Dat...#Interactive Session by Aniket Diwakar Kadukar and  Padimiti Vaidik Eswar Dat...
#Interactive Session by Aniket Diwakar Kadukar and Padimiti Vaidik Eswar Dat...
 
#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...
#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...
#Interactive Session by Vivek Patle and Jahnavi Umarji, "Empowering Functiona...
 

Recently uploaded

Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesZilliz
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 

Recently uploaded (20)

Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector DatabasesVector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 

Introduction To Big Data with Hadoop and Spark - For Batch and Real Time Processing

  • 2. YOUR PRESENTER – SAMPAT KUMAR BUDANKAYALA • Sr . Big Data Analyst @ Harman Solutions • Over 4.5 years of Big Data experience working on over 15-20 projects . • Specialist in Building Data Lake Projects, Data Security, Streaming Solutions(RealTime Ingestion),Linear Regression and Building Recommendation Systems . • Email: sampatbigdata@gmail.com • Linkedin:
  • 3. AGENDA • Around the Globe (Spark and Hadoop) • Big Data, Big Data Stack, Apache Hadoop, Apache Spark. • What is Hadoop and What is Spark ? • SparkVs Hadoop and the combination effect. • Q & A
  • 4. Around the Globe: NEWS: ---------- • Is it Spark ‘vs’ OR ‘and’ Hadoop. • Apache Spark is continuing beyond Apache Hadoop. SURVEYS: -------------- • Big Data, the analysis of large quantities of data to gain new insight has become a ubiquitous phrase in recent years. Day by day the data is growing at a staggering rate. One of the efficient technologies that deal with the Big Data is Hadoop. • Hadoop, for processing large data volume jobs uses MapReduce programming model. http://www.ijetae.com/files/Volume4Issue5/IJETAE_0514_15.pdf • Hadoop's historic focus on batch processing of data was well supported by MapReduce, but there is an appetite for more flexible developer tools to support the larger market of 'mid-size' datasets and use cases that call for real-time processing. http://www.marketwired.com/press-release/survey-indicates-apache-spark-gaining-developer- adoption-as-big-datas-projects-require-1986162.htm
  • 6. Big Data, Big Data Stack, Apache Spark and Hadoop Big Data ---------- • Big data is a term that describes the large volume of data –structured ,semi-structured and unstructured . • But it’s not the amount of data that’s important. It’s what organizations do with the data that matters. Big data can be analyzed for insights that lead to better decisions and strategic business moves. • The concept gained momentum in the early 2000s when industry analysts articulated the now- mainstream definition of big data as the threeVs: Volume – Organizations collect data from a variety of sources, including business transactions, social media and information from sensor or machine-to-machine data. In the past, storing it would’ve been a problem – but new technologies (such as Hadoop) have eased the burden. Velocity – Data streams in at an unprecedented speed and must be dealt with in a timely manner. RFID tags, sensors and smart metering are driving the need to deal with torrents of data in near-real time. Variety – Data comes in all types of formats – from structured, numeric data in traditional databases to unstructured text documents, email, video, audio, stock ticker data and financial transactions. https://www.zettaset.com/index.php/info-center/what-is-big-data/
  • 7. Big Data, Big Data Stack, Apache Spark and Hadoop Big Data Stack -------------------
  • 8. Big Data, Big Data Stack, Apache Spark and Hadoop Apache Hadoop --------------------- • Hadoop is a framework designed to work with huge amount of data sets which is much larger in magnitude than the normal systems can handle. • Hadoop distributes this data across a set of machines.The real power of Hadoop comes from the fact its competence to scalable to hundreds or thousands of computers each containing several processor cores. • Many big enterprises believe that within a few years more than half of the world’s data will be stored in Hadoop. • Hadoop mainly consists of: 1. Hadoop Distributed File System (HDFS): a distributed file system to achieve storage and fault tolerance 2. Hadoop MapReduce a powerful parallel programming model which processes vast quantity of data via distributed computing across the clusters.
  • 9. Big Data, Big Data Stack, Apache Spark and Hadoop Apache Spark --------------------- • Apache Spark is a fast, in-memory data processing engine with elegant and expressive development APIs to allow data workers to efficiently execute streaming, machine learning or SQL workloads that require fast iterative access to datasets. • Apache Spark consists of Spark Core and a set of libraries.The core is the distributed execution engine and the Java, Scala, and Python APIs offer a platform for distributed ETL application development. • Spark is designed for data science and its abstraction makes data science easier. Data scientists commonly use machine learning – a set of techniques and algorithms that can learn from data. • Run programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk.
  • 10. Spark Vs Hadoop and the combination effect Performance ----------------- • Apache Spark processes data in-memory while Hadoop MapReduce persists back to the disk after a map or reduce action, so Spark should outperform Hadoop MapReduce. • Nonetheless, Spark needs a lot of memory. Much like standard DBs, it loads a process into memory and keeps it there until further notice, for the sake of caching. • If Spark runs on HadoopYARN with other resource-demanding services, or if the data is too big to fit entirely into the memory, then there could be major performance degradations for Spark. • MapReduce, however, kills its processes as soon as a job is done, so it can easily run alongside other services with minor performance differences. • Bottom line: Spark performs better when all the data fits in the memory, especially on dedicated clusters; Hadoop MapReduce is designed for data that doesn’t fit in the memory and it can run well alongside other services.
  • 11. Spark Vs Hadoop and the combination effect Ease Of User: ----------------- • Spark has comfortable APIs for Java, Scala and Python, and also includes Spark SQL (formerly known as Shark) for the SQL savvy. • Hadoop MapReduce is written in Java and is infamous for being very difficult to program. Pig makes it easier, though it requires some time to learn the syntax, and Hive adds SQL compatibility to the plate. • MapReduce doesn’t have an interactive mode, although Hive includes a command line interface. Projects like Impala, Presto andTez want to bring full interactive querying to Hadoop. Bottom line: Spark is easier to program and includes an interactive mode; Hadoop MapReduce is more difficult to program but many tools are available to make it easier.
  • 12. Spark Vs Hadoop and the combination effect Cost: ----------------- • Both Spark and Hadoop MapReduce are open source, but money still needs to be spent on machines and staff. • Hardware Requirements. • The memory in the Spark cluster should be at least as large as the amount of data you need to process, because the data has to fit into the memory for optimal performance. So, if you need to process really Big Data, Hadoop will definitely be the cheaper option since hard disk space comes at a much lower rate than memory space. • Furthermore, there is a wide array of Hadoop-as-a-service offerings and Hadoop-based, which help to skip the hardware and staffing requirements. In comparison, there are few Spark-as-a-service options and they are all very new. • Bottom line: Spark is more cost-effective according to the benchmarks, though staffing could be more costly; Hadoop MapReduce could be cheaper because more personnel are available and because of Hadoop-as-a-service offerings.
  • 13. Spark Vs Hadoop and the combination effect Data Processing: ---------------------- • Apache Spark can do more than plain data processing: it can process graphs and use the existing machine-learning libraries. • Spark can do real-time processing as well as batch processing. • Hadoop MapReduce is great for batch processing. If you want a real-time option you’ll need to use another platform like Storm or Impala, and for graph processing you can use Giraph. MapReduce used to have Apache Mahout for machine learning, but the elephant riders have ditched it in favor of Spark and h2o. • Bottom line: Spark is key for real time data processing; Hadoop MapReduce is the key for batch processing.
  • 14. Spark Vs Hadoop and the combination effect FailureTolerance: ---------------------- • Spark has retries per task and speculative execution—just like MapReduce. Nonetheless, because MapReduce relies on hard drives, if a process crashes in the middle of execution, it could continue where it left off, whereas Spark will have to start processing from the beginning.This can save time. • Bottom line: Spark and Hadoop MapReduce both have good failure tolerance, but Hadoop MapReduce is slightly more tolerant. Security: ------------------ • Spark is a bit bare at the moment when it comes to security. • Spark can run onYARN and use HDFS, which means that it can also enjoy Kerberos authentication, HDFS file permissions and encryption between nodes. • Hadoop MapReduce can enjoy all the Hadoop security benefits and integrate with Hadoop security projects, like Knox Gateway and Sentry. • Bottom line: Spark security is still in its infancy; Hadoop MapReduce has more security features and projects.
  • 15. Practical Demo On Performance and Ease of Using API’s