SlideShare a Scribd company logo
Marcin Szymaniuk

Apache Spark
Data intensive processing in practice
www.tantusdata.com
About me
• Data Engineer @TantusData
• Have worked for: Spotify, Apple, telcos, startups
• Cluster installations, application architecture and
development, training, data team support
marcin@tantusdata.com
marcin.szymaniuk@gmail.com
@mszymani
www.tantusdata.com
Agenda
• What is Spark?
• Use case overview
• Architecture
• Big picture
www.tantusdata.com
What is Spark?
• Engine for distributed data processing
• Java, Scala, R, Python, SQL API
• Streaming, Machine Learning
www.tantusdata.com
Mobile app company
APP
www.tantusdata.com
Mobile app company
APP
EVENTS
www.tantusdata.com
Mobile app company
APP
EVENTS
www.tantusdata.com
Mobile app company
APP
www.tantusdata.com
Mobile app company
APP
www.tantusdata.com
Spark API - DataFrame
eventsDf
.groupBy("userId")
.agg(
max("value").alias("maxVal"),
avg("value").alias("avgValue")
)
.join(usersDf, usersDf("id") === eventsDf("userId"))
.select("userId", "maxVal", "avgValue","name")
www.tantusdata.com
Spark API - DataFrame
eventsDf
.groupBy("userId")
.agg(
max("value").alias("maxVal"),
avg("value").alias("avgValue")
)
.join(usersDf, usersDf("id") === eventsDf("userId"))
.select("userId", "maxVal", "avgValue","name")
www.tantusdata.com
General use cases
ETL
HDFS
www.tantusdata.com
General use cases
KPI
ACQUISITION ACTIVATION
RETENTION REFERRAL
REVENUE
www.tantusdata.com
General use cases
• A/B tests
• Anonymization
• Fraud
• Churn
• ML
• …
www.tantusdata.com
Network improvement
www.tantusdata.com
Network improvement
• Score historical customer network quality
• Define a model predicting churn based on historical
score
• Simulate base station upgrade and calculate expected
score after the upgrade
• Use the simulated score with churn prediction model
www.tantusdata.com
Bring analysis to data
DATA
R / PYTHON
/ SAS
Sample
• Sample only (region, latest month…)
• Coarse aggregate eg. month vs hour (1:720)
www.tantusdata.com
Bring analysis to data
Photo credit: productcoalition.com
www.tantusdata.com
Bring analysis to data
DATA
• Analyze all data
• Faster analysis
• No extra data copies (GDPR!)
• Many solutions are already implemented (MLib,
GraphX…)
www.tantusdata.com
Geospatial data
• General map service
• Self driving cars
www.tantusdata.com
Geospatial data
Map V1Car + AI
Map V2
Map OSMap OS
Editors
Map V2
Vendors
www.tantusdata.com
Geospatial data
www.tantusdata.com
Geospatial data
www.tantusdata.com
Geospatial data
www.tantusdata.com
Geospatial data
JVM
JVM
JVM
www.tantusdata.com
Spark use cases - recap
• Massive datasets processing - distribute it!
• Computation intensive processing - distribute it!
• SQL-like interface - analyst friendly
• Functional programming for complex logic
www.tantusdata.com
Deep dive
• Spark execution model
• Partitioning
• Caching
www.tantusdata.com
Deep dive
Photo credit: amazon.ca
www.tantusdata.com
RDD / DataFrame
RDD
Partition 1
Partition 2
Partition 3
www.tantusdata.com
Narrow transformation
f:x toUpperCase(x)
rdd.map(f)
foo
Partition 1
Partition 2
Partition 3
RDD
Bar baz
blah blah
LOREm ipsum dolor
sit amet
FOO
Partition 1
Partition 2
Partition 3
New RDD
BAR BAZ
BLAH BLAH
LOREM IPSUM DOLOR
SIT AMET
www.tantusdata.com
Narrow transformation
RDD
Partition 1
NEW RDD
New Partition 1
NEW RDD2
Partition 1
Partition 2 New Partition 2 Partition 2
Partition 3 New Partition 3 Partition 3
www.tantusdata.com
Partitions, tasks
RDD
Partition 1
NEW RDD
Partition 1
NEW RDD2
Partition 1
Partition 3 Partition 3 Partition 3
Partition 2 Partition 2 Partition 2
TASK 1TASK 1
www.tantusdata.com
Partitions, tasks
RDD
Partition 1
NEW RDD
Partition 1
NEW RDD2
Partition 1
Partition 3 Partition 3 Partition 3
Partition 2 Partition 2 Partition 2TASK 2
www.tantusdata.com
Wide transformations
groupByKey
1 | CLICK
Partition 1
Partition 2
Partition 3
RDD
2 | VIEW 1 | SEND
1 | CLICK 3 | SEND
2 | SEND 2 | VIEW 3 | CALL
3 | SEND 1 | CLICK
New RDD
1 | CALL
www.tantusdata.com
Wide transformations
groupByKey
1 | CLICK
Partition 1
Partition 2
Partition 3
RDD
2 | VIEW 1 | SEND
1 | CLICK 3 | SEND
2 | SEND 2 | VIEW 3 | CALL
3 | SEND 1 | CLICK
Partition 1
Partition 2
Partition 3
New RDD
1 | CALL 1 | CLICK 1 | SEND 1 | CLICK
1 | CLICK1 | CALL
www.tantusdata.com
Wide transformations
groupByKey
1 | CLICK
Partition 1
Partition 2
Partition 3
RDD
2 | VIEW 1 | SEND
1 | CLICK 3 | SEND
2 | SEND 2 | VIEW 3 | CALL
3 | SEND 1 | CLICK
Partition 1
Partition 2
Partition 3
New RDD
1 | CALL 1 | CLICK 1 | SEND 1 | CLICK
1 | CLICK1 | CALL
2 | VIEW 2 | SEND2 | VIEW
www.tantusdata.com
Wide transformations
groupByKey
1 | CLICK
Partition 1
Partition 2
Partition 3
RDD
2 | VIEW 1 | SEND
1 | CLICK 3 | SEND
2 | SEND 2 | VIEW 3 | CALL
3 | SEND 1 | CLICK
Partition 1
Partition 2
Partition 3
New RDD
1 | CALL 1 | CLICK 1 | SEND 1 | CLICK
1 | CLICK1 | CALL
2 | VIEW 2 | SEND2 | VIEW
3 | SEND 3 | CALL 3 | SEND
www.tantusdata.com
Spark application
RDD
www.tantusdata.com
Spark application
www.tantusdata.com
Spark application
STAGE 1
SHUFFLE
www.tantusdata.com
Spark application
STAGE 2
STAGE 1
SHUFFLE
STAGE N
www.tantusdata.com
Simplest scenario ever
val df = spark.read.parquet(“…”)
HDFS
TASK
www.tantusdata.com
Simplest scenario ever
HDFS
TASK
ADDCOL ADDCOL
val df = spark.read.parquet(“…”)
df
.withColumn("year", year(col(“timestamp")))
.withColumn("month", month(col(“timestamp”)))
.withColumn("day", dayofmonth(col(“timestamp”)))
www.tantusdata.com
Simplest scenario ever
HDFS
TASK
ADDCOL ADDCOL
HDFS
val df = spark.read.parquet(“…”)
df
.withColumn("year", year(col(“timestamp”)))
.withColumn("month", month(col("timestamp")))
.withColumn("day", dayofmonth(col("timestamp")))
.write.save(output)
www.tantusdata.com
Simplest scenario ever
1TB of data raw events
www.tantusdata.com
Simplest scenario ever
1TB of data raw events
1000 x 1GB
1GB file =
8 blocks
128MB per block
www.tantusdata.com
Simplest scenario ever
1TB of data raw events
1000 x 1GB
1GB file =
8 blocks
128MB per block
8000
blocks
8000
Tasks
HDFS
TASK
ADDCOL ADDCOL
HDFS
HDFS
TASK
ADDCOL ADDCOL
HDFS
HDFS
TASK
ADDCOL ADDCOL
HDFS
HDFS
TASK
ADDCOL ADDCOL
HDFS
…
8000
www.tantusdata.com
EXECUTOR
—executor-cores 3 —executor-memory 10g
Executors
—executor-cores 3 —executor-memory 10g
EXECUTOR
PENDING TASKS
COMPLETE TASKS
DRIVER
EXECUTOR
www.tantusdata.com
Join
10TB of
events
uniform
distribution
1GB of users
HDFS
TASK
Bucket 1
LOCAL
…
HDFS
HDFS
HDFS
TASK
Bucket 1
Bucket 2
LOCAL
…
HDFS
HDFS
HDFS
TASK
Bucket 1
Bucket 2
LOCAL
HDFS
TASK 2
Bucket 1
Bucket 2
LOCAL
HDFS
TASK X
Bucket 1
Bucket 2
LOCAL
…
…
…
HDFS
HDFS
HDFS
TASK
Bucket 1
Bucket 2
LOCAL
TASK
HDFS
TASK 2
Bucket 1
Bucket 2
LOCAL
HDFS
TASK X
Bucket 1
Bucket 2
LOCAL
…
…
…
HDFS
HDFS
HDFS
TASK
Bucket 1
Bucket 2
LOCAL
TASK
HDFS
TASK 2
Bucket 1
Bucket 2
LOCAL
HDFS
TASK X
Bucket 1
Bucket 2
LOCAL
TASK
…
…
…
HDFS
HDFS
.config("spark.sql.shuffle.partitions", "200")
HDFS
TASK
Bucket 1
Bucket 2
LOCAL
TASK
HDFS
TASK 2
Bucket 1
Bucket 2
LOCAL
HDFS
TASK X
Bucket 1
Bucket 2
LOCAL
TASK
…
…
…
HDFS
HDFS
HDFS
TASK
Bucket 1
Bucket 2
LOCAL
TASK
HDFS
TASK 2
Bucket 1
Bucket 2
LOCAL
HDFS
TASK X
Bucket 1
Bucket 2
LOCAL
TASK
…
…
…
HDFS
HDFS
10TB/200 = 50GB/TASK.config("spark.sql.shuffle.partitions", "200")
10TB/200 = 50GB/TASK
HDFS
TASK
Bucket 1
Bucket 2
LOCAL
TASK
HDFS
TASK 2
Bucket 1
Bucket 2
LOCAL
HDFS
TASK X
Bucket 1
Bucket 2
LOCAL
TASK
…
…
…
HDFS
HDFS
50GB
50GB
.config("spark.sql.shuffle.partitions", "200")
www.tantusdata.com
Problems with join
• Spill to disk
• Timeouts
• GC overhead limit exceeded
• OOM
• ExecutorLostFailure
www.tantusdata.com
What to do?
• Understand your data!
• Control the level of parallelism
.config("spark.sql.shuffle.partitions", “2000")
rdd.join(anotherRDD, 2000)
.repartition(2000)
www.tantusdata.com
Skewed join
10TB of
events
One user with
1 TB of events
others are
uniformly
distributed
HDFS
TASK
Bucket 1
Bucket 2
LOCAL
TASK
HDFS
TASK 2
Bucket 1
Bucket 2
LOCAL
HDFS
TASK X
Bucket 1
Bucket 2
LOCAL
TASK
…
…
…
HDFS
HDFS
HDFS
TASK
Bucket 1
Bucket 2
LOCAL
TASK
HDFS
TASK 2
Bucket 1
Bucket 2
LOCAL
HDFS
TASK X
Bucket 1
Bucket 2
LOCAL
TASK
…
…
…
HDFS
HDFS
1TB
www.tantusdata.com
Skewed join
• Bad data?
• Wrong logic?
• Just ok?
Photo credit: hiveminer.com
www.tantusdata.com
Skewed Join
userId …
1
2
3
… …
eventId userId …
af8 1
bf9 1
ff1 1
881 1
91f 2
cc6 1
b22 1
ee4 1
www.tantusdata.com
userId …
1
2
3
… …
eventId userId …
af8 1
bf9 1
ff1 1
881 1
91f 2
cc6 1
b22 1
ee4 1
Skewed Join
www.tantusdata.com
userId …
1
2
3
… …
eventId userId … salt
af8 1 1
bf9 1 2
ff1 1 1
881 1 3
91f 2 2
cc6 1 3
b22 1 3
ee4 1 1
Skewed Join
www.tantusdata.com
userId … salt
1 1
1 2
1 3
2 1
2 2
2 3
3 1
3 2
3 3
… …
Skewed Join
eventId userId … salt
af8 1 1
bf9 1 2
ff1 1 1
881 1 3
91f 2 2
cc6 1 3
b22 1 3
ee4 1 1
www.tantusdata.com
userId … salt
1 1
1 2
1 3
2 1
2 2
2 3
3 1
3 2
3 3
… …
eventId userId … salt
af8 1 1
bf9 1 2
ff1 1 1
881 1 3
91f 2 2
cc6 1 3
b22 1 3
ee4 1 1
Skewed Join
www.tantusdata.com
eventId userId … salt
af8 1 1
bf9 1 2
ff1 1 1
881 1 3
91f 2 2
cc6 1 3
b22 1 3
ee4 1 1
Skewed Join
userId … salt
1 1
1 2
1 3
2 1
2 2
2 3
3 1
3 2
3 3
… …
www.tantusdata.com
userId … salt
1 1
1 2
1 3
2 1
2 2
2 3
3 1
3 2
3 3
… …
Skewed Join
eventId userId … salt
af8 1 1
bf9 1 2
ff1 1 1
881 1 3
91f 2 2
cc6 1 3
b22 1 3
ee4 1 1
www.tantusdata.com
• Know your data!
• Fix the data?
• Improve the logic?
• Add salt?
Skewed Join - recap
Photo credit: shamakern.com
www.tantusdata.com
Cache
val rdd1=calculate1()
rdd1.
…
saveAsTextFile(…)
rdd1.
…
saveAsTextFile(…)
Executed twice!
www.tantusdata.com
Cache + PageRank
RANKS
LINKS
www.tantusdata.com
Cache + PageRank
RANKS
LINKS
www.tantusdata.com
Cache + PageRank
RANKS
LINKS
www.tantusdata.com
Cache + PageRank
RANKS
LINKS
www.tantusdata.com
Cache + PageRank
RANKS
LINKS
www.tantusdata.com
Cache + PageRank
RANKS
LINKS
www.tantusdata.com
Cache
• Transformations are lazy!
• Re-using RDD/DF means re-calculation!
• Branch in execution plan is a candidate for caching
• You cannot control priority - it's LRU
• Know the size of your RDDs/DF - check Spark UI.
www.tantusdata.com
Other gotchas
• Broadcasting
• Sizing executors
• Locality
• Off heap memory
• …
www.tantusdata.com
Challenges ahead
• File formats
• Data evolution
• Jobs orchestration
• Monitoring
• Anomaly detection
• ML models
• …
www.tantusdata.com
Challenges ahead
OPS
DEVELOPERS
BUSINESS ANALYTICS
Common tools and knowledge
www.tantusdata.com
Challenges ahead
Photo credit: thefinancialbrand.com
www.tantusdata.com
Conclusions
• Spark can help you with data processing at scale
• You should know how it works
• Think about big picture from day one
www.tantusdata.com
• marcin@tantusdata.com
• marcin.szymaniuk@gmail.com
• @mszymani
Q&A
www.tantusdata.com
Thank you!

More Related Content

Similar to Apache Spark Data intensive processing in practice

Apache Spark - Data intensive processing in practice
Apache Spark - Data intensive processing in practiceApache Spark - Data intensive processing in practice
Apache Spark - Data intensive processing in practice
Marcin Szymaniuk
 
Spark3
Spark3Spark3
Real Time analytics with Druid, Apache Spark and Kafka
Real Time analytics with Druid, Apache Spark and KafkaReal Time analytics with Druid, Apache Spark and Kafka
Real Time analytics with Druid, Apache Spark and Kafka
Daria Litvinov
 
What’s New in the Upcoming Apache Spark 3.0
What’s New in the Upcoming Apache Spark 3.0What’s New in the Upcoming Apache Spark 3.0
What’s New in the Upcoming Apache Spark 3.0
Databricks
 
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax EnterpriseA Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
Patrick McFadin
 
Samza la hug
Samza la hugSamza la hug
Samza la hug
Sriram Subramanian
 
Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...
Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...
Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...
DataStax Academy
 
Managing your black friday logs - Code Europe
Managing your black friday logs - Code EuropeManaging your black friday logs - Code Europe
Managing your black friday logs - Code Europe
David Pilato
 
Managing your Black Friday Logs NDC Oslo
Managing your  Black Friday Logs NDC OsloManaging your  Black Friday Logs NDC Oslo
Managing your Black Friday Logs NDC Oslo
David Pilato
 
Faceted Search – the 120 Million Documents Story
Faceted Search – the 120 Million Documents StoryFaceted Search – the 120 Million Documents Story
Faceted Search – the 120 Million Documents Story
Sourcesense
 
Dive into Spark Streaming
Dive into Spark StreamingDive into Spark Streaming
Dive into Spark Streaming
Gerard Maas
 
What's New in Apache Spark 2.3 & Why Should You Care
What's New in Apache Spark 2.3 & Why Should You CareWhat's New in Apache Spark 2.3 & Why Should You Care
What's New in Apache Spark 2.3 & Why Should You Care
Databricks
 
Spark + AI Summit recap jul16 2020
Spark + AI Summit recap jul16 2020Spark + AI Summit recap jul16 2020
Spark + AI Summit recap jul16 2020
Guido Oswald
 
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
StampedeCon
 
Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...
Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...
Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...
it-people
 
Apache Incubator Samza: Stream Processing at LinkedIn
Apache Incubator Samza: Stream Processing at LinkedInApache Incubator Samza: Stream Processing at LinkedIn
Apache Incubator Samza: Stream Processing at LinkedIn
Chris Riccomini
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code
Wim Godden
 
Cassandra : to be or not to be @ TechTalk
Cassandra : to be or not to be @ TechTalkCassandra : to be or not to be @ TechTalk
Cassandra : to be or not to be @ TechTalk
Andriy Rymar
 
Apache Incubator Samza: Stream Processing at LinkedIn
Apache Incubator Samza: Stream Processing at LinkedInApache Incubator Samza: Stream Processing at LinkedIn
Apache Incubator Samza: Stream Processing at LinkedIn
Chris Riccomini
 
20 Ideas On How To Improve Your Agile Board
20 Ideas On How To Improve Your Agile Board20 Ideas On How To Improve Your Agile Board
20 Ideas On How To Improve Your Agile Board
Marcus Hammarberg
 

Similar to Apache Spark Data intensive processing in practice (20)

Apache Spark - Data intensive processing in practice
Apache Spark - Data intensive processing in practiceApache Spark - Data intensive processing in practice
Apache Spark - Data intensive processing in practice
 
Spark3
Spark3Spark3
Spark3
 
Real Time analytics with Druid, Apache Spark and Kafka
Real Time analytics with Druid, Apache Spark and KafkaReal Time analytics with Druid, Apache Spark and Kafka
Real Time analytics with Druid, Apache Spark and Kafka
 
What’s New in the Upcoming Apache Spark 3.0
What’s New in the Upcoming Apache Spark 3.0What’s New in the Upcoming Apache Spark 3.0
What’s New in the Upcoming Apache Spark 3.0
 
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax EnterpriseA Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
A Cassandra + Solr + Spark Love Triangle Using DataStax Enterprise
 
Samza la hug
Samza la hugSamza la hug
Samza la hug
 
Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...
Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...
Beyond the Query: A Cassandra + Solr + Spark Love Triangle Using Datastax Ent...
 
Managing your black friday logs - Code Europe
Managing your black friday logs - Code EuropeManaging your black friday logs - Code Europe
Managing your black friday logs - Code Europe
 
Managing your Black Friday Logs NDC Oslo
Managing your  Black Friday Logs NDC OsloManaging your  Black Friday Logs NDC Oslo
Managing your Black Friday Logs NDC Oslo
 
Faceted Search – the 120 Million Documents Story
Faceted Search – the 120 Million Documents StoryFaceted Search – the 120 Million Documents Story
Faceted Search – the 120 Million Documents Story
 
Dive into Spark Streaming
Dive into Spark StreamingDive into Spark Streaming
Dive into Spark Streaming
 
What's New in Apache Spark 2.3 & Why Should You Care
What's New in Apache Spark 2.3 & Why Should You CareWhat's New in Apache Spark 2.3 & Why Should You Care
What's New in Apache Spark 2.3 & Why Should You Care
 
Spark + AI Summit recap jul16 2020
Spark + AI Summit recap jul16 2020Spark + AI Summit recap jul16 2020
Spark + AI Summit recap jul16 2020
 
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
Beyond the Query – Bringing Complex Access Patterns to NoSQL with DataStax - ...
 
Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...
Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...
Jonathan Ellis "Apache Cassandra 2.0 and 2.1". Выступление на Cassandra conf ...
 
Apache Incubator Samza: Stream Processing at LinkedIn
Apache Incubator Samza: Stream Processing at LinkedInApache Incubator Samza: Stream Processing at LinkedIn
Apache Incubator Samza: Stream Processing at LinkedIn
 
Beyond php - it's not (just) about the code
Beyond php - it's not (just) about the codeBeyond php - it's not (just) about the code
Beyond php - it's not (just) about the code
 
Cassandra : to be or not to be @ TechTalk
Cassandra : to be or not to be @ TechTalkCassandra : to be or not to be @ TechTalk
Cassandra : to be or not to be @ TechTalk
 
Apache Incubator Samza: Stream Processing at LinkedIn
Apache Incubator Samza: Stream Processing at LinkedInApache Incubator Samza: Stream Processing at LinkedIn
Apache Incubator Samza: Stream Processing at LinkedIn
 
20 Ideas On How To Improve Your Agile Board
20 Ideas On How To Improve Your Agile Board20 Ideas On How To Improve Your Agile Board
20 Ideas On How To Improve Your Agile Board
 

Recently uploaded

GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
IndexBug
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
Matthew Sinclair
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 

Recently uploaded (20)

GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceAI 101: An Introduction to the Basics and Impact of Artificial Intelligence
AI 101: An Introduction to the Basics and Impact of Artificial Intelligence
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
20240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 202420240609 QFM020 Irresponsible AI Reading List May 2024
20240609 QFM020 Irresponsible AI Reading List May 2024
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 

Apache Spark Data intensive processing in practice