SlideShare a Scribd company logo
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Apache Spark 101
Large Scale Data Processing
by Paweł Szulc
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Apache Spark 101
Large Scale Data Processing
by Paweł Szulc
email: paul.szulc@gmail.com
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Apache Spark 101
Large Scale Data Processing
by Paweł Szulc
email: paul.szulc@gmail.com
blog: http://www.rabbitonweb.com
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Apache Spark 101
Large Scale Data Processing
by Paweł Szulc (@rabbitonweb)
email: paul.szulc@gmail.com
blog: http://www.rabbitonweb.com
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Apache Spark 101
Large Scale Data Processing
by Paweł Szulc (@rabbitonweb) [@ApacheSpark]
email: paul.szulc@gmail.com
blog: http://www.rabbitonweb.com
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Apache Spark 101
Large Scale Data Processing
by Paweł Szulc (@rabbitonweb) [@ApacheSpark]
email: paul.szulc@gmail.com
blog: http://www.rabbitonweb.com
IN
50
M
INUTES
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Why?
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Why?
buzzword: Big Data
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Big Data is like...
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Big Data is like...
“Big Data is like teenage sex:
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Big Data is like...
“Big Data is like teenage sex: everyone
talks about it,
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Big Data is like...
“Big Data is like teenage sex: everyone
talks about it, nobody really knows how
to do it,
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Big Data is like...
“Big Data is like teenage sex: everyone
talks about it, nobody really knows how
to do it, everyone thinks everyone else
is doing it,
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Big Data is like...
“Big Data is like teenage sex: everyone
talks about it, nobody really knows how
to do it, everyone thinks everyone else
is doing it, so everyone claims they are
doing it”
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Big Data is all about...
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Big Data is all about...
● well, the data :)
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Big Data is all about...
● well, the data :)
● It is said that 2.5 exabytes (2.5×10^18) of
data is being created around the world every
single day
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Big Data is all about...
“Every two days, we generate as much
information as we did from the dawn of
civilization until 2003”
-- Eric Schmidt
Former CEO Google
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
source: http://papyrus.greenville.edu/2014/03/selfiesteem/
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Big Data is all about...
● well, the data :)
● It is said that 2.5 exabytes (2.5×10^18) of
data is being created around the world every
single day
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Big Data is all about...
● well, the data :)
● It is said that 2.5 exabytes (2.5×10^18) of
data is being created around the world every
single day
● It's a capacity on which you can not any
longer use standard tools and methods of
evaluation
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Challenges of Big Data
● The gathering
● Processing and discovery
● Present it to business
● Hardware and network failures
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
What was
before?
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
To the rescue
MAP REDUCE
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
To the rescue
MAP REDUCE
“'MapReduce' is a framework for processing
parallelizable problems across huge datasets
using a cluster, taking into consideration
scalability and fault-tolerance”
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
MapReduce - phases (1)
Map Reduce is
combined of
sequences of two
phases:
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
MapReduce - phases (1)
Map Reduce is
combined of
sequences of two
phases:
1. Map
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
MapReduce - phases (1)
Map Reduce is
combined of
sequences of two
phases:
1. Map
2. Reduce
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
MapReduce - phases (1)
Map Reduce is
combined of
sequences of two
phases:
1. Map
2. Reduce
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
MapReduce - phases (2)
Map Reduce is
combined of
sequences of two
phases:
1. Map
2. Reduce
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Map Reduce - key/value
“In MapReduce, no value stands on its own.
Every value has a key associated with it. Keys
identify related values.
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Map Reduce - key/value
“In MapReduce, no value stands on its own.
Every value has a key associated with it. Keys
identify related values.
The mapping and reducing functions receive
not just values, but (key, value) pairs. The
output of each of these functions is the same:
both a key and a value.”
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Word Count
● The “Hello World” of Big Data world.
● For initial input of multiple lines, extract all
words with number of occurrences
To be or not to be
Let it be
Be me
It must be
Let it be
be 7
to 2
let 2
or 1
not 1
must 2
me 1
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Input
To be or not to be
Let it be
Be me
It must be
Let it be
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Input Splitting
To be or not to be
Let it be
Be me
It must be
Let it be
To be or not to be
Let it be
It must be
Let it be
Be me
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Input Splitting Mapping
To be or not to be
Let it be
Be me
It must be
Let it be
To be or not to be
Let it be
It must be
Let it be
Be me
to 1
be 1
or 1
not 1
to 1
be 1
let 1
it 1
be 1
be 1
me 1
let 1
it 1
be 1
it 1
must 1
be 1
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Input Splitting Mapping Shuffling
To be or not to be
Let it be
Be me
It must be
Let it be
To be or not to be
Let it be
It must be
Let it be
Be me
to 1
be 1
or 1
not 1
to 1
be 1
let 1
it 1
be 1
be 1
me 1
let 1
it 1
be 1
it 1
must 1
be 1
be 1
be 1
be 1
be 1
be 1
be 1
to 1
to 1
or 1
not 1
let 1
let 1
must 1
me 1
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Input Splitting Mapping Shuffling Reducing
To be or not to be
Let it be
Be me
It must be
Let it be
To be or not to be
Let it be
It must be
Let it be
Be me
to 1
be 1
or 1
not 1
to 1
be 1
let 1
it 1
be 1
be 1
me 1
let 1
it 1
be 1
it 1
must 1
be 1
be 1
be 1
be 1
be 1
be 1
be 1
to 1
to 1
or 1
not 1
let 1
let 1
must 1
me 1
be 6
to 2
or 1
not 1
let 2
must 1
me 1
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Input Splitting Mapping Shuffling Reducing Final result
To be or not to be
Let it be
Be me
It must be
Let it be
To be or not to be
Let it be
It must be
Let it be
Be me
to 1
be 1
or 1
not 1
to 1
be 1
let 1
it 1
be 1
be 1
me 1
let 1
it 1
be 1
it 1
must 1
be 1
be 1
be 1
be 1
be 1
be 1
be 1
to 1
to 1
or 1
not 1
let 1
let 1
must 1
me 1
be 6
to 2
or 1
not 1
let 2
must 1
me 1
be 6
to 2
let 2
or 1
not 1
must 2
me 1
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Word count - pseudo-code
function map(String name, String document):
for each word w in document:
emit (w, 1)
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Word count - pseudo-code
function map(String name, String document):
for each word w in document:
emit (w, 1)
function reduce(String word, Iterator
partialCounts):
sum = 0
for each pc in partialCounts:
sum += ParseInt(pc)
emit (word, sum)
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Why?
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Why Apache Spark?
We have MapReduce open-sourced
implementation (Hadoop) running successfully
for the last 10 years. Why to bother?
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Problems with Map Reduce
1. MapReduce provides a difficult programming
model for developers
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Word count - revisited
function map(String name, String document):
for each word w in document:
emit (w, 1)
function reduce(String word, Iterator
partialCounts):
sum = 0
for each pc in partialCounts:
sum += ParseInt(pc)
emit (word, sum)
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Word count: Hadoop implementation
15 public class WordCount { 16
17 public static class Map extends Mapper<LongWritable, Text,
Text, IntWritable> {
18 private final static IntWritable one = new IntWritable(1);
19 private Text word = new Text(); 20
21 public void map(LongWritable key, Text value, Context
context) throws IOException, InterruptedException {
22 String line = value.toString();
23 StringTokenizer tokenizer = new StringTokenizer(line);
24 while (tokenizer.hasMoreTokens()) {
25 word.set(tokenizer.nextToken());
26 context.write(word, one);
27 }
28 }
29 }
30
31 public static class Reduce extends Reducer<Text, IntWritable,
Text, IntWritable> {
33 public void reduce(Text key, Iterable<IntWritable> values,
Context context)
34 throws IOException, InterruptedException {
35 int sum = 0;
36 for (IntWritable val : values) { sum += val.get(); }
39 context.write(key, new IntWritable(sum));
40 }
41 }
43 public static void main(String[] args) throws Exception {
44 Configuration conf = new Configuration();
46 Job job = new Job(conf, "wordcount");
48 job.setOutputKeyClass(Text.class);
49 job.setOutputValueClass(IntWritable.class);
51 job.setMapperClass(Map.class);
52 job.setReducerClass(Reduce.class);
54 job.setInputFormatClass(TextInputFormat.class);
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Hadoop addressing the issue
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Hadoop addressing the issue
● Hive - SQL on Hadoop Cluster
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Hadoop addressing the issue
● Hive - SQL on Hadoop Cluster,
● Declarative language
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Hadoop addressing the issue
● Hive - SQL on Hadoop Cluster,
● Declarative language
● But…
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Declarative?
select count(distinct user_id) from logs;
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Declarative?
select count(distinct user_id) from logs;
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Declarative?
select count(distinct user_id) from logs;
select (count(*) from (select distinct user_id
from logs);
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Declarative?
select count(distinct user_id) from logs;
select (count(*) from (select distinct user_id
from logs);
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Problems with Map Reduce
1. MapReduce provides a difficult programming
model for developers
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Problems with Map Reduce
1. MapReduce provides a difficult programming
model for developers
2. It suffers from a number of performance
issues
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Performance issues
● Map-Reduce pair combination
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Performance issues
● Map-Reduce pair combination
● Output saved to the file
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Performance issues
● Map-Reduce pair combination
● Output saved to the file
● Iterative algorithms go through IO path again
and again
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Performance issues
● Map-Reduce pair combination
● Output saved to the file
● Iterative algorithms go through IO path again
and again
● Poor API (key, value), even basic join
requires expensive code
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Problems with Map Reduce
1. MapReduce provides a difficult programming
model for developers
2. It suffers from a number of performance
issues
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Problems with Map Reduce
1. MapReduce provides a difficult programming
model for developers
2. It suffers from a number of performance
issues
3. While batch-mode analysis is still important,
reacting to events as they arrive has become
more important (lack support of “almost”
real-time)
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark to the rescue
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark to the rescue
1. Intuitive programming model
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Word count once again
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Word count once again
val wc = scala.io.Source.fromFile(args(0)).getLines
Scala solution
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Word count once again
val wc = scala.io.Source.fromFile(args(0)).getLines
.map(line => line.toLowerCase)
Scala solution
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Word count once again
val wc = scala.io.Source.fromFile(args(0)).getLines
.map(line => line.toLowerCase)
.flatMap(line => line.split(“ ”)).toSeq
Scala solution
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Word count once again
val wc = scala.io.Source.fromFile(args(0)).getLines
.map(line => line.toLowerCase)
.flatMap(line => line.split(“ ”)).toSeq
.groupBy(word => 1)
Scala solution
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Word count once again
val wc = scala.io.Source.fromFile(args(0)).getLines
.map(line => line.toLowerCase)
.flatMap(line => line.split(“ ”)).toSeq
.groupBy(word => 1)
.map { case (word, group) => (word, group.size) }
Scala solution
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Word count once again
val wc = scala.io.Source.fromFile(args(0)).getLines
.map(line => line.toLowerCase)
.flatMap(line => line.split(“ ”)).toSeq
.groupBy(word => 1)
.map { case (word, group) => (word, group.size) }
val wc = new SparkContext(“local”, “Word Count”).textFile(args(0))
Scala solution
Spark solution (in Scala
language)
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Word count once again
val wc = scala.io.Source.fromFile(args(0)).getLines
.map(line => line.toLowerCase)
.flatMap(line => line.split(“ ”)).toSeq
.groupBy(word => 1)
.map { case (word, group) => (word, group.size) }
val wc = new SparkContext(“local”, “Word Count”).textFile(args(0))
.map(line => line.toLowerCase)
Scala solution
Spark solution (in Scala
language)
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Word count once again
val wc = scala.io.Source.fromFile(args(0)).getLines
.map(line => line.toLowerCase)
.flatMap(line => line.split(“ ”)).toSeq
.groupBy(word => 1)
.map { case (word, group) => (word, group.size) }
val wc = new SparkContext(“local”, “Word Count”).textFile(args(0))
.map(line => line.toLowerCase)
.flatMap(line => line.split(“ ”))
Scala solution
Spark solution (in Scala
language)
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Word count once again
val wc = scala.io.Source.fromFile(args(0)).getLines
.map(line => line.toLowerCase)
.flatMap(line => line.split(“ ”)).toSeq
.groupBy(word => 1)
.map { case (word, group) => (word, group.size) }
val wc = new SparkContext(“local”, “Word Count”).textFile(args(0))
.map(line => line.toLowerCase)
.flatMap(line => line.split(“ ”))
.groupBy(word => 1)
Scala solution
Spark solution (in Scala
language)
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Word count once again
val wc = scala.io.Source.fromFile(args(0)).getLines
.map(line => line.toLowerCase)
.flatMap(line => line.split(“ ”)).toSeq
.groupBy(word => 1)
.map { case (word, group) => (word, group.size) }
val wc = new SparkContext(“local”, “Word Count”).textFile(args(0))
.map(line => line.toLowerCase)
.flatMap(line => line.split(“ ”))
.groupBy(word => 1)
.map { case (word, group) => (word, group.size) }
Scala solution
Spark solution (in Scala
language)
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark to the rescue
1. Intuitive programming model
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark to the rescue
1. Intuitive programming model
2. Performance boost
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark performance - not tied to map-
reduce cycle
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark performance - not tied to map-
reduce cycle
map
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark performance - not tied to map-
reduce cycle
map groupy
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark performance - not tied to map-
reduce cycle
map groupy map
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark performance - not tied to map-
reduce cycle
map groupy map reduceByKey
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark performance - not tied to map-
reduce cycle
map groupy map reduceByKey
task
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark performance - not tied to map-
reduce cycle
map groupy map reduceByKey
task
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark performance - not tied to map-
reduce cycle
map groupy map reduceByKey
task
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark performance - not tied to map-
reduce cycle
map groupy map reduceByKey
task
Wait for calculations on all partitions before moving on
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark performance - not tied to map-
reduce cycle
map groupy map reduceByKey
task
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark performance - not tied to map-
reduce cycle
map groupy map reduceByKey
task task
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark performance - not tied to map-
reduce cycle
map groupy map reduceByKey
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
stage1
Spark performance - not tied to map-
reduce cycle
map groupy map reduceByKey
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
sda
stage2stage1
Spark performance - not tied to map-
reduce cycle
map groupy map reduceByKey
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark performance - shuffle optimization
map groupBy
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark performance - shuffle optimization
map groupBy
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark performance - shuffle optimization
map groupBy join
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark performance - shuffle optimization
map groupBy join
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark performance - shuffle optimization
map groupBy join
Optimization: shuffle avoided if
data is already partitioned
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark performance - caching
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark performance - caching
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark performance - caching
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark performance - vs Hadoop (1)
“Run programs up to 100x faster than Hadoop
MapReduce in memory, or 10x faster on disk.”
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark performance - vs Hadoop (3)
“(...) we decided to participate in the Sort Benchmark (...),
an industry benchmark on how fast a system can sort 100
TB of data (1 trillion records).
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark performance - vs Hadoop (3)
“(...) we decided to participate in the Sort Benchmark (...),
an industry benchmark on how fast a system can sort 100
TB of data (1 trillion records). The previous world record
was 72 minutes, set by (...) Hadoop (...) cluster of 2100
nodes.
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark performance - vs Hadoop (3)
“(...) we decided to participate in the Sort Benchmark (...),
an industry benchmark on how fast a system can sort 100
TB of data (1 trillion records). The previous world record
was 72 minutes, set by (...) Hadoop (...) cluster of 2100
nodes. Using Spark on 206 nodes, we completed the
benchmark in 23 minutes.
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark performance - vs Hadoop (3)
“(...) we decided to participate in the Sort Benchmark (...),
an industry benchmark on how fast a system can sort 100
TB of data (1 trillion records). The previous world record
was 72 minutes, set by (...) Hadoop (...) cluster of 2100
nodes. Using Spark on 206 nodes, we completed the
benchmark in 23 minutes. This means that Spark sorted
the same data 3X faster using 10X fewer machines.
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark performance - vs Hadoop (3)
“(...) we decided to participate in the Sort Benchmark (...),
an industry benchmark on how fast a system can sort 100
TB of data (1 trillion records). The previous world record
was 72 minutes, set by (...) Hadoop (...) cluster of 2100
nodes. Using Spark on 206 nodes, we completed the
benchmark in 23 minutes. This means that Spark sorted
the same data 3X faster using 10X fewer machines. All (...)
without using Spark’s in-memory cache.”
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark to the rescue
1. Intuitive programming model
2. Performance boost
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark to the rescue
1. Intuitive programming model
2. Performance boost
3. Spark Streaming
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark to the rescue
1. Intuitive programming model
2. Performance boost
3. Spark Streaming
○ but also: graphs, machine learning and SQL
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
How?
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Cluster (Standalone, Yarn, Mesos)
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
SPARK API:
1. Scala
2. Java
3. Python
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
SPARK API:
1. Scala
2. Java
3. Python
Master
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt”
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt”
val conf = new SparkConf()
.setMaster(master)
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt”
val conf = new SparkConf()
.setMaster(master)
val sc = new SparkContext
(conf)
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt”
val conf = new SparkConf()
.setMaster(master)
val sc = new SparkContext
(conf)
Executor 1
Executor 2
Executor 3
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt”
val conf = new SparkConf()
.setMaster(master)
val sc = new SparkContext
(conf)
val logs =
sc.textFile(“logs.
txt”)
Executor 1
Executor 2
Executor 3
HDFS,
GlusterFS,
locality
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt”
val conf = new SparkConf()
.setMaster(master)
val sc = new SparkContext
(conf)
val logs =
sc.textFile(“logs.
txt”)
println(logs.count())
Executor 1
Executor 2
Executor 3
T1
T2
T3
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt”
val conf = new SparkConf()
.setMaster(master)
val sc = new SparkContext
(conf)
val logs =
sc.textFile(“logs.
txt”)
println(logs.count())
Executor 1
Executor 2
Executor 3
T1T2T3
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt
val conf = new SparkConf()
.setMaster(master)
val sc = new SparkContext
(conf)
val logs =
sc.textFile(“logs.
txt”)
println(logs.count())
Executor 1
Executor 2
Executor 3
T1T2T3
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt”
val conf = new SparkConf()
.setMaster(master)
val sc = new SparkContext
(conf)
val logs =
sc.textFile(“logs.
txt”)
println(logs.count())
Executor 1
Executor 2
Executor 3
T1
T2T3
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt”
val conf = new SparkConf()
.setMaster(master)
val sc = new SparkContext
(conf)
val logs =
sc.textFile(“logs.
txt”)
println(logs.count())
Executor 1
Executor 2
Executor 3
T1
T2
T3
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt”
val conf = new SparkConf()
.setMaster(master)
val sc = new SparkContext
(conf)
val logs =
sc.textFile(“logs.
txt”)
println(logs.count())
Executor 1
Executor 2
Executor 3
T1
T2
T3
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt”
val conf = new SparkConf()
.setMaster(master)
val sc = new SparkContext
(conf)
val logs =
sc.textFile(“logs.
txt”)
println(logs.count())
Executor 1
Executor 2
Executor 3
T1
T2
T3
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt”
val conf = new SparkConf()
.setMaster(master)
val sc = new SparkContext
(conf)
val logs =
sc.textFile(“logs.
txt”)
println(logs.count())
Executor 1
Executor 2
Executor 3
T1
T2
T3
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt”
val conf = new SparkConf()
.setMaster(master)
val sc = new SparkContext
(conf)
val logs =
sc.textFile(“logs.
txt”)
println(logs.count())
Executor 1
Executor 2
Executor 3
T1
T2
T3
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt”
val conf = new SparkConf()
.setMaster(master)
val sc = new SparkContext
(conf)
val logs =
sc.textFile(“logs.
txt”)
println(logs.count())
Executor 1
Executor 2
Executor 3
T1
T2
T3
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt”
val conf = new SparkConf()
.setMaster(master)
val sc = new SparkContext
(conf)
val logs =
sc.textFile(“logs.
txt”)
println(logs.count())
Executor 1
Executor 2
Executor 3
T1 T2
T3
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt”
val conf = new SparkConf()
.setMaster(master)
val sc = new SparkContext
(conf)
val logs =
sc.textFile(“logs.
txt”)
println(logs.count())
Executor 1
Executor 2
Executor 3
T1 T2 T3
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt”
val conf = new SparkConf()
.setMaster(master)
val sc = new SparkContext
(conf)
val logs =
sc.textFile(“logs.
txt”)
println(logs.count())
Executor 1
Executor 2
Executor 3
T1T2T3
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt
val conf = new SparkConf()
.setMaster(master)
val sc = new SparkContext
(conf)
val logs =
sc.textFile(“logs.
txt”)
println(logs.count())
Executor 1
Executor 2
Executor 3
T1T2T3
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt”
val conf = new SparkConf()
.setMaster(master)
val sc = new SparkContext
(conf)
val logs =
sc.textFile(“logs.
txt”)
println(logs.count())
Executor 1
Executor 2
Executor 3
T1
T2T3
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt”
val conf = new SparkConf()
.setMaster(master)
val sc = new SparkContext
(conf)
val logs =
sc.textFile(“logs.
txt”)
println(logs.count())
Executor 1
Executor 2
Executor 3
T1
T2
T3
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt”
val conf = new SparkConf()
.setMaster(master)
val sc = new SparkContext
(conf)
val logs =
sc.textFile(“logs.
txt”)
println(logs.count())
Executor 1
Executor 2
Executor 3
T1
T2
T3
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt”
val conf = new SparkConf()
.setMaster(master)
val sc = new SparkContext
(conf)
val logs =
sc.textFile(“logs.
txt”)
println(logs.count())
Executor 1
Executor 2
Executor 3
T1
T2
T3
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt”
val conf = new SparkConf()
.setMaster(master)
val sc = new SparkContext
(conf)
val logs =
sc.textFile(“logs.
txt”)
println(logs.count())
Executor 1
Executor 2
Executor 3
T1
T2
T3
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt”
val conf = new SparkConf()
.setMaster(master)
val sc = new SparkContext
(conf)
val logs =
sc.textFile(“logs.
txt”)
println(logs.count())
Executor 1
Executor 2
Executor 3
T1
T2
T3
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt”
val conf = new SparkConf()
.setMaster(master)
val sc = new SparkContext
(conf)
val logs =
sc.textFile(“logs.
txt”)
println(logs.count())
Executor 1
Executor 2
EDeDEADutor
3
T1
T2
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt”
val conf = new SparkConf()
.setMaster(master)
val sc = new SparkContext
(conf)
val logs =
sc.textFile(“logs.
txt”)
println(logs.count())
Executor 1
Executor 2
T1 T2
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt”
val conf = new SparkConf()
.setMaster(master)
val sc = new SparkContext
(conf)
val logs =
sc.textFile(“logs.
txt”)
println(logs.count())
Executor 1
Executor 2
T1 T2 T3
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt”
val conf = new SparkConf()
.setMaster(master)
val sc = new SparkContext
(conf)
val logs =
sc.textFile(“logs.
txt”)
println(logs.count())
Executor 1
Executor 2
T1 T2
T3
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt”
val conf = new SparkConf()
.setMaster(master)
val sc = new SparkContext
(conf)
val logs =
sc.textFile(“logs.
txt”)
println(logs.count())
Executor 1
Executor 2
T1 T2
T3
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt”
val conf = new SparkConf()
.setMaster(master)
val sc = new SparkContext
(conf)
val logs =
sc.textFile(“logs.
txt”)
println(logs.count())
Executor 1
Executor 2
T1 T2
T3
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt”
val conf = new SparkConf()
.setMaster(master)
val sc = new SparkContext
(conf)
val logs =
sc.textFile(“logs.
txt”)
println(logs.count())
Executor 1
Executor 2
T1 T2
T3
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
The Big Picture
Driver Program
Cluster (Standalone, Yarn, Mesos)
Master
val master = “spark://host:pt”
val conf = new SparkConf()
.setMaster(master)
val sc = new SparkContext
(conf)
val logs =
sc.textFile(“logs.
txt”)
println(logs.count())
Executor 1
Executor 2
T1 T2 T3
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
RDD - the definition
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
RDD - the definition
RDD stands for resilient distributed dataset
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
RDD - the definition
RDD stands for resilient distributed dataset
Resilient - if data is lost, data can be recreated
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
RDD - the definition
RDD stands for resilient distributed dataset
Resilient - if data is lost, data can be recreated
Distributed - stored in nodes among the cluster
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
RDD - the definition
RDD stands for resilient distributed dataset
Resilient - if data is lost, data can be recreated
Distributed - stored in nodes among the cluster
Dataset - initial data comes from a file
or can be created programmatically
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
RDD - example
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
RDD - example
val logs = sc.textFile("hdfs://logs.txt")
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
RDD - example
val logs = sc.textFile("hdfs://logs.txt")
From Hadoop Distributed
File System
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
RDD - example
val logs = sc.textFile("hdfs://logs.txt")
From Hadoop Distributed
File System
This is the RDD
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
RDD - example
val logs = sc.textFile("/home/rabbit/logs.txt")
From local file system
(must be available on
executors)
This is the RDD
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
RDD - example
val logs = sc.parallelize(List(1, 2, 3, 4))
Programmatically from a
collection of elements
This is the RDD
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
RDD - example
val logs = sc.textFile("logs.txt")
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
RDD - example
val logs = sc.textFile("logs.txt")
val lcLogs = logs.map(_.toLowerCase)
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
RDD - example
val logs = sc.textFile("logs.txt")
val lcLogs = logs.map(_.toLowerCase)
Creates a new RDD
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
RDD - example
val logs = sc.textFile("logs.txt")
val lcLogs = logs.map(_.toLowerCase)
val errors = lcLogs.filter(_.contains(“error”))
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
RDD - example
val logs = sc.textFile("logs.txt")
val lcLogs = logs.map(_.toLowerCase)
val errors = lcLogs.filter(_.contains(“error”))
And yet another RDD
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
RDD - example
val logs = sc.textFile("logs.txt")
val lcLogs = logs.map(_.toLowerCase)
val errors = lcLogs.filter(_.contains(“error”))
And yet another RDD
Performance Alert?!?!
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
RDD - Operations
1. Transformations
a. Map
b. Filter
c. FlatMap
d. Sample
e. Union
f. Intersect
g. Distinct
h. GroupByKey
i. ….
2. Actions
a. Reduce
b. Collect
c. Count
d. First
e. Take(n)
f. TakeSample
g. SaveAsTextFile
h. ….
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
RDD - example
val logs = sc.textFile("logs.txt")
val lcLogs = logs.map(_.toLowerCase)
val errors = lcLogs.filter(_.contains(“error”))
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
RDD - example
val logs = sc.textFile("logs.txt")
val lcLogs = logs.map(_.toLowerCase)
val errors = lcLogs.filter(_.contains(“error”))
val numberOfErrors = errors.count
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
RDD - example
val logs = sc.textFile("logs.txt")
val lcLogs = logs.map(_.toLowerCase)
val errors = lcLogs.filter(_.contains(“error”))
val numberOfErrors = errors.count
This will trigger the
computation
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
RDD - example
val logs = sc.textFile("logs.txt")
val lcLogs = logs.map(_.toLowerCase)
val errors = lcLogs.filter(_.contains(“error”))
val numberOfErrors = errors.count
This will trigger the
computation
This will the calculated
value (Int)
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
DEMO (1)
with Spark REPL
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Spark Stack
Spark Stack
Why Spark Streaming
Why Spark Streaming
A need to process data in almost real-time
● monitoring
● web logs analysis
● fraud detection
● online ads
Why Spark Streaming
A need to process data in almost real-time
● monitoring
● web logs analysis
● fraud detection
● online ads
Problem: no framework to do both batch &
stream processing
How Spark Streaming works?
Spark Streaming
live streamed data
How Spark Streaming works?
Spark Streaming
RDD
RDD
RDD
live streamed data
small RDDs
How Spark Streaming works?
Spark Streaming
RDD
RDD
RDD
Spark Core
live streamed data
small RDDs
output data
Spark Streaming - Usage
val ssc = new StreamingContext(conf, Seconds(1))
Similar to SparkContext,
we need to have an entry
point for the new API
Spark Streaming - Usage
val ssc = new StreamingContext(conf, Seconds(1))
val lines = ssc.socketTextStream("localhost", 9999)
DStream is created (think
of it as streamed RDD)
Spark Streaming - Usage
val ssc = new StreamingContext(conf, Seconds(1))
val lines = ssc.socketTextStream("localhost", 9999)
val words = lines.flatMap(_.split(" "))
val pairs = words.map(word => (word, 1))
val wordCounts = pairs.reduceByKey(_ + _)
Exact same API as for
RDD
Spark Streaming - Usage
val ssc = new StreamingContext(conf, Seconds(1))
val lines = ssc.socketTextStream("localhost", 9999)
val words = lines.flatMap(_.split(" "))
val pairs = words.map(word => (word, 1))
val wordCounts = pairs.reduceByKey(_ + _)
ssc.start()
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Q&A
… if I manage
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Q&A paul.szulc@gmail.com,
@rabbitonweb
http://www.rabbitonweb.com
… if I manage
twitter: @rabbitonweb,
email: paul.szulc@gmail.com
Thank you
very much!

More Related Content

What's hot

When to Post on Social Media
When to Post on Social MediaWhen to Post on Social Media
When to Post on Social Media
Daniel Howard
 
Social Media 101: Creating Your Digital Fingerprint
Social Media 101: Creating Your Digital FingerprintSocial Media 101: Creating Your Digital Fingerprint
Social Media 101: Creating Your Digital Fingerprint
Kristi Casey Sanders, CMP, CMM, DES, HMCC
 
A complete guide to the best times to post on social media (and more!)
A complete guide to the best times to post on social media (and more!)A complete guide to the best times to post on social media (and more!)
A complete guide to the best times to post on social media (and more!)
Marketing Wallah
 
Social Media 201: Content Curation, Time Management and Audience Engagement
Social Media 201: Content Curation, Time Management and Audience EngagementSocial Media 201: Content Curation, Time Management and Audience Engagement
Social Media 201: Content Curation, Time Management and Audience Engagement
Kristi Casey Sanders, CMP, CMM, DES, HMCC
 
SearchLove San Diego 2018 | Ashley Ward | Reuse, Recycle: How to Repurpose Yo...
SearchLove San Diego 2018 | Ashley Ward | Reuse, Recycle: How to Repurpose Yo...SearchLove San Diego 2018 | Ashley Ward | Reuse, Recycle: How to Repurpose Yo...
SearchLove San Diego 2018 | Ashley Ward | Reuse, Recycle: How to Repurpose Yo...
Distilled
 
How to Take a Month Off From Content - Pubtelligence
How to Take a Month Off From Content - PubtelligenceHow to Take a Month Off From Content - Pubtelligence
How to Take a Month Off From Content - Pubtelligence
Ashley Segura
 
The Science of Monitoring Yourself
The Science of Monitoring YourselfThe Science of Monitoring Yourself
The Science of Monitoring Yourself
Mary Thengvall
 
How to Reuse Content For Your Website - Melbourne SEO Wordpress Meetup
How to Reuse Content For Your Website - Melbourne SEO Wordpress MeetupHow to Reuse Content For Your Website - Melbourne SEO Wordpress Meetup
How to Reuse Content For Your Website - Melbourne SEO Wordpress Meetup
Ashley Segura
 
Digital Summit Denver: Reusable Content
Digital Summit Denver: Reusable ContentDigital Summit Denver: Reusable Content
Digital Summit Denver: Reusable Content
Ashley Segura
 
How to Reuse Your Content - Search Marketing Summit Sydney
How to Reuse Your Content - Search Marketing Summit SydneyHow to Reuse Your Content - Search Marketing Summit Sydney
How to Reuse Your Content - Search Marketing Summit Sydney
Ashley Segura
 
Twitter Visualization: Semtech Data Feb 2013
Twitter Visualization: Semtech Data Feb 2013Twitter Visualization: Semtech Data Feb 2013
Twitter Visualization: Semtech Data Feb 2013
Boulder Equity Analytics
 
Diversity (in Media)
Diversity (in Media)Diversity (in Media)
Diversity (in Media)
Arjen de Vries
 
WordPress As The Foundation For Your Digital Ecosystem
WordPress As The Foundation  For Your Digital EcosystemWordPress As The Foundation  For Your Digital Ecosystem
WordPress As The Foundation For Your Digital Ecosystem
Sean Nicholson, JD
 
Twitterology - The Science of Twitter
Twitterology - The Science of TwitterTwitterology - The Science of Twitter
Twitterology - The Science of Twitter
Bruno Gonçalves
 
From Desktop to Home: Optimizing for Voice
From Desktop to Home: Optimizing for VoiceFrom Desktop to Home: Optimizing for Voice
From Desktop to Home: Optimizing for Voice
Peter "Dr. Pete" Meyers
 
Human Mobility (with Mobile Devices)
Human Mobility (with Mobile Devices)Human Mobility (with Mobile Devices)
Human Mobility (with Mobile Devices)
Bruno Gonçalves
 
Personal research environment presentation
Personal research environment presentationPersonal research environment presentation
Personal research environment presentation
Daniela Gachago
 

What's hot (19)

When to Post on Social Media
When to Post on Social MediaWhen to Post on Social Media
When to Post on Social Media
 
Social Media 101: Creating Your Digital Fingerprint
Social Media 101: Creating Your Digital FingerprintSocial Media 101: Creating Your Digital Fingerprint
Social Media 101: Creating Your Digital Fingerprint
 
A complete guide to the best times to post on social media (and more!)
A complete guide to the best times to post on social media (and more!)A complete guide to the best times to post on social media (and more!)
A complete guide to the best times to post on social media (and more!)
 
BotBoosted_v11
BotBoosted_v11BotBoosted_v11
BotBoosted_v11
 
Social Media 201: Content Curation, Time Management and Audience Engagement
Social Media 201: Content Curation, Time Management and Audience EngagementSocial Media 201: Content Curation, Time Management and Audience Engagement
Social Media 201: Content Curation, Time Management and Audience Engagement
 
SearchLove San Diego 2018 | Ashley Ward | Reuse, Recycle: How to Repurpose Yo...
SearchLove San Diego 2018 | Ashley Ward | Reuse, Recycle: How to Repurpose Yo...SearchLove San Diego 2018 | Ashley Ward | Reuse, Recycle: How to Repurpose Yo...
SearchLove San Diego 2018 | Ashley Ward | Reuse, Recycle: How to Repurpose Yo...
 
How to Take a Month Off From Content - Pubtelligence
How to Take a Month Off From Content - PubtelligenceHow to Take a Month Off From Content - Pubtelligence
How to Take a Month Off From Content - Pubtelligence
 
The Science of Monitoring Yourself
The Science of Monitoring YourselfThe Science of Monitoring Yourself
The Science of Monitoring Yourself
 
How to Reuse Content For Your Website - Melbourne SEO Wordpress Meetup
How to Reuse Content For Your Website - Melbourne SEO Wordpress MeetupHow to Reuse Content For Your Website - Melbourne SEO Wordpress Meetup
How to Reuse Content For Your Website - Melbourne SEO Wordpress Meetup
 
Digital Summit Denver: Reusable Content
Digital Summit Denver: Reusable ContentDigital Summit Denver: Reusable Content
Digital Summit Denver: Reusable Content
 
How to Reuse Your Content - Search Marketing Summit Sydney
How to Reuse Your Content - Search Marketing Summit SydneyHow to Reuse Your Content - Search Marketing Summit Sydney
How to Reuse Your Content - Search Marketing Summit Sydney
 
Twitter Visualization: Semtech Data Feb 2013
Twitter Visualization: Semtech Data Feb 2013Twitter Visualization: Semtech Data Feb 2013
Twitter Visualization: Semtech Data Feb 2013
 
Diversity (in Media)
Diversity (in Media)Diversity (in Media)
Diversity (in Media)
 
Twitter analysis
Twitter analysisTwitter analysis
Twitter analysis
 
WordPress As The Foundation For Your Digital Ecosystem
WordPress As The Foundation  For Your Digital EcosystemWordPress As The Foundation  For Your Digital Ecosystem
WordPress As The Foundation For Your Digital Ecosystem
 
Twitterology - The Science of Twitter
Twitterology - The Science of TwitterTwitterology - The Science of Twitter
Twitterology - The Science of Twitter
 
From Desktop to Home: Optimizing for Voice
From Desktop to Home: Optimizing for VoiceFrom Desktop to Home: Optimizing for Voice
From Desktop to Home: Optimizing for Voice
 
Human Mobility (with Mobile Devices)
Human Mobility (with Mobile Devices)Human Mobility (with Mobile Devices)
Human Mobility (with Mobile Devices)
 
Personal research environment presentation
Personal research environment presentationPersonal research environment presentation
Personal research environment presentation
 

Viewers also liked

Apache Spark 101
Apache Spark 101Apache Spark 101
Apache Spark 101
Abdullah Çetin ÇAVDAR
 
From Spark to Ignition: Fueling Your Business on Real-Time Analytics
From Spark to Ignition: Fueling Your Business on Real-Time AnalyticsFrom Spark to Ignition: Fueling Your Business on Real-Time Analytics
From Spark to Ignition: Fueling Your Business on Real-Time Analytics
SingleStore
 
Introduction to Spark Internals
Introduction to Spark InternalsIntroduction to Spark Internals
Introduction to Spark Internals
Pietro Michiardi
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
Rahul Jain
 
AWS Redshift Introduction - Big Data Analytics
AWS Redshift Introduction - Big Data AnalyticsAWS Redshift Introduction - Big Data Analytics
AWS Redshift Introduction - Big Data Analytics
Keeyong Han
 
Data analysis scala_spark
Data analysis scala_sparkData analysis scala_spark
Data analysis scala_spark
Yiguang Hu
 
Apache Spark 101
Apache Spark 101Apache Spark 101
Apache Spark 101
Ankara Big Data Meetup
 
Multi Screen Hell
Multi Screen HellMulti Screen Hell
Multi Screen Hell
Abdullah Çetin ÇAVDAR
 
Journey to the Real-Time Analytics in Extreme Growth
Journey to the Real-Time Analytics in Extreme GrowthJourney to the Real-Time Analytics in Extreme Growth
Journey to the Real-Time Analytics in Extreme Growth
SingleStore
 
Ventajas de los servicios de internet
Ventajas de los servicios de internetVentajas de los servicios de internet
Ventajas de los servicios de internetLicenia García
 
Informe anual-2013
Informe anual-2013Informe anual-2013
Informe anual-2013
celamex
 
Los datos abiertos (#OpenData): un valor imprescindible para las start-ups de...
Los datos abiertos (#OpenData): un valor imprescindible para las start-ups de...Los datos abiertos (#OpenData): un valor imprescindible para las start-ups de...
Los datos abiertos (#OpenData): un valor imprescindible para las start-ups de...
Marc Garriga
 
Pdf pajatoquilla
Pdf pajatoquillaPdf pajatoquilla
Pdf pajatoquilla
camihurtado1
 
Presentación1
Presentación1Presentación1
Presentación1
Leyla Fernandez
 
Value Plus July Edition - 2015
Value Plus July Edition - 2015Value Plus July Edition - 2015
Value Plus July Edition - 2015
Redington Value Distribution
 
Sukrit-The ultimate wellness guide
Sukrit-The ultimate wellness guide Sukrit-The ultimate wellness guide
Sukrit-The ultimate wellness guide
VIHANGAM YOGA
 
Automotive Troubleshooting With An Oscilloscope.
Automotive Troubleshooting With An Oscilloscope.Automotive Troubleshooting With An Oscilloscope.
Automotive Troubleshooting With An Oscilloscope.
Jeffrey Bledsoe
 
Aspectos Especiales De La RehabilitacióN Desde AtencióN Primaria De Salud Par...
Aspectos Especiales De La RehabilitacióN Desde AtencióN Primaria De Salud Par...Aspectos Especiales De La RehabilitacióN Desde AtencióN Primaria De Salud Par...
Aspectos Especiales De La RehabilitacióN Desde AtencióN Primaria De Salud Par...
guest5cb03c8
 

Viewers also liked (20)

Apache Spark 101
Apache Spark 101Apache Spark 101
Apache Spark 101
 
From Spark to Ignition: Fueling Your Business on Real-Time Analytics
From Spark to Ignition: Fueling Your Business on Real-Time AnalyticsFrom Spark to Ignition: Fueling Your Business on Real-Time Analytics
From Spark to Ignition: Fueling Your Business on Real-Time Analytics
 
Introduction to Spark Internals
Introduction to Spark InternalsIntroduction to Spark Internals
Introduction to Spark Internals
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
AWS Redshift Introduction - Big Data Analytics
AWS Redshift Introduction - Big Data AnalyticsAWS Redshift Introduction - Big Data Analytics
AWS Redshift Introduction - Big Data Analytics
 
Data analysis scala_spark
Data analysis scala_sparkData analysis scala_spark
Data analysis scala_spark
 
Apache Spark 101
Apache Spark 101Apache Spark 101
Apache Spark 101
 
Multi Screen Hell
Multi Screen HellMulti Screen Hell
Multi Screen Hell
 
Journey to the Real-Time Analytics in Extreme Growth
Journey to the Real-Time Analytics in Extreme GrowthJourney to the Real-Time Analytics in Extreme Growth
Journey to the Real-Time Analytics in Extreme Growth
 
Ventajas de los servicios de internet
Ventajas de los servicios de internetVentajas de los servicios de internet
Ventajas de los servicios de internet
 
Informe anual-2013
Informe anual-2013Informe anual-2013
Informe anual-2013
 
Los datos abiertos (#OpenData): un valor imprescindible para las start-ups de...
Los datos abiertos (#OpenData): un valor imprescindible para las start-ups de...Los datos abiertos (#OpenData): un valor imprescindible para las start-ups de...
Los datos abiertos (#OpenData): un valor imprescindible para las start-ups de...
 
Pdf pajatoquilla
Pdf pajatoquillaPdf pajatoquilla
Pdf pajatoquilla
 
Presentación1
Presentación1Presentación1
Presentación1
 
Value Plus July Edition - 2015
Value Plus July Edition - 2015Value Plus July Edition - 2015
Value Plus July Edition - 2015
 
Sukrit-The ultimate wellness guide
Sukrit-The ultimate wellness guide Sukrit-The ultimate wellness guide
Sukrit-The ultimate wellness guide
 
Automotive Troubleshooting With An Oscilloscope.
Automotive Troubleshooting With An Oscilloscope.Automotive Troubleshooting With An Oscilloscope.
Automotive Troubleshooting With An Oscilloscope.
 
La cèl·lula
La cèl·lulaLa cèl·lula
La cèl·lula
 
Aspectos Especiales De La RehabilitacióN Desde AtencióN Primaria De Salud Par...
Aspectos Especiales De La RehabilitacióN Desde AtencióN Primaria De Salud Par...Aspectos Especiales De La RehabilitacióN Desde AtencióN Primaria De Salud Par...
Aspectos Especiales De La RehabilitacióN Desde AtencióN Primaria De Salud Par...
 
The Acquisition of Conrail Corporation
The Acquisition of Conrail CorporationThe Acquisition of Conrail Corporation
The Acquisition of Conrail Corporation
 

Similar to Apache Spark 101 [in 50 min]

Apache spark workshop
Apache spark workshopApache spark workshop
Apache spark workshop
Pawel Szulc
 
20151020 Metis
20151020 Metis20151020 Metis
20151020 Metis
Dean Malmgren
 
From jUnit to Mutationtesting
From jUnit to MutationtestingFrom jUnit to Mutationtesting
From jUnit to Mutationtesting
Sven Ruppert
 
Deep Learning for Folks Without (or With!) a Ph.D.
Deep Learning for Folks Without (or With!) a Ph.D.Deep Learning for Folks Without (or With!) a Ph.D.
Deep Learning for Folks Without (or With!) a Ph.D.
Douglas Starnes
 
Faster! Faster! Accelerate your business with blazing prototypes
Faster! Faster! Accelerate your business with blazing prototypesFaster! Faster! Accelerate your business with blazing prototypes
Faster! Faster! Accelerate your business with blazing prototypes
OSCON Byrum
 
Ai &amp; ml
Ai &amp; mlAi &amp; ml
Ai &amp; ml
Avilay Parekh
 
Python slide
Python slidePython slide
Mathematics and technology(2)
Mathematics and technology(2)Mathematics and technology(2)
Mathematics and technology(2)Jonathan Martin
 
Causal inference-for-profit | Dan McKinley | DN18
Causal inference-for-profit | Dan McKinley | DN18Causal inference-for-profit | Dan McKinley | DN18
Causal inference-for-profit | Dan McKinley | DN18
DataconomyGmbH
 
DN18 | A/B Testing: Lessons Learned | Dan McKinley | Mailchimp
DN18 | A/B Testing: Lessons Learned | Dan McKinley | MailchimpDN18 | A/B Testing: Lessons Learned | Dan McKinley | Mailchimp
DN18 | A/B Testing: Lessons Learned | Dan McKinley | Mailchimp
Dataconomy Media
 
Brand New JavaScript - ECMAScript 2015
Brand New JavaScript - ECMAScript 2015Brand New JavaScript - ECMAScript 2015
Brand New JavaScript - ECMAScript 2015
Gil Fink
 
Metadata in a Crowd: Shared Knowledge Production
Metadata in a Crowd: Shared Knowledge ProductionMetadata in a Crowd: Shared Knowledge Production
Metadata in a Crowd: Shared Knowledge Production
Kevin Rundblad
 
Mastering Python lesson3b_for_loops
Mastering Python lesson3b_for_loopsMastering Python lesson3b_for_loops
Mastering Python lesson3b_for_loops
Ruth Marvin
 
Algorithms - Future Decoded 2016
Algorithms - Future Decoded 2016Algorithms - Future Decoded 2016
Algorithms - Future Decoded 2016
Frank Krueger
 
It Probably Works - QCon 2015
It Probably Works - QCon 2015It Probably Works - QCon 2015
It Probably Works - QCon 2015
Fastly
 
session_01_react_.pptx
session_01_react_.pptxsession_01_react_.pptx
session_01_react_.pptx
AyaBenkabbour1
 
Replication in Data Science - A Dance Between Data Science & Machine Learning...
Replication in Data Science - A Dance Between Data Science & Machine Learning...Replication in Data Science - A Dance Between Data Science & Machine Learning...
Replication in Data Science - A Dance Between Data Science & Machine Learning...
June Andrews
 
Python for scientific computing
Python for scientific computingPython for scientific computing
Python for scientific computingGo Asgard
 
Computing Social Score of Web Aritfacts
Computing Social Score of Web AritfactsComputing Social Score of Web Aritfacts
Computing Social Score of Web Aritfacts
Venkatesh J N
 
Taking Jupyter Notebooks and Apache Spark to the Next Level PixieDust with Da...
Taking Jupyter Notebooks and Apache Spark to the Next Level PixieDust with Da...Taking Jupyter Notebooks and Apache Spark to the Next Level PixieDust with Da...
Taking Jupyter Notebooks and Apache Spark to the Next Level PixieDust with Da...
Databricks
 

Similar to Apache Spark 101 [in 50 min] (20)

Apache spark workshop
Apache spark workshopApache spark workshop
Apache spark workshop
 
20151020 Metis
20151020 Metis20151020 Metis
20151020 Metis
 
From jUnit to Mutationtesting
From jUnit to MutationtestingFrom jUnit to Mutationtesting
From jUnit to Mutationtesting
 
Deep Learning for Folks Without (or With!) a Ph.D.
Deep Learning for Folks Without (or With!) a Ph.D.Deep Learning for Folks Without (or With!) a Ph.D.
Deep Learning for Folks Without (or With!) a Ph.D.
 
Faster! Faster! Accelerate your business with blazing prototypes
Faster! Faster! Accelerate your business with blazing prototypesFaster! Faster! Accelerate your business with blazing prototypes
Faster! Faster! Accelerate your business with blazing prototypes
 
Ai &amp; ml
Ai &amp; mlAi &amp; ml
Ai &amp; ml
 
Python slide
Python slidePython slide
Python slide
 
Mathematics and technology(2)
Mathematics and technology(2)Mathematics and technology(2)
Mathematics and technology(2)
 
Causal inference-for-profit | Dan McKinley | DN18
Causal inference-for-profit | Dan McKinley | DN18Causal inference-for-profit | Dan McKinley | DN18
Causal inference-for-profit | Dan McKinley | DN18
 
DN18 | A/B Testing: Lessons Learned | Dan McKinley | Mailchimp
DN18 | A/B Testing: Lessons Learned | Dan McKinley | MailchimpDN18 | A/B Testing: Lessons Learned | Dan McKinley | Mailchimp
DN18 | A/B Testing: Lessons Learned | Dan McKinley | Mailchimp
 
Brand New JavaScript - ECMAScript 2015
Brand New JavaScript - ECMAScript 2015Brand New JavaScript - ECMAScript 2015
Brand New JavaScript - ECMAScript 2015
 
Metadata in a Crowd: Shared Knowledge Production
Metadata in a Crowd: Shared Knowledge ProductionMetadata in a Crowd: Shared Knowledge Production
Metadata in a Crowd: Shared Knowledge Production
 
Mastering Python lesson3b_for_loops
Mastering Python lesson3b_for_loopsMastering Python lesson3b_for_loops
Mastering Python lesson3b_for_loops
 
Algorithms - Future Decoded 2016
Algorithms - Future Decoded 2016Algorithms - Future Decoded 2016
Algorithms - Future Decoded 2016
 
It Probably Works - QCon 2015
It Probably Works - QCon 2015It Probably Works - QCon 2015
It Probably Works - QCon 2015
 
session_01_react_.pptx
session_01_react_.pptxsession_01_react_.pptx
session_01_react_.pptx
 
Replication in Data Science - A Dance Between Data Science & Machine Learning...
Replication in Data Science - A Dance Between Data Science & Machine Learning...Replication in Data Science - A Dance Between Data Science & Machine Learning...
Replication in Data Science - A Dance Between Data Science & Machine Learning...
 
Python for scientific computing
Python for scientific computingPython for scientific computing
Python for scientific computing
 
Computing Social Score of Web Aritfacts
Computing Social Score of Web AritfactsComputing Social Score of Web Aritfacts
Computing Social Score of Web Aritfacts
 
Taking Jupyter Notebooks and Apache Spark to the Next Level PixieDust with Da...
Taking Jupyter Notebooks and Apache Spark to the Next Level PixieDust with Da...Taking Jupyter Notebooks and Apache Spark to the Next Level PixieDust with Da...
Taking Jupyter Notebooks and Apache Spark to the Next Level PixieDust with Da...
 

More from Pawel Szulc

Getting acquainted with Lens
Getting acquainted with LensGetting acquainted with Lens
Getting acquainted with Lens
Pawel Szulc
 
Impossibility
ImpossibilityImpossibility
Impossibility
Pawel Szulc
 
Maintainable Software Architecture in Haskell (with Polysemy)
Maintainable Software Architecture in Haskell (with Polysemy)Maintainable Software Architecture in Haskell (with Polysemy)
Maintainable Software Architecture in Haskell (with Polysemy)
Pawel Szulc
 
Painless Haskell
Painless HaskellPainless Haskell
Painless Haskell
Pawel Szulc
 
Trip with monads
Trip with monadsTrip with monads
Trip with monads
Pawel Szulc
 
Trip with monads
Trip with monadsTrip with monads
Trip with monads
Pawel Szulc
 
Illogical engineers
Illogical engineersIllogical engineers
Illogical engineers
Pawel Szulc
 
RChain - Understanding Distributed Calculi
RChain - Understanding Distributed CalculiRChain - Understanding Distributed Calculi
RChain - Understanding Distributed Calculi
Pawel Szulc
 
Illogical engineers
Illogical engineersIllogical engineers
Illogical engineers
Pawel Szulc
 
Understanding distributed calculi in Haskell
Understanding distributed calculi in HaskellUnderstanding distributed calculi in Haskell
Understanding distributed calculi in Haskell
Pawel Szulc
 
Software engineering the genesis
Software engineering  the genesisSoftware engineering  the genesis
Software engineering the genesis
Pawel Szulc
 
Make your programs Free
Make your programs FreeMake your programs Free
Make your programs Free
Pawel Szulc
 
Going bananas with recursion schemes for fixed point data types
Going bananas with recursion schemes for fixed point data typesGoing bananas with recursion schemes for fixed point data types
Going bananas with recursion schemes for fixed point data types
Pawel Szulc
 
“Going bananas with recursion schemes for fixed point data types”
“Going bananas with recursion schemes for fixed point data types”“Going bananas with recursion schemes for fixed point data types”
“Going bananas with recursion schemes for fixed point data types”
Pawel Szulc
 
Writing your own RDD for fun and profit
Writing your own RDD for fun and profitWriting your own RDD for fun and profit
Writing your own RDD for fun and profit
Pawel Szulc
 
The cats toolbox a quick tour of some basic typeclasses
The cats toolbox  a quick tour of some basic typeclassesThe cats toolbox  a quick tour of some basic typeclasses
The cats toolbox a quick tour of some basic typeclasses
Pawel Szulc
 
Introduction to type classes
Introduction to type classesIntroduction to type classes
Introduction to type classes
Pawel Szulc
 
Functional Programming & Event Sourcing - a pair made in heaven
Functional Programming & Event Sourcing - a pair made in heavenFunctional Programming & Event Sourcing - a pair made in heaven
Functional Programming & Event Sourcing - a pair made in heaven
Pawel Szulc
 
Introduction to type classes in 30 min
Introduction to type classes in 30 minIntroduction to type classes in 30 min
Introduction to type classes in 30 min
Pawel Szulc
 
Real world gobbledygook
Real world gobbledygookReal world gobbledygook
Real world gobbledygook
Pawel Szulc
 

More from Pawel Szulc (20)

Getting acquainted with Lens
Getting acquainted with LensGetting acquainted with Lens
Getting acquainted with Lens
 
Impossibility
ImpossibilityImpossibility
Impossibility
 
Maintainable Software Architecture in Haskell (with Polysemy)
Maintainable Software Architecture in Haskell (with Polysemy)Maintainable Software Architecture in Haskell (with Polysemy)
Maintainable Software Architecture in Haskell (with Polysemy)
 
Painless Haskell
Painless HaskellPainless Haskell
Painless Haskell
 
Trip with monads
Trip with monadsTrip with monads
Trip with monads
 
Trip with monads
Trip with monadsTrip with monads
Trip with monads
 
Illogical engineers
Illogical engineersIllogical engineers
Illogical engineers
 
RChain - Understanding Distributed Calculi
RChain - Understanding Distributed CalculiRChain - Understanding Distributed Calculi
RChain - Understanding Distributed Calculi
 
Illogical engineers
Illogical engineersIllogical engineers
Illogical engineers
 
Understanding distributed calculi in Haskell
Understanding distributed calculi in HaskellUnderstanding distributed calculi in Haskell
Understanding distributed calculi in Haskell
 
Software engineering the genesis
Software engineering  the genesisSoftware engineering  the genesis
Software engineering the genesis
 
Make your programs Free
Make your programs FreeMake your programs Free
Make your programs Free
 
Going bananas with recursion schemes for fixed point data types
Going bananas with recursion schemes for fixed point data typesGoing bananas with recursion schemes for fixed point data types
Going bananas with recursion schemes for fixed point data types
 
“Going bananas with recursion schemes for fixed point data types”
“Going bananas with recursion schemes for fixed point data types”“Going bananas with recursion schemes for fixed point data types”
“Going bananas with recursion schemes for fixed point data types”
 
Writing your own RDD for fun and profit
Writing your own RDD for fun and profitWriting your own RDD for fun and profit
Writing your own RDD for fun and profit
 
The cats toolbox a quick tour of some basic typeclasses
The cats toolbox  a quick tour of some basic typeclassesThe cats toolbox  a quick tour of some basic typeclasses
The cats toolbox a quick tour of some basic typeclasses
 
Introduction to type classes
Introduction to type classesIntroduction to type classes
Introduction to type classes
 
Functional Programming & Event Sourcing - a pair made in heaven
Functional Programming & Event Sourcing - a pair made in heavenFunctional Programming & Event Sourcing - a pair made in heaven
Functional Programming & Event Sourcing - a pair made in heaven
 
Introduction to type classes in 30 min
Introduction to type classes in 30 minIntroduction to type classes in 30 min
Introduction to type classes in 30 min
 
Real world gobbledygook
Real world gobbledygookReal world gobbledygook
Real world gobbledygook
 

Recently uploaded

Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Globus
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Globus
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
Fermin Galan
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Globus
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Globus
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
Juraj Vysvader
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
Google
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
Globus
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
Cyanic lab
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
Philip Schwarz
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
informapgpstrackings
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
takuyayamamoto1800
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
Ortus Solutions, Corp
 
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdfEnhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Jay Das
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
Globus
 

Recently uploaded (20)

Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
Cyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdfCyaniclab : Software Development Agency Portfolio.pdf
Cyaniclab : Software Development Agency Portfolio.pdf
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
 
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoamOpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
OpenFOAM solver for Helmholtz equation, helmholtzFoam / helmholtzBubbleFoam
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
 
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdfEnhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
 

Apache Spark 101 [in 50 min]