3. Scientists always needed the best instruments
which technology of the time allowed to build
Microscope (Santiago Ramon y Cajal) Large Hadron Collider (CERN)
4. And supercomputers today can be
considered as the ultimate scientific
instrument that enables progress in science
5. The Evolution of The Research Paradigm
High Performance Computing means Numerical
Simulation and Big Data Analysis that allows
Reduce expense Avoid dangerous experiments Help to build knowledge where
experiments are impossible or
not affordable
6. HPC is an enabler for all scientific fields
Life Sciences &
Medicine
Earth
Sciences
Astro, High Energy &
Plasma Physics
Materials, Chemistry &
Nanoscience
Engineering Neuroscience
7. Emergent focus on big
data requires a transition of
computing facilities into a
data-centric paradigm too
However, traditional HPC systems are designed
according to the compute-centric paradigm
8. We have
experimented
with this in our
HPC facility in
Barcelona.
And this is what
I’m going to talk
about today!
How can traditional HPC existing infrastructure
evolve to meet the new demands?
12. Joint Research Centres with IT Companies
BSC-Microsoft Research Centre
BSC-IBM Technology Center
for Supercomputing
Intel-BSC Exascale Lab
BSC-NVIDIA CUDA
Center of Excellence
15. The Marenostrum 3 Supercomputer
Over 1015 Floating Points
Operations per second
(Petaflop)
– Nearly 50,000 cores
– 100.8 TB of memory
– 2000 TB disk storage
16. The third of three brothers
• 2004: MareNostrum 1
– Nearly 5x1013 Floating
Points per second
– Nearly 5.000 cores
– 236 TB disk storage
• 2006: MareNostrum 2
– Nearly 1014 Floating
Points per second
– Over 10.000 cores
– 460 disk storage
• 2012: MareNostrum 3
17. Marenostrum ancestors in the chapel
A parallel system inside the
same chapel:
Grandparent:
Processing capacity: Over
1000 operations-beats per
minute
Parallel system with 8
parallel typewriter units.
Grandmother:
Storage capacity: over
100Mb
Parallel Storage System with
14 drawer devices.
23. Spark4MN in action
• We performed a System level Performance
Evaluation & Tuning to MN3
• Example of some results:
– Speed-up
– Scale-up
– Parallelism
24. Example 1: Kmeans Speed-up
More dimensions smaller speed-up
because of increased shuffling
(same number of centroids to shuffle but bigger)
• Times for running k-means for
10 iterations.
• Problem size constant =
100GBs (10M1000D = 10M
vectors of 1000 dimensions)
25. Example 2: Kmeans Scale-up
• modify both the number of
records and the number
of machines.
• Ideally, all the plots
should be horizontal
our system behaves
closely to that.
26. Example 3: Configuring task parallelism
Varying the number of tasks over the same amount of cores
for k-means, the best-performing configuration is to have
as many partitions as cores = 1 task per core is better!
• Median times for running k-means
for 10 iterations with different
number of partitions
• In our benchmarks the number of
tasks is equal to the number of
RDD partitions.
27. Example 3: Configuring task parallelism
• Using Sort-by-key: a more intensive shuffling-intensive scenario
– We sort 1 billion records using 64 nodes & different partition sizes
– Contrary to the previous case, we observe speed-ups when
there are 2 partitions per core
28. Exemple 4: sort-by-key
• How many concurrent tasks an executor
can supervise?
Having 2 8-core executors
instead of 8 2-core ones,
improves on the running time by
a factor of 2.79 leaving all the
other parameters the same.
29. More results on
Friday at the
Santa Clara
conference!
2015 IEEE International Conference
on Big Data October 29-November 1,
Santa Clara, CA, USA
34. Data Processing Capacity scaling at
large input dataset
The performance of Spark workloads degrades with
large volumes of data due to substantial increase in
garbage collection and file I/O time.
Spark workloads do not saturate the available
bandwidth and hence their performance is bound on
DRAM latency
35. More results on
• A. J. Awan, M. Brorsson, V. Vlassov and E.
Ayguade, "Performance Characterization of In-
Memory Data Analytics on a Modern Cloud Server",
in 5th IEEE International Conference on Big Data
and Cloud Computing (BDCloud), Aug 2015, Dalina,
China (Best Paper Award)
• A. J. Awan, M. Brorsson, V. Vlassov and E.
Ayguade, "How Data Volume Affects Spark Based
Data Analytics on a Scale-up Server", in 6th
International Workshop on Big Data Benchmarks,
Performance Optimization and Emerging Hardware
(BpoE), held in conjunction with 41st International
Conference on Very Large Data Bases, Sep 2015,
Hawaii, USA.
37. BSC programming model COMPSs
– Sequential programming
model
– Abstracts the
application from the
underlying distributed
infrastructure
– Exploit the inherent
parallelism at runtime
38. We are studying the comparison and
interaction between these two programming
models in platforms like marenostrum 3
Marenostrum
Supercomputer
Marenostrum
Supercomputer
39. Profiling Spark with BSC’s HPC tools
• Relying on over 20
years HPC experience
& tools for profiling
• Preliminary work:
Developed the Hadoop
Instrumentation Toolkit
CPU
Memory
Page Faults
processes and
communication
40. Project ALOJA: Benchmarking Spark
• Open initiative to Explore and
produce a systematic study of
Hadoop/Spark efficiency on
different SW and HW
• Online repository that allows
compare, side by side all
execution parameters ( 50,000+
runs over 100+ HW config.)
42. Preliminary work
• Multimedia Big Data Computing:
Work with three kinds of data at the same time
social
network
relationships
audiovisual
content
metadata
44. 44
Example of tools created: Vectorization
Necessary for visual similarity search, visual clustering, classification, etc.
45. 45
Available in our github: bsc.spark.image
scala> import bsc.spark.image.ImageUtils
…
scala> images = ImageUtils.seqFile("hdfs://...", sc);
scala> dictionary = ImageUtils.BoWDictionary(images);
scala> vectors = dictionary.getBags(images);
…
scala> val splits = vectors.randomSplit(Array(0.6, 0.4), seed = 11L)
scala> training = splits(0)
scala> test = splits(1)
scala> model = NaiveBayes.train(training, lambda = 1.0)
…
46. Applications: Locality Sensitive Hashing
e.g. near-replica detection (visual spam detection, copyright infringement)
PATCH 1
PATCH 2
PATCH 3
PATCH 4
KP1
KP2
KP3
KP4
feature
detection
feature
description
0000 0100 1100
0010 0110 1110
0011 0111 1111
features are sketched, embedded
into a Hamming space
Similar features are hashed into similar buckets in a hash table
SIFT, SURF, ORB, etc.
0 1 1 0
47. Current work: Computer Vision
• Makes very productive use of (convolutional) neural networks
• SIFT features became unnecessary (used for decades)
49. BSC vision:
Giving computers a greater
ability to understand
information, and to learn, to
reason, and act upon it
50. Old wine in a new bottle?
• the term itself dates from the
1950s.
• periods of hype and high
expectations alternating with
periods of setback and
disappointment.
Artificial
Intelligence
plays an
important
role
51. Why Now?
1. Along the explosion of data …
now algorithms can be “trained” by
exposing them to large data sets that
were previously unavailable.
2. And the computing power
necessary to implement these
algorithms are now available
53. This new type of computing requires
DATA
Supercomputers
Research
Big Data
Technologies
Advanced
Analytic
Algorithms
1. the continuous
development of
supercomputing
systems
2. enabling the
convergence of
advanced analytic
algorithms
3. and big data
technologies
55. Cognitive Computing requires a transition of
computing facilities into a new paradigm too
Name? … We use Cognitive Computing
Yesterday Today Tomorrow
61. Welcome to our academic activities
• Teaching Spark @ Master courses
• Using Spark @ Final Master Thesis
• Using Spark @ Research activity
• NEW Spark Book in Spanish
• Editorial UOC
• Presentation November 3, 2015
61
Foreword by
Matei Zaharia