1. @dez_blanchfield
Distributed & Parallel computing – a slice of my personal journey
• Mainframes
• Mini-Computers ( i.e. PDP / VAX )
• Micro-Computers ( 386 PC Servers )
• Desktop Servers ( UltraSPARC 2 )
• High End Proprietary Clusters
• Home Built COTS Clusters
• OpenSource platforms
• Software vs Hardware
3. @dez_blanchfield
The Challenges Of Large Scale Parallel and Distributed Computing
• Parallel vs Distributed
• Parallel Computing is Rocket Science
• Filesystems evolved to deliver
demand for distributed storage
• Distributed became more accessible
• Distributed also won the cost battle
• Open Frameworks are king
• Distributed increasing rapidly
• Everyone wants their own Hadoop
• DIY is fun but comes with wide
ranging costs / risks
4. @dez_blanchfield
Closed vs Open Platforms – Where have we come from
• Proprietary Systems
• IBM SP/2 ( AIX )
• Sun Solaris
• Sun Grid Engine ( Solaris SPARC + x86 )
• A+ Edition ( Fujitsu Mainframes )
• E10k -> E15k big iron
• Custom Systems ( Top 500 list )
• Nations rule the roost here!!
• Open Frameworks
• PVM / MPI / MPI-CH (Chameleon Lib)
• SMP
• Beowulf linux clusters
• Rocks Cluster Distribution
• Hadoop v1 / Hadoop 2 & YARN
5. @dez_blanchfield
Searching for little green men at Home ( aka SETI@Home )
• Projects like BOINC and the
SETI@Home project, launched in
1999, put distributed computing on
millions of screens and introduced
many of the core ideas and concepts
we now take for granted with
distributed computing
• SETI@Home put a tiny piece of a
super computer on your desktop, and
let you participate in a global project
for social good, and for the most part
it cost you nothing to participate as it
only uses spare CPU cycles that
would otherwise go to waste
6. @dez_blanchfield
1943 – “5 x Computers should about do it”
“I think there is
a world market
for maybe five
computers.”
Thomas J. Watson
7. @dez_blanchfield
What Happened – How did the shift happen so quickly
• Open Source & the FSF
• BSD, Minix & Linux
• Network & Clustered Filesystems
• Networking & In Particular Ethernet
• COTS Hardware
• Disks & RAM got a lot cheaper
• Modern CPU Design
• Multi-threading architectures
• Multi-core backplanes
• Smarter Memory Designs
8. @dez_blanchfield
What Happened – Yellow Elephants & a big Yahoo
• Doug Cutting
• Mike Cafarella
• The internet got very big very fast
• 2nd generation search engines
• Google’s MapReduce and the
Google File System paper
• 2006 & The Nutch Search Engine
• Distributing Index & Search at scale
• Yahoo became home to open
source Big Data
11. @dez_blanchfield
Where are we today – Pitfalls & Brick Walls
• Distributed Computing is Hard !!
• Now everyone CAN try it
• But not everyone SHOULD do it
• Rocket science does still apply
• Small clusters with one or two
workloads are safe
• Once you scale out to hundreds of
workloads and thousands of users
you will find pain, and lots of it
• You will not solve your
performance issues in a timely or
cost effective manner on your own
12. @dez_blanchfield
Big Hadoop Systems need automated Performance Management
• Automated Performance Management
is a must, humans can’t respond fast
enough to resolve multi-workload
issues, Ganglia just isn’t enough
• Systems are required to manage
systems, to monitor, discover, and
respond instantly, to deliver good
outcomes from investments in Hadoop
• Even sophisticated large Hadoop
implementations in Govt. & Large
Enterprise who do have rocket
scientists struggle with performance
issues which arise from multi-tenant
multi-workload use of Hadoop