The document summarizes the 19th ACM HPDC conference and VTDC workshop held in 2010. It provides an overview of the accepted papers and talks, including the topics covered, presenters and their affiliations, and high-level discussions. Key areas included distributed storage systems, virtualization technologies, data-intensive computing, workflows, and cloud/grid resources.
2. HPDC
2010
• 2010 6 20 25
•
•
• full
paper
25% 23/91
short
paper
51% 22/43
• 100
hDp://hpdc2010.eecs.northwestern.edu/
3. HPDC
2010
@
Chicago
•
– ScienceCloud:
Workshop
on
ScienKfic
Cloud
CompuKng
– Emerging
ComputaKonal
Methods
for
the
Life
Sciences
– MDQCS:
Managing
Data
Quality
for
CollaboraKve
Science
– Large-‐Scale
System
and
ApplicaKon
Performance
– CLADE:
Challenges
of
Large
ApplicaKons
in
Distributed
Environments
– DIDC:
Data
Intensive
Distributed
CompuKng
– MAPREDUCE:
MapReduce
and
its
ApplicaKons
– VTDC:
VirtualizaKon
Technologies
for
Distributed
CompuKng
•
– Open
Grid
Forum
(OGF
29)
4. VTDC
Invited
Talks
• Virtualiza>on
Technologies
in
Distributed
Architecture:
The
Grid5000
Recipe
Adrien
Lebre,
(EMN)
– 2008
• Xen MAC/IP
– Sky
compuKng
based
Nimbus Saline HiperNet VMScript Shrinker
• An
Introduc>on
to
the
V3VEE
Project
and
the
Palacios
Virtual
Machine
Monitor
Peter
Dinda,
(Northwestern
Univ.)
– HPC VMM
– Palacios
+
KiDen
(Sandia
LKW) Type-‐I
VMM
• Virtualized
RedStorm
(Cray
XT3) [J.
Lange,
IPDSP
2010]
– Virtuoso
(not
Virtuozzo)
• Future
Grid:
Suppor>ng
Next
Genera>on
Data
Intensive
Cyberinfrastructure
Geoffrey
Fox,
(Indiana
Univ.)
– TeraGrid IU
5. VTDC
(1/2)
• Cluster-‐Wide
Context
Switch
of
Virtualized
Jobs
Fabien
Hermenier,
Adrien
Lebre,
Jean-‐Marc
Menaud,
(INRIA)
– VM
•
– CWCS:
VM
• Pools
of
Virtual
Boxes:
Building
Campus
Grids
with
Virtual
Machines
David
Herzfeld,
Lars
Olson,
Craig
Struble,
(Marque:e
University)
– Windows
host Condor
pool VirtualBox
– 300
• Janus:
A
Cross-‐Layer
SoZ
Real-‐Time
Architecture
for
Virtualiza>on
Raoul
Rivas,
Ahsan
Arefin,
Klara
Nahrstedt,
(University
of
Illinois
at
Urbana
Champaign)
– Xen
– VMM RT
scheduler OS RT
task
RT
task VCPU 1 1
6. VTDC
(2/2)
• DistriBit:
A
Distributed
Dynamic
Binary
Translator
System
for
Thin
Client
Compu>ng
Haibing
Guan,
Yindong
Yang,
Kai
Chen
(Shanghai
Jiao
Tong
University),
Yindong
Ge,
Liang
Liu,
Ying
Chen,
(IBM
Research-‐China)
– DBT 2
– DBT CrossBit
• Scaling
Virtual
Organiza>on
Clusters
over
a
Wide
Area
Network
using
the
Kestrel
Workload
Management
System
Lance
Stout,
Michael
Fenn,
Micahel
Murphy,
SebasKen
Goasguen,
(Clemson
University)
– Kestrel:
XMMP
– IPOP/Condor
• Storage
Deduplica>on
for
Virtual
Ad
Hoc
Network
Testbed
By
File-‐Level
Block
Sharing
Chang-‐Han
Jong
(University
of
Maryland),
Cho-‐Yu
Lason
Chiang,
Taichuan
Lu,
Alexander
Poylisher,
ConstanKn
Serban,
(Telcordia
Technologies)
– FS dedup
dedup
– Xen iSCSI Storage
VM
7. HPDC
Keynote
• How
Not
to
Think
about
Parallel
Programming
Guy
Steele
Jr.
(Sun/Oracle)
– Moore’s
Law Jack
Dongarra 2 2
– Accumulators
are
BAD.
Divide-‐and-‐conquer
is
GOOD
• sum
=
0
–
• Google
MapReduce Reduce
• Data
Intensive
Scalable
Compu>ng
Randal
Bryant
(CMU)
– HPC DISC
•
– MapReduce MapReduceMerge Dryad FlumeJava
• XXX
Robert
Harrison
(ORNL)
–
8. S1:
Best
Papers
• Horizon:
Efficient
Deadline-‐Driven
Disk
I/O
Management
for
Distributed
Storage
Systems
Anna
Povzner
(UCSC),
Darren
Sawyer
(NetApp),
ScoD
Brandt
(UCSC)
– Deadline
sensiKve
SCAN I/O
scheduler
• SSD
– NetApp
ONTAP
• Run-‐>me
Op>miza>ons
for
Replicated
Dataflows
on
Heterogeneous
Environments
George
Teodoro
(Universidade
Federal
de
Minas
Gerais),
Timothy
Hartley,
Umit
Catalyurek
(Ohio
State
University),
Renato
Ferreira
(Universidade
Federal
deMinas
Gerais)
– CPU GPU
– CPU GPU
9. S2:
Workflows
• DataSpaces:
An
Interac>on
and
Coordina>on
Framework
for
Coupled
Simula>on
Workflows
Ciprian
Docan,
Manish
Parashar
(Rutgers),
ScoD
Klasky
(Oak
Ridge
NaPonal
Lab)
– DART
(Decoupled
and
Asynchronous
Remote
Data
Transfer)
+
DHT
• ParaTrac:
A
Fine-‐Grained
Profiler
for
Data-‐Intensive
Workflows
Nan
Dun,
Kenjiro
Taura,
Akinori
Yonezawa
(University
of
Tokyo)
– DAG
• Performance
Analysis
of
Dynamic
Workflow
Scheduling
in
Mul>cluster
Grids
Ozan
Sonmez,
Nezih
Yigitbasi,
Saeid
Abrishami,
Alexandru
Iosup,
Dick
Epema
(DelR
University
of
Technology)
– 7
10. S3:
Resources
and
Clouds
• SoZware
Architecture
Defini>on
for
On-‐demand
Cloud
Provisioning
Clovis
Chapman,
Wolfgang
Emmerich
(University
College
London),
Fermin
Galan
Marquez
(Telefonica
I+D),
Stuart
Clayman,
Alex
Galis
(University
College
London)
– FP7
RESERVOIR Resources
and
Services
VirtualizaKon
without
Barriers
–
• High
Occupancy
Resource
Alloca>on
for
Grid
and
Cloud
systems,
a
Study
with
DRIVE
Kyle
Chard,
Kris
Bubendorfer,
Peter
Komisarczuk
(Victoria
University
of
Wellington)
–
• Highly
Available
Component
Sharing
in
Large-‐Scale
Mul>-‐Tenant
Cloud
Systems
Juan
Du,
Xiaohui
Gu,
Douglas
Reeves
(North
Carolina
State
University)
–
11. S4:
MapReduce
and
Debugging
• MOON:
MapReduce
On
Opportunis>c
eNvironments
Heshan
Lin
(Virginia
Tech),
Xaisong
Ma
(North
Carolina
State
University
and
Oak
Ridge
NaPonal
Lab),
Jeremy
Archuleta,
Wu-‐chun
Feng,
Mark
Gardner
(Virginia
Tech),
Zhe
Zhang
(Oak
Ridge
NaPonal
Lab)
– VolaKle
PC PC
–
• MRAP:
A
Novel
MapReduce-‐based
Framework
to
Support
HPC
Analy>cs
Applica>ons
with
Access
Pacerns
Saba
Sehrish,
Grant
Mackey,
Jun
Wang
(University
of
Central
Florida),
John
Bent
(Los
Alamos
NaPonal
Lab)
– spliDer …
– HPC MapReduce
• Data
Centric
Highly
Parallel
Debugging
David
Abramson,
Minh
Ngoc
Dinh,
Donny
Kuniawan
(Monash
University),
Bob
Moench,
Luiz
DeRose
(Cray)
12. S5:
Data
Centers
and
Virtualiza>on
• Thermal
Aware
Server
Provisioning
For
Internet
Data
Centers
Zahra
Abbasi,
Georgios
Varsamopoulos,
Sandeep
Gupta
(Arizona
State
University)
–
• I/O
Scheduling
Model
of
Virtual
Machine
Based
on
Mul>-‐core
Dynamical
Par>>oning
Yanyan
Hu,
Xiang
Long,
Jiong
Zhang,
Jun
He,
Li
Xia
(Beihang
University)
– IO CPU
Xen
• A
Prac>cal
Way
to
Extend
Shared
Memory
Support
Beyond
a
Motherboard
at
Low
Cost
Hector
Montaner,
Federico
Silla,
Jose
Duato
(Universitat
Politècnica
de
València)
– HyperTransport
• RMC
•
– HTX
13. S6:
Storage
and
I/O
• A
GPU
Accelerated
Storage
System
Samer
Al-‐Kiswany,
Abdullah
Gharaibeh,
Sathish
Gopalakrishnan,
Matei
Ripeanu
(University
of
BriPsh
Columbia)
– CAS GPU
– CrystalGPU
• Computa>on
Mapping
for
Mul>-‐Level
Storage
Cache
Hierarchies
Mahmut
Kandemir,
Sai
Muralidhara,
Mustafa
Karakoy
(Pennsylvania
State
University)
,
Seung
Woo
Son
(Argonne
NaPonal
Lab)
– IO
– IO
loop
iteraKon
distribuKon
• Cashing
in
on
Hints
for
Becer
Prefetching
and
Caching
in
PVFS
and
MPI-‐IO
ChrisKna
Patrick,
Mahmut
Kandemir
(Pennsylvania
State
University),
Mustafa
Karaköy
(Imperial
College),
Seung
Woo
Son
(Argonne
NaPonal
Lab),
Alok
Choudhary
(Northwestern
University)
–
I/O I/O
14. S7:
Applica>ons
and
Provenance
• Dimension
Reduc>on
and
Visualiza>on
of
Large
High-‐
Dimensional
Data
via
Interpola>on
Seung-‐Hee
Bae,
Jong
Youl
Choi,
Xiaohong
Qiu,
Geoffrey
Fox
(Indiana
University)
• New
Caching
Techniques
for
Web
Search
Engines
Mauricio
Marin,
Veronica
Gil-‐Costa,
Carlos
Gomez-‐Pantoja
(Yahoo!
Research
LaPn
America)
–
– Broker Master locaKon
cache search
node Slave Top-‐K
cache
• Mendel:
Efficiently
Verifying
the
Lineage
of
Data
Modified
in
Mul>ple
Trust
Domains
Ashish
Gehani,
Minyoung
Kim
(SRI
InternaPonal)
– Data
provenance lineage
–
15. S8:
Communica>on
and
Scheduling
• PV-‐EASY:
A
Strict
Fairness
Guaranteed
and
Predic>on
Enabled
Scheduler
in
Parallel
Job
Scheduling
Yulai
Yuan,
Guangwen
Yang,
Yongwei
Wu
(Tsinghua
University)
– EASY
backfilling
• XCo:
Explicit
Coordina>on
to
Prevent
Network
Fabric
Conges>on
in
Cloud
Compu>ng
Cluster
Plajorms
Vijay
Shankar
Rajanna,
Smit
Shah,
Anand
Jahagirdar,
KarKk
Gopalan
(SUNY
Binghamton)
– TCP
incast/short
flows
–
• Scalability
of
Communicators
and
Groups
in
MPI
Humaira
Kamal,
Seyed
Mirtaheri,
Alan
Wagner
(University
of
BriPsh
Columbia)
– →
– FG-‐MPI Fine-‐Grain
MPI MPICH2
• MPI proclet