SUNDAY:


HPC databases workshop:

rasdman:

   • adding arrays to SQL queries
   • array query operators
          • general array contstructor
          • subset trim & slice
          • array nest/unest
          • matrix multiplication
          • histograms
          • formal encoding (e.g. c, cpp, java arrays)
          • nested queries
   • storage mapping: variants
          • coordinate-free sequence
          • BLOBs
          • ROLAP
          • imaging multidimensional OLAP
   • tiled array storage
          • regular
          • directonal
          • area of interest
   • In-Situ Databases
          • approach: reference external files
          • related: SciQL
   • adding tertiary storage
          • tapes
          • problem: spatial clustering
          • approach: super-tiles = all of the particular index nodes (reiner 2001 - paper)
   • Query processing
          • optimization 1: query rewriting
          • optimization 2: JIT compilation
                  • approach: cluster suitable ops
                  • compile & dynamically bind
                  • benefit: speed up complex, repeated operations
                  • variation: compile code for GPU
   • Intra operator parallelization
          • ...too fast
   • query processing in a federation
          • query splitting
          • work in progress
   • examples
          • human brain imaging
          • gene expression analysis (db queries, sexy as fuck) -> output jpeg, correlations,..
          • geo service standardization (OGC, SIC)
   • use cases/ e.g.:
          • sat imageing
          • 3d clients/vis.
   • historhy of array DBMSs
          • array as table
• conclusion
          • awesome for science and so on..

NEEEEEED SLIDES. so much enhanced SQL statement examples.


Energy Efficient HPC:

VERY much information via slides and talk, graphs,..
extremely interesting. you should read the slides yourself, if you are interested:
http://eehpcwg.lbl.gov/documents


Data-aware networking workshop:

gridftp (fatih university - TR):

https://sites.google.com/a/lbl.gov/ndm2012/home/accepted-papers (first one)

    • intro: pipelining, parallelism, concurrency
    • pipelining:
           • useful for large number of small files
           • higher throughputs on small files (1MB)
           • nr. of files affects total throughput but not the optimal pipelining level
           • throughput increases as number of files increases,..
           • BDP = BW*RTT - optimal windowsize (pfo)
           • ....
    • parallelism:
           • when buffer size is too small comparing to the BDP
           • adventagous with large files
    • concurrency:
           • advantages over parallelism:
                   • para. deteriorates the performance w. small files (pipelining)
                   • concurrency + pipelining has better perf. than cc+pp+p
                   • small RTT: quicker acend to the peak trhoughput
                   • ...
    • rules of thumb:
           • always use pipelining
                   • set diffrent levels
           • keep chunks as big as possible
           • use concurrency with pipelining w. small files and small # files
           • add parallelism to cc and pp with bigger filess
           • use parallelism when # files is insufficient to feed BDP
    • recursive chunk size division
           • mean based algo. to construct cluster of files with diff. optimal pipelining lvls.
           • calc.optimal pipelining level by dividing BDP into mean file size of chunk
    • results
           • awesome (slides needed, graphs and so on,..)

Sandhya Narayan, Hadoop acceleration in an OpenFlow-based cluster:

    • overview of SDN/openflow
          • use case: hadoop
• hadoop overview
             • hadoop acceleration approaches (usual stuff)
             • overview mapreduce pipeline (ibid)
             • overview of hadoop network traffic (ibid)
    •   floodlight as openflow controller
    •   openflow switch: openvswitch and link (research link)
    •   queues in openflow (for different bandwidths 50mbps, 200mbps,..)
    •   improvement in latency due to BW queues
    •   conclusion: SDN is awesome, but we don't use much of it now.
    •   further work: QoS, dynamic hadoop flows

no news there.


Mehmet Balman, Streaming Exa Scale data over 100Gbps Networks:

    • lot-of-small files problem! - file centric tools (not high speed), latency still a problem
    • framework for memeory-mapped network channel
           • blocks
           • memory caches are logically mapped between client and server
           • advantages:
                  • decoupling i/o and network ops (front/backend)
                  • not limited by file size characteristics
                  • moving climate files efficiently (gridftp, fopen,..)
    • SC11 100Gbps demo
           • CMIP3 data (35tb) over gpfs at NERSC
           • bs 4MB
           • each blocks data section was alined according to the system page size
           • 1gb cache
           • testbed overview:
                  • many tcp streams
                  • effects: crazy cpu usage
    • memznet's performance (buffer size 5mb)

wtf?! no new information AT ALL.



MONDAY:


parallel storage workshop:

keynote (eric barton)

    • http://www.pdsw.org/keynote.shtml
    • http://www.pdsw.org/pdsw12/slides/keynote-FF-IO-Storage.pdf

poster sessions
  slides and papers available online: http://www.pdsw.org/index.shtml

slides (papers if no slides available at the time):
   1. http://www.pdsw.org/pdsw12/papers/he-pdsw12.pdf
   2. http://www.pdsw.org/pdsw12/slides/crume-slides-pdsw12.pdf
3. http://www.pdsw.org/pdsw12/papers/grawinkle-pdsw12.pdf - no slides yet
  4. http://www.pdsw.org/pdsw12/papers/kim-pdsw12.pdf - no slides yet
  5. http://www.pdsw.org/pdsw12/slides/jwchoi_sc_SAN.pdf
  6. http://www.pdsw.org/pdsw12/slides/ren-tablefs_giga_pdsw.pdf
  7. http://www.pdsw.org/pdsw12/papers/goodell-pdsw12.pdf - no slides yet
  8. http://www.pdsw.org/pdsw12/slides/watkins-datamods-pdsw12.pdf
  9. http://www.pdsw.org/pdsw12/papers/carns-pdsw12.pdf - no slides (yet?)


HFT workshop:

http://www.cs.usfca.edu/~mfdixon/whpcf12/whpcf_12_program.html

2nd keynote - nvidia (john ashley) - how not to be roadkill

    • overview
    • background: EE, realtime data, big data, datamining, geospatial,..
    • drivers - power and heat
    • drivers - financial regulators
    • drivers the world as we dont knot it:
           • no arch. for everything, multi-arch
           • hadoop isnt the answer to everything
           • need to optimize cost and risk
           • need tools and techniques to implement across heterogenous solutions
           • need metrics to identfiy tradeoffs
                   • example:
                           • hanweck - reduced capt. expen. 10x, oper. expen. 13x
                           • citadel - each gpu saves 180.6K USD / year
                           • JPMC - 80 percent oper. expen. savings through GPUs
    • drivers - information advantage
           • is knowledge power?
                   • profit = f(knowledge, cap., capability)
                   • low latency/hft teams know this,..
           • knowing what your competition does
           • are you in the red with respect to capability to price and risk deals,..
                   • analytical? better models?, faster?
                   • computionally? new technology -> time to market
           • JPMorgan runs GPUs for risk analysis
    • crossing the road w/o getting hit
           • techonolgy
                   • no longer hw agnostic
                   • heterogenous
                   • suitable
                   • data is the new bottleneck
    • skills
           • parallel thinking
                   • data awareness
                   • multi-paragidgm, multi-programming
                   • experimentalism
                   • hft guys are into all of this and so on,...
           • parallel thinking
                   • chunking work
                           • distribution
• tiling
                    • cyclic reduction, parallel solvers, swarm optimization, monte carlo
             • numerical issues
             • awareness of descrete math issues, SP/DP
             • numerical stability, async. algos, red/black coloring, multi-level grid solvers
      • data awareness
             • not just hadoop
             • efficient organization, delivery of data to compute is key
             • dataflow programming is key
             • hpc programmers already know this
             • examples:
                    • structure of arrays vs array of structures, esp. as vector units get wider
                    • tiling algos. vs naive algos drastically improve performance
             • some firms still believe that language optimized and hardware aware programming
               is wrong
      • experimentalism
             • innovate
             • avoid analysis paralysis
             • define relevant metrics, collect them, and then act
• STAC-A2: a benchmark focused on metrics and biz problem
      • can be used to compare a range of potential solutions that are innovative
      • allows free eign to parallel and data-sensitive computing
• case study
      • CARMA: standalone arm + gpu micro server, its a dev. kit, over narrow pci-e
             • monte carlo based
             • MPI
             • carma rocks for hft
             • speed
             • low power consumption

Sc12 workshop-writeup

  • 1.
    SUNDAY: HPC databases workshop: rasdman: • adding arrays to SQL queries • array query operators • general array contstructor • subset trim & slice • array nest/unest • matrix multiplication • histograms • formal encoding (e.g. c, cpp, java arrays) • nested queries • storage mapping: variants • coordinate-free sequence • BLOBs • ROLAP • imaging multidimensional OLAP • tiled array storage • regular • directonal • area of interest • In-Situ Databases • approach: reference external files • related: SciQL • adding tertiary storage • tapes • problem: spatial clustering • approach: super-tiles = all of the particular index nodes (reiner 2001 - paper) • Query processing • optimization 1: query rewriting • optimization 2: JIT compilation • approach: cluster suitable ops • compile & dynamically bind • benefit: speed up complex, repeated operations • variation: compile code for GPU • Intra operator parallelization • ...too fast • query processing in a federation • query splitting • work in progress • examples • human brain imaging • gene expression analysis (db queries, sexy as fuck) -> output jpeg, correlations,.. • geo service standardization (OGC, SIC) • use cases/ e.g.: • sat imageing • 3d clients/vis. • historhy of array DBMSs • array as table
  • 2.
    • conclusion • awesome for science and so on.. NEEEEEED SLIDES. so much enhanced SQL statement examples. Energy Efficient HPC: VERY much information via slides and talk, graphs,.. extremely interesting. you should read the slides yourself, if you are interested: http://eehpcwg.lbl.gov/documents Data-aware networking workshop: gridftp (fatih university - TR): https://sites.google.com/a/lbl.gov/ndm2012/home/accepted-papers (first one) • intro: pipelining, parallelism, concurrency • pipelining: • useful for large number of small files • higher throughputs on small files (1MB) • nr. of files affects total throughput but not the optimal pipelining level • throughput increases as number of files increases,.. • BDP = BW*RTT - optimal windowsize (pfo) • .... • parallelism: • when buffer size is too small comparing to the BDP • adventagous with large files • concurrency: • advantages over parallelism: • para. deteriorates the performance w. small files (pipelining) • concurrency + pipelining has better perf. than cc+pp+p • small RTT: quicker acend to the peak trhoughput • ... • rules of thumb: • always use pipelining • set diffrent levels • keep chunks as big as possible • use concurrency with pipelining w. small files and small # files • add parallelism to cc and pp with bigger filess • use parallelism when # files is insufficient to feed BDP • recursive chunk size division • mean based algo. to construct cluster of files with diff. optimal pipelining lvls. • calc.optimal pipelining level by dividing BDP into mean file size of chunk • results • awesome (slides needed, graphs and so on,..) Sandhya Narayan, Hadoop acceleration in an OpenFlow-based cluster: • overview of SDN/openflow • use case: hadoop
  • 3.
    • hadoop overview • hadoop acceleration approaches (usual stuff) • overview mapreduce pipeline (ibid) • overview of hadoop network traffic (ibid) • floodlight as openflow controller • openflow switch: openvswitch and link (research link) • queues in openflow (for different bandwidths 50mbps, 200mbps,..) • improvement in latency due to BW queues • conclusion: SDN is awesome, but we don't use much of it now. • further work: QoS, dynamic hadoop flows no news there. Mehmet Balman, Streaming Exa Scale data over 100Gbps Networks: • lot-of-small files problem! - file centric tools (not high speed), latency still a problem • framework for memeory-mapped network channel • blocks • memory caches are logically mapped between client and server • advantages: • decoupling i/o and network ops (front/backend) • not limited by file size characteristics • moving climate files efficiently (gridftp, fopen,..) • SC11 100Gbps demo • CMIP3 data (35tb) over gpfs at NERSC • bs 4MB • each blocks data section was alined according to the system page size • 1gb cache • testbed overview: • many tcp streams • effects: crazy cpu usage • memznet's performance (buffer size 5mb) wtf?! no new information AT ALL. MONDAY: parallel storage workshop: keynote (eric barton) • http://www.pdsw.org/keynote.shtml • http://www.pdsw.org/pdsw12/slides/keynote-FF-IO-Storage.pdf poster sessions slides and papers available online: http://www.pdsw.org/index.shtml slides (papers if no slides available at the time): 1. http://www.pdsw.org/pdsw12/papers/he-pdsw12.pdf 2. http://www.pdsw.org/pdsw12/slides/crume-slides-pdsw12.pdf
  • 4.
    3. http://www.pdsw.org/pdsw12/papers/grawinkle-pdsw12.pdf -no slides yet 4. http://www.pdsw.org/pdsw12/papers/kim-pdsw12.pdf - no slides yet 5. http://www.pdsw.org/pdsw12/slides/jwchoi_sc_SAN.pdf 6. http://www.pdsw.org/pdsw12/slides/ren-tablefs_giga_pdsw.pdf 7. http://www.pdsw.org/pdsw12/papers/goodell-pdsw12.pdf - no slides yet 8. http://www.pdsw.org/pdsw12/slides/watkins-datamods-pdsw12.pdf 9. http://www.pdsw.org/pdsw12/papers/carns-pdsw12.pdf - no slides (yet?) HFT workshop: http://www.cs.usfca.edu/~mfdixon/whpcf12/whpcf_12_program.html 2nd keynote - nvidia (john ashley) - how not to be roadkill • overview • background: EE, realtime data, big data, datamining, geospatial,.. • drivers - power and heat • drivers - financial regulators • drivers the world as we dont knot it: • no arch. for everything, multi-arch • hadoop isnt the answer to everything • need to optimize cost and risk • need tools and techniques to implement across heterogenous solutions • need metrics to identfiy tradeoffs • example: • hanweck - reduced capt. expen. 10x, oper. expen. 13x • citadel - each gpu saves 180.6K USD / year • JPMC - 80 percent oper. expen. savings through GPUs • drivers - information advantage • is knowledge power? • profit = f(knowledge, cap., capability) • low latency/hft teams know this,.. • knowing what your competition does • are you in the red with respect to capability to price and risk deals,.. • analytical? better models?, faster? • computionally? new technology -> time to market • JPMorgan runs GPUs for risk analysis • crossing the road w/o getting hit • techonolgy • no longer hw agnostic • heterogenous • suitable • data is the new bottleneck • skills • parallel thinking • data awareness • multi-paragidgm, multi-programming • experimentalism • hft guys are into all of this and so on,... • parallel thinking • chunking work • distribution
  • 5.
    • tiling • cyclic reduction, parallel solvers, swarm optimization, monte carlo • numerical issues • awareness of descrete math issues, SP/DP • numerical stability, async. algos, red/black coloring, multi-level grid solvers • data awareness • not just hadoop • efficient organization, delivery of data to compute is key • dataflow programming is key • hpc programmers already know this • examples: • structure of arrays vs array of structures, esp. as vector units get wider • tiling algos. vs naive algos drastically improve performance • some firms still believe that language optimized and hardware aware programming is wrong • experimentalism • innovate • avoid analysis paralysis • define relevant metrics, collect them, and then act • STAC-A2: a benchmark focused on metrics and biz problem • can be used to compare a range of potential solutions that are innovative • allows free eign to parallel and data-sensitive computing • case study • CARMA: standalone arm + gpu micro server, its a dev. kit, over narrow pci-e • monte carlo based • MPI • carma rocks for hft • speed • low power consumption