Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The impact of supercomputers on MSR

499 views

Published on

MSR Next Generation 2014@HKUST

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

The impact of supercomputers on MSR

  1. 1. The impact of supercomputers on MSR Y. Kamei A. Osaka C. Huang N. Ubayashi MSR Next Generation 2014@HKUST
  2. 2. Who am I? ❖ Yasutaka Kamei http://posl.ait.kyushu-u.ac.jp/~kamei/ ❖ My research interests are 2 Summer Winter Understanding OSS Collaboration Improving Software Quality Scaling up MSR Analysis
  3. 3. Today... ❖ Derive messages from HPC community to MSR community. • Make use of High Performance Computing (HPC) in MSR. HPC MSR 3
  4. 4. 2014: A Space Odyssey ❖ MSR researchers will explore treasure in the Universe anytime soon. 4 2004 2014
  5. 5. 2014: A Space Odyssey ❖ MSR researchers will explore treasure in the Universe anytime soon. 5 Diversity in software engineering research @ FSE 2013 20,028 projects as the Universe 2004 2014
  6. 6. 2014: A Space Odyssey ❖ MSR researchers will explore treasure in the Universe anytime soon. 6 Diversity in software engineering research @ FSE 2013 20,028 projects as the Universe Challenges in Mining Whole Software Universe 2004 2014
  7. 7. One solution is ❖ Supercomputer ❖ In the case of FX10, • CPU: 16 cores • Memory: 32 GByte × 4,800 nodes 7
  8. 8. However… ❖ The adoption rate for HPC is still low. 8 Domain-Specific techniques for Only Fortran using HPC? and C? My tool is imple-mented by
  9. 9. Prof. Chiba says ❖ Via collaboration of CREST project, 9 We can use Java, Ruby and Python on FX10!
  10. 10. Case Study ❖ Evaluate the impact that HPC can have on MSR analyses. ❖ Apply HPC (FX10) to Code Clone Detection. 10
  11. 11. Code Clone ❖ A code fragment that has identical or similar code fragments 11 copy%and%paste copy%and%paste code%clone clone%fragment clone%fragment clone%fragment Hotta et al. CSMR 2012
  12. 12. Type-3 Clones ❖ Programmers often make some changes to code fragments after copy-and-paste. Zhang et al. ICSM 2012 12 final public void daload() {  countLabels = 0;  try { position++; bCodeStream[i++] = OPC_daload; } catch (Exception e) { resizeByteArray(OPC_daload); } }
  13. 13. Type-3 Clones ❖ Programmers often make some changes to code fragments after copy-and-paste. Zhang et al. ICSM 2012 13 final public void daload() {  countLabels = 0;  try { position++; bCodeStream[i++] = OPC_daload; } catch (Exception e) { resizeByteArray(OPC_daload); } } final public void daload() { countLabels = 0; try { position++; bCodeStream[i++] = OPC_daload; } catch (Exception e) { resizeByteArray(OPC_daload); } } copy-and-paste
  14. 14. Type-3 Clones ❖ Programmers often make some changes to code fragments after copy-and-paste. Zhang et al. ICSM 2012 14 final public void daload() {  countLabels = 0;  try { position++; bCodeStream[i++] = OPC_daload; } catch (Exception e) { resizeByteArray(OPC_daload); } } final public void daload() { countLabels = 0; stackDepth += 2; if (stackDepth stackMax) stackMax = stackDepth; try { position++; bCodeStream[i++] = OPC_daload; } catch (Exception e) { resizeByteArray(OPC_daload); } } copy-and-paste gap added code fragment Type-3 clones
  15. 15. Our collaborator ❖ Dr. Keisuke Hotta • Postdoc • Osaka University, Japan • Visiting Researcher • Bremen University, Germany ❖ Help our group to use Scorpio (jar file), which is a PDG-based Type-3 clone detection tool. 15
  16. 16. Case Study Setting ❖ Environment ❖ Dataset • Apache CXF • LOC: 830K • SIZE: 150MB 16 CPU Memory [GB] per node Cores × Nodes Desktop 1 Intel® Core™ i7 16 12×1 Desktop 2 Xeon E5-2630 v2 144 12×1 FX10 SPARC64™ IXfx 32 16×190
  17. 17. 17 FX10 is much faster! 127h28m 42s 2h15m 16m58s Desktop 1 Desktop 2 FX10 Time
  18. 18. How to run Scorpio in FX10 ❖ Describe only 20-30 lines of (bash) code to run Scorpio in FX10. 18 #!/bin/bash #PJM ‒L “rscgrp=debug” #PJM ‒L “node=190” #PJM ‒L “elapse=30:00” #PJB ‒j #PJM ‒S module load Java …⋯ java scorpio.jar How many nodes do we use? How long do we use FX10? What are output options?
  19. 19. Current our challenges 19 Apache CXF 6,000 files Apache All Projects 770,000 files UCI Dataset 390,000,000 files Done Doing ToDo
  20. 20. Challenges in Mining Whole Software Universe 2140 Diversity in software engineering research @ FSE 2013 20,028 projects as the Universe FX10 is much faster! 127h28m 42s 2h15m 16m58s Desktop 1 Desktop 2 FX10 Time Case Study ❖ Evaluate the impact that HPC can have on MSR analyses. ❖ Apply HPC (FX10) to Code Clone Detection. 7 Today... ❖ Derive messages from HPC community to MSR community. • Make use of High Performance Computing (HPC) in MSR. HPC MSR 2 2014: A Space Odyssey ❖ MSR researchers will explore treasure in the Universe anytime soon. 3 2004 2014

×