1. The impact of
supercomputers on MSR
Y. Kamei A. Osaka C. Huang N. Ubayashi
MSR Next Generation 2014@HKUST
2. Who am I?
❖ Yasutaka Kamei
http://posl.ait.kyushu-u.ac.jp/~kamei/
❖ My research interests are
2
Summer Winter
Understanding
OSS Collaboration
Improving
Software Quality
Scaling up
MSR Analysis
3. Today...
❖ Derive messages from HPC community
to MSR community.
• Make use of High Performance Computing
(HPC) in MSR.
HPC MSR
3
4. 2014: A Space Odyssey
❖ MSR researchers will explore treasure in
the Universe anytime soon.
4
2004 2014
5. 2014: A Space Odyssey
❖ MSR researchers will explore treasure in
the Universe anytime soon.
5
Diversity in software engineering
research @ FSE 2013
20,028 projects as the Universe
2004 2014
6. 2014: A Space Odyssey
❖ MSR researchers will explore treasure in
the Universe anytime soon.
6
Diversity in software engineering
research @ FSE 2013
20,028 projects as the Universe
Challenges in Mining Whole
Software Universe
2004 2014
7. One solution is
❖ Supercomputer
❖ In the case of FX10,
• CPU: 16 cores
• Memory: 32 GByte
× 4,800 nodes
7
8. However…
❖ The adoption rate for HPC is still low.
8
Domain-Specific
techniques for
Only Fortran using HPC?
and C?
My tool is imple-mented
by
9. Prof. Chiba says
❖ Via collaboration of CREST project,
9
We can use Java, Ruby
and Python on FX10!
10. Case Study
❖ Evaluate the impact that HPC can have
on MSR analyses.
❖ Apply HPC (FX10) to Code Clone
Detection.
10
11. Code Clone
❖ A code fragment that has identical or
similar code fragments
11
copy%and%paste copy%and%paste
code%clone
clone%fragment
clone%fragment
clone%fragment
Hotta et al. CSMR 2012
12. Type-3 Clones
❖ Programmers often make some changes
to code fragments after copy-and-paste.
Zhang et al. ICSM 2012
12
final
public
void
daload()
{
countLabels
=
0;
try
{
position++;
bCodeStream[i++]
=
OPC_daload;
}
catch
(Exception
e)
{
resizeByteArray(OPC_daload);
}
}
13. Type-3 Clones
❖ Programmers often make some changes
to code fragments after copy-and-paste.
Zhang et al. ICSM 2012
13
final
public
void
daload()
{
countLabels
=
0;
try
{
position++;
bCodeStream[i++]
=
OPC_daload;
}
catch
(Exception
e)
{
resizeByteArray(OPC_daload);
}
}
final
public
void
daload()
{
countLabels
=
0;
try
{
position++;
bCodeStream[i++]
=
OPC_daload;
}
catch
(Exception
e)
{
resizeByteArray(OPC_daload);
}
}
copy-and-paste
14. Type-3 Clones
❖ Programmers often make some changes
to code fragments after copy-and-paste.
Zhang et al. ICSM 2012
14
final
public
void
daload()
{
countLabels
=
0;
try
{
position++;
bCodeStream[i++]
=
OPC_daload;
}
catch
(Exception
e)
{
resizeByteArray(OPC_daload);
}
}
final
public
void
daload()
{
countLabels
=
0;
stackDepth
+=
2;
if
(stackDepth
stackMax)
stackMax
=
stackDepth;
try
{
position++;
bCodeStream[i++]
=
OPC_daload;
}
catch
(Exception
e)
{
resizeByteArray(OPC_daload);
}
}
copy-and-paste
gap
added code
fragment
Type-3 clones
15. Our collaborator
❖ Dr. Keisuke Hotta
• Postdoc
• Osaka University, Japan
• Visiting Researcher
• Bremen University, Germany
❖ Help our group to use Scorpio (jar file),
which is a PDG-based Type-3 clone
detection tool.
15
17. 17
FX10 is much faster!
127h28m
42s
2h15m
16m58s
Desktop 1 Desktop 2 FX10
Time
18. How to run Scorpio in FX10
❖ Describe only 20-30 lines of (bash)
code to run Scorpio in FX10.
18
#!/bin/bash
#PJM ‒L “rscgrp=debug”
#PJM ‒L “node=190”
#PJM ‒L “elapse=30:00”
#PJB ‒j
#PJM ‒S
module load Java
…⋯
java scorpio.jar
How many nodes do
we use?
How long do we use
FX10?
What are output
options?
19. Current our challenges
19
Apache CXF
6,000 files
Apache All
Projects
770,000 files
UCI Dataset
390,000,000
files
Done Doing ToDo
20. Challenges in Mining Whole
Software Universe
2140
Diversity in software engineering
research @ FSE 2013
20,028 projects as the Universe
FX10 is much faster!
127h28m
42s
2h15m
16m58s
Desktop 1 Desktop 2 FX10
Time
Case Study
❖ Evaluate the impact that HPC can have
on MSR analyses.
❖ Apply HPC (FX10) to Code Clone
Detection.
7
Today...
❖ Derive messages from HPC community
to MSR community.
• Make use of High Performance Computing
(HPC) in MSR.
HPC MSR
2
2014: A Space Odyssey
❖ MSR researchers will explore treasure in
the Universe anytime soon.
3
2004 2014