Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
01
02
Let's start!
03
Background
Find new clients
Face big code bases
Need quick analysis
Need quick results
•
•
•
•
04
Code size
Google: 2 billions LOC
Facebook: 61 million LOC
Me: from 20K to 2M LOC
•
•
•
05
I hate reading...
06
It's life, Jim, but not as we know it!
07
Code is data!
08
Well, big data
09
Snapshot
10
Temporal
11
Who uses
SonarQube?
12
SonarQube
Don't get me wrong: I love it!
BUT... it's not that easy to get data in.
It needs to be tuned for each team.
It'...
First steps
14
Count the
lines!
15
Size does matter
Gives you an estimate (on how much reading is needed).
Most of the code bases are polyglot. Ratio between...
cloc
17
Usage
cloc ‐‐help
cloc ‐‐write‐lang‐def=lang.defs
01.
02.
18
Usage
cloc ‐‐csv 
     ‐‐quite 
     ‐‐progress‐rate=0
     ‐‐ignored=files.ignored 
     ‐‐exclude‐dir=test,build
     ‐‐...
Language definitions
Gradle
    filter remove_matches ^s*//
    filter remove_inline //.*$
    filter call_regexp_common C...
Where are the
pictures?
21
We have
stats, let's
plot them!22
Excel?
Probably not.
23
Pie charts are
boring!
24
Infographics!
25
d3.js
26
d3.js
Javascript library for data visualizations
Tons of examples
Many libraries built on top of d3.js
Several books
•
•
•...
Useful charts
Calendar View ­ https://bl.ocks.org/mbostock/4063318
Buble Chart ­ https://bl.ocks.org/mbostock/4063269
Hier...
Demo
29
Many alternatives
vis.js
raphael.js
sigma.js
many, many, many more
•
•
•
•
30
Tableau Public
31
Tableau Public
32
Tableau Public
33
Demo
34
Design your
own!
35
SVG + Inkscape
36
Home­made Inforgraphics
37
Code City
38
Code City
39
Implementations
eXcentia plugin for SonarQube
JSCity
•
•
40
Demo
41
Temporal
analysis
42
GitHub
43
GitHub
44
FishEye
45
Gource
Software projects are displayed by Gource as an animated tree with
the root directory of the project at its centre....
Launch gource
gource ‐s 0.1 ‐1280x720 
gource ‐‐output‐custom‐log log1.txt 
gource ‐s 0.1 ‐1280x720 log1.txt
01.
02.
03.
47
Log format
1444125624|Luke Daley|M|/ratpack‐core/.../WiretapPublisher.java
1444125624|Luke Daley|M|/ratpack‐core/.../Yield...
Demo
49
Code Maat
Code Maat is a command line tool used to mine and analyze data from
version­control systems
“
50
Launch Maat
git log ‐‐pretty=format:'[%h] %aN %ad %s' 
    ‐‐date=short ‐‐numstat ‐‐after=YYYY‐MM‐DD
01.
02.
51
Launch Maat
maat ‐l git.log ‐c git ‐a summary 
maat ‐l ratpack_evo.log ‐c git ‐a revisions > ratpack_freqs.csv
python scri...
Analysis types
abs­churn, age, author­churn, authors, communication, coupling,
entity­churn, entity­effort, entity­ownersh...
Demo
54
Let's talk big
data
55
ElasticSearch
Distributed, scalable, and highly available
Real­time search and analytics capabilities
Sophisticated RESTfu...
Index data
$ curl ‐XPUT 'http://localhost:9200/gitlog/commit/123345' ‐d '{
  "commitId" : "123345",
  "timestamp" : "2009‐...
Kibana
Flexible analytics and visualization platform
Real­time summary and charting of streaming data
Intuitive interface ...
Kibana
59
Demo
60
Structure
61
Structure101
62
Demo
63
Graphs are everywhere!
64
jQAssistant
jQAssistant is a QA tool which allows the definition and validation of
project specific rules on a structural ...
jQAssistant
jqassistant scan ‐f binaries
jqassistant server
01.
02.
66
Query your code
MATCH
  (class:Class)‐[:DECLARES]‐>(method:Method)
RETURN
  class.fqn, count(method) as Methods
ORDER BY
 ...
Query different versions
match
  (artifact:Artifact)‐[:CONTAINS]‐>(type:Type)
return
  artifact.fileName as Artifact, coll...
neo4j
69
Demo
70
Conclusion
71
Adam Tornhill
72
Hotspot Analysis
use hotspots to identify maintenance problems.
use hotspots for risk management.
hotspots point to code r...
Conclusion
Extract data from your code!
Visualize it and search for hot spots!
Search for new facts and knowledge!
Become ...
Next time...
75
when you...
76
push your
code
77
REMEMBER
78
79
Reading
material
80
Your Code as a Crime Scene
81
Metrics and Models in SQE
82
Software Metrics and Metrology
83
Links: images
http://abstrusegoose.com/432
http://camarenaphoto.tumblr.com/post/112238079516/its­life­jim­but­
not­as­we­k...
Links: tools
https://github.com/AlDanial/cloc
http://gource.io/
http://wettel.github.io/codecity­wof.html
https://github.c...
Links: tools
http://www.sonarqube.org/
https://www.atlassian.com/pt/software/fisheye/overview
https://www.elastic.co/produ...
Links: tools
https://github.com/ThoughtWorksStudios/saikuro_treemap
https://www.youtube.com/watch?v=iiIytERhV9o
https://co...
That's all!
88
Thank you!
89
90
Upcoming SlideShare
Loading in …5
×

Visualizing Java code bases for Jfokus 2017

400 views

Published on

Code lines are added at high speed? Too many developers? How do you see the big picture? What processes are going on with your large code base? This presentation will show how to leverage existing code analysis tools (Cloc, Structure101, SonarQube) and combine them with data storage (ElasticSearch, Neo4j) and visualization tools (Gource, D3, Inkscape, Kibana) to at least make some sense out of millions of code lines and their history. During his consulting work, author often meets unfamiliar and at the same time large Java code bases that need quick analysis and input for decision making. That's where visualizations come to into play by helping mining important knowledge directly from the code statistics.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Visualizing Java code bases for Jfokus 2017

  1. 1. 01
  2. 2. 02
  3. 3. Let's start! 03
  4. 4. Background Find new clients Face big code bases Need quick analysis Need quick results • • • • 04
  5. 5. Code size Google: 2 billions LOC Facebook: 61 million LOC Me: from 20K to 2M LOC • • • 05
  6. 6. I hate reading... 06
  7. 7. It's life, Jim, but not as we know it! 07
  8. 8. Code is data! 08
  9. 9. Well, big data 09
  10. 10. Snapshot 10
  11. 11. Temporal 11
  12. 12. Who uses SonarQube? 12
  13. 13. SonarQube Don't get me wrong: I love it! BUT... it's not that easy to get data in. It needs to be tuned for each team. It's not easy to make your team to use the data. Data is impersonal. • • • • • 13
  14. 14. First steps 14
  15. 15. Count the lines! 15
  16. 16. Size does matter Gives you an estimate (on how much reading is needed). Most of the code bases are polyglot. Ratio between languages can tell something. Ratio between test code, comments, blank lines is also interesting. • • • 16
  17. 17. cloc 17
  18. 18. Usage cloc ‐‐help cloc ‐‐write‐lang‐def=lang.defs 01. 02. 18
  19. 19. Usage cloc ‐‐csv       ‐‐quite       ‐‐progress‐rate=0      ‐‐ignored=files.ignored       ‐‐exclude‐dir=test,build      ‐‐read‐lang‐def=lang.defs       ‐‐out=data.csv      . 01. 02. 03. 04. 05. 06. 07. 08. 19
  20. 20. Language definitions Gradle     filter remove_matches ^s*//     filter remove_inline //.*$     filter call_regexp_common C     extension gradle     3rd_gen_scale 4.10 01. 02. 03. 04. 05. 06. 20
  21. 21. Where are the pictures? 21
  22. 22. We have stats, let's plot them!22
  23. 23. Excel? Probably not. 23
  24. 24. Pie charts are boring! 24
  25. 25. Infographics! 25
  26. 26. d3.js 26
  27. 27. d3.js Javascript library for data visualizations Tons of examples Many libraries built on top of d3.js Several books • • • • 27
  28. 28. Useful charts Calendar View ­ https://bl.ocks.org/mbostock/4063318 Buble Chart ­ https://bl.ocks.org/mbostock/4063269 Hierarchical Edge Bundling ­ https://bl.ocks.org/mbostock/1044242 • • • 28
  29. 29. Demo 29
  30. 30. Many alternatives vis.js raphael.js sigma.js many, many, many more • • • • 30
  31. 31. Tableau Public 31
  32. 32. Tableau Public 32
  33. 33. Tableau Public 33
  34. 34. Demo 34
  35. 35. Design your own! 35
  36. 36. SVG + Inkscape 36
  37. 37. Home­made Inforgraphics 37
  38. 38. Code City 38
  39. 39. Code City 39
  40. 40. Implementations eXcentia plugin for SonarQube JSCity • • 40
  41. 41. Demo 41
  42. 42. Temporal analysis 42
  43. 43. GitHub 43
  44. 44. GitHub 44
  45. 45. FishEye 45
  46. 46. Gource Software projects are displayed by Gource as an animated tree with the root directory of the project at its centre.  Directories appear as branches with files as leaves.  Developers can be seen working on the tree at the times they contributed to the project. “ 46
  47. 47. Launch gource gource ‐s 0.1 ‐1280x720  gource ‐‐output‐custom‐log log1.txt  gource ‐s 0.1 ‐1280x720 log1.txt 01. 02. 03. 47
  48. 48. Log format 1444125624|Luke Daley|M|/ratpack‐core/.../WiretapPublisher.java 1444125624|Luke Daley|M|/ratpack‐core/.../YieldingPublisher.java 1444306114|Stian Lindhom|M|/ratpack‐manual/.../13‐http.md 1444312172|Andrey Antukh|M|/ratpack‐core/.../WebSocketEngine.java 01. 02. 03. 04. 48
  49. 49. Demo 49
  50. 50. Code Maat Code Maat is a command line tool used to mine and analyze data from version­control systems “ 50
  51. 51. Launch Maat git log ‐‐pretty=format:'[%h] %aN %ad %s'      ‐‐date=short ‐‐numstat ‐‐after=YYYY‐MM‐DD 01. 02. 51
  52. 52. Launch Maat maat ‐l git.log ‐c git ‐a summary  maat ‐l ratpack_evo.log ‐c git ‐a revisions > ratpack_freqs.csv python scripts/merge_comp_freqs.py ratpack_freqs.csv ratpack_lines 01. 02. 03. 52
  53. 53. Analysis types abs­churn, age, author­churn, authors, communication, coupling, entity­churn, entity­effort, entity­ownership, fragmentation, identity, main­dev, main­dev­by­revs, messages, refactoring­main­dev, revisions, soc, summary 53
  54. 54. Demo 54
  55. 55. Let's talk big data 55
  56. 56. ElasticSearch Distributed, scalable, and highly available Real­time search and analytics capabilities Sophisticated RESTful API • • • 56
  57. 57. Index data $ curl ‐XPUT 'http://localhost:9200/gitlog/commit/123345' ‐d '{   "commitId" : "123345",   "timestamp" : "2009‐11‐15T14:12:12",   "message" : "git into es" }' 01. 02. 03. 04. 05. 57
  58. 58. Kibana Flexible analytics and visualization platform Real­time summary and charting of streaming data Intuitive interface for a variety of users Instant sharing and embedding of dashboards • • • • 58
  59. 59. Kibana 59
  60. 60. Demo 60
  61. 61. Structure 61
  62. 62. Structure101 62
  63. 63. Demo 63
  64. 64. Graphs are everywhere! 64
  65. 65. jQAssistant jQAssistant is a QA tool which allows the definition and validation of project specific rules on a structural level.  It is built upon the graph database Neo4j and can easily be plugged into the build process to automate detection of constraint violations and generate reports about user defined concepts and metrics. “ 65
  66. 66. jQAssistant jqassistant scan ‐f binaries jqassistant server 01. 02. 66
  67. 67. Query your code MATCH   (class:Class)‐[:DECLARES]‐>(method:Method) RETURN   class.fqn, count(method) as Methods ORDER BY   Methods DESC LIMIT 20 01. 02. 03. 04. 05. 06. 07. 67
  68. 68. Query different versions match   (artifact:Artifact)‐[:CONTAINS]‐>(type:Type) return   artifact.fileName as Artifact, collect(type.fqn) as Types 01. 02. 03. 04. 68
  69. 69. neo4j 69
  70. 70. Demo 70
  71. 71. Conclusion 71
  72. 72. Adam Tornhill 72
  73. 73. Hotspot Analysis use hotspots to identify maintenance problems. use hotspots for risk management. hotspots point to code review candidates. hotspots are input to exploratory tests. • • • • 73
  74. 74. Conclusion Extract data from your code! Visualize it and search for hot spots! Search for new facts and knowledge! Become data scientist or data journalist! • • • • 74
  75. 75. Next time... 75
  76. 76. when you... 76
  77. 77. push your code 77
  78. 78. REMEMBER 78
  79. 79. 79
  80. 80. Reading material 80
  81. 81. Your Code as a Crime Scene 81
  82. 82. Metrics and Models in SQE 82
  83. 83. Software Metrics and Metrology 83
  84. 84. Links: images http://abstrusegoose.com/432 http://camarenaphoto.tumblr.com/post/112238079516/its­life­jim­but­ not­as­we­know­it­spock http://technology.ie/big­data­looks­like/ http://www.informationisbeautiful.net/visualizations/million­lines­of­ code/ http://githut.info/ http://emmanueloga.com/2013/10/07/Graphs­are­Everywhere­An­ overview­of­GraphConnect­San­Francisco­2013.html • • • • • • 84
  85. 85. Links: tools https://github.com/AlDanial/cloc http://gource.io/ http://wettel.github.io/codecity­wof.html https://github.com/adamtornhill/code­maat http://d3js.org/ http://visjs.org/ • • • • • • 85
  86. 86. Links: tools http://www.sonarqube.org/ https://www.atlassian.com/pt/software/fisheye/overview https://www.elastic.co/products/elasticsearch https://www.elastic.co/products/kibana http://jqassistant.org/ http://neo4j.com/ • • • • • • 86
  87. 87. Links: tools https://github.com/ThoughtWorksStudios/saikuro_treemap https://www.youtube.com/watch?v=iiIytERhV9o https://codescene.io http://www.adamtornhill.com/articles/software­ revolution/part1/index.html • • • • 87
  88. 88. That's all! 88
  89. 89. Thank you! 89
  90. 90. 90

×