DEveloper COmpanion for Documented and annotatEd code Reference
The DECODER project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 824231.
DECODER Project Overview
Virgile Prevosto
with DECODER's partners
Paris Open Source Summit
2019-12-11
DEveloper COmpanion for Documented on annotatEd
code Reference
Persistent Knowledge Monitor
Database storing all relevant
documents for a software project
documentation (manuals,
comments, bug tracker, ...)
formal specifications
source code
analysis and testing results
and evolutions (e.g. commits)
Tools for feeding the database
Tools for querying the database
Common schema to ease interactions
PKM
Testar
Frama-C
OpenJML
Specification
synthesis
Moskitt
Documentation
generation
NLP
knowledge
formalizer
NLP
knowledge
extractor
Syntactic
analyses
Augmented
IDE
ASFMmodels
inferred
properties
analysis results
infer properties
UM
L
m
odels
Inferredproperties
M
odels, Properties
Code, documentation
raw code information
Advancedqueriesoncode
2019 DEveloper COmpanion for Documented and annotatEd code Reference 2019-12-11
2
Encompassing the Whole Development Lifecycle
Requirements
Preliminary Design
Detailed Design
Implementation
Unit Checking
Integration Checking
System Validation
Maintenance
Evolution
NLP
spec generation
Modeling
ASFM
doc generation
IDE queries
invariant
generation
PKM
traceability
doc generation
PKM
Impact analysis
2019 DEveloper COmpanion for Documented and annotatEd code Reference 2019-12-11
3
From Informal to Formal Documents
Natural Language Processing
Knowledge extraction from informal documents and correspondance with relevant
code pieces
Knowledge extraction from code and semi-automated documentation generation
Abstract Semi-Formal Models (ASFM)
Graphical language to describe effects of a function on the data structures involved
Semi-automated generation of ASFM diagrams
Animation of the diagrams (graphical debugging)
2019 DEveloper COmpanion for Documented and annotatEd code Reference 2019-12-11
4
Use Cases
Evaluation and Improvement of DECODER toolset
Linux Drivers Quickly and accurately assess the quality of an external Linux driver for
inclusion in embedded systems
OpenCV Build a better knowledge of OpenCV API and its usage in some
applications
MyThaiStar UI/UX design and verification
Java Usage of Decoder toolset on selected Open Source Java projects
2019 DEveloper COmpanion for Documented and annotatEd code Reference 2019-12-11
5
PKM Meta Model
2019 DEveloper COmpanion for Documented and annotatEd code Reference 2019-12-11
6
PKM Server Design
Json as main interchange format
Start working on a Json Schema
Take advantage of existing
proposals:
SARIF
JCDB
Others?
Back-end: Document-oriented DB
MongoDB: licence issues
CouchDB
OrientDB: graph model used by
Testar
Others?
2019 DEveloper COmpanion for Documented and annotatEd code Reference 2019-12-11
7
Information Extraction
Dataset Gathering
Collect existing datasets (code and documents) outside Decoder
Start looking at use cases (MyThaiStar and OpenCV)
DeepAPI training corpus for natural language/call sequences correspondance
Initial experiments
Code to NL:
features extraction and grouping tokens
NL to Code:
Consider programming language as a foreign language
Neural Machine Translation
Objective: compute semantic similarity between source code and informal description
2019 DEveloper COmpanion for Documented and annotatEd code Reference 2019-12-11
8
Dissemination Activities
Public website at
https://decoder-project.eu
Present on LinkedIn and Twitter
Decoder poster, presented at OW2 Conf
Contact with other projects
https://openreq.eu/
https://www.chariotproject.eu/
2019 DEveloper COmpanion for Documented and annotatEd code Reference 2019-12-11
9
Current Roadmap
Progress according to plan
PKM implementation underway
First draft of schema design finalized
Early version of MongoDB-based server taking shape
NLP and code processing have produced their first results
Use cases investigation and methodology discussions ramping up
2019 DEveloper COmpanion for Documented and annotatEd code Reference 2019-12-11
10
Thank You!
The DECODER project has received funding from the European Union’s Horizon 2020 research and innovation programme under
grant agreement number 824231.
If you need further information, please contact the coordinator: TECHNIKON Forschungs- und Planungsgesellschaft mbH, Burgplatz
3a, 9500 Villach, AUSTRIA, Tel: +43 4242 233 55, Fax: +43 4242 233 55 77, E-Mail: coordination@DECODER.eu
The information in this document is provided “as is”, and no guarantee or warranty is given that the information is fit for any particular purpose. The content of
this document reflects only the author`s view – the European Commission is not responsible for any use that may be made of the information it contains.
The users use the information at their sole risk and liability.
2019 DEveloper COmpanion for Documented and annotatEd code Reference 2019-12-11
11

DECODER POSS 2019

  • 1.
    DEveloper COmpanion forDocumented and annotatEd code Reference The DECODER project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 824231. DECODER Project Overview Virgile Prevosto with DECODER's partners Paris Open Source Summit 2019-12-11
  • 2.
    DEveloper COmpanion forDocumented on annotatEd code Reference Persistent Knowledge Monitor Database storing all relevant documents for a software project documentation (manuals, comments, bug tracker, ...) formal specifications source code analysis and testing results and evolutions (e.g. commits) Tools for feeding the database Tools for querying the database Common schema to ease interactions PKM Testar Frama-C OpenJML Specification synthesis Moskitt Documentation generation NLP knowledge formalizer NLP knowledge extractor Syntactic analyses Augmented IDE ASFMmodels inferred properties analysis results infer properties UM L m odels Inferredproperties M odels, Properties Code, documentation raw code information Advancedqueriesoncode 2019 DEveloper COmpanion for Documented and annotatEd code Reference 2019-12-11 2
  • 3.
    Encompassing the WholeDevelopment Lifecycle Requirements Preliminary Design Detailed Design Implementation Unit Checking Integration Checking System Validation Maintenance Evolution NLP spec generation Modeling ASFM doc generation IDE queries invariant generation PKM traceability doc generation PKM Impact analysis 2019 DEveloper COmpanion for Documented and annotatEd code Reference 2019-12-11 3
  • 4.
    From Informal toFormal Documents Natural Language Processing Knowledge extraction from informal documents and correspondance with relevant code pieces Knowledge extraction from code and semi-automated documentation generation Abstract Semi-Formal Models (ASFM) Graphical language to describe effects of a function on the data structures involved Semi-automated generation of ASFM diagrams Animation of the diagrams (graphical debugging) 2019 DEveloper COmpanion for Documented and annotatEd code Reference 2019-12-11 4
  • 5.
    Use Cases Evaluation andImprovement of DECODER toolset Linux Drivers Quickly and accurately assess the quality of an external Linux driver for inclusion in embedded systems OpenCV Build a better knowledge of OpenCV API and its usage in some applications MyThaiStar UI/UX design and verification Java Usage of Decoder toolset on selected Open Source Java projects 2019 DEveloper COmpanion for Documented and annotatEd code Reference 2019-12-11 5
  • 6.
    PKM Meta Model 2019DEveloper COmpanion for Documented and annotatEd code Reference 2019-12-11 6
  • 7.
    PKM Server Design Jsonas main interchange format Start working on a Json Schema Take advantage of existing proposals: SARIF JCDB Others? Back-end: Document-oriented DB MongoDB: licence issues CouchDB OrientDB: graph model used by Testar Others? 2019 DEveloper COmpanion for Documented and annotatEd code Reference 2019-12-11 7
  • 8.
    Information Extraction Dataset Gathering Collectexisting datasets (code and documents) outside Decoder Start looking at use cases (MyThaiStar and OpenCV) DeepAPI training corpus for natural language/call sequences correspondance Initial experiments Code to NL: features extraction and grouping tokens NL to Code: Consider programming language as a foreign language Neural Machine Translation Objective: compute semantic similarity between source code and informal description 2019 DEveloper COmpanion for Documented and annotatEd code Reference 2019-12-11 8
  • 9.
    Dissemination Activities Public websiteat https://decoder-project.eu Present on LinkedIn and Twitter Decoder poster, presented at OW2 Conf Contact with other projects https://openreq.eu/ https://www.chariotproject.eu/ 2019 DEveloper COmpanion for Documented and annotatEd code Reference 2019-12-11 9
  • 10.
    Current Roadmap Progress accordingto plan PKM implementation underway First draft of schema design finalized Early version of MongoDB-based server taking shape NLP and code processing have produced their first results Use cases investigation and methodology discussions ramping up 2019 DEveloper COmpanion for Documented and annotatEd code Reference 2019-12-11 10
  • 11.
    Thank You! The DECODERproject has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement number 824231. If you need further information, please contact the coordinator: TECHNIKON Forschungs- und Planungsgesellschaft mbH, Burgplatz 3a, 9500 Villach, AUSTRIA, Tel: +43 4242 233 55, Fax: +43 4242 233 55 77, E-Mail: coordination@DECODER.eu The information in this document is provided “as is”, and no guarantee or warranty is given that the information is fit for any particular purpose. The content of this document reflects only the author`s view – the European Commission is not responsible for any use that may be made of the information it contains. The users use the information at their sole risk and liability. 2019 DEveloper COmpanion for Documented and annotatEd code Reference 2019-12-11 11