A Business Perspective on Use-Case-Driven Challenges for Software
Architectures to Document Study and Variable Information
IASSIST 2013
29.05.2013
Thomas Bosch
GESIS, Germany
thomas.bosch@gesis.org
boschthomas@blogspot.com
Matthäus Zloch
GESIS, Germany
matthaeus.zloch@gesis.org
Dennis Wegener
GESIS, Germany
dennis.wegener@gesis.org
Outline
• general information about MISSY
• next generation MISSY
• software architecture overview
• presentation
• business logic
general information about MISSY
• Microdata Information System (MISSY)
• currently, MISSY contains only the microcensus survey (largest
household survey in Europe)
• MISSY provides detailed information about individual data sets
• MISSY facilitates the data usage for research
general information about MISSY
• MISSY contains metadata of microdata
• MISSY is split in two parts
• Missy Web for metadata presentation (end-user front-end)
• Missy Editor for metadata documentation (back-end)
• MISSY consists of approx. 500 Variables & Questions per year
• MISSY captures 25 years, since 1973
next generation MISSY
further studies
we integrate further studies (e.g. EU-SILC, EU-LFS, EVS, …)
MISSY Editor
we implement the Missy Editor as a web application
modern web project architecture
we design a modern web project architecture
• multitier software architecture
• Model-View-Controller (MVC) pattern
• Apache Maven as project management software
next generation MISSY
physical persistence
MISSY supports multiple types of physical persistence
open source
we publish MISSY as an Open Source project
import
MISSY provides an import from SPSS and XML
export
MISSY provides an export to multiple formats like DDI-L, DDI-C, DDI-RDF, …
software architecture
presentation
presentation control
business logic
data storage access
data storage
presentation
general information about microcensus
variables by thematic classification and year
list of variables by year
details of variables with statistics
variable-time matrix
questionnaire catalogue
question flow diagram
business logic
data model architecture
DDI-RDF Discovery Vocabulary
• contains only a small subset of DDI-XML + additional axioms
• the conceptual model is derived from use cases which are typical in
the statistical community
• statistical domain experts have formulated these use cases which
are seen as most significant to solve frequent problems
• increase visibility of microdata
• increase use of microdata
• enable inferencing on microdata
• harmonize microdata (make microdata comparable)
DDI-RDF Discovery Vocabulary
• enables to
• publish
• discover
microdata and metadata about microdata (research and survey
data) in the Web of Linked Data
• to link microdata to other microdata
making the data and the results of research (e.g. publications) more closely
connected
DDI-RDF Discovery Vocabulary
• availability of (meta)data
• Microdata may be available (typically as CSV files)
• In most cases, metadata about microdata is NOT available
• contains major types of metadata of DDI-C and DDI-L
• mappings from DDI-XML to DDI-RDF
• no straightforward Mapping from DDI-RDF to DDI-XML
• enables better support for the LD community
• partly no corresponding constructs in DDI-XML
• 26 experts from the statistics and the Linked Data community of
12 different countries have contributed
how to extend the DISCO?
use case 'variable details'
What comes next?
• How does the “next generation MISSY“ look like under the
hood?
• How is the data model implemented
• How does inheritance at data model level work?
• How does persistence work?
• Which modules/APIs does the MISSY Software System offer?
33
thank you for your attention…
• feel free to download the sources from GitHub!
https://github.com/missy-project
• have a look at the unofficial draft of DDI-RDF!
[planned as specification by the DDI Alliance by 2013]
http://rdf-vocabulary.ddialliance.org/discovery
give us feedback!
feel free to criticize!
Thomas Bosch
GESIS, Germany
thomas.bosch@gesis.org
boschthomas@blogspot.com
Matthäus Zloch
GESIS, Germany
matthaeus.zloch@gesis.org
Dennis Wegener
GESIS, Germany
dennis.wegener@gesis.org
backup
software architecture
• standard technologies to develop software
• multitier software architecture
• Model-View-Controller (MVC) pattern
• Apache Maven as project management software
• multitier architecture separates the project into logical parts
multitier software architecture
• presentation
• users can access the web application using their internet browser
• presentation control
• Maven module responsible for the view the user gets when interacting with
the web application
• business logic
• Maven modules defining the data models (DISCO, MISSY)
• data storage access
• Maven modules defining persistence functionalities for data model
components regardless of the actual type of physical persistence
• data storage
• Maven modules implementing concrete persistence functionalities (e.g. DDI-
XML, DDI-RDF, RDBs) for data model components

2013.05 - IASSIST 2013

  • 1.
    A Business Perspectiveon Use-Case-Driven Challenges for Software Architectures to Document Study and Variable Information IASSIST 2013 29.05.2013 Thomas Bosch GESIS, Germany thomas.bosch@gesis.org boschthomas@blogspot.com Matthäus Zloch GESIS, Germany matthaeus.zloch@gesis.org Dennis Wegener GESIS, Germany dennis.wegener@gesis.org
  • 2.
    Outline • general informationabout MISSY • next generation MISSY • software architecture overview • presentation • business logic
  • 3.
    general information aboutMISSY • Microdata Information System (MISSY) • currently, MISSY contains only the microcensus survey (largest household survey in Europe) • MISSY provides detailed information about individual data sets • MISSY facilitates the data usage for research
  • 4.
    general information aboutMISSY • MISSY contains metadata of microdata • MISSY is split in two parts • Missy Web for metadata presentation (end-user front-end) • Missy Editor for metadata documentation (back-end) • MISSY consists of approx. 500 Variables & Questions per year • MISSY captures 25 years, since 1973
  • 5.
    next generation MISSY furtherstudies we integrate further studies (e.g. EU-SILC, EU-LFS, EVS, …) MISSY Editor we implement the Missy Editor as a web application modern web project architecture we design a modern web project architecture • multitier software architecture • Model-View-Controller (MVC) pattern • Apache Maven as project management software
  • 6.
    next generation MISSY physicalpersistence MISSY supports multiple types of physical persistence open source we publish MISSY as an Open Source project import MISSY provides an import from SPSS and XML export MISSY provides an export to multiple formats like DDI-L, DDI-C, DDI-RDF, …
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
    variables by thematicclassification and year
  • 16.
  • 17.
    details of variableswith statistics
  • 18.
  • 19.
  • 20.
  • 21.
  • 22.
  • 23.
    DDI-RDF Discovery Vocabulary •contains only a small subset of DDI-XML + additional axioms • the conceptual model is derived from use cases which are typical in the statistical community • statistical domain experts have formulated these use cases which are seen as most significant to solve frequent problems • increase visibility of microdata • increase use of microdata • enable inferencing on microdata • harmonize microdata (make microdata comparable)
  • 24.
    DDI-RDF Discovery Vocabulary •enables to • publish • discover microdata and metadata about microdata (research and survey data) in the Web of Linked Data • to link microdata to other microdata making the data and the results of research (e.g. publications) more closely connected
  • 25.
    DDI-RDF Discovery Vocabulary •availability of (meta)data • Microdata may be available (typically as CSV files) • In most cases, metadata about microdata is NOT available • contains major types of metadata of DDI-C and DDI-L • mappings from DDI-XML to DDI-RDF • no straightforward Mapping from DDI-RDF to DDI-XML • enables better support for the LD community • partly no corresponding constructs in DDI-XML • 26 experts from the statistics and the Linked Data community of 12 different countries have contributed
  • 26.
    how to extendthe DISCO?
  • 27.
  • 33.
    What comes next? •How does the “next generation MISSY“ look like under the hood? • How is the data model implemented • How does inheritance at data model level work? • How does persistence work? • Which modules/APIs does the MISSY Software System offer? 33
  • 34.
    thank you foryour attention… • feel free to download the sources from GitHub! https://github.com/missy-project • have a look at the unofficial draft of DDI-RDF! [planned as specification by the DDI Alliance by 2013] http://rdf-vocabulary.ddialliance.org/discovery give us feedback! feel free to criticize! Thomas Bosch GESIS, Germany thomas.bosch@gesis.org boschthomas@blogspot.com Matthäus Zloch GESIS, Germany matthaeus.zloch@gesis.org Dennis Wegener GESIS, Germany dennis.wegener@gesis.org
  • 35.
  • 36.
    software architecture • standardtechnologies to develop software • multitier software architecture • Model-View-Controller (MVC) pattern • Apache Maven as project management software • multitier architecture separates the project into logical parts
  • 37.
    multitier software architecture •presentation • users can access the web application using their internet browser • presentation control • Maven module responsible for the view the user gets when interacting with the web application • business logic • Maven modules defining the data models (DISCO, MISSY) • data storage access • Maven modules defining persistence functionalities for data model components regardless of the actual type of physical persistence • data storage • Maven modules implementing concrete persistence functionalities (e.g. DDI- XML, DDI-RDF, RDBs) for data model components