Trends in Datapublishing
Barend Mons
Simplified eScience
RO’s
All Core Legacy
Information +
WorkFlows
User
New
dataset
New
Insights
X
AREAL SURVEY DEEP EXCAVATION
‘Why would I believe this association’???
2005, 6:142
Why I gave up spending most of my energy on text mining years ago
Data loss is real and significant…
Nature news, 19 December 2013
…and so is Data growth
The Data cycle in eScience
6
Prof. Carole Goble
www.datafairport.org
F
A
I
R
Findable:
- PID for each concept used
- PID assignment (authorities)
- ARTA-service
Accessable:
- Machines can Map...
PID
'provenance' (user defined)
Data (elements)
Metadata (intrinsic)
A simplified diagram of a Digital (data) Object irres...
PID
'provenance' (user defined)
Data (elements)
Metadata (intrinsic)
Digital Object Architecture
s are Digital Objects
Nan...
PID
Metadata (intrinsic)
'provenance' (user defined)
Data (elements)
Totally UNFAIR
PID
Metadata (intrinsic)
'provenance' ...
15
16
17
Combine
FANTOM5
&LOVD
16 TSS 10 Tissues
3 tissues
Heart muscle
Skeletal muscle
Cerebral Cortex
RNA detected
In many more
tissues
25 TSS
???
Repositories
Data Owners
(supp)
data
Data
bases
ELIXIR FAIR Data Search Index
End-users
FAIR L2
ELIXIR semantic data repos...
Develop
ELIXIR-NL
Engage key data
owners
Training &
Education
Develop
Infrastructure &
service
Sustainable
funding
Policy ...
Barend Mons slides from #ISMB 2014: Trends in data publishing
Barend Mons slides from #ISMB 2014: Trends in data publishing
Upcoming SlideShare
Loading in …5
×

Barend Mons slides from #ISMB 2014: Trends in data publishing

877 views

Published on

Barend Mons slides from #ISMB 2014: Trends in data publishing. Talk 3 in the "What Bioinformaticians need to know about digital publishing beyond the PDF2" workshop at ISMB 2014, Boston, 16th July 2014

Published in: Science
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
877
On SlideShare
0
From Embeds
0
Number of Embeds
20
Actions
Shares
0
Downloads
23
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide
  • The a soothing oversimplication
  • Mind the orange buckets…..
  • The full data cycle and:
    Where the boudaries of ELIXIR are (ref: our discussion on the ‘left side’ (data design and planning and analysis/modelling
    I put the third ‘flag’ there to emphasize that TRAINING of respected data experts is a remit of ELIXIR and that the NL node in GOBLET will participate. Argument: the better the (data colelction in) big data experiments of the future is planned and designed, the easier it is for ELIXIR later to deal with the data properly and serve them up for analysis at the end of the cycle. I can also emphaisze that the BBMRI etc. are more on the left and could partner with ELIXIR on the right etc.
    We can discuss this slide once more before the meeting to align our messages.
  • sub-scenes: attribute servers, lots of user databases, etc
  • Barend Mons slides from #ISMB 2014: Trends in data publishing

    1. 1. Trends in Datapublishing Barend Mons
    2. 2. Simplified eScience RO’s All Core Legacy Information + WorkFlows User New dataset New Insights
    3. 3. X AREAL SURVEY DEEP EXCAVATION ‘Why would I believe this association’???
    4. 4. 2005, 6:142 Why I gave up spending most of my energy on text mining years ago
    5. 5. Data loss is real and significant… Nature news, 19 December 2013 …and so is Data growth
    6. 6. The Data cycle in eScience 6
    7. 7. Prof. Carole Goble
    8. 8. www.datafairport.org
    9. 9. F A I R Findable: - PID for each concept used - PID assignment (authorities) - ARTA-service Accessable: - Machines can Map (IMS-service) - License on data elements - Authentication/Authorization Interoperable: - Machines understand data - Download-link-data formats - Workflows (Research Objects) Re-usable: - Functionally Interlinked - Harmonized - Citable and available for...
    10. 10. PID 'provenance' (user defined) Data (elements) Metadata (intrinsic) A simplified diagram of a Digital (data) Object irrespective of technological choices and naming
    11. 11. PID 'provenance' (user defined) Data (elements) Metadata (intrinsic) Digital Object Architecture s are Digital Objects Nanopublications are Research ObjectsSome Research Objects are
    12. 12. PID Metadata (intrinsic) 'provenance' (user defined) Data (elements) Totally UNFAIR PID Metadata (intrinsic) 'provenance' (user defined) Data (elements) Findable Usable for Humans PID Metadata (intrinsic) 'provenance' (user defined) Data (elements) FAIR metadata PID Metadata (intrinsic) 'provenance' (user defined) Data (elements) FAIR data- restricted access PID Metadata (intrinsic) 'provenance' (user defined) Data (elements) FAIR data- Open Access PID Metadata (intrinsic) 'provenance' (user defined) Data (elements) FAIR data- Open Access/Functionally Linked Data as increasingly FAIR Digital Objects
    13. 13. 15
    14. 14. 16
    15. 15. 17
    16. 16. Combine FANTOM5 &LOVD
    17. 17. 16 TSS 10 Tissues
    18. 18. 3 tissues Heart muscle Skeletal muscle Cerebral Cortex RNA detected In many more tissues 25 TSS ???
    19. 19. Repositories Data Owners (supp) data Data bases ELIXIR FAIR Data Search Index End-users FAIR L2 ELIXIR semantic data repository ELIXIR Data FAIR Port ELIXIR federated data FAIR L1 Search for datasets Download data (sub) sets in many formats (xml, rdf, json etc) FAIR L3 FAIR L4 ASPs, Inhouse IT, Bioinformatics Etc.. Tools & Applications Elixir Fin. Elixir Esp. Elixir Nor. Elixir UKElixir SWEElixir NL.. Elixir Fin. Elixir Esp. Elixir Nor. Elixir UKElixir SWEElixir NL.. 3 1 2 4
    20. 20. Develop ELIXIR-NL Engage key data owners Training & Education Develop Infrastructure & service Sustainable funding Policy Alignment 1 2 3 6 5 4

    ×