CSIRO investing in the future of data - John Morrissey
1. CSIRO investing in the future of data
INFORMATION MANAGEMENT & TECHNOLOGY
John Morrissey | eResearch Planner
22 July 2016
2. CSIRO
CSIRO investing in the future of data | John Morrissey2 |
~5300
talented staff
$1billion+
budget
Working
with over
2800+
industry
partners
55
sites across
Australia
Top 1%
of global
research
agencies
Each year
6 CSIRO
technologies
contribute
$5 billion to
the economy
3. The ongoing problem….
Science data assets:
• Undescribed …
• Inaccessible …
• Undiscoverable, unusable, uncitable …
• On a really wide range of infrastructure …
• In a really wide range of preservation-unfriendly formats …
• Unconnected …
CSIRO investing in the future of data | John Morrissey3 |
4. Some elements to connect
4 |
Systems
Infrastructure
Processes
(e.g. Quality Control,
Approval)
Legal
Licensing Intellectual
Property
Culture
Training
Fulfilling
needs
… … …
Policy
CSIRO investing in the future of data | John Morrissey
5. Data Access Portal
Functions
Self serve Deposit
Describe
Create Citation
Restrict
License
Approve
Store
Publish
Discover
Access
Manage
CSIRO investing in the future of data | John Morrissey5 |
6. Goals for a data repository
• Persistent access
• Version control
• Self service
• Scalable storage
• Minimal use of expensive spinning disk storage
• Cheaper tape storage added as required – fast throughput when data is
optimally “encapsulated” on tape
• Integration with Bowen Research Cloud storage – used by projects for
working storage
CSIRO investing in the future of data | John Morrissey6 |
7. Decision workflows for data and software
CSIRO investing in the future of data | John Morrissey7 |
8. The Data Management Ecosystem …
CSIRO investing in the future of data | John Morrissey8 |
Collaboration:
industry,
universities, other
organisations
Vocab
Service
9. Like an onion …
CSIRO investing in the future of data | John Morrissey9 |
Data
management
ecosystem
Collaboration:
Industry,
universities, other
organisations
14. What’s next?
Policy
• Supported by infrastructure services that make compliance easier
• Data management planning, with tools to support this and return value to
end user
• Management support within research projects required to allocate resources
to data management
Development
• Storage
– Better integration with existing network storage for simpler ingest
– More access options
• Services, vocabularies, semantic web
• Provenance
• Object / file level metadata
CSIRO investing in the future of data | John Morrissey14 |
15. What’s even more exciting?
• Researchers wanting to add “plug-in” functions to the DAP
• Researchers writing whole-of-program data management
roadmaps for their business units, heavily referencing DAP and
enterprise-developed tools.
• Continuation of the “working with research groups” model to
implement:
• Semantic enablement and vocabularies
• Provenance
• Reuse of DAP metadata in other tools
CSIRO investing in the future of data | John Morrissey15 |
16. interested [ view, download ]
similar
Data Collection A
likely interested
Similar Data Collections
Data User
In current implementation, similar
datasets are determined based on :
• Title
• Description
• Keyword
• Fields of research
• Data Contributor
• Activity
• Related Collection (specified by data
depositors)
A Recommender System for Research Data
Data Sources
• DAP Web Service
• Offline files (ANZSRC, Activity)
• Server logs (download, views)*
*will be included in future
New development work by Dr Anusuriya Devaraju, Postdoctoral Fellow ,CSIRO Mineral
Resources
17. You may
also like :
• ..
• ..
• ..
An Overview of the DAP Recommender System
SQL
database
Research Data
Recommender Model
RecommendationService
DAP Metadata
Store
Web Service
Other Data Sources (e.g.,
server logs, auxiliary data)
Data View
Data Download
Data Deposit (Post-Process)
18. Information Management &
Technology
John Morrissey
eResearch Planner
t +61 2 6124 1411
e john.morrissey@csiro.au
w www.csiro.au
INFORMATION MANAGEMENT & TECHNOLOGY
Thank you
Editor's Notes
Staff # as at 3 March 2016 = 5319
2014–15 budget = $1.2 billion
--------------------
Today we have around 5300 talented people working out of 50-plus centres in Australia and internationally.
We are a billion dollar organisation
We generate $485+ million in external revenue – essentially nearly 40% per cent of our revenue is externally sourced
Our people work closely with industry and communities to leave a lasting legacy.
Our ability to achieve results is shown by the quality of our research. We are in the top 1% of global research institutions in 15 of 22 research fields and in the top 0.1% in four research fields.
CSIRO is the key connector of institutions in the Australian system for some areas. CSIRO is the most central Australian institution in 6 research fields – Agricultural Sciences, Environment/Ecology, Plant and Animal Sciences, Geosciences, Chemistry and Materials Science.
CSIRO works with 1208 SME’s and 2,877 customers each year. We’re always looking for ways we can help business and industry.