SlideShare a Scribd company logo
1 of 100
Download to read offline
RDSI
CREATINGAUSTRALIA’S
RESEARCHDATACLOUD
2|
TableofContents
03 13 23 38 43
Chapter I Chapter II Chapter III Chapter IV Chapter V
48 52 56 59 67
Chapter VI Chapter VII Chapter VIII Chapter XI Chapter X
IntheBeginning
Consultingthe
community
Thedatacollections
Anintegrated
infrastructure
Movingdata
Accessingdata Protectingdata Workingwithdata Selectingsolutions Casestudies
74 91
Chapter XI Chapter XI
Lookingbackand
lookingahead
Theteam
Inthebeginning…
Chapter I
4|
…therewasnothing
At the very beginning of this
project there was nothing in the
sector for collaborative storage.
Really nothing. So it started
from scratch, to persuade people
that it was a good idea to store
data, except not just in your
university or in your research
group.
– Dr Nick Tate
RDSI Project Director
5|
Nodiscoverieswithout
data
RDSI is in my mind, entirely about the delivery
of data and the amassing of the data. It is my
belief that without bringing collections together,
without making them accessible, you can’t make
new discoveries.
‒ Prof Nathan Bindoff
Director TPAC and Professor of Physical Oceanography
6|
AMajorGap
I was working for the Australian Research
Council when we were doing projects on the
cutting edge in research which crosses
disciplinary boundaries. Like many at the same
time we were convinced the future of research
would be shaped by the adoption of digital
technologies, increased computational power,
high performance computing and massive data
storage and tools to make it possible to link and
re-use existing research data. After considerable
federal investment and matching efforts from
state governments and universities, there was
one major gap: massive data stores to service
the growing eResearch communities existing and
coming in to being.
‒ Prof Doug McEachern
RDSI Project Board
7|
Howitstarted
There is growing recognition that new ways to conduct research have
emerged and are being validated across most research disciplines.
Adding to traditional forms of research that rely on experiment, theory
and testing hypotheses using data, it is now evident that researchers
also:
» collect increasingly larger sets of data as a primary form of research;
and
» use modelling tools to assist them in deriving patterns, perceptions and
trends that can form the basis for establishing and confirming
hypotheses.
Information and communications technology (ICT) is the cornerstone to
such new approaches, providing the means not only for increasingly
powerful computer-enabled simulation and modelling, but also the very
avenue to manage and integrate the increasing volume and complexity
of datasets and collections. Hence, ICT is not only a resource to
administer and manage research but also to drive and innovate the ways
in which research is conducted.
– Strategic Roadmap for Australian Research Infrastructure,2008, p19
In2008,theGovernment’sStrategicRoadmapfor
AustralianResearchInfrastructurehighlightedthe
needtomanageandintegratetheincreasingvolume
andcomplexityofresearchdatasetsandcollections.
8|
Researchstorage
beforeRDSI
Part of my role here at QCIF is to help research
groups with their storage needs. In the days
before RDSI it caught me by a huge surprise
that even at a major university it was quite
hard for a research group to get the right kind
of storage, to have some better way of
collaborating than sending a spreadsheet across
the globe.
I remember the first meeting I had with one
professor at UQ who had come from Oxford. He
said, ‘Look, I’m a centre director, I’m a
professor, and I have spent six months just
trying to get a little bit of storage. I’m worried I
should be doing something else.’
‒ Graham Chen
eResearch Manager, QCIF
9|
Researchstorage
beforeRDSI
A retiring academic in Marine Science came to
us at the University of Sydney library. He had
spent his whole life collecting data from the
beaches of Australia. He asked us, ‘What do I do
with all of this? It’s in various formats, it’s
valuable, but I don’t know what to do with it.’
We approached the university, but at that stage
the view was that when research results were
published, the research was finished. Why would
they want to store the data and why would
they want to share it?
‒ John Shipp
RDSI Project Board
10|
Researchstorage
beforeRDSI
We asked another researcher what she was
doing to curate her data. She said, ‘Every week I
download it onto CD. I make three copies: one
for home, one for the office, and one I send to
my mother in Perth.’
Even today I have a lingering fantasy that
somewhere in Perth there is a little old lady
with these CDs stacked up around her and one
day they will cave in on top of her and she will
be crushed by her daughter’s research. This is
just imagination, but it’s an indication of what
people were doing because they didn’t have
facilities to curate their data properly.
‒ John Shipp
RDSI Project Board
11|
2010:RDSIbegins
Dates:
2010-1014
Funding:
$50m
Program:
Education Investment Fund (EIF) under the Super Science
Initiative
Lead agent:
The University of Queensland
Chair:
Prof Max Lu
DeputyVice-Chancellor (Research)
Project Director:
Dr NickTate
The aim of the RDSI Project is that researchers will
be able to use and manipulate significant
collections of data that were previously either
unavailable or difficult to access, and that there will
be a consistent means of accessing this data.
The Project will be realised through the creation
and development of data storage infrastructure
accessed through a common infrastructure layer
and provided by agencies within the sector, or
commercial providers, or both.
12|
Whereshouldweput
thedata?
The question was, where should we put the
data? This was stopping progress across
research. People couldn’t keep storing it on
their desktops, and there were no onshore cloud
services at the time. That left us with two
options: either one big data centre or a small
number of sites around Australia. The problem
with one site is then you have to put everything
else around it, and you lose innovation. Several
is better.
‒ Dr Rhys Francis
RDSI Board
Consultingthe
community
Chapter II
14|
OpeningaDialogueJune
July
August
September
October
November
December
January
February
March
April
May
2011 2012
ReDS Consultation Series
9 Feb – 2 Mar
Vendor Briefing
1 Aug
ReDS Tinman
Consultation
28 Sep
DaSh Tinman
Consultation
11 Nov
ReDS Strawman
Consultation Series
3 June – 1 July
DaSh Strawman
Consultation Series
17 June
15|
Consultingthe
community
Because the Education Investment Fund has a
restriction that you can’t use the funding for
operation, we needed operational partners who
could provide the operational working funds. We
didn’t know if anybody would be interested in
putting their hand up to do that. We were
therefore consulting to see who would be
interested, under what circumstances they
might be interested, and what would be possible
for them to achieve that would meet the project
objectives and the needs of researchers. By doing
that we were able to put together a plan.
‒ Dr Nick Tate
RDSI Project Director
16|
FromStrawMan
toTinMan
It was a large community engagement exercise.
For each program, we would find out who
might be interested, get them together for a
workshop, bounce around ideas, and then
consolidate the thinking into a document. We
had a lot of good feedback—in some cases
probably 100 individual contributions to the
Straw Man documents. We would then develop
a Tin Man document for each of the programs
and work through those, to ultimately lead into
the business plan for the project. It matured the
thinking of the project quite quickly, and it
didn’t feel like an invention being planted on the
community from outside. This was something
the community built itself.
‒ Dr Markus Buchhorn
Research Data Manager, RDSI Project
17|
Buildingawareness
A moment stands out in my mind, from one of
the early workshops. A lady was there who was
researching children affected by asthma. She
wasn’t sure if her data would be suitable because
it was not a large dataset. I assured her that it
was the potential for reuse, not the size, that
was important. By the end of the workshop she
was excited about the possibilities for enhancing
her research.
‒ Mary Sharp
Infrastructure Advisory Panel
18|
NodeDevelopment
The NoDe Development programme was
designed to identify, strengthen and develop
research data centres so that they were able to
hold and process high data volumes. These data
centres became Nodes of the RDSI project, and
their operators were provided with
establishment funding from this programme.
The Node Development programme funded
the development of eight high capacity Nodes:
six Primary Nodes located in Brisbane, Sydney,
Canberra, Melbourne, Adelaide and Perth, with
two additional Nodes inTownsville and Hobart.
19|
IdentifyingtheNodes
RDSI went through a process of calling for
proposals. We were all looking at what our
individual contribution to the national
infrastructure would be, and I remember in
Queensland we focused on life sciences and eco
sciences as being the main areas where we
thought that there was a really significant body
of research expertise.
‒ Rob Cook
CEO, QCIF
20|
Respondingtothecall
forNodes
We put a proposal to our members that
Intersect propose to become a Node of RDSI.
Intersect has twelve university members, quite a
lot, and the thing that is interesting is that all
twelve quickly said, ‘Yes, this is a really good
idea.’ So we responded to the call for proposals
and we were one of the Nodes selected from the
country.
‒ Dr Ian Gibson
CEO, Intersect Australia
21|
Keytobetterresearch
outcomes
When we think about the services we offer at
eResearch SA, we think about what we can do
to make researchers more effective. It’s not our
job to produce research outcomes, but it is our
job to enable researchers to create better
research outcomes. RDSI has been key to that.
‒ Mary Hobson
CEO, eResearch SA
22|
RDSIfundedNodes
January
February
March
April
May
June
July
August
September
October
November
December
22/03/2012
5/04/2012
21/05/2012
25/06/2012
5/07/2012
10/07/2012
4/09/2012
2/11/2012
Thedata
collections
Chapter III
24|
ResearchDataServices
The ReDS programme was designed to identify
research data holdings of lasting value and
importance and contribute funding to their
development at the most appropriate Node.
ReDS delivers storage services in support of
significant data collections, research data sets
and associated access tools which are
aggregated into related holdings that add value
to each other through co-location.
25|
Selectingcollections
The value of data is only realised when it’s used,
so having data that is going to be reused was a
critical part of whether it could be stored
through the ReDS program. Every collection
that has been given an allocation through the
ReDS program meets certain criteria that
indicate or demonstrate its national significance.
‒ Peter Hicks
ReDS Program Manager, RDSI Project
26|
Meritcriteria
Determining criteria by which research data
might be assessed for merit was difficult across
disciplines. You could assess a collection based on
how many people it would be shared with. But
what might be an appropriate audience size for
sharing climate data would not be the same for
sharing medical data or humanities data. It was
very challenging to find merit criteria that
would carry the same weight across all
disciplines.
‒ Dr Frankie Stevens
Research Data Manager, RDSI Project
27|
Datastorageisa
commitment
Many of the Nodes had experience with merit
allocation processes used for high performance
computing. But with HPC, the resources are
used, and at the end of the cycle the whole
process is repeated. When you try to think about
that for data, for long-term storage, you’re not
thinking about a resource that goes away in six
months. You’re making a long-term
commitment. And so the assessment of merit
was a much bigger challenge.
‒ Dr Markus Buchhorn
Research Data Manager, RDSI Project
28|
Identifyingcollections
The Intersect model is we have staff located on
campus with our member universities, a team of
roughly 15 people already talking to research
groups, telling them about options and services
available to them. So that team were now
looking for collections that might benefit from
being on the RDSI infrastructure.
‒ Dr Ian Gibson
CEO, Intersect Australia
29|
Exposingscience
agencycollections
At NCI we’ve worked very closely with the
science agencies we are associated with—CSIRO,
Geoscience Australia, and the Bureau of
Meteorology—to expose the national and
international collections that have been locked
up inside those agencies. And the win-win is
that the exposure to the national community is
of value because these collections otherwise
would not have been available, and the
confluence of them enables transdisciplinary
research. The advantage to the agencies is that
the availability of the copy here puts that data
in a rich computational environment, much
richer than they have internally. That’s the
nature of the win-win that’s been possible.
‒ Prof Lindsay Botten
Director, National Computational Infrastructure
30|
Thelongtailof
researchdata
Coordinated research groups like astronomers
are able to collect their data and curate it, and
eventually they will find the storage. But the
humanities, medicine, the environmental
sciences, they were less united in their purpose,
and I always saw RDSI as a vehicle for bringing
those people together and providing them the
nascent infrastructure to store their data and
make it available.
‒ John Shipp
RDSI Board
31|
DataatRDSINodes
Last updated 14/12/2014
Click graph for current figures
32|
55Petabytesofdata
availableinover70
Petabytesofstorage
That these are huge numbers is beyond question;
perhaps more astonishing is that this is an order
of magnitude increase in just 4 years.
Even more encouraging is that this data is
spread across every one of the 22 research
disciplines. From Humanities to Radio
Astronomy, no Field of Research has been left
untouched.
33|
CollectionsbyFieldof
Research Last updated 14/12/2014
Click graph for current figures
34|
Aboutlargeallocations
The RDSI project allocated up to $9.4M of funds to
support large collections at Nodes. Nodes were given
the opportunity to submit funding proposals for
collections that were too large to be funded under the
initial ReDS agreement. Large collection proposals
were evaluated by the RDSI Project Board who
decided a total of 5 proposals will be funded.
35|
Largeallocations
I’m really pleased about the large allocations.
Having that strategic view at the research
community level of how storage is used to
aggregate data from particular domains in a
way that enables advanced research, is a critical
outcome of the ReDS programme.
‒ Peter Hicks
ReDS Programme Manager, RDSI Project
36|
Anationalmedical
researchdatastorage
facility
Recognising that there was a potential need to
support health and medical research data
collections, we put in a proposal to establish a
national medical data storage facility using
funds from the ReDS special allocation process.
We put that proposal to the medical research
community and we were overwhelmed with
responses. Forty-seven institutions around
Australia nominated major collections they
would like to store. So what we have now is an
opportunity to build a national medical research
data storage facility. This is something that is
quite uncommon globally. It will allow
researchers to get the serendipitous second use
outcomes and impacts from that data.
‒ Dr Ian Gibson
CEO, Intersect Australia
37|
Thelargeallocations Murchison Widefield Array Data Archive: The Murchison Widefield
Array (MWA) project is funded for operations over a two-year period,
which commenced in July 2013. These data sets, of international
importance, will assist a global Astronomy and Astrophysics
community to do research into the main science goals of the MWA,
which include: exploration of the early Universe and the search for
signals from the first stars and galaxies; exploration of the transient
and dynamic Universe; studies of the Earth's ionosphere; and the study
of astrophysics related to objects in our galaxy and in the distant
galaxies.
Participating RDSI Node is iVEC.
National Environmental Research Data Centre: The National
Environmental Research Data Collection (NERDC) comprises
international and national reference collections spanning five fields.
This multidisciplinary confluence of collections: (a) spans the
lithosphere, crust, biosphere, hydrosphere, troposphere, and the
stratosphere; (b) encapsulates the complex interactions within, and
amongst, these layers, and (c) will enable new, transdisciplinary
approaches to research.
Participating RDSI Node is NCI.
National Genomics Data Storage Facility: Genomics is a critical and
complex science for the understanding of living forms and the way
they are impacted by the environment, for improving medicine, and
understanding food amongst many other uses. The facility will store
and make available large volumes of genome data generated at the
leading national centres, as well as essential national and
international genome libraries.
Participating RDSI Nodes are Intersect, VicNode and QCIF.
Australian Coordinated Characterisation Data Space: The ACCDS
underpins national-scale research programs, in particular two
recently established characterisation-intensive ARC Centres of
Excellence: the ARC Centre of Excellence in Advanced Molecular
Imaging (CAMI), which will develop innovative imaging
technologies to explore the immune system; and the ARC Centre of
Excellence for Integrative Brain Function (CIBF), which is tackling
the challenging problems involved in understanding how the human
brain works.
Participating RDSI Nodes are VicNode, Intersect and QCIF.
Australian National Medical Research Data Storage Facility: The
foundation data sets for this collection represent major national
assets supporting research into Australia’s most significant diseases
including heart disease, mental illness, the major cancers, as well as
the increasing problems of lifestyle diseases such as diabetes and
obesity. Importantly, children’s health and the health of our aging
population are both well supported in the foundation data sets of
the ANMRDSF.
Participating RDSI Nodes are Intersect, VicNode and QCIF.
Anintegrated
infrastructure
Chapter IV
39|
DataSharing
Programme
The DaSh programme was designed to develop
the DaSh Collaboration Network (DaShNet)
and the DaSh Technical Architecture. The
integration of these two major parts of the
DaSh programme became the DaSh Technical
Framework, which describes the network, data
movement capabilities, security and identity
matters, data access, cloud gateway access,
test platform for the programme’s components
and workflow automation capabilities for the
RDSI-funded Nodes.
40|
Whyisithard?
Why is it hard, relative to other programs? It’s
hard because data has a narrative that’s
different for every data stream. Every data-
oriented organisation thinks their data is special
and different from everybody else’s. Really it
should be about revealing data, gathering it
together, and developing tools to express and
analyse it.
‒ Prof Nathan Bindoff
TPAC Director and Professor of Physical Oceanography
41|
Innovativeand
ambitious
RDSI was an innovative and ambitious
project. It seems like it ought to be
straightforward, but it’s actually very hard to
present datasets in a meaningful way to the
world. It’s not a simple problem. RDSI has
highlighted what’s possible.
‒ Jim McGovern
Infrastructure Advisory Panel
42|
Buildingtechnicalskills
acrossthesector
One of the things the project has been able to
do is to fund technical and data specialists at
the Nodes. Many of the techniques and skills for
storing, moving, and accessing large sets of data
were new to the project and to the Nodes. Being
able to invest in building up a community of
people in the sector with these skills and
capabilities has been one of the important
contributions of RDSI.
‒ Viviani Paz
RDSI Project Manager
Movingdata
ChapterV
44|
Theneedforafast
network
One of the significant technical challenges to
consider was moving data. We experienced this
early in the project as the ARCS Data Fabric
drew to a close and people were trying to move
the Integrated Marine Observing System data
from Perth to Hobart and Brisbane. The data
was going at the speed of congealed porridge
running down a hill. It was clear that the
network capabilities—not just fast networks, but
the techniques, the tools, the software for using
them—weren’t in place. So the Project has
solved a very serious challenge. When AARNet
put in the data transfer network, they are now
certifying that they’re getting 95 percent use of
the capacity of a 10-gigabit link. That is
enormous compared with what was there. It’s
not just the bandwidth that’s changed.
‒ Dr Nick Tate
RDSI Project Director
45|
Thechallengein
movingdata
Without a doubt, the biggest challenge you have
when moving data is that a lot of these datasets
are built upon many files that are inherently
small in nature, and they’re quite difficult to
move over large distances. Even though you
might have a lot of bandwidth available to you,
you can find that 10 terabytes of data can take
weeks if not months to transfer.
‒ Brendan Davey
Deputy Director, TPAC
46|
Network
enhancements,
AARNetandNRN
Our goal was to ensure we could get multiple
10 gigabits of capacity into the RDSI sites. We
were looking to get every Node connected to the
AARNet backbone with capacity significantly
over and above what a big university would
have, so they could provide services to a
community. And to do that we needed to
ensure there were redundant fibre paths, and
the appropriate network infrastructure on those
fibres to deliver that capacity.
‒ Peter Elford
Network Program Manager (2013), RDSI Project
47|
ConnectingNodesto
Nodes,andNodesto
Researchers
The Data Sharing Network (DaShNet) is a reliable high-speed network
service built over the new AARNet4 backbone network. It connects
RDSI-funded Nodes to each other and researchers around Australia. It
can ultimately support up to 100 gigabits per second, significantly
increasing data transfer rates across the country.
Accessingdata
ChapterVI
49|
Makingiteasytofind
andaccessdata
Mediaflux will enable the research community to
have useful data management tools that are
consistent across Australia, so that people will be
interacting with research data in the same way,
irrespective of where they’re located.
‒ Dr Frankie Stevens
Research Data Manager, RDSI Project
50|
Mediafluxandthe
Nodes
51|
Identityandaccess
management
In Australia we already had the Australian
Access Federation, a trust fabric for identity
management. But one of the things that became
very plain early on is that although it handles
web access for everyone, it can’t yet do access
via other methodologies. This is where the
project got into territory where we were cutting
new ground.
‒ Richard Northam
Node Development Manager, RDSI Project
Protectingdata
ChapterVII
53|
Securingthedata
One of the toughest parts of looking at the
security models for this project was to
understand how the Nodes would collaborate
and share responsibility for the data and data
transfers. How they would manage the
relationships with researchers to give them a
level of comfort that the integrity of their data
is being maintained. When your research data
has been under your direct control and then it
goes outside your own perimeter, there’s a
concern that you don’t know what’s happening
with it. So for me, it’s been a management job
of perception more than anything else.
‒ Mark McPherson
RDSI Security Policy Manager
54|
Willmydatabesafe?
One of our biggest challenges in getting people
on board is to assure them that their data is
going to be safe, it’s going to be secure, that
they’ll have access to it, and their partners who
use the data will have access to it.
‒ Brendan Davey
Deputy Director, TPAC
55|
Removingthe
obstacles
We were initially concerned about whether
researchers would adopt a central data storage
facility at all. We drew up a list of 10 obstacles,
and you could be pretty sure that if you started
talking to a researcher about putting their data
on RDSI, they’d start going through these 10
obstacles one by one without you prompting
them. ‘I can’t touch it anymore, it’s insecure,
you’re only here until the end of 2014, I might
have to pay for it,’ and so on. One by one we’ve
been able to make these obstacles insignificant.
‒ Rob Cook
CEO, QCIF
Workingwithdata
ChapterVIII
57|
Connectingwith
computefacilities
The Raijin supercomputer at NCI
“Data is a vital enabler of research.
Big data can only be handled in a
rich computational environment. It’s
not the data alone. It’s not the
compute alone. It’s the confluence of
the two. People advance their
research by being able to have well-
managed, integrated collections of
data where they can explore new
ideas by having a confluence of
different datasets available to them.”
‒ Prof Lindsay Botten
Director, National Computational
Infrastructure
58|
Servicesresearchers
need
RDSI has allowed us a platform to develop the
services that researchers actually need. It has
completely revolutionised not only our thinking
but our outlook and our future.
‒ Mary Hobson
CEO, eResearch SA
Selectingsolutions
Chapter IX
60|
VendorPanel
TheVendor Panel programme, implemented in partnership with the
Council of Australian University Directors of InformationTechnology
(CAUDIT), was created to facilitate the procurement of storage
related infrastructure, software and services.The purposes for the
programme were twofold. Firstly to allow Nodes, universities and
other authorised users to avoid lengthy tendering processes by
using an appropriately constructed panel, and secondly to support
volume pricing across Nodes and the wider Higher Education and
Research Sector.
61|
Anopenmindtowards
solutions
In many of the research infrastructure projects
we’ve seen before, there has been a focus on
adopting only open source solutions or solutions
developed by other researchers. We took the
view that we should go with a completely open
mind to the process. You can use commercial or
open source or other not-for-profit
infrastructure software. It doesn’t matter. The
important thing is to pick the best that’s
available at the time and to make sure it’s
affordable, and that’s why we’ve been
negotiating collectively. For example, the
software we’re using in our data transfer
network is a commercial product and by
negotiating effectively, we’ve been able to
acquire it at a price that allows the sector to do
things which have not been possible in the past.
With the Vendor Panel transferring to CAUDIT
over the coming months, this is a legacy for the
sector as a whole.
‒ Dr Nick Tate
RDSI Project Director
62|
VendorsonthePanel
Twenty-twoandcounting
Last updated 14/12/2014
Click image for current members
63|
SavingMoneyforthe
Sector
CIOs have challenges in navigating procurement
for IT services, testing the market, keeping up
with suppliers. The Vendor Panel simplified
storage options for the whole sector and was a
catalyst for CIOs to open a dialogue with the
research community. It took data storage down
from being too complicated, too hard, to being
a commodity product you can buy and use as
you need. Probably we won’t realise the benefits
for a few years, but it saved the sector an
enormous amount of money.
‒ Peter Nikoletatos
RDSI Project Board
64|
Arichsetofproposals
I appreciated the opportunity to work alongside
my colleagues in evaluating a rich set of
proposals from a diverse group of
respondents. The experience contributed to the
expansion of my knowledge around the complex
nature of research support structures. The
professional and pragmatic driving force within
RDSI brought great satisfaction in knowing that
the effort would in time be a major enabler of
Research within Australia.
‒ Rick Van Haeften
Infrastructure Advisory Panel and Independent Vendor
Panel Evaluation Committee
65|
Towardspubliccloud
This is potentially the last generation of serious
storage the Nodes will own. The project has
looked extensively at how public cloud could
complement in-house storage and compute, and
we’ve established agreements with Amazon Web
Services to eliminate costs in moving research
data in and out of the public cloud, to enable
the Nodes to make some informed decisions
about that in the future.
‒ Paul Campbell
Vendor Engagement Specialist, RDSI Project
66|
Adatastorage
ecosystem
When it comes to large-scale infrastructure,
organisations for the foreseeable future will use a
hybrid solution. They will have some capability
internally, some from private cloud providers
such as Intersect, and some from public cloud
infrastructure. We are part of an ecosystem that
allows the data to flow around these different
parties which collectively provide an ongoing
solution to data storage.
‒ Dr Ian Gibson
CEO, Intersect Australia
Chapter X
From July 2013, the RDSI Project began collecting use cases on how
research groups across Australia are using collections stored at RDSI
nodes, and why RDSI-funded storage is important to their research.
From high energy physics to the humanities, from climate to cancer
research, researchers are discovering common needs around research
data.They all need to preserve and store their data, access and share it
with collaborators, bring disparate collections together to be analysed
by common tools, and in many cases, reuse data that was collected by
someone else or for a different purpose.
CaseStudies
68|
Amajornewhuman
genomecollection
How RDSI is helping:
The sequencing will generate 4.5 petabytes
of data over the next 3 years. Storage
through RDSI helps reduce costs to
researchers and allows the data to be
moved easily among Nodes for analysis.
The outcome:
Australian researchers are positioned to
take a leading role in emerging genomics
research through access to cost-effective
genome sequencing and genomic
collections of international importance.
The challenge:
The Garvan Institute of Medical Research is
sequencing over 4000 healthy human
genomes to create a Medical Genomics
Reference Bank for researchers around the
world.
Image courtesy of P. Morris, Garvan Institute
“We see this as providing Australia with a seat at the
table and an opportunity to be amongst the world
leaders in an area that’s emerging so rapidly.”
– A/Prof Marcel Dinger
Head of Clinical Genomics and Genome Informatics, Garvan Institute
69|
Accesstowhatwas
onceinaccessible
How RDSI is helping:
Through RDSI storage, Richard is now able
for the first time to make this collection of
over 1 petabyte of data accessible and
searchable by researchers everywhere.
The outcome:
The footage is being used by the
Queensland Government to track turtle
hatchling success rates at Raine Island and
by JCU to study the ecology and biology of
venomous box jellyfish.
The challenge:
Award-winning natural history
cinematographer and marine scientist
Richard Fitzpatrick has 20 years worth of
film footage of the complex behaviours of
ocean and terrestrial creatures.
“The fact that it’s now searchable is
just huge. There are 5000 hours of
footage, and now you can go in and
chase stuff yourself. That in itself is
monumental.”
– A/Prof Jamie Seymour
Director, Tropical Australian Venom
Research Unit
70|
Openingthedoorto
uselargedatasetsina
HPCenvironment
How RDSI is helping:
RDSI storage makes available data that was
previously locked within agencies. RDSI
storage within the NCI computational
environment opens the door for using HPC
to work with these large datasets.
The outcome:
The National Flood Risk Information Project
(NFRIP) is using the Data Cube to create a
portal showing areas of land where surface
water has been observed from satellites in
the past, to raise community awareness of
flood risks.
The challenge:
Geoscience Australia is bringing together 30
years of earth observation satellite images
into a Data Cube that creates a geographic
time machine, allowing scientists to apply
the data to big problems such as managing
flood and fire risk.
“We now have hundreds of terabytes of satellite data
covering all of Australia going back 30 years, and we
wanted to begin applying it to big problems.”
‒ Dr Adam Lewis
National Earth & Marine Observations Group, Geoscience Australia
71|
Preservingcollections
How RDSI is helping:
A growing number of these collections have
been brought together by the human
communication science community, stored
through RDSI, and are now accessible
through the Alveo virtual laboratory, funded
by NeCTAR.
The outcome:
These collections are being preserved and
used in new ways by linguists, psychologists,
musicologists, and computational scientists.
The challenge:
Collections containing real examples of the
use of speech, language, and music were
stored in locations disparate from one
another. Accessibility was difficult, and
some collections were at risk of being lost.
72|
End-to-endresearch
datamanagement
The challenge:
The Australian Synchrotron needed to
protect, store, provide access to, and allow
researchers to share, reuse, and validate
data from Synchrotron beamline
experiments.
How RDSI is helping:
Technical staff from the Monash eResearch
Centre and VicNode worked with the
Synchrotron to develop a solution, which
uses RDSI storage to store and provide
access to the data. It also uses the NeCTAR
Research Cloud and DOIs from ANDS.
The outcome:
Store.Synchrotron is the first persistent,
open data store in the world for a
synchrotron. Thousands of datasets have
been stored in the permanent, accessible
archive.
“Ours is going to be the only system in the world where
all of the primary data from the beamlines, every frame,
will go into the store. And it will be there. This is an
absolute world first.”
‒ Dr Tom Caradoc-Davies
Principal Scientist for Macromolecular Crystallography, Australian
Synchrotron
73|
Datareuseandnew
collaborations
How RDSI is helping:
RDSI storage integrated with the NeCTAR
Research Cloud allows these models to be
available and easily run by other
researchers. As a result, use of the models
by other groups is growing rapidly.
The outcome:
A group of National Resource Management
Regions (NRMs) has found them so
beneficial they are funding Dr VanDerWal’s
group to include fresh water species
information.
The challenge:
Dr Jeremy VanDerWal at JCU creates
models to study how climate change will
affect bird and animal populations. The
models were previously behind university
firewalls. Providing access to others was
difficult.
“Previously my thinking was limited by the small amount
of storage and computing that was available to me. I
always had to summarise down and minimise the data. I
don’t have to do that now. I don’t have to worry about
the live disk limitation or the compute resources. Now I
can keep doing the research as I’d like to see it done.”
‒ Dr Jeremy VanDerWal
Centre for Tropical Biodiversity and Climate Change, James Cook
University
Lookingbackand
lookingahead
Chapter XI
75|
ProjectSuccess
The RDSI Project set out to transform the way
in which research data in Australia is stored and
made available to its potential users. By any
measure, the project has been successful in
achieving this.
When the contract for the project was signed
between the University of Queensland and the
Commonwealth Government on Christmas Eve
2010, it has been estimated that there was a
total of about 5 Petabytes of research data
stored throughout the sector and that much of
this was inaccessible to most researchers. By the
end of the project in 2015, it is expected that
over 55 Petabytes of data will be available in
over 70 Petabytes of storage. Even more
importantly, this will be stored in facilities that
are able to make it collaboratively available to
researchers.
‒ Dr Nick Tate
RDSI Project Director
76|
Areresearchersfinding
itvaluable?
Researchers are voting with their data. They’re
bringing it to the Nodes, they’re putting it on.
It’s a great leap forward.
‒ Peter Elford
Director, Government Relations, AARNet
77|
Culturalchange
‒ In addition to putting the tin on the ground
to store data, a key success of the RDSI
Project has been to facilitate cultural change
around collaboration and sharing.
‒ Brian Anker
Chair, RDSI Project Board
78|
Supportingresearch
activities
Coming from an IT background, I learned so
much from working on this project about
supporting research activities in the next decade.
It’s not just about compute and store, it’s about
collaboration. It’s about connecting, about
access, about identity. It’s about protecting the
work. It’s about curation of data. It’s about
making it available for groups, not just for now
but in the future. It’s just so big, it takes a while
to get your head around it.
‒ Peter Nikoletatos
RDSI Project Board
79|
Fingersintothefuture
With a lot of projects there is a start, a finish,
and you move onto the next thing. This thing
really has fingers into the future in being able to
act as a building block for other initiatives to
build on.
‒ Brian Anker
Chair, RDSI Project Board
80|
Dataaccess
Our views of data have changed as RDSI has evolved.
When the project started, everyone was talking about
data storage. But as you start storing the data, you
realise the real problem is access. In the early days, the
access mechanisms available were extremely primitive.
Now we have the data stored, we have the Mediaflux tool
to make data easy to find and access, we have Aspera
for moving large quantities of data. These weren’t even
really imagined when we started the RDSI project. As we
go forward, people will begin to realise that the real
value that’s been delivered by the system is the
organisation of the data into a way that people can find
it, access it, and manage it. That’s a big change that
RDSI has made.
‒ Rob Cook
CEO, QCIF
81|
Thinkingoutsidethe
box
A lot of people say to me, ‘This is great. I can
now go to one location, I can have access to this
dataset and to this other dataset right
alongside, whereas before I’d have to go to
multiple locations to get all the data I needed.’
And what I’m hearing in the wider research
community is that people are starting to think
outside the box. They can now suddenly combine
two datasets from different disciplines together
and potentially do new science. So it’s quite
exciting.
‒ Brendan Davey
Deputy Director, TPAC
82|
Focusonresearch
Once you’ve solved the problem of knowing how
to move data around and knowing where to put
it, you can start to focus on other things. And
that’s really where RDSI will continue to change
research in Victoria. You won’t need to focus on
where to put the data and whether or not it’s a
good idea to put it there. You can move on. As
an operator, the best thing is when you do get
people to use these services and they never, ever
call you again. Because it means the service is
humming away.
‒ Dr Steven Manos
Manager Research Services in ITS, The University of
Melbourne
83|
Thevalueofdata
Researchers are beginning to realise the value of
their data. A professor in pathology from The
University of Melbourne was one of our first
major consumers of data storage. We had gone
in to speak to her. Their archiving solution was a
set of hard drives on a shelf, and she wanted
advice on a better solution. She said, ‘Well, I pay
$70,000 a year in liquid nitrogen to preserve
my physical tissue samples. Why wouldn’t I pay
the equivalent to look after my digital assets?’
For her, the value proposition was obvious.
‒ Dr Steven Manos
Manager Research Services in ITS, The University of
Melbourne
84|
Amillion-foldincrease
inscale
In 1996 I became the Chairman of the World
Ocean Circulation Experiment Data Products
Committee, which was all about assembling the
data from this billion dollar international
project. I had 12 organisations working for me
and in those 12, probably 35 effective full-time
staff delivering up data on a regular basis. To
give you a sense of the scale, the final product in
2002 from this billion dollar experiment fit
onto a single DVD. It was just 4.7 gigabytes, but
we won accolades because that dataset, huge at
the time, was delivered across the Internet.
Now in 2014, we have over 30 petabytes of
data approved for ingestion into the RDSI Nodes
across Australia. So that’s nearly than a million-
fold increase in data, with fewer people
involved, and with the additional challenge that
the datasets come from a much more diverse
research community.
‒ Prof Nathan Bindoff
Director of TPAC and Professor of Physical Oceanography
85|
Puttinginachairlift
Sharing research data for everyone’s benefit
involves taking a risk. You have to climb a hill.
With the RDSI investment, the government has
put in a chair lift to help the research sector get
up the hill to see the benefit on the other side.
‒ Prof Liz Sonenberg
RDSI Project Board
86|
Enhancedresearch
capacity
The success of RDSI is unparalleled and its legacy
is a substantially enhanced research
capacity. To keep Australia competitive in
international research we need more assured
funding to support not just the large capital
investment needed to continually build our
computational capacity and data storage but to
fund the people to ensure these investments are
worthwhile and that they serve the needs of our
cutting edge researchers.
‒ Prof Doug McEachern
RDSI Projet Board
87|
Movingoutofthedata
iceage
An interesting lesson that wasn’t clear to me
when we started is that we’re really at a very
early stage of maturity in dealing with data. We
didn’t realise that we were in the ice age; it was
all frozen. You can get a real sense of excitement
from recognising what people will be able to do
with data when it becomes so easy to use. And
it will, you know.
‒ Rob Cook
CEO, QCIF
88|
Evolution
I think the RDSI Project has taken us through a
significant learning curve. Data is a tricky thing.
It’s so multi-dimensional, and it has so many
owners. Working with the multiplicity of
interests is quite challenging. And so the place
we’ve ended up is by evolution. You would not
have been able to write it down on a piece of
paper on Day One.
‒ Prof Lindsay Botten
Director, National Computational Infrastructure
89|
Thepathwayisreal
It’s important to understand that it’s not just
about where we have ended up. It’s about the
pathway. The pathway is real.
‒ Dr Rhys Francis
RDSI Project Board
90|
PassingtheBaton
It has been quite a journey over the past 4 years
as together we have created this extraordinary
infrastructure for the research sector. The
project has tackled a rich tapestry issues and
challenges, but with the help of all our
stakeholders we now have a result to be proud
of.
Researchers have made significant gains in the
way they interact with their data by being able
to concentrate on their research rather than
worrying about the volume of data they are
producing or the mechanisms to store that data.
We now pass the baton for continued
development and support of this national
infrastructure to the RDSI Node Operators, who
will lead the next step in its evolution. We wish
them luck and long lasting sustainability.
‒ Dr Nick Tate
RDSI Project Director
Theteam
Chapter XII
92|
ProjectOffice Communications
Project Director
Dr Nick Tate
n.tate@rdsi.uq.edu.au
+61 7 3365 2019 | +61 412 674 010
Communications Officer
Asher Vennell
a.vennell@rdsi.uq.edu.au
+61 408 517 376
Project Manager
Viviani Paz
v.paz@rdsi.uq.edu.au
+61 7 3365 2033 | +61 402 280 257
Storyteller
Patricia McMillan
patricia@patriciamcmillan.com
+61 434 602 050
Office Manager
Toni Walkinshaw office@rdsi.uq.edu.au
+61 7 3365 2030 | +61 419 477 490
Solutions Specialist
Loretta Davis
loretta.davis@rdsi.uq.edu.au
+61 407 370 474
93|
DataSharing(DaSh)
Vendor Engagement Specialist
Paul Campbell
paul.campbell@cogentia.com.au
+61 7 3878 2666 | +61 402 002 266
Security Policy Manager
Mark McPherson
mark.mcpherson@rdsi.uq.edu.au
+61 418 425 872
NodeDevelopment(NoDe)
NoDe Manager
Richard Northam
richard.northam@rdsi.uq.edu.au
+61 417 044 625
ResearchDataServices(ReDS)
ReDS Programme Manager
Peter Hicks
peter.hicks@rdsi.uq.edu.au
+61 401 103 640
Research Data Manager
Dr Markus Buchhorn
Markus.buchhorn@rdsi.uq.edu.au
+61 417 281 429
Research Data Manager
Dr Frankie Stevens
frankie.stevens@rdsi.uq.edu.au
+61 435 657 730
94|
Interactingwiththe
community
I’ve really enjoyed the breadth of interaction
we’ve been able to have as a project team across
the stakeholders– the research communities, the
universities and science agencies, the Nodes. It’s
been wonderful to be able to talk to all of those
stakeholder groups. One of the things that’s been
most interesting for me has been to see the
different approaches those groups bring to
data—how research data management might be
viewed at the institutional level, versus the state
level, versus a person working in a laboratory.
‒ Dr Frankie Stevens
Research Data Manager, RDSI Project
95|
Fromprincipleto
practice
When I first started with RDSI, communications
were focused on the overall brand awareness of
the project. As the Nodes capabilities improved
and became operational, there was a
fundamental shift towards their
accomplishments. The metamorphosis from
principle to practice has seen collective available
data go from zero to over 16 petabytes.
‒ Asher Vennell
Senior Communications Officer, RDSI Project
96|
Exceedingexpectations
Having worked in distributed research
environments, I had an understanding of
research data needs. As I was welcomed into
the project team it became apparent that RDSI
was not only meeting data storage
requirements, but exceeding all researcher
expectations.
‒ Toni Walkinshaw
Office Manager, RDSI Project
97|
Fromprincipleto
practice
The early stages of a major project like this are
hard work, and you don’t get to see the real
value until later when people begin to use it. I
had the opportunity towards the end of the
project to talk with researchers who
were adding data or using data via the
Nodes. Their response to this new capability was
overwhelming. They described how it was
enabling new collaborations, giving them access
to data that had previously been locked away,
and fuelling new research they would not have
been able to do without it.
‒ Patricia McMillan
Storyteller, RDSI Project
98|
Nodes
BoardMembers
Brian Anker
 Independent Chair
Dr Rhys Francis
 Director - eResearch Futures P/L
Professor Doug McEachern
 Former ProVice Chancellor Research and Innovation -
The University ofWestern Australia
Professor Anton Middelberg
 DeputyVice Chancellor (Research) -The University of
Queensland
FormerBoardMembers
Peter Nikoletatos
 Executive Director and Chief Information Officer - La Trobe University
John Shipp
 Vice-President - Australian Library and Information Association
Professor Liz Sonenberg
 Pro Vice-Chancellor (Research Collaboration) - The University of Melbourne
Professor Max Lu
 Provost and Senior Vice-President - The University of
Queensland
Professor Jill Trewhella
 Deputy Vice-Chancellor (Research) - The University of
Sydney
Theend

More Related Content

Viewers also liked

Viewers also liked (12)

Drama
DramaDrama
Drama
 
Untitled Presentation
Untitled PresentationUntitled Presentation
Untitled Presentation
 
Q7 Evaluation
Q7 EvaluationQ7 Evaluation
Q7 Evaluation
 
Drama
DramaDrama
Drama
 
2.
2.2.
2.
 
The male gaze theory
The male gaze theoryThe male gaze theory
The male gaze theory
 
Metabolizam
MetabolizamMetabolizam
Metabolizam
 
Proteini
ProteiniProteini
Proteini
 
Nukleofilna supstitucija
Nukleofilna supstitucijaNukleofilna supstitucija
Nukleofilna supstitucija
 
Metabolizam aminokiselina
Metabolizam aminokiselinaMetabolizam aminokiselina
Metabolizam aminokiselina
 
Regulacija homeostaze glukoze
Regulacija homeostaze glukozeRegulacija homeostaze glukoze
Regulacija homeostaze glukoze
 
VODA, ZAGAĐIVANJE VODE I MERE ZAŠTITE
VODA, ZAGAĐIVANJE VODE I MERE ZAŠTITEVODA, ZAGAĐIVANJE VODE I MERE ZAŠTITE
VODA, ZAGAĐIVANJE VODE I MERE ZAŠTITE
 

Similar to RDSI: Creating Australia's Research Data Cloud

Capacity Building and Implementation Guidelines for Open Science: Practices o...
Capacity Building and Implementation Guidelines for Open Science: Practices o...Capacity Building and Implementation Guidelines for Open Science: Practices o...
Capacity Building and Implementation Guidelines for Open Science: Practices o...Academy of Science of South Africa (ASSAf)
 
Open Data and Big Data Capacity Building Initiative
Open Data and Big Data Capacity Building InitiativeOpen Data and Big Data Capacity Building Initiative
Open Data and Big Data Capacity Building InitiativeCIARD Movement
 
Libraries and Research Data Management – What Works? Lessons Learned from the...
Libraries and Research Data Management – What Works? Lessons Learned from the...Libraries and Research Data Management – What Works? Lessons Learned from the...
Libraries and Research Data Management – What Works? Lessons Learned from the...LIBER Europe
 
The current challenges of upgrading the infrastructure
The current challenges of upgrading the infrastructureThe current challenges of upgrading the infrastructure
The current challenges of upgrading the infrastructureArhiv družboslovnih podatkov
 
Mobilising a nation: RDM education and training in South Africa
Mobilising a nation: RDM education and training in South AfricaMobilising a nation: RDM education and training in South Africa
Mobilising a nation: RDM education and training in South Africaheila1
 
Vitae tomorrows-researchers
Vitae tomorrows-researchersVitae tomorrows-researchers
Vitae tomorrows-researchersMatthew Dovey
 
Systems and Services: Adding Value For Research
Systems and Services: Adding Value For ResearchSystems and Services: Adding Value For Research
Systems and Services: Adding Value For ResearchARDC
 
Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...Robin Rice
 
A VIVO VIEW OF CANCER RESEARCH: Dream, Vision and Reality
A VIVO VIEW OF CANCER RESEARCH: Dream, Vision and RealityA VIVO VIEW OF CANCER RESEARCH: Dream, Vision and Reality
A VIVO VIEW OF CANCER RESEARCH: Dream, Vision and Reality Paul Courtney
 
Recognizing Research Technologists in the Research Process
Recognizing Research Technologists in the Research ProcessRecognizing Research Technologists in the Research Process
Recognizing Research Technologists in the Research ProcessMatthew Dovey
 
UVa Library Scientific Data Consulting Group (SciDaC): New Partnerships and...
UVa Library Scientific Data Consulting Group (SciDaC):  New Partnerships and...UVa Library Scientific Data Consulting Group (SciDaC):  New Partnerships and...
UVa Library Scientific Data Consulting Group (SciDaC): New Partnerships and...Andrew Sallans
 
Researchers: how and why manage research data; CDU Darwin 070915
Researchers: how and why manage research data; CDU Darwin 070915Researchers: how and why manage research data; CDU Darwin 070915
Researchers: how and why manage research data; CDU Darwin 070915Richard Ferrers
 
Data Management Plan Advising? A New Business Venture for Libraries
Data Management Plan Advising?  A New Business Venture for LibrariesData Management Plan Advising?  A New Business Venture for Libraries
Data Management Plan Advising? A New Business Venture for LibrariesAndrew Sallans
 
Open FAIR Data and Open Science: Developing Partnerships, Strategies, Policie...
Open FAIR Data and Open Science: Developing Partnerships, Strategies, Policie...Open FAIR Data and Open Science: Developing Partnerships, Strategies, Policie...
Open FAIR Data and Open Science: Developing Partnerships, Strategies, Policie...Academy of Science of South Africa (ASSAf)
 
Supporting Research Data Management at the University of Stirling
Supporting Research Data Management at the University of StirlingSupporting Research Data Management at the University of Stirling
Supporting Research Data Management at the University of StirlingLisa Haddow
 

Similar to RDSI: Creating Australia's Research Data Cloud (20)

Capacity Building and Implementation Guidelines for Open Science: Practices o...
Capacity Building and Implementation Guidelines for Open Science: Practices o...Capacity Building and Implementation Guidelines for Open Science: Practices o...
Capacity Building and Implementation Guidelines for Open Science: Practices o...
 
CODATA, Open Science Policies and Capacity Building by Simon Hodson
CODATA, Open Science Policies and Capacity Building by Simon HodsonCODATA, Open Science Policies and Capacity Building by Simon Hodson
CODATA, Open Science Policies and Capacity Building by Simon Hodson
 
Open Data and Big Data Capacity Building Initiative
Open Data and Big Data Capacity Building InitiativeOpen Data and Big Data Capacity Building Initiative
Open Data and Big Data Capacity Building Initiative
 
Libraries and Research Data Management – What Works? Lessons Learned from the...
Libraries and Research Data Management – What Works? Lessons Learned from the...Libraries and Research Data Management – What Works? Lessons Learned from the...
Libraries and Research Data Management – What Works? Lessons Learned from the...
 
African Open Science Platform: Pilot Phase
African Open Science Platform: Pilot PhaseAfrican Open Science Platform: Pilot Phase
African Open Science Platform: Pilot Phase
 
The current challenges of upgrading the infrastructure
The current challenges of upgrading the infrastructureThe current challenges of upgrading the infrastructure
The current challenges of upgrading the infrastructure
 
Mobilising a nation: RDM education and training in South Africa
Mobilising a nation: RDM education and training in South AfricaMobilising a nation: RDM education and training in South Africa
Mobilising a nation: RDM education and training in South Africa
 
Vitae tomorrows-researchers
Vitae tomorrows-researchersVitae tomorrows-researchers
Vitae tomorrows-researchers
 
CODATA: Open Data, FAIR Data and Open Science/Simon Hodson
CODATA: Open Data, FAIR Data and Open Science/Simon HodsonCODATA: Open Data, FAIR Data and Open Science/Simon Hodson
CODATA: Open Data, FAIR Data and Open Science/Simon Hodson
 
Digging into Data Funders Forum
Digging into Data Funders ForumDigging into Data Funders Forum
Digging into Data Funders Forum
 
Systems and Services: Adding Value For Research
Systems and Services: Adding Value For ResearchSystems and Services: Adding Value For Research
Systems and Services: Adding Value For Research
 
Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...Services, policy, guidance and training: Improving research data management a...
Services, policy, guidance and training: Improving research data management a...
 
A VIVO VIEW OF CANCER RESEARCH: Dream, Vision and Reality
A VIVO VIEW OF CANCER RESEARCH: Dream, Vision and RealityA VIVO VIEW OF CANCER RESEARCH: Dream, Vision and Reality
A VIVO VIEW OF CANCER RESEARCH: Dream, Vision and Reality
 
Recognizing Research Technologists in the Research Process
Recognizing Research Technologists in the Research ProcessRecognizing Research Technologists in the Research Process
Recognizing Research Technologists in the Research Process
 
UVa Library Scientific Data Consulting Group (SciDaC): New Partnerships and...
UVa Library Scientific Data Consulting Group (SciDaC):  New Partnerships and...UVa Library Scientific Data Consulting Group (SciDaC):  New Partnerships and...
UVa Library Scientific Data Consulting Group (SciDaC): New Partnerships and...
 
Researchers: how and why manage research data; CDU Darwin 070915
Researchers: how and why manage research data; CDU Darwin 070915Researchers: how and why manage research data; CDU Darwin 070915
Researchers: how and why manage research data; CDU Darwin 070915
 
Rdaeu russia_fg_1_july2014_final
Rdaeu  russia_fg_1_july2014_finalRdaeu  russia_fg_1_july2014_final
Rdaeu russia_fg_1_july2014_final
 
Data Management Plan Advising? A New Business Venture for Libraries
Data Management Plan Advising?  A New Business Venture for LibrariesData Management Plan Advising?  A New Business Venture for Libraries
Data Management Plan Advising? A New Business Venture for Libraries
 
Open FAIR Data and Open Science: Developing Partnerships, Strategies, Policie...
Open FAIR Data and Open Science: Developing Partnerships, Strategies, Policie...Open FAIR Data and Open Science: Developing Partnerships, Strategies, Policie...
Open FAIR Data and Open Science: Developing Partnerships, Strategies, Policie...
 
Supporting Research Data Management at the University of Stirling
Supporting Research Data Management at the University of StirlingSupporting Research Data Management at the University of Stirling
Supporting Research Data Management at the University of Stirling
 

Recently uploaded

ROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationAadityaSharma884161
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxEyham Joco
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementmkooblal
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Planning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxPlanning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxLigayaBacuel1
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.arsicmarija21
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxChelloAnnAsuncion2
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 

Recently uploaded (20)

ROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint PresentationROOT CAUSE ANALYSIS PowerPoint Presentation
ROOT CAUSE ANALYSIS PowerPoint Presentation
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Types of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptxTypes of Journalistic Writing Grade 8.pptx
Types of Journalistic Writing Grade 8.pptx
 
9953330565 Low Rate Call Girls In Rohini Delhi NCR
9953330565 Low Rate Call Girls In Rohini  Delhi NCR9953330565 Low Rate Call Girls In Rohini  Delhi NCR
9953330565 Low Rate Call Girls In Rohini Delhi NCR
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Hierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of managementHierarchy of management that covers different levels of management
Hierarchy of management that covers different levels of management
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Planning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptxPlanning a health career 4th Quarter.pptx
Planning a health career 4th Quarter.pptx
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptxGrade 9 Q4-MELC1-Active and Passive Voice.pptx
Grade 9 Q4-MELC1-Active and Passive Voice.pptx
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 

RDSI: Creating Australia's Research Data Cloud

  • 2. 2| TableofContents 03 13 23 38 43 Chapter I Chapter II Chapter III Chapter IV Chapter V 48 52 56 59 67 Chapter VI Chapter VII Chapter VIII Chapter XI Chapter X IntheBeginning Consultingthe community Thedatacollections Anintegrated infrastructure Movingdata Accessingdata Protectingdata Workingwithdata Selectingsolutions Casestudies 74 91 Chapter XI Chapter XI Lookingbackand lookingahead Theteam
  • 4. 4| …therewasnothing At the very beginning of this project there was nothing in the sector for collaborative storage. Really nothing. So it started from scratch, to persuade people that it was a good idea to store data, except not just in your university or in your research group. – Dr Nick Tate RDSI Project Director
  • 5. 5| Nodiscoverieswithout data RDSI is in my mind, entirely about the delivery of data and the amassing of the data. It is my belief that without bringing collections together, without making them accessible, you can’t make new discoveries. ‒ Prof Nathan Bindoff Director TPAC and Professor of Physical Oceanography
  • 6. 6| AMajorGap I was working for the Australian Research Council when we were doing projects on the cutting edge in research which crosses disciplinary boundaries. Like many at the same time we were convinced the future of research would be shaped by the adoption of digital technologies, increased computational power, high performance computing and massive data storage and tools to make it possible to link and re-use existing research data. After considerable federal investment and matching efforts from state governments and universities, there was one major gap: massive data stores to service the growing eResearch communities existing and coming in to being. ‒ Prof Doug McEachern RDSI Project Board
  • 7. 7| Howitstarted There is growing recognition that new ways to conduct research have emerged and are being validated across most research disciplines. Adding to traditional forms of research that rely on experiment, theory and testing hypotheses using data, it is now evident that researchers also: » collect increasingly larger sets of data as a primary form of research; and » use modelling tools to assist them in deriving patterns, perceptions and trends that can form the basis for establishing and confirming hypotheses. Information and communications technology (ICT) is the cornerstone to such new approaches, providing the means not only for increasingly powerful computer-enabled simulation and modelling, but also the very avenue to manage and integrate the increasing volume and complexity of datasets and collections. Hence, ICT is not only a resource to administer and manage research but also to drive and innovate the ways in which research is conducted. – Strategic Roadmap for Australian Research Infrastructure,2008, p19 In2008,theGovernment’sStrategicRoadmapfor AustralianResearchInfrastructurehighlightedthe needtomanageandintegratetheincreasingvolume andcomplexityofresearchdatasetsandcollections.
  • 8. 8| Researchstorage beforeRDSI Part of my role here at QCIF is to help research groups with their storage needs. In the days before RDSI it caught me by a huge surprise that even at a major university it was quite hard for a research group to get the right kind of storage, to have some better way of collaborating than sending a spreadsheet across the globe. I remember the first meeting I had with one professor at UQ who had come from Oxford. He said, ‘Look, I’m a centre director, I’m a professor, and I have spent six months just trying to get a little bit of storage. I’m worried I should be doing something else.’ ‒ Graham Chen eResearch Manager, QCIF
  • 9. 9| Researchstorage beforeRDSI A retiring academic in Marine Science came to us at the University of Sydney library. He had spent his whole life collecting data from the beaches of Australia. He asked us, ‘What do I do with all of this? It’s in various formats, it’s valuable, but I don’t know what to do with it.’ We approached the university, but at that stage the view was that when research results were published, the research was finished. Why would they want to store the data and why would they want to share it? ‒ John Shipp RDSI Project Board
  • 10. 10| Researchstorage beforeRDSI We asked another researcher what she was doing to curate her data. She said, ‘Every week I download it onto CD. I make three copies: one for home, one for the office, and one I send to my mother in Perth.’ Even today I have a lingering fantasy that somewhere in Perth there is a little old lady with these CDs stacked up around her and one day they will cave in on top of her and she will be crushed by her daughter’s research. This is just imagination, but it’s an indication of what people were doing because they didn’t have facilities to curate their data properly. ‒ John Shipp RDSI Project Board
  • 11. 11| 2010:RDSIbegins Dates: 2010-1014 Funding: $50m Program: Education Investment Fund (EIF) under the Super Science Initiative Lead agent: The University of Queensland Chair: Prof Max Lu DeputyVice-Chancellor (Research) Project Director: Dr NickTate The aim of the RDSI Project is that researchers will be able to use and manipulate significant collections of data that were previously either unavailable or difficult to access, and that there will be a consistent means of accessing this data. The Project will be realised through the creation and development of data storage infrastructure accessed through a common infrastructure layer and provided by agencies within the sector, or commercial providers, or both.
  • 12. 12| Whereshouldweput thedata? The question was, where should we put the data? This was stopping progress across research. People couldn’t keep storing it on their desktops, and there were no onshore cloud services at the time. That left us with two options: either one big data centre or a small number of sites around Australia. The problem with one site is then you have to put everything else around it, and you lose innovation. Several is better. ‒ Dr Rhys Francis RDSI Board
  • 14. 14| OpeningaDialogueJune July August September October November December January February March April May 2011 2012 ReDS Consultation Series 9 Feb – 2 Mar Vendor Briefing 1 Aug ReDS Tinman Consultation 28 Sep DaSh Tinman Consultation 11 Nov ReDS Strawman Consultation Series 3 June – 1 July DaSh Strawman Consultation Series 17 June
  • 15. 15| Consultingthe community Because the Education Investment Fund has a restriction that you can’t use the funding for operation, we needed operational partners who could provide the operational working funds. We didn’t know if anybody would be interested in putting their hand up to do that. We were therefore consulting to see who would be interested, under what circumstances they might be interested, and what would be possible for them to achieve that would meet the project objectives and the needs of researchers. By doing that we were able to put together a plan. ‒ Dr Nick Tate RDSI Project Director
  • 16. 16| FromStrawMan toTinMan It was a large community engagement exercise. For each program, we would find out who might be interested, get them together for a workshop, bounce around ideas, and then consolidate the thinking into a document. We had a lot of good feedback—in some cases probably 100 individual contributions to the Straw Man documents. We would then develop a Tin Man document for each of the programs and work through those, to ultimately lead into the business plan for the project. It matured the thinking of the project quite quickly, and it didn’t feel like an invention being planted on the community from outside. This was something the community built itself. ‒ Dr Markus Buchhorn Research Data Manager, RDSI Project
  • 17. 17| Buildingawareness A moment stands out in my mind, from one of the early workshops. A lady was there who was researching children affected by asthma. She wasn’t sure if her data would be suitable because it was not a large dataset. I assured her that it was the potential for reuse, not the size, that was important. By the end of the workshop she was excited about the possibilities for enhancing her research. ‒ Mary Sharp Infrastructure Advisory Panel
  • 18. 18| NodeDevelopment The NoDe Development programme was designed to identify, strengthen and develop research data centres so that they were able to hold and process high data volumes. These data centres became Nodes of the RDSI project, and their operators were provided with establishment funding from this programme. The Node Development programme funded the development of eight high capacity Nodes: six Primary Nodes located in Brisbane, Sydney, Canberra, Melbourne, Adelaide and Perth, with two additional Nodes inTownsville and Hobart.
  • 19. 19| IdentifyingtheNodes RDSI went through a process of calling for proposals. We were all looking at what our individual contribution to the national infrastructure would be, and I remember in Queensland we focused on life sciences and eco sciences as being the main areas where we thought that there was a really significant body of research expertise. ‒ Rob Cook CEO, QCIF
  • 20. 20| Respondingtothecall forNodes We put a proposal to our members that Intersect propose to become a Node of RDSI. Intersect has twelve university members, quite a lot, and the thing that is interesting is that all twelve quickly said, ‘Yes, this is a really good idea.’ So we responded to the call for proposals and we were one of the Nodes selected from the country. ‒ Dr Ian Gibson CEO, Intersect Australia
  • 21. 21| Keytobetterresearch outcomes When we think about the services we offer at eResearch SA, we think about what we can do to make researchers more effective. It’s not our job to produce research outcomes, but it is our job to enable researchers to create better research outcomes. RDSI has been key to that. ‒ Mary Hobson CEO, eResearch SA
  • 24. 24| ResearchDataServices The ReDS programme was designed to identify research data holdings of lasting value and importance and contribute funding to their development at the most appropriate Node. ReDS delivers storage services in support of significant data collections, research data sets and associated access tools which are aggregated into related holdings that add value to each other through co-location.
  • 25. 25| Selectingcollections The value of data is only realised when it’s used, so having data that is going to be reused was a critical part of whether it could be stored through the ReDS program. Every collection that has been given an allocation through the ReDS program meets certain criteria that indicate or demonstrate its national significance. ‒ Peter Hicks ReDS Program Manager, RDSI Project
  • 26. 26| Meritcriteria Determining criteria by which research data might be assessed for merit was difficult across disciplines. You could assess a collection based on how many people it would be shared with. But what might be an appropriate audience size for sharing climate data would not be the same for sharing medical data or humanities data. It was very challenging to find merit criteria that would carry the same weight across all disciplines. ‒ Dr Frankie Stevens Research Data Manager, RDSI Project
  • 27. 27| Datastorageisa commitment Many of the Nodes had experience with merit allocation processes used for high performance computing. But with HPC, the resources are used, and at the end of the cycle the whole process is repeated. When you try to think about that for data, for long-term storage, you’re not thinking about a resource that goes away in six months. You’re making a long-term commitment. And so the assessment of merit was a much bigger challenge. ‒ Dr Markus Buchhorn Research Data Manager, RDSI Project
  • 28. 28| Identifyingcollections The Intersect model is we have staff located on campus with our member universities, a team of roughly 15 people already talking to research groups, telling them about options and services available to them. So that team were now looking for collections that might benefit from being on the RDSI infrastructure. ‒ Dr Ian Gibson CEO, Intersect Australia
  • 29. 29| Exposingscience agencycollections At NCI we’ve worked very closely with the science agencies we are associated with—CSIRO, Geoscience Australia, and the Bureau of Meteorology—to expose the national and international collections that have been locked up inside those agencies. And the win-win is that the exposure to the national community is of value because these collections otherwise would not have been available, and the confluence of them enables transdisciplinary research. The advantage to the agencies is that the availability of the copy here puts that data in a rich computational environment, much richer than they have internally. That’s the nature of the win-win that’s been possible. ‒ Prof Lindsay Botten Director, National Computational Infrastructure
  • 30. 30| Thelongtailof researchdata Coordinated research groups like astronomers are able to collect their data and curate it, and eventually they will find the storage. But the humanities, medicine, the environmental sciences, they were less united in their purpose, and I always saw RDSI as a vehicle for bringing those people together and providing them the nascent infrastructure to store their data and make it available. ‒ John Shipp RDSI Board
  • 32. 32| 55Petabytesofdata availableinover70 Petabytesofstorage That these are huge numbers is beyond question; perhaps more astonishing is that this is an order of magnitude increase in just 4 years. Even more encouraging is that this data is spread across every one of the 22 research disciplines. From Humanities to Radio Astronomy, no Field of Research has been left untouched.
  • 33. 33| CollectionsbyFieldof Research Last updated 14/12/2014 Click graph for current figures
  • 34. 34| Aboutlargeallocations The RDSI project allocated up to $9.4M of funds to support large collections at Nodes. Nodes were given the opportunity to submit funding proposals for collections that were too large to be funded under the initial ReDS agreement. Large collection proposals were evaluated by the RDSI Project Board who decided a total of 5 proposals will be funded.
  • 35. 35| Largeallocations I’m really pleased about the large allocations. Having that strategic view at the research community level of how storage is used to aggregate data from particular domains in a way that enables advanced research, is a critical outcome of the ReDS programme. ‒ Peter Hicks ReDS Programme Manager, RDSI Project
  • 36. 36| Anationalmedical researchdatastorage facility Recognising that there was a potential need to support health and medical research data collections, we put in a proposal to establish a national medical data storage facility using funds from the ReDS special allocation process. We put that proposal to the medical research community and we were overwhelmed with responses. Forty-seven institutions around Australia nominated major collections they would like to store. So what we have now is an opportunity to build a national medical research data storage facility. This is something that is quite uncommon globally. It will allow researchers to get the serendipitous second use outcomes and impacts from that data. ‒ Dr Ian Gibson CEO, Intersect Australia
  • 37. 37| Thelargeallocations Murchison Widefield Array Data Archive: The Murchison Widefield Array (MWA) project is funded for operations over a two-year period, which commenced in July 2013. These data sets, of international importance, will assist a global Astronomy and Astrophysics community to do research into the main science goals of the MWA, which include: exploration of the early Universe and the search for signals from the first stars and galaxies; exploration of the transient and dynamic Universe; studies of the Earth's ionosphere; and the study of astrophysics related to objects in our galaxy and in the distant galaxies. Participating RDSI Node is iVEC. National Environmental Research Data Centre: The National Environmental Research Data Collection (NERDC) comprises international and national reference collections spanning five fields. This multidisciplinary confluence of collections: (a) spans the lithosphere, crust, biosphere, hydrosphere, troposphere, and the stratosphere; (b) encapsulates the complex interactions within, and amongst, these layers, and (c) will enable new, transdisciplinary approaches to research. Participating RDSI Node is NCI. National Genomics Data Storage Facility: Genomics is a critical and complex science for the understanding of living forms and the way they are impacted by the environment, for improving medicine, and understanding food amongst many other uses. The facility will store and make available large volumes of genome data generated at the leading national centres, as well as essential national and international genome libraries. Participating RDSI Nodes are Intersect, VicNode and QCIF. Australian Coordinated Characterisation Data Space: The ACCDS underpins national-scale research programs, in particular two recently established characterisation-intensive ARC Centres of Excellence: the ARC Centre of Excellence in Advanced Molecular Imaging (CAMI), which will develop innovative imaging technologies to explore the immune system; and the ARC Centre of Excellence for Integrative Brain Function (CIBF), which is tackling the challenging problems involved in understanding how the human brain works. Participating RDSI Nodes are VicNode, Intersect and QCIF. Australian National Medical Research Data Storage Facility: The foundation data sets for this collection represent major national assets supporting research into Australia’s most significant diseases including heart disease, mental illness, the major cancers, as well as the increasing problems of lifestyle diseases such as diabetes and obesity. Importantly, children’s health and the health of our aging population are both well supported in the foundation data sets of the ANMRDSF. Participating RDSI Nodes are Intersect, VicNode and QCIF.
  • 39. 39| DataSharing Programme The DaSh programme was designed to develop the DaSh Collaboration Network (DaShNet) and the DaSh Technical Architecture. The integration of these two major parts of the DaSh programme became the DaSh Technical Framework, which describes the network, data movement capabilities, security and identity matters, data access, cloud gateway access, test platform for the programme’s components and workflow automation capabilities for the RDSI-funded Nodes.
  • 40. 40| Whyisithard? Why is it hard, relative to other programs? It’s hard because data has a narrative that’s different for every data stream. Every data- oriented organisation thinks their data is special and different from everybody else’s. Really it should be about revealing data, gathering it together, and developing tools to express and analyse it. ‒ Prof Nathan Bindoff TPAC Director and Professor of Physical Oceanography
  • 41. 41| Innovativeand ambitious RDSI was an innovative and ambitious project. It seems like it ought to be straightforward, but it’s actually very hard to present datasets in a meaningful way to the world. It’s not a simple problem. RDSI has highlighted what’s possible. ‒ Jim McGovern Infrastructure Advisory Panel
  • 42. 42| Buildingtechnicalskills acrossthesector One of the things the project has been able to do is to fund technical and data specialists at the Nodes. Many of the techniques and skills for storing, moving, and accessing large sets of data were new to the project and to the Nodes. Being able to invest in building up a community of people in the sector with these skills and capabilities has been one of the important contributions of RDSI. ‒ Viviani Paz RDSI Project Manager
  • 44. 44| Theneedforafast network One of the significant technical challenges to consider was moving data. We experienced this early in the project as the ARCS Data Fabric drew to a close and people were trying to move the Integrated Marine Observing System data from Perth to Hobart and Brisbane. The data was going at the speed of congealed porridge running down a hill. It was clear that the network capabilities—not just fast networks, but the techniques, the tools, the software for using them—weren’t in place. So the Project has solved a very serious challenge. When AARNet put in the data transfer network, they are now certifying that they’re getting 95 percent use of the capacity of a 10-gigabit link. That is enormous compared with what was there. It’s not just the bandwidth that’s changed. ‒ Dr Nick Tate RDSI Project Director
  • 45. 45| Thechallengein movingdata Without a doubt, the biggest challenge you have when moving data is that a lot of these datasets are built upon many files that are inherently small in nature, and they’re quite difficult to move over large distances. Even though you might have a lot of bandwidth available to you, you can find that 10 terabytes of data can take weeks if not months to transfer. ‒ Brendan Davey Deputy Director, TPAC
  • 46. 46| Network enhancements, AARNetandNRN Our goal was to ensure we could get multiple 10 gigabits of capacity into the RDSI sites. We were looking to get every Node connected to the AARNet backbone with capacity significantly over and above what a big university would have, so they could provide services to a community. And to do that we needed to ensure there were redundant fibre paths, and the appropriate network infrastructure on those fibres to deliver that capacity. ‒ Peter Elford Network Program Manager (2013), RDSI Project
  • 47. 47| ConnectingNodesto Nodes,andNodesto Researchers The Data Sharing Network (DaShNet) is a reliable high-speed network service built over the new AARNet4 backbone network. It connects RDSI-funded Nodes to each other and researchers around Australia. It can ultimately support up to 100 gigabits per second, significantly increasing data transfer rates across the country.
  • 49. 49| Makingiteasytofind andaccessdata Mediaflux will enable the research community to have useful data management tools that are consistent across Australia, so that people will be interacting with research data in the same way, irrespective of where they’re located. ‒ Dr Frankie Stevens Research Data Manager, RDSI Project
  • 51. 51| Identityandaccess management In Australia we already had the Australian Access Federation, a trust fabric for identity management. But one of the things that became very plain early on is that although it handles web access for everyone, it can’t yet do access via other methodologies. This is where the project got into territory where we were cutting new ground. ‒ Richard Northam Node Development Manager, RDSI Project
  • 53. 53| Securingthedata One of the toughest parts of looking at the security models for this project was to understand how the Nodes would collaborate and share responsibility for the data and data transfers. How they would manage the relationships with researchers to give them a level of comfort that the integrity of their data is being maintained. When your research data has been under your direct control and then it goes outside your own perimeter, there’s a concern that you don’t know what’s happening with it. So for me, it’s been a management job of perception more than anything else. ‒ Mark McPherson RDSI Security Policy Manager
  • 54. 54| Willmydatabesafe? One of our biggest challenges in getting people on board is to assure them that their data is going to be safe, it’s going to be secure, that they’ll have access to it, and their partners who use the data will have access to it. ‒ Brendan Davey Deputy Director, TPAC
  • 55. 55| Removingthe obstacles We were initially concerned about whether researchers would adopt a central data storage facility at all. We drew up a list of 10 obstacles, and you could be pretty sure that if you started talking to a researcher about putting their data on RDSI, they’d start going through these 10 obstacles one by one without you prompting them. ‘I can’t touch it anymore, it’s insecure, you’re only here until the end of 2014, I might have to pay for it,’ and so on. One by one we’ve been able to make these obstacles insignificant. ‒ Rob Cook CEO, QCIF
  • 57. 57| Connectingwith computefacilities The Raijin supercomputer at NCI “Data is a vital enabler of research. Big data can only be handled in a rich computational environment. It’s not the data alone. It’s not the compute alone. It’s the confluence of the two. People advance their research by being able to have well- managed, integrated collections of data where they can explore new ideas by having a confluence of different datasets available to them.” ‒ Prof Lindsay Botten Director, National Computational Infrastructure
  • 58. 58| Servicesresearchers need RDSI has allowed us a platform to develop the services that researchers actually need. It has completely revolutionised not only our thinking but our outlook and our future. ‒ Mary Hobson CEO, eResearch SA
  • 60. 60| VendorPanel TheVendor Panel programme, implemented in partnership with the Council of Australian University Directors of InformationTechnology (CAUDIT), was created to facilitate the procurement of storage related infrastructure, software and services.The purposes for the programme were twofold. Firstly to allow Nodes, universities and other authorised users to avoid lengthy tendering processes by using an appropriately constructed panel, and secondly to support volume pricing across Nodes and the wider Higher Education and Research Sector.
  • 61. 61| Anopenmindtowards solutions In many of the research infrastructure projects we’ve seen before, there has been a focus on adopting only open source solutions or solutions developed by other researchers. We took the view that we should go with a completely open mind to the process. You can use commercial or open source or other not-for-profit infrastructure software. It doesn’t matter. The important thing is to pick the best that’s available at the time and to make sure it’s affordable, and that’s why we’ve been negotiating collectively. For example, the software we’re using in our data transfer network is a commercial product and by negotiating effectively, we’ve been able to acquire it at a price that allows the sector to do things which have not been possible in the past. With the Vendor Panel transferring to CAUDIT over the coming months, this is a legacy for the sector as a whole. ‒ Dr Nick Tate RDSI Project Director
  • 63. 63| SavingMoneyforthe Sector CIOs have challenges in navigating procurement for IT services, testing the market, keeping up with suppliers. The Vendor Panel simplified storage options for the whole sector and was a catalyst for CIOs to open a dialogue with the research community. It took data storage down from being too complicated, too hard, to being a commodity product you can buy and use as you need. Probably we won’t realise the benefits for a few years, but it saved the sector an enormous amount of money. ‒ Peter Nikoletatos RDSI Project Board
  • 64. 64| Arichsetofproposals I appreciated the opportunity to work alongside my colleagues in evaluating a rich set of proposals from a diverse group of respondents. The experience contributed to the expansion of my knowledge around the complex nature of research support structures. The professional and pragmatic driving force within RDSI brought great satisfaction in knowing that the effort would in time be a major enabler of Research within Australia. ‒ Rick Van Haeften Infrastructure Advisory Panel and Independent Vendor Panel Evaluation Committee
  • 65. 65| Towardspubliccloud This is potentially the last generation of serious storage the Nodes will own. The project has looked extensively at how public cloud could complement in-house storage and compute, and we’ve established agreements with Amazon Web Services to eliminate costs in moving research data in and out of the public cloud, to enable the Nodes to make some informed decisions about that in the future. ‒ Paul Campbell Vendor Engagement Specialist, RDSI Project
  • 66. 66| Adatastorage ecosystem When it comes to large-scale infrastructure, organisations for the foreseeable future will use a hybrid solution. They will have some capability internally, some from private cloud providers such as Intersect, and some from public cloud infrastructure. We are part of an ecosystem that allows the data to flow around these different parties which collectively provide an ongoing solution to data storage. ‒ Dr Ian Gibson CEO, Intersect Australia
  • 67. Chapter X From July 2013, the RDSI Project began collecting use cases on how research groups across Australia are using collections stored at RDSI nodes, and why RDSI-funded storage is important to their research. From high energy physics to the humanities, from climate to cancer research, researchers are discovering common needs around research data.They all need to preserve and store their data, access and share it with collaborators, bring disparate collections together to be analysed by common tools, and in many cases, reuse data that was collected by someone else or for a different purpose. CaseStudies
  • 68. 68| Amajornewhuman genomecollection How RDSI is helping: The sequencing will generate 4.5 petabytes of data over the next 3 years. Storage through RDSI helps reduce costs to researchers and allows the data to be moved easily among Nodes for analysis. The outcome: Australian researchers are positioned to take a leading role in emerging genomics research through access to cost-effective genome sequencing and genomic collections of international importance. The challenge: The Garvan Institute of Medical Research is sequencing over 4000 healthy human genomes to create a Medical Genomics Reference Bank for researchers around the world. Image courtesy of P. Morris, Garvan Institute “We see this as providing Australia with a seat at the table and an opportunity to be amongst the world leaders in an area that’s emerging so rapidly.” – A/Prof Marcel Dinger Head of Clinical Genomics and Genome Informatics, Garvan Institute
  • 69. 69| Accesstowhatwas onceinaccessible How RDSI is helping: Through RDSI storage, Richard is now able for the first time to make this collection of over 1 petabyte of data accessible and searchable by researchers everywhere. The outcome: The footage is being used by the Queensland Government to track turtle hatchling success rates at Raine Island and by JCU to study the ecology and biology of venomous box jellyfish. The challenge: Award-winning natural history cinematographer and marine scientist Richard Fitzpatrick has 20 years worth of film footage of the complex behaviours of ocean and terrestrial creatures. “The fact that it’s now searchable is just huge. There are 5000 hours of footage, and now you can go in and chase stuff yourself. That in itself is monumental.” – A/Prof Jamie Seymour Director, Tropical Australian Venom Research Unit
  • 70. 70| Openingthedoorto uselargedatasetsina HPCenvironment How RDSI is helping: RDSI storage makes available data that was previously locked within agencies. RDSI storage within the NCI computational environment opens the door for using HPC to work with these large datasets. The outcome: The National Flood Risk Information Project (NFRIP) is using the Data Cube to create a portal showing areas of land where surface water has been observed from satellites in the past, to raise community awareness of flood risks. The challenge: Geoscience Australia is bringing together 30 years of earth observation satellite images into a Data Cube that creates a geographic time machine, allowing scientists to apply the data to big problems such as managing flood and fire risk. “We now have hundreds of terabytes of satellite data covering all of Australia going back 30 years, and we wanted to begin applying it to big problems.” ‒ Dr Adam Lewis National Earth & Marine Observations Group, Geoscience Australia
  • 71. 71| Preservingcollections How RDSI is helping: A growing number of these collections have been brought together by the human communication science community, stored through RDSI, and are now accessible through the Alveo virtual laboratory, funded by NeCTAR. The outcome: These collections are being preserved and used in new ways by linguists, psychologists, musicologists, and computational scientists. The challenge: Collections containing real examples of the use of speech, language, and music were stored in locations disparate from one another. Accessibility was difficult, and some collections were at risk of being lost.
  • 72. 72| End-to-endresearch datamanagement The challenge: The Australian Synchrotron needed to protect, store, provide access to, and allow researchers to share, reuse, and validate data from Synchrotron beamline experiments. How RDSI is helping: Technical staff from the Monash eResearch Centre and VicNode worked with the Synchrotron to develop a solution, which uses RDSI storage to store and provide access to the data. It also uses the NeCTAR Research Cloud and DOIs from ANDS. The outcome: Store.Synchrotron is the first persistent, open data store in the world for a synchrotron. Thousands of datasets have been stored in the permanent, accessible archive. “Ours is going to be the only system in the world where all of the primary data from the beamlines, every frame, will go into the store. And it will be there. This is an absolute world first.” ‒ Dr Tom Caradoc-Davies Principal Scientist for Macromolecular Crystallography, Australian Synchrotron
  • 73. 73| Datareuseandnew collaborations How RDSI is helping: RDSI storage integrated with the NeCTAR Research Cloud allows these models to be available and easily run by other researchers. As a result, use of the models by other groups is growing rapidly. The outcome: A group of National Resource Management Regions (NRMs) has found them so beneficial they are funding Dr VanDerWal’s group to include fresh water species information. The challenge: Dr Jeremy VanDerWal at JCU creates models to study how climate change will affect bird and animal populations. The models were previously behind university firewalls. Providing access to others was difficult. “Previously my thinking was limited by the small amount of storage and computing that was available to me. I always had to summarise down and minimise the data. I don’t have to do that now. I don’t have to worry about the live disk limitation or the compute resources. Now I can keep doing the research as I’d like to see it done.” ‒ Dr Jeremy VanDerWal Centre for Tropical Biodiversity and Climate Change, James Cook University
  • 75. 75| ProjectSuccess The RDSI Project set out to transform the way in which research data in Australia is stored and made available to its potential users. By any measure, the project has been successful in achieving this. When the contract for the project was signed between the University of Queensland and the Commonwealth Government on Christmas Eve 2010, it has been estimated that there was a total of about 5 Petabytes of research data stored throughout the sector and that much of this was inaccessible to most researchers. By the end of the project in 2015, it is expected that over 55 Petabytes of data will be available in over 70 Petabytes of storage. Even more importantly, this will be stored in facilities that are able to make it collaboratively available to researchers. ‒ Dr Nick Tate RDSI Project Director
  • 76. 76| Areresearchersfinding itvaluable? Researchers are voting with their data. They’re bringing it to the Nodes, they’re putting it on. It’s a great leap forward. ‒ Peter Elford Director, Government Relations, AARNet
  • 77. 77| Culturalchange ‒ In addition to putting the tin on the ground to store data, a key success of the RDSI Project has been to facilitate cultural change around collaboration and sharing. ‒ Brian Anker Chair, RDSI Project Board
  • 78. 78| Supportingresearch activities Coming from an IT background, I learned so much from working on this project about supporting research activities in the next decade. It’s not just about compute and store, it’s about collaboration. It’s about connecting, about access, about identity. It’s about protecting the work. It’s about curation of data. It’s about making it available for groups, not just for now but in the future. It’s just so big, it takes a while to get your head around it. ‒ Peter Nikoletatos RDSI Project Board
  • 79. 79| Fingersintothefuture With a lot of projects there is a start, a finish, and you move onto the next thing. This thing really has fingers into the future in being able to act as a building block for other initiatives to build on. ‒ Brian Anker Chair, RDSI Project Board
  • 80. 80| Dataaccess Our views of data have changed as RDSI has evolved. When the project started, everyone was talking about data storage. But as you start storing the data, you realise the real problem is access. In the early days, the access mechanisms available were extremely primitive. Now we have the data stored, we have the Mediaflux tool to make data easy to find and access, we have Aspera for moving large quantities of data. These weren’t even really imagined when we started the RDSI project. As we go forward, people will begin to realise that the real value that’s been delivered by the system is the organisation of the data into a way that people can find it, access it, and manage it. That’s a big change that RDSI has made. ‒ Rob Cook CEO, QCIF
  • 81. 81| Thinkingoutsidethe box A lot of people say to me, ‘This is great. I can now go to one location, I can have access to this dataset and to this other dataset right alongside, whereas before I’d have to go to multiple locations to get all the data I needed.’ And what I’m hearing in the wider research community is that people are starting to think outside the box. They can now suddenly combine two datasets from different disciplines together and potentially do new science. So it’s quite exciting. ‒ Brendan Davey Deputy Director, TPAC
  • 82. 82| Focusonresearch Once you’ve solved the problem of knowing how to move data around and knowing where to put it, you can start to focus on other things. And that’s really where RDSI will continue to change research in Victoria. You won’t need to focus on where to put the data and whether or not it’s a good idea to put it there. You can move on. As an operator, the best thing is when you do get people to use these services and they never, ever call you again. Because it means the service is humming away. ‒ Dr Steven Manos Manager Research Services in ITS, The University of Melbourne
  • 83. 83| Thevalueofdata Researchers are beginning to realise the value of their data. A professor in pathology from The University of Melbourne was one of our first major consumers of data storage. We had gone in to speak to her. Their archiving solution was a set of hard drives on a shelf, and she wanted advice on a better solution. She said, ‘Well, I pay $70,000 a year in liquid nitrogen to preserve my physical tissue samples. Why wouldn’t I pay the equivalent to look after my digital assets?’ For her, the value proposition was obvious. ‒ Dr Steven Manos Manager Research Services in ITS, The University of Melbourne
  • 84. 84| Amillion-foldincrease inscale In 1996 I became the Chairman of the World Ocean Circulation Experiment Data Products Committee, which was all about assembling the data from this billion dollar international project. I had 12 organisations working for me and in those 12, probably 35 effective full-time staff delivering up data on a regular basis. To give you a sense of the scale, the final product in 2002 from this billion dollar experiment fit onto a single DVD. It was just 4.7 gigabytes, but we won accolades because that dataset, huge at the time, was delivered across the Internet. Now in 2014, we have over 30 petabytes of data approved for ingestion into the RDSI Nodes across Australia. So that’s nearly than a million- fold increase in data, with fewer people involved, and with the additional challenge that the datasets come from a much more diverse research community. ‒ Prof Nathan Bindoff Director of TPAC and Professor of Physical Oceanography
  • 85. 85| Puttinginachairlift Sharing research data for everyone’s benefit involves taking a risk. You have to climb a hill. With the RDSI investment, the government has put in a chair lift to help the research sector get up the hill to see the benefit on the other side. ‒ Prof Liz Sonenberg RDSI Project Board
  • 86. 86| Enhancedresearch capacity The success of RDSI is unparalleled and its legacy is a substantially enhanced research capacity. To keep Australia competitive in international research we need more assured funding to support not just the large capital investment needed to continually build our computational capacity and data storage but to fund the people to ensure these investments are worthwhile and that they serve the needs of our cutting edge researchers. ‒ Prof Doug McEachern RDSI Projet Board
  • 87. 87| Movingoutofthedata iceage An interesting lesson that wasn’t clear to me when we started is that we’re really at a very early stage of maturity in dealing with data. We didn’t realise that we were in the ice age; it was all frozen. You can get a real sense of excitement from recognising what people will be able to do with data when it becomes so easy to use. And it will, you know. ‒ Rob Cook CEO, QCIF
  • 88. 88| Evolution I think the RDSI Project has taken us through a significant learning curve. Data is a tricky thing. It’s so multi-dimensional, and it has so many owners. Working with the multiplicity of interests is quite challenging. And so the place we’ve ended up is by evolution. You would not have been able to write it down on a piece of paper on Day One. ‒ Prof Lindsay Botten Director, National Computational Infrastructure
  • 89. 89| Thepathwayisreal It’s important to understand that it’s not just about where we have ended up. It’s about the pathway. The pathway is real. ‒ Dr Rhys Francis RDSI Project Board
  • 90. 90| PassingtheBaton It has been quite a journey over the past 4 years as together we have created this extraordinary infrastructure for the research sector. The project has tackled a rich tapestry issues and challenges, but with the help of all our stakeholders we now have a result to be proud of. Researchers have made significant gains in the way they interact with their data by being able to concentrate on their research rather than worrying about the volume of data they are producing or the mechanisms to store that data. We now pass the baton for continued development and support of this national infrastructure to the RDSI Node Operators, who will lead the next step in its evolution. We wish them luck and long lasting sustainability. ‒ Dr Nick Tate RDSI Project Director
  • 92. 92| ProjectOffice Communications Project Director Dr Nick Tate n.tate@rdsi.uq.edu.au +61 7 3365 2019 | +61 412 674 010 Communications Officer Asher Vennell a.vennell@rdsi.uq.edu.au +61 408 517 376 Project Manager Viviani Paz v.paz@rdsi.uq.edu.au +61 7 3365 2033 | +61 402 280 257 Storyteller Patricia McMillan patricia@patriciamcmillan.com +61 434 602 050 Office Manager Toni Walkinshaw office@rdsi.uq.edu.au +61 7 3365 2030 | +61 419 477 490 Solutions Specialist Loretta Davis loretta.davis@rdsi.uq.edu.au +61 407 370 474
  • 93. 93| DataSharing(DaSh) Vendor Engagement Specialist Paul Campbell paul.campbell@cogentia.com.au +61 7 3878 2666 | +61 402 002 266 Security Policy Manager Mark McPherson mark.mcpherson@rdsi.uq.edu.au +61 418 425 872 NodeDevelopment(NoDe) NoDe Manager Richard Northam richard.northam@rdsi.uq.edu.au +61 417 044 625 ResearchDataServices(ReDS) ReDS Programme Manager Peter Hicks peter.hicks@rdsi.uq.edu.au +61 401 103 640 Research Data Manager Dr Markus Buchhorn Markus.buchhorn@rdsi.uq.edu.au +61 417 281 429 Research Data Manager Dr Frankie Stevens frankie.stevens@rdsi.uq.edu.au +61 435 657 730
  • 94. 94| Interactingwiththe community I’ve really enjoyed the breadth of interaction we’ve been able to have as a project team across the stakeholders– the research communities, the universities and science agencies, the Nodes. It’s been wonderful to be able to talk to all of those stakeholder groups. One of the things that’s been most interesting for me has been to see the different approaches those groups bring to data—how research data management might be viewed at the institutional level, versus the state level, versus a person working in a laboratory. ‒ Dr Frankie Stevens Research Data Manager, RDSI Project
  • 95. 95| Fromprincipleto practice When I first started with RDSI, communications were focused on the overall brand awareness of the project. As the Nodes capabilities improved and became operational, there was a fundamental shift towards their accomplishments. The metamorphosis from principle to practice has seen collective available data go from zero to over 16 petabytes. ‒ Asher Vennell Senior Communications Officer, RDSI Project
  • 96. 96| Exceedingexpectations Having worked in distributed research environments, I had an understanding of research data needs. As I was welcomed into the project team it became apparent that RDSI was not only meeting data storage requirements, but exceeding all researcher expectations. ‒ Toni Walkinshaw Office Manager, RDSI Project
  • 97. 97| Fromprincipleto practice The early stages of a major project like this are hard work, and you don’t get to see the real value until later when people begin to use it. I had the opportunity towards the end of the project to talk with researchers who were adding data or using data via the Nodes. Their response to this new capability was overwhelming. They described how it was enabling new collaborations, giving them access to data that had previously been locked away, and fuelling new research they would not have been able to do without it. ‒ Patricia McMillan Storyteller, RDSI Project
  • 99. BoardMembers Brian Anker  Independent Chair Dr Rhys Francis  Director - eResearch Futures P/L Professor Doug McEachern  Former ProVice Chancellor Research and Innovation - The University ofWestern Australia Professor Anton Middelberg  DeputyVice Chancellor (Research) -The University of Queensland FormerBoardMembers Peter Nikoletatos  Executive Director and Chief Information Officer - La Trobe University John Shipp  Vice-President - Australian Library and Information Association Professor Liz Sonenberg  Pro Vice-Chancellor (Research Collaboration) - The University of Melbourne Professor Max Lu  Provost and Senior Vice-President - The University of Queensland Professor Jill Trewhella  Deputy Vice-Chancellor (Research) - The University of Sydney
  • 100. Theend