Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
"Plans are worthless, but planning is essential"
1. “Plans are worthless, but planning is essential”
Creating the culture and technology for an international data infrastructure
Mark A. Parsons
Secretary General
CASRAI Canada ReConnnect14
Ottawa, Canada
20 November 2014
Unless otherwise noted, the slides in this presentation are licensed by Mark A. Parsons under a Creative Commons Attribution-Share Alike 3.0 License
2. All of society’s grand challenges require diverse
(often large) data to be shared and integrated
across cultures, scales, and technologies.
3. Research Data Alliance
Vision
Researchers and innovators openly share data across
technologies, disciplines, and countries to address the
grand challenges of society.
Mission
RDA builds the social and technical bridges that enable
open sharing of data.
4.
5.
6.
7.
8. Dynamics of Infrastructure
Edwards, et al. 2007 Understanding Infrastructure: Dynamics,
Tensions, and Design.
• Infrastructures become “ubiquitous, accessible, reliable, and
transparent” as they mature.
• Systems Networks Inter-networks
• “system-building, characterized by the deliberate and successful
design of technology-based services.”
• “technology transfer across domains and locations results in
variations on the original design, as well as the emergence of
competing systems.”
• Finally, “a process of consolidation characterized by gateways
that allow dissimilar systems to be linked into networks.”
11. Bridges and
Gateways
Gateways are often wrongly
understood as “technologies,”
i.e. hardware or software
alone. A more accurate
approach conceives them as
combining a technical solution
with a social choice, i.e. a
standard, both of which must
be integrated into existing
users’ communities of
practice. Because of this,
gateways rarely perform
perfectly.
— Edwards et al. 2007
13. From Interregional Highways: Message from the President of the United States Transmitting a Report of the National
Interregional Highway Committee, Outlining and Recommending a National System of Interregional Highways, 12 Jan. 1944.
CC-BY Eric Fischer http://www.flickr.com/photos/walkingsf/8270270785/
16. Ranch Exit
CC-BY-SA Ken Lund http://www.flickr.com/photos/kenlund/2381991900/
17. Themes from A. Tsing on Collaboration
Friction—An ethnography of global connection
•“Actual existing universalisms are
hybrid, transient, and involved in
constant reformulation through
dialogue.” They work out through
friction.
•“There is no reason to think
collaborators have common goals.”
•Unity and diversity cover each
other up. Need to remember the
local.
22. Ashby’s Law of
Requisite Variety Only variety absorbs variety
23. Map of the internet by the Opte Project [CC-BY] via Wikimedia
Commons
24. Networks or ecosystems often rely on “weak” links, so partner and
build relationships. (See Barabási A-L and R Albert. 1999 and others)
25. But what does this all have to do with
RDA?
1. RDA focusses on developing “gateways”
2. RDA doesn’t do “architecture,” but it does provide a level of unity.
26. Deliverables that make data work
“Create - Adopt - Use”
• Adopted code, policy, specifications, standards, or practices that
enable data sharing
• “Harvestable” efforts for which 12-18 months of work can eliminate
a roadblock
• Efforts that have substantive applicability to
groups within the data community but may
RDA Principles
not apply to all
Openness
Consensus
• Efforts that can start today
Balance
Harmonization
Community Driven
Non-profit
28. RDA Working Groups
1. Brokering Governance*
2. Data Citation WG
3. Data Description Registry
Interoperability
4. Data Foundation and Terminology
WG
5. Data Type Registries WG
6. Metadata Standards Directory
Working Group
7. PID Information Types WG
8. Practical Policy WG
9. RDA/CODATA Summer Schools in
Data Science and Cloud Computing
in the Developing World*
10.RDA/WDS Publishing Data
Bibliometrics WG
11.RDA/WDS Publishing Data Services
WG
12.RDA/WDS Publishing Data
Workflows WG
13.Repository Audit and Certification
DSA–WDS Partnership WG
14.Standardisation of Data Categories
and Codes WG
15.The BioSharing Registry:
connecting data policies, standards
& databases in life sciences*
16.Urban Quality of Life Indicators*
17.Wheat Data Interoperability WG
* in review
29. Initial Products—adopt one today!
• A basic vocabulary of foundational terminology and query tool to make sure we know what
we’re talking about.
• A data type model and registry (“MIME-types” for data) to help tools interpret, display, and
process data.
• A persistent identifier type registry to help search engines understand what they are pointing to
and retrieving.
• Coming soon:
• A basic set of machine actionable rules to enhance trust
• A metadata standards directory so we can describe similar things consistently
• A dynamic-data citation methodology so we can reference precise subsets of changing
data.
• Semantically linked terms describing wheat data so we can share harvest and related
information around the world
• A unified repository certification scheme to reduce confusion and improve trust.
30. But what does this all have to do with
RDA?
1. RDA focusses on developing “gateways”
2. RDA doesn’t do “architecture,” but it does provide a level of unity.
3. RDA plays both globally and locally—Think “glocal”.
31. Other
Private6%
13%
Government
18% Academia
63%
Distribution of 2,353 Individual RDA Members in 96 Countries
12 September 2014
Map courtesy traveltip.org
Europe
50%
North America
36%
Austral-pacific
5%
Africa
3%
South
America
1%
Asia
5%
32. Regional RDAs
• Australian National Data Service, RDA/United States, RDA/Europe,
• Implement RDA deliverables locally and enhance adoption.
• Ensure regional or national issues are addressed globally.
• Support plenaries and support attendance at plenaries.
33. But what does this all have to do with
RDA?
1. RDA focusses on developing “gateways”
2. RDA doesn’t do “architecture,” but it does provide a level of unity.
3. RDA plays both globally and locally—Think glocal.
4. RDA fosters relationships, interfaces, and connections.
5. RDA provides a “neutral place” to identify and work through friction.
34. RDA Interest Groups
1. Agricultural Data Interoperability IG
2. Big Data Analytics IG
3. Biodiversity Data Integration IG
4. Brokering IG
5. Community Capability Model IG
6. Data Fabric IG
7. Data for Development
8. Data in Context IG
9. Defining Urban Data Exchange for Science IG*
10.Development of cloud computing capacity and
education in developing world research
11.Digital Practices in History and Ethnography IG
12.Domain Repositories Interest Group
13.Education and Training on handling of research
data
14.ELIXIR Bridging Force IG*
15.Engagement IG
16.Federated Identity Management
17.Geospatial IG*
18.Libraries for Research Data*
19.Long tail of research data IG
20.Marine Data Harmonization IG
21.Metabolomics
22.Metadata IG
23.PID Interest Group
24.Preservation e-Infrastructure IG
25.RDA/CODATA Legal Interoperability IG
26.RDA/CODATA Materials Data, Infrastructure &
Interoperability IG
27.RDA/WDS Certification of Digital Repositories IG
28.RDA/WDS Publishing Data Cost Recovery for
Data Centres
29.RDA/WDS Publishing Data IG
30.Reproducibility IG*
31.Research data needs of the Photon and Neutron
Science community
32.Research Data Provenance
33.Service Management IG
34.Structural Biology IG
35.Toxicogenomics Interoperability IG
* in review
37. Get involved!
• Join RDA as an individual member supporting our principles at
http://rd-alliance.org
• Join as an Organisational Member (nominal fee) or an
Organisational Affiliate (jointly sponsored efforts).
• Initiate or join an Interest Group
• Propose or join a Working Group
• Attend the RDA Plenaries
Coming together is a beginning;
keeping together is progress;
working together is success.
—Henry Ford
38. Summary
• Infrastructure is created in phases with the final consolidation phase relying
on gateways and bridges.
• Diversity is a central problem, but only diversity absorbs diversity.
• Networking and interconnection are the way to solve complex problems.
• Need to be constantly, but lightly, managing tension between bottom-up
chaos and stifling, top-down control.
• We are in more global and democratic world, but also a more local world.
Coalition politics with new kinds of coalitions because there are new kinds of
identity.
• Data science needs to focus on relationships, connections, interfaces.
• You must participate “glocally” to succeed.
• Responding to change is more important than following a plan.
• RDA provides mechanisms to address all of the above!