Brian King from CGIAR Big Data platform introducing open science infrastructures for food security, and some of the trade offs that exist in it's set up.
7. Linking Open Data cloud diagram 2017, by Andrejs Abele, John P. McCrae,
Paul Buitelaar, Anja Jentzsch and Richard Cyganiak. http://lod-cloud.net/
Interoperability –
Linked Open Data
The Platform for Big Data in Agriculture seeks to ‘big data enable’ the CGIAR as a global network of research institutions working in the public good, and to enable the agriculture development sector as a whole to make best use of these technologies.
We have found in the CGIAR that it is unlikely that there is one solution that is optimal for all (we would expect similar dynamics across institutions). The CGIAR centers have very different conditions with regard to:
• Scientific focus
• IT staff availability and salaries
• Out-sourcing options
• Internet cost and reliability
• Compliance requirements
• NGO and Educational discount options
• Existing IT architecture
• When cost structure is different the solution should be adapted to high/low cost inputs
• If we implement same solution across all centers it will rarely be financially optimal
Data integrity—unified processes for collecting, organizing, storing and using data—are extremely helpful for unlocking its value….yet it also contributes to the seductive idea of “one platform to rule them all” –expensive endeavor that, if successful, may still provoke standards battles.
It appears there is a natural trade-off between breadth of utility and fitness for purpose of data infrastructure (curve)
If you try to build the whole curve, you may be setting yourself up for failure. If you build the first part of the curve and perhaps complement with tools or approaches (sort of the AWS approach) you will be providing value.
If you develop a specialized tool for a purpose without dropping a bundle on being everything to everyone, you are more set up for success. These two ends of the curve or spectrum need to inform eachother---and there is probably not an optimal spot between them.
So how do we connect the ends of the curve? I think data interoperability and ‘integrate-ability’ is the key.
We have a broad, but linked domain space: food security, or “sustainable, adaptive global food systems” that we are generating data on.
Data are linked using a semantic web, three-word naming approach enabling searchability that can be used to find data resources and their connections with other data resources.
This is a peek at the map interface to view the location of datasets of all CGIAR data that is discoverable. We are in the process of RDF-izing these data to enable cross-domain searching and integration. This will be complemented with some ability to pull these and other datasets into a common analytic environment if and as needed, across the CGIAR, and perhaps across partners as well.
there is a TON of data out there. How can we capture and synergize an un-exploited “surplus of measurement” and enable its productive use?
If we are successful, we should start to see network ‘two-sided” network effects, where additions and improvements to each data infrastructure helps drive and add value to other data infrastructures. There is expensive business literature on ‘two sided network’ business models. What we stand to see may actually be ‘multi-sided network effects’
We unlock the ‘surplus of measurement’ and accelerate learning and adaptation as a result.
It also will accelerate and enable an approach to science that moves beyond the “census of things” approach to one where the “connections among things” may be just as important for building that adaptation.”
Perhaps to the ‘internet of things’?
For example, disease patterns have been more rapidly identified using network analysis in the epidemilology realm. We are beginning to see this application to pest and disease modeling in agriculture. Our sector needs to be equipped in that way.
Continuing to pull on this thread to network science should reveal new connections among things in our ‘sustainable intensification indicators framework’ will be more understood and equip us with new ways to build that adaptation.