SlideShare a Scribd company logo
Critical Infrastructure
to Promote Data
Synthesis into
Evidence-Based
Nutrient Management
Sylvie M. Brouder
Jeff. Volenec
Agronomy Department, Purdue University, West Lafayette, IN
72nd SWCS International Annual Conference, Madison, WI
July 31, 2017
Motivation…
1997: In the beginning…
Purdue’s Water Quality Field Station: A
Core Facility and bleak landscape
without a DMP safety net…
WQFS Video
Data in the stacks: Libraries and
the research data afterlife
Why Libraries? The skill sets, the
thought process, professional value
system (“public good”), & public
expectation of infallibility /
persistence are right
for the problem…
I like
Perez
He has
an ugly
girlfriend
An ugly
girlfriend
means he has
no confidence
Inexplicable human
behavior = buy lottery
tickets but also
insurance
Vanity Fair, 12/2011: Michael
Lewis asks “Why do professional
baseball executives make such
colossal mistakes?”
Best metric: on-base %
Moneyball=myscience/
knowledgetranslationepiphany…
Inspiration/Lessons from
Moneyball, Conservation
Agriculture, Ext./Rec Development
and TamiFlu?
Michael Lewis’
Moneyball: The Art of
Winning an Unfair
Game
R. Navarrete, 2013
Agriculture and the non-big data problem: Short
data life cycles, long-tail data, and data lost to the
dark side… enlightenment from medicine
Number of data sets
Data
Size
Organ-
ized
big
data
Long-tail data
Dark data
Schematic adapted from Ferguson et al., 2014, “Big data from small
data: data sharing in the long-tail of neuroscience”
Literature limit
Examples of valuable but dark data in Agricultural Research ~
Recommendations must come from the “preponderance” of all
evidence (not just the novel result that makes it to a journal…)
Dark research data?
• Orphaned data ~ data collected
but not used in experimental
analysis (increasingly prevalant)
• Null or failed studies (reproving
the null hypothesis) ~ no impact
studies need to contribute to a
“preponderance” of evidence
• Confirmatory studies ~ not novel
so may not be publishable but still
needed for preponderance of
evidence
Dark non-research data
• Data from on-farm
collaboratives and farmer-driven
research efforts
• Data collected by farmers, CCAs,
etc. in current management
protocols (e.g. farm records)
• Monitoring data off equipment,
etc.
• Other??
The long and winding road,
That leads to your door… J. Lennon, P. McCartney
Non-compliant
“Digital Natives”
Persistent Players: M.S. Bracke, J.J. Volenec, R. Turco, S. Brandt, T.S. Murrell
Assoc. Dean Plaut, Dean Mullins, M. Witt, P. Fixen, J. Carlson, …
2004 proposal rejection
Natural
hazards
Encouraging
directional
indicators
My current vision for evidence-based nutrient recommendations: 10 steps
to real-time data uptake, analysis & customized recommendations (working
backwards)
10. Customized,
credible, nutrient
management
recommendation …
 Self-improving
 References the
users’ data
 Can be modified for
non agronomic
priorities (risk
consideration, time
horizons, etc.)
Steps 6 – 9: The cool stuff via the Analytical Framework
6. Automatic reanalysis w/ accruing data
7. Machine learning / artificial intelligence
strategies to minimize human resources
8. Combination analytical strategies that
are directed by scientist using proven
theories & data mining (“unsupervised”)
strategies to surface overlooked linkages,
drivers & proxy measures
9. Tools for “unpacking” the analytical
result to explore new/unexpected
results & discoveries
The Foundation: A bit less cool but essential…
1. User enters data via web
portal
2. Portal has imbedded
workflows for ease of use &
auto quality assurance/quality
control (QA/QC)
3. Data anonymized at entry
according to mutually
acceptable terms & conditions
4. User data combined with
existing research data
5. Data archived and preserved
in a “trusted” repository
The Data Repository….
Impediments/Challenges Confronting Data
Generators and Downstream Data Users
Meta-data standards
Data standards
Minimum data sets
Provenance
Repositories
Data publishing
Dataset versioning
 Data discovery and retrieval
 Data granularity
 Scholarship of data publishing
 Data ownership
 Business models for data
 Education about data
management, including re-
education
Our Focus: Pressing technological challenges to informatics
for all agronomic efforts concern data workflow…
• Data dispersion
– Take advantage of small
datasets collected by many
researchers (not everything
is “BIG”)
• Data heterogeneity
– Varied protocols reflecting
local culture & variation in
1o purpose
• Data provenance
– Need to track data through
multi-step process of
aggregation, modeling,
analysis
Storage is not enough!!!!!
What is a data repository
• It is: an emerging mechanisms
for extending data lifecycles
• Moving beyond storage to
preservation & curation
• Example: Research Repositories,
Data Publications
• It is not: just storage, nor a
website, a database, a network…
Repository Issues – No Perfect Solution (yet) for Data, a Public
Good
Examples considered:
DataOne-NSF, Soft money-renewed for a second 5 yr term; become a node?
DRYAD: http://www.datadryad.org/; requires linkage to a publication; what
happens to unpublished, negative results critical to systematic reviews?
Professional Societies: Association of Crops, Soils, & Environmental Science
Societies (ACSESS) - expand Digital Library of e-pubs into a repository?
Enhance data discovery.
New Ag Data Commons at USDA National Agricultural Libraries
Purdue Univ. Research Repository (PURR) & the 4R-RR: Attached to an
Institution with a long legacy; Storage for at least 10 yrs -then what?
Where are we (PURR / 4R-RR) focusing in the
“data value chain” ~ working behind or upstream
of the “interoperability curtain”
Conceive
• Exp. Des.
• Data Mgt.
Planning
Collect
• Clean
• Rectify
Describe
• Data
Dictionaries
• Meta Data
Discover
Aggregate
• Code / APIs
• Derivative
Data
Synthesize
• BD Analytics
• Statistical
Meta Anal..
Create
New
Knowledge
Interoperability
Produce “Transformative” (Headline) Results
→ Advance Science
Prepare (Preprocess) Data
→ Create tools & workflows
Largely out of sight; sparsely
populated w/ expertise &
solutions
High visibility; crowded
w/ expertise & solutions
PURR / 4R-RR Goals: Facilitating best practices
for data sharing…
• Discoverable ~ findable with common search engines
• Accessible ~ downloadable and subject to manipulation
• Intelligible ~ human and machine readable, suitably described, access
rights clearly stated
• Assessable ~ provenance clear & quality/reliability should be evident
• Usable ~ data should be in a generically “actionable” format (not a
pdf!)
• No-nos:
• Simply posting to a website (non-persistent)
• Requirements: New curriculum and infrastructure…
Purdue University Research Repository (PURR)
most useful agronomic tool since the RCBD
PURR can assign
unique DOI to aid
data discovery and
provenance
PURR is a “Hub”
Cyber-environment;
includes tools,
models, workspace
along with storage
and publication
capabilities.
So much more than “data storage”….
Purdue University Research Repository: What libraries were/are
to books, PURR is/will be to data (plus so much more!)
You can search for (“google”, web of
science, …) data published via PURR
NAL terms; important unique terms (Grant #)
The workflow is predetermined when
publishing ~ you are prompted to be
comprehensive in the info you provide ~
PU Lib. Information Specialists review it
prior to publication…
Supporting documents accessible with
datasets…. Alfalfa P/K study
A rudimentary Data Dictionary
LOCKSS: PURR relieves the researcher of the
responsibilities of ensuring data security
Per PURR Policy:
You cannot post
sensitive data
unless you have
removed
identifiers…
The 4R Fund Research Repository: Foundational infrastructure for collaboration & synthesis
in nutrient management research & recommendation development (a repository w/in a
repository)
Scott Brandt,
Purdue
University
Libraries
And not
by me…
Includes librarians
who possess the
professional skills
to design
workflows that will
help organize &
store things
(data!!) so
something can be
discovered /
accessed / used.
PURR
Process:
Plan,
Collaborate,
Publish,
Archive
Key attribute: Linking of project with archival space ~
data are not accessible to others until you “publish”
Write
Data
Mngmt.
Plan
Create
Project
in
PURR
Collab.
w/
Research
Team
Upload
Data,
Working
Files
Finalize
Dataset
(version)
Upload
Support.
Materials
Create
Data
Pub.
Publish
w/ DOI
PURR
Archives
10+ Yr
Private ~ viewable only to your “team” Searchable,
Accessible,
Retrievable,
Reusable
Policy discussion point with the 4R Fund/IPNI: how long of an embargo post
project completion….????
Hands-on Help: Ag. Research Librarians will help 4R-funded Researchers with
Workflows, Policies and Procedures for Curation, Preservation and Publication of
Their Data Including:
Persistent data formats
Licensing data (privacy requirements & policy)
Meta-data / other tags for data discovery
Versioning of accruing data sets
Supporting documentation
Data publishing
Assigning DOIs
PURR has a 10yr commitment to data set preservation / options beyond 10yr
Policies / mechanisms for novel public/private partnerships for data stewardship
Business models for open access data
Hands-on help from Agronomists: Best practices
have to become easier to do than not to do…
• Tough Lessons: Single biggest mistake we
have & can make is “build it and they will
come” & not providing enough help
• Many datasets need “special treatment”
• 4R-RR “Data Buddy”
Work one-on-one w/ PIs to help transition
their data from their computers to PURR
Assist with standards: data and meta-data
Make certain minimum data sets are
acquired.
• Challenge: “Data buddies” are
hard to find!
• Youth: not enough wisdom about
the culture of the science
• Established Scientists: don’t have
the data skills, time or both
Meadow Creek Students Partner as Data Buddies
www.hebisd.eduhttp://www.hebisd.edu/media/images/articles/2763f.jpg
Final thought: Description not prescription
•Second biggest
mistake…?
•Templates!!!!
•Solution:
•Data dictionaries

More Related Content

What's hot

Introduction to data management
Introduction to data managementIntroduction to data management
Introduction to data managementCunera Buys
 
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...The University of Edinburgh
 
NSF Data Management Plan Case Study: UVa’s Response.
NSF Data Management Plan Case Study:  UVa’s Response.NSF Data Management Plan Case Study:  UVa’s Response.
NSF Data Management Plan Case Study: UVa’s Response.Andrew Sallans
 
RDAP14: Learning to Curate Panel
RDAP14: Learning to Curate Panel RDAP14: Learning to Curate Panel
RDAP14: Learning to Curate Panel ASIS&T
 
The liaison librarian: connecting with the qualitative research lifecycle
The liaison librarian: connecting with the qualitative research lifecycleThe liaison librarian: connecting with the qualitative research lifecycle
The liaison librarian: connecting with the qualitative research lifecycleCelia Emmelhainz
 
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"Anita de Waard
 
Linked data presentation for libraries (COMO)
Linked data presentation for libraries (COMO)Linked data presentation for libraries (COMO)
Linked data presentation for libraries (COMO)robin fay
 
Preservation, Publishing, and People: A SEAD View
Preservation, Publishing, and  People: A SEAD ViewPreservation, Publishing, and  People: A SEAD View
Preservation, Publishing, and People: A SEAD ViewInna Kouper
 
Andrew cox rdm rose
Andrew cox   rdm roseAndrew cox   rdm rose
Andrew cox rdm rosesconul
 
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...Amanda Whitmire
 
Virginia Data Management Bootcamp: Building the Research Data Community of Pr...
Virginia Data Management Bootcamp: Building the Research Data Community of Pr...Virginia Data Management Bootcamp: Building the Research Data Community of Pr...
Virginia Data Management Bootcamp: Building the Research Data Community of Pr...Sherry Lake
 
Documentation and Metdata - VA DM Bootcamp
Documentation and Metdata - VA DM BootcampDocumentation and Metdata - VA DM Bootcamp
Documentation and Metdata - VA DM BootcampSherry Lake
 
Data management (1)
Data management (1)Data management (1)
Data management (1)SM Lalon
 

What's hot (20)

Introduction to data management
Introduction to data managementIntroduction to data management
Introduction to data management
 
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
Perspectives on the Role of Trustworthy Repository Standards in Data Journal ...
 
NSF Data Management Plan Case Study: UVa’s Response.
NSF Data Management Plan Case Study:  UVa’s Response.NSF Data Management Plan Case Study:  UVa’s Response.
NSF Data Management Plan Case Study: UVa’s Response.
 
Holmes "Institutional Infrastructure for Data Sharing"
Holmes "Institutional Infrastructure for Data Sharing"Holmes "Institutional Infrastructure for Data Sharing"
Holmes "Institutional Infrastructure for Data Sharing"
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
RDAP14: Learning to Curate Panel
RDAP14: Learning to Curate Panel RDAP14: Learning to Curate Panel
RDAP14: Learning to Curate Panel
 
The liaison librarian: connecting with the qualitative research lifecycle
The liaison librarian: connecting with the qualitative research lifecycleThe liaison librarian: connecting with the qualitative research lifecycle
The liaison librarian: connecting with the qualitative research lifecycle
 
Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...
Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...
Introduction to Research Data Management - 2015-05-27 - Social Sciences Divis...
 
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"Some Ideas on Making Research Data: "It's the Metadata, stupid!"
Some Ideas on Making Research Data: "It's the Metadata, stupid!"
 
Linked data presentation for libraries (COMO)
Linked data presentation for libraries (COMO)Linked data presentation for libraries (COMO)
Linked data presentation for libraries (COMO)
 
Preservation, Publishing, and People: A SEAD View
Preservation, Publishing, and  People: A SEAD ViewPreservation, Publishing, and  People: A SEAD View
Preservation, Publishing, and People: A SEAD View
 
Andrew cox rdm rose
Andrew cox   rdm roseAndrew cox   rdm rose
Andrew cox rdm rose
 
Stephenson - Data Curation for Quantitative Social Science Research
Stephenson - Data Curation for Quantitative Social Science ResearchStephenson - Data Curation for Quantitative Social Science Research
Stephenson - Data Curation for Quantitative Social Science Research
 
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
NISO Virtual Conference Scientific Data Management: Caring for Your Instituti...
 
Sept 18 NISO Webinar: Research Data Curation, Part 2: Libraries and Big Data ...
Sept 18 NISO Webinar: Research Data Curation, Part 2: Libraries and Big Data ...Sept 18 NISO Webinar: Research Data Curation, Part 2: Libraries and Big Data ...
Sept 18 NISO Webinar: Research Data Curation, Part 2: Libraries and Big Data ...
 
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
IDCC Workshop: Analysing DMPs to inform research data services: lessons from ...
 
Virginia Data Management Bootcamp: Building the Research Data Community of Pr...
Virginia Data Management Bootcamp: Building the Research Data Community of Pr...Virginia Data Management Bootcamp: Building the Research Data Community of Pr...
Virginia Data Management Bootcamp: Building the Research Data Community of Pr...
 
Sept 11 NISO Webinar: Research Data Curation Part 1: E-Science Librarianship
Sept 11 NISO Webinar: Research Data Curation Part 1: E-Science Librarianship Sept 11 NISO Webinar: Research Data Curation Part 1: E-Science Librarianship
Sept 11 NISO Webinar: Research Data Curation Part 1: E-Science Librarianship
 
Documentation and Metdata - VA DM Bootcamp
Documentation and Metdata - VA DM BootcampDocumentation and Metdata - VA DM Bootcamp
Documentation and Metdata - VA DM Bootcamp
 
Data management (1)
Data management (1)Data management (1)
Data management (1)
 

Similar to Critical infrastructure to promote data synthesis

DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...
DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...
DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...University of California Curation Center
 
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017ARDC
 
Research Data Management in the Humanities and Social Sciences
Research Data Management in the Humanities and Social SciencesResearch Data Management in the Humanities and Social Sciences
Research Data Management in the Humanities and Social SciencesCelia Emmelhainz
 
Human Genome and Big Data Challenges
Human Genome and Big Data ChallengesHuman Genome and Big Data Challenges
Human Genome and Big Data ChallengesPhilip Bourne
 
INSERM - Data Management & Reuse of Health Data - May 2017
INSERM - Data Management & Reuse of Health Data - May 2017INSERM - Data Management & Reuse of Health Data - May 2017
INSERM - Data Management & Reuse of Health Data - May 2017Susanna-Assunta Sansone
 
Workshop intro090314
Workshop intro090314Workshop intro090314
Workshop intro090314Philip Bourne
 
Ands ttt2 perth_accelerate your data skills training_ top tips for topics and...
Ands ttt2 perth_accelerate your data skills training_ top tips for topics and...Ands ttt2 perth_accelerate your data skills training_ top tips for topics and...
Ands ttt2 perth_accelerate your data skills training_ top tips for topics and...ARDC
 
Data publishing at the UQ Library
Data publishing at the UQ LibraryData publishing at the UQ Library
Data publishing at the UQ LibraryARDC
 
Love Your Data Locally
Love Your Data LocallyLove Your Data Locally
Love Your Data LocallyErin D. Foster
 
Open Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsOpen Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsMartin Donnelly
 
Research Data Management for Librarians at Oxford Brookes
Research Data Management for Librarians at Oxford BrookesResearch Data Management for Librarians at Oxford Brookes
Research Data Management for Librarians at Oxford BrookesMarieke Guy
 
Data Sets, Ensemble Cloud Computing, and the University Library: Getting the ...
Data Sets, Ensemble Cloud Computing, and the University Library:Getting the ...Data Sets, Ensemble Cloud Computing, and the University Library:Getting the ...
Data Sets, Ensemble Cloud Computing, and the University Library: Getting the ...SEAD
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsfBrad Houston
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsfBrad Houston
 
Research data management during and after your research ; an introduction / L...
Research data management during and after your research ; an introduction / L...Research data management during and after your research ; an introduction / L...
Research data management during and after your research ; an introduction / L...Leon Osinski
 
Data management plans
Data management plansData management plans
Data management plansBrad Houston
 

Similar to Critical infrastructure to promote data synthesis (20)

DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...
DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...
DMPTool Webinar 8: Data Curation Profiles and the DMPTool (presented by Jake ...
 
Research data life cycle
Research data life cycleResearch data life cycle
Research data life cycle
 
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
Research Data Management in practice, RIA Data Management Workshop Brisbane 2017
 
Research Data Management in the Humanities and Social Sciences
Research Data Management in the Humanities and Social SciencesResearch Data Management in the Humanities and Social Sciences
Research Data Management in the Humanities and Social Sciences
 
Human Genome and Big Data Challenges
Human Genome and Big Data ChallengesHuman Genome and Big Data Challenges
Human Genome and Big Data Challenges
 
INSERM - Data Management & Reuse of Health Data - May 2017
INSERM - Data Management & Reuse of Health Data - May 2017INSERM - Data Management & Reuse of Health Data - May 2017
INSERM - Data Management & Reuse of Health Data - May 2017
 
Workshop intro090314
Workshop intro090314Workshop intro090314
Workshop intro090314
 
Ands ttt2 perth_accelerate your data skills training_ top tips for topics and...
Ands ttt2 perth_accelerate your data skills training_ top tips for topics and...Ands ttt2 perth_accelerate your data skills training_ top tips for topics and...
Ands ttt2 perth_accelerate your data skills training_ top tips for topics and...
 
Data publishing at the UQ Library
Data publishing at the UQ LibraryData publishing at the UQ Library
Data publishing at the UQ Library
 
Data!
Data!Data!
Data!
 
Yale Day of Data
Yale Day of Data Yale Day of Data
Yale Day of Data
 
Love Your Data Locally
Love Your Data LocallyLove Your Data Locally
Love Your Data Locally
 
Open Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and SolutionsOpen Access to Research Data: Challenges and Solutions
Open Access to Research Data: Challenges and Solutions
 
Research Data Management for Librarians at Oxford Brookes
Research Data Management for Librarians at Oxford BrookesResearch Data Management for Librarians at Oxford Brookes
Research Data Management for Librarians at Oxford Brookes
 
METRO RDM Webinar
METRO RDM WebinarMETRO RDM Webinar
METRO RDM Webinar
 
Data Sets, Ensemble Cloud Computing, and the University Library: Getting the ...
Data Sets, Ensemble Cloud Computing, and the University Library:Getting the ...Data Sets, Ensemble Cloud Computing, and the University Library:Getting the ...
Data Sets, Ensemble Cloud Computing, and the University Library: Getting the ...
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsf
 
Data management plans (dmp) for nsf
Data management plans (dmp) for nsfData management plans (dmp) for nsf
Data management plans (dmp) for nsf
 
Research data management during and after your research ; an introduction / L...
Research data management during and after your research ; an introduction / L...Research data management during and after your research ; an introduction / L...
Research data management during and after your research ; an introduction / L...
 
Data management plans
Data management plansData management plans
Data management plans
 

More from Soil and Water Conservation Society

More from Soil and Water Conservation Society (20)

September 1 - 0939 - Catherine DeLong.pptx
September 1 - 0939 - Catherine DeLong.pptxSeptember 1 - 0939 - Catherine DeLong.pptx
September 1 - 0939 - Catherine DeLong.pptx
 
September 1 - 830 - Chris Hay
September 1 - 830 - Chris HaySeptember 1 - 830 - Chris Hay
September 1 - 830 - Chris Hay
 
August 31 - 0239 - Yuchuan Fan
August 31 - 0239 - Yuchuan FanAugust 31 - 0239 - Yuchuan Fan
August 31 - 0239 - Yuchuan Fan
 
August 31 - 0216 - Babak Dialameh
August 31 - 0216 - Babak DialamehAugust 31 - 0216 - Babak Dialameh
August 31 - 0216 - Babak Dialameh
 
August 31 - 0153 - San Simon
August 31 - 0153 - San SimonAugust 31 - 0153 - San Simon
August 31 - 0153 - San Simon
 
August 31 - 0130 - Chuck Brandel
August 31 - 0130 - Chuck BrandelAugust 31 - 0130 - Chuck Brandel
August 31 - 0130 - Chuck Brandel
 
September 1 - 1139 - Ainis Lagzdins
September 1 - 1139 - Ainis LagzdinsSeptember 1 - 1139 - Ainis Lagzdins
September 1 - 1139 - Ainis Lagzdins
 
September 1 - 1116 - David Whetter
September 1 - 1116 - David WhetterSeptember 1 - 1116 - David Whetter
September 1 - 1116 - David Whetter
 
September 1 - 1053 - Matt Helmers
September 1 - 1053 - Matt HelmersSeptember 1 - 1053 - Matt Helmers
September 1 - 1053 - Matt Helmers
 
September 1 - 1030 - Chandra Madramootoo
September 1 - 1030 - Chandra MadramootooSeptember 1 - 1030 - Chandra Madramootoo
September 1 - 1030 - Chandra Madramootoo
 
August 31 - 1139 - Mitchell Watkins
August 31 - 1139 - Mitchell WatkinsAugust 31 - 1139 - Mitchell Watkins
August 31 - 1139 - Mitchell Watkins
 
August 31 - 1116 - Shiv Prasher
August 31 - 1116 - Shiv PrasherAugust 31 - 1116 - Shiv Prasher
August 31 - 1116 - Shiv Prasher
 
August 31 - 1053 - Ehsan Ghane
August 31 - 1053 - Ehsan GhaneAugust 31 - 1053 - Ehsan Ghane
August 31 - 1053 - Ehsan Ghane
 
August 31 - 1030 - Joseph A. Bubcanec
August 31 - 1030 - Joseph A. BubcanecAugust 31 - 1030 - Joseph A. Bubcanec
August 31 - 1030 - Joseph A. Bubcanec
 
September 1 - 130 - McBride
September 1 - 130 - McBrideSeptember 1 - 130 - McBride
September 1 - 130 - McBride
 
September 1 - 0216 - Jessica D'Ambrosio
September 1 - 0216 - Jessica D'AmbrosioSeptember 1 - 0216 - Jessica D'Ambrosio
September 1 - 0216 - Jessica D'Ambrosio
 
September 1 - 0153 - Mike Pniewski
September 1 - 0153 - Mike PniewskiSeptember 1 - 0153 - Mike Pniewski
September 1 - 0153 - Mike Pniewski
 
September 1 - 0130 - Johnathan Witter
September 1 - 0130 - Johnathan WitterSeptember 1 - 0130 - Johnathan Witter
September 1 - 0130 - Johnathan Witter
 
August 31 - 1139 - Melisa Luymes
August 31 - 1139 - Melisa LuymesAugust 31 - 1139 - Melisa Luymes
August 31 - 1139 - Melisa Luymes
 
August 31 - 1116 - Hassam Moursi
August 31 - 1116 - Hassam MoursiAugust 31 - 1116 - Hassam Moursi
August 31 - 1116 - Hassam Moursi
 

Recently uploaded

IPCC Vice Chair Ladislaus Change Central Asia Climate Change Conference 27 Ma...
IPCC Vice Chair Ladislaus Change Central Asia Climate Change Conference 27 Ma...IPCC Vice Chair Ladislaus Change Central Asia Climate Change Conference 27 Ma...
IPCC Vice Chair Ladislaus Change Central Asia Climate Change Conference 27 Ma...ipcc-media
 
Natural farming @ Dr. Siddhartha S. Jena.pptx
Natural farming @ Dr. Siddhartha S. Jena.pptxNatural farming @ Dr. Siddhartha S. Jena.pptx
Natural farming @ Dr. Siddhartha S. Jena.pptxsidjena70
 
DESERT ECOSYSTEM AND ITS CHARACTERISTICS AND TYPES
DESERT ECOSYSTEM AND ITS CHARACTERISTICS AND TYPESDESERT ECOSYSTEM AND ITS CHARACTERISTICS AND TYPES
DESERT ECOSYSTEM AND ITS CHARACTERISTICS AND TYPESSumayyaSayeeda
 
Micro RNA genes and their likely influence in rice (Oryza sativa L.) dynamic ...
Micro RNA genes and their likely influence in rice (Oryza sativa L.) dynamic ...Micro RNA genes and their likely influence in rice (Oryza sativa L.) dynamic ...
Micro RNA genes and their likely influence in rice (Oryza sativa L.) dynamic ...Open Access Research Paper
 
一比一原版(Southern Cross毕业证)南十字星大学毕业证成绩单
一比一原版(Southern Cross毕业证)南十字星大学毕业证成绩单一比一原版(Southern Cross毕业证)南十字星大学毕业证成绩单
一比一原版(Southern Cross毕业证)南十字星大学毕业证成绩单yegohah
 
Prevention and Control of Water Pollution
Prevention and Control of Water PollutionPrevention and Control of Water Pollution
Prevention and Control of Water Pollutionlinciy03
 
一比一原版(Monash毕业证)莫纳什大学毕业证成绩单
一比一原版(Monash毕业证)莫纳什大学毕业证成绩单一比一原版(Monash毕业证)莫纳什大学毕业证成绩单
一比一原版(Monash毕业证)莫纳什大学毕业证成绩单pcoow
 
CHLORITE( a phyllosilicate clay mineral)
CHLORITE( a phyllosilicate clay mineral)CHLORITE( a phyllosilicate clay mineral)
CHLORITE( a phyllosilicate clay mineral)malleshmalli2994
 
Use of Raffias’ species (Raphia spp.) and its impact on socioeconomic charact...
Use of Raffias’ species (Raphia spp.) and its impact on socioeconomic charact...Use of Raffias’ species (Raphia spp.) and its impact on socioeconomic charact...
Use of Raffias’ species (Raphia spp.) and its impact on socioeconomic charact...Open Access Research Paper
 
一比一原版(Adelaide毕业证)阿德莱德大学毕业证成绩单
一比一原版(Adelaide毕业证)阿德莱德大学毕业证成绩单一比一原版(Adelaide毕业证)阿德莱德大学毕业证成绩单
一比一原版(Adelaide毕业证)阿德莱德大学毕业证成绩单qogbuux
 
一比一原版(SUT毕业证)斯威本科技大学毕业证成绩单
一比一原版(SUT毕业证)斯威本科技大学毕业证成绩单一比一原版(SUT毕业证)斯威本科技大学毕业证成绩单
一比一原版(SUT毕业证)斯威本科技大学毕业证成绩单pcoow
 
The State Board for Water Pollution - The Water Act 1974 .pptx
The State Board for  Water Pollution - The Water Act 1974  .pptxThe State Board for  Water Pollution - The Water Act 1974  .pptx
The State Board for Water Pollution - The Water Act 1974 .pptxlinciy03
 
一比一原版EUR毕业证鹿特丹伊拉斯姆斯大学毕业证成绩单如何办理
一比一原版EUR毕业证鹿特丹伊拉斯姆斯大学毕业证成绩单如何办理一比一原版EUR毕业证鹿特丹伊拉斯姆斯大学毕业证成绩单如何办理
一比一原版EUR毕业证鹿特丹伊拉斯姆斯大学毕业证成绩单如何办理exehay
 
DRAFT NRW Recreation Strategy - People and Nature thriving together
DRAFT NRW Recreation Strategy - People and Nature thriving togetherDRAFT NRW Recreation Strategy - People and Nature thriving together
DRAFT NRW Recreation Strategy - People and Nature thriving togetherRobin Grant
 
@@how to Join @occult for money ritual..☎️+2349022657119.
@@how to Join @occult for money ritual..☎️+2349022657119.@@how to Join @occult for money ritual..☎️+2349022657119.
@@how to Join @occult for money ritual..☎️+2349022657119.RoyaleEaglepriest
 
Prevalence, biochemical and hematological study of diabetic patients
Prevalence, biochemical and hematological study of diabetic patientsPrevalence, biochemical and hematological study of diabetic patients
Prevalence, biochemical and hematological study of diabetic patientsOpen Access Research Paper
 
一比一原版(Lincoln毕业证)新西兰林肯大学毕业证成绩单
一比一原版(Lincoln毕业证)新西兰林肯大学毕业证成绩单一比一原版(Lincoln毕业证)新西兰林肯大学毕业证成绩单
一比一原版(Lincoln毕业证)新西兰林肯大学毕业证成绩单tyvaq
 
Powers and Functions of CPCB - The Water Act 1974.pdf
Powers and Functions of CPCB - The Water Act 1974.pdfPowers and Functions of CPCB - The Water Act 1974.pdf
Powers and Functions of CPCB - The Water Act 1974.pdflinciy03
 

Recently uploaded (20)

IPCC Vice Chair Ladislaus Change Central Asia Climate Change Conference 27 Ma...
IPCC Vice Chair Ladislaus Change Central Asia Climate Change Conference 27 Ma...IPCC Vice Chair Ladislaus Change Central Asia Climate Change Conference 27 Ma...
IPCC Vice Chair Ladislaus Change Central Asia Climate Change Conference 27 Ma...
 
Natural farming @ Dr. Siddhartha S. Jena.pptx
Natural farming @ Dr. Siddhartha S. Jena.pptxNatural farming @ Dr. Siddhartha S. Jena.pptx
Natural farming @ Dr. Siddhartha S. Jena.pptx
 
DESERT ECOSYSTEM AND ITS CHARACTERISTICS AND TYPES
DESERT ECOSYSTEM AND ITS CHARACTERISTICS AND TYPESDESERT ECOSYSTEM AND ITS CHARACTERISTICS AND TYPES
DESERT ECOSYSTEM AND ITS CHARACTERISTICS AND TYPES
 
Micro RNA genes and their likely influence in rice (Oryza sativa L.) dynamic ...
Micro RNA genes and their likely influence in rice (Oryza sativa L.) dynamic ...Micro RNA genes and their likely influence in rice (Oryza sativa L.) dynamic ...
Micro RNA genes and their likely influence in rice (Oryza sativa L.) dynamic ...
 
一比一原版(Southern Cross毕业证)南十字星大学毕业证成绩单
一比一原版(Southern Cross毕业证)南十字星大学毕业证成绩单一比一原版(Southern Cross毕业证)南十字星大学毕业证成绩单
一比一原版(Southern Cross毕业证)南十字星大学毕业证成绩单
 
Prevention and Control of Water Pollution
Prevention and Control of Water PollutionPrevention and Control of Water Pollution
Prevention and Control of Water Pollution
 
一比一原版(Monash毕业证)莫纳什大学毕业证成绩单
一比一原版(Monash毕业证)莫纳什大学毕业证成绩单一比一原版(Monash毕业证)莫纳什大学毕业证成绩单
一比一原版(Monash毕业证)莫纳什大学毕业证成绩单
 
CHLORITE( a phyllosilicate clay mineral)
CHLORITE( a phyllosilicate clay mineral)CHLORITE( a phyllosilicate clay mineral)
CHLORITE( a phyllosilicate clay mineral)
 
Major-Environmental-Problems and Proven Solutions.pdf
Major-Environmental-Problems and Proven Solutions.pdfMajor-Environmental-Problems and Proven Solutions.pdf
Major-Environmental-Problems and Proven Solutions.pdf
 
Use of Raffias’ species (Raphia spp.) and its impact on socioeconomic charact...
Use of Raffias’ species (Raphia spp.) and its impact on socioeconomic charact...Use of Raffias’ species (Raphia spp.) and its impact on socioeconomic charact...
Use of Raffias’ species (Raphia spp.) and its impact on socioeconomic charact...
 
一比一原版(Adelaide毕业证)阿德莱德大学毕业证成绩单
一比一原版(Adelaide毕业证)阿德莱德大学毕业证成绩单一比一原版(Adelaide毕业证)阿德莱德大学毕业证成绩单
一比一原版(Adelaide毕业证)阿德莱德大学毕业证成绩单
 
一比一原版(SUT毕业证)斯威本科技大学毕业证成绩单
一比一原版(SUT毕业证)斯威本科技大学毕业证成绩单一比一原版(SUT毕业证)斯威本科技大学毕业证成绩单
一比一原版(SUT毕业证)斯威本科技大学毕业证成绩单
 
A systematic review of the implementation of Industry 4.0 in human resources
A systematic review of the implementation of Industry 4.0 in human resourcesA systematic review of the implementation of Industry 4.0 in human resources
A systematic review of the implementation of Industry 4.0 in human resources
 
The State Board for Water Pollution - The Water Act 1974 .pptx
The State Board for  Water Pollution - The Water Act 1974  .pptxThe State Board for  Water Pollution - The Water Act 1974  .pptx
The State Board for Water Pollution - The Water Act 1974 .pptx
 
一比一原版EUR毕业证鹿特丹伊拉斯姆斯大学毕业证成绩单如何办理
一比一原版EUR毕业证鹿特丹伊拉斯姆斯大学毕业证成绩单如何办理一比一原版EUR毕业证鹿特丹伊拉斯姆斯大学毕业证成绩单如何办理
一比一原版EUR毕业证鹿特丹伊拉斯姆斯大学毕业证成绩单如何办理
 
DRAFT NRW Recreation Strategy - People and Nature thriving together
DRAFT NRW Recreation Strategy - People and Nature thriving togetherDRAFT NRW Recreation Strategy - People and Nature thriving together
DRAFT NRW Recreation Strategy - People and Nature thriving together
 
@@how to Join @occult for money ritual..☎️+2349022657119.
@@how to Join @occult for money ritual..☎️+2349022657119.@@how to Join @occult for money ritual..☎️+2349022657119.
@@how to Join @occult for money ritual..☎️+2349022657119.
 
Prevalence, biochemical and hematological study of diabetic patients
Prevalence, biochemical and hematological study of diabetic patientsPrevalence, biochemical and hematological study of diabetic patients
Prevalence, biochemical and hematological study of diabetic patients
 
一比一原版(Lincoln毕业证)新西兰林肯大学毕业证成绩单
一比一原版(Lincoln毕业证)新西兰林肯大学毕业证成绩单一比一原版(Lincoln毕业证)新西兰林肯大学毕业证成绩单
一比一原版(Lincoln毕业证)新西兰林肯大学毕业证成绩单
 
Powers and Functions of CPCB - The Water Act 1974.pdf
Powers and Functions of CPCB - The Water Act 1974.pdfPowers and Functions of CPCB - The Water Act 1974.pdf
Powers and Functions of CPCB - The Water Act 1974.pdf
 

Critical infrastructure to promote data synthesis

  • 1. Critical Infrastructure to Promote Data Synthesis into Evidence-Based Nutrient Management Sylvie M. Brouder Jeff. Volenec Agronomy Department, Purdue University, West Lafayette, IN 72nd SWCS International Annual Conference, Madison, WI July 31, 2017
  • 3. 1997: In the beginning… Purdue’s Water Quality Field Station: A Core Facility and bleak landscape without a DMP safety net…
  • 5. Data in the stacks: Libraries and the research data afterlife
  • 6. Why Libraries? The skill sets, the thought process, professional value system (“public good”), & public expectation of infallibility / persistence are right for the problem…
  • 7. I like Perez He has an ugly girlfriend An ugly girlfriend means he has no confidence Inexplicable human behavior = buy lottery tickets but also insurance Vanity Fair, 12/2011: Michael Lewis asks “Why do professional baseball executives make such colossal mistakes?” Best metric: on-base % Moneyball=myscience/ knowledgetranslationepiphany…
  • 8. Inspiration/Lessons from Moneyball, Conservation Agriculture, Ext./Rec Development and TamiFlu? Michael Lewis’ Moneyball: The Art of Winning an Unfair Game R. Navarrete, 2013
  • 9. Agriculture and the non-big data problem: Short data life cycles, long-tail data, and data lost to the dark side… enlightenment from medicine Number of data sets Data Size Organ- ized big data Long-tail data Dark data Schematic adapted from Ferguson et al., 2014, “Big data from small data: data sharing in the long-tail of neuroscience” Literature limit
  • 10. Examples of valuable but dark data in Agricultural Research ~ Recommendations must come from the “preponderance” of all evidence (not just the novel result that makes it to a journal…) Dark research data? • Orphaned data ~ data collected but not used in experimental analysis (increasingly prevalant) • Null or failed studies (reproving the null hypothesis) ~ no impact studies need to contribute to a “preponderance” of evidence • Confirmatory studies ~ not novel so may not be publishable but still needed for preponderance of evidence Dark non-research data • Data from on-farm collaboratives and farmer-driven research efforts • Data collected by farmers, CCAs, etc. in current management protocols (e.g. farm records) • Monitoring data off equipment, etc. • Other??
  • 11. The long and winding road, That leads to your door… J. Lennon, P. McCartney Non-compliant “Digital Natives” Persistent Players: M.S. Bracke, J.J. Volenec, R. Turco, S. Brandt, T.S. Murrell Assoc. Dean Plaut, Dean Mullins, M. Witt, P. Fixen, J. Carlson, … 2004 proposal rejection Natural hazards Encouraging directional indicators
  • 12. My current vision for evidence-based nutrient recommendations: 10 steps to real-time data uptake, analysis & customized recommendations (working backwards) 10. Customized, credible, nutrient management recommendation …  Self-improving  References the users’ data  Can be modified for non agronomic priorities (risk consideration, time horizons, etc.)
  • 13. Steps 6 – 9: The cool stuff via the Analytical Framework 6. Automatic reanalysis w/ accruing data 7. Machine learning / artificial intelligence strategies to minimize human resources 8. Combination analytical strategies that are directed by scientist using proven theories & data mining (“unsupervised”) strategies to surface overlooked linkages, drivers & proxy measures 9. Tools for “unpacking” the analytical result to explore new/unexpected results & discoveries
  • 14. The Foundation: A bit less cool but essential… 1. User enters data via web portal 2. Portal has imbedded workflows for ease of use & auto quality assurance/quality control (QA/QC) 3. Data anonymized at entry according to mutually acceptable terms & conditions 4. User data combined with existing research data 5. Data archived and preserved in a “trusted” repository The Data Repository….
  • 15. Impediments/Challenges Confronting Data Generators and Downstream Data Users Meta-data standards Data standards Minimum data sets Provenance Repositories Data publishing Dataset versioning  Data discovery and retrieval  Data granularity  Scholarship of data publishing  Data ownership  Business models for data  Education about data management, including re- education
  • 16. Our Focus: Pressing technological challenges to informatics for all agronomic efforts concern data workflow… • Data dispersion – Take advantage of small datasets collected by many researchers (not everything is “BIG”) • Data heterogeneity – Varied protocols reflecting local culture & variation in 1o purpose • Data provenance – Need to track data through multi-step process of aggregation, modeling, analysis Storage is not enough!!!!!
  • 17. What is a data repository • It is: an emerging mechanisms for extending data lifecycles • Moving beyond storage to preservation & curation • Example: Research Repositories, Data Publications • It is not: just storage, nor a website, a database, a network…
  • 18. Repository Issues – No Perfect Solution (yet) for Data, a Public Good Examples considered: DataOne-NSF, Soft money-renewed for a second 5 yr term; become a node? DRYAD: http://www.datadryad.org/; requires linkage to a publication; what happens to unpublished, negative results critical to systematic reviews? Professional Societies: Association of Crops, Soils, & Environmental Science Societies (ACSESS) - expand Digital Library of e-pubs into a repository? Enhance data discovery. New Ag Data Commons at USDA National Agricultural Libraries Purdue Univ. Research Repository (PURR) & the 4R-RR: Attached to an Institution with a long legacy; Storage for at least 10 yrs -then what?
  • 19. Where are we (PURR / 4R-RR) focusing in the “data value chain” ~ working behind or upstream of the “interoperability curtain” Conceive • Exp. Des. • Data Mgt. Planning Collect • Clean • Rectify Describe • Data Dictionaries • Meta Data Discover Aggregate • Code / APIs • Derivative Data Synthesize • BD Analytics • Statistical Meta Anal.. Create New Knowledge Interoperability Produce “Transformative” (Headline) Results → Advance Science Prepare (Preprocess) Data → Create tools & workflows Largely out of sight; sparsely populated w/ expertise & solutions High visibility; crowded w/ expertise & solutions
  • 20. PURR / 4R-RR Goals: Facilitating best practices for data sharing… • Discoverable ~ findable with common search engines • Accessible ~ downloadable and subject to manipulation • Intelligible ~ human and machine readable, suitably described, access rights clearly stated • Assessable ~ provenance clear & quality/reliability should be evident • Usable ~ data should be in a generically “actionable” format (not a pdf!) • No-nos: • Simply posting to a website (non-persistent) • Requirements: New curriculum and infrastructure…
  • 21. Purdue University Research Repository (PURR) most useful agronomic tool since the RCBD PURR can assign unique DOI to aid data discovery and provenance PURR is a “Hub” Cyber-environment; includes tools, models, workspace along with storage and publication capabilities. So much more than “data storage”….
  • 22. Purdue University Research Repository: What libraries were/are to books, PURR is/will be to data (plus so much more!)
  • 23. You can search for (“google”, web of science, …) data published via PURR NAL terms; important unique terms (Grant #)
  • 24. The workflow is predetermined when publishing ~ you are prompted to be comprehensive in the info you provide ~ PU Lib. Information Specialists review it prior to publication…
  • 25. Supporting documents accessible with datasets…. Alfalfa P/K study A rudimentary Data Dictionary
  • 26. LOCKSS: PURR relieves the researcher of the responsibilities of ensuring data security Per PURR Policy: You cannot post sensitive data unless you have removed identifiers…
  • 27. The 4R Fund Research Repository: Foundational infrastructure for collaboration & synthesis in nutrient management research & recommendation development (a repository w/in a repository) Scott Brandt, Purdue University Libraries And not by me… Includes librarians who possess the professional skills to design workflows that will help organize & store things (data!!) so something can be discovered / accessed / used. PURR Process: Plan, Collaborate, Publish, Archive
  • 28. Key attribute: Linking of project with archival space ~ data are not accessible to others until you “publish” Write Data Mngmt. Plan Create Project in PURR Collab. w/ Research Team Upload Data, Working Files Finalize Dataset (version) Upload Support. Materials Create Data Pub. Publish w/ DOI PURR Archives 10+ Yr Private ~ viewable only to your “team” Searchable, Accessible, Retrievable, Reusable Policy discussion point with the 4R Fund/IPNI: how long of an embargo post project completion….????
  • 29. Hands-on Help: Ag. Research Librarians will help 4R-funded Researchers with Workflows, Policies and Procedures for Curation, Preservation and Publication of Their Data Including: Persistent data formats Licensing data (privacy requirements & policy) Meta-data / other tags for data discovery Versioning of accruing data sets Supporting documentation Data publishing Assigning DOIs PURR has a 10yr commitment to data set preservation / options beyond 10yr Policies / mechanisms for novel public/private partnerships for data stewardship Business models for open access data
  • 30. Hands-on help from Agronomists: Best practices have to become easier to do than not to do… • Tough Lessons: Single biggest mistake we have & can make is “build it and they will come” & not providing enough help • Many datasets need “special treatment” • 4R-RR “Data Buddy” Work one-on-one w/ PIs to help transition their data from their computers to PURR Assist with standards: data and meta-data Make certain minimum data sets are acquired. • Challenge: “Data buddies” are hard to find! • Youth: not enough wisdom about the culture of the science • Established Scientists: don’t have the data skills, time or both Meadow Creek Students Partner as Data Buddies www.hebisd.eduhttp://www.hebisd.edu/media/images/articles/2763f.jpg
  • 31. Final thought: Description not prescription •Second biggest mistake…? •Templates!!!! •Solution: •Data dictionaries