1. The changing landscape of science
Chelle L. Gentemann
Earth & Space Research
cgentemann@esr.org
Research supported by the Schmidt
Family Foundation, Saildrone, Inc., and
NASA Physical Oceanography
Image Credit: Saildrone, Inc.
1
2. Background: sharing data
• So… sharing is easy…. If you want too…
1990s: tape drives 2018: cheap local storage
Cloud storage options
2
3. Sharing data
Figure credit: Grandjean, Martin (2014). "La connaissance est un réseau". Les Cahiers du Numérique
10 (3): 37-54. DOI:10.3166/LCN.10.3.37-54 3
4. But people still don’t share data….
Legal reasons…
data includes confidential information
data collection funded by agency/institution with a closed policy
security (export control) issues
Career / funding:
I collected this data. It is mine.
I won’t advance if I don’t publish, I won’t get more funding on this topic.
Other concerns:
Other researchers will misuse or misinterpret the data
It takes too much effort to document the data
Other people helped produce the data, they will never agree to share it
I’m a scientists not a data provider, this isn’t my job
4
5. Sharing data
Figure credit: Grandjean, Martin (2014). "La connaissance est un réseau". Les Cahiers du Numérique
10 (3): 37-54. DOI:10.3166/LCN.10.3.37-54 5
7. Digital Object Identifier (DOI)
Share data, but ensure credit.
If you are a civil servant, public
domain applies.
Otherwise license your data.
https://creativecommons.org/lic
enses/
Image credit: https://www.ands.org.au/__data/assets/pdf_file/0006/715155/Digital-Object-Identifiers.pdf 8
Image credit: Creative Commons Guide with as source How To Attribute Creative Commons Photos
9. Journal policies are changing
https://publications.agu.org/author-resource-center/publication-policies/data-policy/
AGU affirmed in its 2012 position statement that “Earth and space science data should
be widely accessible in multiple formats and long‐term preservation of data is an integral
responsibility of scientists and sponsoring institutions.” Following this statement and to
advance scientific exploration and discovery, and allow a full assessment of results
presented in AGU’s journals, all data necessary to understand, evaluate, replicate, and
build upon the reported research must be made available and accessible whenever
possible.
For the purposes of this policy, data include, but are not limited to, the following:
Data used to generate, or be displayed in, figures, graphs, plots, videos, animations, or
tables in a paper.
New protocols or methods used to generate the data in a paper.
New code/computer software used to generate results or analyses reported in the paper.
Derived data products reported or described in a paper.
10
10. Federal Policy
• 2013 OSTP memo, “Increasing Access to the Results of Federally Funded Scientific
Research”, aims to ensure “that, to the greatest extent and with the fewest
constraints possible … , the direct results of federally funded scientific research are
made available to and useful for the public, industry, and the scientific community.
Such results include peer-reviewed publications and digital data.”
• 2016 OMB memo, “Federal Source Code Policy: Achieving Efficiency, Transparency,
and Innovation through Reusable and Open Source Software” (M-16-21), requires
agencies to consider the value of publishing the code they develop as open-source
software and to establish requirements for releasing custom-developed source code.
11
11. NOAA open data
• In 2004, NOAA policy included “NOAA will
promote the open and unrestricted exchange of
environmental information worldwide”
• NOAA CIO Zach Goldstein describes, “It’s our job
to get that data out there. The data doesn’t
belong to us, it belongs to the American
people.”
• In 2017 NOAA created Chief Data Officer
position
http://odimpact.org/files/case-studies-noaa.pdf
Quote from GovLab interview with Zachary Goldstein, Chief Information Officer,
NOAA, September 3, 2015. 12
13. Open source software
• The next logical step after open data is open
software
– Reproducibility
– Advancing science
– Accelerating science
• Tools are now mature enough to make open
software relatively easy (GitHub, GitLab,
BitBucket)
14
14. Open source collaborations expanding
Come gather 'round people
Wherever you roam
And admit that the waters
Around you have grown
And accept it that soon
You'll be drenched to the bone.
If your time to you
Is worth savin'
Then you better start swimmin'
Or you'll sink like a stone
For the times they are a-changin'.
-Bob Dylan
15
17. Expand
Openly sharing software creates a traceable
resource for your work. GitHub + Zenodo for DOI.
Putting figures on figshare with CC-BY allows
people to easily reuse and ensures credit
Putting data in dataverse or institutional archive,
with an assigned DOI and license ensures credit and
visibility
Make it easy to find your work
Make it easy to cite your work
Make it easy to collaborate with you
18
18. Using Saildrone autonomous in situ data for satellite validation
and research into upper ocean physics and ecology
Co-Investigators: S. Akella, I. Cetinić, Y. Chao, M. Chin, M. Daugharty,
K. Dohan, J. Dorman, M. Fewings, X. Flores-Vidal, B. Fox-Kemper, B.
Franz, M. García-Reyes, J. Gomez Valdes, E. Hazen, J. Høyer, J.
Largier, P. Mazzini, J. Scott, W. Sydeman, J. Vazquez, F. Veron, J.
Werdell, L. Yu, K. Zaba.
Institutions: Brown University, CODAR Ocean Sensors, Danish
Meteorological Institute, Earth and Space Research, Ensenada Center for
Scientific Research and Higher Education, Farallon Institute, NASA Jet
Propulsion Laboratory, NASA GMAO, NASA GSFC, Remote Sensing
Solutions, San Francisco State University, Science Systems and
Applications Inc., Scripps Institution of Oceanography, Universities Space
Research Association, University of Baja California, University of
California Davis, University of California Santa Cruz, University of
Connecticut, University of Miami, University of Rhode Island, University of
Delaware, Woods Hole Oceanographic Institution.
Project funded by:
Saildrone Inc. &
The Schmidt Family Foundation
Image credit: Saildrone, Inc.
C. Gentemann, ESR
P. Minnett, U. Miami
P. Cornillon, U. Rhode Island
19
22. Baja Cruise
11 April – 11 June 2018
60-day cruise
Along-wind and across wind
sampling of fronts
Data freely available (format
finalized soon) on google
drive (soon NASA.PODAAC)
CC-BY-NC license
Software repository for
project:
https://github.com/cgentemann
/Saildrone
23
23. April 11 - June 11, 2018
4 temperature loggers added by NASA
Physical Oceanography Program
295mm
500mm
985mm
1420mm
1785mm
24
Assistant ScientistsSaildrone Engineers
24. Baja Cruise: real time data
Data explorer : web interface provided by Saildrone to visualize
data while cruise is occurring. Data from instruments on
Saildrone as well as model analyses (SST, SSS, currents, etc.)
are shown 25
25. Direct and task USV
Data explorer : web interface for tasking USV. This allows for
control of the USV to sample fronts and adjust the track as they
move. Points are set, with a ‘width’ set that controls the distance
the USV is allowed to vary from the track. 26
26. VIIRS SST and USV track
Track designed to sample
different types of fronts, provide
data for validation of satellite
environmental products, and
areas with diurnal warming
events
27
27. Physics of observation
At 55 deg there is .96
emissivity
Changes with angle
of observation
In reality reflected
radiance is quasi-
specular
B(Tskin)
B(Tsky)
CT15 is measuring the skin temperature and
reflected sky temperature
28
28. Skin minus bulk difference
Noise
due to
reflected
sky
radiation
?
Diurnal
warming
of surface
skin layer
29
29. Bulk SST
Two measurements of ‘bulk’ SST at 0.6 m depth from the O2 sensor and a
CTD. Comparison between the two ‘bulk’ SSTs below. There is a *very*
small difference that is wind speed dependent, but they essentially are
independently measuring the same temperature to O(0.01)
Bulk SST is a high quality
observation, bias is NOT in
the bulk SST
30
30. Collocated Saildrone data with GOES 16 SST data
Cloudy day in SD
area
Clear sky in SD area, cloud
contamination in other areas
SD track for
day shown in
pink. 24 hour
average of
GOES SSTs
shown in
image.
Missing data
means that it
was cloudy.
Cold ‘speckle’
shows cloud
contamination
in SSTs.
Cloud mask
not perfect. 31
31. Time series of collocated GOES and SD data
Cloudy, no GOES matchups,
smaller skin SST bias
Clear Sky, GOES SST matchups, large
neg. bias
32
32. Cloudy, no GOES matchups, smaller
skin SST bias
Clear Sky, GOES SST matchups, large neg. bias
Time series of collocated GOES
and SD data: VERIFY, verify.
verify.
33
33. Upper ocean diurnal warming
Mixing in the upper ocean
This is how solar energy is transferred
into the ocean resulting in the seasonal
cycle of temperature
How does upper ocean stratification
change rates of mixing?
34
35. Next steps
• Organize research / publications on topics:
• Quality of Saildrone observations, what looks good, what
needs to be flagged, share flagged values
• Prof. Gomez collocated drifters, circulation study
• Zaba collocated glider ADCP analysis
• Satellite – buoy – Saildrone collocation analysis,
validation of SST, ocean color, salinity
• HF Radar surface currents, OSCAR currents, and
Saildrone validation
• Coastal front analysis
• Offshore front analysis
• Across / Along winds front differences
• Circulation in frontal regions
• Baroclinic Instability waves along fronts
• Diurnal warming in surface layer 36
37. 2019-2022: 5 Arctic Cruises
Image credit: NOAA PMEL
Image credit: Saildrone
NASA Physical
Oceanography
Program
38
38. Summary
39
Open data policy for data from 1st day of cruise.
Open source software repository on github.
Partners on the project have given talks, given out
Googledrive link.
Initially, data shared via googledrive to anyone who
requested access, CC-BY-NC license.
Will be formally shared via NASA Physical Oceanography
Data Active Archive (PO.DAAC) soon.
Format and metadata are finalized.
Working on documentation.