Symposium on Digital Curation in
the Era of Big Data:
Career Opportunities and
Educational Requirements:
A Data Scientist ...
Background (What I do)
•
•
•
•
•
•
•
•
•
•
•

Data Documentation (Metadata)
Data Management
Data Discovery & Access Tools
...
Scientific Data Continuum
Data
Producers

Scientific
Literature

Data
Consumers
THEN

Data
Producers

Scientific
Literatur...
Perspective of Data Producers

Domain Specialists

• Goal: Scientific Discovery
• Data Acquisition&
Reduction
• Data Assem...
Perspective of Data Consumers
•
•
•
•
•
•

Domain Specialists & Public

Goal: Discovery
Data Discoverability & Access
Cros...
Perspective of Data Providers
• Goal: Access/Preservation/Re-Use
• Data Formats & Standards
• Data Documentation &
Preserv...
At the Intersection:
The Data Scientist

Data
Producers

Data
Consumers

Data
Providers
Data Stewardship Continuum
DATA
PRODUCERS

DATA
PROVIDERS

Data Scientist

DATA
CONSUMERS
Key Attributes of Data Scientists
• Knowledge spanning full scientific data
stewardship continuum
• Domain Experience
• Co...
Key Attributes of Data Scientists
• Other skills (seldom taught)
• Communication & Organization
• Understand cultural aspe...
Key Attributes Tech Team Members
• Basic knowledge of content OR interest/curiosity
• Experience with Data Production/Cons...
Challenges & Opportunities
• Difficult to find right balance between technical
skills and interest in content
– Team dynam...
The Future?

Data
Scientists

Data
Producers

Data
Consumers

Data
Providers
Upcoming SlideShare
Loading in …5
×

A Data Scientist Perspective on Data Curation in the Digital Era

402 views

Published on

Perspective from a marine geoscientist turned data scientist on career opportunities and educational requirements in the "Era of Big Data"

Published in: Education, Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
402
On SlideShare
0
From Embeds
0
Number of Embeds
12
Actions
Shares
0
Downloads
10
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

A Data Scientist Perspective on Data Curation in the Digital Era

  1. 1. Symposium on Digital Curation in the Era of Big Data: Career Opportunities and Educational Requirements: A Data Scientist Perspective Dr. Vicki Lynn Ferrini Lamont-Doherty Earth Observatory
  2. 2. Background (What I do) • • • • • • • • • • • Data Documentation (Metadata) Data Management Data Discovery & Access Tools Develop/Implement QA/QC Data Syntheses Data Compliance Tools Education Materials Delivery to National Data Centers, Libraries Data Publication & Links to Scientific Literature Data Integration, Visualization & Analysis Tools Best Practice Guidelines for Optimizing Acquisition “Support, sustain, and advance the geosciences by providing data services for observational solid earth data from the Ocean, Earth, and Polar Sciences.” rvdata.us
  3. 3. Scientific Data Continuum Data Producers Scientific Literature Data Consumers THEN Data Producers Scientific Literature Data Providers Data Consumers Varying Goals/Perspectives/Needs NOW
  4. 4. Perspective of Data Producers Domain Specialists • Goal: Scientific Discovery • Data Acquisition& Reduction • Data Assembly • Visualization, Integration & Interpretation • Scientific Standards • Technical & Operational Limitations • Data documentation • Varies by domain • Often difficult • Heterogeneous
  5. 5. Perspective of Data Consumers • • • • • • Domain Specialists & Public Goal: Discovery Data Discoverability & Access Cross-disciplinary Scientific Standards Interpretation Increased importance of documentation • Data not self-generated • Data Quality/Reliability • Data Use/Misuse
  6. 6. Perspective of Data Providers • Goal: Access/Preservation/Re-Use • Data Formats & Standards • Data Documentation & Preservation Techniques • Scientific & Metadata Standards • Data Citation • Data Transfer Mechanisms • System Usability • Interoperability/Linked Data • Needs of Diversity of User Community • Knowledge of Content Human & Digital Bridge between Producers & Consumers
  7. 7. At the Intersection: The Data Scientist Data Producers Data Consumers Data Providers
  8. 8. Data Stewardship Continuum DATA PRODUCERS DATA PROVIDERS Data Scientist DATA CONSUMERS
  9. 9. Key Attributes of Data Scientists • Knowledge spanning full scientific data stewardship continuum • Domain Experience • Content & applications • Data acquisition & reduction practices • Nuances of Data • Technical knowledge • Evolving Technologies • Data Acquisition & Management • Metadata
  10. 10. Key Attributes of Data Scientists • Other skills (seldom taught) • Communication & Organization • Understand cultural aspects of user community • People/Project Management • Balance between micro- and macroperspectives
  11. 11. Key Attributes Tech Team Members • Basic knowledge of content OR interest/curiosity • Experience with Data Production/Consumption • Technical skills: – web development & technology – geospatially enabled data management tools – experience with data analysis tools – ability to work in a variety of tech environments • Complementary skill sets • Innovation & creativity • Willingness to ask questions – assumptions can be dangerous
  12. 12. Challenges & Opportunities • Difficult to find right balance between technical skills and interest in content – Team dynamics, management approaches evolving – Increasing opportunities to engage/educate computer scientists in domain science • Data producers are slow to join the digital era – Educational opportunities – Scientific benefits continue to grow – New generation incorporating data sharing into scientific workflow • Difficult to keep pace with evolving technologies – Educational & Professional Development opportunities
  13. 13. The Future? Data Scientists Data Producers Data Consumers Data Providers

×