SlideShare a Scribd company logo
1 of 49
Download to read offline
Dat: version and share
your data
Karissa McKelvey
Software Developer and Project Manager
and Science Evangelist and Designer (I wear a lot of hats)
U.S. Open Data
@karissamck
karissa $ ~
dat is a non profit
Reproducible Research
“A rule of thumb … is that half of published
research cannot be replicated”
How do we replicate research today?
How do we replicate research today?
collaborate on
How do we replicate research today?
collaborate on
data analysis
How do we collaborate today?
How do we collaborate today?
How do we collaborate today?
How do we collaborate today?
????????
How do we replicate research today?
me@home $ dat push
me@campus $ dat pull
you@work $ dat clone
dat workflow
• import
• version
• publish
• replicate
.csv.csvdata
you
.csv.csvdata
you
.csv.csvdata
you
.csv.csvdata
import
you
$ dat init
$ dat add dataset cities
$ dat add rows cities cities.csv
$ dat add files cities city_model.gz
import
.csv.csvdata
import
http://my-data.bids.edu
you
$ dat listen
.csv.csvdata
import
http://my-data.bids.edu
publish
you
.csv.csvdata
import
http://my-data.bids.edu
publish
you
$ dat clone
.csv.csvdata
import
http://my-data.bids.edu
publish
.csv.csvdata
you
Versioning
$ dat add files cities us_cities_viz.png
This will override us_cities_viz.png at
c2342. OK?
$ dat cities add rows updated_data.csv
This will update 3,434,245 rows. OK?
$ dat push
.csv.csvdata
import
http://my-data.bids.edu
publish
.csv.csvdata
you
http://my-data.bids.edu
publish
.csv.csvdata
http://my-data.indiana.edu
.csv
.png.png.png
.csv.csv
.csv.csv.R
.csv.csv.py
INTEROPERABILITY in Python and R
.csv
.png.png.png
.csv.csv
.csv.csv.R
.csv.csv.py
ECOSYSTEM
• Goal: manipulate datasets with scripting
• Supported keywords: run, pipe, map, reduce,
fork, keyword
• Bash-like
• Platform-independent
• Uses node.js streams (fast!)
Datscript
Top: Datscript “pipe” command
Bottom: Equivalent command in bash
Datscript: pipeline example
Datscript: example commands
background
- executes command, but doesn’t wait for it to finish


map
- pipes first argument into rest of arguments
run
- a serial command (executes and finishes command)
Karissa McKelvey - @karissamck
Melanie Cebula - @melaniecebula
http://dat-data.com
.csv
.png.png.png
.csv.csv
.csv.csv.R
.csv.csv.py
INTEROPERABILITY in Python and R
.csv
.png.png.png
.csv.csv
.csv.csv.R
.csv.csv.py
ECOSYSTEM

More Related Content

Similar to Dat 5 minute Lightning Talk

Abcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosasAbcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosasMerce Crosas
 
My Futuristic Vision of the Future of Cassandra's Future - NGCC 2015
My Futuristic Vision of the Future of Cassandra's Future - NGCC 2015My Futuristic Vision of the Future of Cassandra's Future - NGCC 2015
My Futuristic Vision of the Future of Cassandra's Future - NGCC 2015gdusbabek
 
Scientific Software Challenges and Community Responses
Scientific Software Challenges and Community ResponsesScientific Software Challenges and Community Responses
Scientific Software Challenges and Community ResponsesDaniel S. Katz
 
Reproducibility and Dataverse
Reproducibility and DataverseReproducibility and Dataverse
Reproducibility and Dataversephilipdurbin
 
Social Computing Research with Apache Spark
Social Computing Research with Apache SparkSocial Computing Research with Apache Spark
Social Computing Research with Apache SparkMatthew Rowe
 
Hide the Stack: Toward Usable Linked Data
Hide the Stack:Toward Usable Linked DataHide the Stack:Toward Usable Linked Data
Hide the Stack: Toward Usable Linked Dataaba-sah
 
Big Data in NATO and Your Role
Big Data in NATO and Your RoleBig Data in NATO and Your Role
Big Data in NATO and Your RoleJay Gendron
 
Exploring the Semantic Web
Exploring the Semantic WebExploring the Semantic Web
Exploring the Semantic WebRoberto García
 
Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...
Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...
Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...Benjamin Nussbaum
 
SciSoftDays Talk - Howison: Spreading the work in software ecosystems
SciSoftDays Talk - Howison: Spreading the work in software ecosystemsSciSoftDays Talk - Howison: Spreading the work in software ecosystems
SciSoftDays Talk - Howison: Spreading the work in software ecosystemsJames Howison
 
RDAP13 Heather Piwowar: Data Citation and Altmetrics Panel: Tools that work t...
RDAP13 Heather Piwowar: Data Citation and Altmetrics Panel: Tools that work t...RDAP13 Heather Piwowar: Data Citation and Altmetrics Panel: Tools that work t...
RDAP13 Heather Piwowar: Data Citation and Altmetrics Panel: Tools that work t...ASIS&T
 
No more waiting! Tools that work Today to reveal dataset use
No more waiting!  Tools that work Today to reveal dataset useNo more waiting!  Tools that work Today to reveal dataset use
No more waiting! Tools that work Today to reveal dataset useHeather Piwowar
 
Hofstra University - Overview of Big Data
Hofstra University - Overview of Big DataHofstra University - Overview of Big Data
Hofstra University - Overview of Big Datasarasioux
 
Magic Myth and the DevOps, ANTIDOTES TO LEARNED HELPLESSNESS AND FEAR CULTURE...
Magic Myth and the DevOps, ANTIDOTES TO LEARNED HELPLESSNESS AND FEAR CULTURE...Magic Myth and the DevOps, ANTIDOTES TO LEARNED HELPLESSNESS AND FEAR CULTURE...
Magic Myth and the DevOps, ANTIDOTES TO LEARNED HELPLESSNESS AND FEAR CULTURE...Jennifer Davis
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningRahul Jain
 
Sharing - Collecting our DAH Thoughts
Sharing  - Collecting our DAH ThoughtsSharing  - Collecting our DAH Thoughts
Sharing - Collecting our DAH ThoughtsShawn Day
 
Is everything we used to do wrong?
Is everything we used to do wrong?Is everything we used to do wrong?
Is everything we used to do wrong?Russ Weakley
 

Similar to Dat 5 minute Lightning Talk (20)

Tf in-context
Tf in-contextTf in-context
Tf in-context
 
MCN 2013 Verso Presentation
MCN 2013 Verso PresentationMCN 2013 Verso Presentation
MCN 2013 Verso Presentation
 
Abcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosasAbcd iqs ssoftware-projects-mercecrosas
Abcd iqs ssoftware-projects-mercecrosas
 
My Futuristic Vision of the Future of Cassandra's Future - NGCC 2015
My Futuristic Vision of the Future of Cassandra's Future - NGCC 2015My Futuristic Vision of the Future of Cassandra's Future - NGCC 2015
My Futuristic Vision of the Future of Cassandra's Future - NGCC 2015
 
Scientific Software Challenges and Community Responses
Scientific Software Challenges and Community ResponsesScientific Software Challenges and Community Responses
Scientific Software Challenges and Community Responses
 
Reproducibility and Dataverse
Reproducibility and DataverseReproducibility and Dataverse
Reproducibility and Dataverse
 
Social Computing Research with Apache Spark
Social Computing Research with Apache SparkSocial Computing Research with Apache Spark
Social Computing Research with Apache Spark
 
Hide the Stack: Toward Usable Linked Data
Hide the Stack:Toward Usable Linked DataHide the Stack:Toward Usable Linked Data
Hide the Stack: Toward Usable Linked Data
 
Big Data in NATO and Your Role
Big Data in NATO and Your RoleBig Data in NATO and Your Role
Big Data in NATO and Your Role
 
Exploring the Semantic Web
Exploring the Semantic WebExploring the Semantic Web
Exploring the Semantic Web
 
Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...
Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...
Knowledge Graphs - Journey to the Connected Enterprise - Data Strategy and An...
 
Howison si2 keynote
Howison si2 keynoteHowison si2 keynote
Howison si2 keynote
 
SciSoftDays Talk - Howison: Spreading the work in software ecosystems
SciSoftDays Talk - Howison: Spreading the work in software ecosystemsSciSoftDays Talk - Howison: Spreading the work in software ecosystems
SciSoftDays Talk - Howison: Spreading the work in software ecosystems
 
RDAP13 Heather Piwowar: Data Citation and Altmetrics Panel: Tools that work t...
RDAP13 Heather Piwowar: Data Citation and Altmetrics Panel: Tools that work t...RDAP13 Heather Piwowar: Data Citation and Altmetrics Panel: Tools that work t...
RDAP13 Heather Piwowar: Data Citation and Altmetrics Panel: Tools that work t...
 
No more waiting! Tools that work Today to reveal dataset use
No more waiting!  Tools that work Today to reveal dataset useNo more waiting!  Tools that work Today to reveal dataset use
No more waiting! Tools that work Today to reveal dataset use
 
Hofstra University - Overview of Big Data
Hofstra University - Overview of Big DataHofstra University - Overview of Big Data
Hofstra University - Overview of Big Data
 
Magic Myth and the DevOps, ANTIDOTES TO LEARNED HELPLESSNESS AND FEAR CULTURE...
Magic Myth and the DevOps, ANTIDOTES TO LEARNED HELPLESSNESS AND FEAR CULTURE...Magic Myth and the DevOps, ANTIDOTES TO LEARNED HELPLESSNESS AND FEAR CULTURE...
Magic Myth and the DevOps, ANTIDOTES TO LEARNED HELPLESSNESS AND FEAR CULTURE...
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Sharing - Collecting our DAH Thoughts
Sharing  - Collecting our DAH ThoughtsSharing  - Collecting our DAH Thoughts
Sharing - Collecting our DAH Thoughts
 
Is everything we used to do wrong?
Is everything we used to do wrong?Is everything we used to do wrong?
Is everything we used to do wrong?
 

Recently uploaded

Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Pooja Nehwal
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Delhi Call girls
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...only4webmaster01
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceDelhi Call girls
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfadriantubila
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsJoseMangaJr1
 

Recently uploaded (20)

Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
Thane Call Girls 7091864438 Call Girls in Thane Escort service book now -
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
Sampling (random) method and Non random.ppt
Sampling (random) method and Non random.pptSampling (random) method and Non random.ppt
Sampling (random) method and Non random.ppt
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 

Dat 5 minute Lightning Talk