SlideShare a Scribd company logo
1 of 53
Download to read offline
N E T W O R K E D S C I E N C E A N D MAC H I N E L E A R N I N G
J OAQ U I N VA N S C H O R E N ( T U / E ) , 2 0 1 4
#OpenML
1 6 1 0
G A L I L E O G A L I L E I
D I S C O V E R S S A T U R N ’ S R I N G S
‘ S M A I S M R M I L M E P O E TA L
E U M I B U N E N U G T TA U I R A S ’
How do you convince scientists to
share their discoveries?
1 7 T H C E N T U RY
J O U R N A L S Y S T E M
R E P U TA T I O N - B A S E D
E C O N O M Y
N E T W O R K E D S C I E N C E
T O D A Y
Online scholarly tools
Share data, code impossible to print in journals
Collect, organise, analyse all data
Collaborate in real time with hundreds of scientists
S C A L I N G U P C O L L A B O R AT I O N
• Large-scale collaborations change the way we
make discoveries
• Massively collaborative science
• Open data: mapping and mining
• Citizen science
D E S I G N E D S E R E N D I P I T Y
• Many scientists have complementary
expertise
• Right expertise at the right time
• Ideas spark new ideas, questions get
answered, data and tools reused in
unexpected ways
• `Happy accidents’ common in large collaborations
D Y N A M I C D I S T R I B U T I O N O F L A B O R
• Scientists have complementary skills:
generate ideas, experiment, analyse,
interpret
• Right skills, resources, time at the right
time
• Dramatically speeds up progress
• What is impossibly hard for one scientist is routine
for another
S C A L I N G U P C O L L A B O R AT I O N
• Online tools: contribute any amount at any time
• Encourage small contributions
• Subtasks that can be attacked independently
• Rich, structured information commons
• Architecture of attention
• Honor code
How do you convince scientists to share
their ideas, data, code?
M A S S I V E LY C O L L A B O R AT I V E
S C I E N C E
P O LY M A T H S
P O LY M AT H P R O J E C T S
• Designed serendipity
• Broadcast question hoping that many minds may find a
solution
• “find myself having thoughts I would not have had
without some chance remark of another contributor”
• Dynamic division of labor
• Throwing out ideas, criticising, testing ideas,
synthesising, reformulating, coordinating,…
W H Y S H A R E I D E A S ?
• Authorship: contributions clearly visible, self-
reporting publication
• Visibility: earn respect from notable peers
• Scalability: over many projects, concentrate on
where you have special insight and advantage
• Interaction: share ideas early (before others),
ideas are quickly developed, corrected
O P E N D ATA
S D S S
S L O A N D I G I TA L S K Y S U R V E Y
• Designed serendipity
• Broadcast data, believing that many minds will ask
unanticipated questions
• More data than single person can comprehend:
challenge is asking the right questions
• Dynamic division of labor
• Collect data, ask questions, mine the data
W H Y S H A R E D ATA ?
• Fame: releasing the data yields more citations:
people more likely to build on it
• Funding: sharing data increases value of
research to community as a whole, increasing
chances of continued funding
C I T I Z E N S C I E N C E
G A L A X Y Z O O
G A L A X Y Z O O
• Designed serendipity
• Unexpected observations reported on forum.
• Accidental discovery of new classes of objects: green
pea galaxies, passive red spirals, Hanny’s Voorwerp
• Dynamic division of labor
• Huge task subdivided in many small tasks which can
be easily learned
W H Y V O L U N T E E R ?
• Discovery: being the first to see a galaxy
• Progress: understanding universe, beating
cancer,…
• Fun: gamification
• Learning: learning more about a science/topic
• Community: meeting like-minded people
M A C H I N E L E A R N I N G
• Good candidate for networked science
• Highly complex data, code, workflows, yet most work
published in papers (graphs, pseudocode)
• Experiments are not shared online: impossible to
build on prior work, start each time from scratch
• Low generalisability: studies contradict
• Low reproducibility: code, experiment details missing
• Place to share data in fine detail, and organise it to work more
effectively, be more visible, collaborate, tackle hard problems
• Links to data available anywhere online, integrated in popular
machine learning environments (WEKA, R, MOA, RapidMiner)
• Website to find data, code, results; discuss, compare, visualise
Data Tasks Flows Runs Studies
Demo
D ATA
F L O W S
TA S K S
TA S K S
TA S K S
R U N S
R U N S
R U N S : D ATA S E T S
R U N S : F L O W S
U N E X P E C T E D
Plugins
W E K A P L U G I N
M O A P L U G I N
R P L U G I N
1 . O P E R AT O R T O D O W N L O A D TA S K ( TA S K T Y P E S P E C I F I C )
R A P I D M I N E R
2 . S U B W O R K F L O W T H AT S O LV E S T H E TA S K , G E N E R AT E S R E S U LT S
3 . O P E R AT O R F O R U P L O A D I N G R E S U LT S
O P E N M L C O N N E C T
• Library for Java
• Package for R
• In progress: Module for Python
• In progress: Command-line tools
F O R S C I E N C E
D E S I G N E D S E R E N D I P I T Y
• `Impossible’ questions become possible by reusing
prior experiments
• Answer routine questions in minutes
• Mine all collected results for patterns: meta-learning
• Browse all data for unexpected results
• Reuse code, data in novel ways
D Y N A M I C D I V I S I O N O F L A B O R
• Scientists can focus attention on important problems by
adding data, collaborate with community
• Large collaborations: OpenML organizes all results to
follow progress
• Benchmark studies: only run algorithms you know well,
reuse all other results
• Students, citizen scientists can contribute data, runs
through plugins
E X A M P L E : M E TA - Q S A R P R O J E C T
• Large amounts of QSAR data available
• Not known which machine learning techniques are best
• OpenML used to try many algorithms and learn when
to use which techniques
• Applications in fighting malaria
B E Y O N D J O U R N A L S
• Enriches research output, linked to papers
• Freely accessible
• Organized online
• Low threshold for students
• Continuously updated
• Immensely detailed
• Reproducible
• Stimulates online discussion
• Diminishes publication bias
S C A L A B I L I T Y
• Easy to make small contributions: add data, code, run
experiments using plugins, leave comments
• Split up complex studies: OpenML tasks
• Rich, structured data: all data, flows, runs, users linked.
Keyword search, filters, SQL endpoint
• Data easily filtered: easy to focus on your interests
• Enforce scientific standards: task types, verifiability, server-
side evaluations, clear attribution, honor code
F O R S C I E N T I S T S
M O R E T I M E
• OpenML assists in most routinizable work:
• Find code and data online
• Setup, run & organize experiments
• Relate to state-of-the-art (benchmarks)
• Annotate code and data
• Full log of your research
• Keep control of your data, code, experiments
• Follow experiments on the go (mobile devices)
M O R E K N O W L E D G E
• Your results linked to everybody else’s
• Larger, more general studies
• Answer more questions
• Mine all combined results
• Find unexpected results
• Interact with others on global scale, get help
• Collaborate with scientists from other fields
M O R E C R E D I T
M O R E C R E D I T
• Citation
• OpenML attributes data, flows, runs, tells others how to cite it
• More easy to find by others
• Altmetrics: track how often your work is reused
• Productivity: contribute efficiently to many studies
• Visibility: collaborate, climb leaderboards, self-publish (tweet)
• Funding: convincing way to make data open
• No publication bias: unexecpected results
F U T U R E W O R K
• OpenML studies: online representation of paper:
data, code, runs, discussions,…
• Social layer: control visibility: public, friends, private
• Collaborative leaderboards: all top-3 contributors
• Discussion forum for unexpected results
• More data types, tasks
S P R E A D T H E W O R D, W O R K O P E N LY
#OpenML

More Related Content

Similar to OpenML 2014

Human computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspectiveHuman computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspective
oralonso
 

Similar to OpenML 2014 (20)

UCL Research Software Development and Digital Humanities
UCL Research Software Development and Digital Humanities UCL Research Software Development and Digital Humanities
UCL Research Software Development and Digital Humanities
 
The science and art of methodology
The science and art of methodologyThe science and art of methodology
The science and art of methodology
 
Ebi
EbiEbi
Ebi
 
Open Knowledge and University of Cambridge European Bioinformatics Institute
Open Knowledge and University of Cambridge European Bioinformatics InstituteOpen Knowledge and University of Cambridge European Bioinformatics Institute
Open Knowledge and University of Cambridge European Bioinformatics Institute
 
OpenML data@Sheffield
OpenML data@SheffieldOpenML data@Sheffield
OpenML data@Sheffield
 
Five Ways to Use Social Media to Raise Awareness for Your Paper or Research
Five Ways to Use Social Media to Raise Awareness for Your Paper or ResearchFive Ways to Use Social Media to Raise Awareness for Your Paper or Research
Five Ways to Use Social Media to Raise Awareness for Your Paper or Research
 
Deep learning and the systemic challenges of data science initiatives
Deep learning and the systemic challenges of data science initiativesDeep learning and the systemic challenges of data science initiatives
Deep learning and the systemic challenges of data science initiatives
 
Communicating Clickable Complexities: From Nuclei to AI by Jenny Burns & Rach...
Communicating Clickable Complexities: From Nuclei to AI by Jenny Burns & Rach...Communicating Clickable Complexities: From Nuclei to AI by Jenny Burns & Rach...
Communicating Clickable Complexities: From Nuclei to AI by Jenny Burns & Rach...
 
ACM @ UCLA Fall 2015 General Meeting
ACM @ UCLA Fall 2015 General Meeting ACM @ UCLA Fall 2015 General Meeting
ACM @ UCLA Fall 2015 General Meeting
 
The Well Connected Facility
The Well Connected FacilityThe Well Connected Facility
The Well Connected Facility
 
Ngsp
NgspNgsp
Ngsp
 
The culture of researchData
The culture of researchData The culture of researchData
The culture of researchData
 
The Culture of Research Data, by Peter Murray-Rust
The Culture of Research Data, by Peter Murray-RustThe Culture of Research Data, by Peter Murray-Rust
The Culture of Research Data, by Peter Murray-Rust
 
APLIC 2012: Discovering & Dealing with Data
APLIC 2012: Discovering & Dealing with DataAPLIC 2012: Discovering & Dealing with Data
APLIC 2012: Discovering & Dealing with Data
 
Open + Internet of Things
Open + Internet of ThingsOpen + Internet of Things
Open + Internet of Things
 
Social Graphs for Better Drug Development
Social Graphs for Better Drug DevelopmentSocial Graphs for Better Drug Development
Social Graphs for Better Drug Development
 
Open Science for sustainability and inclusiveness: the SKA role model
 Open Science for sustainability and inclusiveness: the SKA role model Open Science for sustainability and inclusiveness: the SKA role model
Open Science for sustainability and inclusiveness: the SKA role model
 
Data Science and Urban Science @ UW
Data Science and Urban Science @ UWData Science and Urban Science @ UW
Data Science and Urban Science @ UW
 
ACM Winter 2016 General Meeting
ACM Winter 2016 General Meeting ACM Winter 2016 General Meeting
ACM Winter 2016 General Meeting
 
Human computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspectiveHuman computation, crowdsourcing and social: An industrial perspective
Human computation, crowdsourcing and social: An industrial perspective
 

More from Joaquin Vanschoren

More from Joaquin Vanschoren (17)

Meta learning tutorial
Meta learning tutorialMeta learning tutorial
Meta learning tutorial
 
AutoML lectures (ACDL 2019)
AutoML lectures (ACDL 2019)AutoML lectures (ACDL 2019)
AutoML lectures (ACDL 2019)
 
OpenML 2019
OpenML 2019OpenML 2019
OpenML 2019
 
Exposé Ontology
Exposé OntologyExposé Ontology
Exposé Ontology
 
Designed Serendipity
Designed SerendipityDesigned Serendipity
Designed Serendipity
 
Learning how to learn
Learning how to learnLearning how to learn
Learning how to learn
 
OpenML NeurIPS2018
OpenML NeurIPS2018OpenML NeurIPS2018
OpenML NeurIPS2018
 
Open and Automated Machine Learning
Open and Automated Machine LearningOpen and Automated Machine Learning
Open and Automated Machine Learning
 
OpenML Reproducibility in Machine Learning ICML2017
OpenML Reproducibility in Machine Learning ICML2017OpenML Reproducibility in Machine Learning ICML2017
OpenML Reproducibility in Machine Learning ICML2017
 
OpenML DALI
OpenML DALIOpenML DALI
OpenML DALI
 
OpenML Tutorial ECMLPKDD 2015
OpenML Tutorial ECMLPKDD 2015OpenML Tutorial ECMLPKDD 2015
OpenML Tutorial ECMLPKDD 2015
 
Data science
Data scienceData science
Data science
 
Open Machine Learning
Open Machine LearningOpen Machine Learning
Open Machine Learning
 
Hadoop tutorial
Hadoop tutorialHadoop tutorial
Hadoop tutorial
 
Hadoop sensordata part2
Hadoop sensordata part2Hadoop sensordata part2
Hadoop sensordata part2
 
Hadoop sensordata part1
Hadoop sensordata part1Hadoop sensordata part1
Hadoop sensordata part1
 
Hadoop sensordata part3
Hadoop sensordata part3Hadoop sensordata part3
Hadoop sensordata part3
 

Recently uploaded

👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
karishmasinghjnh
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
gajnagarg
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
amitlee9823
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
only4webmaster01
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
amitlee9823
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
amitlee9823
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
gajnagarg
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
gajnagarg
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 

Recently uploaded (20)

👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 9155563397 👗 Top Class Call Girl Service B...
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men  🔝Sambalpur🔝   Esc...
➥🔝 7737669865 🔝▻ Sambalpur Call-girls in Women Seeking Men 🔝Sambalpur🔝 Esc...
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Rabindra Nagar  (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Rabindra Nagar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 

OpenML 2014

  • 1. N E T W O R K E D S C I E N C E A N D MAC H I N E L E A R N I N G J OAQ U I N VA N S C H O R E N ( T U / E ) , 2 0 1 4 #OpenML
  • 2. 1 6 1 0 G A L I L E O G A L I L E I D I S C O V E R S S A T U R N ’ S R I N G S ‘ S M A I S M R M I L M E P O E TA L E U M I B U N E N U G T TA U I R A S ’
  • 3. How do you convince scientists to share their discoveries?
  • 4. 1 7 T H C E N T U RY J O U R N A L S Y S T E M R E P U TA T I O N - B A S E D E C O N O M Y
  • 5.
  • 6. N E T W O R K E D S C I E N C E T O D A Y Online scholarly tools Share data, code impossible to print in journals Collect, organise, analyse all data Collaborate in real time with hundreds of scientists
  • 7. S C A L I N G U P C O L L A B O R AT I O N • Large-scale collaborations change the way we make discoveries • Massively collaborative science • Open data: mapping and mining • Citizen science
  • 8. D E S I G N E D S E R E N D I P I T Y • Many scientists have complementary expertise • Right expertise at the right time • Ideas spark new ideas, questions get answered, data and tools reused in unexpected ways • `Happy accidents’ common in large collaborations
  • 9. D Y N A M I C D I S T R I B U T I O N O F L A B O R • Scientists have complementary skills: generate ideas, experiment, analyse, interpret • Right skills, resources, time at the right time • Dramatically speeds up progress • What is impossibly hard for one scientist is routine for another
  • 10. S C A L I N G U P C O L L A B O R AT I O N • Online tools: contribute any amount at any time • Encourage small contributions • Subtasks that can be attacked independently • Rich, structured information commons • Architecture of attention • Honor code
  • 11. How do you convince scientists to share their ideas, data, code?
  • 12. M A S S I V E LY C O L L A B O R AT I V E S C I E N C E P O LY M A T H S
  • 13. P O LY M AT H P R O J E C T S • Designed serendipity • Broadcast question hoping that many minds may find a solution • “find myself having thoughts I would not have had without some chance remark of another contributor” • Dynamic division of labor • Throwing out ideas, criticising, testing ideas, synthesising, reformulating, coordinating,…
  • 14. W H Y S H A R E I D E A S ? • Authorship: contributions clearly visible, self- reporting publication • Visibility: earn respect from notable peers • Scalability: over many projects, concentrate on where you have special insight and advantage • Interaction: share ideas early (before others), ideas are quickly developed, corrected
  • 15. O P E N D ATA S D S S
  • 16. S L O A N D I G I TA L S K Y S U R V E Y • Designed serendipity • Broadcast data, believing that many minds will ask unanticipated questions • More data than single person can comprehend: challenge is asking the right questions • Dynamic division of labor • Collect data, ask questions, mine the data
  • 17. W H Y S H A R E D ATA ? • Fame: releasing the data yields more citations: people more likely to build on it • Funding: sharing data increases value of research to community as a whole, increasing chances of continued funding
  • 18. C I T I Z E N S C I E N C E G A L A X Y Z O O
  • 19. G A L A X Y Z O O • Designed serendipity • Unexpected observations reported on forum. • Accidental discovery of new classes of objects: green pea galaxies, passive red spirals, Hanny’s Voorwerp • Dynamic division of labor • Huge task subdivided in many small tasks which can be easily learned
  • 20. W H Y V O L U N T E E R ? • Discovery: being the first to see a galaxy • Progress: understanding universe, beating cancer,… • Fun: gamification • Learning: learning more about a science/topic • Community: meeting like-minded people
  • 21. M A C H I N E L E A R N I N G • Good candidate for networked science • Highly complex data, code, workflows, yet most work published in papers (graphs, pseudocode) • Experiments are not shared online: impossible to build on prior work, start each time from scratch • Low generalisability: studies contradict • Low reproducibility: code, experiment details missing
  • 22. • Place to share data in fine detail, and organise it to work more effectively, be more visible, collaborate, tackle hard problems • Links to data available anywhere online, integrated in popular machine learning environments (WEKA, R, MOA, RapidMiner) • Website to find data, code, results; discuss, compare, visualise
  • 23. Data Tasks Flows Runs Studies
  • 24. Demo
  • 25. D ATA
  • 26. F L O W S
  • 27. TA S K S
  • 28. TA S K S
  • 29. TA S K S
  • 30. R U N S
  • 31. R U N S
  • 32. R U N S : D ATA S E T S
  • 33. R U N S : F L O W S
  • 34. U N E X P E C T E D
  • 36. W E K A P L U G I N
  • 37. M O A P L U G I N
  • 38. R P L U G I N
  • 39. 1 . O P E R AT O R T O D O W N L O A D TA S K ( TA S K T Y P E S P E C I F I C ) R A P I D M I N E R 2 . S U B W O R K F L O W T H AT S O LV E S T H E TA S K , G E N E R AT E S R E S U LT S 3 . O P E R AT O R F O R U P L O A D I N G R E S U LT S
  • 40. O P E N M L C O N N E C T • Library for Java • Package for R • In progress: Module for Python • In progress: Command-line tools
  • 41. F O R S C I E N C E
  • 42. D E S I G N E D S E R E N D I P I T Y • `Impossible’ questions become possible by reusing prior experiments • Answer routine questions in minutes • Mine all collected results for patterns: meta-learning • Browse all data for unexpected results • Reuse code, data in novel ways
  • 43. D Y N A M I C D I V I S I O N O F L A B O R • Scientists can focus attention on important problems by adding data, collaborate with community • Large collaborations: OpenML organizes all results to follow progress • Benchmark studies: only run algorithms you know well, reuse all other results • Students, citizen scientists can contribute data, runs through plugins
  • 44. E X A M P L E : M E TA - Q S A R P R O J E C T • Large amounts of QSAR data available • Not known which machine learning techniques are best • OpenML used to try many algorithms and learn when to use which techniques • Applications in fighting malaria
  • 45. B E Y O N D J O U R N A L S • Enriches research output, linked to papers • Freely accessible • Organized online • Low threshold for students • Continuously updated • Immensely detailed • Reproducible • Stimulates online discussion • Diminishes publication bias
  • 46. S C A L A B I L I T Y • Easy to make small contributions: add data, code, run experiments using plugins, leave comments • Split up complex studies: OpenML tasks • Rich, structured data: all data, flows, runs, users linked. Keyword search, filters, SQL endpoint • Data easily filtered: easy to focus on your interests • Enforce scientific standards: task types, verifiability, server- side evaluations, clear attribution, honor code
  • 47. F O R S C I E N T I S T S
  • 48. M O R E T I M E • OpenML assists in most routinizable work: • Find code and data online • Setup, run & organize experiments • Relate to state-of-the-art (benchmarks) • Annotate code and data • Full log of your research • Keep control of your data, code, experiments • Follow experiments on the go (mobile devices)
  • 49. M O R E K N O W L E D G E • Your results linked to everybody else’s • Larger, more general studies • Answer more questions • Mine all combined results • Find unexpected results • Interact with others on global scale, get help • Collaborate with scientists from other fields
  • 50. M O R E C R E D I T
  • 51. M O R E C R E D I T • Citation • OpenML attributes data, flows, runs, tells others how to cite it • More easy to find by others • Altmetrics: track how often your work is reused • Productivity: contribute efficiently to many studies • Visibility: collaborate, climb leaderboards, self-publish (tweet) • Funding: convincing way to make data open • No publication bias: unexecpected results
  • 52. F U T U R E W O R K • OpenML studies: online representation of paper: data, code, runs, discussions,… • Social layer: control visibility: public, friends, private • Collaborative leaderboards: all top-3 contributors • Discussion forum for unexpected results • More data types, tasks
  • 53. S P R E A D T H E W O R D, W O R K O P E N LY #OpenML