SlideShare a Scribd company logo
1 of 51
Dark Matter,
Public Health,
     and
  Scientific
 Computing
 Greg Wilson
“Dark Matter Developers”
Scott Hanselman (March 2012)

[We] hypothesize that there is another kind
of developer than the ones we meet all the
time. We call them Dark Matter Developers.
They don't read a lot of blogs, they never write blogs,
they don't go to user groups, they don't tweet or
facebook, and you don't often see them at large
conferences... [They] aren't chasing the latest beta
or pushing...limits, they are just producing.


http://www.hanselman.com/blog/DarkMatterDevelopersTheUnseen99.aspx
                                                                     2
I'm Not That Optimistic
●   95% of scientists have their heads down
    –   Doing science instead of talking about using GPU
        clouds to personalize collaborative management
        of reproducible peta-scale workflows
●   Not because they don't
    want to do all that Star
    Trek stuff
●   But because e-science
    is out of reach

                                                           3
Shiny Toys




             4
Grimy Reality




                5
So Here We Are

You            Them




                      6
Surely You're Exaggerating
1.   How many graduate students write shell
     scripts to analyze each new data set
     instead of running those analyses by hand?




                                              7
Surely You're Exaggerating
2.   How many of them use version control
     to keep track of their work and collaborate
     with colleagues?




                                                   8
Surely You're Exaggerating
3.   How many routinely break large problems
     into pieces small enough to be
     - comprehensible,
     - testable, and
     - reusable?




                                               9
Surely You're Exaggerating
3.   How many routinely break large problems
     into pieces small enough to be
     - comprehensible,
     - testable, and
     - reusable?
     And how many know those are the
     same things?




                                               10
“Not My Problem”
     Actually your biggest problem




If someone is chronically malnourished as a
child, giving them a CT scanner when they're
      20 has no effect on their wellbeing.

                                               11
Their Biggest Problem Too

      Them                      Us




   “I don't know much     “Let's talk about
    about this, but I'd     brain scans.
like some clean water.”    They're cool.”

                                              12
We've Left the Majority Behind




                                 13
We've Left the Majority Behind

             Other than
             Googling for things,
             the majority of scientists
             do not use computers
             more effectively today
             than they did 28 years ago.


                                          14
The Goalposts
A scientist is computationally competent if she
can build, use, validate, and share software to:
● Manage and process data


●   Tell if it's been processed correctly
●   Find and fix problems when it hasn't been
●   Keep track of what she's done
●   Share work with others
●   Do all of these things efficiently
                                                   15
A Driver's License Exam
●   Developed with the Software Sustainability
    Institute for the DiRAC Consortium
●   Formative assessment
    –   Do you know what you need to know in order to
        get the most out of this gear?
●   Pencils ready?


            Note: actual exam allows for several different
        programming languages, version control systems, etc.
                                                               16
A Driver's License Exam
1.   Check out a working copy of the exam
2.   materials from Subversion.
3.
4.
5.
6.
7.
     Could do it easily               1.0
     Could struggle through           0.5
     Wouldn't know where to start     0.0
     Don't understand the question   -1.0
                                            17
A Driver's License Exam
1.   Use find and grep in a pipe to create a
2.   list of all .dat files in the working copy,
3.   redirect the output to a file, and add that
4.   file to the repository.
5.
6.
7.
     Could do it easily                   1.0
     Could struggle through               0.5
     Wouldn't know where to start         0.0
     Don't understand the question       -1.0
                                                   18
A Driver's License Exam
1.   Write a shell script that runs a legacy
2.   program for each parameter in a set.
3.
4.
5.
6.
7.
     Could do it easily                   1.0
     Could struggle through               0.5
     Wouldn't know where to start         0.0
     Don't understand the question       -1.0
                                                19
A Driver's License Exam
1.   Edit the Makefile provided so that if any
2.   .dat file changes, analyze.py is re-run to
3.   create the corresponding .out file.
4.
5.
6.
7.
     Could do it easily                  1.0
     Could struggle through              0.5
     Wouldn't know where to start        0.0
     Don't understand the question      -1.0
                                                  20
A Driver's License Exam
1.   Write four unit tests to exercise a function
2.   that calculates running sums. Explain why
3.   your four tests are most likely to uncover
4.   bugs in the function.
5.
6.
7.
     Could do it easily                  1.0
     Could struggle through              0.5
     Wouldn't know where to start        0.0
     Don't understand the question      -1.0
                                                    21
A Driver's License Exam
1.   Explain when and how a function could
2.   pass your tests, but still fail on real data.
3.
4.
5.
6.
7.
     Could do it easily                     1.0
     Could struggle through                 0.5
     Wouldn't know where to start           0.0
     Don't understand the question         -1.0
                                                     22
A Driver's License Exam
1.   Do a code review of the legacy program
2.   from Q3 (about 50 lines long) and list
3.   the four most important improvements
4.   you would make.
5.
6.
7.
     Could do it easily               1.0
     Could struggle through           0.5
     Wouldn't know where to start     0.0
     Don't understand the question   -1.0
                                              23
How Are We Doing?
a)   How did you do?




                             24
How Are We Doing?
a)   How did you do?


b)   How do you think most scientists do?




                                            25
How Are We Doing?
a)   How did you do?


b)   How do you think most scientists do?


c)   Do you think a scientist could use
     a peta-scale GPU provenance
     cloud without these skills?


                                            26
How Are We Doing?
a)   How did you do?


b)   How do you think most scientists do?


c)   Do you think a scientist could use debug
     a peta-scale GPU provenance
     cloud without these skills?


                                                27
How Are We Doing?
a)   How did you do?


b)   How do you think most scientists do?


c)   Do you think a scientist could use debug
     a peta-scale GPU provenance        extend
     cloud without these skills?


                                                 28
So Why Is This Your Problem?
If you're only helping the small minority
lucky enough to have acquired the skills
that mastering your shiny toy depends on...




your potential user base is many times
smaller than it could be.
                                              29
It Is Therefore Obvious That...
●   Put more computing courses in the curriculum!
    –   Except the curriculum is already full




                                                    30
It Is Therefore Obvious That...
●   Put a little computing in every course!
    –   Still adds up: 5 minutes/lecture = 4 courses/degree
    –   First thing cut when running late
    –   The blind leading the blind




                                                              31
It Is Therefore Obvious That...
●   Move all our teaching online!
    –   Where do I start?
    –   Signal vs. noise
    –   “It's not a MOOC, it's a MESS”
                Everyone who understands something
                correctly understands it the same way.
                Everyone who misunderstands something
                misunderstands differently.


                                                         32
What Has Worked
●   Target graduate students




                               33
What Has Worked
●   Target graduate students
●   Intensive short courses (2 days to 2 weeks)




                                                  34
What Has Worked
●   Target graduate students
●   Intensive short courses (2 days to 2 weeks)
●   Focus on practical skills




                                                  35
What Has Worked
●   Target graduate students
●   Intensive short courses (2 days to 2 weeks)
●   Focus on practical skills
●   Follow up with 6-8 weeks of once-a-week
    online tutorials and help
    –   Still tuning this part




                                                  36
37
What We Teach
●   Unix shell
●   Python
●   Version control
●   Testing
●   Matrix programming,
    image processing,
    SQL, ...

                                 38
What We Teach
●   Unix shell
●   Python
●   Version control
●   Testing
●   Matrix programming,
    image processing,
    SQL, ...

                                 39
“That's Merely Useful”
●   None of this is publishable any longer
    –   Which means it's ineffective career-wise for
        academic computer scientists




                                                       40
“That's Merely Useful”
●   None of this is publishable any longer
    –   Which means it's ineffective career-wise for
        academic computer scientists
●   But it works
    –   Two independent assessments have validated the
        impact of what we do for 1/3-2/3 of learners
    –   That's at least a six-fold increase in the number of
        scientists who can get work done without karōshi


                                                               41
Majestic Equality

Anatole France (1844-1924)

“The law, in its majestic equality, forbids
the rich and poor alike to sleep under
bridges, to beg in the streets, and to
steal bread.”




                                              42
Majestic Equality

Anatole France (1844-1924)

“The law, in its majestic equality, forbids
the rich and poor alike to sleep under
bridges, to beg in the streets, and to
steal bread.”


Thanks to modern computers,
every scientist can now devote her working life
to installation and configuration issues.
                                                  43
If you build a man a fire,
      you'll keep him warm for a night.
           If you set a man on fire,
you'll keep him warm for the rest of his life.

                           — Terry Pratchett
                                                 44
Host a Workshop




                  45
Teach a Workshop
●   All our materials are CC-BY licensed
●   We will help train you




                                           46
Shine Some Light

●   Include a few lines about your software stack
    and working practices in your scientific papers
●   Ask about them when reviewing
    –   Where's the repository containing the code?
    –   What's the coverage of your unit tests?
    –   How did you track provenance?




                                                      47
Be the Change You Want to See


   Would you rather increase the
    productivity of the top 5% of
        scientists ten-fold?

   Or double the productivity of the
             other 95%?


                                       48
Be the Change You Want to See


   Would you rather increase the
    productivity of the top 5% of
        scientists ten-fold?

   Or double the productivity of the
             other 95%?


                                       49
Be the Change You Want to See


   Would you rather see your best
   ideas left on the shelf because
        they're out of reach?

  Or raise a generation who can all
 do the things we think are exciting?


                                        50
http://software-carpentry.org




                                51

More Related Content

What's hot

Getting Things Done for Technical Communicators
Getting Things Done for Technical CommunicatorsGetting Things Done for Technical Communicators
Getting Things Done for Technical CommunicatorsKaren Mardahl
 
The Elusive Nature of Context: Why We Need It and Were We Might Find It
The Elusive Nature of Context: Why We Need It and Were We Might Find ItThe Elusive Nature of Context: Why We Need It and Were We Might Find It
The Elusive Nature of Context: Why We Need It and Were We Might Find ItGail Murphy
 
Getting Things Done for Technical Communicators at TCUK14
Getting Things Done for Technical Communicators at TCUK14Getting Things Done for Technical Communicators at TCUK14
Getting Things Done for Technical Communicators at TCUK14Karen Mardahl
 
Fdn016 term 2 week 1 and 2
Fdn016 term 2 week 1 and 2Fdn016 term 2 week 1 and 2
Fdn016 term 2 week 1 and 2Tim Curtis
 
Andres Rodriguez at AI Frontiers: Catalyzing Deep Learning's Impact in the En...
Andres Rodriguez at AI Frontiers: Catalyzing Deep Learning's Impact in the En...Andres Rodriguez at AI Frontiers: Catalyzing Deep Learning's Impact in the En...
Andres Rodriguez at AI Frontiers: Catalyzing Deep Learning's Impact in the En...AI Frontiers
 
Hack Schooling Presentation for TIE Colorado June 2013
Hack Schooling Presentation for TIE Colorado June 2013Hack Schooling Presentation for TIE Colorado June 2013
Hack Schooling Presentation for TIE Colorado June 2013Michelle Cordy
 
SAD08 - Working With Others
SAD08 - Working With OthersSAD08 - Working With Others
SAD08 - Working With OthersMichael Heron
 
How to get what you really want from Testing' with Michael Bolton
How to get what you really want from Testing' with Michael BoltonHow to get what you really want from Testing' with Michael Bolton
How to get what you really want from Testing' with Michael BoltonTEST Huddle
 
Planning for Uncertainty
Planning for UncertaintyPlanning for Uncertainty
Planning for UncertaintyMarcin Czenko
 
Exploratory Testing in an Agile Context
Exploratory Testing in an Agile ContextExploratory Testing in an Agile Context
Exploratory Testing in an Agile ContextElisabeth Hendrickson
 
How to avoid drastic project change (using stochastic stability)
How to avoid drastic project change (using stochastic stability)How to avoid drastic project change (using stochastic stability)
How to avoid drastic project change (using stochastic stability)CS, NcState
 
James Langley presentation about Computer science & ICT curriculum
James Langley presentation about Computer science & ICT curriculumJames Langley presentation about Computer science & ICT curriculum
James Langley presentation about Computer science & ICT curriculumpetzanet.HR Kurikulum
 
Day 7 mit workshop newsletter 08 13
Day 7 mit workshop newsletter 08 13Day 7 mit workshop newsletter 08 13
Day 7 mit workshop newsletter 08 13Ken Lechtanski
 
A Technological Revolution in Automated Software Development
A Technological Revolution in Automated Software DevelopmentA Technological Revolution in Automated Software Development
A Technological Revolution in Automated Software DevelopmentGraham Kendall
 
BPM Cluster Meeting 2014
BPM Cluster Meeting 2014BPM Cluster Meeting 2014
BPM Cluster Meeting 2014Jan Claes
 

What's hot (16)

Getting Things Done for Technical Communicators
Getting Things Done for Technical CommunicatorsGetting Things Done for Technical Communicators
Getting Things Done for Technical Communicators
 
The Elusive Nature of Context: Why We Need It and Were We Might Find It
The Elusive Nature of Context: Why We Need It and Were We Might Find ItThe Elusive Nature of Context: Why We Need It and Were We Might Find It
The Elusive Nature of Context: Why We Need It and Were We Might Find It
 
Getting Things Done for Technical Communicators at TCUK14
Getting Things Done for Technical Communicators at TCUK14Getting Things Done for Technical Communicators at TCUK14
Getting Things Done for Technical Communicators at TCUK14
 
Fdn016 term 2 week 1 and 2
Fdn016 term 2 week 1 and 2Fdn016 term 2 week 1 and 2
Fdn016 term 2 week 1 and 2
 
Andres Rodriguez at AI Frontiers: Catalyzing Deep Learning's Impact in the En...
Andres Rodriguez at AI Frontiers: Catalyzing Deep Learning's Impact in the En...Andres Rodriguez at AI Frontiers: Catalyzing Deep Learning's Impact in the En...
Andres Rodriguez at AI Frontiers: Catalyzing Deep Learning's Impact in the En...
 
Hack Schooling Presentation for TIE Colorado June 2013
Hack Schooling Presentation for TIE Colorado June 2013Hack Schooling Presentation for TIE Colorado June 2013
Hack Schooling Presentation for TIE Colorado June 2013
 
SAD08 - Working With Others
SAD08 - Working With OthersSAD08 - Working With Others
SAD08 - Working With Others
 
How to get what you really want from Testing' with Michael Bolton
How to get what you really want from Testing' with Michael BoltonHow to get what you really want from Testing' with Michael Bolton
How to get what you really want from Testing' with Michael Bolton
 
Planning for Uncertainty
Planning for UncertaintyPlanning for Uncertainty
Planning for Uncertainty
 
Exploratory Testing in an Agile Context
Exploratory Testing in an Agile ContextExploratory Testing in an Agile Context
Exploratory Testing in an Agile Context
 
Testing for everyone
Testing for everyoneTesting for everyone
Testing for everyone
 
How to avoid drastic project change (using stochastic stability)
How to avoid drastic project change (using stochastic stability)How to avoid drastic project change (using stochastic stability)
How to avoid drastic project change (using stochastic stability)
 
James Langley presentation about Computer science & ICT curriculum
James Langley presentation about Computer science & ICT curriculumJames Langley presentation about Computer science & ICT curriculum
James Langley presentation about Computer science & ICT curriculum
 
Day 7 mit workshop newsletter 08 13
Day 7 mit workshop newsletter 08 13Day 7 mit workshop newsletter 08 13
Day 7 mit workshop newsletter 08 13
 
A Technological Revolution in Automated Software Development
A Technological Revolution in Automated Software DevelopmentA Technological Revolution in Automated Software Development
A Technological Revolution in Automated Software Development
 
BPM Cluster Meeting 2014
BPM Cluster Meeting 2014BPM Cluster Meeting 2014
BPM Cluster Meeting 2014
 

Similar to Dark Matter, Public Health, and Scientific Computing

Software Carpentry for the Geophysical Sciences
Software Carpentry for the Geophysical SciencesSoftware Carpentry for the Geophysical Sciences
Software Carpentry for the Geophysical SciencesAron Ahmadia
 
Software Carpentry and the Hydrological Sciences @ AGU 2013
Software Carpentry and the Hydrological Sciences @ AGU 2013Software Carpentry and the Hydrological Sciences @ AGU 2013
Software Carpentry and the Hydrological Sciences @ AGU 2013Aron Ahmadia
 
30% faster coder on-boarding when you have a code cookbook
30% faster coder on-boarding when you have a code cookbook30% faster coder on-boarding when you have a code cookbook
30% faster coder on-boarding when you have a code cookbookGabriel Paunescu 🤖
 
top developer mistakes
top developer mistakes top developer mistakes
top developer mistakes Hanokh Aloni
 
Git Makes Me Angry Inside - DrupalCon Prague
Git Makes Me Angry Inside - DrupalCon PragueGit Makes Me Angry Inside - DrupalCon Prague
Git Makes Me Angry Inside - DrupalCon PragueEmma Jane Hogbin Westby
 
Graham Thomas - 10 Great but Now Overlooked Tools - EuroSTAR 2012
Graham Thomas - 10 Great but Now Overlooked Tools - EuroSTAR 2012Graham Thomas - 10 Great but Now Overlooked Tools - EuroSTAR 2012
Graham Thomas - 10 Great but Now Overlooked Tools - EuroSTAR 2012TEST Huddle
 
Become a Better Developer with Debugging Techniques for Drupal (and more!)
Become a Better Developer with Debugging Techniques for Drupal (and more!)Become a Better Developer with Debugging Techniques for Drupal (and more!)
Become a Better Developer with Debugging Techniques for Drupal (and more!)Acquia
 
Webdev and programming
Webdev and programming  Webdev and programming
Webdev and programming George Ingram
 
Life in the tech trenches (2015)
Life in the tech trenches (2015)Life in the tech trenches (2015)
Life in the tech trenches (2015)Julien SIMON
 
CTO Crunch avec Julien Simon, Viadeo
CTO Crunch avec Julien Simon, ViadeoCTO Crunch avec Julien Simon, Viadeo
CTO Crunch avec Julien Simon, ViadeoFrance Digitale
 
Originate - Think In Hours Not Sprints
Originate - Think In Hours Not SprintsOriginate - Think In Hours Not Sprints
Originate - Think In Hours Not SprintsRob Meadows
 
Surviving the technical interview
Surviving the technical interviewSurviving the technical interview
Surviving the technical interviewEric Brooke
 
User Experience Basics for Product Management
User Experience Basics for Product ManagementUser Experience Basics for Product Management
User Experience Basics for Product ManagementRoger Hart
 
GDG-Start your Programing Career.pptx
GDG-Start your Programing Career.pptxGDG-Start your Programing Career.pptx
GDG-Start your Programing Career.pptxMohamed Essam
 

Similar to Dark Matter, Public Health, and Scientific Computing (20)

Software Carpentry for the Geophysical Sciences
Software Carpentry for the Geophysical SciencesSoftware Carpentry for the Geophysical Sciences
Software Carpentry for the Geophysical Sciences
 
Software Carpentry and the Hydrological Sciences @ AGU 2013
Software Carpentry and the Hydrological Sciences @ AGU 2013Software Carpentry and the Hydrological Sciences @ AGU 2013
Software Carpentry and the Hydrological Sciences @ AGU 2013
 
30% faster coder on-boarding when you have a code cookbook
30% faster coder on-boarding when you have a code cookbook30% faster coder on-boarding when you have a code cookbook
30% faster coder on-boarding when you have a code cookbook
 
top developer mistakes
top developer mistakes top developer mistakes
top developer mistakes
 
Git Makes Me Angry Inside - DrupalCon Prague
Git Makes Me Angry Inside - DrupalCon PragueGit Makes Me Angry Inside - DrupalCon Prague
Git Makes Me Angry Inside - DrupalCon Prague
 
Stepping Outside
Stepping OutsideStepping Outside
Stepping Outside
 
Graham Thomas - 10 Great but Now Overlooked Tools - EuroSTAR 2012
Graham Thomas - 10 Great but Now Overlooked Tools - EuroSTAR 2012Graham Thomas - 10 Great but Now Overlooked Tools - EuroSTAR 2012
Graham Thomas - 10 Great but Now Overlooked Tools - EuroSTAR 2012
 
Is Python still production ready ? Ludovic Gasc
Is Python still production ready ? Ludovic GascIs Python still production ready ? Ludovic Gasc
Is Python still production ready ? Ludovic Gasc
 
Developer disciplines
Developer disciplinesDeveloper disciplines
Developer disciplines
 
Become a Better Developer with Debugging Techniques for Drupal (and more!)
Become a Better Developer with Debugging Techniques for Drupal (and more!)Become a Better Developer with Debugging Techniques for Drupal (and more!)
Become a Better Developer with Debugging Techniques for Drupal (and more!)
 
Webdev and programming
Webdev and programming  Webdev and programming
Webdev and programming
 
Life in the tech trenches (2015)
Life in the tech trenches (2015)Life in the tech trenches (2015)
Life in the tech trenches (2015)
 
CTO Crunch avec Julien Simon, Viadeo
CTO Crunch avec Julien Simon, ViadeoCTO Crunch avec Julien Simon, Viadeo
CTO Crunch avec Julien Simon, Viadeo
 
Originate - Think In Hours Not Sprints
Originate - Think In Hours Not SprintsOriginate - Think In Hours Not Sprints
Originate - Think In Hours Not Sprints
 
Design Sprints
Design SprintsDesign Sprints
Design Sprints
 
Surviving the technical interview
Surviving the technical interviewSurviving the technical interview
Surviving the technical interview
 
User Experience Basics for Product Management
User Experience Basics for Product ManagementUser Experience Basics for Product Management
User Experience Basics for Product Management
 
Spaghetti gate
Spaghetti gateSpaghetti gate
Spaghetti gate
 
GDG-Start your Programing Career.pptx
GDG-Start your Programing Career.pptxGDG-Start your Programing Career.pptx
GDG-Start your Programing Career.pptx
 
Best pratice
Best praticeBest pratice
Best pratice
 

Recently uploaded

Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfagholdier
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024Janet Corral
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...Sapna Thakur
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxVishalSingh1417
 

Recently uploaded (20)

Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Holdier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdfHoldier Curriculum Vitae (April 2024).pdf
Holdier Curriculum Vitae (April 2024).pdf
 
General AI for Medical Educators April 2024
General AI for Medical Educators April 2024General AI for Medical Educators April 2024
General AI for Medical Educators April 2024
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
BAG TECHNIQUE Bag technique-a tool making use of public health bag through wh...
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Unit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptxUnit-IV- Pharma. Marketing Channels.pptx
Unit-IV- Pharma. Marketing Channels.pptx
 

Dark Matter, Public Health, and Scientific Computing

  • 1. Dark Matter, Public Health, and Scientific Computing Greg Wilson
  • 2. “Dark Matter Developers” Scott Hanselman (March 2012) [We] hypothesize that there is another kind of developer than the ones we meet all the time. We call them Dark Matter Developers. They don't read a lot of blogs, they never write blogs, they don't go to user groups, they don't tweet or facebook, and you don't often see them at large conferences... [They] aren't chasing the latest beta or pushing...limits, they are just producing. http://www.hanselman.com/blog/DarkMatterDevelopersTheUnseen99.aspx 2
  • 3. I'm Not That Optimistic ● 95% of scientists have their heads down – Doing science instead of talking about using GPU clouds to personalize collaborative management of reproducible peta-scale workflows ● Not because they don't want to do all that Star Trek stuff ● But because e-science is out of reach 3
  • 6. So Here We Are You Them 6
  • 7. Surely You're Exaggerating 1. How many graduate students write shell scripts to analyze each new data set instead of running those analyses by hand? 7
  • 8. Surely You're Exaggerating 2. How many of them use version control to keep track of their work and collaborate with colleagues? 8
  • 9. Surely You're Exaggerating 3. How many routinely break large problems into pieces small enough to be - comprehensible, - testable, and - reusable? 9
  • 10. Surely You're Exaggerating 3. How many routinely break large problems into pieces small enough to be - comprehensible, - testable, and - reusable? And how many know those are the same things? 10
  • 11. “Not My Problem” Actually your biggest problem If someone is chronically malnourished as a child, giving them a CT scanner when they're 20 has no effect on their wellbeing. 11
  • 12. Their Biggest Problem Too Them Us “I don't know much “Let's talk about about this, but I'd brain scans. like some clean water.” They're cool.” 12
  • 13. We've Left the Majority Behind 13
  • 14. We've Left the Majority Behind Other than Googling for things, the majority of scientists do not use computers more effectively today than they did 28 years ago. 14
  • 15. The Goalposts A scientist is computationally competent if she can build, use, validate, and share software to: ● Manage and process data ● Tell if it's been processed correctly ● Find and fix problems when it hasn't been ● Keep track of what she's done ● Share work with others ● Do all of these things efficiently 15
  • 16. A Driver's License Exam ● Developed with the Software Sustainability Institute for the DiRAC Consortium ● Formative assessment – Do you know what you need to know in order to get the most out of this gear? ● Pencils ready? Note: actual exam allows for several different programming languages, version control systems, etc. 16
  • 17. A Driver's License Exam 1. Check out a working copy of the exam 2. materials from Subversion. 3. 4. 5. 6. 7. Could do it easily 1.0 Could struggle through 0.5 Wouldn't know where to start 0.0 Don't understand the question -1.0 17
  • 18. A Driver's License Exam 1. Use find and grep in a pipe to create a 2. list of all .dat files in the working copy, 3. redirect the output to a file, and add that 4. file to the repository. 5. 6. 7. Could do it easily 1.0 Could struggle through 0.5 Wouldn't know where to start 0.0 Don't understand the question -1.0 18
  • 19. A Driver's License Exam 1. Write a shell script that runs a legacy 2. program for each parameter in a set. 3. 4. 5. 6. 7. Could do it easily 1.0 Could struggle through 0.5 Wouldn't know where to start 0.0 Don't understand the question -1.0 19
  • 20. A Driver's License Exam 1. Edit the Makefile provided so that if any 2. .dat file changes, analyze.py is re-run to 3. create the corresponding .out file. 4. 5. 6. 7. Could do it easily 1.0 Could struggle through 0.5 Wouldn't know where to start 0.0 Don't understand the question -1.0 20
  • 21. A Driver's License Exam 1. Write four unit tests to exercise a function 2. that calculates running sums. Explain why 3. your four tests are most likely to uncover 4. bugs in the function. 5. 6. 7. Could do it easily 1.0 Could struggle through 0.5 Wouldn't know where to start 0.0 Don't understand the question -1.0 21
  • 22. A Driver's License Exam 1. Explain when and how a function could 2. pass your tests, but still fail on real data. 3. 4. 5. 6. 7. Could do it easily 1.0 Could struggle through 0.5 Wouldn't know where to start 0.0 Don't understand the question -1.0 22
  • 23. A Driver's License Exam 1. Do a code review of the legacy program 2. from Q3 (about 50 lines long) and list 3. the four most important improvements 4. you would make. 5. 6. 7. Could do it easily 1.0 Could struggle through 0.5 Wouldn't know where to start 0.0 Don't understand the question -1.0 23
  • 24. How Are We Doing? a) How did you do? 24
  • 25. How Are We Doing? a) How did you do? b) How do you think most scientists do? 25
  • 26. How Are We Doing? a) How did you do? b) How do you think most scientists do? c) Do you think a scientist could use a peta-scale GPU provenance cloud without these skills? 26
  • 27. How Are We Doing? a) How did you do? b) How do you think most scientists do? c) Do you think a scientist could use debug a peta-scale GPU provenance cloud without these skills? 27
  • 28. How Are We Doing? a) How did you do? b) How do you think most scientists do? c) Do you think a scientist could use debug a peta-scale GPU provenance extend cloud without these skills? 28
  • 29. So Why Is This Your Problem? If you're only helping the small minority lucky enough to have acquired the skills that mastering your shiny toy depends on... your potential user base is many times smaller than it could be. 29
  • 30. It Is Therefore Obvious That... ● Put more computing courses in the curriculum! – Except the curriculum is already full 30
  • 31. It Is Therefore Obvious That... ● Put a little computing in every course! – Still adds up: 5 minutes/lecture = 4 courses/degree – First thing cut when running late – The blind leading the blind 31
  • 32. It Is Therefore Obvious That... ● Move all our teaching online! – Where do I start? – Signal vs. noise – “It's not a MOOC, it's a MESS” Everyone who understands something correctly understands it the same way. Everyone who misunderstands something misunderstands differently. 32
  • 33. What Has Worked ● Target graduate students 33
  • 34. What Has Worked ● Target graduate students ● Intensive short courses (2 days to 2 weeks) 34
  • 35. What Has Worked ● Target graduate students ● Intensive short courses (2 days to 2 weeks) ● Focus on practical skills 35
  • 36. What Has Worked ● Target graduate students ● Intensive short courses (2 days to 2 weeks) ● Focus on practical skills ● Follow up with 6-8 weeks of once-a-week online tutorials and help – Still tuning this part 36
  • 37. 37
  • 38. What We Teach ● Unix shell ● Python ● Version control ● Testing ● Matrix programming, image processing, SQL, ... 38
  • 39. What We Teach ● Unix shell ● Python ● Version control ● Testing ● Matrix programming, image processing, SQL, ... 39
  • 40. “That's Merely Useful” ● None of this is publishable any longer – Which means it's ineffective career-wise for academic computer scientists 40
  • 41. “That's Merely Useful” ● None of this is publishable any longer – Which means it's ineffective career-wise for academic computer scientists ● But it works – Two independent assessments have validated the impact of what we do for 1/3-2/3 of learners – That's at least a six-fold increase in the number of scientists who can get work done without karōshi 41
  • 42. Majestic Equality Anatole France (1844-1924) “The law, in its majestic equality, forbids the rich and poor alike to sleep under bridges, to beg in the streets, and to steal bread.” 42
  • 43. Majestic Equality Anatole France (1844-1924) “The law, in its majestic equality, forbids the rich and poor alike to sleep under bridges, to beg in the streets, and to steal bread.” Thanks to modern computers, every scientist can now devote her working life to installation and configuration issues. 43
  • 44. If you build a man a fire, you'll keep him warm for a night. If you set a man on fire, you'll keep him warm for the rest of his life. — Terry Pratchett 44
  • 46. Teach a Workshop ● All our materials are CC-BY licensed ● We will help train you 46
  • 47. Shine Some Light ● Include a few lines about your software stack and working practices in your scientific papers ● Ask about them when reviewing – Where's the repository containing the code? – What's the coverage of your unit tests? – How did you track provenance? 47
  • 48. Be the Change You Want to See Would you rather increase the productivity of the top 5% of scientists ten-fold? Or double the productivity of the other 95%? 48
  • 49. Be the Change You Want to See Would you rather increase the productivity of the top 5% of scientists ten-fold? Or double the productivity of the other 95%? 49
  • 50. Be the Change You Want to See Would you rather see your best ideas left on the shelf because they're out of reach? Or raise a generation who can all do the things we think are exciting? 50