ESIP Winter Meeting, January 2014
!

Software Carpentry for the
Geophysical Sciences
!

Aron Ahmadia
Software Carpentry an...
Dark Matter,
Public Health,
and
Scientific
http://www.slideshare.net/gvwilson/dark-matter-publichealth-and-scientific-comp...
ESIP Winter Meeting, January 2014
!

Software Carpentry for the
Geophysical Sciences
!

Aron Ahmadia
Software Carpentry an...
Computation in the Geophysical Sciences
!

Aron Ahmadia
Coastal and Hydraulics Lab, US Army ERDC
http://farm8.staticflickr.com/7407/11340654005_16dd86b6ab_o.png
!5
This talk is not about GIS

I still brought some maps to this talk for later…
!6
Computational Modeling within
ERDC

!7
Computational Modeling within
ERDC

!8
Computational Modeling within
ERDC

!9
This talk is not about computational
modeling

!10
This talk is about geoscientists…
constructing software

!11
“Dark Matter Developers”
Scott Hanselman (March 2012)
!
[We] hypothesize that there is another kind
of developer than the ...
We’re Not That Optimistic
l

90% of scientists have their heads down
−

l

l

Producing science and solving problems,
n...
Shiny Toys

!14
Grimy Reality

!15
So Here We Are
Us

Them

!16
Surely You're Exaggerating
1. How many graduate students write shell
scripts to analyze each new data set
instead of runni...
Surely You're Exaggerating
2. How many of them use version control
to keep track of their work and collaborate
with collea...
Surely You're Exaggerating
3.

How many routinely break large problems
into pieces small enough to be
- comprehensible,
- ...
Surely You're Exaggerating
3.

How many routinely break large problems
into pieces small enough to be
- comprehensible,
- ...
We've left the majority behind

!21
We've left the majority behind
Other than
Googling for things,
the majority of scientists
do not use computers
more effect...
“Not My Problem”
Actually your biggest problem
Applications Yesterday’s models and tools are solving
today’s million dolla...
Jeff Hammond

Your code will outlive your
supercomputer. Most
successful supercomputers
lasts 5 years. The most
successful...
Where Are Your Goalposts?
An Earth Scientist is computationally competent if
she can build, use, validate, and share softw...
A Driver's License Exam
l

l

Developed with the Software Sustainability
Institute for the DiRAC Consortium
Formative as...
A Driver's License Exam
1.
2.
3.
4.
5.
6.
7.

Check out a working copy of the exam
materials from a git repository.

Could...
A Driver's License Exam
1.
2.
3.
4.
5.
6.
7.

Use find and grep in a pipe to create a
list of all .dat files in the workin...
A Driver's License Exam
1.
2.
3.
4.
5.
6.
7.

Write a shell script that runs a legacy
program for each parameter in a set....
A Driver's License Exam
1.
2.
3.
4.
5.
6.
7.

Edit the Makefile provided so that if any
.dat file changes, analyze.py is r...
A Driver's License Exam
1.
2.
3.
4.
5.
6.
7.

Write four unit tests to exercise a function
that calculates running sums. E...
A Driver's License Exam
1.
2.
3.
4.
5.
6.
7.

Explain when and how a function could
pass your tests, but still fail on rea...
A Driver's License Exam
1.
2.
3.
4.
5.
6.
7.

Do a code review of the legacy program
from Q3 (about 50 lines long) and lis...
How Are We Doing?
A. How well did you do?

!34
How Are We Doing?
A. How well did you do?
B. How well do you think most earth scientists would do?

!35
How Are We Doing?
A. How well did you do?
B. How well do you think most earth scientists would do?
C. Do you think an eart...
How Are We Doing?
A. How well did you do?
B. How well do you think most earth scientists would do?
C. Do you think an eart...
So Why Is This Your Problem?
If you're only helping the (small) minority
lucky enough to have acquired the base skills
tha...
It Is Therefore Obvious That...
l

Put more computing courses in the curriculum!
−

Except the curriculum is already full...
It Is Therefore Obvious That...
l

Put a little computing in every course!
−
−
−

Still adds up: 5 minutes/lecture = 4 co...
What Has Worked

Target graduate students
Intensive short courses
Focus on practical skills

!41
What Has Worked

http://software-carpentry.org

!42
!43
What We Teach (Core Materials)
l

Unix shell
automate repetitive tasks

l

Git and GitHub
manage, collaborate, and share...
How We Teach
•

Openly licensed content

•

Managed as a public project on GitHub

•

IPython Notebooks

•

RStudio and RM...
!46
“That's Merely Useful”

l

None of this is publishable any longer
−

l

Which means it's ineffective career-wise for
aca...
Software Carpentry is International

!48
Get Involved!

!49
Shine Some Light
l

l

Include a few lines about your software stack
and working practices in your scientific papers
Ask...
Attend a Workshop

!51
Host a Workshop

?
?
? ?
?
!52
Host a Workshop

?

?

??

?

?
!53
Teach a Workshop
l

All our materials are CC-BY licensed

l

We will help train you

???
!54
Earth Science Workshops
l

l

l

We are developing geophysical and
environmental sciences course material that
extends ...
Thanks!
Web: http://aron.ahmadia.net
Email: aron@ahmadia.net
Twitter: @ahmadia
!

Software Carpentry:
http://software-carp...
Upcoming SlideShare
Loading in …5
×

Software Carpentry for the Geophysical Sciences

853 views

Published on

Published in: Education, Technology
1 Comment
2 Likes
Statistics
Notes
  • Download Over 16,000 Woodworking Plans | Projects woodworkingplanspro.weebly.com
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total views
853
On SlideShare
0
From Embeds
0
Number of Embeds
137
Actions
Shares
0
Downloads
6
Comments
1
Likes
2
Embeds 0
No embeds

No notes for slide

Software Carpentry for the Geophysical Sciences

  1. 1. ESIP Winter Meeting, January 2014 ! Software Carpentry for the Geophysical Sciences ! Aron Ahmadia Software Carpentry and US Army ERDC-CHL
  2. 2. Dark Matter, Public Health, and Scientific http://www.slideshare.net/gvwilson/dark-matter-publichealth-and-scientific-computing Computing This Talk is a Fork ! Greg Wilson
  3. 3. ESIP Winter Meeting, January 2014 ! Software Carpentry for the Geophysical Sciences ! Aron Ahmadia Software Carpentry and US Army ERDC-CHL
  4. 4. Computation in the Geophysical Sciences ! Aron Ahmadia Coastal and Hydraulics Lab, US Army ERDC
  5. 5. http://farm8.staticflickr.com/7407/11340654005_16dd86b6ab_o.png !5
  6. 6. This talk is not about GIS I still brought some maps to this talk for later… !6
  7. 7. Computational Modeling within ERDC !7
  8. 8. Computational Modeling within ERDC !8
  9. 9. Computational Modeling within ERDC !9
  10. 10. This talk is not about computational modeling !10
  11. 11. This talk is about geoscientists… constructing software !11
  12. 12. “Dark Matter Developers” Scott Hanselman (March 2012) ! [We] hypothesize that there is another kind of developer than the ones we meet all the time. We call them Dark Matter Developers. They don't read a lot of blogs, they never write blogs, they don't go to user groups, they don't tweet or facebook, and you don't often see them at large conferences... [They] aren't chasing the latest beta or pushing...limits, they are just producing. ! ! http://www.hanselman.com/blog/DarkMatterDevelopersTheUnseen99.aspx !12
  13. 13. We’re Not That Optimistic l 90% of scientists have their heads down − l l Producing science and solving problems, not worrying about the latest in semantic interoperability or standardized metadata Not because they don't want to do all that Star Trek stuff But because the majority of the field isn’t there yet. !13
  14. 14. Shiny Toys !14
  15. 15. Grimy Reality !15
  16. 16. So Here We Are Us Them !16
  17. 17. Surely You're Exaggerating 1. How many graduate students write shell scripts to analyze each new data set instead of running those analyses by hand? !17
  18. 18. Surely You're Exaggerating 2. How many of them use version control to keep track of their work and collaborate with colleagues? !18
  19. 19. Surely You're Exaggerating 3. How many routinely break large problems into pieces small enough to be - comprehensible, - testable, and - reusable? !19
  20. 20. Surely You're Exaggerating 3. How many routinely break large problems into pieces small enough to be - comprehensible, - testable, and - reusable? And how many know those are the same things? !20
  21. 21. We've left the majority behind !21
  22. 22. We've left the majority behind Other than Googling for things, the majority of scientists do not use computers more effectively today than they did 28 years ago. !22
  23. 23. “Not My Problem” Actually your biggest problem Applications Yesterday’s models and tools are solving today’s million dollar problems, because nobody knows how to update them. Collaboration Hundreds of person-years of effort are redundantly combining to create a year’s worth of result Credibility Many published results can’t be trusted, because the scientific code isn’t modular, can’t be tested, and sometimes isn’t even released How can we do better? !23
  24. 24. Jeff Hammond Your code will outlive your supercomputer. Most successful supercomputers lasts 5 years. The most successful codes may reach 5 decades. There are currently 40 year-old modeling codes still being used. Argonne Leadership Computing Facility !24
  25. 25. Where Are Your Goalposts? An Earth Scientist is computationally competent if she can build, use, validate, and share software to: • Manage software and data • Tell if it's working correctly • Find and fix problems when it isn’t • Keep track of what she’s done • Share work with others • Do all of these things efficiently !25
  26. 26. A Driver's License Exam l l Developed with the Software Sustainability Institute for the DiRAC Consortium Formative assessment − l Do you know what you need to know in order to get the most out of this gear? Pencils ready? Note: actual exam allows for several different programming languages, version control systems, etc. !26
  27. 27. A Driver's License Exam 1. 2. 3. 4. 5. 6. 7. Check out a working copy of the exam materials from a git repository. Could do it easily Could struggle through 1 0.5 Wouldn’t know where to start 0 Don’t understand the question -1 !27
  28. 28. A Driver's License Exam 1. 2. 3. 4. 5. 6. 7. Use find and grep in a pipe to create a list of all .dat files in the working copy, redirect the output to a file, and add that file to the repository. Could do it easily Could struggle through 1 0.5 Wouldn’t know where to start 0 Don’t understand the question -1 !28
  29. 29. A Driver's License Exam 1. 2. 3. 4. 5. 6. 7. Write a shell script that runs a legacy program for each parameter in a set. Could do it easily Could struggle through 1 0.5 Wouldn’t know where to start 0 Don’t understand the question -1 !29
  30. 30. A Driver's License Exam 1. 2. 3. 4. 5. 6. 7. Edit the Makefile provided so that if any .dat file changes, analyze.py is re-run to create the corresponding .out file. Could do it easily Could struggle through 1 0.5 Wouldn’t know where to start 0 Don’t understand the question -1 !30
  31. 31. A Driver's License Exam 1. 2. 3. 4. 5. 6. 7. Write four unit tests to exercise a function that calculates running sums. Explain why your four tests are most likely to uncover bugs in the function. Could do it easily Could struggle through 1 0.5 Wouldn’t know where to start 0 Don’t understand the question -1 !31
  32. 32. A Driver's License Exam 1. 2. 3. 4. 5. 6. 7. Explain when and how a function could pass your tests, but still fail on real data. Could do it easily Could struggle through 1 0.5 Wouldn’t know where to start 0 Don’t understand the question -1 !32
  33. 33. A Driver's License Exam 1. 2. 3. 4. 5. 6. 7. Do a code review of the legacy program from Q3 (about 50 lines long) and list the four most important improvements you would make. Could do it easily Could struggle through 1 0.5 Wouldn’t know where to start 0 Don’t understand the question -1 !33
  34. 34. How Are We Doing? A. How well did you do? !34
  35. 35. How Are We Doing? A. How well did you do? B. How well do you think most earth scientists would do? !35
  36. 36. How Are We Doing? A. How well did you do? B. How well do you think most earth scientists would do? C. Do you think an earth scientist could use your software without having these skills? !36
  37. 37. How Are We Doing? A. How well did you do? B. How well do you think most earth scientists would do? C. Do you think an earth scientist could use your software without having these skills? D. Or a grasp of the principles behind them? !37
  38. 38. So Why Is This Your Problem? If you're only helping the (small) minority lucky enough to have acquired the base skills that use of your tools depends on... ! ! ! ! ! then your potential user base is many times smaller than it could be. !38
  39. 39. It Is Therefore Obvious That... l Put more computing courses in the curriculum! − Except the curriculum is already full !39
  40. 40. It Is Therefore Obvious That... l Put a little computing in every course! − − − Still adds up: 5 minutes/lecture = 4 courses/degree First thing cut when running late The blind leading the blind !40
  41. 41. What Has Worked Target graduate students Intensive short courses Focus on practical skills !41
  42. 42. What Has Worked http://software-carpentry.org !42
  43. 43. !43
  44. 44. What We Teach (Core Materials) l Unix shell automate repetitive tasks l Git and GitHub manage, collaborate, and share work with distributed version control systems l Python or R grow scripts, programs, and libraries in structured, modular, testable, and reusable ways l Databases distinguish between structured and unstructured data, and know when to use each !44
  45. 45. How We Teach • Openly licensed content • Managed as a public project on GitHub • IPython Notebooks • RStudio and RMarkdown • Continuum Anaconda and Enthought Canopy • Practical, exercise-driven lessons and lectures • Teams of volunteer instructors !45
  46. 46. !46
  47. 47. “That's Merely Useful” l None of this is publishable any longer − l Which means it's ineffective career-wise for academic computer scientists But it works − Two independent assessments have validated the impact of what we do !47
  48. 48. Software Carpentry is International !48
  49. 49. Get Involved! !49
  50. 50. Shine Some Light l l Include a few lines about your software stack and working practices in your scientific papers Ask about them when reviewing − − − Where's the repository containing the code? What's the coverage of your unit tests? How did you track provenance? !50
  51. 51. Attend a Workshop !51
  52. 52. Host a Workshop ? ? ? ? ? !52
  53. 53. Host a Workshop ? ? ?? ? ? !53
  54. 54. Teach a Workshop l All our materials are CC-BY licensed l We will help train you ??? !54
  55. 55. Earth Science Workshops l l l We are developing geophysical and environmental sciences course material that extends the Software Carpentry training material Randal Olson and Aron Ahmadia will be teaching Software Carpentry to Earth Scientists at MIT and ERDC this month Are you interested in hosting? Teaching? Contributing material? Get in touch! !55
  56. 56. Thanks! Web: http://aron.ahmadia.net Email: aron@ahmadia.net Twitter: @ahmadia ! Software Carpentry: http://software-carpentry.org ! US Army Engineer Research and Development Center: http://www.erdc.usace.army.mil/ ! ERDC Proteus (high resolution models): https://github.com/erdc-cm/proteus !56

×