The road from good software engineering to good science...is a two way street

537 views
463 views

Published on

Invited talk at SETQA-2009 workshop

http://compbio.uchsc.edu/SETQA-NLP2009/index.shtml

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
537
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

The road from good software engineering to good science...is a two way street

  1. 1. The road from good software engineering
  2. 2. to good science
  3. 3. ...is a two way street...
  4. 4. Good
  5. 6. Quality
  6. 7. Science
  7. 8. Develop theories or models that let us make predictions about the world
  8. 9. Our world is language...
  9. 10. Do our work so that others can reproduce our results
  10. 11. Science
  11. 12. Reproduce Results
  12. 13. Good Software Engineering
  13. 14. ...are those methods that result in software that anyone can use, anytime, anywhere...
  14. 15. ...to reproduce our results...
  15. 16. Experimental Results in NLP/CL Papers
  16. 17. Results are output from test cases
  17. 18. Empirical / experimental results that you publish are the test cases for your ideas
  18. 19. ...and your software...
  19. 20. Can't discount role of softwawe
  20. 21. ...although many try...
  21. 22. “It's really the ideas that count...”
  22. 23. “Well, the algorithm is described in the paper...”
  23. 24. “It's really just a prototype...”
  24. 25. “Well, I got a new computer and I don't think the software made it to the new one...”
  25. 26. “Ummm....my student left and I don't quite know how he did all this...”
  26. 27. I did this experiment on X
  27. 28. Here are the results...
  28. 29. Accept them
  29. 30. No, the software isn't available
  30. 31. Neither is the data
  31. 32. I simply assume you have 8 months available to reinvent my method
  32. 33. And that you can do that from an incomplete description
  33. 34. Cheers!
  34. 35. That's many things
  35. 36. It's not science
  36. 37. Empiricism is Not a Matter of Faith Computational Linguistics September 2008
  37. 38. Software and NLP
  38. 41. Good Software
  39. 42. It should work
  40. 43. Anytime
  41. 44. Anywhere
  42. 45. For Anyone
  43. 46. ...and it should certainly work for you 6 months in the future
  44. 47. ...or 5 years from now...
  45. 48. ...and it should work for someone else now, and 5 years from now..
  46. 49. ...even after you've moved on and aren't answering email, even after the project is over
  47. 50. If your software can do that, it's pretty well engineered
  48. 51. Will your software work in 40 years?
  49. 52. You should hope so
  50. 53. Make choices that make that at least possible
  51. 54. Think of your software as a time capsule
  52. 55. Think of it as your chance for immortality
  53. 57. How many hours have you spent away from loved ones, friends, adventure, nature, romance, and life...to create software?
  54. 58. At least make it last...
  55. 59. Let someone 100 years from now unpack your code and data, and be able to read it, understand it, run it, and modify it
  56. 60. Maybe it's your grandchildren, wondering why they never saw you?
  57. 61. Let yourself be able to do the same thing in 10 years
  58. 62. Will the Linux Kernel be available and running in X years?
  59. 63. There's a good chance
  60. 64. Company won't go out of business
  61. 65. ANSI C will be around a long time
  62. 66. Virtualization will keep architectures alive even when hardware is gone
  63. 67. Make choices that give your code (and your legacy) a chance too
  64. 68. Don't rely on the newest priceiest weirdest goofball proprietary bleeding edge hardware and software
  65. 71. Anyone
  66. 72. ...with $200?
  67. 73. ...with $20,000
  68. 74. ...with a PhD in Computer Science?
  69. 75. ...and a staff of 10?
  70. 76. ...with 4 weeks available to debug?
  71. 77. ...and another 6 months to reimplement?
  72. 78. Good
  73. 79. Software won't stop children from dying of war and disease...
  74. 80. So be a little humble
  75. 81. Appreciate your good fortune
  76. 82. And push yourself a little harder
  77. 83. Don't act all whiny and put-upon when people ask you for code or data
  78. 84. Especially since a lot of that work is funded by people working very hard making less in a month than we make a in a week
  79. 85. ...or that we spend going to NAACL
  80. 86. Appreciate our good fortune
  81. 87. Good Science
  82. 88. Produces theories that make reliable predictions about the world
  83. 89. Reproducible Experimental Results
  84. 90. Anytime
  85. 91. Anywhere
  86. 92. For Anyone
  87. 93. Gravity
  88. 94. A Good Theory
  89. 95. Works now
  90. 96. Will work in 10 years
  91. 97. Works here
  92. 98. Works on the moon
  93. 99. Works for me
  94. 100. Works for you
  95. 101. Falling Objects
  96. 102. Gravity is a force, not an artifact
  97. 103. Telescope
  98. 104. Works anytime, anywhere, for anyone
  99. 105. The old ones still work
  100. 107. We share the big ones...
  101. 109. If we have access to the same resources, we can reproduce each other's results
  102. 110. If we have access to the same resources, we can reproduce each other's results
  103. 111. We need to work a lot harder (and engineer systems a lot better) to make that happen
  104. 112. Not convinced?
  105. 113. Conduct the following experiment
  106. 114. Randomly select 1 of your papers
  107. 115. Reproduce your results
  108. 116. If you can't...
  109. 117. Do you think anyone else can?
  110. 118. What if nobody could have reproduced Galileo's falling objects experimental results? Would we simply believe him?
  111. 119. They barely believed him at the time
  112. 120. If you can, try it for someone else's papers
  113. 121. Lessons from Science
  114. 122. We don't get it right the first time
  115. 123. If I have seen further it is only by standing on the shoulders of giants
  116. 125. (who were mostly wrong)
  117. 126. Aristotle (384 – 322 BC)
  118. 127. There are 4 elements
  119. 128. The heavens are different
  120. 129. Different rules apply
  121. 130. Before the telescope, the heavens really were different
  122. 131. Other planets were balls of fire, like the stars, like the sun
  123. 133. Ptolemy (90 – 168)
  124. 136. Crazy?
  125. 137. Very reliably predicts the movement of heavenly bodies
  126. 138. Instrumentalist
  127. 139. A theory that reliably explains and predicts the existing data
  128. 140. Realistic
  129. 141. A theory that describes things as they “really” are
  130. 144. A great instrumentalist A great data preserver and collector
  131. 145. Copernicus (1473 - 1543)
  132. 147. Wasn't much of an observer
  133. 149. Found Ptolmey's model overly complicated
  134. 150. Wanted a simpler explanation
  135. 151. Wanted a simpler explanation
  136. 152. Came up with another model that was consistent with Ptolmey's data
  137. 154. Great!
  138. 155. (Well, better)
  139. 156. Uniform Motion
  140. 157. Perfect circles
  141. 160. Brahe (1546 - 1601)
  142. 162. A great observational astronomer, the last naked eye astronomer
  143. 164. Galileo (1564 - 1642)
  144. 167. 1609 Telescope
  145. 168. 1610 Observed 4 moons of Jupiter
  146. 169. Back to Tycho
  147. 170. Made remarkably accurate observations for 20 years
  148. 171. Knew about Copernicus
  149. 172. Arrived at his own theory
  150. 174. A hybrid model
  151. 175. Fits and predicts the observed data
  152. 176. Data Sharing
  153. 178. Kepler (1571 - 1630)
  154. 180. Why are there 6 planets?
  155. 181. Why are they so positioned?
  156. 182. Geometry and Perfect Solids
  157. 184. In 1601 Tycho bequeathed his data...
  158. 185. Kepler's Laws of Planetary Motion
  159. 186. Varying velocity
  160. 187. Elliptical Orbits
  161. 188. ...around the Sun
  162. 189. It was left to Newton to work out what held the planets in place and made them move...
  163. 190. History of Science?
  164. 191. We are wrong many many times before we are right
  165. 192. Progress happens when people leave their data and instruments behind
  166. 193. Ptolemy (90 - 168) Copernicus (1473 - 1543) Tycho (1546 – 1601) Galileo (1564 - 1642) Kepler (1571 - 1630) Newton (1642 - 1727)
  167. 194. Good science and good software assume you don't get it right at first
  168. 195. Leave something behind for your successors to build on
  169. 196. Here lies our fried Ted We're sad that's he dead http://www.d.umn.edu/~tpederse

×