Path Dependent Development          Nick Coghlan           @ncoghlan_dev          Red Hat Toolsmith        CPython Core De...
Usefully Wrong“All models are wrong. Some models are useful.”“... the practical question is: How wrong do theyhave to be t...
Choose Any Two?
Path Dependence●   “good enough to be useful” -> ship it●   The decisions we make leave their mark on    the software we s...
What is “Good Enough”?●   Depends on your priorities and resources    –   What are you building?    –   Why are you buildi...
Context Matters●   Building an intranet web service    –   Trusted network    –   Enforced user base●   Building a web sta...
Trade-Offs Needed:   Inquire Within
Functionality●   Doing one (or a few) things well is often better    than doing a lot of things badly●   Adding functional...
Flexibility●   Dont make things configurable●   Configurability = testing and maintenance pain●   Do separate concerns (if...
Security●   A lot of software is still insecure by default    –   Unhashed (or poorly hashed) passwords    –   Unencrypted...
Reinventing Wheels●   Reuse means dependency management●   Often simpler to roll your own to start●   With good modularity...
Documentation●   How sophisticated are users expected to be?    –   Installed by developers? Admins? End users?    –   Int...
Test Quality●   Fine grained tests pinpoint failures easily●   Coarse grained tests are often easier to write●   Can easil...
Code Reviews●   Code is written to:    –   Tell the computer what to do    –   Tell future maintainers what it does●   Tes...
Many More Possibilities...●   Performance & Scalability●   Reliability●   Usability●   Maintainability●   Business Risk●  ...
Managing Path Dependence
Exit Strategies●   Know what youre not doing●   Have a vague idea how to fix it when needed●   Actual fixes will depend on...
Patterns and Processes●   Keep your options open●   Minimise current complexity●   This is not easy    –   Software archit...
Prototyping vs Implementation●   Two very different modes of development●   Prototyping    –   Exploration    –   Trying t...
Social Implications●   Design decisions are context dependent●   Easy to criticise in hindsight●   Design trade-offs can i...
Path Dependence in Action
An Innocent Start●   PulpDist: Mirroring network based on rsync●   Simple job definitions    {        "remote_server": "lo...
Dont Repeat Yourself●   Simple format turned out to be too simple    –   Hard to modify given multiple jobs from same sour...
What To Do?●   Upgrade the existing validator    –   Possible, but tedious to test properly    –   Not a good wheel to rei...
Two Birds...●   For validation, I needed to:    –   Ensure identifiers were unique    –   Ensure cross references were val...
...One Stone●   An in-memory SQLite database was perfect●   But writing SQL by hand is still horrible●   SQL Alchemy in ta...
How Does The Story End?●   Still some very rough edges    –   Sqlite error messages are quite user hostile    –   Schema c...
Q&A              Pulp:      http://pulpproject.org/            PulpDist:https://fedorahosted.org/pulpdist/        CPython ...
Additional Trade-Offs
Performance & Scalability●   Dont stress about it if you dont need to●   Start with measurement infrastructure●   If simpl...
Reliability●   Not all software is mission critical●   Pay attention to failure modes●   Error quality matters
Usability●   Humans are still a lot smarter than computers●   If users have no choice, theyll usually cope●   Hence, awful...
Maintainability & Business Risks●   The Bus Factor    –   Most startups = 1    –   Large companies want it to be higher●  ...
Automation●   Critical to speeding up release cycles●   Is a process stable enough to automate?
Upcoming SlideShare
Loading in …5
×

Path Dependent Development (PyCon AU)

923 views

Published on

(Image on page 3: it's the traditional fast/good/cheap trade-off. Something glitched in the conversion))

The decisions we make in getting software ready to ship can have lasting consequences for later versions. Early priorities can end up setting the direction for the whole project.

My presentation from PyConAU 2012 (including bonus slides that were cut before the talk due to time limitations)

Published in: Technology
  • Be the first to comment

Path Dependent Development (PyCon AU)

  1. 1. Path Dependent Development Nick Coghlan @ncoghlan_dev Red Hat Toolsmith CPython Core Developer
  2. 2. Usefully Wrong“All models are wrong. Some models are useful.”“... the practical question is: How wrong do theyhave to be to not be useful?” George E. P. Box (statistician) “Empirical Model-Building”
  3. 3. Choose Any Two?
  4. 4. Path Dependence● “good enough to be useful” -> ship it● The decisions we make leave their mark on the software we ship● These marks remain long after the scope of the software expands to other use cases
  5. 5. What is “Good Enough”?● Depends on your priorities and resources – What are you building? – Why are you building it? – Who are you building it for? – Who is building it? – What are you building it with? – How much risk can you tolerate?
  6. 6. Context Matters● Building an intranet web service – Trusted network – Enforced user base● Building a web startup – Hostile network – Business lives or dies by user choice● Building hardware control and management systems – Usage driven by hardware – Software as a necessary evil
  7. 7. Trade-Offs Needed: Inquire Within
  8. 8. Functionality● Doing one (or a few) things well is often better than doing a lot of things badly● Adding functionality later is usually easier to sell than taking it away (no matter how broken it turns out to be)
  9. 9. Flexibility● Dont make things configurable● Configurability = testing and maintenance pain● Do separate concerns (if you make it configurable later, only one place needs to change)● Do use flexible support tools – SQL Alchemy makes it easy to change database – Django locks in some major decisions (like ORM and templating language) but provides a rich ecosystem of prebuilt components that work well together
  10. 10. Security● A lot of software is still insecure by default – Unhashed (or poorly hashed) passwords – Unencrypted communications channels● Multiple layers of defence can hide this● Try to make the “easy option” and the “secure option” one and same● Can be very hard to fix poor security choices
  11. 11. Reinventing Wheels● Reuse means dependency management● Often simpler to roll your own to start● With good modularity, easy to replace later● Watch for increasing complexity
  12. 12. Documentation● How sophisticated are users expected to be? – Installed by developers? Admins? End users? – Intended for domain experts only?● Is it stable enough to document?● Documentation can highlight design flaws
  13. 13. Test Quality● Fine grained tests pinpoint failures easily● Coarse grained tests are often easier to write● Can easily start with coarse grained tests, then add more fine grained tests to narrow down failures● Slow tests are better than no tests● External dependencies are better than no tests● Regression tests are great, but dont let them block fixes for problems that cant be reproduced reliably
  14. 14. Code Reviews● Code is written to: – Tell the computer what to do – Tell future maintainers what it does● Tests cover the first, reviews the second● Debatable value for small teams● Highly valuable for large teams● Needs appropriate tools
  15. 15. Many More Possibilities...● Performance & Scalability● Reliability● Usability● Maintainability● Business Risk● Automation● ...
  16. 16. Managing Path Dependence
  17. 17. Exit Strategies● Know what youre not doing● Have a vague idea how to fix it when needed● Actual fixes will depend on future needs● Sometimes, the only right answer is “No”
  18. 18. Patterns and Processes● Keep your options open● Minimise current complexity● This is not easy – Software architecture and design patterns – Software processes and methodologies● If you dont have a test suite, start there
  19. 19. Prototyping vs Implementation● Two very different modes of development● Prototyping – Exploration – Trying to figure out what is feasible● Implementation – Already known to be feasible – Making it happen to a known specification● Big difference in priorities!
  20. 20. Social Implications● Design decisions are context dependent● Easy to criticise in hindsight● Design trade-offs can influence community● Actually getting better at building software● Ambitions are (more than?) keeping pace
  21. 21. Path Dependence in Action
  22. 22. An Innocent Start● PulpDist: Mirroring network based on rsync● Simple job definitions { "remote_server": "localhost", "remote_path": "/demo/simple/", "local_path": "/var/www/pub/sync_demo_raw/", ... }● Simple custom validator for JSON data – Checks on individual values – Overall sanity checks on full jobs
  23. 23. Dont Repeat Yourself● Simple format turned out to be too simple – Hard to modify given multiple jobs from same source● Enhanced format with reusable elements { "mirror_id": "local_copy", "tree_id": "simple_sync", "site_id": "bne", ... }● Simple validator was no longer adequate
  24. 24. What To Do?● Upgrade the existing validator – Possible, but tedious to test properly – Not a good wheel to reinvent● JSON validation library – Research would be starting from scratch – Hard to assess quality quickly● Relational database – Enforces the constraints by its very nature – Error quality would likely be poor
  25. 25. Two Birds...● For validation, I needed to: – Ensure identifiers were unique – Ensure cross references were valid● For UI purposes I also needed: – To filter by component identifiers – To sorting by various fields● Sound familiar?
  26. 26. ...One Stone● An in-memory SQLite database was perfect● But writing SQL by hand is still horrible● SQL Alchemy in target environment● Problem solved! – Config loaded into DB after simple field validation – If the DB accepts it, references are also valid
  27. 27. How Does The Story End?● Still some very rough edges – Sqlite error messages are quite user hostile – Schema changes are triple-keyed● Future changes? – Master in database, JSON only as export? – Improved error messages? – Switch to an actual JSON schema engine?● Well, that depends :)
  28. 28. Q&A Pulp: http://pulpproject.org/ PulpDist:https://fedorahosted.org/pulpdist/ CPython Sprints Monday & Tuesday
  29. 29. Additional Trade-Offs
  30. 30. Performance & Scalability● Dont stress about it if you dont need to● Start with measurement infrastructure● If simple is fast enough, stick with simple
  31. 31. Reliability● Not all software is mission critical● Pay attention to failure modes● Error quality matters
  32. 32. Usability● Humans are still a lot smarter than computers● If users have no choice, theyll usually cope● Hence, awful UX in most “enterprise” software
  33. 33. Maintainability & Business Risks● The Bus Factor – Most startups = 1 – Large companies want it to be higher● Developer docs (including comments)● Legal risks (copyrights, patents, trademarks)
  34. 34. Automation● Critical to speeding up release cycles● Is a process stable enough to automate?

×