Python For Large Company?

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Notes on slide 1

    C / C++ / Java a lot of projects … definitely references languagesC# going up but yet a few projectsRuby is not well established .. Yet? Only for web???Python has already a lot of projects and has the lowest LOC per project

    1 Favorite

    Python For Large Company? - Presentation Transcript

    1. PythonIn Large Companies?
      SébastienTandel
      sebastien.tandel@corp.terra.com.br
      sebastien.tandel@gmail.com
    2. Plan
      About Terra
      The 7 steps
      Prototype
      Define the Goals
      Integration
      Some Libs
      Prove It Works
      Evangelize
      Next Steps
      Conclusions
    3. About Terra : Web Portal
      Largest Latin American web portal
      Located in 18 countries
      1000s of servers
      Brazil :
      ~7M unique visitors / day
      ~70M pageviews / day
    4. About Terra
      Source: Nielsen NetView (June 2009)
    5. About Terra : Email Plaftorm
      I’m part of the email team.
      Some stats :
      +10M mailboxes
      +30M inbound emails per day
      +30M outbound emails per day
      avg : 300 mail/s, peak : 600 mail/s
      Systems
      Main systems : SMTP, LMTP, POP, IMAP, Webmail
      Total of +30 systems to design/develop/maintain
      Main languages used C / C++
    6. About Terra
      • Several “official” languages at Terra :
      PHP, C, C++, Java, C#, Erlang
      • Average # is one per team!
      • No official “scripting” language (Python, Perl or other)
      Why? From what I hear
      Performance
      Integration with others systems (& legacy)
      Costs / benefits?
      Buzzword fear
      Labor market
    7. Flash Python Overview
      Python is …
      Interpreted
      Dynamically Typed
      Really Concise
      Multi-paradigm : procedural, OO, functional
      Exceptions : helpful for robustness, debug (no strace ;))
      Garbage Collector : don’t worry about allocation/free
    8. Step 1 : Prototype
    9. Step 1 : Prototype
      Buggy system re-written as prototype in Python
      Surprise! Worked a lot better than its C cousin
      Prototype is now in production!
      Spread the word about this rewrite around me
      Some technical people liked the idea
      One has not been so enthusiast … my manager
      Cons:
      no integration with homemade systems
      Just one example
    10. Step 1 : Prototype
      Introducing new ideas is a long and though way
    11. Step 2 : Define the Goals
      Performance critical systems :
      postfix, lmtp, imap / pop
    12. Step 2 : Define the Goals
      • Performance critical systems
      Web-based systems
      Webmail, ECP
    13. Step 2 : Define the Goals
      • Performance critical systems
      • Web-based systems
      Backend systems :
      spamreporter, cleaner, clx trainer, base trainer, mfsbuilder, migrador, nnentrega, smigol, …
    14. Step 2 : Define the Goals
      • Performance critical systems
      • Web-based systems
      • Backend systems
      Almost inexistent systems (though interesting ones) :
      Mailboxes stats, logs analysis (stats and user behavior characterization)
    15. Step 2 : Define the Goals
      • Performance critical systems
      • Web-based systems
      • Backend systems
      • Stats / User behavior characterization, …
      System / Integration tests scripts
    16. Step 2 : Define the Goals
      • Performance critical systems
      • Web-based systems
      • Backend systems
      • Stats / User behavior characterization, …
      • System / Integration tests scripts
      The Grail :
      • Python can be used for ALL except Performance Critical Systems
    17. Step 3 : Integration
    18. Step 3 : Integration
      Python could be used with every systems
      but how can I interface with the homemade systems (legacy) ? 
    19. Step 3 : Integration
      Various way to create Python Bindings :
      Python C API: the “hard” way
    20. Step 3 : Integration
      Various way to create Python Bindings :
      Python C API: the “hard” way
      swig : the lazy way
      won’t create a Pythonic API for you
    21. Step 3 : Integration
      Various way to create Python Bindings :
      Python C API : the “hard” way
      swig : the lazy way
      ctypes: the stupidly easy way
      from ctypes import cdll
      l = cdll.LoadLibrary(“libc.so.6”)
      l.mkdir(“python-mkdir-test”)
    22. Step 3 : Integration
      Various way to create Python Bindings :
      Python C API : the “hard” way
      swig : the lazy way
      ctypes : the stupidly easy way
      Cython : write python, compile with gcc
    23. Step 3 : Integration
      Wrote bindings to interface with all major internal systems (thanks to ctypes)
      With pythonic API! 
    24. Step 3 : Integration
      from trrauth import TrrAuth
      auth = TrrAuth(“IMAP”)auth.open_userpass(“standel”, “1q2w3e”, “terra”)
      auth.attributes = [ “short_name”, “id_perm”, “antispam” ]
    25. Step 3 : Integration
      from trrauth import TrrAuth
      auth = TrrAuth(“IMAP”)auth.open_userpass(“standel”, “1q2w3e”, “terra”)
      auth.attributes = [ “short_name”, “id_perm”, “antispam” ]
      print auth.short_name, “:”, auth.id_perm
    26. Step 3 : Integration
      from trrauth import TrrAuth
      auth= TrrAuth(“IMAP”)auth.open_userpass(“standel”, “1q2w3e”, “terra”)
      auth.attributes = [ “short_name”, “id_perm”, “antispam” ]
      for attr, value in auth:  print attr, “:”, value
    27. Step 4 : Some Libs
    28. Step 4 : Some LibsMaster / Slave
      Master responsible for :
      Forking the slaves
      Reading a “list” of tasks
      Distribution of the tasks to the slaves
      Slave responsible for :
      Execution of the task
      Return execution status to the master
      Key characteristics :
      Slave death detection
      Handle unhandled exceptions(+ hook)
      Master <-> slave protocol allows temporary error code
      Timeout of the tasks
    29. Step 4 : Some LibsMaster / Slave
      One neat characteristic :
      System might got bug in prod w/ minimal impact
      If unhandled exception occurs
      Only one slave dies
      It is detected and master will fork a new one (if needed)
      The lib handles the exception :
      Default behavior : prints to console
      User defined (callback) : e.g. write the stack trace to a file!
      Cherry on the cake : getting specific production data about faulty task
    30. from robustpools.process_pool import master_task_list
      from robustpools.process_pool import slave_task_list
      m_config = { 'INFINITE_LOOP' : 0 }
      class list_task(object):
      def __init__(self, list, num, timeout_validity=600):
      self.__num = num
      def _id(self):
      return self.__num
      id = property(_id)
      class list_slave(slave_task_list):
      def __init__(self):
      super(list_slave, self).__init__(list_task)
      def run(self, task):
      print task.id
      return 0, "ok”
      list = xrange(10)
      m = master_task_list(list, num_slave=5, slave_class=list_slave, config=m_config)
      m.start()
    31. Step 4 : Some LibsTCP Sockets Pool
      Manage connections to a pool of servers
      send in a round-robin/priority way to each server
      Detect connection errors
      Retry to connect
      Number of retries limited => after mark as dead
      Retry again later with exponential backoff
    32. Step 5 : Prove It Works
    33. Step 5 : Prove It Works
      Prove = collect data … How?
      Write integrated systems using bindings and libs of previous steps.
      Show it works 
      Performance
      Productivity
    34. Step 5 : Prove It Works
      Performance, one obvious thought : C/C++
      PINCS
      Performance is not C, Stupid!
    35. Step 5 : Prove It WorksPerformance
      Some of the rewrites works faster than C/C++ cousins
      Why?
      OS / Systems limits
      Libs (legacy)
      Algorithms
      Software Architecture
      Infrastructure
    36. Step 5 : Prove It WorksProductivity
      BTW, pure performance so important?
      Time to Market much more important
      Adopt Lean Thinking and eliminate every possible waste
      • Writing too much code is a big waste in several ways
      Loose time when writing
      Increase # bugs
      More time to maintain
      More time to know code base (think to new employees)
      • Impact Overall Productivity
    37. Step 5 : Prove It WorksProductivity
      http://page.mi.fu-berlin.de/prechelt/Biblio/jccpprt2_advances2003.pdf
    38. Step 5 : Prove It WorksProductivity
      http://page.mi.fu-berlin.de/prechelt/Biblio/jccpprt2_advances2003.pdf
    39. Step 5 : Prove It WorksProductivity
      http://www.ohloh.net
    40. Step 5 : Prove It WorksProductivity
      http://www.ohloh.net
    41. Step 5 : Prove It WorksProductivity
      http://www.ohloh.net
    42. Step 5 : Prove It WorksProductivity
    43. Step 5 : Prove It WorksProductivity
      Some existing C/C++ systems re-written in Python
      Original C/C++ versions total of ~20.000 LOC
      In Python, 4-6x less code !
      The previous numbers do not seem to lie 
    44. Step 5 : Prove It WorksProductivity
      Oh, parsing an email?
      Any idea in C/C++?
    45. Step 5 : Prove It WorksProductivity
      • parsing an email
      from email import message_from_file
      fh = open(filename, “r”)
      mail = message_from_file(fh)
      fh.close()
    46. Step 5 : Prove It WorksProductivity
      • parsing an email
      content types of parts?
      Any idea in C/C++ ?
      from email import message_from_file
      def get_mail(filename):
      fh = open(filename, “r”)
      mail = message_from_file(fh)
      fh.close()
      return mail
      mail = get_mail(filename)
    47. Step 5 : Prove It WorksProductivity
      • parsing an email
      • content types of parts
      from email import message_from_file
      def get_mail(filename):
      fh = open(filename, “r”)
      mail = message_from_file(fh)
      fh.close()
      return mail
      mail = get_mail(filename)
      for part in mail.walk():
      print part.get_content_type()
    48. Step 5 : Prove It WorksProductivity
      • parsing an email
      • content types of parts
      • getting headers?
      • Any idea in C/C++?
      from email import message_from_file
      def get_mail(filename):
      fh = open(filename, “r”)
      mail = message_from_file(fh)
      fh.close()
      return mail
      mail = get_mail(filename)
      for part in mail.walk():
      print part.get_content_type()
    49. Step 5 : Prove It WorksProductivity
      • parsing an email
      • content types of parts
      • getting headers
      • Python libs are just that simple!
      … and there are a lot!
      from email import message_from_file
      def get_mail(filename):
      fh = open(filename, “r”)
      mail = message_from_file(fh)
      fh.close()
      return mail
      mail = get_mail(filename)
      for part in mail.walk():
      print part.get_content_type()
      print mail[“From”]
      print mail[“Subject”]
    50. Step 5 : Prove It WorksPerformance (Again?)
      For equivalent architecture
      (libs, algorithm, infrastructure)
      C is a best performer than Python! 
      Python Is Not C, Stupid!
    51. Step 5 : Prove It WorksPerformance (Again?)
      Bottleneck discovered!
      PINCS! : think first to architecture!
    52. Step 5 : Prove It WorksPerformance (Again?)
      Bottleneck discovered!
      • PINCS! : think first to architecture!
      Ctypes/ Swig : python bindings
      Write your bottleneck in C / C++, use it in your python app
    53. Step 5 : Prove It WorksPerformance (Again?)
      Bottleneck discovered!
      • PINCS! : think first to architecture!
      • Ctypes : absurdly easy python bindings
      Cython: write python, obtain a gcc compiled lib
    54. Step 5 : Prove It WorksPerformance (Again?)
      Bottleneck discovered!
      • PINCS! : think first to architecture!
      • Ctypes : absurdly easy python bindings
      • Cython: write python, obtain a gcc compiled lib
      Psyco: JIT for python
      Just an additional module import in your code
      2 – 100x times faster than normal Python
      Requires a bit more memory
    55. Step 5 : Prove It WorksPerformance (Again?)
      Bottleneck discovered!
      • PINCS! : think first to architecture!
      • Ctypes : absurdly easy python bindings
      • Cython: write python, obtain a gcc compiled lib
      • Psyco: JIT for python
      • Unladden Swallow : Google Project
      • Produce a version of Python at least 5x faster
      • Every patch goes to Python (no fork!)
    56. Step 6 : Evangelize
    57. Step 6 : Evangelize
      Once having stopped and look at what have been accomplished …
      Show it, Evangelize!
    58. Step 6 : Evangelize
      Because introducing a “new technology” is not just about teaching something to users.
      You’ve got to play the role of evangelist!
      Innovators (3.5%)
      New stuffs? they’re in!
    59. Step 6 : Evangelize
      Because introducing a “new technology” is not just about teaching something to users.
      You’ve got to play the role of evangelist!
      • Innovators (3.5%)
      Early-adopters (12.5%)
      Open to new ideas but check before
    60. Step 6 : Evangelize
      Because introducing a “new technology” is not just about teaching something to users.
      You’ve got to play the role of evangelist!
      • Innovators (3.5%)
      • Early-adopters (12.5%)
      Early majority (35%)
      First, they must see the idea working
    61. Step 6 : Evangelize
      Because introducing a “new technology” is not just about teaching something to users.
      You’ve got to play the role of evangelist!
      • Innovators (3.5%)
      • Early-adopters (12.5%)
      • Early majority (35%)
      Late majority (35%)
      Accept after lot of pressure, or imposed
    62. Step 6 : Evangelize
      Because introducing a “new technology” is not just about teaching something to users.
      You’ve got to play the role of evangelist!
      • Innovators (3.5%)
      • Early-adopters (12.5%)
      • Early majority (35%)
      • Late majority (35%)
      Laggard (14%)
      Never accept (why would I want to change?)
    63. Step 6 : Evangelize
      During work, I constantly spoke (a lot) to others
      Presentation on Python made for all
      Present to a large audience what has been done
      Open discussion
      Poster resuming what has been done
      Wiki page documenting Python stuffs
      Specific mailing-list related to Python
    64. Step 6 : Evangelize
      lot of work and slow process but I won some allies
      Some technical people are convinced that Python is useful
      Some managers are convinced that Python could be a good thing for Terra
      Starting evaluation in some specific cases
    65. Step 7 : Next Steps
    66. Step 7 : Next Steps
      Proven that Python could be useful in some cases.
      Don’t forget my Grail!
      The way has not ended …
      I’m lobbying to start using Python for web development.
      And again, I made a prototype
    67. Step 7 : Next Steps
      Django = THE Python MVC web framework :
      Model :
      By describing data, no code written (SQLAlchemy)
      Automatic creation of tables (if needed),
      Data accessed through objects,
      No SQL needed!
      View :
      access models to get the data
      render the output through templates
      • loose coupling interface <-> code!
      Controller :
      REST through url parsing
    68. Step 7 : Next Steps
      Login :
      Module auth already exists.
      Easy to tell django that authentication is required
      @login_required
      def list_abook(request, username):

      login_requiredis a python decorator
    69. Step 7 : Next Steps
      Caching information (memcache, bd, file, …)
      4 levels :
      Per site : one config line
      Per view : one python decorator
      @cache_page(60 * 15)
      def list_abook(request, username):

      In templates : maybe better to let this one out! 
      Low-level cache access :
      cache.get(id)
      cache.set(id, value, timeout)
    70. Step 7 : Next Steps
      Address book Web Service
      Retrieve address book of one user,
      Add an account,
      Add an entry to the address book of a user,
      View all the address book entries,
      Output in HTML, JSON and CSV
      &lt; 100 LOC
      2 hours (w/o knowing the framework)
      Not one line of SQL
      just usefulcode
    71. Conclusions
      One year and a half …
      and Evangelization is not done yet!
      Email Team :
      Several systems have been written in Python and works really fine … even with the Terra high load!
      Web project should start right now
      People are starting using/learning it inside the company
      Some teams are starting evaluating Python
      Some Terra employees here at this conference!
    72. THANKS!
      Any Questions?
    SlideShare Zeitgeist 2009

    + Sebastien TandelSebastien Tandel Nominate

    custom

    406 views, 1 favs, 0 embeds more stats

    Companies in the process of adoption of a language more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 406
      • 406 on SlideShare
      • 0 from embeds
    • Comments 0
    • Favorites 1
    • Downloads 11
    Most viewed embeds

    more

    All embeds

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories