Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Python at Facebook

3,188 views

Published on

Facebook is a company that operates at massive scale. In this talk we’ll talk about how we use Python at Facebook.
Be it building back-end services, fast prototyping, automation, scaling operations, or simply gluing together various pieces of our infrastructure, Python is at the heart of it and allows our engineers to quickly deliver working solutions.
We’ll talk about our review process, unit testing, deployment workflow and various open-source framework we use.

Published in: Technology

Python at Facebook

  1. 1. Python @ Facebook Angelo Failla Production Engineer @ Facebook Dublin October 11th, 2014 -- PyCon Ireland 2014
  2. 2. 1 How do we use Python at Facebook? 2 Description of a production system based on Python 3 Questions Agenda
  3. 3. How do we use Python at Facebook?
  4. 4. • Centralized repo for back-end systems • A LOT of code. • All Python code hosted there • 3rd most used language (PHP/Hack => C++ => Python) • Still a LOT of code :P
  5. 5. ONE BUILD TOOL TO RULE THEM ALL Idreamlikecrazy@ https://www.flickr.com/photos/purple-lover/13583362554 (cc 2.0 license)
  6. 6. Everything is built from HEAD C++ binaries are statically linked http://cdn.instapop.com/assets/memes/Skeptical%20Baby/4302/thumb.jpeg?1383838705
  7. 7. How about the same in Python? We use .par files PAR file Short shell script header Zipped content __main__.py .py, .pyc files .so and other files needed by the application
  8. 8. Thrift (https://thrift.apache.org/) in a nutshell Write myservice.thrift Thrift compiler generates language bindings Client/Server app imports generated code
  9. 9. # cat ~/repo/thrift_example/myservice.thrift namespace py thrift_example.myservice service Calculator { i32 add(1: i32 num1, 2: i32 num2) }
  10. 10. $ thrift --gen py myservice.thrift $ tree gen-py/ gen-py/ ├── __init__.py └── myservice ├── Calculator.py ├── Calculator-remote ├── constants.py ├── __init__.py └── ttypes.py
  11. 11. # cat ~/fbcode/thrift_example/server.py # import lines removed, see: http://tinyurl.com/p6a7cuw 1: class CalculatorHandler(Calculator.Iface): 2: def add(self, num1, num2): 3: return num1 + num2 4: 5: handler = CalculatorHandler() 6: socket = TSocket.TServerSocket(port=9090) 7: tfactory = TTransport.TBufferedTransportFactory() 8: pfactory = TBinaryProtocol.TBinaryProtocolFactory() 9: server = TServer.TSimpleServer( 10: handler, socket, tfactory, pfactory) 11: server.serve()
  12. 12. # cat ~/fbcode/thrift_example/client.py # import lines removed, see: http://tinyurl.com/p6zz9ex 1: socket = TSocket.TSocket('localhost', 9090) 2: transport = TTransport.TBufferedTransport(socket) 3: protocol = TBinaryProtocol.TBinaryProtocol(transport) 4: client = Calculator.Client(protocol) 5: transport.open() 6: 7: sum = client.add(1, 2) 8: print(‘1 + 2 = %d’ % sum) 9: 10: transport.close()
  13. 13. # cat ~/repo/thrift_example/TARGETS thrift_library( name = "myservice_thrift", languages = ["ruby", "php", "cpp", "python"], thrift_srcs = {"myservice.thrift": ["Calculator"]}, ) python_binary( name = "myservice_server", main_module = "thrift_example.server", srcs = ["server.py"], deps = [":myservice_thrift-py"], ) python_binary( name = "myservice_client", main_module = "thrift_example.client", srcs = ["client.py"], deps = [":myservice_thrift-py"], )
  14. 14. pallotron@dev:~/repo $ fbconfig thrift_example/ Configuring 20 targets Writing 208/208 rules to makefile [100%] Done pallotron@dev:~/repo $ fbmake opt thrift/compiler/thriftl.ll common/memory/JEMalloc.cpp […] Linking _build/opt/thrift/compiler/thrift... _build/opt/thrift/compiler/thrift thrift_example/myservice.thrift _build/opt/thrift_example/myservice_thrift-Calculator-pyremote.lpar _build/opt/thrift_example/myservice_client.lpar _build/opt/thrift_example/myservice_server.lpar _build/opt/thrift_example/myservice_server.par _build/opt/thrift_example/myservice_thrift-Calculator-pyremote.par _build/opt/thrift_example/myservice_client.par
  15. 15. Python as a DSL
  16. 16. Enter configerator…
  17. 17. JSON Review and Commit Git Repo MAGIC! Proxy disk Application Thrift schema: struct Job { 1: string name, 2: Scheduling scheduling, } struct Scheduling { 1: int priority, } Python config: importThrift(file) myConfig = Job( name = “ajobname”, scheduling = Scheduling( priority = 3) ) export(myConfig) Python validator: def validate_job(job): ensureNotEmpty(job.name) ensurePositive( job.scheduling.priority) addValidator(myConfig, validate_job) Distribution Client lib Composition
  18. 18. 1: def consume(): 2: 3: handle = dict() 4: ctc = ConfigeratorThriftClient() 5: fname = "example/config" 6: 7: def on_update(name): 8: handle['config'] = ctc.getJSONConfigContent(name) 9: 10: ctc.startMonitoringConfig(fname) 11: ctc.setUpdateCallback(on_update) 12: on_update(fname) 13: 14: return handle
  19. 19. Benefits of using configerator
  20. 20. Code review on configuration changes Automated validation http://www.jasonawesome.com/wp-content/uploads/2010/06/sally-code-review-300x256.png http://sounddesignlive.com/wp-content/uploads/2013/08/sound-design-live-automate_ all_the_things.jpeg
  21. 21. Clear representation of complex config http://www.visualthinkingmagic.com/wp-content/uploads/2011/08/complex-simple.gif No service restart required to pick up changes http://social.microsoft.com/Forums/getfile/25914/
  22. 22. Code review at FB
  23. 23. • Code *is* peer reviewed • We use Phabricator • Originally developed @ FB by Evan Priestley • Go check www.phabricator.org • Supports git/svn/hg
  24. 24. • (Some) PEPs enforced and push blocking • Common code dir for accelerated development • Unit tests results are clearly visible in our code review tool http://memegenerator.net/Bill-Lumbergh-Office-Space
  25. 25. Suggestions for reviewers: • Reviewing code is not writing code • Nitpicks are not enough • Stop typing and have a conversation • Build trust • Find people to emulate
  26. 26. Making diffs easier to be reviewed: • One and only one thesis • Do No Harm • Mix and match reviewers • Write good test plans • You are not your code
  27. 27. Production Python at Facebook
  28. 28. PROJECT KEANU • Automated load tests of regions/clusters/hosts • Implements feedback control loop • No simulated traffic • Goal is to find out “breaking point” == real capacity • Runs daily • Peek time • When no incidents • Skips drained clusters
  29. 29. KEANU
  30. 30. Questions?

×