EuroPython 2014 - How we switched our 800+ projects from Apache to uWSGI

1,681 views

Published on

During the last 7 years the company I am working for developed more than 800 projects in PHP and Python. All this time we were using Apache+nginx for hosting this projects. In this talk I will explain why we decided to switch all our projects from Apache+nginx to uWSGI+nginx and how we did that.

Published in: Software
0 Comments
6 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,681
On SlideShare
0
From Embeds
0
Number of Embeds
150
Actions
Shares
0
Downloads
14
Comments
0
Likes
6
Embeds 0
No embeds

No notes for slide
  • When we just started we used only PHP for developing our projects with nginx on the frontend and apache/mod_php on the backend which was the best available way to host PHP applications at that time. But then after several years we realized that there are a lot of problems and downsides with PHP itself, so we started to search for another language and choose Python for several reasons which are beyond the scope of this talk. Because we already had a working nginx and apache environment we decided to choose between mod_python which was completely dead and abandoned at that time and mod_wsgi which was also dead but it’s corpse was still warm, it was more feature rich and had more documentation. We chose mod_wsgi and started to serve our Python applications with mod_wsgi in daemon mode because it allowed us to run applications in separate proccesses and was the recommended way from the mod_wsgi author himself. Everything worked but there were several problems with apache and mod_wsgi.
  • Our main problem was that we had projects written in different versions of PHP and Python and we had to have 5 Apache instances – one for each version, listening on different ports which was a nightmare to support.
  • The other problem was with automatic code reloading during the development. We didn’t want to use Django’s built-in web server and wanted to develop with mod_wsgi to be sure that everything will work as expected after we deploy. On embedded mode one needs to restart the whole Apache after the code changes, on daemon mode which we were using, one have to issue touch command to the wsgi entrypoint to reload the source. Also there was a recipe from mod_wsgi author, which was hidden deeply in the docs and was the 100+ lines custom made script which we had to include and live with in our projects. I know maybe it’s not a so big problem but still it’s 2014, there must be a more appropriate way to do that.

    Actually at the moment an active development of mod_wsgi is started again and starting from the 4.1.0 there is a Django application plugin which adds a “runmodwsgi” command to the Django which allows python manage.py runmodwsgi --reload-on-changes command but that’s only for Django and it’s the same old 100+ lines script which is now just became part of the official package instead of only being hidden somewhere in the docs in the past
  • Another problem was that during development we create a lot of different branches in GIT and we want them to be available, for example via url like branch.project.com. We also wanted each branch to run under separate daemon process because if one branch has some problems, which can crash the daemon process, the other developers working on separate branches should continue to work. The solution we came up with is to have a git hook, which generates a separate apache config file for a new branch and then reloads apache, if a branch gets deleted then the config file is deleted too and we have to reload apache again. Now if we have 20 branches we have 20 almost absolutely similar config files except for the different daemon process name and a few other things and this is only for one project. What if we decided to change something in the config file of a project, we have to edit all of this 20 config files. Really painful. Perhaps there is another way and I HOPE there is another way, but we spent almost a month googling and couldn’t find it.
  • Also when we add a new config file we have to issue a reload because apache can’t load configuration files dynamically, but there are two problems with that:

    First – there is still a little time when the web server don’t respond to requests, this time can be from milliseconds to seconds depending on various factors
    Second – if there is an error in the configuration file, the whole server will crash, yes you can check the syntax of the file with a configtest command, but it will only test for the syntax errors and doesn’t guarantee that server will restart properly.
  • Lastly I want to mention a few other problems:

    I know a lot of people (me including) who thinks that Apache config files are ugly and when you look at it you need some time to understand what’s going on in there, especially in complex configuration files
    You have to be an Apache expert to configure it in a way that it could compete with other web servers in areas like memory management or cpu utilization, not every sysadmin knows apache so good
    Apache is old, my grandma was using it in the old days, 2.2.0 version was released in the 2005, yes the 2.2 branch is still developed but they mostly fix bugs in new versions. 2.4 which was released in 2012 fixed a lot of problems and limitations, for example memory management became much better, they added an ability to declare a config file variables which made config files a bit cleaner and so on
    mod_wsgi was dead until the beginning of 2014, or as the author of mod_wsgi prefers to say: mod_wsgi was not not dead, it’s just been resting

    All that problems made us to start looking for a solution, and the solution was quickly found – it’s called uWSGI
  • What is uWSGI:

    1) uWSGI is a modern project it started somewhere between 2009 and 2010
    2) It has really fast development cycle with new features constantly adding
    3) Supports a lot of languages including Python (cPython, PyPy and Jython support is coming), PHP, Lua, PERL, Ruby, Erlang, Go, Java, has support for V8 engine and Mono for running ASP.NET applications and more
    4) Works on all Linux, Windows, BSDs and other OSs like MacOS, Solaris and others
    5) nginx supports uwsgi protocol which is the best performing protocol for uwsgi
    6) Has a ton more features, we’ll discuss some of the most interesting later

    Let’s see how we can install it.
  • The vast majority of uWSGI features are available as plugins, if you want to have the maximum amount of flexibility and use only the minimal amount of resources, just create a modular build which is the recommended approach. I’ll show how to do it from the source because OS package repositories may not always have the latest version or may not have all available plugins.

    First we have to download the uWSGI. To change install location add the following two lines to the buildconf/core.ini and create the following directories.
  • Second we need to build a core, the core consists of several packages that are most likely will be needed by everyone. If you are not everyone you can also customize what exactly the core will consists of.
    Lastly we need to build our plugins, 2 plugins for Python 2.7 and 3.3 and 2 plugins for PHP 5.4 and 5.5.

    That’s it. We just completed the install process of uWSGI with support for 2 versions of Python and 2 versions of PHP.

    Now let’s see how we can solve the problems we had with Apache/mod_wsgi with the help of uWSGI
  • The biggest problem for us was the support for projects written using different versions of different languages when we had to have 5 Apache instances – one for each version, listening on different ports. With uWSGI it’s like a breeze, you can compile a plugin for different versions of the language and use it in the project. Really cool.
  • Remember 100+ lines Python script which was used in mod_wsgi to monitor code changes and to reload daemon process. uWSGI has just a simple option py-auto-reload which you can set to the number of seconds to how often you want uWSGI to check if something has changed and to reload sources. Another option is a touch reload which is the same as touch reload in mod_wsgi. No more custom made scripts which you have to search for and include in your projects.
  • We also had a problem with mod_wsgi’s WSGIDaemonProcess directive name can’t be dynamic and we needed separate daemon process for each GIT branch, so we had to generate absolutely similar config files. The uWSGI’s emperor mode was the solution to this. uWSGI emperor is a special instance of uWSGI that will monitor specific events and will spawn/stop/reload instances (known as vassals) on demand. By default the emperor will scan specific directories for supported configuration files, but it is extensible using imperial monitor plugins, that means that you can store configuration in postgres, mongodb, publish it via amqp or zeromq and so on).
  • Let’s see an example for the directory monitoring plugin. We start a uwsgi emperor instance and ask it to monitor the following directory pattern. Now if we add a new file to this directory the emperor will automatically spawn a new vassal, if we modify a file the emperor will restart a vassal, and if we delete the file, the vassal will be killed. All of this will happen absolutely automatically, there is no need to issue any restart commands or anything. Now what if we have an error in our vassal configuration file – actually nothing bad will happen to the emperor or other vassals, the emperor will just say that this vassal is cursed and won’t spawn a new instance of it. If the emperor dies himself, then all of his vassals are died with him.
  • So how this will help us with our Apache dynamic WSGIDaemonProcess problem. Should we also create the same insane amount of configuration files per project one for each branch. Yes and no. We’ll use template config files and create symlinks.
  • This is an example of template config file. At first we define our own variable called project_dir and set here the path to our project. Then we define that we’ll be using python 2.7, and our other python project related stuff which should be nothing new to anyone who knows how python apps work. The interesting thing here is the %n magic variable which is the config file name without extension. Btw I don’t know about you but for me this syntax is much much cleaner compared to apache config files and also as a bonus, my brain doesn’t hurt after working with this config files.
  • That means that now, instead of generating a separate config file for each GIT branch (we were doing this for Apache with mod_wsgi) we just create a symlink to the project’s template file and yes this will give us 20 symlinks for 20 git branches but, there is a big win over Apache because if we need to change something, we just change it in one place – in the projects’s template config file instead of changing every generated config file in Apache.

    Also because we are using the emperor mode, which is monitoring our directories for config files, when a new symlink is created - a new vassal is spawned automatically, and when the symlink is deleted - a vassal is killed, and all this happens without any restart commands, reloading configuration files and so on.

    Also starting from 1.9.1 you can now tell the emperor to spawn a vassal only after the first request has been made. Combined with –idle and –die-on-idle options which allow to kill process after the specified amount of inactivity you can have a truly on-demand applications.
  • Let’s briefly talk about some of the uWSGI interesting features.

    We can implement auto-scaling with uWSGI using the broodlord, zerg and emperor modes with idle and die-on-idle options. The idea is that when the site load is heavy and your vassals just can’t handle it, they can ask emperor to enter broodlord mode and give them some help (zergs) so they could win this battle and serve all the requests, after the load is normal again, the broodlord kills all the zergs.
  • As of 1.3 there is a so called alarm subsystem. It allows the developer/sysadmin to ‘announce’ special conditions of an app via various channels. For example, you may want to get notified via Jabber when a string TERRIBLE ALARM appeared in log files.
  • Having multiple versions of a Python package/module is very common. Manipulating PYTHONPATH or using virtualenvs are a way to use various versions without changing your code. uWSGI gives us another option called aliasing system. Let’s say we have imports of foo and bar modules in a lot of places in our code, now we want to make some modifications to it but keep our original foo and bar modules intact for whatever reason, we can create experimental_foo and experimental_bar modules, make the needed changes in them and alias them to the foo and bar modules respectively, now in your code when an import foo and import bar statements are found the experimental_foo and experimental_bar modules will be imported instead of original foo and bar modules.
  • There are a lot of other cool features, like built-in crontab, load-balancer, clustering subsystem, offload subsystem which allows you to automatically delegate some heavy tasks to separate threads, has a lot of plugins for different tasks, integrates with almost all well-known web servers like nginx, apache, cherokee, lighthttpd, mongrel2 and more.

    Has django admin integration plugin which shows the status of the uwsgi and allows to restart it or clear it’s cache, more functionality is in development.

    Has rich configuration system - supports configuration files in ini, xml, yaml or json. It has more then 30 magic variables for all sort of things, environment variables, placeholders which are variables declared inside configuration files, you can even do simple math in placeholders. You can read contents of other files from your config files, can write for cycles and if statements, can declare variables in your python scripts and use them in uwsgi and much more.
  • Ok, uWSGI is cool but the final question is how to switch from Apache to uWSGI. There is no easy way. There is no some magic tool that will translate all your apache config files to uwsgi ones. How we did that with our 800 projects ? Well, we divided them into several groups that have equal or almost equal apache config files, then wrote a script that generates similar analog for uwsgi for each group. That took us approximately two days and allowed to switch all the projects to uwsgi on our dev servers. Then we started to run our functional tests and see if all of them passed. Whenever there was a problem we looked what it was and tune the uwsgi config file. That took us almost a week or a little bit more. After we were sure that everything workes on our dev servers, we made a switch to uwsgi on our production servers. That’s how we did that.
  • In the conclusion I would like to say that Apache is actually not bad. It’s a very good and stable webserver which is used by a lot of people. It is suitable for a lot of situations and you can happily use it if it works for you. Do not
    blindly trust people that say apache sucks go try this or that new super cool web server, turn on your brain and think, do YOU really have any problems with your apache, if no, than live happily with it and don’t listen to anyone.

    In this talk I explained why we switched. We didn’t have any problems with apache performance, we had problems with it’s features. We could do everything we wanted with apache, but we wasn’t happy with HOW we could do that. What I want to say is that you should choose a web server by features and not by benchmarks, because it is pointless to benchmark webservers because they all perform approximately the same if configured properly. In 99% cases it’s an application that is written incorrectly and has performance problems, and not a web server which just serves an application to the world.
  • EuroPython 2014 - How we switched our 800+ projects from Apache to uWSGI

    1. 1. europython 2014 1 / 26 How we switched our 800+ projects from Apache to uWSGI Max Tepkeev 23 July 2014 Berlin, Germany
    2. 2. europython 2014 2 / 26 Who We ?! • Ailove Group • Ailove • Aitarget • iCom • 120+ people • 800+ projects
    3. 3. europython 2014 3 / 26 Who I ?! Max Tepkeev Russia, Moscow • python-redmine • architect https://www.github.com/maxtepkeev
    4. 4. europython 2014 4 / 26 Previous Technology Stack • PHP • 5.3 • 5.4 • 5.5 • Python • 2.7 • 3.3 • nginx • Apache • mod_php • mod_wsgi
    5. 5. europython 2014 5 / 26 Problems Separate Apache for each version of PHP and Python listening on different ports: • :8080 – Apache/PHP5.3 • :8081 – Apache/PHP5.4 • :8082 – Apache/PHP5.5 • :8083 – Apache/Python2.7 • :8084 – Apache/Python3.3
    6. 6. europython 2014 6 / 26 Problems Monitoring for code changes during development: • embedded-mode – full restart • daemon-mode – 100+ lines script mod_wsgi >= 4.1.0 (Django only): • runmodwsgi --reload-on-changes
    7. 7. europython 2014 7 / 26 Problems mod_wsgi WSGIDaemonProcess name can’t be dynamic, e.g. separate daemon process per GIT branch: • dev.project.com – project-dev.conf • dev2.project.com – project-dev2.conf
    8. 8. europython 2014 8 / 26 Problems Apache can’t load configuration files dynamically: • reload – not so graceful restart • total crash if there are errors
    9. 9. europython 2014 9 / 26 Problems • Apache configuration files are ugly (subjective) • Apache is hard to configure properly (subjective) • Apache is old (2.4 fixed a lot) • mod_wsgi seemed to be abandoned (is actively developed again)
    10. 10. europython 2014 10 / uWSGI • Modern project • Fast development cycle • Multi-language (Python, PHP, Lua, PERL, Ruby, Erlang, Go, Java and more) • Supports > 20 OS • nginx speaks uwsgi protocol • Ton more features
    11. 11. europython 2014 11 / Installation http://projects.unbit.it/downloads/uwsgi-latest.tar.gz To change install location add to buildconf/core.ini: bin_name = /usr/local/uwsgi/bin/uwsgi plugin_dir = /usr/local/uwsgi/plugins mkdir /usr/local/uwsgi mkdir /usr/local/uwsgi/bin mkdir /usr/local/uwsgi/plugins
    12. 12. europython 2014 12 / Installation Core: python uwsgiconfig.py --build core Plugins: python2.7 uwsgiconfig.py --plugin plugins/python core python27 python3.3 uwsgiconfig.py --plugin plugins/python core python33 UWSGICONFIG_PHPDIR=/usr/local/php5.4 python uwsgiconfig.py --plugin plugins/php core php54 UWSGICONFIG_PHPDIR=/usr/local/php5.5 python uwsgiconfig.py --plugin plugins/php core php55
    13. 13. europython 2014 13 / Solutions Multi-version plugins: • php53_plugin.so – PHP5.3 • php54_plugin.so – PHP5.4 • php55_plugin.so – PHP5.5 • python27_plugin.so – Python2.7 • python33_plugin.so – Python3.3
    14. 14. europython 2014 14 / Solutions Monitoring for code changes during development: • --py-auto-reload N (seconds) • --touch-reload=django.wsgi
    15. 15. europython 2014 15 / Solutions Emperor mode - event based dynamic handling of applications (vassals): • Monitor config files in directories • More plugins available (postgres, mongodb, amqp, zeromq and more)
    16. 16. europython 2014 16 / Solutions Monitor config files in directories: uwsgi --emperor “/usr/local/uwsgi/apps/*/*.ini” • New file “/usr/local/uwsgi/apps/app/app.ini” • Spawn vassal • File modified • Restart vassal • File removed • Kill vassal
    17. 17. europython 2014 17 / Solutions Template config files: ln -s /usr/local/uwsgi/templates/app1_tpl /usr/local/uwsgi/apps/app1/main.ini
    18. 18. europython 2014 18 / Solutions [uwsgi] project_dir = /srv/projects/my_project plugins = python27 virtualenv = %(project_dir)/python pythonpath = %(project_dir) socket = %(project_dir)/tmp/%n.sock wsgi-file = %(project_dir)/repo/%n/wsgi/django.wsgi touch-reload = %(project_dir)/repo/%n/wsgi/django.wsgi logto = %(project_dir)/logs/uwsgi-%n.log
    19. 19. europython 2014 19 / Solutions Create symlink for each GIT branch: • ln -s /usr/local/uwsgi/templates/app1_template /usr/local/uwsgi/apps/app1/dev.ini • ln -s /usr/local/uwsgi/templates/app1_template /usr/local/uwsgi/apps/app1/master.ini
    20. 20. europython 2014 20 / Cool Features Auto-scaling: • broodlord mode • zerg mode • emperor • idle/die-on-idle
    21. 21. europython 2014 21 / Cool Features Alarm subsystem: In configuration [uwsgi] alarm = jabber xmpp:foo@jabber.xxx;mypass;admin@jabber.xxx log-alarm = jabber ^TERRIBLE ALARM In application print "TERRIBLE ALARM!"
    22. 22. europython 2014 22 / Cool Features Aliasing Python modules: [uwsgi] # some configuration here pymodule-alias = foo=/opt/proj/experimental_foo.py pymodule-alias = bar=/opt/proj/experimental_bar.py
    23. 23. europython 2014 23 / Cool Features • built-in crontab • built-in load-balancer • clustering subsystem • offload subsystem • plugins (geoip etc.) • integration with web-servers • django admin integration
    24. 24. europython 2014 24 / How to switch • no fully automatic way • write scripts that generate basic config files for you • tune them by hand afterwards
    25. 25. europython 2014 25 / Conclusion • apache is not bad • benchmarks are pointless • web servers perform similarly if configured properly • it’s applications that have problems, not web servers • choose a right tool for a right job
    26. 26. europython 2014 26 / Questions slides: http://slideshare.net/maxtepkeev github: https://github.com/maxtepkeev email: tepkeev@gmail.com skype: max.tepkeev

    ×