Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

4

Share

Download to read offline

How to Design a Great API (using flask) [ploneconf2017]

Download to read offline

How do you build an API that developers love building and consumers love using?

There's a lot that goes into creating a great API. This presentation shares some tips & tricks, architectural patterns, and best practices that go into building a great engineering environment around your API.

Talk presented on Oct 18, 2017 at PloneConf2017.

Topics covered by this talk:
Intuitive Practices:
standardization, configuration/environment files, ORMs, SQLAlchemy, database migrations, Alembic, database seeds, requirements.txt, package management, dependency management, setup scripts

Durable Practices:
Unit Tests, virtual environments, flush vs commit, error rollbacks, request lifecycle, session lifecycle

Flexible Practices:
Directory structures, application factories, blueprints, python debugger

Reliable Practices:
Logging, progressive rollouts, slack hooks, cron health checks, api versioning, api analytics

Use Friendly Practices:
Endpoint design, endpoint documentation, debugging tools, postman

Speed Practices:
Python profiling, Bulk SQL Inserts, caching

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all

How to Design a Great API (using flask) [ploneconf2017]

  1. 1. How to Design a Great API (using Flask) Devon Bernard VP of Engineering @ Enlitic
  2. 2. Developers Users &
  3. 3. Intuitive Durable Flexible Code That Makes Developers Happy
  4. 4. Intuitive
  5. 5. Standardization Makes Debugging Easier Very Unique Standardized Time Spent Debugging Uniqueness of Environment
  6. 6.
  7. 7. Are Hard Coded Variables Bad? Yes, here’s why: 1. When runtime variables aren’t centralized, they’re harder to find, making your code less readable 2. If the variable is related to environment, you’ll likely need to juggle git 3. If the variable is supposed to be private or secret (e.g. passwords) database = 'production_url' File.py (production) ~~ database = ’localhost_url' ++ def ban_user(email): ++ ... File.py (local machine)
  8. 8. Leaking Passwords
  9. 9. Configurations / Environment database: postgresql://localhost:5432/my_app test_database: postgresql://localhost:5432/testing flask_login_secret: import yaml yaml_config = yaml.load( open(('../my-app.yaml')) print yaml_config[’database’] usage my_app.yaml.example
  10. 10. Configurations / Environment (2) class Config(object): DEBUG = False CSRF_ENABLED = True SECRET = yaml_config['flask_login_secret'] SQLALCHEMY_DATABASE_URI = yaml_config['database'] class DevelopmentConfig(Config): DEBUG = True class TestingConfig(Config): TESTING = True SQLALCHEMY_DATABASE_URI = yaml_config['test_database’] DEBUG = True class ProductionConfig(Config): DEBUG = False TESTING = False app_config = { 'development': DevelopmentConfig, 'testing': TestingConfig, 'production': ProductionConfig } app = create_app('development') app.run() run.py conftest.py @pytest.fixture(scope='session') def mock_app(): mock = create_app(’testing') cxt = mock.app_context() cxt.push() yield mock
  11. 11. Object Relational Mappers ObjectObject Object Object ObjectObject ORM Relational Database
  12. 12. Object Relational Mappers (2) class User(Base): __tablename__ = 'users' id = Column(Integer, primary_key=True) name = Column(Text, nullable=False) email = Column(Text, index=True, unique=True, nullable=False) password = Column(Text, nullable=False)
  13. 13. Object Relational Mappers (3) With ORM Without ORM def get_user_email(id): return conn.query(User.email) .get(id) def get_user_email(id): query = text(””” SELECT email FROM users WHERE id = %d LIMIT 1;”””%(id)) rows = conn.execute(query) if rows is not None: return rows[0][’email’] return None ^^ Much simpler ^^ Might have to worry about SQL Injection
  14. 14. How to Standardize Database Schemas? GIT History Database Migrations (Alembic) + Database Schema =
  15. 15. Database Migrations def upgrade(): op.add_column('reports', sa.Column(author', sa.Text()) ) def downgrade(): op.drop_column('reports', author’) models.py alembic/version/6a9eb6b43d7d_added_report_author.py class Report(Base): __tablename__ = 'reports’ ++ author = Column(Text) ^^ automatically generated for you by Alembic
  16. 16. Default Alembic Behavior $ alembic revision –-autogenerate -m “myrev” # file_template = %%(rev)s_%%(slug)s Terminal Alembic.ini
  17. 17. Default Alembic Behavior (2) $ alembic history 69c35d0ef5bb -> 000fd728c260 (head) b4b0fbdebf42 -> 69c35d0ef5bb 9dcf0d65e983 -> b4b0fbdebf42 ac39eb5890f4 -> 9dcf0d65e983 7054e7fae54a -> ac39eb5890f4 6a9eb6b43d7d -> 7054e7fae54a 56a1b15216da -> 6a9eb6b43d7d dbe9911f4c0b -> 56a1b15216da 2d5f5c762653 -> dbe9911f4c0b 24f34e82e6c1 -> 2d5f5c762653 ... Terminal 8 12 3 14 5 6 9 4 7 11 13 15 10 1 2
  18. 18. Ideal Alembic Behavior # file_template = %%(rev)s_%%(slug)s file_template = %%(year)d_%%(month).2d_%%(day).2d_%%( slug)s Alembic.ini 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
  19. 19. How to Standardize Database Data? Revision tools typically only manage schemas, not data, you’ll need to create a database seed Why • Go from 0 to standard instantly • Developers are free to mess with their local database • It’s easy to test your back end with a known database setup How • Dump database data to a sql file • Create a script (.py, .sh, etc) to reset your database schema and insert the seed file pg_dump --dbname=“my_app” --file=”/PATH/TO/seed.sql” --data-only --inserts
  20. 20. Where Are My Packages? $ python run.py ImportError: No module named flask_sqlalchemy $ pip install flask_sqlalchemy $ python run.py ImportError: No module named smtplib $ pip install smtplib $ python run.py ImportError: No module named passlib $ ... #FacePalm
  21. 21. Requirements.txt $ python run.py ImportError: No module named flask_sqlalchemy $ pip install –r requirements.txt Collecting flask_sqlalchemy Collecting smtplib Collecting passlib ... $ python run.py * Running on http://127.0.0.1:5000/ Flask == 0.12 Flask-Cors==3.0.3 Flask-SQLAlchemy == 2.1Flask-Login==0.3.2 psycopg2 == 2.6.1 PyYAML == 3.11 requests == 2.10.0 SQLAlchemy == 1.0.14 passlib==1.6.5 bcrypt==3.1.1 WTForms==2.1 ...
  22. 22. Requirements.txt (2) git+ssh://git@github.com/MY_USER/MY_REPO.git#egg=MY_REPO Install packages from github git+file:///PATH/TO/MY/PROJECT Install packages from local file system Why use these methods? • Ability to quickly test whether changes to another python project worked without having to push to github • This flow is fully local and keeps your git history cleaner • WARNING: When using the local file system method, your changes must be locally committed otherwise pip will not notice the changes.
  23. 23. Why aren’t my files updating? $ find . -name '*.pyc' -delete Your project may be running pre-compiled .pyc files and not overriding the old files Common .pyc conflict scenarios: • Moved or renamed python directories • Moved or renamed python files • Added or removed __init__.py files • Created a variable with the same name as a module To clear out all .pyc files and start fresh, run the following:
  24. 24. Why Isn’t the Setup README Working? $ cd MyApp $ cp config/app.yaml.example config/app.yaml cp: No such file or directory $ #Confused #WhatDo
  25. 25. Setup Scripts def install(dirs, args): env = make_virtualenv(dirs['venv_dir']) path = install_node_and_npm(dirs, env['PATH']) env['PATH'] = path install_bower(dirs, env) check_call(['bower', 'install', '--force-latest'], cwd=dirs['root_dir'], env=env) check_call(['pip', 'install', '-q', '--upgrade', 'pip'], cwd=dirs['root_dir'], env=env) check_call(['pip', 'install', '-q', '-r', 'requirements.txt'], cwd=dirs['root_dir'], env=env) return env setup.py
  26. 26. Durable
  27. 27. Should We UnitTest? Common Concerns • Writing tests takes time away from building features • Slows down development • Writing tests is boring Short Answer • Absolutely!!! • Don’t just test the golden-path, also test failure scenarios like 3xx, 4xx, or 5xx errors
  28. 28. Why UnitTests are Awesome Resolution of Time Concern • Requirement: You must verify your code works before shipping • Would you prefer to do that manually every time or automate that process? • Test coverage is an upfront investment saving time later on extra debugging Time Spent Testing Application Complexity Automated Manual
  29. 29. Why UnitTests are Awesome (2) Resolution of Boring Concern • Writing real code (tests) vs being a click farm Writing Features Writing Tests Clicking Thru App How Boring
  30. 30. Global PIP Cache Using the Wrong Dependencies? Project X (Flask==0.12) Project Y (Flask==0.8) Common Package Management Problems • Need to use different versions of the same package on another project or branch • Project broke because it was using existing package versions of other projects • Requirements.txt is missing an entry because previous developer already had the package installed Project Z Project W
  31. 31. ~/.venvs/Z ~/.venvs/Y ~/.venvs/W ~/.venvs/X Virtual Environments Project X (Flask==0.12) Project Y (Flask==0.8) Project Z Project W
  32. 32. Virtual Environments (2) Setup Issues? Just delete your venv and create a fresh one $ virtualenv ~/.venvs/X $ source ~/.venvs/X/bin/activate (X)$ pip install -r requirements.txt $ rm –rf ~/.venvs/X Clean & easy dependency management; especially when working on multiple projects Why? • Package version conflicts • v5.0.1 != v5.3.0 • Dependency contamination across multiple projects • Testing dependency upgrades
  33. 33. Flush vs Commit Flush Reserves a placeholder spot in database Commit Saves records into database Tips • Use flush when you need to get an auto- increment id back • Use commit ONLY when you are 100% sure you want to save rows • Every time you use `flush` or `commit`, you are adding another round-trip request Problem with always committing def save_test(student, answers): for answer in answers: conn.add(Answer(answer)) conn.commit() # SLOW! grade = calcGrade(answers) << ERROR >> conn.add(Test(student, grade)) conn.commit() Test will not get saved, but every answer will still be in the database!
  34. 34. Flush vs Commit (2) Only commit once def save_test(student, answers): for answer in answers: conn.add(Answer(answer)) grade = calcGrade(answers) << ? ERROR ? >> conn.add(Test(student, grade)) conn.commit() If error, nothing gets saved If no error, everything gets saved Flush (if auto-increment needed) def save_test(student, answers): for answer in answers: conn.add(Answer(answer)) conn.flush() ### use answers[0].id grade = calcGrade(answers) << ? ERROR ? >> conn.add(Test(student, grade)) conn.commit()
  35. 35. Errors & Rollbacks On app error, rollback session @app.errorhandler(Exception) def handle_error_response(error): conn.rollback() return jsonify(error) After requests, close sessions @app.teardown_appcontext def shutdown_session(exception=None): conn.remove()
  36. 36. Flexible
  37. 37. Project Directory Structure /app /api /models /dbms /utils /config /setup /tests /api /dbms /utils requirements.txt conftest.py run.py /app - Contains all business logic for run-time usage /config - Contains yaml and other configuration files /setup - Contain scripts and seed files for resetting database and project dependencies /tests - Contains all unit-test, ideally similar in directory structure to /app
  38. 38. App Factories app/__init__.py db = SQLAlchemy() def create_app(config_name): app = Flask(__name__, instance_relative_config=True) app.config.from_object(app_config[config_name]) app.config['SQLALCHEMY_TRACK_MODIFICATIONS'] = False app.secret_key = app.config['SECRET'] cors = CORS(app, supports_credentials=True) db.init_app(app) login_manager.init_app(app) app.register_blueprint(api) return app
  39. 39. App Factories (2) app/__init__.py from flask import Flask app = Flask(__name__) import api import db app/api/__init__.py from app import app, db import endpoint_group1 import endpoint_group2 app/db/__init__.py from app.utils.config import config from sqlalchemy.orm import scoped_session conn = scoped_session(config...) import query_group1 import query_group2 Issues: • Circular imports (prevents testing) • Lack of separation of concerns between app module and app object
  40. 40. Blueprints admins = Blueprint( ’admins', __name__) app.register_blueprint(admins) @admins.before_request def before_admin_req(): print ”ADMIN ROUTE REQUESTED" @admins.route('/admin/secret', methods=['GET']) def admin_secret(): return jsonify({}) public = Blueprint( ’public', __name__) app.register_blueprint(public) @public.before_request def before_public_req(): print ”PUBLIC ROUTE REQUESTED" @public.route('/public/not_secret', methods=['GET']) def public_not_secret(): return jsonify({}) Blueprints can be used to: • Logically group routes • Selectively apply/add request lifecycle events to a subset of routes
  41. 41. Extend and Expire Sessions Every time a user makes a request, extend their session for 30 minutes @app.before_request def extend_session(): session.permanent = True app.config['PERMANENT_SESSION_LIFETIME'] = timedelta(minutes=30) session.modified = True
  42. 42. Flexible Testing (pdb) Typical Python Debugging def my_func(objects): print "MY_FUNC", objects if len(objects) > 0: print "NOT EMPTY" new_objects = some_func(objects) << ERROR >> print new_objects else: print "EMTPY" new_objects = [] return new_objects • Lots of print statements • Only renders prints run before the app crashes • Can’t update values Python Debugger import pdb def my_func(objects): pdb.set_trace() if len(objects) > 0: new_objects = some_func(objects) << ERROR >> else: new_objects = [] return new_objects • Sets a break-point at `.set_trace()` • Functions like JS browser debugging break-points • Can step line-by-line or continue until next break-point • At each line, there is an interactive terminal • Not only can print, but update values
  43. 43. Reliable User Friendly Fast APIs That Make Users Happy
  44. 44. Reliable
  45. 45. How to maximize up-time? Preventative (Before break) • Don’t ship bugs • Unit Tests • Staging servers with user testing • A/B rollouts • Stable deployment server Quick Detection • System logging (Splunk) • Slack hooks • Scheduled health check pings • E.g. cron jobs Quick Diagnosis • Failure/Exception tracking + logging • API playground • Postman
  46. 46. Versioning Why • New rollouts won’t break people’s existing builds • Backwards compatibility • Ability to plan obsolescence How Host a deployment for each version • Track version # in config files • Prepend version to routes Host one big deployment for all versions • Duplicate files • Create route version mapper
  47. 47. Analytics on API Usage Why • Shows what endpoints are most frequently used (most critical) • Optimize popular endpoints • Shows what users are still using deprecated endpoints • Notify users to upgrade How • Could be as simple as an incrementing counter in a database table • Could use deployment tools to wrap your API that provide analytics (DevOps level)
  48. 48. User Friendly
  49. 49. 101 in Endpoint Design Things to be aware of • #1 rule is being consistent • No matter what you pick, some group is going to be annoyed • Their complaints are likely valid, but only for a small component, not how it affects the whole picture • Common patterns: REST, CRUD • Use more then just GET and POST • PUT, DELETE, OPTIONS, PATCH • Use relevant status codes • 1xx info, 2xx success, 3xx redirection, 4xx client errors, 5xx server errors • Figure out if users are often instantly re-forming the data you send them
  50. 50. 101 in Endpoint Documentation If it’s not documented, it’s as if it doesn’t exist Over document, biggest complaint is ”lack of documentation” You only have to write it once, but it can be read millions of times Interactive docs & test environments are great
  51. 51. Postman A great tool for • Sharing endpoints with your team and users in an interactive way • Quickly debugging API issues Pro-tip Use environment and global variables to generalize endpoint usage across localhost, staging, and production servers
  52. 52. Fast
  53. 53. Python Profiling What lines of code are costing you the most time?
  54. 54. Bulk SQL Inserts Three useful methods • bulk_save_objects • bulk_insert_mappings • Mogrify ORMs were not primarily intended for bulk usage Using bulk commands gives a 5-10x speed boost # Without bulk for i in range(100000): conn.add(Employee(name="EMP #" + str(i))) conn.commit() # With bulk conn.bulk_save_objects([ Employee(name="EMP #" + str(i)) for i in range(100000) ])
  55. 55. Caching Store common responses in-memory for quick retrieval from flask import Flask from flask.ext.cache import Cache app = Flask(__name__) cache.init_app(app, config={'CACHE_TYPE': 'simple'}) @cache.cached(timeout=50) @app.route('/', methods=['GET']) def index(): return render_template('index.html')
  56. 56. Devon Bernard VP of Engineering @ Enlitic " devon@enlitic.com # @devonwbernard Thank you! Any questions?
  • EnzoForte1

    Apr. 15, 2020
  • niceguysan

    Aug. 1, 2018
  • bal699

    Apr. 18, 2018
  • HaslinaHjSaharCTALTA

    Mar. 1, 2018

How do you build an API that developers love building and consumers love using? There's a lot that goes into creating a great API. This presentation shares some tips & tricks, architectural patterns, and best practices that go into building a great engineering environment around your API. Talk presented on Oct 18, 2017 at PloneConf2017. Topics covered by this talk: Intuitive Practices: standardization, configuration/environment files, ORMs, SQLAlchemy, database migrations, Alembic, database seeds, requirements.txt, package management, dependency management, setup scripts Durable Practices: Unit Tests, virtual environments, flush vs commit, error rollbacks, request lifecycle, session lifecycle Flexible Practices: Directory structures, application factories, blueprints, python debugger Reliable Practices: Logging, progressive rollouts, slack hooks, cron health checks, api versioning, api analytics Use Friendly Practices: Endpoint design, endpoint documentation, debugging tools, postman Speed Practices: Python profiling, Bulk SQL Inserts, caching

Views

Total views

2,484

On Slideshare

0

From embeds

0

Number of embeds

1,690

Actions

Downloads

41

Shares

0

Comments

0

Likes

4

×