The document provides guidance on designing a complex web application by breaking it into multiple microservices or applications. It recommends asking questions about team size, traffic patterns, priorities for speed vs stability, existing APIs or libraries, and programming languages. Based on the answers, it suggests appropriate frameworks, languages, data storage, testing/deployment processes, and server/container management options. The overall goal is to modularize the application, leverage existing tools when possible, and not overengineer parts of the design.
WSO2Con2024 - WSO2's IAM Vision: Identity-Led Digital Transformation
System design for Web Application
1. Your client or supervisor wants
you to design how you would
build a complex web app.
Step 1: Discovery Stage
Main question I think about: "Is it better to build this as a single
application or break it into multiple applications?"
Sub-questions to influence this decision:
1) how many developers willbe working on this? How many front-end
and how many back-end? What are the languages and frameworks these
developers are already familiar with?
2) what part of this application do you think will get the most traffic?
3) how important is speed of building theproject vs stability of the
project. In other words, if you had to choosebetweenlaunching features
very quickly but where customers would find somebugs or where you
could build features at 1/3 or ¼ of thespeed but not having any bugs is
really important, then what would you choose? Which bugs would you
find acceptable? Which bugs would you not find acceptable?
4) are there any APIs or micro-services that our team can already utilize
that we've either built previously or which are available that you know
of?
5) any specific preference on which programming language, framework,
or test framework to use?
Step 3: Choose the programming languages and frameworks
for back-end and front-end
Step 4: Choose how you would store datafor each app or
micro-service
Step 5: Choose your testing/deployment process for each app
or micro-serviceusing continuous integration, continuous
delivery, continuous deployment)
System Design Process for Web Applications
Step 2: Research stage
Research to find if there arelibraries or APIs that you can
leveragethat would reducethecomplexity of your app. For
example, if someone already built somemodules that you can
also leverage, explore those options.
For example, if you're doing machine learning stuff, look to see if
there arePython libraries or modules you could leverage.
Ifthereare APIs that already do somecriticalpart of what you
need, see ifyou can leverage thoseAPIs instead of your team
creating all ofthese functions.
Doing early research now will influencehow many modules or
micro-services you'll end up creating.
By Michael Choi (founder of Coding Dojo, Data Compass, Hacker Hero, Village 88)
(Optional) Step 6: Manage micro-services/apps. Orchestrate
these management using containers or using traditional web
servers.
Listen beforejumping in. Ask lots of
questions and don't assume.
Research before jumping in
2. Need anything
real time?
Yes
Only build thereal-time
component as a micro-
service using Node.js ,
Express and socket.
No
Is this a weekend
project?
Yes
Use a light framework
such as Flask (for Python),
Sinatra (for Ruby), or just
script it out without using
any framework.
How many
full timeback-end
engineers will be
working on this
project?
No
Less than ~10?
Which is more
important? Speed or
stability.
Speed
Use MVC frameworks with
strong test cases support such
as Rails. Rspec is amazing!!!
Stability
Use either Django (Python),
Laravel/CodeIgniter/Zend (PHP), Spring
(Java), .NETCore (for C#), etc. Do this based
on what your team members is already
familiar with. "All ofthis is practically the
same and not that different"is what you
should think although becareful on saying
this out aloud to other developers as this
could causeun-necessary heated arguments
about which framework is better and why.
Listen to them and have them pick one that
the team is most comfortable with.
Need any
machine learning
libraries? Or need to do
a lot of math
computations?
Strongly consider going with Python as
Python has great libraries for machine
learning and math/statistics. Use any MVC
framework such as Django.
Yes
No
Find ways to modularize the app into multipleservices/apps.
Each service/app could communicate to each other through
API calls. As a ruleof thumb, try to keep each service/app to
have less than 10 back-end engineers.
Once you broke this app into several
"micro-services" or modules, for each of
the module, go through the process.
`More than ~10 engineers
Step 3: Choosing programming languages/frameworks for back-end and front-end
(fancier word for this is called designing "micro-services")
Design how your micro-service would
communicate to other micro-services
using API calls. Figure out how you
would authenticate the request (to
prevent unauthorized folks to request
and retrievesensitive information).
Figureout which API methods you will
make available and what type of
response will be sent back (html, json,
xml, etc)
For each html page
that's rendered, how many
lines of javascript would you
expect?
Use frameworks such as React or
Angular
Does your app need
to be a single page
application?
No need to uselargeframeworks such as React
or Angular as this could significantly slow down
how fast the app can bebuilt.. Use plain
Javascript supplemented by light-weight
libraries.
Ifthereare frameworks that allow very quick
build-out, listen to your engineers and use those
frameworks but really make sure these
frameworks are not an overkill for organizing
small amount of javascript code.
Yes
Not sure
Thousands of lines of javascript per http response!
Less than 1,000 lines of javascript code per html file or http response
Utilize strengths of each
language/framework.
Havea small team of engineers
per project to increase
efficiency.
Connect multiple apps/micro-
services using APIs
Don't over-engineer the front
end portion of the app.
Don't over-engineer. Testcases
reduces #of bugs and is
important when reliability is
more important than speed. If
speed is more important, it's
okay to start out without
writing extensive test cases.
3. Any data (large files, long
texts/blob) that could be
stored in services like
Amazon S3?
Store these large data as
files in Amazon S3. Protect
it so that other peoplecan't
simply browse files in your
s3 bucket.
Do you need
relational or non-
relational
database?
How big will the
database get say
in thenext 3-5
years?
Use MySQL, Postgres,
MariaDB, etc. Use theone
that the team prefers.
Consider using services like
Amazon RDS. Check how
much data RDS can hold
(16TB as ofApr 2020)
Use Oracle – need lots of
money. Millions. Usually
not the right option when
you're starting a brand new
project.
Use Hadoop – could be hard
to set up and is usually not
the right option for any web
application unless you're
analyzing 16TB+of data
consistently. Look into
services such as Apache Hive
if you want a managed
solution.
Don't even consider using
SQLiteor any other light
databases for production.
1 TB or less
16+ TB
Way more than 16+ TB
< 16 TB
Any data
that you're okay if not
stored permanently?
(e.g. real time
broadcasting)
Store these in
memory based
database such as
Redis, Memcache,
etc.
Needs to be stored
okay to delete once the data is served
permanentlypermanently
How big will the
database get say
in thenext 3-5
years?
Use non-relational
database such as
MongoDB.
Consider using services
available in Amazon AWS
such as Apache Hiveor
other cloud services that
can scale your non
relational database for you.
1 TB or less
Bigger
relational
Non-relational
Step 4: Choose how you would store data for each app or micro-service
Reference urls:
https://aws.amazon.com/products/databases/
Don't crowd your database
with things that can be stored
elsewhere.
Don't over-engineer. Scale
things as theproject evolves
but know what the next steps
would be if you needed to scale
Storing things on memory is
significantly faster than storing
on disk.
4. Use Github and createa
repository for each micro-
service you're creating
Your developers fork the
repository, work on the
specific version they are
asked to work on and
submit pull request for
features they've
completed as well as test
cases for the new features
they've built.
Engineers areinstructed to
run thetest cases locally
and make sure it passes
beforethey submit a pull
request. This could also be
done automatically
utilizing pre-commit hooks
also.
Use GitFlow. For each
repo, theengineering lead
for that project creates
branches where branches
label themajor version
(e.g. 1.0, 1.1, 1.2, 1.2a, 2.0,
2.1, etc). Other
developers are not given
the authority to create
branches in the
main/origin repository.
You have tools such as Jenkin
set up that it listens for a pull
request being submitted.
Jenkins automatically test to
see ifthis pull request breaks
any existing test cases in the
staging server.
Did the pull
request pass all test
cases in staging?
Pullrequest
doesn't go
through and the
developer is
asked to fix their
code.
Engineering lead reviews
the code.
Code is good?
Did it come with test cases?
Was it documented well?
Developer is
asked to re-do
and re-submit.
Engineering lead
approves thepull
request! And gives a pat
to the developer.
no
yes
no
yes
Continuous Integration
Pullrequest is merged to
the appropriate branch.
Continuous Delivery
Services such as 'Jenkins' listen for
changes in the branch.
Branch updated?
Staging site is updated automatically!
QA team tests staging site thoroughly
to identify bugs not caught by test
cases. They do this using tools such as
Selenium.
QA lead and theEngineering lead is
happy with how staging is working..
Approveroll out
to production?
Engineers update the production with
the latestcode. Good job!
Production site is
updated
automatically!
No
Continuous
deployment set
up?
Yes
No
Yes
Set up GitHub
Step 5: Choose your testing/deployment process for each app or micro-service
Deployment
Continuous
Deployment
Continuous
Need to support
multipleversion of your
product? Or would this
app/micro-service require 5+
FT backend engineers to work
on this project
for months?
Use GitHub Flow. Only
one main development
branch – master.
Simpler and great for
simple/smaller projects.
You can still usepre-
commit hooks to make
sure all test cases pass
locally before codes are
committed.
No
Yes
Reference urls on Git/GitLab flow:
https://nvie.com/posts/a-successful-git-branching-model/
https://pradeeploganathan.com/git/git-branching-strategies/
https://sigmoidal.io/automatic-code-quality-checks-with-git-hooks/
Use version control as a
collaboration tool. Don't
over-complicate how to use
Git. Tools such as Jenkins and
connecting with Git hooks can
help you deploy apps much
faster and reduce bugs.
5. (Optional) Step 6: Manage servers or containers
Option 1: Servers Option 2: Containers
These tools/services may be helpful for managing your servers.
Amazon AWS / Microsoft Azure / Google Cloud – places to rent and set up your
own webserver.
AWS Elastic Beanstalk – orchestration service for deploying applications and
orchestrating EC2, S3, Simple Notification Service, Elastic Load Balancers, etc.
Azure App Service – Microsoft version of managing applications and servers
created from Azure
These tools/services may be helpful for managing your containers.
Docker: used for creating containers.
Kubernetes: open-source container-orchestration system for automating
application deployment, scaling, and management. Originally designed by
Google.
Amazon Elastic Container Service (ECS): a fully managed container
orchestration service.
AWS Elastic Kubernetes Service (EKS): fully managed Kubernetes service from
Amazon AWS
Amazon Fargate: a serverless compute engine forAmazon ECS and EKS that
allows you to run containers without having to manage servers or clusters.
Azure Kubernetes Services: fully managed Kubernetes service from Microsoft
Azure
IBM Cloud Kubernetes Service: fully managed Kubernetes service from IBM
Rancher: open-source multi-cluster orchestration platform; lets operations
teams deploy, manage and secure enterprise Kubernetes.
Tools that can be used with either option
Elasticsearch: search engine based on the Lucene library. Built in Java, it can be used to search
all kinds of documents including log files. Has near-time search and supports multitenancy.
Kibana: lets you visualize your Elasticsearch data and navigate the Elastic Stack so you can do
anything from tracking query load to understanding the way requests flow through your apps.
RabbitMQ: open-source message-broker software (sometimes called message-oriented
middleware)
Containerization isthefuture. Use
containers ifyou can unless you're
supporting older architecture that's
not using containers.
Keep learning. There are new
tools/services being introduced
always! This area ofhow apps are
managed and deployed in the cloud
is going through lots of
transformations. Don't panic when
you learn a new servicein the cloud
you haven't heard of. Be calm that
everything boils down to
fundamentals and you can also pick
up appropriateservices when you
need to.