SlideShare a Scribd company logo
From prototype
to production
The journey of re-designing SmartUp.io
● Extrovert geek and tech lover
● Joined SmartUp.io @ Q2 2016
● Adores Linux, OSS, is a general minimalist
● Mining the bits in the startup & product
mines, but been through the valleys of
outsourcing & consultancy scene as well
About Me - whoami
Mate Lang
CTO @ SmartUp.io
About SmartUp.io - Who we are
● Startup company with a mobile-first,
gamified, social micro-learning SaaS
○ Initial goal is to help entrepreneurs launch and
take their startup companies to success
without bothering VCs with the exact same
questions
○ Found out, we have a much wider use case:
reach, engage, train & inspire communities to
facilitate learning
● Got a lot of attention from clients and
investors
SmartUp.io - Motivation
● Deloitte predicts huge
disruption opportunity
in corporate learning
sector
SmartUp.io - Motivation
● Deloitte predicts huge
disruption opportunity
in corporate learning
sector
Company’s average
net promoter score
for their LMS?
SmartUp.io - Motivation
● Deloitte predicts huge
disruption opportunity
in corporate learning
sector
Company’s average
net promoter score
for their LMS?
-8
What is this talk about
● How shoud I (the geek) attack the problem domain to be successful?
● How did we decompose and redesign a “complex” (a.k.a messy) monolith into
maintainable microservices
● How did we cope NFRs like security and performance
Fact #1 - In software the only constant thing is change
Fact #1 - In software the only constant thing is change
Fact #2 - Geek is the new sexy ;)
Market driven evolution
● Initial product is simple media consumption
○ Users learn by consuming content and competing with each other
○ SmartUp content team writes content on protected administration webapp
○ Users are on the same platform, without isolation
● ACME Inc comes along with proposal to use the platform internally for learning
○ Users are isolated for ACME Inc and FooBar Inc (multi-tenancy)
○ Company needs to be able to write their own content on their “slice” of the platform
● Introduce Communities = isolated instance of our platform suitable to serve the
needs for business consumers
Version 1 (Wondeer)
Market driven evolution
What we had What we wanted to sell
But most importantly...
Market driven evolution
This is what we had
under the hood
Plan is to rewrite from scratch
Team’s estimation of backlog = ~ 7 months
Plan is to rewrite from scratch
Actual delivery time = 7 months 10 months
Version 2 (Brontocorn)
Cool :)
Let’s rewrite...
Where could it go
wrong?
Where could it go
wrong?
Right from architecting
Understanding your domain
● Clients use a “smart hack” on V1 to obtain a predefined order of their published
learning material
● By default the platform facilitates “feed-like” behaviour
○ no explicit ordering
○ potentially infinite list of items
● The right question to ask as an engineer
Why do our client would ever
want that?
Understanding your domain
● Clients want a course like structure
● That is substantially different than our individual publishing model
○ The content is designed from the beginning in an ordered fashion
○ The completion should be in an ordered fashion
○ Analytics should be able to correlate between completion records
○ It deserves it’s own management
Understanding your domain
● Clients want a course like structure
● That is substantially different than our individual publishing model
○ The content is designed from the beginning in an ordered fashion
○ The completion should be in an ordered fashion
○ Analytics should be able to correlate between completion records
○ It deserves it’s own management
VS
Let’s get geeky
Geek team shop list for the re-write
● Scalable and maintainable codebase on all platforms
● Needs to be microservice architecture, because everything is easier with them
(this is a fat lie)
● Automate everything that is possible to automate
● Detailed and helpful documentation
● Needs to ship in a continuous fashion with Docker because Docker is cool (it
actually is)
● Needs to have Infrastructure-as-Code
● Needs to use managed solutions
● Web client and mobile apps useable by your grandma
Tech stack
Microservices - the why
Google Trends - search for term “microservice”
Microservices - the why
● Haters gonna hate, but there is undeniable interest & adoption in
engineering-led companies
● Microservices take common clean code (SOLID) concerns to system level
○ SRP - a service should be concerned about a single coherent domain
○ OCP - extending behavior done through encapsulating with higher level services.
Change in remote context should not produce change in a given service.
○ ISP - introducing edge services - specialized backends for clients
● The above seems like common sense, but engineers do fail in designing such
systems
Microservices - the reason we fail
“Often stepping back
you see more, don’t you?”
David Hockney
painter, draughtsman, printmaker, stage designer and photographer
Microservices - my fast service design test
● If you want to know whether you have (not) designed it correctly (false
positives may appear) fill in the following test
Test for your service design
1. Imagine you have to open source your service. Are you able to do so without
doing code changes, but staying useful to an engineer outside your business
domain? …………………………………
------------------------------ END OF TEST ------------------------------
Let’s see a concrete example
Meet the
“Leaderboard Service*”
*(the tech scene ran out of deity names to name services after,
so no Zeus or Hydra for you)
The leaderboard - proposition
● Service responsible to manage leaderboards
● A leaderboard is an ordered list of players associated with a score
● It needs to allow the near real-time update of such boards, based on individual
score change events
● Allows the player to check the “transaction history” not just the aggregated
state
● It allows the fast retrieval of a certain segment of the board
○ Top X
○ Around X ( X-10, X, X+10)
Something
like this...
But a bit more modern...
Like this
The leaderboard - collecting points
Card
Points
Card
The leaderboard - under the hood
public interface LeaderboardService {
LeaderboardCreationResponseDto createLeaderboard(LeaderboardCreationRequestDto requestDto);
LeaderboardDto retrieveTopNLeaderboard(String leaderboardId, long size);
LeaderboardDto retrieveAroundNLeaderboard(String leaderboardId, String playerId, long size);
PlayerDto retrievePlayer(String leaderboardId, String playerId);
void updatePlayerScore(String leaderboardId, String playerId, long score);
}
● Note that this interface is agnostic of all SmartUp related logic
● Could be reused in any situation where you want to represent entities in a
sorted order by score
● No matter what the player entity is, it’s created lazily when you first upgrade it’s
score
● Currently used to incentivize consumption, but can be applied to groups of
players, content creators, etc.
The leaderboard - under the hood
● Due to event sourcing we can always reconstruct state in case of failure
● Redis is in-memory. HA or not, should it ever go down, an update of state would
effectively restore our read-optimized model
● Our number of Score Processors scale with the data volume
● Concurrent state update correctness guaranteed by DynamoDB Stream shards
● In case of processor failure, upon service restoration the unprocessed events
would get picked up, all in a couple minutes (up to a couple weeks of staying
behind)
The leaderboard - the gain
The leaderboard
The leaderboard
QC: PASS
Another example...
Meet the Content Service
The Content Service - The proposition
A service that handles the creation and modification of versioned learning material.
Also enables versions to be instantiated and completed by consumers.
Encapsulates both structural and behavioural functionalities, like:
- Structural
- Question text
- Limited number of answers
- Solution explanation
- Behavioural
- Single-choice
- Multiple-choice
The Content Service - Under the hood
The Content Service - the gain
● Due to the Context being an abstract entity we can support lots of use cases for
consumption rules & resilient to change
○ Sharing of consumption record
○ You can re-do a content in certain circumstances (e.g. exam mode has separate context)
● Feedback from our Head Of Content after going live
“Fast, smooth and easy. And really fast. And damn, this thing's fast...”
● Design enables easy clean-up should we ever do so.
For now we store every change a content creator made.
“Because I can” - Dr. Bob Kelso, Scrubs
Tuning your
microservices, cause
Tuning your
microservices, cause
with high throughput
comes high latency
Tuning your microservices
● Simple Operation: Check users
credentials and request JWT Token
● Initial results: not too bad
● P95 responds in 701 ms
Tuning your microservices
● Scale it up
○ 1 OAuth Service
○ 2 User Services
● P95 in ~35k ms
● Almost all requests respond after 1
second
● 40% requests FAILED
THAT DOES NOT MAKE SENSE!
Tuning your microservices
● Found out there is no connection HTTP pooling -> TCP handshake penalty
● Update Spring Cloud to Edgware
● Set correct timeouts for Ribbon and Hystrix
● Reduce (yes, reduce) Tomcat resources
○ Max-Threads
■ The maximum number of request processing threads to be created
○ Max-Connections
■ The maximum number of connections that the server will accept and
process at any given time
○ Accept-Count
■ The maximum queue length for incoming connection requests when all
possible request processing threads are in use
Tuning your microservices
● 0 Failed Requests
● Mean Req/s: ~50
● P95 latency: 148 ms
Tuning your microservices
Sometimes
If microservices have
not solved your
engineering problems...
Try hiring a
“DevOps Engineer”
As per Wikipedia
DevOps =
a software engineering practice that aims at
unifying software development (Dev) and
software operation (Ops).
Automate
EVERYTHING!
Bronicorn Release
● Went live October 3, 0600 RO time
● Development environment was used for previous 10 months
● No other environment due to cost reasons
● On the day of release we created
○ Staging (Acceptance Testing) environment
○ Production
Bronicorn Release
● Went live October 3, 0600 RO time
● Development environment was used for previous 10 months
● No other environment due to cost reasons
● On the day of release we created
○ Staging (Acceptance Testing) environment
○ Production
45 minutes difference between
deployments (5 minutes active)
How is that possible?
How is that possible?
Infrastructure-as-Code
Infrastructure as Code
Definition
Infrastructure as code (IaC) is the process of managing and provisioning computer data
centers through machine-readable definition files, rather than physical hardware
configuration or interactive configuration tools. (as per Wikipedia)
● Just a Bunch of shell scripts
● Modern provisioning tools like: Chef, Puppet, Ansible
● Cloud-ready IoC management: Cloudformation, HashiCorp Terraform
Meet Terraform
● DSL based using HCL (HashiCorp Configuration Language)
● Module oriented
● Manages dependencies between resources (e.g. DNS depends on IP)
● Nice interpolation syntax
● Natively manages multiple environments through configuration (e.g. instance
types differ from env to env)
● Workflow = Plan > Review > Apply
● Configurable state backends
● From V0.10 supports pluggable “providers” (e.g. AWS, GCP)
Meet Terraform
Src: Terraform Homepage
Our way of Terraforming
● 4 AWS VPCs
○ Services VPC (For maintenance, and unified connection to other VPCs)
○ SmartUp VPCs (e.g. Dev, Stg, Prod)
● Each service owns its own module along with dependencies
Eg: Leaderboard:
○ Redis
○ Queues
○ DynamoDB Tables & Streams
○ etc
● Peering module to connect Services <-> SmartUp
● Using encrypted S3 for safe state storage
Managing service configuration
● Services pick up their configuration exclusively in runtime
● No mvn package -Pdev|stg|prod
● Build one artifact (docker) and use it everywhere
● Consul as service discovery and configuration storage
● Terraform injects properties into Consul upon execution
● No configuration done in YML
● Using Spring Cloud Config to pick these values up from Consul upon startup
and checking periodically for changes
Managing service configuration
Managing service configuration
Each team is responsible for
their delivery process
Each team is responsible for
their delivery process
From design to production
Let’s put it to production
● Loads of deploys in a geek’s life, better make it simple
● A good pipeline will
○ Provide fast feedback before PR integration (build, test & check infra dependencies)
○ Deploy ASAP changes to dev (fail-fast)
○ Streamline production releases so they prevent human error
○ Clear separation between steps
● Preferably define the whole pipeline using code
● Decided to use CircleCI - YAML based Workflows
Kudos to the team
Thank you for your
kind attention!
Mate Lang
CTO @ smartup.io
mate@smartup.io
twitter: @langmate
medium: @matelang

More Related Content

Similar to From prototype to production - The journey of re-designing SmartUp.io

Rapid app building with loopback framework
Rapid app building with loopback frameworkRapid app building with loopback framework
Rapid app building with loopback framework
Thomas Papaspiros
 
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning InfrastructureML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
Fei Chen
 
Symantec - From Early Drupal Adoption to the Latest Drupal Innovations
Symantec - From Early Drupal Adoption to the Latest Drupal InnovationsSymantec - From Early Drupal Adoption to the Latest Drupal Innovations
Symantec - From Early Drupal Adoption to the Latest Drupal Innovations
Tag1 Consulting, Inc.
 
Ahmed El Mawaziny CV
Ahmed El Mawaziny CVAhmed El Mawaziny CV
Ahmed El Mawaziny CV
Ahmed El Mawaziny
 
Google cloud Study Jam 2023.pptx
Google cloud Study Jam 2023.pptxGoogle cloud Study Jam 2023.pptx
Google cloud Study Jam 2023.pptx
GDSCNiT
 
Deploying ML models in the enterprise
Deploying ML models in the enterpriseDeploying ML models in the enterprise
Deploying ML models in the enterprise
doppenhe
 
Preparing for Neo - Singapore OutSystems User Group October 2022 Meetup
Preparing for Neo - Singapore OutSystems User Group October 2022 MeetupPreparing for Neo - Singapore OutSystems User Group October 2022 Meetup
Preparing for Neo - Singapore OutSystems User Group October 2022 Meetup
YashrajNayak4
 
Technology insights: Decision Science Platform
Technology insights: Decision Science PlatformTechnology insights: Decision Science Platform
Technology insights: Decision Science Platform
Decision Science Community
 
VISWAPAVAN _2015_v1
VISWAPAVAN _2015_v1VISWAPAVAN _2015_v1
VISWAPAVAN _2015_v1viswa pavan
 
How to Migrate Applications Off a Mainframe
How to Migrate Applications Off a MainframeHow to Migrate Applications Off a Mainframe
How to Migrate Applications Off a Mainframe
VMware Tanzu
 
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...
Ambassador Labs
 
IBM Bluemix OpenWhisk: Interconnect 2016, Las Vegas: CCD-1088: The Future of ...
IBM Bluemix OpenWhisk: Interconnect 2016, Las Vegas: CCD-1088: The Future of ...IBM Bluemix OpenWhisk: Interconnect 2016, Las Vegas: CCD-1088: The Future of ...
IBM Bluemix OpenWhisk: Interconnect 2016, Las Vegas: CCD-1088: The Future of ...
OpenWhisk
 
IBM Bluemix Openwhisk
IBM Bluemix OpenwhiskIBM Bluemix Openwhisk
IBM Bluemix Openwhisk
Sonia Baratas Alves
 
Clean architecture
Clean architectureClean architecture
Clean architecture
.NET Crowd
 
Tales from the trenches creating complex distributed systems
Tales from the trenches  creating complex distributed systemsTales from the trenches  creating complex distributed systems
Tales from the trenches creating complex distributed systems
Particular Software
 
Concept of SOA
Concept of SOAConcept of SOA
Concept of SOA
Sylvain Witmeyer
 
Develop, deploy, and operate services at reddit scale oscon 2018
Develop, deploy, and operate services at reddit scale   oscon 2018Develop, deploy, and operate services at reddit scale   oscon 2018
Develop, deploy, and operate services at reddit scale oscon 2018
Gregory Taylor
 
MDOQ - Platform As A Service Agile Workflow Application for Magento - Launch ...
MDOQ - Platform As A Service Agile Workflow Application for Magento - Launch ...MDOQ - Platform As A Service Agile Workflow Application for Magento - Launch ...
MDOQ - Platform As A Service Agile Workflow Application for Magento - Launch ...
Arron Moss
 
PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
PHP At 5000 Requests Per Second: Hootsuite’s Scaling StoryPHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
vanphp
 

Similar to From prototype to production - The journey of re-designing SmartUp.io (20)

Rapid app building with loopback framework
Rapid app building with loopback frameworkRapid app building with loopback framework
Rapid app building with loopback framework
 
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning InfrastructureML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
ML Platform Q1 Meetup: Airbnb's End-to-End Machine Learning Infrastructure
 
Symantec - From Early Drupal Adoption to the Latest Drupal Innovations
Symantec - From Early Drupal Adoption to the Latest Drupal InnovationsSymantec - From Early Drupal Adoption to the Latest Drupal Innovations
Symantec - From Early Drupal Adoption to the Latest Drupal Innovations
 
Ahmed El Mawaziny CV
Ahmed El Mawaziny CVAhmed El Mawaziny CV
Ahmed El Mawaziny CV
 
Google cloud Study Jam 2023.pptx
Google cloud Study Jam 2023.pptxGoogle cloud Study Jam 2023.pptx
Google cloud Study Jam 2023.pptx
 
Deploying ML models in the enterprise
Deploying ML models in the enterpriseDeploying ML models in the enterprise
Deploying ML models in the enterprise
 
Preparing for Neo - Singapore OutSystems User Group October 2022 Meetup
Preparing for Neo - Singapore OutSystems User Group October 2022 MeetupPreparing for Neo - Singapore OutSystems User Group October 2022 Meetup
Preparing for Neo - Singapore OutSystems User Group October 2022 Meetup
 
Technology insights: Decision Science Platform
Technology insights: Decision Science PlatformTechnology insights: Decision Science Platform
Technology insights: Decision Science Platform
 
VISWAPAVAN _2015_v1
VISWAPAVAN _2015_v1VISWAPAVAN _2015_v1
VISWAPAVAN _2015_v1
 
How to Migrate Applications Off a Mainframe
How to Migrate Applications Off a MainframeHow to Migrate Applications Off a Mainframe
How to Migrate Applications Off a Mainframe
 
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...
Montreal Kubernetes Meetup: Developer-first workflows (for microservices) on ...
 
IBM Bluemix OpenWhisk: Interconnect 2016, Las Vegas: CCD-1088: The Future of ...
IBM Bluemix OpenWhisk: Interconnect 2016, Las Vegas: CCD-1088: The Future of ...IBM Bluemix OpenWhisk: Interconnect 2016, Las Vegas: CCD-1088: The Future of ...
IBM Bluemix OpenWhisk: Interconnect 2016, Las Vegas: CCD-1088: The Future of ...
 
IBM Bluemix Openwhisk
IBM Bluemix OpenwhiskIBM Bluemix Openwhisk
IBM Bluemix Openwhisk
 
Clean architecture
Clean architectureClean architecture
Clean architecture
 
Tales from the trenches creating complex distributed systems
Tales from the trenches  creating complex distributed systemsTales from the trenches  creating complex distributed systems
Tales from the trenches creating complex distributed systems
 
Concept of SOA
Concept of SOAConcept of SOA
Concept of SOA
 
Software Engineer
Software EngineerSoftware Engineer
Software Engineer
 
Develop, deploy, and operate services at reddit scale oscon 2018
Develop, deploy, and operate services at reddit scale   oscon 2018Develop, deploy, and operate services at reddit scale   oscon 2018
Develop, deploy, and operate services at reddit scale oscon 2018
 
MDOQ - Platform As A Service Agile Workflow Application for Magento - Launch ...
MDOQ - Platform As A Service Agile Workflow Application for Magento - Launch ...MDOQ - Platform As A Service Agile Workflow Application for Magento - Launch ...
MDOQ - Platform As A Service Agile Workflow Application for Magento - Launch ...
 
PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
PHP At 5000 Requests Per Second: Hootsuite’s Scaling StoryPHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
PHP At 5000 Requests Per Second: Hootsuite’s Scaling Story
 

Recently uploaded

Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
Georgi Kodinov
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
Donna Lenk
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
wottaspaceseo
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
informapgpstrackings
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
Globus
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
XfilesPro
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
Philip Schwarz
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
e20449
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Mind IT Systems
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Globus
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
Globus
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar
 
RISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent EnterpriseRISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent Enterprise
Srikant77
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Globus
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
Fermin Galan
 

Recently uploaded (20)

Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
 
Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"Navigating the Metaverse: A Journey into Virtual Evolution"
Navigating the Metaverse: A Journey into Virtual Evolution"
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
How Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptxHow Recreation Management Software Can Streamline Your Operations.pptx
How Recreation Management Software Can Streamline Your Operations.pptx
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
 
GlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote sessionGlobusWorld 2024 Opening Keynote session
GlobusWorld 2024 Opening Keynote session
 
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, BetterWebinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
Webinar: Salesforce Document Management 2.0 - Smarter, Faster, Better
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
 
Graphic Design Crash Course for beginners
Graphic Design Crash Course for beginnersGraphic Design Crash Course for beginners
Graphic Design Crash Course for beginners
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
 
Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024Globus Compute Introduction - GlobusWorld 2024
Globus Compute Introduction - GlobusWorld 2024
 
SOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBrokerSOCRadar Research Team: Latest Activities of IntelBroker
SOCRadar Research Team: Latest Activities of IntelBroker
 
RISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent EnterpriseRISE with SAP and Journey to the Intelligent Enterprise
RISE with SAP and Journey to the Intelligent Enterprise
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
 

From prototype to production - The journey of re-designing SmartUp.io

  • 1.
  • 2. From prototype to production The journey of re-designing SmartUp.io
  • 3. ● Extrovert geek and tech lover ● Joined SmartUp.io @ Q2 2016 ● Adores Linux, OSS, is a general minimalist ● Mining the bits in the startup & product mines, but been through the valleys of outsourcing & consultancy scene as well About Me - whoami Mate Lang CTO @ SmartUp.io
  • 4. About SmartUp.io - Who we are ● Startup company with a mobile-first, gamified, social micro-learning SaaS ○ Initial goal is to help entrepreneurs launch and take their startup companies to success without bothering VCs with the exact same questions ○ Found out, we have a much wider use case: reach, engage, train & inspire communities to facilitate learning ● Got a lot of attention from clients and investors
  • 5. SmartUp.io - Motivation ● Deloitte predicts huge disruption opportunity in corporate learning sector
  • 6. SmartUp.io - Motivation ● Deloitte predicts huge disruption opportunity in corporate learning sector Company’s average net promoter score for their LMS?
  • 7. SmartUp.io - Motivation ● Deloitte predicts huge disruption opportunity in corporate learning sector Company’s average net promoter score for their LMS? -8
  • 8. What is this talk about ● How shoud I (the geek) attack the problem domain to be successful? ● How did we decompose and redesign a “complex” (a.k.a messy) monolith into maintainable microservices ● How did we cope NFRs like security and performance
  • 9. Fact #1 - In software the only constant thing is change
  • 10. Fact #1 - In software the only constant thing is change Fact #2 - Geek is the new sexy ;)
  • 11. Market driven evolution ● Initial product is simple media consumption ○ Users learn by consuming content and competing with each other ○ SmartUp content team writes content on protected administration webapp ○ Users are on the same platform, without isolation ● ACME Inc comes along with proposal to use the platform internally for learning ○ Users are isolated for ACME Inc and FooBar Inc (multi-tenancy) ○ Company needs to be able to write their own content on their “slice” of the platform ● Introduce Communities = isolated instance of our platform suitable to serve the needs for business consumers
  • 13. Market driven evolution What we had What we wanted to sell
  • 15. Market driven evolution This is what we had under the hood
  • 16. Plan is to rewrite from scratch Team’s estimation of backlog = ~ 7 months
  • 17. Plan is to rewrite from scratch Actual delivery time = 7 months 10 months
  • 20. Where could it go wrong?
  • 21. Where could it go wrong? Right from architecting
  • 22.
  • 23. Understanding your domain ● Clients use a “smart hack” on V1 to obtain a predefined order of their published learning material ● By default the platform facilitates “feed-like” behaviour ○ no explicit ordering ○ potentially infinite list of items ● The right question to ask as an engineer Why do our client would ever want that?
  • 24. Understanding your domain ● Clients want a course like structure ● That is substantially different than our individual publishing model ○ The content is designed from the beginning in an ordered fashion ○ The completion should be in an ordered fashion ○ Analytics should be able to correlate between completion records ○ It deserves it’s own management
  • 25. Understanding your domain ● Clients want a course like structure ● That is substantially different than our individual publishing model ○ The content is designed from the beginning in an ordered fashion ○ The completion should be in an ordered fashion ○ Analytics should be able to correlate between completion records ○ It deserves it’s own management VS
  • 27. Geek team shop list for the re-write ● Scalable and maintainable codebase on all platforms ● Needs to be microservice architecture, because everything is easier with them (this is a fat lie) ● Automate everything that is possible to automate ● Detailed and helpful documentation ● Needs to ship in a continuous fashion with Docker because Docker is cool (it actually is) ● Needs to have Infrastructure-as-Code ● Needs to use managed solutions ● Web client and mobile apps useable by your grandma
  • 29. Microservices - the why Google Trends - search for term “microservice”
  • 30. Microservices - the why ● Haters gonna hate, but there is undeniable interest & adoption in engineering-led companies ● Microservices take common clean code (SOLID) concerns to system level ○ SRP - a service should be concerned about a single coherent domain ○ OCP - extending behavior done through encapsulating with higher level services. Change in remote context should not produce change in a given service. ○ ISP - introducing edge services - specialized backends for clients ● The above seems like common sense, but engineers do fail in designing such systems
  • 31. Microservices - the reason we fail “Often stepping back you see more, don’t you?” David Hockney painter, draughtsman, printmaker, stage designer and photographer
  • 32. Microservices - my fast service design test ● If you want to know whether you have (not) designed it correctly (false positives may appear) fill in the following test Test for your service design 1. Imagine you have to open source your service. Are you able to do so without doing code changes, but staying useful to an engineer outside your business domain? ………………………………… ------------------------------ END OF TEST ------------------------------
  • 33. Let’s see a concrete example
  • 34. Meet the “Leaderboard Service*” *(the tech scene ran out of deity names to name services after, so no Zeus or Hydra for you)
  • 35. The leaderboard - proposition ● Service responsible to manage leaderboards ● A leaderboard is an ordered list of players associated with a score ● It needs to allow the near real-time update of such boards, based on individual score change events ● Allows the player to check the “transaction history” not just the aggregated state ● It allows the fast retrieval of a certain segment of the board ○ Top X ○ Around X ( X-10, X, X+10)
  • 37. But a bit more modern...
  • 39. The leaderboard - collecting points Card Points Card
  • 40. The leaderboard - under the hood public interface LeaderboardService { LeaderboardCreationResponseDto createLeaderboard(LeaderboardCreationRequestDto requestDto); LeaderboardDto retrieveTopNLeaderboard(String leaderboardId, long size); LeaderboardDto retrieveAroundNLeaderboard(String leaderboardId, String playerId, long size); PlayerDto retrievePlayer(String leaderboardId, String playerId); void updatePlayerScore(String leaderboardId, String playerId, long score); } ● Note that this interface is agnostic of all SmartUp related logic ● Could be reused in any situation where you want to represent entities in a sorted order by score ● No matter what the player entity is, it’s created lazily when you first upgrade it’s score ● Currently used to incentivize consumption, but can be applied to groups of players, content creators, etc.
  • 41. The leaderboard - under the hood
  • 42. ● Due to event sourcing we can always reconstruct state in case of failure ● Redis is in-memory. HA or not, should it ever go down, an update of state would effectively restore our read-optimized model ● Our number of Score Processors scale with the data volume ● Concurrent state update correctness guaranteed by DynamoDB Stream shards ● In case of processor failure, upon service restoration the unprocessed events would get picked up, all in a couple minutes (up to a couple weeks of staying behind) The leaderboard - the gain
  • 46. Meet the Content Service
  • 47. The Content Service - The proposition A service that handles the creation and modification of versioned learning material. Also enables versions to be instantiated and completed by consumers. Encapsulates both structural and behavioural functionalities, like: - Structural - Question text - Limited number of answers - Solution explanation - Behavioural - Single-choice - Multiple-choice
  • 48. The Content Service - Under the hood
  • 49. The Content Service - the gain ● Due to the Context being an abstract entity we can support lots of use cases for consumption rules & resilient to change ○ Sharing of consumption record ○ You can re-do a content in certain circumstances (e.g. exam mode has separate context) ● Feedback from our Head Of Content after going live “Fast, smooth and easy. And really fast. And damn, this thing's fast...” ● Design enables easy clean-up should we ever do so. For now we store every change a content creator made. “Because I can” - Dr. Bob Kelso, Scrubs
  • 51. Tuning your microservices, cause with high throughput comes high latency
  • 52. Tuning your microservices ● Simple Operation: Check users credentials and request JWT Token ● Initial results: not too bad ● P95 responds in 701 ms
  • 53. Tuning your microservices ● Scale it up ○ 1 OAuth Service ○ 2 User Services ● P95 in ~35k ms ● Almost all requests respond after 1 second ● 40% requests FAILED
  • 54. THAT DOES NOT MAKE SENSE!
  • 55. Tuning your microservices ● Found out there is no connection HTTP pooling -> TCP handshake penalty ● Update Spring Cloud to Edgware ● Set correct timeouts for Ribbon and Hystrix ● Reduce (yes, reduce) Tomcat resources ○ Max-Threads ■ The maximum number of request processing threads to be created ○ Max-Connections ■ The maximum number of connections that the server will accept and process at any given time ○ Accept-Count ■ The maximum queue length for incoming connection requests when all possible request processing threads are in use
  • 56. Tuning your microservices ● 0 Failed Requests ● Mean Req/s: ~50 ● P95 latency: 148 ms
  • 59. If microservices have not solved your engineering problems...
  • 60. Try hiring a “DevOps Engineer”
  • 61.
  • 62.
  • 63. As per Wikipedia DevOps = a software engineering practice that aims at unifying software development (Dev) and software operation (Ops).
  • 65. Bronicorn Release ● Went live October 3, 0600 RO time ● Development environment was used for previous 10 months ● No other environment due to cost reasons ● On the day of release we created ○ Staging (Acceptance Testing) environment ○ Production
  • 66. Bronicorn Release ● Went live October 3, 0600 RO time ● Development environment was used for previous 10 months ● No other environment due to cost reasons ● On the day of release we created ○ Staging (Acceptance Testing) environment ○ Production 45 minutes difference between deployments (5 minutes active)
  • 67. How is that possible?
  • 68. How is that possible? Infrastructure-as-Code
  • 69. Infrastructure as Code Definition Infrastructure as code (IaC) is the process of managing and provisioning computer data centers through machine-readable definition files, rather than physical hardware configuration or interactive configuration tools. (as per Wikipedia) ● Just a Bunch of shell scripts ● Modern provisioning tools like: Chef, Puppet, Ansible ● Cloud-ready IoC management: Cloudformation, HashiCorp Terraform
  • 70. Meet Terraform ● DSL based using HCL (HashiCorp Configuration Language) ● Module oriented ● Manages dependencies between resources (e.g. DNS depends on IP) ● Nice interpolation syntax ● Natively manages multiple environments through configuration (e.g. instance types differ from env to env) ● Workflow = Plan > Review > Apply ● Configurable state backends ● From V0.10 supports pluggable “providers” (e.g. AWS, GCP)
  • 72. Our way of Terraforming ● 4 AWS VPCs ○ Services VPC (For maintenance, and unified connection to other VPCs) ○ SmartUp VPCs (e.g. Dev, Stg, Prod) ● Each service owns its own module along with dependencies Eg: Leaderboard: ○ Redis ○ Queues ○ DynamoDB Tables & Streams ○ etc ● Peering module to connect Services <-> SmartUp ● Using encrypted S3 for safe state storage
  • 73. Managing service configuration ● Services pick up their configuration exclusively in runtime ● No mvn package -Pdev|stg|prod ● Build one artifact (docker) and use it everywhere ● Consul as service discovery and configuration storage ● Terraform injects properties into Consul upon execution ● No configuration done in YML ● Using Spring Cloud Config to pick these values up from Consul upon startup and checking periodically for changes
  • 76. Each team is responsible for their delivery process
  • 77. Each team is responsible for their delivery process From design to production
  • 78. Let’s put it to production ● Loads of deploys in a geek’s life, better make it simple ● A good pipeline will ○ Provide fast feedback before PR integration (build, test & check infra dependencies) ○ Deploy ASAP changes to dev (fail-fast) ○ Streamline production releases so they prevent human error ○ Clear separation between steps ● Preferably define the whole pipeline using code ● Decided to use CircleCI - YAML based Workflows
  • 79. Kudos to the team
  • 80. Thank you for your kind attention! Mate Lang CTO @ smartup.io mate@smartup.io twitter: @langmate medium: @matelang