Miguel Gubitosi, Project Leader do Mercadolibre.com fala sobre SLA vs Agilidade: uso de microserviços e monitoramento de cloud no InterCon 2016.
Saiba mais em http://intercon2016.imasters.com.br/
iMastersjornalista, web editor, web writer, tradutora (en-pt/pt-en) at iMasters
3. This is our vision
Building the foundation to Build a 3B Company by FY20
Agenda
1 . “Old World”: MercadoLivre’s original architecture.
2 . “Ground Zero”: shifting to microservices on the cloud
3 . Monitoring the cloud
4. Alarms: when things go south
5. “Fury”: streamlining DevOps at MercadoLivre
8. This is our vision
Building the foundation to Build a 3B Company by FY20
Old world properties
● Monolithic
● Highly coupled code
● Unified SVN repository
● Single DB
● Simple infrastructure with little overhead
● Single QA team
● Closed system
9. This is our vision
Building the foundation to Build a 3B Company by FY20
Deployments as ML grew
Anyone at anytime
10. This is our vision
Building the foundation to Build a 3B Company by FY20
Deployments as ML grew
Anyone at anytime
Some people, anytime
11. This is our vision
Building the foundation to Build a 3B Company by FY20
Deployments as ML grew
Anyone at anytime
Some people, anytime
Some people, once a week
12. This is our vision
Building the foundation to Build a 3B Company by FY20
Deployments as ML grew
Anyone at anytime
Some people, anytime
Some people, once a week
Only by all experts together, at 3 AM, on
thursdays not covered by any “freeze”
15. This is our vision
Building the foundation to Build a 3B Company by FY20
Ground zero properties
● Multiple technologies and frameworks (dev’s choice)
● Completely decoupled code in multiple Github repositories
● One DB for each app, multiple engines
● Complex infrastructure with possible high overhead
● QA, testing and Continuous Integrations is done by each team
● Independent deployments, environments and policies
● Open platform
17. This is our vision
Building the foundation to Build a 3B Company by FY20
Developer responsibilities
● Developer gets ownership of entire dev cycle
● Massive empowerment of dev team -> OWNERSHIP
Manage
resources
VMs
Choose support systems
required and create them
Develop
Code
Choose your technology
and keep your Github
repository
Test
Create tests, regressions
or CI as needed
Ensure
quality
Define uptime
Define what “up” means for
your own app (health.sh)
Measure
Create metrics to analyze
performance and
downtime
DBs and services
Networking
Create rules and
loadbalancers to route
traffic to application
Create & scale computing
pools for dev/test/prod
React
Deploy
Write all routines for
automatically deploying
your app on any VM React to critical events
that affect your app
18. DevTools in ML
Developer
Melicloud API
- Create apps
- Manage pools (test/prod)
- Manage VMs & loadbalancers
- Build & deploy
- Create queues
- Create DBaaS or KVSaaS
- Create caches
Github repo
- Code app
- Write test & deploy strategy
- Write uptime definitions
Nginx
eventRouting &
OpsGenie
- Write rules to route traffic to
your pools
- Write rules to manage alarms
- Define alarm escalation
policies & schedules
- Manage contact channels
23. This is our vision
Building the foundation to Build a 3B Company by FY20
New Relic
● Default monitoring in VMs golden image
● No configuration necessary (initially)
HTTP
errors
Unhandled errors
See if other devs/clients
misuse your entry params
Stack
traces
Fast debugging
See what’s going on in
production
Unified pool data
All instances’ traces in
the same place
Performance
metrics
Transaction traces
See what’s taking so long
Recognize deviations
Graphs to see if traffic or
response time vary w/
respect to another period
Unsupported params
Other services
Detect down services
affecting you
Unexpected issues
appear in production
Apdex Score
25. This is our vision
Building the foundation to Build a 3B Company by FY20
Datadog
● Easy to use for different frameworks
● Good for business specific metrics
Custom
metrics
Complex metrics
Graphs filtered with
different dimensions
Infra
monitoring
Full info
More data than NR on
disk, memory, network
Scalable
Handles well aggregating
information from many
different VMs
Real time
analysis
Fast response
Almost no latency
Dashboards
Customizable dashboards
to show what’s more
relevant for each app
Online filtering
Alarms
Flexible alarms based on
custom metrics
You can send multiple
parameters for events
27. This is our vision
Building the foundation to Build a 3B Company by FY20
Log collection
● Logs are collected by an agent on all VMs
● They are sent to an ElasticSearch
● Access via a Kibana frontend
● Developers can use special syntax to create queryable
dimensions for all logged events
● All instances’ logs in the same place
● Request tracing through multiple applications/APIs
(request_id)
31. This is our vision
Building the foundation to Build a 3B Company by FY20
Event routing
● Rules added by each team
● Check alarm origin, type and importance
● Check “quiet hours”
● Assign escalation policy and forward to OpsGenie
32. This is our vision
Building the foundation to Build a 3B Company by FY20
OpsGenie
● Manage teams to deal with escalation policies
● Set “on call” schedules (w/substitutes & manager escalation)
● Everyone manages his contact methods (SMS, mail, phone call, app)
34. This is our vision
Building the foundation to Build a 3B Company by FY20
Evolution
Old world Ground zero Fury
35. This is our vision
Building the foundation to Build a 3B Company by FY20
Fury: DevOps to NoOps
● Still microservices
● Full service oriented
● Easier dev cycle and learning curve
● Pre-assembled flavors for popular frameworks
● Less bash scripts, more UI based configuration
● Auto-scaling & auto-healing
● Docker based (smaller dev/prod environment gap)
● Designed to run on AWS
● Continuous integration already included
36. This is our vision
Building the foundation to Build a 3B Company by FY20
Fury dashboard
37. This is our vision
Building the foundation to Build a 3B Company by FY20
Dev Cycle in Fury: create app
● Creates repository
● Creates Jenkins CI server
● Creates network infra
38. This is our vision
Building the foundation to Build a 3B Company by FY20
Dev Cycle in Fury: create scope
● Creates load balancer (ELB)
● Creates auto scaling group (ASG) for scope instances
● Creates instances
● Initialize logs & metrics services
● Download containers to instances
● Start traffic
39. This is our vision
Building the foundation to Build a 3B Company by FY20
Dev Cycle in Fury: deploy
● Creates ASG for new version
● Create instances for new ASG
● Initialize logs & metrics services
● Download containers to instances
● Progressive traffic switch
● If candidate is OK, destroy
previous infrastructure