Bringing Agile to IT
SCOTT GOH-DAVIS | SOLUTIONS ENGINEER, APAC | ATLASSIAN
IT’S THE MAGNITUDE
It’s not just the rate of change
DIGITAL
TRANSFORMATION
DIGITAL TRANSFORMATION
IT TEAMS ALL TEAMSDEV TEAMS
IT TEAMSDEV TEAMS
Collaborative
Collaborative Iterative
Collaborative Iterative Customer-centric
Agile
The future of IT is agile.
Unleash the agile 

in every IT team
IT TEAMSDEV TEAMS
INCIDENTS
4 BILLION
per day
Cost of downtime is
$
IT OPERATIONSDEVELOPERS
Incident management requires
DEV & IT OPS
to work together
BEST OF BREED
Products
Incident management requires
DEEPLY
Integrated
BEST OF BREED
Products
Incident management requires
Incident management requires
Incident management requires
Incident management requires
BEST OF BREED PRODUCTS
Build trust
with every
incident
STATUSPAGE MISSION
Mobile App Unresponsive
Investigating- We’re currently experiencing an issue with users unable to log into
our mobile application. We’re actively looking into the issue and will have an
update in the next 30 minutes.
Uptime Showcase
BEST OF BREED PRODUCTSBEST OF BREED PRODUCTS
Wednesday, April 10
8:20
[Banc.ly Site Status] Investigating: Site
instability http://stspg.com/21ak
MESSAGES
Investigating- We’re currently experiencing an issue with users unable to log into
our mobile application. We’re actively looking into the issue and will have an
update in the next 30 minutes.
Uptime Showcase
BEST OF BREED PRODUCTSBEST OF BREED PRODUCTS
Build trust
with every
incident
STATUSPAGE MISSION
BEST OF BREED PRODUCTS
Powerful
alerting & on-call
management
OPSGENIE
BEST OF BREED PRODUCTS
Escalations
Banc.ly backend weekday
Banc.ly backend weekend
0 On-call users in Banc.ly backend,if not acknowledgedm
5 Sarah Smith, if not acknowledgedm
10 Ryan Windows, if not acknowledgedm
m15 Evgeny Willows, if not acknowledged
20 Banc.ly MIMs, if not acknowledgedm
+ Add escalation
Routing rules + Add routing rule
for any received alert
route the alert to
Friday 18:00 - Monday 06:00
Banc.ly backend weekend
Banc.ly backend weekday
routing time is betweenAND
THEN
IF
route alerts toELSE
BEST OF BREED PRODUCTS
FBGRMKJM DC
BEST OF BREED PRODUCTS
FilterTimeline Add entry
Am checking for possums in the Google tracts,
as they have infested us before.
Josie Michaels01:14 ·
The curse has not yet been lifted from the
Liam Hens00:34 ·
The defragulator is checked and is not the source
of the problem. Frag lines are flowing smoothly.
Mark Kane01:24 ·MK
We have now fully cleared out the Login blockage.
It seems that Google was full of possums again.
We reset our API tokens and drained all cisterns of
the pestilence but we will remain ever vigilant.
Josie Michaels01:16 ·
Mary Smith01:30 · Incident resolved ·
| We have now fully restored service to
all of our customers. We will continue to monitor
the login services to ensure no further issues.
Resolved
16:45 (UTC +8) · Statuspage updated · Mary Smith
The hydrospanner became stuck in the Google pipeline. Despite
heroic efforts to free said spanner this led to a a blockage 2 weeks
ago.
Leadup
The pressure due to this blockage grew until approximately 7pm 21
Feb 2019, when there was an overflow of possums in the Google
pipeline. Obviously, this led to an outage of Google logins.
Fault
Executive summary
Login with Google has been unavailable for over 15 mins now.
Google services are still running, so it seems to be something our
end.
Postmortem: Incident #10 - Banc.ly site down
for customers
Reports / Postmortems /
BEST OF BREED PRODUCTS
FilterTimeline Add entry
Am checking for possums in the Google tracts,
as they have infested us before.
Josie Michaels01:14 ·
The curse has not yet been lifted from the
Liam Hens00:34 ·
The defragulator is checked and is not the source
of the problem. Frag lines are flowing smoothly.
Mark Kane01:24 ·MK
We have now fully cleared out the Login blockage.
It seems that Google was full of possums again.
We reset our API tokens and drained all cisterns of
the pestilence but we will remain ever vigilant.
Josie Michaels01:16 ·
Mary Smith01:30 · Incident resolved ·
| We have now fully restored service to
all of our customers. We will continue to monitor
the login services to ensure no further issues.
Resolved
16:45 (UTC +8) · Statuspage updated · Mary Smith
The hydrospanner became stuck in the Google pipeline. Despite
heroic efforts to free said spanner this led to a a blockage 2 weeks
ago.
Leadup
The pressure due to this blockage grew until approximately 7pm 21
Feb 2019, when there was an overflow of possums in the Google
pipeline. Obviously, this led to an outage of Google logins.
Fault
Executive summary
Login with Google has been unavailable for over 15 mins now.
Google services are still running, so it seems to be something our
end.
Postmortem: Incident #10 - Banc.ly site down
for customers
Reports / Postmortems /
BEST OF BREED PRODUCTS
Banc.ly Backend
Software project
JH Label
Create subscription plans and
discount codes in Stripe
BBE-945
Add link to app usage (GA) in email
report
BBE-935
Force SSL on any page that contains
account info
BBE-1029 H
Add analytics to pricing page
BBE-939
IN PROGRESS 4
Apply a prorated discount to a user
when they move from a low to a
high priced tier
BBE-1021
Allow users to change between two
tiers at the same price
BBE-973
J
Add NPS feedback to email report
BBE-1004
Add NPS feedback to wallboard
BBE-961
J
Implement feedback collector
BBE-321
TO DO 29 DONE 3
Schedule weekly email report for
Monday mornings to all staff
BBE-732
Automate collection of feedback for
weekly email report
BBE-931 HJ
Install SSL certificate
HBBE-983
Board
Board
Add item
Settings
Give feedback
Dashboard
Projects
Issues
Add-on
Settings
Back to project
Queues
Banc.ly Infras…
Service desk project
All open
All unassigned
Assigned to me
+ Add queue
11
2
5
Sharon Tweed raised this request via Portal
View request in portal
Activity Show all
Sandbox environment for testing changes Status
Waiting for support
Oleg Jobbs
ASSIGNEE
Sharon Tweed
REPORTER
EC2 Linux
AWS PRODUCT
Created 8 May 2017 5:43 PM
Last updated 4 hours ago
Show more
Hi, I need this provisioned for testing my changes in staging.
Sharon Tweed 17, Dec 2017
DeleteEdit
Hey Sharon, you requested a t3.micro instance type. I would suggest a t3.small
type as it will be better suited for what you’re trying to do.
Oleg Jobbs 17, Dec 2017
DeleteEdit
Add internal note / Reply to customer
INFRA-123
CRITICAL
Incidents
Performing maintenance on our file sync systems for the entire
weekend
BEGINNING 4 OCT 2018 (01:30 PDT)
Maintenance TemplatesIncidentsOpen
Apps
Your page
Upcoming
Components
Subscribers
Incidents
View status page
Sear
SCHEDULED
CRITICAL
Performing maintenance on our file sync systems for the entire
weekend
10 MINS AGO (08:30 UTC)
Component group name - component name long lorem Short name
IN PROGRESS
2
Banc.ly
Public site
DEEPLY INTEGRATED
Escalations
Banc.ly backend weekend
0 On-call users in Banc.ly backend,if notm
5 Sarah Smith, if not ackm
10 Ryam
m15
20m
On-call
Integrations
Services
Members
Roles
Policies
Conferences
Activity stream
On-call
Routing rules
for any received alert
route the alert to
Friday 1
Banc.ly backend w
Banc.ly backend wee
routing time is betweenAND
THEN
IF
route alerts toELSE
/TeamsBanc.ly Backend
Software project
Unified User
Management
DEEPLY INTEGRATED
Service desks
Human resources
We can help with new employee
onboarding and general queries.
IT Support
We can help with any
regarding your comp
Welcome to the Banc.ly Help Cent
Find help and services
Status update
Mobile app users having trouble logging in View Status page
Status update
Mobile app users having trouble logging in View in Statuspage
DEEPLY INTEGRATED
Incident #10/Incidents
Banc.ly site down for customers - 500 errors on /deposit/v2 API
Mar 9, 2019 11:52 PM
Backend API Integration +4
Elapsed time: 4h 4m 38s
P2
Associated alerts Responders StakeholdersDetails
Team Banc.ly backend
Service bancly-backend-api
Description Banc.ly site is down for customers. We’re seeing a large number of 500 errors in the
CloudWatch logs due to errors on /deposit/v2 API.
> Rate limiting has prevented the flux capacitors from receiving stream notifications.
P2 - HighPriority
FilterTimeline Add entry
Jira issues
Create new issue Link existing issue
Join command center
Open
+ Assign role
Role User
Incident response roles
Incident commander
Communications officer
Josie Michaels
Helena Carter
It appears flux rate limits were set in error, so team is
testing a restore of the previous configuration.
Josie Michaels01:14 ·
The elevated error rate appears to be due to incorrect
rate limits on the flux capacitor stream.
Liam Chaudhury00:34 ·
Saturday 9 March 2019
Mary Smith23:54 · Stakeholders updated ·
|
We have identified a problem with the deposit API, and
are working to determine a fix.
Website down due to deposit API errors.
23:45 · Site reliability alerted
500 error threshold exceeded on /deposit/v2 API#1094
500 error threshold exceeded on /deposit/v2 API#1094
Mary Smith23:48 · Associated alert acked ·
Mary Smith23:52 · Incident opened ·
We have now reinstated the previous flux rate limit levels
to allow sufficient traffic through the nets. Levels of API
traffic are returning to normal despite some heavy errors
underlying services.
Josie Michaels01:16 ·
DEEPLY INTEGRATED
Opsgenie + JSW logo
P2 - HighPriority
Jira issues
Create new issue Link existing issue
+ Assign role
Role User
Incident response roles
Incident commander
Communications officer
Josie Michaels
Helena Carter
Create Cancel
Banc.ly backend What needs to be fixed?Add error handling to deposit API for invalid tuple length
Create
This fly in
DEEPLY INTEGRATED
Opsgenie + JSW logo
P2 - HighPriority
Jira issues
Create new issue Link existing issue
+ Assign role
Role User
Incident response roles
Incident commander
Communications officer
Josie Michaels
Helena Carter
Jira issues
BBE-1227 TO DOAdd error handling to deposit API for invalid tuple length
BBE-1228 TO DOFix alerting rules to notify devs when rate limits exceeded
Create new issue Link existing issue
DEEPLY INTEGRATED
OG +JSW logo
The hydrospanner became stuck in the Google pipeline. Despite heroic efforts to
free said spanner this led to a a blockage 2 weeks ago.
Leadup
The pressure due to this blockage grew until approximately 7pm 21 Feb 2019, when
there was an overflow of possums in the Google pipeline. Obviously, this led to an
Fault
Executive summary
Login with Google has been unavailable for over 15 mins now. Google services are
still running, so it seems to be something our end.
Banc.ly site down for customers - 500 errors on /
deposit/v2 API - postmortem report
Reports / Postmortems /
P2 - HighPriority
Jira issues
Create new issue Link existing issue
+ Assign role
Role User
Incident response roles
Incident commander
Communications officer
The hydrospanner became stuck in the Google pipeline. Despite heroic efforts to
free said spanner this led to a a blockage 2 weeks ago.
Leadup
The pressure due to this blockage grew until approximately 7pm 21 Feb 2019, when
there was an overflow of possums in the Google pipeline. Obviously, this led to an
outage of Google logins.
Fault
This outage was first detected by New Relic. Simo Nalakorn was then alerted and
acknowledged the alert at 7:21pm
Detection
Root causes
Thresholds were exceeded. We ultimately performed inadequate checks of this
pipeline.
Mitigation and resolution
Defragging the pipeline cleared the possums, allowing us to restart it. Login service
restored at 7:51pm.
Executive summary
Login with Google has been unavailable for over 15 mins now. Google services are
still running, so it seems to be something our end.
deposit/v2 API - postmortem report
TimelineDetails
Am checking for possums in the
they have infested us before.
Josie Michaels01:34 ·
The curse has not yet been lifted
am continuing to search.
Josie Michaels01:20 ·
The defragulator is checked and
of the problem. Frag lines are flow
Josie Michaels01:47 ·MK
We have now fully cleared out the
It seems that Google was full of p
reset our API tokens and drained
pestilence but we will remain eve
Josie Michaels01:45 ·
Mary01:50 · Incident resolved ·
16:45 (UTC +8) · Statuspage upd
| We have now fully rest
of our customers. We will continu
login services to ensure no furthe
Resolve
d
FilterAdd entry
Jira issues
BBE-1227 TO DOAdd error handling to deposit API for invalid tuple length
BBE-1228 TO DOFix alerting rules to notify devs when rate limits exceeded
Create new issue Link existing issue
DEEPLY INTEGRATED
OG _ JSW logo
Banc.lyBanc.ly
Opsgenie
Josie closed alert #5684 “ALARM: PROD - backend-api 5xx error threshold exceeded”
6:53PMAPP
Opsgenie
Josie added a note to incident #23: “Team has identified a problem in event stream error handling.”
6:53PMAPP
dana 6:49PM
Have we checked all of the servers yet?
scott 7:39PM
Yeah, it went down last week
xander 6:59PM
This isn’t the first time Kinesis has gone down, right?
xander 8:01PM
Ah
Do we know why?
The event stream data had some invalid records.We need to fix the error handling and alerting.
dana 8:02PM
josie 8:02PM
!
INC #23: Banc.ly site down for customers - 500 errors on /deposit/v2 APIjosie
josie (you)
#inc-23
inc-23#
DEEPLY INTEGRATED
INCIDENTS ALERT FATIGUE IMPACTING THE IT TEAM
ITOps Team
Businesssystems/
infrastructure
Monitoring
+
Detect Respond
IMPROVE EVENT & INCIDENT COORDINATION
SecOps Team
Businesssystems/
infrastructure
Monitoring
+
SMS
Alarms
Detect Respond Recover
IMPROVE EVENT & INCIDENT COORDINATION
SecOps Team
Businesssystems/
infrastructure
Monitoring
+
SMS
Alarms
Detect Respond Recover
HOW ATLASSIANS MONITOR THEIR ENTERPRISE DEPLOYMENTS
https://confluence.atlassian.com/enterprise/how-atlassians-monitor-their-enterprise-deployments-947849816.html
Find out how we use Data Center ourselves for:
• getsupport.atlassian.com
• jira.atlassian.com
To date, both instances track a total of 1.9 million tickets,
with a combined user base of around 4.8 million users.
Incident management
DIGITAL TRANSFORMATION
IT TEAMS ALL TEAMS
LENGTHY,

EXPENSIVE,

COMPLICATED
A service desk for every team
Service Desk
Create a project
Start with a service desk and
build it the way you want.
Change template What’s in this?
Name
Open (recommended)
Access
Legal
Create
Review a contract
Create
RuleTransitionDone statusIn-progress statusTo-do status
DiscardSave & closeReview a contract
Legal
START
TO DO
Start work
Create request
Publish DONEIN PROGRESS
START
TO DO
IN REVIEW
Start work
Create request
Review contract
Approve
DeclinedRequest more info
APPROVED
DECLINED
CANCELLED
ANY STATUS
IN PROGRESS
Legal
Have a legal request? Raise a
request here.
Bancly support
Legal
Add a statement of Work (SOW) to an Existing Agreement
Add additional vendor services for an existing agreement
Request a Non-Disclosure Agreement
Protect confidential information using Banc.ly’s form NDA
Review a contract
Request a legal review of a contract
What can we help you with?
Have a legal request? Raise a request here.
What can we help you with?
Review a contract
Request a legal review of a contract
Bancly support
Legal
Summary*
Contract value
Attachment
Send Cancel
What is the purpose of the contract?
Drag and drop files, paste screenshots, or browse
Browse
What is the purpose of the contract?
Legal
Have a legal request? Raise a
request here.
Legal
Have a legal request? Raise a
request here.
DIGITAL TRANSFORMATION
IT TEAMSDEV TEAMS ALL TEAMS
Simplify software at scale so our
customers can not only survive but truly
thrive in the modern economy.
MISSION
STRATEGY EXECUTION&
“[Jira Align] allows us to connect our
teams to strategy, and that has been
critical to our transformation.”
— Candace Kelly, AT&T Center of Excellence
CONNECT WORK DIRECTLY TO COMPANY GOALS
CONNECT WORK DIRECTLY TO COMPANY GOALS
STAY ALIGNED IN REAL TIME
OPTIMIZE RESULTS WHEN THINGS CHANGE
UNDERSTAND HOW FEATURES IMPACT YOUR OKRS
Highperformingteams
rightpracticesrightpeople+ rightproducts
Download
atlassian.com/itil4-wp
THANK YOU!

Team Tour Seoul: Bringing Agile to IT

  • 1.
    Bringing Agile toIT SCOTT GOH-DAVIS | SOLUTIONS ENGINEER, APAC | ATLASSIAN
  • 2.
    IT’S THE MAGNITUDE It’snot just the rate of change
  • 6.
  • 7.
  • 8.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
    The future ofIT is agile.
  • 17.
    Unleash the agile
 in every IT team
  • 18.
  • 19.
  • 20.
    4 BILLION per day Costof downtime is $
  • 22.
  • 23.
    Incident management requires DEV& IT OPS to work together
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
    BEST OF BREEDPRODUCTS Build trust with every incident STATUSPAGE MISSION
  • 30.
    Mobile App Unresponsive Investigating-We’re currently experiencing an issue with users unable to log into our mobile application. We’re actively looking into the issue and will have an update in the next 30 minutes. Uptime Showcase BEST OF BREED PRODUCTSBEST OF BREED PRODUCTS Wednesday, April 10 8:20 [Banc.ly Site Status] Investigating: Site instability http://stspg.com/21ak MESSAGES
  • 31.
    Investigating- We’re currentlyexperiencing an issue with users unable to log into our mobile application. We’re actively looking into the issue and will have an update in the next 30 minutes. Uptime Showcase BEST OF BREED PRODUCTSBEST OF BREED PRODUCTS
  • 32.
    Build trust with every incident STATUSPAGEMISSION BEST OF BREED PRODUCTS
  • 33.
  • 34.
    Escalations Banc.ly backend weekday Banc.lybackend weekend 0 On-call users in Banc.ly backend,if not acknowledgedm 5 Sarah Smith, if not acknowledgedm 10 Ryan Windows, if not acknowledgedm m15 Evgeny Willows, if not acknowledged 20 Banc.ly MIMs, if not acknowledgedm + Add escalation Routing rules + Add routing rule for any received alert route the alert to Friday 18:00 - Monday 06:00 Banc.ly backend weekend Banc.ly backend weekday routing time is betweenAND THEN IF route alerts toELSE BEST OF BREED PRODUCTS
  • 35.
    FBGRMKJM DC BEST OFBREED PRODUCTS
  • 36.
    FilterTimeline Add entry Amchecking for possums in the Google tracts, as they have infested us before. Josie Michaels01:14 · The curse has not yet been lifted from the Liam Hens00:34 · The defragulator is checked and is not the source of the problem. Frag lines are flowing smoothly. Mark Kane01:24 ·MK We have now fully cleared out the Login blockage. It seems that Google was full of possums again. We reset our API tokens and drained all cisterns of the pestilence but we will remain ever vigilant. Josie Michaels01:16 · Mary Smith01:30 · Incident resolved · | We have now fully restored service to all of our customers. We will continue to monitor the login services to ensure no further issues. Resolved 16:45 (UTC +8) · Statuspage updated · Mary Smith The hydrospanner became stuck in the Google pipeline. Despite heroic efforts to free said spanner this led to a a blockage 2 weeks ago. Leadup The pressure due to this blockage grew until approximately 7pm 21 Feb 2019, when there was an overflow of possums in the Google pipeline. Obviously, this led to an outage of Google logins. Fault Executive summary Login with Google has been unavailable for over 15 mins now. Google services are still running, so it seems to be something our end. Postmortem: Incident #10 - Banc.ly site down for customers Reports / Postmortems / BEST OF BREED PRODUCTS
  • 37.
    FilterTimeline Add entry Amchecking for possums in the Google tracts, as they have infested us before. Josie Michaels01:14 · The curse has not yet been lifted from the Liam Hens00:34 · The defragulator is checked and is not the source of the problem. Frag lines are flowing smoothly. Mark Kane01:24 ·MK We have now fully cleared out the Login blockage. It seems that Google was full of possums again. We reset our API tokens and drained all cisterns of the pestilence but we will remain ever vigilant. Josie Michaels01:16 · Mary Smith01:30 · Incident resolved · | We have now fully restored service to all of our customers. We will continue to monitor the login services to ensure no further issues. Resolved 16:45 (UTC +8) · Statuspage updated · Mary Smith The hydrospanner became stuck in the Google pipeline. Despite heroic efforts to free said spanner this led to a a blockage 2 weeks ago. Leadup The pressure due to this blockage grew until approximately 7pm 21 Feb 2019, when there was an overflow of possums in the Google pipeline. Obviously, this led to an outage of Google logins. Fault Executive summary Login with Google has been unavailable for over 15 mins now. Google services are still running, so it seems to be something our end. Postmortem: Incident #10 - Banc.ly site down for customers Reports / Postmortems / BEST OF BREED PRODUCTS
  • 38.
    Banc.ly Backend Software project JHLabel Create subscription plans and discount codes in Stripe BBE-945 Add link to app usage (GA) in email report BBE-935 Force SSL on any page that contains account info BBE-1029 H Add analytics to pricing page BBE-939 IN PROGRESS 4 Apply a prorated discount to a user when they move from a low to a high priced tier BBE-1021 Allow users to change between two tiers at the same price BBE-973 J Add NPS feedback to email report BBE-1004 Add NPS feedback to wallboard BBE-961 J Implement feedback collector BBE-321 TO DO 29 DONE 3 Schedule weekly email report for Monday mornings to all staff BBE-732 Automate collection of feedback for weekly email report BBE-931 HJ Install SSL certificate HBBE-983 Board Board Add item Settings Give feedback Dashboard Projects Issues Add-on Settings Back to project Queues Banc.ly Infras… Service desk project All open All unassigned Assigned to me + Add queue 11 2 5 Sharon Tweed raised this request via Portal View request in portal Activity Show all Sandbox environment for testing changes Status Waiting for support Oleg Jobbs ASSIGNEE Sharon Tweed REPORTER EC2 Linux AWS PRODUCT Created 8 May 2017 5:43 PM Last updated 4 hours ago Show more Hi, I need this provisioned for testing my changes in staging. Sharon Tweed 17, Dec 2017 DeleteEdit Hey Sharon, you requested a t3.micro instance type. I would suggest a t3.small type as it will be better suited for what you’re trying to do. Oleg Jobbs 17, Dec 2017 DeleteEdit Add internal note / Reply to customer INFRA-123 CRITICAL Incidents Performing maintenance on our file sync systems for the entire weekend BEGINNING 4 OCT 2018 (01:30 PDT) Maintenance TemplatesIncidentsOpen Apps Your page Upcoming Components Subscribers Incidents View status page Sear SCHEDULED CRITICAL Performing maintenance on our file sync systems for the entire weekend 10 MINS AGO (08:30 UTC) Component group name - component name long lorem Short name IN PROGRESS 2 Banc.ly Public site DEEPLY INTEGRATED Escalations Banc.ly backend weekend 0 On-call users in Banc.ly backend,if notm 5 Sarah Smith, if not ackm 10 Ryam m15 20m On-call Integrations Services Members Roles Policies Conferences Activity stream On-call Routing rules for any received alert route the alert to Friday 1 Banc.ly backend w Banc.ly backend wee routing time is betweenAND THEN IF route alerts toELSE /TeamsBanc.ly Backend Software project
  • 39.
  • 40.
    Service desks Human resources Wecan help with new employee onboarding and general queries. IT Support We can help with any regarding your comp Welcome to the Banc.ly Help Cent Find help and services Status update Mobile app users having trouble logging in View Status page Status update Mobile app users having trouble logging in View in Statuspage DEEPLY INTEGRATED
  • 41.
    Incident #10/Incidents Banc.ly sitedown for customers - 500 errors on /deposit/v2 API Mar 9, 2019 11:52 PM Backend API Integration +4 Elapsed time: 4h 4m 38s P2 Associated alerts Responders StakeholdersDetails Team Banc.ly backend Service bancly-backend-api Description Banc.ly site is down for customers. We’re seeing a large number of 500 errors in the CloudWatch logs due to errors on /deposit/v2 API. > Rate limiting has prevented the flux capacitors from receiving stream notifications. P2 - HighPriority FilterTimeline Add entry Jira issues Create new issue Link existing issue Join command center Open + Assign role Role User Incident response roles Incident commander Communications officer Josie Michaels Helena Carter It appears flux rate limits were set in error, so team is testing a restore of the previous configuration. Josie Michaels01:14 · The elevated error rate appears to be due to incorrect rate limits on the flux capacitor stream. Liam Chaudhury00:34 · Saturday 9 March 2019 Mary Smith23:54 · Stakeholders updated · | We have identified a problem with the deposit API, and are working to determine a fix. Website down due to deposit API errors. 23:45 · Site reliability alerted 500 error threshold exceeded on /deposit/v2 API#1094 500 error threshold exceeded on /deposit/v2 API#1094 Mary Smith23:48 · Associated alert acked · Mary Smith23:52 · Incident opened · We have now reinstated the previous flux rate limit levels to allow sufficient traffic through the nets. Levels of API traffic are returning to normal despite some heavy errors underlying services. Josie Michaels01:16 · DEEPLY INTEGRATED Opsgenie + JSW logo
  • 42.
    P2 - HighPriority Jiraissues Create new issue Link existing issue + Assign role Role User Incident response roles Incident commander Communications officer Josie Michaels Helena Carter Create Cancel Banc.ly backend What needs to be fixed?Add error handling to deposit API for invalid tuple length Create This fly in DEEPLY INTEGRATED Opsgenie + JSW logo
  • 43.
    P2 - HighPriority Jiraissues Create new issue Link existing issue + Assign role Role User Incident response roles Incident commander Communications officer Josie Michaels Helena Carter Jira issues BBE-1227 TO DOAdd error handling to deposit API for invalid tuple length BBE-1228 TO DOFix alerting rules to notify devs when rate limits exceeded Create new issue Link existing issue DEEPLY INTEGRATED OG +JSW logo
  • 44.
    The hydrospanner becamestuck in the Google pipeline. Despite heroic efforts to free said spanner this led to a a blockage 2 weeks ago. Leadup The pressure due to this blockage grew until approximately 7pm 21 Feb 2019, when there was an overflow of possums in the Google pipeline. Obviously, this led to an Fault Executive summary Login with Google has been unavailable for over 15 mins now. Google services are still running, so it seems to be something our end. Banc.ly site down for customers - 500 errors on / deposit/v2 API - postmortem report Reports / Postmortems / P2 - HighPriority Jira issues Create new issue Link existing issue + Assign role Role User Incident response roles Incident commander Communications officer The hydrospanner became stuck in the Google pipeline. Despite heroic efforts to free said spanner this led to a a blockage 2 weeks ago. Leadup The pressure due to this blockage grew until approximately 7pm 21 Feb 2019, when there was an overflow of possums in the Google pipeline. Obviously, this led to an outage of Google logins. Fault This outage was first detected by New Relic. Simo Nalakorn was then alerted and acknowledged the alert at 7:21pm Detection Root causes Thresholds were exceeded. We ultimately performed inadequate checks of this pipeline. Mitigation and resolution Defragging the pipeline cleared the possums, allowing us to restart it. Login service restored at 7:51pm. Executive summary Login with Google has been unavailable for over 15 mins now. Google services are still running, so it seems to be something our end. deposit/v2 API - postmortem report TimelineDetails Am checking for possums in the they have infested us before. Josie Michaels01:34 · The curse has not yet been lifted am continuing to search. Josie Michaels01:20 · The defragulator is checked and of the problem. Frag lines are flow Josie Michaels01:47 ·MK We have now fully cleared out the It seems that Google was full of p reset our API tokens and drained pestilence but we will remain eve Josie Michaels01:45 · Mary01:50 · Incident resolved · 16:45 (UTC +8) · Statuspage upd | We have now fully rest of our customers. We will continu login services to ensure no furthe Resolve d FilterAdd entry Jira issues BBE-1227 TO DOAdd error handling to deposit API for invalid tuple length BBE-1228 TO DOFix alerting rules to notify devs when rate limits exceeded Create new issue Link existing issue DEEPLY INTEGRATED OG _ JSW logo
  • 45.
    Banc.lyBanc.ly Opsgenie Josie closed alert#5684 “ALARM: PROD - backend-api 5xx error threshold exceeded” 6:53PMAPP Opsgenie Josie added a note to incident #23: “Team has identified a problem in event stream error handling.” 6:53PMAPP dana 6:49PM Have we checked all of the servers yet? scott 7:39PM Yeah, it went down last week xander 6:59PM This isn’t the first time Kinesis has gone down, right? xander 8:01PM Ah Do we know why? The event stream data had some invalid records.We need to fix the error handling and alerting. dana 8:02PM josie 8:02PM ! INC #23: Banc.ly site down for customers - 500 errors on /deposit/v2 APIjosie josie (you) #inc-23 inc-23# DEEPLY INTEGRATED
  • 46.
    INCIDENTS ALERT FATIGUEIMPACTING THE IT TEAM ITOps Team Businesssystems/ infrastructure Monitoring + Detect Respond
  • 47.
    IMPROVE EVENT &INCIDENT COORDINATION SecOps Team Businesssystems/ infrastructure Monitoring + SMS Alarms Detect Respond Recover
  • 48.
    IMPROVE EVENT &INCIDENT COORDINATION SecOps Team Businesssystems/ infrastructure Monitoring + SMS Alarms Detect Respond Recover
  • 49.
    HOW ATLASSIANS MONITORTHEIR ENTERPRISE DEPLOYMENTS https://confluence.atlassian.com/enterprise/how-atlassians-monitor-their-enterprise-deployments-947849816.html Find out how we use Data Center ourselves for: • getsupport.atlassian.com • jira.atlassian.com To date, both instances track a total of 1.9 million tickets, with a combined user base of around 4.8 million users.
  • 50.
  • 51.
  • 52.
  • 53.
    A service deskfor every team
  • 54.
    Service Desk Create aproject Start with a service desk and build it the way you want. Change template What’s in this? Name Open (recommended) Access Legal Create
  • 55.
  • 57.
    RuleTransitionDone statusIn-progress statusTo-dostatus DiscardSave & closeReview a contract Legal START TO DO Start work Create request Publish DONEIN PROGRESS START TO DO IN REVIEW Start work Create request Review contract Approve DeclinedRequest more info APPROVED DECLINED CANCELLED ANY STATUS IN PROGRESS
  • 58.
    Legal Have a legalrequest? Raise a request here.
  • 59.
    Bancly support Legal Add astatement of Work (SOW) to an Existing Agreement Add additional vendor services for an existing agreement Request a Non-Disclosure Agreement Protect confidential information using Banc.ly’s form NDA Review a contract Request a legal review of a contract What can we help you with? Have a legal request? Raise a request here.
  • 60.
    What can wehelp you with? Review a contract Request a legal review of a contract Bancly support Legal Summary* Contract value Attachment Send Cancel What is the purpose of the contract? Drag and drop files, paste screenshots, or browse Browse What is the purpose of the contract?
  • 61.
    Legal Have a legalrequest? Raise a request here.
  • 62.
    Legal Have a legalrequest? Raise a request here.
  • 63.
  • 65.
    Simplify software atscale so our customers can not only survive but truly thrive in the modern economy. MISSION
  • 66.
  • 67.
    “[Jira Align] allowsus to connect our teams to strategy, and that has been critical to our transformation.” — Candace Kelly, AT&T Center of Excellence
  • 68.
    CONNECT WORK DIRECTLYTO COMPANY GOALS
  • 69.
    CONNECT WORK DIRECTLYTO COMPANY GOALS
  • 70.
    STAY ALIGNED INREAL TIME
  • 71.
    OPTIMIZE RESULTS WHENTHINGS CHANGE
  • 72.
    UNDERSTAND HOW FEATURESIMPACT YOUR OKRS
  • 73.
  • 74.
  • 75.