Scaling Indexing and Replication in Jira Data Center Apps

Scaling Indexing
and Replication in Jira DC Apps
ANDRIY YAKOVLEV | PRINCIPAL PREMIER SUPPORT ENGINEER | ATLASSIAN

DC Index replication
problems
Goals of this talk
DC ScaleInsight

Nodes are showing different
data for the same project and
the same board.
UNHAPPY JIRA DC USER

Agenda
Why is this important?
Overview of Jira Indexing
Index replication in Jira DC
When things break

Datacenter
expectations
• Availability
• Stability (consistency)
• Performance

Premier
Support
• Partners
• Giving a hand

Application down or major
malfunction / Serious
degradation of application
performance or functionality
L1L2

Confusion
Multiple end users affected
and confused
Jira admin perception
Confidence
Loss of confidence or fear to
install App
Hours
Time spent on
troubleshooting

Jira index
Lucene
Text search engine which
keeps its structures on disk

JQL Search
Converts JQL into Lucene
query request, extendable,
pluggable.
Jira index
Lucene

JQL Search
Converts JQL into Lucene
query request, extendable,
pluggable.
Jira index
Filters, Dashboards,
Agile boards
Using JQL as a building blocks
Lucene

Jira index
Lucene
JQL Search
Filters, Dashboards,
Agile boards

SQL DB
How it works - Jira index

LUCENE
SQL
DOCS
DB

LUCENE
SQL
JQL
DOCS
DB

How it works - Jira index (2)
ISSUE
CF
COMMENT DOC
ISSUE DOCUMENT

CustomField.getValue()

CustomField.getValue()
STORE
INDEX

Global scope
Recomputing the value
CF is indexed
Cascading dependancies
Project scope
Storing values
No index for ViewOnly CF
TTL for cached values
FAST SLOW

Jira index replication
Multiserver
Each node has its own Lucene
copy

Replicating issues
Issue data is replicated by ID
and Action
Multiserver
copy

Replicating issues
Issue data is replicated by ID
and Action
Eventually consistent
Each node replays the
replication in its own tempo
Multiserver
copy

Everything is going to be
alright, maybe not today, but
eventually.
CONVENTIONAL WISDOM

How it works - DC replication
1

How it works - DC replication
1
2

How it works - DC replication (2)
1
2
3
1 2 3 12 23 2 2

DC index
replication
Replication
tables
Index counter
Nodes table
• replicatedindexoperation (RIO) - log entries of Luciene
update events
• nodeindexcounter - position of each node in RIO table
• clusternode - all nodes in cluster
•clusternodeheartbeat - cluster heartbeat table
Important tables for DC Index replication
RIO table

DC index
replication
ID 37821490
index_time 2019-08-09 03:30:17
node_id Node1
affected_index ISSUE
entity_type NONE
affected_ids 4500367
operation UPDATE_WITH_RELATED
filename
Replication
tables
Index counter
Nodes table
RIO table

DC index
replication
ID 37834110
index_time 2019-08-09 03:30:17
node_id Node2
affected_index ALL
entity_type NONE
affected_ids -
operation FULL_REINDEX_END
filename IndexSnapshot_10402.zip
Replication
tables
Index counter
Nodes table
RIO table (2)

Global scope x Nodes
Recomputing the value x Nodes
CF is indexed x Nodes
Cascading dependancies x
Nodes
Project scope
Storing values
No index for ViewOnly CF
TTL for cached values
FAST SLOW

When things
break
Slow indexing
Slow replication
JQL search not
consistent
Own replication
Unnecessary computations
• Global scope for App custom field
• Recomputing values for issues without modifications
Slow computations
• Cascading computations
• External calls to remote 3rd party systems

When things
break
Slow indexing
Slow replication
JQL search not
consistent
Best practice
• Project scope for App CF
• Storing values
• No index for ViewOnly CF
• Don’t abuse reindexing API
• Test on large data sets (App Performance Toolkit)
Own replication
Unnecessary computations
• Global scope for App custom field
• Recomputing values for issues without modifications
Slow computations
• Cascading computations
• External calls to remote 3rd party systems

When things
break
Slow indexing
Slow replication
JQL search not
consistent
JIRA DC scalability Lucene index test
Own replication

When things
break
Slow indexing
Slow replication
JQL search not
consistent
Node can’t keep up with index replication
• Includes slow indexing problems
• Write amplification due to large data sets
• Adding more nodes doesn’t help
Own replication

When things
break
Slow indexing
Slow replication
JQL search not
consistent
Node can’t keep up with index replication
• Includes slow indexing problems
• Write amplification due to large data sets
• Adding more nodes doesn’t help
Own replication
Best practice
• Store values
• Test with 3+ nodes
• Measure and test indexing time, report CF indexing
slowness to the Jira admin

There are only two hard things
in Computer Science: cache
invalidation and naming
things..
PHIL KARLTON

When things
break
Slow indexing
Slow replication
JQL search not
consistent
Lucene data is not consistent
• Nodes collect the value at different time
• Recomputing in different context
• Stale cache
• Errors in reindexing
Own replication
Leaking Lucene searcher
Avoid using ThreadLocalSearcherCache#startSearcherContext
without cleaning the ThreadLocals

When things
break
Slow indexing
Slow replication
JQL search not
consistent
Creating your own replication
When issue and CF are not enough
• Use cache to short lived copy values
• Use DB to store values and pass the reference
• Possibly own Lucene index
• Create your own replication Q
• Don’t use cache if consistency is important
• Avoid using ClusterMessage for heavy traffic
•Monitoring and health check
Own replication

When things
break
Shape usage
Please only use circles, rectangles, and
rounded rectangles to call attention to a
particular part of a screenshot, for the
sake of consistency.
Knowledge • Jira Data Center Troubleshooting
• Index Replication Jira Data Center Troubleshooting
•Keeping Lucene Index Synchronised in JIRA Data
Center
•HealthCheck: Cluster Index Replication

DC Index replication
problems
App CF can slow down index
operations
To recap
DC Scale
Use proper config and test for
data large sets
Insight
How Jira DC indexing works

Q & A
Shape usage
Please only use circles, rectangles, and
rounded rectangles to call attention to a
particular part of a screenshot, for the
sake of consistency.
Knowledge • Jira Data Center Troubleshooting
• Index Replication Jira Data Center Troubleshooting
•Keeping Lucene Index Synchronised in JIRA Data
Center
•HealthCheck: Cluster Index Replication

Thank you!
ANDRIY YAKOVLEV | PRINCIPAL PREMIER SUPPORT ENGINEER | ATLASSIAN

Scaling Indexing and Replication in Jira Data Center Apps

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Scaling Indexing and Replication in Jira Data Center Apps

Similar to Scaling Indexing and Replication in Jira Data Center Apps (20)

More from Atlassian

More from Atlassian (20)

Recently uploaded

Recently uploaded (20)

Scaling Indexing and Replication in Jira Data Center Apps