Tuning a database for millions of users
Chaowlert Chaisrichalermpol
Senior Solution Architect at IBM
Chaowlert
Chaisrichalermpol
DevelopDesign Monitoring
AGENDA
Set your expectation
DBA Developer
Today Topics
Scaling up to your first 10 million users
https://www.youtube.com/watch?v=Ma3xWDXTxRg
Why not we start with full architecture?
Should I use NoSQL?
- If you plan for 500,000 users in first year
- Lower cost storage, with no ACID features
required (ex. Logs)
Should I start with Microservices?
Monolith first if
- Bounded Context is still unclear
- No time for Microservices complexity
- No time for tooling
Understand Db Strength & Weakness
Strength
- Established technology
- ACID out of the box
- Integrated metrics
- Able to tune without effecting system
Weakness
- Db always be bottleneck
Scaling Db
> 100 users
- Scaling up
> 500,000 users
- Move some functionalities to NoSQL
- Caching
- Move expensive queries to read replica
> 1,000,000 users
- Async integration
- Apply microservices and split databases
- Db Partitioning
Develop Monitoring
AGENDA
Design
It’s all about Execution Plan
If you are Db Admin, let developers see
Execution Plan
Index basic
- Lookup time much faster (n vs log(n))
- Increase storage & modification time
- Recommended number of indexes are 10 for
read intensive, and 5 for write intensive
(http://www.youtube.com/watch?v=gOsflkQk
Hjg)
Index usage
- Avoid type conversion in where clause
- Avoid using function on indexed column
- Avoid OR
- Does column order matter?
Workshop 1 - Index
- Table scan & index scan means db will
search entire table
- Foreign key is good candidate for index
https://data.stackexchange.com/stackoverflow/query/2469/all
-my-badges
Workshop 2 – Include columns
- Key lookup means index need data from
main table
- You can add columns to index with include
to reduce key lookup
https://data.stackexchange.com/stackoverflow/query/7521/ho
w-unsung-am-i
Workshop 3 – Composite index
- Sort is a good candidate for index
https://data.stackexchange.com/stackoverflow/query/947/my
-comment-score-distribution
Workshop 4 – Merge duplicate index
- Index on columns A,B,C implies index on
A,B and index on A only
- Therefore, index A,B,C can be merge with
index A,B
https://data.stackexchange.com/stackoverflow/query/3160/jo
n-skeet-comparison
Workshop 5 – CTE
- Table variable is for a few rows
- Use CTE if table is consumed only once
https://data.stackexchange.com/stackoverflow/query/466/mo
st-controversial-posts-on-the-site
How to handle complex queries?
Synchronous
- Materialized view
- New table
- Use trigger to update
- Update to new table in the
same transaction
Asynchronous
- Cache
- Async integration
- Async stream
Consistency Latency
Develop Monitoring
AGENDA
Design
Monitor your indexes
- Missing indexes
- Unused indexes
- Merge duplicate indexes
Avoid local maxima
Tuning db Tuning app
Query statistics
- High worker time
- High logical read
- High total rows
- High elapsed time
- These are good candidates for precompute
or caching
Still local maxima?
Tuning db Tuning app Tuning infra
Tuning biz
Q & A
Send you CV & position to
chaowlert@th.ibm.com
THANK YOU

Tuning a database for millions of users

  • 1.
    Tuning a databasefor millions of users Chaowlert Chaisrichalermpol
  • 2.
    Senior Solution Architectat IBM Chaowlert Chaisrichalermpol
  • 3.
  • 4.
    Set your expectation DBADeveloper Today Topics
  • 5.
    Scaling up toyour first 10 million users https://www.youtube.com/watch?v=Ma3xWDXTxRg
  • 11.
    Why not westart with full architecture?
  • 12.
    Should I useNoSQL? - If you plan for 500,000 users in first year - Lower cost storage, with no ACID features required (ex. Logs)
  • 13.
    Should I startwith Microservices? Monolith first if - Bounded Context is still unclear - No time for Microservices complexity - No time for tooling
  • 14.
    Understand Db Strength& Weakness Strength - Established technology - ACID out of the box - Integrated metrics - Able to tune without effecting system Weakness - Db always be bottleneck
  • 15.
    Scaling Db > 100users - Scaling up > 500,000 users - Move some functionalities to NoSQL - Caching - Move expensive queries to read replica > 1,000,000 users - Async integration - Apply microservices and split databases - Db Partitioning
  • 16.
  • 17.
    It’s all aboutExecution Plan If you are Db Admin, let developers see Execution Plan
  • 18.
    Index basic - Lookuptime much faster (n vs log(n)) - Increase storage & modification time - Recommended number of indexes are 10 for read intensive, and 5 for write intensive (http://www.youtube.com/watch?v=gOsflkQk Hjg)
  • 19.
    Index usage - Avoidtype conversion in where clause - Avoid using function on indexed column - Avoid OR - Does column order matter?
  • 20.
    Workshop 1 -Index - Table scan & index scan means db will search entire table - Foreign key is good candidate for index https://data.stackexchange.com/stackoverflow/query/2469/all -my-badges
  • 21.
    Workshop 2 –Include columns - Key lookup means index need data from main table - You can add columns to index with include to reduce key lookup https://data.stackexchange.com/stackoverflow/query/7521/ho w-unsung-am-i
  • 22.
    Workshop 3 –Composite index - Sort is a good candidate for index https://data.stackexchange.com/stackoverflow/query/947/my -comment-score-distribution
  • 23.
    Workshop 4 –Merge duplicate index - Index on columns A,B,C implies index on A,B and index on A only - Therefore, index A,B,C can be merge with index A,B https://data.stackexchange.com/stackoverflow/query/3160/jo n-skeet-comparison
  • 24.
    Workshop 5 –CTE - Table variable is for a few rows - Use CTE if table is consumed only once https://data.stackexchange.com/stackoverflow/query/466/mo st-controversial-posts-on-the-site
  • 25.
    How to handlecomplex queries? Synchronous - Materialized view - New table - Use trigger to update - Update to new table in the same transaction Asynchronous - Cache - Async integration - Async stream Consistency Latency
  • 26.
  • 27.
    Monitor your indexes -Missing indexes - Unused indexes - Merge duplicate indexes
  • 28.
  • 29.
    Query statistics - Highworker time - High logical read - High total rows - High elapsed time - These are good candidates for precompute or caching
  • 30.
    Still local maxima? Tuningdb Tuning app Tuning infra Tuning biz
  • 31.
  • 32.
    Send you CV& position to chaowlert@th.ibm.com
  • 33.