Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

RailsエンジニアのためのSQLチューニング速習会

10,710 views

Published on

12/10 に Wantedly で行った「RailsエンジニアのためのSQLチューニング速習会」の発表資料です。

Published in: Engineering
  • DOWNLOAD THAT BOOKS INTO AVAILABLE FORMAT (2019 Update) ......................................................................................................................... ......................................................................................................................... Download Full PDF EBOOK here { http://bit.ly/2m77EgH } ......................................................................................................................... Download Full EPUB Ebook here { http://bit.ly/2m77EgH } ......................................................................................................................... Download Full doc Ebook here { http://bit.ly/2m77EgH } ......................................................................................................................... Download PDF EBOOK here { http://bit.ly/2m77EgH } ......................................................................................................................... Download EPUB Ebook here { http://bit.ly/2m77EgH } ......................................................................................................................... Download doc Ebook here { http://bit.ly/2m77EgH } ......................................................................................................................... ......................................................................................................................... ................................................................................................................................... eBook is an electronic version of a traditional print book that can be read by using a personal computer or by using an eBook reader. (An eBook reader can be a software application for use on a computer such as Microsoft's free Reader application, or a book-sized computer that is used solely as a reading device such as Nuvomedia's Rocket eBook.) Users can purchase an eBook on diskette or CD, but the most popular method of getting an eBook is to purchase a downloadable file of the eBook (or other reading material) from a Web site (such as Barnes and Noble) to be read from the user's computer or reading device. Generally, an eBook can be downloaded in five minutes or less ......................................................................................................................... .............. Browse by Genre Available eBooks .............................................................................................................................. Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, ......................................................................................................................... ......................................................................................................................... .....BEST SELLER FOR EBOOK RECOMMEND............................................................. ......................................................................................................................... Blowout: Corrupted Democracy, Rogue State Russia, and the Richest, Most Destructive Industry on Earth,-- The Ride of a Lifetime: Lessons Learned from 15 Years as CEO of the Walt Disney Company,-- Call Sign Chaos: Learning to Lead,-- StrengthsFinder 2.0,-- Stillness Is the Key,-- She Said: Breaking the Sexual Harassment Story That Helped Ignite a Movement,-- Atomic Habits: An Easy & Proven Way to Build Good Habits & Break Bad Ones,-- Everything Is Figureoutable,-- What It Takes: Lessons in the Pursuit of Excellence,-- Rich Dad Poor Dad: What the Rich Teach Their Kids About Money That the Poor and Middle Class Do Not!,-- The Total Money Makeover: Classic Edition: A Proven Plan for Financial Fitness,-- Shut Up and Listen!: Hard Business Truths that Will Help You Succeed, ......................................................................................................................... .........................................................................................................................
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

RailsエンジニアのためのSQLチューニング速習会

  1. 1. Railsエンジニアのための SQLチューニング速習会 @ Wantedly 2015-12-10 Nao Minami (@south37)
  2. 2. 自己紹介
  3. 3. • 1. SQLが実行されるとき、RDBの中で何が起きるか を知る • 2. Explain の読み方、適切なindexの張り方を知る • 3. チューニングの為に気をつけるポイントを知る 今日速習する内容
  4. 4. セットアップ $ git clone https://github.com/south37/sql-tuning $ git checkout sql-tuning $ bin/rake db:create $ pg_restore -j 4 --verbose --no-acl --no-owner -d sql-tuning-dev db.dump
  5. 5. Explain してみよう
  6. 6. ActiveRecord::Relation#explain $ Job.joins(:company).group('companies.country').where('companies.id < 1000’) .select('companies.country', 'COUNT(jobs.id)').explain => EXPLAIN for: SELECT companies.country, COUNT(jobs.id) FROM "jobs" INNER JOIN "companies" ON "companies"."id" = "jobs"."company_id" WHERE (companies.id < 1000) GROUP BY companies.country QUERY PLAN ------------------------------------------------------------------------------------------------------- HashAggregate (cost=1213.79..1220.12 rows=634 width=16) -> Hash Join (cost=54.28..1188.79 rows=5000 width=16) Hash Cond: (jobs.company_id = companies.id) -> Seq Scan on jobs (cost=0.00..897.00 rows=50000 width=8) -> Hash (cost=41.78..41.78 rows=1000 width=16) -> Index Scan using companies_pkey on companies (cost=0.29..41.78 rows=1000 width=16) Index Cond: (id < 1000)
  7. 7. $ Job.joins(:company).group('companies.country').where('companies.id < 1000’) .select('companies.country', 'COUNT(jobs.id)').explain => EXPLAIN for: SELECT companies.country, COUNT(jobs.id) FROM "jobs" INNER JOIN "companies" ON "companies"."id" = "jobs"."company_id" WHERE (companies.id < 1000) GROUP BY companies.country QUERY PLAN ------------------------------------------------------------------------------------------------------- HashAggregate (cost=1213.79..1220.12 rows=634 width=16) -> Hash Join (cost=54.28..1188.79 rows=5000 width=16) Hash Cond: (jobs.company_id = companies.id) -> Seq Scan on jobs (cost=0.00..897.00 rows=50000 width=8) -> Hash (cost=41.78..41.78 rows=1000 width=16) -> Index Scan using companies_pkey on companies (cost=0.29..41.78 rows=1000 width=16) Index Cond: (id < 1000) ツリー構造 Explainの見方
  8. 8. 実行計画はツリー状の構造 ツリー構造 HashAggregate Hash Join Seq ScanHash Index Scan $ Job.joins(:company).group('companies.country').where('companies.id < 1000’) .select('companies.country', 'COUNT(jobs.id)').explain => EXPLAIN for: SELECT companies.country, COUNT(jobs.id) FROM "jobs" INNER JOIN "companies" ON "companies"."id" = "jobs"."company_id" WHERE (companies.id < 1000) GROUP BY companies.country
  9. 9. $ Job.joins(:company).group('companies.country').where('companies.id < 1000’) .select('companies.country', 'COUNT(jobs.id)').explain => EXPLAIN for: SELECT companies.country, COUNT(jobs.id) FROM "jobs" INNER JOIN "companies" ON "companies"."id" = "jobs"."company_id" WHERE (companies.id < 1000) GROUP BY companies.country QUERY PLAN ------------------------------------------------------------------------------------------------------- HashAggregate (cost=1213.79..1220.12 rows=634 width=16) -> Hash Join (cost=54.28..1188.79 rows=5000 width=16) Hash Cond: (jobs.company_id = companies.id) -> Seq Scan on jobs (cost=0.00..897.00 rows=50000 width=8) -> Hash (cost=41.78..41.78 rows=1000 width=16) -> Index Scan using companies_pkey on companies (cost=0.29..41.78 rows=1000 width=16) Index Cond: (id < 1000) コストの見方
  10. 10. コストの見方 Seq Scan on jobs (cost=0.00..897.00 rows=50000 width=8) Index Scan using companies_pkey on companies (cost=0.29..41.78 rows=1000 width=16) 初期化コスト 総コスト 取得行数 1行あたりのデータサイズ(バイト) 総コスト = 初期化コスト + (走査行数 × 1行あたりの取得コスト ) index 使うと初期化コストが存在
  11. 11. ANALYSE をつけると実際に実行 $ ActiveRecord::Base.connection.execute("EXPLAIN ANALYSE #{Job.joins(:company).group('companies.country').where('companies.id < 1000').select('companies.country', 'COUNT(jobs.id)').to_sql}").each { |row| print row['QUERY PLAN']+"n" } HashAggregate (cost=1213.79..1220.12 rows=634 width=16) (actual time=20.290..20.465 rows=950 loops=1) -> Hash Join (cost=54.28..1188.79 rows=5000 width=16) (actual time=1.018..18.102 rows=4983 loops=1) Hash Cond: (jobs.company_id = companies.id) -> Seq Scan on jobs (cost=0.00..897.00 rows=50000 width=8) (actual time=0.009..6.352 rows=50000 loops=1) -> Hash (cost=41.78..41.78 rows=1000 width=16) (actual time=0.995..0.995 rows=999 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 51kB -> Index Scan using companies_pkey on companies (cost=0.29..41.78 rows=1000 width=16) (actual time=0.022..0.527 rows=999 loops=1) Index Cond: (id < 1000)
  12. 12. Explainの見方 より詳しく知りたい方はこちら: http://www.postgresql.org/docs/current/static/sql-explain.html
  13. 13. HashAggregate Hash Join Seq ScanHash Index Scan 最初のステップはデータの取得 $ Job.joins(:company).group('companies.country').where('companies.id < 1000’) .select('companies.country', 'COUNT(jobs.id)').explain => EXPLAIN for: SELECT companies.country, COUNT(jobs.id) FROM "jobs" INNER JOIN "companies" ON "companies"."id" = "jobs"."company_id" WHERE (companies.id < 1000) GROUP BY companies.country
  14. 14. index を知る HashAggregate Hash Join Seq ScanHash Index Scan
  15. 15. index の仕組み B-tree index • ノードあたり数百要素 • 300要素として、3段で2,700 万件格納 高速なデータ取得
  16. 16. index の利用 $ Job.where(id: 1).explain => EXPLAIN for: SELECT "jobs".* FROM "jobs" WHERE "jobs"."id" = $1 [["id", 1]] QUERY PLAN ----------------------------------------------------------------------- Index Scan using jobs_pkey on jobs (cost=0.29..8.31 rows=1 width=28) Index Cond: (id = 1) $ Job.where(id_without_index: 1).explain => EXPLAIN for: SELECT "jobs".* FROM "jobs" WHERE "jobs"."id_without_index" = $1 [["id_without_index", 1]] QUERY PLAN -------------------------------------------------------- Seq Scan on jobs (cost=0.00..1022.00 rows=1 width=28) Filter: (id_without_index = 1) index有り index無し
 (Seq Scan)
  17. 17. index バッドパターン その1 「index を貼ったカラムに演算」 $ Profile.where('lower(email) = ?', 'minami@wantedly.com').limit(1).explain => EXPLAIN for: SELECT "profiles".* FROM "profiles" WHERE (lower(email) = 'minami@wantedly.com') LIMIT 1 QUERY PLAN ------------------------------------------------------------------ Limit (cost=0.00..5.08 rows=1 width=54) -> Seq Scan on profiles (cost=0.00..254.00 rows=50 width=54) Filter: (lower(email) = 'minami@wantedly.com'::text) index は key の比較で sort してるので、 演算が行われると利用できない 「クエリ書き換え」 or 「Indexes on Expression を利用」
  18. 18. index バッドパターン その2 「絞り込み条件の緩いWHERE」 $ Profile.where(gender: ‘female').explain => EXPLAIN for: SELECT "profiles".* FROM "profiles" WHERE "profiles"."gender" = $1 [["gender", "female"]] QUERY PLAN -------------------------------------------------------------- Seq Scan on profiles (cost=0.00..229.00 rows=5038 width=54) Filter: (gender = 'female'::text) male female profiles.gender の分布 デフォルトだと、 4分の1以下に絞り込まれる必要あり
  19. 19. なぜ絞り込み条件が緩いと indexが使われないのか?
  20. 20. HDDへのランダムアクセスと シーケンシャルアクセスの速度差が原因 Seq Scan Index Scan (Random Access) 1 2 3 4 1 23 4 1要素単位だと高コスト
  21. 21. ちゃんと絞り込まれるならOK $ BoxerProfile.where(gender: ‘female').explain => EXPLAIN for: SELECT "boxer_profiles".* FROM "boxer_profiles" WHERE "boxer_profiles"."gender" = $1 [["gender", "female"]] QUERY PLAN ------------------------------------------------------------------------------------------------- Bitmap Heap Scan on boxer_profiles (cost=28.08..114.66 rows=1006 width=25) Recheck Cond: (gender = 'female'::text) -> Bitmap Index Scan on index_boxer_profiles_on_gender (cost=0.00..27.83 rows=1006 width=0) Index Cond: (gender = 'female'::text) male female profiles.gender の分布 データの分布 = 「統計情報」が大事
  22. 22. 余談: PostgreSQL 内での データレイアウト 詳しく知りたい方は: http://www.postgresql.org/docs/current/static/storage.html または「内部構造から学ぶPostgreSQL 設計・運用計画の鉄則」
  23. 23. index のデメリット • 1. 更新に時間がかかるようになる • 2. HOT が効かない
  24. 24. 1. 更新に時間がかかるようになる B-tree index の更新が必要
  25. 25. 2. HOT が効かない HOTはPostgreSQL のカラムの更新を早くする仕組み (必要な箇所のみを更新する) 詳しくはこちら: http://lets.postgresql.jp/documents/tutorial/hot_1/
  26. 26. いろいろな index • 1. Unique Indexes • 2. Multicolumn Indexes • 3. Indexes on Expressions • 4. Partial Indexes
  27. 27. いろいろな index • 1. Unique Indexes • 2. Multicolumn Indexes • 3. Indexes on Expressions • 4. Partial Indexes
  28. 28. 2. Multicolumn Indexes create_table "tourist_spots", force: :cascade do |t| t.text "country" t.text "city" end add_index "tourist_spots", ["country", "city"], name: "index_tourist_spots_on_country_and_city", using: :btree 複数カラムに対しての index
  29. 29. 2. Multicolumn Indexes $ TouristSpot.where(country: 'japan', city: 'tokyo').explain => EXPLAIN for: SELECT "tourist_spots".* FROM "tourist_spots" WHERE "tourist_spots"."country" = $1 AND "tourist_spots"."city" = $2 [["country", "japan"], ["city", "tokyo"]] QUERY PLAN -------------------------------------------------------------------------------------------------------------- Index Scan using index_tourist_spots_on_country_and_city on tourist_spots (cost=0.42..8.44 rows=1 width=52) Index Cond: ((country = 'japan'::text) AND (city = 'tokyo'::text)) Multicolumn index有り $ TouristSpotWithoutMultipleIndex.where(country: 'japan', city: 'tokyo').explain => EXPLAIN for: SELECT "tourist_spot_without_multiple_indices".* FROM "tourist_spot_without_multiple_indices" WHERE "tourist_spot_without_multiple_indices"."country" = $1 AND "tourist_spot_without_multiple_indices"."city" = $2 [["country", "japan"], ["city", "tokyo"]] QUERY PLAN ------------------------------------------------------------------------------------------------------------- Index Scan using index_tourist_spot_without_multiple_indices_on_city on tourist_spot_without_multiple_indices (cost=0.42..8.44 rows=1 width=52) Index Cond: (city = 'tokyo'::text) Filter: (country = 'japan'::text) Multicolumn index無し
  30. 30. 2. Multicolumn Indexes
  31. 31. 2. Multicolumn Indexes 先頭の要素の index としても効く より詳細を知りたい方は: http://www.postgresql.org/docs/current/static/indexes-multicolumn.html $ TouristSpot.where(country: 'japan').explain => EXPLAIN for: SELECT "tourist_spots".* FROM "tourist_spots" WHERE "tourist_spots"."country" = $1 [["country", "japan"]] QUERY PLAN ------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on tourist_spots (cost=4.50..41.67 rows=10 width=52) Recheck Cond: (country = 'japan'::text) -> Bitmap Index Scan on index_tourist_spots_on_country_and_city (cost=0.00..4.49 rows=10 width=0) Index Cond: (country = 'japan'::text)
  32. 32. 3. Indexes on Expressions 関数などの返り値を key として index を作る事ができる # db/migrate/db/migrate/20151210065304_add_indexes_on_~.rb def up execute <<-SQL CREATE INDEX index_profiles_with_indexes_on_expressions_on_lower_email ON profiles_with_indexes_on_expressions(lower(email)); SQL end def down execute <<-SQL DROP INDEX index_profiles_with_indexes_on_expressions_on_lower_email SQL end
  33. 33. 3. Indexes on Expressions lower(email) を index として利用 詳細はこちら: http://www.postgresql.org/docs/current/static/indexes-expressional.html $ ProfilesWithIndexesOnExpression.where("lower(email) = 'minami@wantedly.com'").explain => EXPLAIN for: SELECT "profiles_with_indexes_on_expressions".* FROM "profiles_with_indexes_on_expressions" WHERE (lower(email) = 'minami@wantedly.com') QUERY PLAN ------------------------------------------------------------------------------------------------------ Index Scan using index_profiles_with_indexes_on_expressions_on_lower_email on profiles_with_indexes_on_expressions (cost=0.29..8.30 rows=1 width=48) Index Cond: (lower(email) = 'minami@wantedly.com'::text)
  34. 34. HashAggregate Hash Join Seq ScanHash Index Scan 次のステップはデータの結合(JOIN) $ Job.joins(:company).group('companies.country').where('companies.id < 1000’) .select('companies.country', 'COUNT(jobs.id)').explain => EXPLAIN for: SELECT companies.country, COUNT(jobs.id) FROM "jobs" INNER JOIN "companies" ON "companies"."id" = "jobs"."company_id" WHERE (companies.id < 1000) GROUP BY companies.country
  35. 35. JOIN のアルゴリズム index の有無や統計情報(データの量・分布)から、 最適なアルゴリズムが選ばれる • 1. Nested Loop Join • 2. Hash Join • 3. Merge Join 遅い 早い
  36. 36. 1. Nested Loop テーブル1と2に対して、すべての組み合わせを試す O(N × M) … 極めて遅い レコード数N レコード数M • レコード数が少なければ高速 • Table 2 に index を貼れば、 高速化が可能
  37. 37. 2. Hash Join テーブル2に対して、一度フルスキャンしてHashMapを作成 O(N + M) …Hash 生成のコストはかかるが、 Nested Loop よりはマシ テーブル2の全てのレコード をメモリに載せる必要あり
  38. 38. $ Job.joins(:company).group('companies.country').where('companies.id < 1000’) .select('companies.country', 'COUNT(jobs.id)').explain => EXPLAIN for: SELECT companies.country, COUNT(jobs.id) FROM "jobs" INNER JOIN "companies" ON "companies"."id" = "jobs"."company_id" WHERE (companies.id < 1000) GROUP BY companies.country QUERY PLAN ------------------------------------------------------------------------------------------------------- HashAggregate (cost=1213.79..1220.12 rows=634 width=16) -> Hash Join (cost=54.28..1188.79 rows=5000 width=16) Hash Cond: (jobs.company_id = companies.id) -> Seq Scan on jobs (cost=0.00..897.00 rows=50000 width=8) -> Hash (cost=41.78..41.78 rows=1000 width=16) -> Index Scan using companies_pkey on companies (cost=0.29..41.78 rows=1000 width=16) Index Cond: (id < 1000) Hash Join のコスト Hash の生成コスト(初期化コスト)
  39. 39. 3. Merge Join ソート済みのテーブル1と2に対して、1度だけフルスキャン O(N+M) …最も高速 JOIN に使うカラムには、 index を貼りましょう
  40. 40. index があっても JOIN が遅くなるケース どんなに高速化しても O(N+M) にしかならない Nが大きいと遅くなる
  41. 41. index があっても JOIN が遅くなるケース $ User.joins(:profile).select('COUNT(*)').explain => EXPLAIN for: SELECT COUNT(*) FROM "users" INNER JOIN "profiles" ON "profiles"."user_id" = "users"."id" QUERY PLAN -------------------------------------------------------------------------------- Aggregate (cost=23288.72..23288.73 rows=1 width=0) -> Hash Join (cost=354.30..23261.80 rows=10769 width=0) Hash Cond: (users.id = profiles.user_id) -> Seq Scan on users (cost=0.00..11441.64 rows=698964 width=4) -> Hash (cost=219.69..219.69 rows=10769 width=4) -> Seq Scan on profiles (cost=0.00..219.69 rows=10769 width=4)
  42. 42. JOIN される left relation は、 事前に絞り込んでおこう $ User.where(registered: true).joins(:profile).select('COUNT(*)').explain => EXPLAIN for: SELECT COUNT(*) FROM "users" INNER JOIN "profiles" ON "profiles"."user_id" = "users"."id" WHERE "users"."registered" = $1 [["registered", "t"]] QUERY PLAN ----------------------------------------------------------------------------------------------------------- Aggregate (cost=8131.17..8131.18 rows=1 width=0) -> Hash Join (cost=1850.65..8128.51 rows=1065 width=0) Hash Cond: (users.id = profiles.user_id) -> Bitmap Heap Scan on users (cost=1496.35..6639.86 rows=69151 width=4) Filter: registered -> Bitmap Index Scan on index_users_on_registered (cost=0.00..1479.06 rows=69151 width=0) Index Cond: (registered = true) -> Hash (cost=219.69..219.69 rows=10769 width=4) -> Seq Scan on profiles (cost=0.00..219.69 rows=10769 width=4)
  43. 43. HashAggregate Hash Join Seq ScanHash Index Scan ラストステップはデータの集約 (Aggregate) $ Job.joins(:company).group('companies.country').where('companies.id < 1000’) .select('companies.country', 'COUNT(jobs.id)').explain => EXPLAIN for: SELECT companies.country, COUNT(jobs.id) FROM "jobs" INNER JOIN "companies" ON "companies"."id" = "jobs"."company_id" WHERE (companies.id < 1000) GROUP BY companies.country
  44. 44. GROUP BY の2つのアルゴリズム • 1. Group Aggregate • 2. Hash Aggregate
  45. 45. 1. Group Aggregate 入力されたデータをグループキーで ソート後、 各グループを順番に処理 (index があってソート済みならパイプライン化も可能)
  46. 46. $ Job.joins(:company).group('companies.country').where('companies.id < 1000’) .select('companies.country', 'COUNT(jobs.id)').explain => EXPLAIN for: SELECT companies.country, COUNT(jobs.id) FROM "jobs" INNER JOIN "companies" ON "companies"."id" = "jobs"."company_id" WHERE (companies.id < 1000) GROUP BY companies.country QUERY PLAN ------------------------------------------------------------------------------------------------------- HashAggregate (cost=1213.79..1220.12 rows=634 width=16) -> Hash Join (cost=54.28..1188.79 rows=5000 width=16) Hash Cond: (jobs.company_id = companies.id) -> Seq Scan on jobs (cost=0.00..897.00 rows=50000 width=8) -> Hash (cost=41.78..41.78 rows=1000 width=16) -> Index Scan using companies_pkey on companies (cost=0.29..41.78 rows=1000 width=16) Index Cond: (id < 1000) 2. Hash Aggregate グループキーを key とする、一時的な Hash Tableを作成
  47. 47. ORDER BY を指定する事で、 Sort 処理が入る ラストステップが Sort と Limitの場合 $ PageViewLog.order(:viewed_at).limit(20).explain => EXPLAIN for: SELECT "page_view_logs".* FROM "page_view_logs" ORDER BY "page_view_logs"."viewed_at" ASC LIMIT 20 QUERY PLAN ----------------------------------------------------------------------------------- Limit (cost=22026.31..22026.36 rows=20 width=28) -> Sort (cost=22026.31..23278.87 rows=501024 width=28) Sort Key: viewed_at -> Seq Scan on page_view_logs (cost=0.00..8694.24 rows=501024 width=28) Disk sort になると、すごく遅い
  48. 48. ORDER BY には index index があればすでに sort 済みなので、sort 処理が不要 $PageViewLogWithIndex.order(:viewed_at).limit(20).explain => EXPLAIN for: SELECT "page_view_log_with_indices".* FROM "page_view_log_with_indices" ORDER BY "page_view_log_with_indices"."viewed_at" ASC LIMIT 20 QUERY PLAN --------------------------------------------------------------------------------------------------- Limit (cost=0.42..1.09 rows=20 width=28) -> Index Scan using index_page_view_log_with_indices_on_viewed_at on page_view_log_with_indices (cost=0.42..16698.78 rows=501024 width=28)
  49. 49. その他、PostgreSQLに特徴的な 愉快な仲間たち • 1. Window Functions • 2. Json Type • 3. Hstore • 4. Materialized View • 5. Stored Procedure (PL/pgSQL)
  50. 50. 1. Window Functions http://www.postgresql.org/docs/current/static/tutorial-window.html $ Company.select('country, rank() OVER (PARTITION BY country ORDER BY id DESC)').explain => EXPLAIN for: SELECT country, rank() OVER (PARTITION BY country ORDER BY id DESC) FROM "companies" QUERY PLAN ---------------------------------------------------------------------------- WindowAgg (cost=936.35..1155.63 rows=10964 width=16) -> Sort (cost=936.35..963.76 rows=10964 width=16) Sort Key: country, id -> Seq Scan on companies (cost=0.00..200.64 rows=10964 width=16) Partition ごとに、値を計算 country | rank --------------+------ britain | 1 china | 1 china | 2 china | 3 country_0 | 1 高機能な集約関数
  51. 51. 2. Json Type Json データを保存可能 ActiveREcord で対応済み $ Event.create(payload: { kind: "user_renamed", change: ["jack", "john"]}) (0.1ms) BEGIN SQL (1.7ms) INSERT INTO "events" ("payload", "created_at", "updated_at") VALUES ($1, $2, $3) RETURNING "id" [["payload", "{"kind":"user_renamed","change":["jack","john"]}"], ["created_at", "2015-12-10 09:57:52.294809"], ["updated_at", "2015-12-10 09:57:52.294809"]] (0.4ms) COMMIT # db/migrate/~.rb def change create_table :events do |t| t.json :payload end end
  52. 52. 2. Json Type http://www.postgresql.org/docs/current/static/functions-json.html Json の値取得用の operator が存在 $ Event.where("payload->>'name' = ?", "test1").explain => EXPLAIN for: SELECT "events".* FROM "events" WHERE (payload->>'name' = 'test1') QUERY PLAN -------------------------------------------------------- Seq Scan on events (cost=0.00..24.85 rows=5 width=52) Filter: ((payload ->> 'name'::text) = 'test1'::text)
  53. 53. 3. Hstore http://www.postgresql.org/docs/current/static/hstore.html key, value のペアを1つの絡むに保存可能 問い合わせ用のオペレータあり
  54. 54. 4. Materialized View http://www.postgresql.org/docs/current/static/sql-creatematerializedview.html キャッシュされた View 高速化は期待できるが、手動で Reflesh する必要あり
  55. 55. 5. Stored Procedure (PL/pgSQL) http://www.postgresql.org/docs/current/static/plpgsql.html PostgreSQL で実行可能な function を定義可能
  56. 56. まとめ SQLの実行時に選ばれる実行計画は、index の有無や 統計情報(データの量・分布)に依存する 適切な schema, index, query の選択によって、 高速化しよう • WHERE, JOIN, ORDER BY, GROUP BY の key には index • JOIN の前に絞り込めるだけ絞り込む • JSON Type などもケースバイケースで

×