Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

リアルタイムアクセスログ分析基盤をAWSに構築した話 (JAWS UG BigData Branch)

4,616 views

Published on

BigData-JAWS 勉強会#9
アクセスログ収集〜分析基盤をAWSに構築しました。堅牢性、コスト、リアルタイム性をバランスする設計と、データ利用を促進する施策について紹介します。

Published in: Data & Analytics

リアルタイムアクセスログ分析基盤をAWSに構築した話 (JAWS UG BigData Branch)

  1. 1. AW S Hajime Sano, Marketing & Data Technologist Data Analytics & CRM Center, B to C Unit, Nikkei Inc.
  2. 2. 1. 2. 3. TIPS
  3. 3. • • AWS 
 DynamoDB DAX • • www.linkedin.com/in/hsano
  4. 4. 1. 2. 3. ≠
  5. 5. 1. 2. 3. TIPS
  6. 6. +
  7. 7. frequency √ volume
  8. 8. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 = LOG2(F√V)
  9. 9. • 10 • • • • FT …
  10. 10. • GROUP BY 50 • • Latency • •
  11. 11. 2014 2015 2016 2017 2018
  12. 12. • • • •
  13. 13. AT L A S … • • • • 10 1 •
  14. 14. 4
  15. 15. R D B H A D O O P B I D A S H B O A R D
  16. 16. R D B H A D O O P B I D A S H B O A R D
  17. 17. 1. 2. 3. TIPS
  18. 18. EndpointTracking Enrichment Consumers S3 AzkabanSQS Kinesis S3 Redshift Consumers ESS3 ParserAdobe Analytics Dynamo DB Dynamo DB S3 Kinesis→ES DataFeed Kinesis→S3 S3→RS E B / E C 2 Rundeck
  19. 19. Kinesis S3 Redshift Athena Spectrum ES Kibana Analytics Firehose Quick Sight (R) Jupyter (py) OSS BI/DS B I
  20. 20. AW S SQS Kinesis S3 Redshift
  21. 21. S Q S • - - -
  22. 22. K I N E S I S • Kinesis Stream - - 7 - • Firehose Analytics
  23. 23. S 3 • • Redshift • S3 - Redshift Athena Spectrum -
  24. 24. R E D S H I F T • • postgres SQL - BI • -
  25. 25. E L A S T I C S E A R C H E C 2 • Elasticsearch Service EC2 - c4.8xlarge x 3 - r4.xlarge x 1 • - X-Pack Graph ML - ES OS JVM
  26. 26. • Kinesis - 200ms - Consumer • Elasticsearch - - INDEX 1 • Redshift - 15 20
  27. 27. 10 20 10 1
  28. 28. … AT L A S A G
  29. 29. 1. 2. 3. TIPS
  30. 30. • • Lightning Talk • SQL Data Dojo • • • / IF
  31. 31. R Studio Server Shiny Anaconda(JupyterHub) Chartio Re:dash Kibana DOMO DataSquad Maia KPI Screens
  32. 32. • 2 1. 2. • - Slack -
  33. 33. J O I N - L E S S • OUTER JOIN - - JOIN • 1 - 1 - SELECT
  34. 34. J O I N - L E S S GET URL URL URL
  35. 35. • - time sliced table UNION ALL - timestamptz - WHERE • - 3 30 … -
  36. 36. last 7 days 7
  37. 37. • - Redash … - • - Redash - 
 KILL
  38. 38. T I P S 1. 2. 3. 4.
  39. 39. T H A N K Y O U !
  40. 40. Data Technologist Data Scientist
  41. 41. Q U E S T I O N S ?

×