SlideShare a Scribd company logo
BIG DATA @
RIOT GAMES
USING HADOOP TO IMPROVE THE PLAYER EXPERIENCE
BARRY LIVINGSTON & SANDEEP SHRESTHA | JULY 2013
SPEAKERS
CONTEXT
HIGH LEVEL ARCHITECTURE
PLAYER EXPERIENCE USE CASES
SUMMARY
QUICK DATA WAREHOUSE HISTORY
FIRST, A BIT OF CONTEXT…
WHAT IS LEAGUE OF LEGENDS?
2009
LAUNCH
TEAM
ORIENTED
100+
CHAMPS
MODERN
FANTASY
WHAT IS LEAGUE OF LEGENDS?
LEAGUE OF LEGENDS GAMEPLAY - CHAMPIONS
LEAGUE OF LEGENDS GAMEPLAY - GAMEPLAY
A QUICK HISTORY
INITIAL LAUNCH / SCRAPPY START UP PHASE
‣  Had	
  a	
  single,	
  dedicated	
  MySQL	
  instance	
  for	
  the	
  DW	
  
‣  Data	
  was	
  ETL’d	
  from	
  produc@on	
  slaves	
  into	
  this	
  instance	
  
‣  Queries	
  were	
  run	
  in	
  MySQL	
  
‣  Repor@ng	
  was	
  done	
  in	
  Excel	
  
▾  All	
  ETLs,	
  queries	
  and	
  repor@ng	
  were	
  done	
  by	
  one	
  person	
  
HISTORY	
   START-­‐UP	
  
THIS WORKED GREAT!
THEN – CRAZY GROWTH
HISTORY	
   START-­‐UP	
  
@me	
  
#	
  unique	
  logins	
  
TOTAL	
  ACTIVE	
  PLAYERS	
  
	
  June	
  2012	
  
CRAZY	
  
GROWTH	
  
THE BREAKING POINT
HISTORY	
   START-­‐UP	
  
CRAZY	
  
GROWTH	
  
BREAKING	
  
POINT	
  
‣  Data	
  warehouse	
  reached	
  a	
  breaking	
  point	
  
▾  24	
  hours	
  of	
  data	
  took	
  24.5	
  hours	
  to	
  ETL	
  
‣  We	
  couldn’t	
  handle…	
  
▾  Mul@ple	
  environments	
  in	
  a	
  ver@cal	
  MySQL	
  instance	
  	
  
▾  A	
  single	
  environment	
  in	
  a	
  ver@cal	
  MySQL	
  instance	
  
‣  We	
  needed	
  to	
  change	
  
	
  
INTRODUCTION OF HADOOP
HISTORY	
   START-­‐UP	
  
CRAZY	
  
GROWTH	
  
BREAKING	
  
POINT	
  
‣  Hadoop	
  has	
  a	
  number	
  of	
  great	
  quali@es	
  
▾  Cost	
  effec@ve	
  
▾  Scalable	
  
▾  Open	
  source	
  
▾  We	
  could	
  execute	
  quickly	
  
HADOOP	
  
HIGH LEVEL ARCHITECTURE – JUNE 2012
Tableau	
  
	
  
Hive	
  Data	
  Warehouse	
  
Pentaho	
  
	
  
+	
  
	
  
Custom	
  
ETL	
  
	
  
+	
  
	
  
Sqoop	
  
MySQL	
  Pentaho	
  
Analysts	
  
EUROPE	
  
Audit	
   Plat	
  
LoL	
  
KOREA	
  
Audit	
   Plat	
  
LoL	
  
NORTH	
  AMERICA	
  
Audit	
   Plat	
  
LoL	
  
Business	
  
Analyst	
  
BUT, THIS WASN’T GOOD ENOUGH
‣  The	
  @me	
  to	
  arrive	
  at	
  insight	
  was	
  too	
  long!	
  
‣  Our	
  solu@on	
  required	
  too	
  much	
  data	
  team	
  involvement	
  
▾  Schema	
  changes	
  
▾  ETL	
  tweaks	
  
▾  Hive	
  metadata	
  updates	
  
‣  Hive	
  is	
  painful	
  for	
  ad-­‐hoc	
  or	
  interac@ve	
  analysis	
  
▾  Especially	
  for	
  non-­‐technical	
  folks	
  
GOALS
‣  Democra@ze	
  data	
  access	
  
▾  Enable	
  Self-­‐service	
  Data	
  Collec@on	
  and	
  
Analysis	
  
‣  Create	
  ac@onable	
  insights	
  
‣  Increase	
  speed	
  to	
  insight	
  
USE CASE:
GAME CLIENT PERFORMANCE
CLIENT FOOTPRINT
‣  Significant	
  por@on	
  of	
  our	
  soware	
  runs	
  directly	
  on	
  players’	
  
machines	
  
▾  High	
  performance	
  graphics	
  
▾  Responsiveness	
  
‣  There	
  is	
  logic	
  in	
  these	
  components	
  that's	
  ONLY	
  exercised	
  
on	
  the	
  client-­‐side	
  
‣  Understanding	
  the	
  performance,	
  reliability	
  and	
  stability	
  of	
  
these	
  features	
  is	
  paramount	
  to	
  improving	
  the	
  player	
  
experience	
  
PATCHER
LOBBY CLIENT
GAME CLIENT
ITEM SHOP
CHALLENGE: THE GAME IS ALIVE
The	
  game	
  is	
  a	
  living,	
  breathing	
  service	
  that’s	
  always	
  in	
  mo@on	
  
‣  New	
  champions	
  
‣  New	
  items 	
  	
  
‣  New	
  effects/par@cles	
  
‣  Changes	
  in	
  environment	
  
‣  Changes	
  in	
  design	
  and	
  design	
  
balance	
  
	
  	
  
UPDATE
2-3WEEKS
CHALLENGE: WE’RE GLOBAL
CHALLENGE: PC VARIABILITY
‣  Hardware	
  and	
  OS	
  profiles	
  are	
  significantly	
  different	
  even	
  
within	
  regions	
  
▾  OS	
  and	
  patch	
  level	
  
▾  CPU	
  
▾  Memory	
  
▾  Video	
  card	
  
▾  Video	
  card	
  memory	
  
▾  Drivers	
  
CHALLENGE: GRAPHIC SETTINGS
CHALLENGE: CLIENT-SIDE LOGIC
IMPROVING THE PLAYER EXPERIENCE
‣  We	
  need	
  to	
  gather	
  informa@on	
  across	
  all	
  of	
  these	
  
dimensions	
  in	
  order	
  to	
  UNDERSTAND	
  the	
  player	
  experience	
  
‣  We	
  use	
  this	
  info	
  to:	
  
▾  React	
  quickly	
  to	
  changes	
  
▾  Op@mize	
  performance	
  
▾  Op@mize	
  designs	
  
▾  Improve	
  our	
  tes@ng	
  
•  Like	
  crea@ng	
  our	
  compa@bility	
  tes@ng	
  lab	
  
REACTING QUICKLY
GAME LOAD SCREEN
IMPROVING LOAD TIME
OPTIMIZING DESIGN AND PERFORMANCE
OPTIMIZING DESIGN AND PERFORMANCE
OPTIMIZING DESIGN AND PERFORMANCE
OPTIMIZING DESIGN AND PERFORMANCE
HOW DID WE SOLVE THIS
WE HAVE AN ARMY OF TEEMOS WATCHING PLAYERS’ MACHINES THROUGH THEIR TELESCOPES?!
(NOT REALLY, BUT WE DID CONSIDER IT)
HONU: GENERATE - COLLECT - ANALYZE
‣  Riot’s	
  self-­‐service	
  end-­‐to-­‐end	
  Big	
  Data	
  pipeline	
  
▾  Cloud-­‐ready	
  (AWS	
  compa@ble)	
  
▾  Internal	
  data-­‐center	
  ready	
  
▾  Persistent	
  storage:	
  HDFS/S3	
  
▾  Batch	
  processing:	
  Apache	
  Hadoop/AWS	
  EMR	
  
▾  Data	
  publish:	
  Apache	
  Hive	
  
	
  
EVENT GENERATION
‣  Honu	
  SDKs:	
  Java,	
  C++,	
  Erlang	
  
‣  Collector	
  discovery	
  
‣  Failover	
  
‣  Load	
  balancing	
  
‣  Buffering/Batching	
  
‣  Dispatching	
  
‣  Thri	
  transport	
  
HONU CLIENT SDK
Select	
  avg(f[‘pingAVG’])	
  from	
  game_client_stats	
  group	
  by	
  f[‘serverId’];	
  
pingAvg	
   serverId	
   system	
  source	
   	
  	
  app	
  @mestamp	
  
1234567890	
   99.123.456.78	
   game_client	
   220.9542	
   12.345.678.90	
   Intel64	
  …	
  
GAME_CLIENT_STATS	
  
EVENT COLLECTION
‣  Honu	
  collector	
  
‣  Online	
  system	
  
‣  High	
  availability	
  –	
  100%	
  up@me	
  
‣  Horizontally	
  scalable	
  
‣  Elas@c	
  
‣  Fault	
  tolerant	
  
‣  Neulix	
  OSS	
  Eureka	
  discovery	
  service	
  
HONU COLLECTOR
‣  Collect	
  events	
  from	
  mul@ple	
  clients	
  
(Thri/NIO)	
  
‣  Save	
  all	
  events	
  to	
  one	
  compressed	
  
file	
  locally	
  
‣  Upload	
  that	
  file	
  every	
  XX	
  minutes	
  to	
  
HDFS/S3	
  
‣  Send	
  a	
  message	
  to	
  Queue/SQS	
  for	
  
Demux	
  
H	
  o	
  n	
  u	
  C	
  o	
  l	
  l	
  e	
  c	
  t	
  o	
  r	
  s	
  
S	
  Q	
  S	
  
S	
  3	
  
EVENT ORGANIZATION
‣  Honu	
  demux	
  
‣  Mul@-­‐stage	
  batch	
  processing	
  pipeline	
  
‣  Elas@c	
  producer-­‐consumer	
  
‣  Apache	
  Hadoop	
  map	
  reduce	
  
‣  Standalone	
  map	
  reduce	
  mode	
  
‣  Apache	
  Hive	
  integra@on	
  
HONU DEMUX
‣  Mul@-­‐Stage	
  batch	
  
processing	
  pipeline	
  
‣  Bucket	
  events	
  to	
  separate	
  
tables	
  
‣  Write	
  Hive	
  par@@on	
  files	
  
‣  Add	
  par@@ons	
  to	
  Hive	
  
metastore	
  
‣  Merge	
  par@@ons	
  
	
  
Demux	
  
	
  SQS	
  
S3
S3	
  
Standalone
Demux
Standalone
Demux
Standalone
Demux
Standalone
Demux
S3 S3
S3 S3
HIVE	
  
MERGE	
  
HONU PIPELINE
HONU
CLIENT
SDK
HONU
COLLECTORS
HONU
DEMUX
ORGANIZECOLLECTGENERATE
USE CASE:
PLAYER BEHAVIOR
PLAYER BEHAVIOR
PLAYER BEHAVIOR INITIATIVES
TRIBUNAL JUSTICE
‣  Community	
  regulated	
  
‣  In-­‐game	
  chat	
  log	
  
‣  Player	
  stats	
  
‣  Inventory	
  
‣  Game	
  Info	
  
PLAYER BEHAVIOR INITIATIVES
HONOR SYSTEM
‣  Recognize	
  posi@ve	
  experience	
  
‣  Improve	
  sportsmanship	
  
STARTUP TIPS
TEAMS THAT USE SMART PINGS TO ALERT OTHER PLAYERS TO THREATS ARE MORE LIKELY TO WIN GAME
PLAYERS WHO FOLLOW THE SUMMONER'S CODE WIN 27% MORE GAMES
THE TRIBUNAL BANS PLAYERS FOR NEGATIVE BEHAVIOR SUCH AS VERBAL HARASSMENT
PLAYERS WHO COOPERATE WITH THEIR TEAM WIN 31% MORE GAMES
HOW WE SOLVED IT – EXTEND HONU
HONU
CLIENT
SDK
HONU
COLLECTORS
HONU
DEMUX
ORGANIZECOLLECTGENERATE
HONU TOOLS: DRADIS
‣  Hwp	
  based	
  data	
  collec@on	
  
‣  Large	
  volume	
  of	
  data	
  from	
  
untrusted	
  source	
  
‣  C10K	
  
‣  Nginx	
  +	
  Newy	
  
‣  4+	
  billion	
  API	
  calls/day	
  
‣  Peak	
  100K+	
  calls/sec	
  
	
  
HONU TOOLS: DRADIS
‣  Json	
  Messages:	
  
▾  curl	
  -­‐d	
  ’[	
  
{"messageType":	
  "Foo",	
  "@mestamp":	
  1369064555,	
  "fact":	
  "Hello	
  World!"},	
  {"messageType":	
  
"Foo",	
  "@mestamp":	
  1369064555,	
  "fact":	
  "Hello	
  Dradis!",	
  	
  
"fic@on":	
  "Hello	
  Honu!"}]’	
  	
  
‣  Hive	
  Query:	
  
▾  Select	
  *	
  from	
  foo	
  where	
  f[‘fact’]	
  =	
  ‘Hello	
  Dradis!’	
  
Table:	
  Foo	
  
HONU TOOLS: ECHO SERVICE
‣  Web	
  UI	
  to	
  easily	
  and	
  immediately	
  visualize	
  the	
  data	
  that	
  has	
  been	
  sent	
  
to	
  Honu	
  collectors	
  
‣  Self-­‐service	
  end-­‐to-­‐end	
  pipeline	
  
HONU TOOLS: ECHO SERVICE
‣  Web	
  UI	
  to	
  easily	
  and	
  immediately	
  visualize	
  the	
  data	
  that	
  has	
  been	
  sent	
  
to	
  Honu	
  collectors	
  
‣  Self-­‐service	
  end-­‐to-­‐end	
  pipeline	
  
HONU TOOLS: ECHO SERVICE
‣  Web	
  UI	
  to	
  easily	
  and	
  immediately	
  visualize	
  the	
  data	
  that	
  has	
  been	
  sent	
  
to	
  Honu	
  collectors	
  
‣  Self-­‐service	
  end-­‐to-­‐end	
  pipeline	
  
HONU TOOLS: METADATA SERVICE
‣  Data	
  discovery	
  
‣  Schema	
  management	
  
‣  Counter,	
  @me	
  
HONU TOOLS: REAL-TIME SLICING/DICING
‣  Integration with Platfora
‣  End-user ad-hoc analysis tool
‣  Interactive visual feedback
‣  Realtime exploration/graphing @ 109 data points
HONU TOOLS: REAL-TIME SLICING/DICING
HONU TOOLS: WORKFLOW MANAGEMENT
ENTERPRISE WORKFLOW
MANAGEMENT
MATT GOEKE
@ LATER TODAY
ClientMobile
WWW
HONU STATS
‣  7+ billion events/day
‣  Tested @ 70+ billion events/day
‣  100+ tables
▾  10+ tables @ 100M – 1B rows/day
‣  7 Petabytes Game Event Dataset
‣  Semi-global deployment
‣  0 downtime
‣  Runs in cloud (AWS) +
datacenter
SUMMARY
GOALS
ü Democra@ze	
  Data	
  Access	
  
ü Enable	
  Self-­‐service	
  Data	
  Collec@on	
  and	
  Analysis	
  
ü Create	
  Ac@onable	
  Insights	
  
ü Increase	
  Speed	
  to	
  Insight	
  
HONU
HONU
CLIENT
SDK
FUTURE
‣  Improve	
  self-­‐service	
  workflow	
  &	
  tooling	
  
▾  Metadata	
  management	
  
▾  Discovery	
  of	
  captured	
  data	
  
▾  Workflow	
  management	
  
▾  Plauora	
  to	
  all	
  teams	
  
‣  Real@me	
  event	
  aggrega@on	
  
‣  Global	
  data	
  infrastructure	
  
‣  Replace	
  legacy	
  audit/event	
  logging	
  services	
  
HANDLE INCREASING DATA VELOCITY
JUNE 2012 JULY 2013
MySQL	
  tables	
   180	
   1200	
  
Pipeline	
  Events/day	
   0	
   7+	
  Billion	
  
Workflows	
   Cronjob	
  +	
  Pentaho	
   Oozie	
  
Environment	
   Datacenter	
   DC	
  +	
  AWS	
  
SLA	
   1	
  day	
   2	
  hours	
  
Event	
  tracking	
   •  2+	
  weeks	
  (DB	
  
update)	
  
•  Dependencies:	
  DBA	
  
teams	
  +	
  ETL	
  teams	
  +	
  
Tools	
  teams	
  
•  Down@me	
  (3h	
  min.)	
  
•  10	
  minutes	
  
•  Self-­‐Service	
  
	
  
•  No	
  down@me	
  
DECREASE TEEMO DEATHS?
SHAMELESS HIRING PLUG
Like most everybody else at this conference… we’re hiring!
PLAYER EXPERIENCE FIRST
CHALLENGE CONVENTION
FOCUS ON TALENT AND TEAM
TAKE PLAY SERIOUSLY
STAY HUNGRY, STAY HUMBLE
THE RIOT MANIFESTO
SHAMELESS HIRING PLUG
AND YES, YOU CAN PLAY GAMES AT WORK
IT’S ENCOURAGED!
THANK YOU! QUESTIONS?
BARRY LIVINGSTON
blivingston@riotgames.com
SANDEEP SHRESTHA
sshrestha@riotgames.com

More Related Content

What's hot

Minio ♥ Go
Minio ♥ GoMinio ♥ Go
Minio ♥ Go
Minio
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
Saurav Haloi
 
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a ServiceZeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
Databricks
 
Whoops, The Numbers Are Wrong! Scaling Data Quality @ Netflix
Whoops, The Numbers Are Wrong! Scaling Data Quality @ NetflixWhoops, The Numbers Are Wrong! Scaling Data Quality @ Netflix
Whoops, The Numbers Are Wrong! Scaling Data Quality @ Netflix
DataWorks Summit
 
Galera Cluster - Node Recovery - Webinar slides
Galera Cluster - Node Recovery - Webinar slidesGalera Cluster - Node Recovery - Webinar slides
Galera Cluster - Node Recovery - Webinar slides
Severalnines
 
YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions
Yugabyte
 
Greenplum and Kafka: Real-time Streaming to Greenplum - Greenplum Summit 2019
Greenplum and Kafka: Real-time Streaming to Greenplum - Greenplum Summit 2019Greenplum and Kafka: Real-time Streaming to Greenplum - Greenplum Summit 2019
Greenplum and Kafka: Real-time Streaming to Greenplum - Greenplum Summit 2019
VMware Tanzu
 
Red hat ceph storage customer presentation
Red hat ceph storage customer presentationRed hat ceph storage customer presentation
Red hat ceph storage customer presentation
Rodrigo Missiaggia
 
Ceph Performance and Sizing Guide
Ceph Performance and Sizing GuideCeph Performance and Sizing Guide
Ceph Performance and Sizing Guide
Jose De La Rosa
 
Apache NiFi- MiNiFi meetup Slides
Apache NiFi- MiNiFi meetup SlidesApache NiFi- MiNiFi meetup Slides
Apache NiFi- MiNiFi meetup Slides
Isheeta Sanghi
 
Apache Flink and what it is used for
Apache Flink and what it is used forApache Flink and what it is used for
Apache Flink and what it is used for
Aljoscha Krettek
 
Amazon Aurora: Under the Hood
Amazon Aurora: Under the HoodAmazon Aurora: Under the Hood
Amazon Aurora: Under the Hood
Amazon Web Services
 
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
Amy W. Tang
 
Sqoop on Spark for Data Ingestion
Sqoop on Spark for Data IngestionSqoop on Spark for Data Ingestion
Sqoop on Spark for Data Ingestion
DataWorks Summit
 
Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics
Araf Karsh Hamid
 
Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopTrend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache Bigtop
Evans Ye
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
Alluxio, Inc.
 
Sql Antipatterns Strike Back
Sql Antipatterns Strike BackSql Antipatterns Strike Back
Sql Antipatterns Strike Back
Karwin Software Solutions LLC
 
Strongly Consistent Global Indexes for Apache Phoenix
Strongly Consistent Global Indexes for Apache PhoenixStrongly Consistent Global Indexes for Apache Phoenix
Strongly Consistent Global Indexes for Apache Phoenix
YugabyteDB
 
Linux Profiling at Netflix
Linux Profiling at NetflixLinux Profiling at Netflix
Linux Profiling at Netflix
Brendan Gregg
 

What's hot (20)

Minio ♥ Go
Minio ♥ GoMinio ♥ Go
Minio ♥ Go
 
Introduction to Apache ZooKeeper
Introduction to Apache ZooKeeperIntroduction to Apache ZooKeeper
Introduction to Apache ZooKeeper
 
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a ServiceZeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
Zeus: Uber’s Highly Scalable and Distributed Shuffle as a Service
 
Whoops, The Numbers Are Wrong! Scaling Data Quality @ Netflix
Whoops, The Numbers Are Wrong! Scaling Data Quality @ NetflixWhoops, The Numbers Are Wrong! Scaling Data Quality @ Netflix
Whoops, The Numbers Are Wrong! Scaling Data Quality @ Netflix
 
Galera Cluster - Node Recovery - Webinar slides
Galera Cluster - Node Recovery - Webinar slidesGalera Cluster - Node Recovery - Webinar slides
Galera Cluster - Node Recovery - Webinar slides
 
YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions
 
Greenplum and Kafka: Real-time Streaming to Greenplum - Greenplum Summit 2019
Greenplum and Kafka: Real-time Streaming to Greenplum - Greenplum Summit 2019Greenplum and Kafka: Real-time Streaming to Greenplum - Greenplum Summit 2019
Greenplum and Kafka: Real-time Streaming to Greenplum - Greenplum Summit 2019
 
Red hat ceph storage customer presentation
Red hat ceph storage customer presentationRed hat ceph storage customer presentation
Red hat ceph storage customer presentation
 
Ceph Performance and Sizing Guide
Ceph Performance and Sizing GuideCeph Performance and Sizing Guide
Ceph Performance and Sizing Guide
 
Apache NiFi- MiNiFi meetup Slides
Apache NiFi- MiNiFi meetup SlidesApache NiFi- MiNiFi meetup Slides
Apache NiFi- MiNiFi meetup Slides
 
Apache Flink and what it is used for
Apache Flink and what it is used forApache Flink and what it is used for
Apache Flink and what it is used for
 
Amazon Aurora: Under the Hood
Amazon Aurora: Under the HoodAmazon Aurora: Under the Hood
Amazon Aurora: Under the Hood
 
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
Espresso: LinkedIn's Distributed Data Serving Platform (Paper)
 
Sqoop on Spark for Data Ingestion
Sqoop on Spark for Data IngestionSqoop on Spark for Data Ingestion
Sqoop on Spark for Data Ingestion
 
Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics Apache Flink, AWS Kinesis, Analytics
Apache Flink, AWS Kinesis, Analytics
 
Trend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache BigtopTrend Micro Big Data Platform and Apache Bigtop
Trend Micro Big Data Platform and Apache Bigtop
 
Apache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic DatasetsApache Iceberg - A Table Format for Hige Analytic Datasets
Apache Iceberg - A Table Format for Hige Analytic Datasets
 
Sql Antipatterns Strike Back
Sql Antipatterns Strike BackSql Antipatterns Strike Back
Sql Antipatterns Strike Back
 
Strongly Consistent Global Indexes for Apache Phoenix
Strongly Consistent Global Indexes for Apache PhoenixStrongly Consistent Global Indexes for Apache Phoenix
Strongly Consistent Global Indexes for Apache Phoenix
 
Linux Profiling at Netflix
Linux Profiling at NetflixLinux Profiling at Netflix
Linux Profiling at Netflix
 

Similar to Big Data at Riot Games – Using Hadoop to Understand Player Experience - StampedeCon 2013

(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
Amazon Web Services
 
Elastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @DatadogElastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @Datadog
C4Media
 
Riot Games Scalable Data Warehouse Lecture at UCSB / UCLA
Riot Games Scalable Data Warehouse Lecture at UCSB / UCLARiot Games Scalable Data Warehouse Lecture at UCSB / UCLA
Riot Games Scalable Data Warehouse Lecture at UCSB / UCLA
sean_seannery
 
Kafka Summit SF 2017 - Riot's Journey to Global Kafka Aggregation
Kafka Summit SF 2017 - Riot's Journey to Global Kafka AggregationKafka Summit SF 2017 - Riot's Journey to Global Kafka Aggregation
Kafka Summit SF 2017 - Riot's Journey to Global Kafka Aggregation
confluent
 
Hadoop at Twitter (Hadoop Summit 2010)
Hadoop at Twitter (Hadoop Summit 2010)Hadoop at Twitter (Hadoop Summit 2010)
Hadoop at Twitter (Hadoop Summit 2010)
Kevin Weil
 
Microsoft Big Data @ SQLUG 2013
Microsoft Big Data @ SQLUG 2013Microsoft Big Data @ SQLUG 2013
Microsoft Big Data @ SQLUG 2013
Nathan Bijnens
 
MySQL Performance Monitoring
MySQL Performance MonitoringMySQL Performance Monitoring
MySQL Performance Monitoring
spil-engineering
 
Monitor OpenStack Environments from the bottom up and front to back
Monitor OpenStack Environments from the bottom up and front to backMonitor OpenStack Environments from the bottom up and front to back
Monitor OpenStack Environments from the bottom up and front to back
Icinga
 
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice MachineSpark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Data Con LA
 
Gam301 Real-Time Game Analytics with Amazon Redshift, Amazon Kinesis, and Ama...
Gam301 Real-Time Game Analytics with Amazon Redshift, Amazon Kinesis, and Ama...Gam301 Real-Time Game Analytics with Amazon Redshift, Amazon Kinesis, and Ama...
Gam301 Real-Time Game Analytics with Amazon Redshift, Amazon Kinesis, and Ama...
Amazon Web Services Korea
 
Using Event Streams in Serverless Applications
Using Event Streams in Serverless ApplicationsUsing Event Streams in Serverless Applications
Using Event Streams in Serverless Applications
Jonathan Dee
 
Creating PostgreSQL-as-a-Service at Scale
Creating PostgreSQL-as-a-Service at ScaleCreating PostgreSQL-as-a-Service at Scale
Creating PostgreSQL-as-a-Service at Scale
Sean Chittenden
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFix
C4Media
 
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNAFirst Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
Tomas Cervenka
 
HDInsight for Architects
HDInsight for ArchitectsHDInsight for Architects
HDInsight for Architects
Ashish Thapliyal
 
Understanding event data
Understanding event dataUnderstanding event data
Understanding event data
yalisassoon
 
Lambda Architectures in Practice
Lambda Architectures in PracticeLambda Architectures in Practice
Lambda Architectures in Practice
C4Media
 
Webinar - Order out of Chaos: Avoiding the Migration Migraine
Webinar - Order out of Chaos: Avoiding the Migration MigraineWebinar - Order out of Chaos: Avoiding the Migration Migraine
Webinar - Order out of Chaos: Avoiding the Migration Migraine
Peak Hosting
 
Open Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
Open Source Lambda Architecture with Hadoop, Kafka, Samza and DruidOpen Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
Open Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
DataWorks Summit
 
Supersize your production pipe enjmin 2013 v1.1 hd
Supersize your production pipe    enjmin 2013 v1.1 hdSupersize your production pipe    enjmin 2013 v1.1 hd
Supersize your production pipe enjmin 2013 v1.1 hd
slantsixgames
 

Similar to Big Data at Riot Games – Using Hadoop to Understand Player Experience - StampedeCon 2013 (20)

(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
(GAM301) Real-Time Game Analytics with Amazon Kinesis, Amazon Redshift, and A...
 
Elastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @DatadogElastic Data Analytics Platform @Datadog
Elastic Data Analytics Platform @Datadog
 
Riot Games Scalable Data Warehouse Lecture at UCSB / UCLA
Riot Games Scalable Data Warehouse Lecture at UCSB / UCLARiot Games Scalable Data Warehouse Lecture at UCSB / UCLA
Riot Games Scalable Data Warehouse Lecture at UCSB / UCLA
 
Kafka Summit SF 2017 - Riot's Journey to Global Kafka Aggregation
Kafka Summit SF 2017 - Riot's Journey to Global Kafka AggregationKafka Summit SF 2017 - Riot's Journey to Global Kafka Aggregation
Kafka Summit SF 2017 - Riot's Journey to Global Kafka Aggregation
 
Hadoop at Twitter (Hadoop Summit 2010)
Hadoop at Twitter (Hadoop Summit 2010)Hadoop at Twitter (Hadoop Summit 2010)
Hadoop at Twitter (Hadoop Summit 2010)
 
Microsoft Big Data @ SQLUG 2013
Microsoft Big Data @ SQLUG 2013Microsoft Big Data @ SQLUG 2013
Microsoft Big Data @ SQLUG 2013
 
MySQL Performance Monitoring
MySQL Performance MonitoringMySQL Performance Monitoring
MySQL Performance Monitoring
 
Monitor OpenStack Environments from the bottom up and front to back
Monitor OpenStack Environments from the bottom up and front to backMonitor OpenStack Environments from the bottom up and front to back
Monitor OpenStack Environments from the bottom up and front to back
 
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice MachineSpark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
Spark as part of a Hybrid RDBMS Architecture-John Leach Cofounder Splice Machine
 
Gam301 Real-Time Game Analytics with Amazon Redshift, Amazon Kinesis, and Ama...
Gam301 Real-Time Game Analytics with Amazon Redshift, Amazon Kinesis, and Ama...Gam301 Real-Time Game Analytics with Amazon Redshift, Amazon Kinesis, and Ama...
Gam301 Real-Time Game Analytics with Amazon Redshift, Amazon Kinesis, and Ama...
 
Using Event Streams in Serverless Applications
Using Event Streams in Serverless ApplicationsUsing Event Streams in Serverless Applications
Using Event Streams in Serverless Applications
 
Creating PostgreSQL-as-a-Service at Scale
Creating PostgreSQL-as-a-Service at ScaleCreating PostgreSQL-as-a-Service at Scale
Creating PostgreSQL-as-a-Service at Scale
 
Data Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFixData Science in the Cloud @StitchFix
Data Science in the Cloud @StitchFix
 
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNAFirst Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
First Hive Meetup London 2012-07-10 - Tomas Cervenka - VisualDNA
 
HDInsight for Architects
HDInsight for ArchitectsHDInsight for Architects
HDInsight for Architects
 
Understanding event data
Understanding event dataUnderstanding event data
Understanding event data
 
Lambda Architectures in Practice
Lambda Architectures in PracticeLambda Architectures in Practice
Lambda Architectures in Practice
 
Webinar - Order out of Chaos: Avoiding the Migration Migraine
Webinar - Order out of Chaos: Avoiding the Migration MigraineWebinar - Order out of Chaos: Avoiding the Migration Migraine
Webinar - Order out of Chaos: Avoiding the Migration Migraine
 
Open Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
Open Source Lambda Architecture with Hadoop, Kafka, Samza and DruidOpen Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
Open Source Lambda Architecture with Hadoop, Kafka, Samza and Druid
 
Supersize your production pipe enjmin 2013 v1.1 hd
Supersize your production pipe    enjmin 2013 v1.1 hdSupersize your production pipe    enjmin 2013 v1.1 hd
Supersize your production pipe enjmin 2013 v1.1 hd
 

More from StampedeCon

Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
StampedeCon
 
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
StampedeCon
 
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
StampedeCon
 
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
StampedeCon
 
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
StampedeCon
 
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
StampedeCon
 
Foundations of Machine Learning - StampedeCon AI Summit 2017
Foundations of Machine Learning - StampedeCon AI Summit 2017Foundations of Machine Learning - StampedeCon AI Summit 2017
Foundations of Machine Learning - StampedeCon AI Summit 2017
StampedeCon
 
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
StampedeCon
 
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
StampedeCon
 
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
StampedeCon
 
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017
StampedeCon
 
A Different Data Science Approach - StampedeCon AI Summit 2017
A Different Data Science Approach - StampedeCon AI Summit 2017A Different Data Science Approach - StampedeCon AI Summit 2017
A Different Data Science Approach - StampedeCon AI Summit 2017
StampedeCon
 
Graph in Customer 360 - StampedeCon Big Data Conference 2017
Graph in Customer 360 - StampedeCon Big Data Conference 2017Graph in Customer 360 - StampedeCon Big Data Conference 2017
Graph in Customer 360 - StampedeCon Big Data Conference 2017
StampedeCon
 
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
StampedeCon
 
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
StampedeCon
 
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
StampedeCon
 
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
StampedeCon
 
Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016
StampedeCon
 
Creating a Data Driven Organization - StampedeCon 2016
Creating a Data Driven Organization - StampedeCon 2016Creating a Data Driven Organization - StampedeCon 2016
Creating a Data Driven Organization - StampedeCon 2016
StampedeCon
 
Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016
StampedeCon
 

More from StampedeCon (20)

Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
Why Should We Trust You-Interpretability of Deep Neural Networks - StampedeCo...
 
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
 
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
Predicting Outcomes When Your Outcomes are Graphs - StampedeCon AI Summit 2017
 
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
Novel Semi-supervised Probabilistic ML Approach to SNP Variant Calling - Stam...
 
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
How to Talk about AI to Non-analaysts - Stampedecon AI Summit 2017
 
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
Getting Started with Keras and TensorFlow - StampedeCon AI Summit 2017
 
Foundations of Machine Learning - StampedeCon AI Summit 2017
Foundations of Machine Learning - StampedeCon AI Summit 2017Foundations of Machine Learning - StampedeCon AI Summit 2017
Foundations of Machine Learning - StampedeCon AI Summit 2017
 
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
Don't Start from Scratch: Transfer Learning for Novel Computer Vision Problem...
 
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
Bringing the Whole Elephant Into View Can Cognitive Systems Bring Real Soluti...
 
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
Automated AI The Next Frontier in Analytics - StampedeCon AI Summit 2017
 
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017AI in the Enterprise: Past,  Present &  Future - StampedeCon AI Summit 2017
AI in the Enterprise: Past, Present & Future - StampedeCon AI Summit 2017
 
A Different Data Science Approach - StampedeCon AI Summit 2017
A Different Data Science Approach - StampedeCon AI Summit 2017A Different Data Science Approach - StampedeCon AI Summit 2017
A Different Data Science Approach - StampedeCon AI Summit 2017
 
Graph in Customer 360 - StampedeCon Big Data Conference 2017
Graph in Customer 360 - StampedeCon Big Data Conference 2017Graph in Customer 360 - StampedeCon Big Data Conference 2017
Graph in Customer 360 - StampedeCon Big Data Conference 2017
 
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
End-to-end Big Data Projects with Python - StampedeCon Big Data Conference 2017
 
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
Doing Big Data Using Amazon's Analogs - StampedeCon Big Data Conference 2017
 
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
Enabling New Business Capabilities with Cloud-based Streaming Data Architectu...
 
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
Big Data Meets IoT: Lessons From the Cloud on Polling, Collecting, and Analyz...
 
Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016Innovation in the Data Warehouse - StampedeCon 2016
Innovation in the Data Warehouse - StampedeCon 2016
 
Creating a Data Driven Organization - StampedeCon 2016
Creating a Data Driven Organization - StampedeCon 2016Creating a Data Driven Organization - StampedeCon 2016
Creating a Data Driven Organization - StampedeCon 2016
 
Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016Using The Internet of Things for Population Health Management - StampedeCon 2016
Using The Internet of Things for Population Health Management - StampedeCon 2016
 

Recently uploaded

Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
Brandon Minnick, MBA
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
Intelisync
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
panagenda
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
LucaBarbaro3
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
saastr
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
Miro Wengner
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
ssuserfac0301
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
Hiike
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
Data Hops
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
HarisZaheer8
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
Jakub Marek
 
SAP S/4 HANA sourcing and procurement to Public cloud
SAP S/4 HANA sourcing and procurement to Public cloudSAP S/4 HANA sourcing and procurement to Public cloud
SAP S/4 HANA sourcing and procurement to Public cloud
maazsz111
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
Dinusha Kumarasiri
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
Hiroshi SHIBATA
 

Recently uploaded (20)

Choosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptxChoosing The Best AWS Service For Your Website + API.pptx
Choosing The Best AWS Service For Your Website + API.pptx
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUHCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
 
JavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green MasterplanJavaLand 2024: Application Development Green Masterplan
JavaLand 2024: Application Development Green Masterplan
 
Taking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdfTaking AI to the Next Level in Manufacturing.pdf
Taking AI to the Next Level in Manufacturing.pdf
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
 
GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3FREE A4 Cyber Security Awareness  Posters-Social Engineering part 3
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
 
Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)Main news related to the CCS TSI 2023 (2023/1695)
Main news related to the CCS TSI 2023 (2023/1695)
 
SAP S/4 HANA sourcing and procurement to Public cloud
SAP S/4 HANA sourcing and procurement to Public cloudSAP S/4 HANA sourcing and procurement to Public cloud
SAP S/4 HANA sourcing and procurement to Public cloud
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Introduction of Cybersecurity with OSS at Code Europe 2024
Introduction of Cybersecurity with OSS  at Code Europe 2024Introduction of Cybersecurity with OSS  at Code Europe 2024
Introduction of Cybersecurity with OSS at Code Europe 2024
 

Big Data at Riot Games – Using Hadoop to Understand Player Experience - StampedeCon 2013

  • 1. BIG DATA @ RIOT GAMES USING HADOOP TO IMPROVE THE PLAYER EXPERIENCE BARRY LIVINGSTON & SANDEEP SHRESTHA | JULY 2013
  • 3. CONTEXT HIGH LEVEL ARCHITECTURE PLAYER EXPERIENCE USE CASES SUMMARY QUICK DATA WAREHOUSE HISTORY
  • 4. FIRST, A BIT OF CONTEXT…
  • 5.
  • 6. WHAT IS LEAGUE OF LEGENDS? 2009 LAUNCH TEAM ORIENTED 100+ CHAMPS MODERN FANTASY
  • 7. WHAT IS LEAGUE OF LEGENDS?
  • 8. LEAGUE OF LEGENDS GAMEPLAY - CHAMPIONS
  • 9. LEAGUE OF LEGENDS GAMEPLAY - GAMEPLAY
  • 11. INITIAL LAUNCH / SCRAPPY START UP PHASE ‣  Had  a  single,  dedicated  MySQL  instance  for  the  DW   ‣  Data  was  ETL’d  from  produc@on  slaves  into  this  instance   ‣  Queries  were  run  in  MySQL   ‣  Repor@ng  was  done  in  Excel   ▾  All  ETLs,  queries  and  repor@ng  were  done  by  one  person   HISTORY   START-­‐UP   THIS WORKED GREAT!
  • 12. THEN – CRAZY GROWTH HISTORY   START-­‐UP   @me   #  unique  logins   TOTAL  ACTIVE  PLAYERS    June  2012   CRAZY   GROWTH  
  • 13. THE BREAKING POINT HISTORY   START-­‐UP   CRAZY   GROWTH   BREAKING   POINT   ‣  Data  warehouse  reached  a  breaking  point   ▾  24  hours  of  data  took  24.5  hours  to  ETL   ‣  We  couldn’t  handle…   ▾  Mul@ple  environments  in  a  ver@cal  MySQL  instance     ▾  A  single  environment  in  a  ver@cal  MySQL  instance   ‣  We  needed  to  change    
  • 14. INTRODUCTION OF HADOOP HISTORY   START-­‐UP   CRAZY   GROWTH   BREAKING   POINT   ‣  Hadoop  has  a  number  of  great  quali@es   ▾  Cost  effec@ve   ▾  Scalable   ▾  Open  source   ▾  We  could  execute  quickly   HADOOP  
  • 15. HIGH LEVEL ARCHITECTURE – JUNE 2012 Tableau     Hive  Data  Warehouse   Pentaho     +     Custom   ETL     +     Sqoop   MySQL  Pentaho   Analysts   EUROPE   Audit   Plat   LoL   KOREA   Audit   Plat   LoL   NORTH  AMERICA   Audit   Plat   LoL   Business   Analyst  
  • 16. BUT, THIS WASN’T GOOD ENOUGH ‣  The  @me  to  arrive  at  insight  was  too  long!   ‣  Our  solu@on  required  too  much  data  team  involvement   ▾  Schema  changes   ▾  ETL  tweaks   ▾  Hive  metadata  updates   ‣  Hive  is  painful  for  ad-­‐hoc  or  interac@ve  analysis   ▾  Especially  for  non-­‐technical  folks  
  • 17. GOALS ‣  Democra@ze  data  access   ▾  Enable  Self-­‐service  Data  Collec@on  and   Analysis   ‣  Create  ac@onable  insights   ‣  Increase  speed  to  insight  
  • 18. USE CASE: GAME CLIENT PERFORMANCE
  • 19. CLIENT FOOTPRINT ‣  Significant  por@on  of  our  soware  runs  directly  on  players’   machines   ▾  High  performance  graphics   ▾  Responsiveness   ‣  There  is  logic  in  these  components  that's  ONLY  exercised   on  the  client-­‐side   ‣  Understanding  the  performance,  reliability  and  stability  of   these  features  is  paramount  to  improving  the  player   experience  
  • 24. CHALLENGE: THE GAME IS ALIVE The  game  is  a  living,  breathing  service  that’s  always  in  mo@on   ‣  New  champions   ‣  New  items     ‣  New  effects/par@cles   ‣  Changes  in  environment   ‣  Changes  in  design  and  design   balance       UPDATE 2-3WEEKS
  • 26. CHALLENGE: PC VARIABILITY ‣  Hardware  and  OS  profiles  are  significantly  different  even   within  regions   ▾  OS  and  patch  level   ▾  CPU   ▾  Memory   ▾  Video  card   ▾  Video  card  memory   ▾  Drivers  
  • 29. IMPROVING THE PLAYER EXPERIENCE ‣  We  need  to  gather  informa@on  across  all  of  these   dimensions  in  order  to  UNDERSTAND  the  player  experience   ‣  We  use  this  info  to:   ▾  React  quickly  to  changes   ▾  Op@mize  performance   ▾  Op@mize  designs   ▾  Improve  our  tes@ng   •  Like  crea@ng  our  compa@bility  tes@ng  lab  
  • 33. OPTIMIZING DESIGN AND PERFORMANCE
  • 34. OPTIMIZING DESIGN AND PERFORMANCE
  • 35. OPTIMIZING DESIGN AND PERFORMANCE
  • 36. OPTIMIZING DESIGN AND PERFORMANCE
  • 37. HOW DID WE SOLVE THIS WE HAVE AN ARMY OF TEEMOS WATCHING PLAYERS’ MACHINES THROUGH THEIR TELESCOPES?! (NOT REALLY, BUT WE DID CONSIDER IT)
  • 38. HONU: GENERATE - COLLECT - ANALYZE ‣  Riot’s  self-­‐service  end-­‐to-­‐end  Big  Data  pipeline   ▾  Cloud-­‐ready  (AWS  compa@ble)   ▾  Internal  data-­‐center  ready   ▾  Persistent  storage:  HDFS/S3   ▾  Batch  processing:  Apache  Hadoop/AWS  EMR   ▾  Data  publish:  Apache  Hive    
  • 39. EVENT GENERATION ‣  Honu  SDKs:  Java,  C++,  Erlang   ‣  Collector  discovery   ‣  Failover   ‣  Load  balancing   ‣  Buffering/Batching   ‣  Dispatching   ‣  Thri  transport  
  • 40. HONU CLIENT SDK Select  avg(f[‘pingAVG’])  from  game_client_stats  group  by  f[‘serverId’];   pingAvg   serverId   system  source      app  @mestamp   1234567890   99.123.456.78   game_client   220.9542   12.345.678.90   Intel64  …   GAME_CLIENT_STATS  
  • 41. EVENT COLLECTION ‣  Honu  collector   ‣  Online  system   ‣  High  availability  –  100%  up@me   ‣  Horizontally  scalable   ‣  Elas@c   ‣  Fault  tolerant   ‣  Neulix  OSS  Eureka  discovery  service  
  • 42. HONU COLLECTOR ‣  Collect  events  from  mul@ple  clients   (Thri/NIO)   ‣  Save  all  events  to  one  compressed   file  locally   ‣  Upload  that  file  every  XX  minutes  to   HDFS/S3   ‣  Send  a  message  to  Queue/SQS  for   Demux   H  o  n  u  C  o  l  l  e  c  t  o  r  s   S  Q  S   S  3  
  • 43. EVENT ORGANIZATION ‣  Honu  demux   ‣  Mul@-­‐stage  batch  processing  pipeline   ‣  Elas@c  producer-­‐consumer   ‣  Apache  Hadoop  map  reduce   ‣  Standalone  map  reduce  mode   ‣  Apache  Hive  integra@on  
  • 44. HONU DEMUX ‣  Mul@-­‐Stage  batch   processing  pipeline   ‣  Bucket  events  to  separate   tables   ‣  Write  Hive  par@@on  files   ‣  Add  par@@ons  to  Hive   metastore   ‣  Merge  par@@ons     Demux    SQS   S3 S3   Standalone Demux Standalone Demux Standalone Demux Standalone Demux S3 S3 S3 S3 HIVE   MERGE  
  • 48. PLAYER BEHAVIOR INITIATIVES TRIBUNAL JUSTICE ‣  Community  regulated   ‣  In-­‐game  chat  log   ‣  Player  stats   ‣  Inventory   ‣  Game  Info  
  • 49. PLAYER BEHAVIOR INITIATIVES HONOR SYSTEM ‣  Recognize  posi@ve  experience   ‣  Improve  sportsmanship  
  • 50. STARTUP TIPS TEAMS THAT USE SMART PINGS TO ALERT OTHER PLAYERS TO THREATS ARE MORE LIKELY TO WIN GAME PLAYERS WHO FOLLOW THE SUMMONER'S CODE WIN 27% MORE GAMES THE TRIBUNAL BANS PLAYERS FOR NEGATIVE BEHAVIOR SUCH AS VERBAL HARASSMENT PLAYERS WHO COOPERATE WITH THEIR TEAM WIN 31% MORE GAMES
  • 51. HOW WE SOLVED IT – EXTEND HONU HONU CLIENT SDK HONU COLLECTORS HONU DEMUX ORGANIZECOLLECTGENERATE
  • 52. HONU TOOLS: DRADIS ‣  Hwp  based  data  collec@on   ‣  Large  volume  of  data  from   untrusted  source   ‣  C10K   ‣  Nginx  +  Newy   ‣  4+  billion  API  calls/day   ‣  Peak  100K+  calls/sec    
  • 53. HONU TOOLS: DRADIS ‣  Json  Messages:   ▾  curl  -­‐d  ’[   {"messageType":  "Foo",  "@mestamp":  1369064555,  "fact":  "Hello  World!"},  {"messageType":   "Foo",  "@mestamp":  1369064555,  "fact":  "Hello  Dradis!",     "fic@on":  "Hello  Honu!"}]’     ‣  Hive  Query:   ▾  Select  *  from  foo  where  f[‘fact’]  =  ‘Hello  Dradis!’   Table:  Foo  
  • 54. HONU TOOLS: ECHO SERVICE ‣  Web  UI  to  easily  and  immediately  visualize  the  data  that  has  been  sent   to  Honu  collectors   ‣  Self-­‐service  end-­‐to-­‐end  pipeline  
  • 55. HONU TOOLS: ECHO SERVICE ‣  Web  UI  to  easily  and  immediately  visualize  the  data  that  has  been  sent   to  Honu  collectors   ‣  Self-­‐service  end-­‐to-­‐end  pipeline  
  • 56. HONU TOOLS: ECHO SERVICE ‣  Web  UI  to  easily  and  immediately  visualize  the  data  that  has  been  sent   to  Honu  collectors   ‣  Self-­‐service  end-­‐to-­‐end  pipeline  
  • 57. HONU TOOLS: METADATA SERVICE ‣  Data  discovery   ‣  Schema  management   ‣  Counter,  @me  
  • 58. HONU TOOLS: REAL-TIME SLICING/DICING ‣  Integration with Platfora ‣  End-user ad-hoc analysis tool ‣  Interactive visual feedback ‣  Realtime exploration/graphing @ 109 data points
  • 59. HONU TOOLS: REAL-TIME SLICING/DICING
  • 60. HONU TOOLS: WORKFLOW MANAGEMENT ENTERPRISE WORKFLOW MANAGEMENT MATT GOEKE @ LATER TODAY ClientMobile WWW
  • 61. HONU STATS ‣  7+ billion events/day ‣  Tested @ 70+ billion events/day ‣  100+ tables ▾  10+ tables @ 100M – 1B rows/day ‣  7 Petabytes Game Event Dataset ‣  Semi-global deployment ‣  0 downtime ‣  Runs in cloud (AWS) + datacenter
  • 63. GOALS ü Democra@ze  Data  Access   ü Enable  Self-­‐service  Data  Collec@on  and  Analysis   ü Create  Ac@onable  Insights   ü Increase  Speed  to  Insight   HONU HONU CLIENT SDK
  • 64. FUTURE ‣  Improve  self-­‐service  workflow  &  tooling   ▾  Metadata  management   ▾  Discovery  of  captured  data   ▾  Workflow  management   ▾  Plauora  to  all  teams   ‣  Real@me  event  aggrega@on   ‣  Global  data  infrastructure   ‣  Replace  legacy  audit/event  logging  services  
  • 65. HANDLE INCREASING DATA VELOCITY JUNE 2012 JULY 2013 MySQL  tables   180   1200   Pipeline  Events/day   0   7+  Billion   Workflows   Cronjob  +  Pentaho   Oozie   Environment   Datacenter   DC  +  AWS   SLA   1  day   2  hours   Event  tracking   •  2+  weeks  (DB   update)   •  Dependencies:  DBA   teams  +  ETL  teams  +   Tools  teams   •  Down@me  (3h  min.)   •  10  minutes   •  Self-­‐Service     •  No  down@me  
  • 67. SHAMELESS HIRING PLUG Like most everybody else at this conference… we’re hiring! PLAYER EXPERIENCE FIRST CHALLENGE CONVENTION FOCUS ON TALENT AND TEAM TAKE PLAY SERIOUSLY STAY HUNGRY, STAY HUMBLE THE RIOT MANIFESTO
  • 68. SHAMELESS HIRING PLUG AND YES, YOU CAN PLAY GAMES AT WORK IT’S ENCOURAGED!
  • 69. THANK YOU! QUESTIONS? BARRY LIVINGSTON blivingston@riotgames.com SANDEEP SHRESTHA sshrestha@riotgames.com