SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.
SlideShare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.
Successfully reported this slideshow.
Activate your 14 day free trial to unlock unlimited reading.
MongoDB: Queries and Aggregation Framework with NBA Game Data
2.
Introducing an Awesome Data Set
•Scraped basketball-reference.com
•Mad props to NPM module Cheerio
•Box scores for all 31,686 NBA games since 1985
•Download: http://bit.ly/1jlgs9u via S3
•Untar and run mongorestore
*
3.
Data Set Structure
•Contains final score
•Contains box score for teams and players
*
4.
Data Set Structure - High Level
•Contains _id, date
•Info on winning team and losing team
*
5.
Data Set Structure - Box
•Box score contains detailed stats by team
*
6.
Data Set Structure - Box
•And also for individual players:
*
7.
Queries and Aggregation
•MongoDB has a rich query framework
•Aggregation framework is like SQL’s group by
*
8.
Query Basics - findOne()
•When was Kobe Bryant’s 81 point game?
*
9.
Query Basics - find()
•Which teams have lost despite scoring more than
150 points?
*
10.
Query Basics - count()
•How many games did the Lakers win in the 19992000 season?
*
11.
Query Basics - distinct()
•Which teams have lost a game despite having a
player make at least 10 3 pointers?
*
12.
Query Basics - $elemMatch operator
•When did Michael Jordan score 60 points in a losing
effort?
*
14.
Query Basics - .sort() and .limit()
•What are the 5 highest point totals for a losing
team?
*
15.
Query Basics - .sort() and .limit()
•What are the 5 highest point totals for a losing
team?
*
16.
Aggregation
•Similar to SQL group by
•Filters and transforms data in pipeline stages
•Stages are chainable
•Accessible via the .aggregate() function in shell
*
17.
Aggregation - Lakers Season PPG
•How many points did the Lakers average in games
they won in the 2008-2009 season?
*
18.
Aggregation - Lakers Season PPG
•How many points did the Lakers average in games
they won in the 2008-2009 season?
*
19.
Aggregation - $sort and $limit
•Compute the teams with the 5 best records in the
1999-2000 season
*