SlideShare a Scribd company logo
H I T C H H I K ER ’ S GUI D E T O A NO S Q L
D A T A C L O UD – T UT O R I A L H A ND O UT
                                    AWS SETUP
1. What you need :
     a. An AWS account. Signup is at Amazon EC2 page
     b. Create a local directory aws : ~/aws or c:aws.
         It is easier to have a sub directory to keep all aws stuff
     c. (optional) Have downloaded and installed aws tools. I install them in the aws sub
         directory
     d. Setup and store an SSH key. My key is at ~/aws/SSH-Key1.pem
     e. Create a security group (for example Test-1 as shown in Figure 1 – Open port 22,80
         and 8080. Remember this is for experimental purposes and not for production)




                                         FIG UR E 1

2. Redeem AWS credit. Amazon has given us credit voucher for $25. (Thanks to Jeff Barr and
   Jenny Kohr Chynoweth)
       a. Visit the AWS Credit Redemption Page and enter the promotion code above. You
           must be signed in to view this page.
       b. Verify that the credit has been applied to your account by viewing your Account
           Activity page.
       c. These credits will be good for $25 worth of AWS services from June 10, 2010-December
           31, 2010.
3. I have created OSCON-NOSQL AMI that has the stuff needed for this tutorial.
       a. Updates and references at http://doubleclix.wordpress.com/2010/07/13/oscon-
           nosql-training-ami/
       b. Reference Pack
                i. The NOSQL Slide set
               ii. The hand out (This document)
iii. Reference List
                 iv. The AMI
                  v. The NOSQL papers
   4. To launch the AMI
      AWS Management Console-EC2-Launch Instance
      Search for OSCON-NOSQL (Figure 2)




                                              FIG UR E 2

       [Select]

       [Instance Details] (small instance is fine)

       [Advanced Instance Options] (Default is fine)

       [Create Key pair] (Use existing if you already have one or create one and store in the ~/aws
directory

       [Configure Firewall] – choose Test-1 or create 1 with 80,22 and 8080 open

       [Review]-[Launch]-[Close]

        Instance should show up in the management console in a few minutes (Figure 3)




                                              FIG UR E 3

   5. Open up a console and ssh into the instance
      SSH-Key1.pem ubuntu@ec2-184-73-85-67.compute-1.amazonaws.com
      And type-in couple of commands (Figure 4)
FIG UR E 4

Note :

         If it says connection refused, reboot the instance. You can check the startup logs by
typing
      ec2-get-console-output --private-key <private key file
name> --cert <certificate file name> <instancd id>
      Also type-in commands like the following to get a feel for
the instance.
ps –A
vmstat
cat /proc/cpuinfo
uname –a
sudo fdisk –l
sudo lshw
df –h –T


Note : Don’t forget to turn off the instance at the end of the
hands-on session !
HANDS-ON DOCUMENT DATASTORES - MONGODB
Close Encounters of the first kind …

              Start mongo shell : type mongo




      FIG UR E 5

              The CLI is a full Javascript interpreter
              Type any javascript commands
               Function loopTest() {
               For (i=0;i<10;i++) {
               Print(i);
               }
               }
              See how the shell understands editing, multi-line and end of function
              loopTest will show the code
              loopTest() will run the function
              Other commands to try
                   o show dbs
                   o use <dbname>
                   o show collections
                   o db.help()
                   o db.collections.help()
                   o db.stats()
                   o db.version()

Warm up

              create a document & insert it
o person = {“name”,”aname”,”phone”:”555-1212”}
                  o db.persons.insert(person)
                  o db.persons.find()
             <discussion about id>
                  o db.persons.help()
                  o db.persons.count()
             Create 2 more records
             Embedded Nested documents
             address = {“street”:”123,some street”,”City”:”Portland”,”state”:”WA”,
              ”tags”:[“home”,”WA”])
             db.persons.update({“name”:”aname”},address) – will replace
             Need to use update modifier
                  o Db.persons.update({“name”:”aname”},{$set :{address}}) to make add
                      address
             Discussion:
                  o Embedded docs are very versatile
                  o For example the collider could add more data from another source with
                      nested docs and added metadata
                  o As mongo understands json structure, it can index inside embedded docs.
                      For example address.state to update keys in embedded docs
             db.persons.remove({})
             “$unset” to remove a key
             Upsert – interesting
                  o db.persons.update({“name”:”xyz”},{“$inc”:{“checkins”:1}}) <- won’t add
                  o db.persons.update({“name”:”xyz”},{“$inc”:{“checkins”:1}},{“upsert”:”true”)
                      <-will add
             Add a new field/update schema
                  o db.persons.update({},{“$set”:{“insertDate”:new Date()}})
                  o db.runCommand({getLastError:1})
                  o db.persons.update({},{“$set”:{“insertDate”:new
                      Date()}},{”upsert”:”false”},{“multi”:”true”})
                  o db.runCommand({getLastError:1})


Main Feature – Loc us

             Goal is to develop location based/geospatial capability :
              o       Allow users to 'check in' at places
              o       Based on the location, recommend near-by attractions
              o       Allow users to leave tips about places
              o       Do a few aggregate map-reduce operations like “How many people are
                      checked in at a place?” and “List who is there” …
             But for this demo, we will implement a small slice : Ability to checkin and show
              nearby soccer attractions using a Java program
             A quick DataModel
              Database : locus
              Collections: Places – state, city, stadium, latlong
people – name, when, latlong
   Hands on




                                   FIG UR E 6




           FIG UR E 7



       o use <dbname> (use locus)
       o Show collections
o db.stats()
o db.<collection>.find()
     o db.checkins.find()
     o db.places.find()




                 FIG UR E 8




                 FIG UR E 9
FIG UR E 10

   db.places.find({latlong: {$near : [40,70]})

   Mapreduce
    Mapreduce is best for batch type operations, especially aggregations - while the
    standard queries are well suited for lookup operations/matching operations on
    documents. In our example, we could run mapreduce operation every few hours
    to calculate which locations are 'trending' i.e. have a many check ins in the last
    few hours..

    First we will add one more element place and capture the place where the
    person checksin. Then we could use the following simple map-reduce operation
    which operates on the checkin collection which aggregates counts of all
    historical checkins per location:

    m = function() {
    emit(this.place, 1);}

    r = function(key, values) {
    total = 0;
    for (var i in values) total += values[i];
    return total;
    }

    res = db.checkins.mapReduce(m,r)
    db[res.result].find()

    You could run this (this particular code should run in the javascript shell) and it
    would create a temporary collection with totals for checkins for each location,
    for all historical checkins.

    If you wanted to just run this on a subset of recent data, you could just run the
mapreduce, but filter on checkins over the last 3 hours e.g.
                  res = db.checkins.mapReduce(m,r, {when: {$gt:nowminus3hours}}), where
                  nowminus3hours is a variable containing a datetime corresponding to the
                  current time minus 3 hours

                  This would give you a collection with documents for each location that was
                  checked into for the last 3 hours with the totals number of checkins for each
                  location contained in the document.

                  We could use this to develop a faceted search – popular places - 3 hrs, 1 day, 1
                  week, 1 month and max.

                  (Thanks to Nosh Petigara/10Gen for the scenario and the code)

The wind-up

      *** First terminate the instance ! ***

      Discussions

The Afterhours (NOSQL + Beer)

      Homework – add social graph based inference mechanisms.
      a) MyNearby Friends
         When you check-in show if your friends are here (either add friends to this site or based
         on their facebook page using the OpenGraph API)
      b) We were here
         Show if their friends were here or nearby places the friends visited
      c) Show pictures taken by friends when they were here

More Related Content

What's hot

mobl - model-driven engineering lecture
mobl - model-driven engineering lecturemobl - model-driven engineering lecture
mobl - model-driven engineering lecture
zefhemel
 
The Ring programming language version 1.10 book - Part 56 of 212
The Ring programming language version 1.10 book - Part 56 of 212The Ring programming language version 1.10 book - Part 56 of 212
The Ring programming language version 1.10 book - Part 56 of 212
Mahmoud Samir Fayed
 
java experiments and programs
java experiments and programsjava experiments and programs
java experiments and programs
Karuppaiyaa123
 
MySQL flexible schema and JSON for Internet of Things
MySQL flexible schema and JSON for Internet of ThingsMySQL flexible schema and JSON for Internet of Things
MySQL flexible schema and JSON for Internet of Things
Alexander Rubin
 
Backbone.js: Run your Application Inside The Browser
Backbone.js: Run your Application Inside The BrowserBackbone.js: Run your Application Inside The Browser
Backbone.js: Run your Application Inside The Browser
Howard Lewis Ship
 
Hopping in clouds - phpuk 17
Hopping in clouds - phpuk 17Hopping in clouds - phpuk 17
Hopping in clouds - phpuk 17
Michele Orselli
 
Drupal 8 database api
Drupal 8 database apiDrupal 8 database api
Drupal 8 database api
Viswanath Polaki
 
Writing Clean Code in Swift
Writing Clean Code in SwiftWriting Clean Code in Swift
Writing Clean Code in Swift
Derek Lee Boire
 
Node js mongodriver
Node js mongodriverNode js mongodriver
Node js mongodriver
christkv
 
Pdxpugday2010 pg90
Pdxpugday2010 pg90Pdxpugday2010 pg90
Pdxpugday2010 pg90
Selena Deckelmann
 
Cnam azure 2014 mobile services
Cnam azure 2014   mobile servicesCnam azure 2014   mobile services
Cnam azure 2014 mobile services
Aymeric Weinbach
 
The Ring programming language version 1.5.4 book - Part 40 of 185
The Ring programming language version 1.5.4 book - Part 40 of 185The Ring programming language version 1.5.4 book - Part 40 of 185
The Ring programming language version 1.5.4 book - Part 40 of 185
Mahmoud Samir Fayed
 
Mozilla とブラウザゲーム
Mozilla とブラウザゲームMozilla とブラウザゲーム
Mozilla とブラウザゲーム
Noritada Shimizu
 
Using Scala Slick at FortyTwo
Using Scala Slick at FortyTwoUsing Scala Slick at FortyTwo
Using Scala Slick at FortyTwo
Eishay Smith
 
Inside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source DatabaseInside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source Database
Mike Dirolf
 
Test driven game development silly, stupid or inspired?
Test driven game development   silly, stupid or inspired?Test driven game development   silly, stupid or inspired?
Test driven game development silly, stupid or inspired?
Eric Smith
 
Test Driven Cocos2d
Test Driven Cocos2dTest Driven Cocos2d
Test Driven Cocos2d
Eric Smith
 
RxSwift 시작하기
RxSwift 시작하기RxSwift 시작하기
RxSwift 시작하기
Suyeol Jeon
 
Data Types and Processing in ES6
Data Types and Processing in ES6Data Types and Processing in ES6
Data Types and Processing in ES6
m0bz
 
All Things Open 2016 -- Database Programming for Newbies
All Things Open 2016 -- Database Programming for NewbiesAll Things Open 2016 -- Database Programming for Newbies
All Things Open 2016 -- Database Programming for Newbies
Dave Stokes
 

What's hot (20)

mobl - model-driven engineering lecture
mobl - model-driven engineering lecturemobl - model-driven engineering lecture
mobl - model-driven engineering lecture
 
The Ring programming language version 1.10 book - Part 56 of 212
The Ring programming language version 1.10 book - Part 56 of 212The Ring programming language version 1.10 book - Part 56 of 212
The Ring programming language version 1.10 book - Part 56 of 212
 
java experiments and programs
java experiments and programsjava experiments and programs
java experiments and programs
 
MySQL flexible schema and JSON for Internet of Things
MySQL flexible schema and JSON for Internet of ThingsMySQL flexible schema and JSON for Internet of Things
MySQL flexible schema and JSON for Internet of Things
 
Backbone.js: Run your Application Inside The Browser
Backbone.js: Run your Application Inside The BrowserBackbone.js: Run your Application Inside The Browser
Backbone.js: Run your Application Inside The Browser
 
Hopping in clouds - phpuk 17
Hopping in clouds - phpuk 17Hopping in clouds - phpuk 17
Hopping in clouds - phpuk 17
 
Drupal 8 database api
Drupal 8 database apiDrupal 8 database api
Drupal 8 database api
 
Writing Clean Code in Swift
Writing Clean Code in SwiftWriting Clean Code in Swift
Writing Clean Code in Swift
 
Node js mongodriver
Node js mongodriverNode js mongodriver
Node js mongodriver
 
Pdxpugday2010 pg90
Pdxpugday2010 pg90Pdxpugday2010 pg90
Pdxpugday2010 pg90
 
Cnam azure 2014 mobile services
Cnam azure 2014   mobile servicesCnam azure 2014   mobile services
Cnam azure 2014 mobile services
 
The Ring programming language version 1.5.4 book - Part 40 of 185
The Ring programming language version 1.5.4 book - Part 40 of 185The Ring programming language version 1.5.4 book - Part 40 of 185
The Ring programming language version 1.5.4 book - Part 40 of 185
 
Mozilla とブラウザゲーム
Mozilla とブラウザゲームMozilla とブラウザゲーム
Mozilla とブラウザゲーム
 
Using Scala Slick at FortyTwo
Using Scala Slick at FortyTwoUsing Scala Slick at FortyTwo
Using Scala Slick at FortyTwo
 
Inside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source DatabaseInside MongoDB: the Internals of an Open-Source Database
Inside MongoDB: the Internals of an Open-Source Database
 
Test driven game development silly, stupid or inspired?
Test driven game development   silly, stupid or inspired?Test driven game development   silly, stupid or inspired?
Test driven game development silly, stupid or inspired?
 
Test Driven Cocos2d
Test Driven Cocos2dTest Driven Cocos2d
Test Driven Cocos2d
 
RxSwift 시작하기
RxSwift 시작하기RxSwift 시작하기
RxSwift 시작하기
 
Data Types and Processing in ES6
Data Types and Processing in ES6Data Types and Processing in ES6
Data Types and Processing in ES6
 
All Things Open 2016 -- Database Programming for Newbies
All Things Open 2016 -- Database Programming for NewbiesAll Things Open 2016 -- Database Programming for Newbies
All Things Open 2016 -- Database Programming for Newbies
 

Viewers also liked

Mainstreaming HIV into Water, Sanitation and Hygiene (WASH)
Mainstreaming HIV into Water, Sanitation and Hygiene (WASH)Mainstreaming HIV into Water, Sanitation and Hygiene (WASH)
Mainstreaming HIV into Water, Sanitation and Hygiene (WASH)
Rouzeh Eghtessadi
 
กราฟกับงานพรีเซนเตชั่น
กราฟกับงานพรีเซนเตชั่นกราฟกับงานพรีเซนเตชั่น
กราฟกับงานพรีเซนเตชั่นMonsicha Saesiao
 
เทคนิคPhotoshop
เทคนิคPhotoshopเทคนิคPhotoshop
เทคนิคPhotoshopMonsicha Saesiao
 
WinWire slides for Mobile Enterprise SaaS Platform Launch from Appcelerator
WinWire slides for Mobile Enterprise SaaS Platform Launch from AppceleratorWinWire slides for Mobile Enterprise SaaS Platform Launch from Appcelerator
WinWire slides for Mobile Enterprise SaaS Platform Launch from Appcelerator
WinWire Technologies Inc
 
Adviesinzakeadolescentenstrafrecht
AdviesinzakeadolescentenstrafrechtAdviesinzakeadolescentenstrafrecht
Adviesinzakeadolescentenstrafrechtjohanperridon
 
Proposed New US fashion design law 2010
Proposed New US fashion design law 2010Proposed New US fashion design law 2010
Proposed New US fashion design law 2010
Darrell Mottley
 
Brochure invest eng
Brochure invest engBrochure invest eng
Brochure invest eng
MRC et CLD de Brome-Missisquoi
 
网络团购调查研究报告
网络团购调查研究报告网络团购调查研究报告
网络团购调查研究报告Enlight Chen
 
Erich von daniken recuerdos del futuro
Erich von daniken   recuerdos del futuroErich von daniken   recuerdos del futuro
Erich von daniken recuerdos del futurofrostiss
 
Bollywood purple sarees online collection with maching clutch pursh
Bollywood purple sarees online collection with maching clutch purshBollywood purple sarees online collection with maching clutch pursh
Bollywood purple sarees online collection with maching clutch pursh
ChrisPerez
 
Rode Duivels - Red Devils Belgium
Rode Duivels - Red Devils BelgiumRode Duivels - Red Devils Belgium
Rode Duivels - Red Devils Belgium
Nils Demanet
 
Wired2Win webinar Building Enterprise Social Engine Leveraging SharePoint 2013
Wired2Win webinar Building Enterprise Social Engine Leveraging SharePoint 2013Wired2Win webinar Building Enterprise Social Engine Leveraging SharePoint 2013
Wired2Win webinar Building Enterprise Social Engine Leveraging SharePoint 2013
WinWire Technologies Inc
 
ABC Breakfast Club m. Abena: 40% færre restordre
ABC Breakfast Club m. Abena: 40% færre restordreABC Breakfast Club m. Abena: 40% færre restordre
ABC Breakfast Club m. Abena: 40% færre restordre
ABC Softwork
 
Feast of saint martin celebrated in santarcangelo by cassoli alberto 3 d
Feast of saint martin  celebrated in santarcangelo by cassoli alberto 3 dFeast of saint martin  celebrated in santarcangelo by cassoli alberto 3 d
Feast of saint martin celebrated in santarcangelo by cassoli alberto 3 d
barbelkarlsruhe
 
Ace of Sales Affiliate Program Walkthrough
Ace of Sales Affiliate Program WalkthroughAce of Sales Affiliate Program Walkthrough
Ace of Sales Affiliate Program Walkthrough
Andy Horner
 
나의 Tstore 공모전 체험수기 (김재철)
나의 Tstore 공모전 체험수기 (김재철)나의 Tstore 공모전 체험수기 (김재철)
나의 Tstore 공모전 체험수기 (김재철)
ONE store corp. (원스토어 주식회사)
 
CSA 2010: Carrier Considerations
CSA 2010: Carrier ConsiderationsCSA 2010: Carrier Considerations
CSA 2010: Carrier Considerations
Multi Service
 

Viewers also liked (20)

Mainstreaming HIV into Water, Sanitation and Hygiene (WASH)
Mainstreaming HIV into Water, Sanitation and Hygiene (WASH)Mainstreaming HIV into Water, Sanitation and Hygiene (WASH)
Mainstreaming HIV into Water, Sanitation and Hygiene (WASH)
 
กราฟกับงานพรีเซนเตชั่น
กราฟกับงานพรีเซนเตชั่นกราฟกับงานพรีเซนเตชั่น
กราฟกับงานพรีเซนเตชั่น
 
เทคนิคPhotoshop
เทคนิคPhotoshopเทคนิคPhotoshop
เทคนิคPhotoshop
 
WinWire slides for Mobile Enterprise SaaS Platform Launch from Appcelerator
WinWire slides for Mobile Enterprise SaaS Platform Launch from AppceleratorWinWire slides for Mobile Enterprise SaaS Platform Launch from Appcelerator
WinWire slides for Mobile Enterprise SaaS Platform Launch from Appcelerator
 
Adviesinzakeadolescentenstrafrecht
AdviesinzakeadolescentenstrafrechtAdviesinzakeadolescentenstrafrecht
Adviesinzakeadolescentenstrafrecht
 
Proposed New US fashion design law 2010
Proposed New US fashion design law 2010Proposed New US fashion design law 2010
Proposed New US fashion design law 2010
 
Brochure invest eng
Brochure invest engBrochure invest eng
Brochure invest eng
 
网络团购调查研究报告
网络团购调查研究报告网络团购调查研究报告
网络团购调查研究报告
 
Erich von daniken recuerdos del futuro
Erich von daniken   recuerdos del futuroErich von daniken   recuerdos del futuro
Erich von daniken recuerdos del futuro
 
Bollywood purple sarees online collection with maching clutch pursh
Bollywood purple sarees online collection with maching clutch purshBollywood purple sarees online collection with maching clutch pursh
Bollywood purple sarees online collection with maching clutch pursh
 
Rode Duivels - Red Devils Belgium
Rode Duivels - Red Devils BelgiumRode Duivels - Red Devils Belgium
Rode Duivels - Red Devils Belgium
 
Wired2Win webinar Building Enterprise Social Engine Leveraging SharePoint 2013
Wired2Win webinar Building Enterprise Social Engine Leveraging SharePoint 2013Wired2Win webinar Building Enterprise Social Engine Leveraging SharePoint 2013
Wired2Win webinar Building Enterprise Social Engine Leveraging SharePoint 2013
 
ABC Breakfast Club m. Abena: 40% færre restordre
ABC Breakfast Club m. Abena: 40% færre restordreABC Breakfast Club m. Abena: 40% færre restordre
ABC Breakfast Club m. Abena: 40% færre restordre
 
Feast of saint martin celebrated in santarcangelo by cassoli alberto 3 d
Feast of saint martin  celebrated in santarcangelo by cassoli alberto 3 dFeast of saint martin  celebrated in santarcangelo by cassoli alberto 3 d
Feast of saint martin celebrated in santarcangelo by cassoli alberto 3 d
 
Android
AndroidAndroid
Android
 
Resumenek
ResumenekResumenek
Resumenek
 
Ace of Sales Affiliate Program Walkthrough
Ace of Sales Affiliate Program WalkthroughAce of Sales Affiliate Program Walkthrough
Ace of Sales Affiliate Program Walkthrough
 
나의 Tstore 공모전 체험수기 (김재철)
나의 Tstore 공모전 체험수기 (김재철)나의 Tstore 공모전 체험수기 (김재철)
나의 Tstore 공모전 체험수기 (김재철)
 
Open
OpenOpen
Open
 
CSA 2010: Carrier Considerations
CSA 2010: Carrier ConsiderationsCSA 2010: Carrier Considerations
CSA 2010: Carrier Considerations
 

Similar to Nosql hands on handout 04

Refactoring to Macros with Clojure
Refactoring to Macros with ClojureRefactoring to Macros with Clojure
Refactoring to Macros with Clojure
Dmitry Buzdin
 
Fun Teaching MongoDB New Tricks
Fun Teaching MongoDB New TricksFun Teaching MongoDB New Tricks
Fun Teaching MongoDB New Tricks
MongoDB
 
MongoDB .local Toronto 2019: Using Change Streams to Keep Up with Your Data
MongoDB .local Toronto 2019: Using Change Streams to Keep Up with Your DataMongoDB .local Toronto 2019: Using Change Streams to Keep Up with Your Data
MongoDB .local Toronto 2019: Using Change Streams to Keep Up with Your Data
MongoDB
 
CouchDB on Android
CouchDB on AndroidCouchDB on Android
CouchDB on Android
Sven Haiges
 
An intro to Docker, Terraform, and Amazon ECS
An intro to Docker, Terraform, and Amazon ECSAn intro to Docker, Terraform, and Amazon ECS
An intro to Docker, Terraform, and Amazon ECS
Yevgeniy Brikman
 
Angular Schematics
Angular SchematicsAngular Schematics
Angular Schematics
Christoffer Noring
 
AWS Office Hours: Amazon Elastic MapReduce
AWS Office Hours: Amazon Elastic MapReduce AWS Office Hours: Amazon Elastic MapReduce
AWS Office Hours: Amazon Elastic MapReduce
Amazon Web Services
 
CouchDB Mobile - From Couch to 5K in 1 Hour
CouchDB Mobile - From Couch to 5K in 1 HourCouchDB Mobile - From Couch to 5K in 1 Hour
CouchDB Mobile - From Couch to 5K in 1 Hour
Peter Friese
 
Latinoware
LatinowareLatinoware
Latinoware
kchodorow
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation Framework
Caserta
 
Introduction tomongodb
Introduction tomongodbIntroduction tomongodb
Introduction tomongodb
Lee Theobald
 
Play á la Rails
Play á la RailsPlay á la Rails
Play á la Rails
Sebastian Nozzi
 
Redis the better NoSQL
Redis the better NoSQLRedis the better NoSQL
Redis the better NoSQL
OpenFest team
 
C++ tutorial
C++ tutorialC++ tutorial
The Art Of Readable Code
The Art Of Readable CodeThe Art Of Readable Code
The Art Of Readable Code
Baidu, Inc.
 
C++totural file
C++totural fileC++totural file
C++totural file
halaisumit
 
Writing Maintainable JavaScript
Writing Maintainable JavaScriptWriting Maintainable JavaScript
Writing Maintainable JavaScript
Andrew Dupont
 
Node Boot Camp
Node Boot CampNode Boot Camp
Node Boot Camp
Troy Miles
 
The Ring programming language version 1.10 book - Part 22 of 212
The Ring programming language version 1.10 book - Part 22 of 212The Ring programming language version 1.10 book - Part 22 of 212
The Ring programming language version 1.10 book - Part 22 of 212
Mahmoud Samir Fayed
 
Fullstack conf 2017 - Basic dev pipeline end-to-end
Fullstack conf 2017 - Basic dev pipeline end-to-endFullstack conf 2017 - Basic dev pipeline end-to-end
Fullstack conf 2017 - Basic dev pipeline end-to-end
Ezequiel Maraschio
 

Similar to Nosql hands on handout 04 (20)

Refactoring to Macros with Clojure
Refactoring to Macros with ClojureRefactoring to Macros with Clojure
Refactoring to Macros with Clojure
 
Fun Teaching MongoDB New Tricks
Fun Teaching MongoDB New TricksFun Teaching MongoDB New Tricks
Fun Teaching MongoDB New Tricks
 
MongoDB .local Toronto 2019: Using Change Streams to Keep Up with Your Data
MongoDB .local Toronto 2019: Using Change Streams to Keep Up with Your DataMongoDB .local Toronto 2019: Using Change Streams to Keep Up with Your Data
MongoDB .local Toronto 2019: Using Change Streams to Keep Up with Your Data
 
CouchDB on Android
CouchDB on AndroidCouchDB on Android
CouchDB on Android
 
An intro to Docker, Terraform, and Amazon ECS
An intro to Docker, Terraform, and Amazon ECSAn intro to Docker, Terraform, and Amazon ECS
An intro to Docker, Terraform, and Amazon ECS
 
Angular Schematics
Angular SchematicsAngular Schematics
Angular Schematics
 
AWS Office Hours: Amazon Elastic MapReduce
AWS Office Hours: Amazon Elastic MapReduce AWS Office Hours: Amazon Elastic MapReduce
AWS Office Hours: Amazon Elastic MapReduce
 
CouchDB Mobile - From Couch to 5K in 1 Hour
CouchDB Mobile - From Couch to 5K in 1 HourCouchDB Mobile - From Couch to 5K in 1 Hour
CouchDB Mobile - From Couch to 5K in 1 Hour
 
Latinoware
LatinowareLatinoware
Latinoware
 
MongoDB Aggregation Framework
MongoDB Aggregation FrameworkMongoDB Aggregation Framework
MongoDB Aggregation Framework
 
Introduction tomongodb
Introduction tomongodbIntroduction tomongodb
Introduction tomongodb
 
Play á la Rails
Play á la RailsPlay á la Rails
Play á la Rails
 
Redis the better NoSQL
Redis the better NoSQLRedis the better NoSQL
Redis the better NoSQL
 
C++ tutorial
C++ tutorialC++ tutorial
C++ tutorial
 
The Art Of Readable Code
The Art Of Readable CodeThe Art Of Readable Code
The Art Of Readable Code
 
C++totural file
C++totural fileC++totural file
C++totural file
 
Writing Maintainable JavaScript
Writing Maintainable JavaScriptWriting Maintainable JavaScript
Writing Maintainable JavaScript
 
Node Boot Camp
Node Boot CampNode Boot Camp
Node Boot Camp
 
The Ring programming language version 1.10 book - Part 22 of 212
The Ring programming language version 1.10 book - Part 22 of 212The Ring programming language version 1.10 book - Part 22 of 212
The Ring programming language version 1.10 book - Part 22 of 212
 
Fullstack conf 2017 - Basic dev pipeline end-to-end
Fullstack conf 2017 - Basic dev pipeline end-to-endFullstack conf 2017 - Basic dev pipeline end-to-end
Fullstack conf 2017 - Basic dev pipeline end-to-end
 

More from Krishna Sankar

Pandas, Data Wrangling & Data Science
Pandas, Data Wrangling & Data SciencePandas, Data Wrangling & Data Science
Pandas, Data Wrangling & Data Science
Krishna Sankar
 
An excursion into Graph Analytics with Apache Spark GraphX
An excursion into Graph Analytics with Apache Spark GraphXAn excursion into Graph Analytics with Apache Spark GraphX
An excursion into Graph Analytics with Apache Spark GraphX
Krishna Sankar
 
An excursion into Text Analytics with Apache Spark
An excursion into Text Analytics with Apache SparkAn excursion into Text Analytics with Apache Spark
An excursion into Text Analytics with Apache Spark
Krishna Sankar
 
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & AntidotesBig Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Krishna Sankar
 
Data Science with Spark
Data Science with SparkData Science with Spark
Data Science with Spark
Krishna Sankar
 
Architecture in action 01
Architecture in action 01Architecture in action 01
Architecture in action 01
Krishna Sankar
 
Data Science with Spark - Training at SparkSummit (East)
Data Science with Spark - Training at SparkSummit (East)Data Science with Spark - Training at SparkSummit (East)
Data Science with Spark - Training at SparkSummit (East)
Krishna Sankar
 
R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538
R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538
R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538
Krishna Sankar
 
R, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science CompetitionsR, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science Competitions
Krishna Sankar
 
The Hitchhiker's Guide to Machine Learning with Python & Apache Spark
The Hitchhiker's Guide to Machine Learning with Python & Apache SparkThe Hitchhiker's Guide to Machine Learning with Python & Apache Spark
The Hitchhiker's Guide to Machine Learning with Python & Apache Spark
Krishna Sankar
 
Data Science Folk Knowledge
Data Science Folk KnowledgeData Science Folk Knowledge
Data Science Folk Knowledge
Krishna Sankar
 
Data Wrangling For Kaggle Data Science Competitions
Data Wrangling For Kaggle Data Science CompetitionsData Wrangling For Kaggle Data Science Competitions
Data Wrangling For Kaggle Data Science Competitions
Krishna Sankar
 
Bayesian Machine Learning - Naive Bayes
Bayesian Machine Learning - Naive BayesBayesian Machine Learning - Naive Bayes
Bayesian Machine Learning - Naive Bayes
Krishna Sankar
 
AWS VPC distilled for MongoDB devOps
AWS VPC distilled for MongoDB devOpsAWS VPC distilled for MongoDB devOps
AWS VPC distilled for MongoDB devOps
Krishna Sankar
 
The Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & PythonThe Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & Python
Krishna Sankar
 
Big Data Engineering - Top 10 Pragmatics
Big Data Engineering - Top 10 PragmaticsBig Data Engineering - Top 10 Pragmatics
Big Data Engineering - Top 10 Pragmatics
Krishna Sankar
 
Scrum debrief to team
Scrum debrief to team Scrum debrief to team
Scrum debrief to team
Krishna Sankar
 
The Art of Big Data
The Art of Big DataThe Art of Big Data
The Art of Big Data
Krishna Sankar
 
Precision Time Synchronization
Precision Time SynchronizationPrecision Time Synchronization
Precision Time Synchronization
Krishna Sankar
 
The Hitchhiker’s Guide to Kaggle
The Hitchhiker’s Guide to KaggleThe Hitchhiker’s Guide to Kaggle
The Hitchhiker’s Guide to Kaggle
Krishna Sankar
 

More from Krishna Sankar (20)

Pandas, Data Wrangling & Data Science
Pandas, Data Wrangling & Data SciencePandas, Data Wrangling & Data Science
Pandas, Data Wrangling & Data Science
 
An excursion into Graph Analytics with Apache Spark GraphX
An excursion into Graph Analytics with Apache Spark GraphXAn excursion into Graph Analytics with Apache Spark GraphX
An excursion into Graph Analytics with Apache Spark GraphX
 
An excursion into Text Analytics with Apache Spark
An excursion into Text Analytics with Apache SparkAn excursion into Text Analytics with Apache Spark
An excursion into Text Analytics with Apache Spark
 
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & AntidotesBig Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
 
Data Science with Spark
Data Science with SparkData Science with Spark
Data Science with Spark
 
Architecture in action 01
Architecture in action 01Architecture in action 01
Architecture in action 01
 
Data Science with Spark - Training at SparkSummit (East)
Data Science with Spark - Training at SparkSummit (East)Data Science with Spark - Training at SparkSummit (East)
Data Science with Spark - Training at SparkSummit (East)
 
R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538
R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538
R, Data Wrangling & Predicting NFL with Elo like Nate SIlver & 538
 
R, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science CompetitionsR, Data Wrangling & Kaggle Data Science Competitions
R, Data Wrangling & Kaggle Data Science Competitions
 
The Hitchhiker's Guide to Machine Learning with Python & Apache Spark
The Hitchhiker's Guide to Machine Learning with Python & Apache SparkThe Hitchhiker's Guide to Machine Learning with Python & Apache Spark
The Hitchhiker's Guide to Machine Learning with Python & Apache Spark
 
Data Science Folk Knowledge
Data Science Folk KnowledgeData Science Folk Knowledge
Data Science Folk Knowledge
 
Data Wrangling For Kaggle Data Science Competitions
Data Wrangling For Kaggle Data Science CompetitionsData Wrangling For Kaggle Data Science Competitions
Data Wrangling For Kaggle Data Science Competitions
 
Bayesian Machine Learning - Naive Bayes
Bayesian Machine Learning - Naive BayesBayesian Machine Learning - Naive Bayes
Bayesian Machine Learning - Naive Bayes
 
AWS VPC distilled for MongoDB devOps
AWS VPC distilled for MongoDB devOpsAWS VPC distilled for MongoDB devOps
AWS VPC distilled for MongoDB devOps
 
The Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & PythonThe Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & Python
 
Big Data Engineering - Top 10 Pragmatics
Big Data Engineering - Top 10 PragmaticsBig Data Engineering - Top 10 Pragmatics
Big Data Engineering - Top 10 Pragmatics
 
Scrum debrief to team
Scrum debrief to team Scrum debrief to team
Scrum debrief to team
 
The Art of Big Data
The Art of Big DataThe Art of Big Data
The Art of Big Data
 
Precision Time Synchronization
Precision Time SynchronizationPrecision Time Synchronization
Precision Time Synchronization
 
The Hitchhiker’s Guide to Kaggle
The Hitchhiker’s Guide to KaggleThe Hitchhiker’s Guide to Kaggle
The Hitchhiker’s Guide to Kaggle
 

Nosql hands on handout 04

  • 1. H I T C H H I K ER ’ S GUI D E T O A NO S Q L D A T A C L O UD – T UT O R I A L H A ND O UT AWS SETUP 1. What you need : a. An AWS account. Signup is at Amazon EC2 page b. Create a local directory aws : ~/aws or c:aws. It is easier to have a sub directory to keep all aws stuff c. (optional) Have downloaded and installed aws tools. I install them in the aws sub directory d. Setup and store an SSH key. My key is at ~/aws/SSH-Key1.pem e. Create a security group (for example Test-1 as shown in Figure 1 – Open port 22,80 and 8080. Remember this is for experimental purposes and not for production) FIG UR E 1 2. Redeem AWS credit. Amazon has given us credit voucher for $25. (Thanks to Jeff Barr and Jenny Kohr Chynoweth) a. Visit the AWS Credit Redemption Page and enter the promotion code above. You must be signed in to view this page. b. Verify that the credit has been applied to your account by viewing your Account Activity page. c. These credits will be good for $25 worth of AWS services from June 10, 2010-December 31, 2010. 3. I have created OSCON-NOSQL AMI that has the stuff needed for this tutorial. a. Updates and references at http://doubleclix.wordpress.com/2010/07/13/oscon- nosql-training-ami/ b. Reference Pack i. The NOSQL Slide set ii. The hand out (This document)
  • 2. iii. Reference List iv. The AMI v. The NOSQL papers 4. To launch the AMI AWS Management Console-EC2-Launch Instance Search for OSCON-NOSQL (Figure 2) FIG UR E 2 [Select] [Instance Details] (small instance is fine) [Advanced Instance Options] (Default is fine) [Create Key pair] (Use existing if you already have one or create one and store in the ~/aws directory [Configure Firewall] – choose Test-1 or create 1 with 80,22 and 8080 open [Review]-[Launch]-[Close] Instance should show up in the management console in a few minutes (Figure 3) FIG UR E 3 5. Open up a console and ssh into the instance SSH-Key1.pem ubuntu@ec2-184-73-85-67.compute-1.amazonaws.com And type-in couple of commands (Figure 4)
  • 3. FIG UR E 4 Note : If it says connection refused, reboot the instance. You can check the startup logs by typing ec2-get-console-output --private-key <private key file name> --cert <certificate file name> <instancd id> Also type-in commands like the following to get a feel for the instance. ps –A vmstat cat /proc/cpuinfo uname –a sudo fdisk –l sudo lshw df –h –T Note : Don’t forget to turn off the instance at the end of the hands-on session !
  • 4. HANDS-ON DOCUMENT DATASTORES - MONGODB Close Encounters of the first kind …  Start mongo shell : type mongo FIG UR E 5  The CLI is a full Javascript interpreter  Type any javascript commands Function loopTest() { For (i=0;i<10;i++) { Print(i); } }  See how the shell understands editing, multi-line and end of function  loopTest will show the code  loopTest() will run the function  Other commands to try o show dbs o use <dbname> o show collections o db.help() o db.collections.help() o db.stats() o db.version() Warm up  create a document & insert it
  • 5. o person = {“name”,”aname”,”phone”:”555-1212”} o db.persons.insert(person) o db.persons.find()  <discussion about id> o db.persons.help() o db.persons.count()  Create 2 more records  Embedded Nested documents  address = {“street”:”123,some street”,”City”:”Portland”,”state”:”WA”, ”tags”:[“home”,”WA”])  db.persons.update({“name”:”aname”},address) – will replace  Need to use update modifier o Db.persons.update({“name”:”aname”},{$set :{address}}) to make add address  Discussion: o Embedded docs are very versatile o For example the collider could add more data from another source with nested docs and added metadata o As mongo understands json structure, it can index inside embedded docs. For example address.state to update keys in embedded docs  db.persons.remove({})  “$unset” to remove a key  Upsert – interesting o db.persons.update({“name”:”xyz”},{“$inc”:{“checkins”:1}}) <- won’t add o db.persons.update({“name”:”xyz”},{“$inc”:{“checkins”:1}},{“upsert”:”true”) <-will add  Add a new field/update schema o db.persons.update({},{“$set”:{“insertDate”:new Date()}}) o db.runCommand({getLastError:1}) o db.persons.update({},{“$set”:{“insertDate”:new Date()}},{”upsert”:”false”},{“multi”:”true”}) o db.runCommand({getLastError:1}) Main Feature – Loc us  Goal is to develop location based/geospatial capability : o Allow users to 'check in' at places o Based on the location, recommend near-by attractions o Allow users to leave tips about places o Do a few aggregate map-reduce operations like “How many people are checked in at a place?” and “List who is there” …  But for this demo, we will implement a small slice : Ability to checkin and show nearby soccer attractions using a Java program  A quick DataModel Database : locus Collections: Places – state, city, stadium, latlong
  • 6. people – name, when, latlong  Hands on FIG UR E 6 FIG UR E 7 o use <dbname> (use locus) o Show collections
  • 7. o db.stats() o db.<collection>.find() o db.checkins.find() o db.places.find() FIG UR E 8 FIG UR E 9
  • 8. FIG UR E 10  db.places.find({latlong: {$near : [40,70]})  Mapreduce Mapreduce is best for batch type operations, especially aggregations - while the standard queries are well suited for lookup operations/matching operations on documents. In our example, we could run mapreduce operation every few hours to calculate which locations are 'trending' i.e. have a many check ins in the last few hours.. First we will add one more element place and capture the place where the person checksin. Then we could use the following simple map-reduce operation which operates on the checkin collection which aggregates counts of all historical checkins per location: m = function() { emit(this.place, 1);} r = function(key, values) { total = 0; for (var i in values) total += values[i]; return total; } res = db.checkins.mapReduce(m,r) db[res.result].find() You could run this (this particular code should run in the javascript shell) and it would create a temporary collection with totals for checkins for each location, for all historical checkins. If you wanted to just run this on a subset of recent data, you could just run the
  • 9. mapreduce, but filter on checkins over the last 3 hours e.g. res = db.checkins.mapReduce(m,r, {when: {$gt:nowminus3hours}}), where nowminus3hours is a variable containing a datetime corresponding to the current time minus 3 hours This would give you a collection with documents for each location that was checked into for the last 3 hours with the totals number of checkins for each location contained in the document. We could use this to develop a faceted search – popular places - 3 hrs, 1 day, 1 week, 1 month and max. (Thanks to Nosh Petigara/10Gen for the scenario and the code) The wind-up *** First terminate the instance ! *** Discussions The Afterhours (NOSQL + Beer) Homework – add social graph based inference mechanisms. a) MyNearby Friends When you check-in show if your friends are here (either add friends to this site or based on their facebook page using the OpenGraph API) b) We were here Show if their friends were here or nearby places the friends visited c) Show pictures taken by friends when they were here