• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
MongoDB Aggregation MongoSF May 2011

MongoDB Aggregation MongoSF May 2011



Chris Westin's talk from MongoSF (May 2011) on MongoDB's coming aggregation framework.

Chris Westin's talk from MongoSF (May 2011) on MongoDB's coming aggregation framework.



Total Views
Views on SlideShare
Embed Views



13 Embeds 180

http://www.10gen.com 116
http://coderwall.com 18
http://www.mongodb.com 12
http://paper.li 12
http://a0.twimg.com 6
https://twitter.com 4
http://us-w1.rockmelt.com 3
http://drupal1.10gen.cc 3
url_unknown 2
http://tweetedtimes.com 1
http://twitter.com 1
http://trunk.ly 1
http://www.linkedin.com 1



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    MongoDB Aggregation MongoSF May 2011 MongoDB Aggregation MongoSF May 2011 Presentation Transcript

    • MongoDB’s New Aggregation Features
      Chris Westin
      © Copyright 2010 10gen Inc.
    • What problem are we solving?
      Map/Reduce can be used for aggregation…
      Currently being used for totaling, averaging, etc
      Map/Reduce is a big hammer
      Simpler tasks should be easier
      Shouldn’t need to write JavaScript
      Avoid the overhead of JavaScript engine
      We’re seeing requests for help in handling complex documents
      Select only subdocuments or arrays
    • How will we solve the problem?
      Our new aggregation framework
      Declarative framework
      No JavaScript required
      Describe a chain of operations to apply
      Expression evaluation
      Return computed values
      Framework: we can add new operations easily
      C++ implementation
      Higher performance than JavaScript
    • Aggregation - Pipelines
      Aggregation requests specify a pipeline
      A pipeline is a series of operations
      Conceptually, the members of a collection are passed through a pipeline to produce a result
      Similar to a command-line pipe
    • Pipeline Operations
      Uses a query predicate (like .find({…})) as a filter
      Uses a sample document to determine the shape of the result (similar to .find()’s optional argument)
      This can include computed values
      Aggregates items into buckets defined by a key
    • Computed Expressions
      Available in $project operations
      Prefix expression language
      Add two fields: $add:[“$field1”, “$field2”]
      Provide a value for a missing field: $ifnull:[“$field1”, “$field2”]
      Nesting: $add:[“$field1”, $ifnull:[“$field2”, “$field3”]]
      Other functions….
      And we can easily add more as required
    • Projections
      $project can reshape results
      $unwind expression doles out array values one at a time
      Pull fields from nested documents to the top
      Push fields from the top down into new virtual documents
    • Grouping
      $group aggregation expressions
      Total of column values: $sum
      Average of column values: $avg
      Collect column values in an array: $push
    • Demo
      (See script at https://gist.github.com/993733)
    • Usage Tips
      Use $match in a pipeline as early as possible
      The query optimizer can then be used to choose an index and avoid scanning the entire collection
    • Driver Support
      Initial version is a command
      For any language, build a JSON database object, and execute the command
      { aggregate : <collection>, pipeline : {…} }
      Beware of command result size limit
    • When is this being released?
      In final development now
      Expect to see this in the near future
    • Sharding support
      Initial release will support sharding
      Mongos analyzes pipeline, and forwards operations up to $group to shards; combines shard server results and continues
    • Pipeline Operations – Future Plans
      Sorts the document stream according to a key
      Saves the document stream to a collection
      Similar to M/R $out, but with sharded output
    • Expressions – Future Plans
      Date field extraction
      Get year, month, day, hour, etc, from Date
      Date arithmetic