Deployment Architecture (revised) Load Balancer / Proxy Gobble Server Develop Scalability is good Single-node performance is good, too Master DB Server MongoDB Master Apache mod_wsgi / TG 2.0 Apache mod_wsgi / TG 2.0 Apache mod_wsgi / TG 2.0 Apache mod_wsgi / TG 2.0
SF.net Downloads
Allow non-sf.net projects to use SourceForge mirror network
Stats calculated in Hadoop and stored/served from MongoDB
Same deployment architecture as Consume (4 web, 1 db)
Allura (SF.net “beta” devtools)
Rewrite developer tools with new architecture
Wiki, Tracker, Discussions, Git, Hg, SVN, with more to come
Single MongoDB replica set manually sharded by project
Release early & often
What We Liked
Performance, performance, performance – Easily handle 90% of SF.net traffic from 1 DB server, 4 web servers
Schemaless server allows fast schema evolution in development, making many migrations unnecessary
Replication is easy , making scalability and backups easy
Keep a “backup slave” running
Kill backup slave, copy off database, bring back up the slave
Automatic re-sync with master
Query Language
You mean I can have performance without map-reduce?
GridFS
Pitfalls
Too-large documents
Store less per document
Return only a few fields
Ignoring indexing
Watch your server log; bad queries show up there
Ignoring your data’s schema
Using many databases when one will do
Using too many queries
Ming – an “Object-Document Mapper?”
Your data has a schema
Your database can define and enforce it
It can live in your application (as with MongoDB)
Nice to have the schema defined in one place in the code
Sometimes you need a “migration”
Changing the structure/meaning of fields
Adding indexes
Sometimes lazy, sometimes eager
Queuing up all your updates can be handy
Python dicts are nice; objects are nicer
Ming Concepts
Inspired by SQLAlchemy
Group of classes to which you map your collections
Each class defines its schema, including indexes
Convenience methods for loading/saving objects and ensuring indexes are created
Migrations
Unit of Work – great for web applications
MIM – “Mongo in Memory” nice for unit tests
Ming Example from ming import schema from ming.orm import MappedClass from ming.orm import (FieldProperty, ForeignIdProperty, RelationProperty) class WikiPage (MappedClass): class __mongometa__ : session = session name = 'wiki_page' _id = FieldProperty(schema . ObjectId) title = FieldProperty( str ) text = FieldProperty( str ) comments = RelationProperty( 'WikiComment' ) MappedClass . compile_all() # Lets ming know about the mapping
Open Source
Ming
http://sf.net/projects/merciless/
MIT License
Allura
http://sf.net/p/allura/
Apache License
Future Work
mongos
New Allura Tools
Migrating legacy SF.net projects to Allura
Stats all in MongoDB rather than Hadoop?
Better APIs to access your project data
Questions?
Rick Copeland @rick446 [email_address]
Let LinkedIn power your SlideShare experience
+
Let LinkedIn power your SlideShare experience
Customize SlideShare content based on your interests
We will import your LinkedIn profile and you will be visible on SlideShare.
Keep up to date when your LinkedIn contacts post on SlideShare
1–2 of 2 previous next