Evolving the nuget.org Architecture

Jeff Handley
Jeff HandleyEngineering Manager at Microsoft
Evolving the nuget.org 
Architecture 
Jeff Handley 
@JeffHandley
Evolving the nuget.org Architecture
Evolving the nuget.org Architecture
275 000 000 
Package downloads
265 000 
Total packages
28 000 
Unique package IDs
7 500 000 
Requests per day
1 500 000 
Searches per day
700 000 
Package downloads per day
550 
Package uploads per day
60 
User registrations per day
85 
Requests per second
Uptime (October 2013 – September 2014) 
99.86% 
99.88% 
99.91% 
100.00% 
99.95% 
99.90% 
99.85% 
99.80% 
package installation 
and restore 
feed / search registration, login, 
and package uploads
http://lapulpadigital.files.wordpress.com/2013/12/simo 
n-cowell-people-sound-bad-when-recorded.jpg
Line of Business application 
Using familiar technology
API v2 architecture 
SQL Azure 
database 
Azure 
Blob 
Storage 
Gallery / API v2 
Azure web role (3 instances) 
EntityFramework 
code-first 
WCF Data 
Services and 
ASP.NET MVC 
and Razor 
OData 
Search and 
feed 
requests 
Website 
requests 
Content Delivery Network (CDN) 
Package 
downloads
http://www.thecoralgablesstory.com/2010/08/17/sellers-regret- 
is-the-new-buyers-remorse-in-real-estate/
Basing architecture on usage 
Instead of familiar technology
all other reads 
71% 
user registrations 
0.001% 
package downloads 
9% 
package uploads 
0.01% 
searches 
20%
Avoiding failures 
• Writes should not impact read or query performance 
• Command-Query Responsibility Segregation 
• Append-only system for incremental processing 
• Reads should not require server compute 
• Presentation models (views) are regenerated (and persisted) asynchronously 
• The materialized views can then be treated as static documents on the CDN 
• Queries should not require external resources 
• Lucene search index can be held entirely in memory
Catalog and materialized views 
Resolver metadata 
Another catalog 
HTML 
Web site 
catalog 
LUCENE Search
Building the catalog 
page0.json 
A B C 
page1.json 
D E F 
page2.json 
G H I 
index.json
Reading the catalog 
index.json 
page0.json 
A B C 
page1.json 
D E F 
page2.json 
G H I
Commit timestamps 
1 | 1 | 2 
1 1 2 
3 | 3 | 3 
3 3 3 
4 | 5 | 5 
4 5 5 
2 | 3 | 5
Incremental updates 
1 | 1 | 2 
1 1 2 
3 | 3 | 3 
3 3 3 
4 | 5 | 5 
4 5 5 
2 | 3 | 5 
Cursor: 45
Materializing views as blobs 
Read new 
catalog entries 
Apply view 
Identify blobs 
to be written 
Read any 
existing blobs 
Merge data 
using RDF 
Write the new 
blob
http://jerryfahrni.com/2009/12/biometric-identification-and-facial-recognition/
Database 
Server 
http://jerryfahrni.com/2009/12/biometric-identification-and-facial-recognition/
What we planned 
•Add API v3 support to nuget.org 
• Update the NuGet client to use API v3 
• Factor the server into NuGet packages 
• Factor the client into NuGet packages
Design resource formats 
• Search 
• Dependency resolution 
•Website rendering 
• Other package metadata reads 
• Gallery mirroring 
• Custom metadata consumption scenarios
Representing a graph in JSON 
• Collection+JSON and Collection.Next+JSON 
• Very similar to ATOM 
• Includes query templates 
• JSON-API and most others 
• Emphasis on RESTful CRUD 
• Client specification of data shape in some 
• We didn’t want to expose our raw data model 
• Scenario-specific views instead
http://www.flickr.com/photos/foxypar4/1004464889/
JSON-LD 
HTTP://JSON-LD.ORG/ 
{ 
"@context": "http://json-ld.org/contexts/person.jsonld", 
"@id": "http://dbpedia.org/resource/John_Lennon", 
"name": "John Lennon", 
"born": "1940-10-09", 
"spouse": "http://dbpedia.org/resource/Cynthia_Lennon" 
}
Linked Data 
• @context allows the JSON document to be treated as an RDF data-set 
• RDF = Resource Definition Framework 
• Fully-qualified names and relationships (URI triples) 
• RDF data-sets benefits 
• Easily merged and queried 
• Idempotent 
• Everything is namespaced 
• Based on W3C standards 
• RDF 1.0: 1999 
• RDF 1.1: February 2014 - http://www.w3.org/RDF/ 
• JSON-LD: January 2014 - http://www.w3.org/TR/json-ld/ (W3C Recommendation)
Azure 
Blob 
Storage 
API v3 
search 
(feed) 
Search Service 
Azure worker role (3+ instances) 
Lucene.NET 
Full index loaded into memory 
Backend jobs 
Azure VM 
(1 instance) 
SQL 
Azure 
warehouse Gallery / API v2 
Azure web role (3+ instances) 
Content Delivery Network (CDN) 
Package 
Downloads 
SQL 
Azure 
database 
Website 
requests 
API v3 
Package 
Metadata 
Metrics Service 
Azure website (3+ instances)
API v3 clients 
• Get service index from CDN 
• Links are followed to resources: 
• Search service 
• Dependency resolver views 
• Other metadata views 
• Packages 
• Metrics service
http://preview.nuget.org/ver3-preview/index.json 
Humanizer RavenDB.Client
http://readingafterbedtime.wordpress.com/2012/11/01/october-2012-in-review/
http://freethoughtblogs.com/biodork/2012/03/19/cross-country-connections-tricky/
API v2 client behavior locked 
• Lots of existing clients in the wild 
• All requests are made through a single Gallery pipeline 
• Download requests 
• Record statistics to database 
• Redirect to the CDN to get the nupkg
Azure 
Blob 
Storage 
API v3 
search 
(feed) 
Search Service 
Azure worker role (3+ instances) 
Lucene.NET 
Full Index Loaded into Memory 
Backend jobs 
Azure VM 
(1 instance) 
SQL 
Azure 
warehouse Gallery / API v2 
Azure web role (3+ instances) 
Content Delivery Network (CDN) 
Package 
Downloads 
SQL 
Azure 
database 
Website 
requests 
API v3 
Package 
Metadata 
Metrics Service 
Azure website (3+ instances) 
API v2 
OData
Search service integration 
Gallery receives 
OData request 
Search queries 
identified and 
stopped 
API v3 
request to 
search service 
JSON-LD 
response 
Transformed 
into OData XML 
OData XML 
returned
Metrics service integration 
Gallery receives 
download 
request 
Call to 
database is 
removed 
API v3 
request to 
metrics service 
JSON-LD 
POST 
Fire and forget 
Redirect to 
package on 
CDN
Iterative success 
• Usage-based priorities for availability 
• JSON-LD as our API v3 resource format 
• Scenario-focused API v3 subsystems 
• API v2 gallery became the first API v3 client 
• API v3 responses are transformed into API v2 responses 
• Getting API v3 benefits for existing API v2 users
NUGET 3.0 – TRANSITIONING 
FROM ODATA TO JSON-LD 
Friday at 14:20 in KITT
Thanks 
Come get NuGet stickers  
Gallery and Backend Services 
http://github.com/NuGet 
Catalog and Collectors 
http://github.com/NuGet/NuGet.Services.Metadata 
JSON-LD Processor 
http://github.com/NuGet/json-ld.net 
http://www.nuget.org/packages/json-ld.net 
@jeffhandley | jeffhandley.com | jeff.handley@microsoft.com | blog.nuget.org
1 of 46

More Related Content

Recently uploaded(20)

METHOD AND SYSTEM FOR PREDICTING OPTIMAL LOAD FOR WHICH THE YIELD IS MAXIMUM ...METHOD AND SYSTEM FOR PREDICTING OPTIMAL LOAD FOR WHICH THE YIELD IS MAXIMUM ...
METHOD AND SYSTEM FOR PREDICTING OPTIMAL LOAD FOR WHICH THE YIELD IS MAXIMUM ...
Prity Khastgir IPR Strategic India Patent Attorney Amplify Innovation24 views
The Research Portal of Catalonia: Growing more (information) & more (services)The Research Portal of Catalonia: Growing more (information) & more (services)
The Research Portal of Catalonia: Growing more (information) & more (services)
CSUC - Consorci de Serveis Universitaris de Catalunya59 views
ThroughputThroughput
Throughput
Moisés Armani Ramírez31 views
ChatGPT and AI for Web DevelopersChatGPT and AI for Web Developers
ChatGPT and AI for Web Developers
Maximiliano Firtman161 views
Java Platform Approach 1.0 - Picnic MeetupJava Platform Approach 1.0 - Picnic Meetup
Java Platform Approach 1.0 - Picnic Meetup
Rick Ossendrijver24 views
Liqid: Composable CXL PreviewLiqid: Composable CXL Preview
Liqid: Composable CXL Preview
CXL Forum120 views

Featured(20)

How to have difficult conversations How to have difficult conversations
How to have difficult conversations
Rajiv Jayarajah, MAppComm, ACC4.1K views
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
Christy Abraham Joy82.1K views
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
Alireza Esmikhani30.3K views
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
Project for Public Spaces & National Center for Biking and Walking6.9K views
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
Erica Santiago25.1K views
9 Tips for a Work-free Vacation9 Tips for a Work-free Vacation
9 Tips for a Work-free Vacation
Weekdone.com7.2K views
I Rock Therefore I Am. 20 Legendary Quotes from PrinceI Rock Therefore I Am. 20 Legendary Quotes from Prince
I Rock Therefore I Am. 20 Legendary Quotes from Prince
Empowered Presentations142.8K views
How to Map Your FutureHow to Map Your Future
How to Map Your Future
SlideShop.com275.1K views
Read with Pride | LGBTQ+ ReadsRead with Pride | LGBTQ+ Reads
Read with Pride | LGBTQ+ Reads
Kayla Martin-Gant1.1K views

Evolving the nuget.org Architecture

Editor's Notes

  1. Visual Studio 2012 launch day 3GB memory grant inside SQL
  2. We thought JSON would save us!