Scaling Servers and Storage for Film Assets
Upcoming SlideShare
Loading in...5
×
 

Scaling Servers and Storage for Film Assets

on

  • 544 views

In the past two years, Pixar has grown from a handful of Perforce servers to over 90 servers. In this session, members of the Pixar team will discuss how they met the challenges in scaling out and ...

In the past two years, Pixar has grown from a handful of Perforce servers to over 90 servers. In this session, members of the Pixar team will discuss how they met the challenges in scaling out and being prudent about storage usage, from automating server creation to de-duplicating the repositories.

Statistics

Views

Total Views
544
Views on SlideShare
544
Embed Views
0

Actions

Likes
0
Downloads
51
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Scaling Servers and Storage for Film Assets  Scaling Servers and Storage for Film Assets Presentation Transcript

    • Scaling Servers andStorage for Film AssetsMike SundyDigital Asset System AdministratorDavid BaraffSenior Animation Research ScientistPixar Animation Studios
    • Environment OverviewScaling StorageScaling ServersChallenges with Scaling
    • Environment Overview
    • Environment As of March 2011: •  ~1000 Perforce users (80% of company) •  70 GB db.have •  12 million p4 ops per day (on busiest server) •  30+ VMWare server instances •  40 million submitted changelists (across all servers) •  On 2009.1 but planning to upgrade to 2010.1 soon
    • Growth & Types of Data Pixar grew from one code server in 2007 to 90+ Perforce servers storing all types of assets: •  art – reference and concept art – inspirational art for film. •  tech – show-specific data. e.g. models, textures, pipeline. •  studio – company-wide reference libraries. e.g. animation reference, config files, flickr-like company photo site. •  tools – code for our central tools team, software projects. •  dept – department-specific files. e.g. Creative Resources has “blessed” marketing images. •  exotics – patent data, casting audio, data for live action shorts, story gags, theme park concepts, intern art show.
    • Scaling Storage
    • Storage Stats •  115 million files in Perforce. •  20+ TB of versioned files.
    • Techniques to Manage Storage •  Use +S filetype for the majority of generated data. Saved 40% of storage for Toy Story 3 (1.2 TB). •  Work with teams to migrate versionless data out of Perforce. Saved 2 TB by moving binary scene data out. •  De-dupe files — saved 1 million files and 1 TB.
    • De-dupe Trigger Cases • p4 submit file1 file2 ••• fileN p4 submit file1 file2 ••• fileN # only file2 actually modified •  p4 submit file # contents: revision n # five seconds later: “crap!” p4 submit file # contents: revision n–1 •  p4 delete file p4 submit file # user deletes file (revision n) # five seconds later: “crap!” p4 add file p4 submit file # contents: revision n
    • De-dupe Trigger Mechanics repfile.14 repfile.15 AABBCC…! AABBCC…! file#n file#n+1 repfile.24 repfile.25 repfile.26 AABBCC…! XXYYZZ…! AABBCC…! file#n file#n+1 file#n+2 repfile.34 repfile.38 AABBCC…! AABBCC…! file#n file#n+1 file#n+2
    • De-dupe Trigger Mechanics repfile.24 repfile.25 repfile.26 AABBCC…! XXYYZZ…! AABBCC…! •  +F for all files; detect duplicates via checksums. •  Safely discard duplicate: $ ln repfile.24 repfile.26.tmp
 $ rename repfile.26.tmp repfile.26! repfile.24 repfile.25 repfile.26 AABBCC…! XXYYZZ…! AABBCC…! hardlink repfile.26.tmp rename
    • Scaling Servers
    • Scale Up vs. Scale Out Why did we choose to scale out? •  Shows are self-contained. •  Performance of one depot won’t affect another.* •  Easy to browse other depots. •  Easier administration/downtime scheduling. •  Fits with workflow (e.g. no merging art) •  Central code server – share where it matters.
    • Pixar Perforce Server Spec •  VMWare ESX Version 4. •  RHEL 5 (Linux 2.6). •  4 GB RAM. •  50 GB “local” data volume (on EMC SAN). •  Versioned files on Netapp GFX. •  90 Perforce depots on 6 node VMWare cluster – special 2-node cluster for “hot” tech show. •  For more details, see 2009 conference paper.
    • Virtualization Benefits •  Quick to spin up new servers. •  Stable and fault tolerant. •  Easy to remotely administer. •  Cost-effective. •  Reduces datacenter footprint, cooling, power, etc.
    • Reduce Dependencies •  Clone all servers from a VM template. •  RHEL vs. Fedora. •  Reduce triggers to minimum. •  Default tables, p4d startup options. •  Versioned files stored on NFS. •  VM on a cluster. •  Can build new VM quickly if one ever dies.
    • Virtualization Gotchas •  Had severe performance problem when one datastore grew to over 90% full. •  Requires some jockeying to ensure load stays balanced across multiple nodes – manual vs. auto. •  Physical host performance issues can cause cross- depot issues.
    • Speed of Virtual Perforce Servers •  Used Perforce Benchmark Results Database tools. •  Virtualized servers 95% of performance for branchsubmit benchmark. •  85% of performance for browse benchmark (not as critical to us). •  VMWare flexibility outweighed minor performance hit.
    • Quick Server Setup •  Critical to be able to quickly spin up new servers. •  Went from 2-3 days for setup to 1 hour. 1-hour Setup •  Clone a p4 template VM. (30 minutes) •  Prep the VM. ( 15 minutes) •  Run “squire” script to build out p4 instance. (8 seconds) •  Validate and test. (15 minutes)
    • Squire Script which automates p4 server setup. Sets up: •  p4 binaries •  metadata tables (protect/triggers/typemap/counters) •  cron jobs (checkpoint/journal/verify) •  monitoring •  permissions (filesystem and p4) •  .initd startup script •  linkatron namespace •  pipeline integration (for tech depots) •  config files
    • Superp4 Script for managing p4 metadata tables across multiple servers. •  Preferable to hand-editing 90 tables. •  Database driven (i.e. list of depots) •  Scopable by depot domain (art, tech, etc.) •  Rollback functionality.
    • Superp4 example$ cd /usr/anim/ts3!$ p4 triggers -o!Triggers: 
 !noHost form-out client ”removeHost.py %formfile%”!!$ cat fix-noHost.py!def modify(data, depot):! return [line.replace("noHost form-out”,! "noHost form-in”)! for line in data]!!$ superp4 –table triggers –script fix-noHost.py –diff! • Copies triggers to restore dir • Runs fix-noHost.py to produce new triggers, for each depot. • Shows me a diff of the above. • Asks confirmation; finally, modifies triggers on each depot. • Tells me where the restore dir is!!
    • Superp4 options$ superp4 –help! -n Don’t actually modify data! -diff Show diffs for each depot using xdiff. -category category Pick depots by category (art, tech, etc.) -units unit1 unit2 ... Specify an explicit depot list (regexp allowed). -script script Python file to be execfile()d; must define a function named modify(). -table tableType Table to operate on (triggers, typemap,…) -configFile configFile Config file to modify (e.g. admin/values-config) -outDir outDir Directory to store working files, and for restoral. -restoreDir restoreDir Directory previously produced by running superp4, for when you screw up.
    • Challenges With Scaling
    • Gotchas •  //spec/client filled up. •  user-written triggers sub-optimal. •  “shadow files” consumed server space. •  monitoring difficult – cue templaRX and mayday. •  cap renderfarm ops. •  beware of automated tests and clueless GUIs. •  verify can be dangerous to your health (cross-depot).
    • Summary •  Perforce scales well for large amounts of binary data. •  Virtualization = fast and cost-effective server setup. •  Use +S filetype and de-dupe to reduce storage usage.
    • Q&A Questions?