Recent releases of Perforce include improvements targeting performance and scalability. The edge/commit server architecture and lockless reads are two such improvements. This presentation will detail the effect of these improvements as measured in concurrency simulations and production deployments. Some of the server internals implemented to achieve these gains in performance and scalability will also be discussed.
Performance & Scalability Improvements in Perforce
1. #
1
Michael Shields
Server Performance Engineer
2. #
2
Developed and supported software since 1977, specializing in
server optimization for software products including Perforce,
Sybase and Ingres. Hobbies include
on the human psyche.
10. #
10
• db.peeking=2
• Significant concurrency improvements
• Shared locks not taken for some large reads, e.g. integrate
• db.peeking=3
• Lockless db.rev scan
• Instead of db.revhx and db.revdx scans with shared locks
• Can require more resources
• Not all commands and arguments can be lockless
11. #
11
• btree layer implementation (Patent Pending)
• Structural changes requiring checkpoint replay
• Maximum table size is now 64 zettabytes
• Additional potential invalidation of process-level caches
• Data scans can tolerate writes
• Additional complexities
12. #
12
• Executes commands “typical” of a “developer”
• sync, fstat, edit, change, submit, integrate, resolve, etc.
• Concurrent execution of many “developer” roles
• Random paths, files per task, and delays
• Shorter average delay simulates many more users
• 256@15sec might approximate 10,000@10min
• 512@15sec ~20,000@10min, YMMV
29. #
29
• Goal: Improve remote user experience
• Client commands handled by local edge
• Helps enable larger remote presence
• Network load to Commit Server likely reduced
• Network latency to Commit Server less of an impact
• acb simulation
• 128 “developer” roles, average delay of 15 seconds
• 128@15sec might approximate 5,000@10min
42. #
42
• acb simulation
• Average delay reduced to five seconds (from 15)
• 512@5sec might approximate 60,000@10min
• Simulation stressed
• 2x workspace servers: 2x more “developers”
• Only 14% longer run time when stressed
46. #
46
• Doubling workspace servers again
• Cheated by deploying two on each machine
• For large deployments, one per machine is best practice
• Average delay further reduced to three seconds
• 512@3sec might approximate 100,000@10min
• 100,000 simulated “developer” roles!
51. #
51
• Lockless Reads
• Get there now if you’re not already
• Edge/Commit Servers
• Deploy edge servers across latency
• Clustering
• Scale to even larger number of users