Hear about the multi-server Perforce architecture used at Remedy Entertainment, a developer of state-of-the-art action games, game franchises and cutting edge technology. Get tips on how various virtualization, storage technologies and the new distributed Perforce server features can be used to gain high availability and quick recoverability in different disaster scenarios. Handling large game content files and the dependencies between the game code and content assets will also be covered.
3. #
• Privately held game developer based in Finland.
• Released games Death Rally, Max Payne, Max
Payne 2: The Fall of Max Payne, Alan Wake, Alan
Wake’s American Nightmare, Death Rally Mobile.
• Franchises made into a movie, TV-series & novel.
• Announced titles Agents of Storm for iOS and
Xbox One exclusive title Quantum Break.
4. #
• Founded in 1995, currently 120+ employees.
• Over 100 Game of the Year awards.
• Franchises generated over $500M revenue.
• Max Payne IP sold for $43M.
• AAA games sold over 11M units.
• First mobile experiment over 16M downloads
and reached #1 in 70 countries.
7. #
# of files Total size # of files,
> 100 MB
Created by Remedy since 2004
All projects, all revisions 10.5 million 12 terabytes
All projects, #head revisions 5 million 5.5 terabytes
Alan Wake (XBOX 360), #head 1.1 million 920 gigabytes 1,300
Quantum Break (XBOX One, until today),
#head
3 million 4.3 terabytes 7,000
Perforce Database 30 gigabytes
8. #
• Large content files
• Dependencies of game engine <-> internal
tools <-> game content (in proprietary formats)
9. #
Tools source
code
Tool
binaries
3 Content source rd party
tools
Game source
code
Export util
source code
Export util
Runtime
game
binary
Runtime
content
10. #
• Large content files
• Dependencies of game engine <-> internal
tools <-> game content (in proprietary formats)
• Everything that comes out, comes from
Perforce depot
– Availability of the system is business critical
12. #
• System design approach
• Service implementation
• Principles of HA engineering
1. Elimination of single points of failure
2. Reliable crossover
3. Detection of failures as they occur.
• Source:
http://en.wikipedia.org/wiki/High_availability
13. #
• Client and access network don’t
have HA
– Opting for fast manual response
• LAN core w/ act/act redundancy
• Servers with failover
• SAN w/ active/active redundancy
• Storage w/ redundant components
14. #
• HA design principles do not cover the concept of
backups
– Even when HA is taken care of, data and availability
can be lost by user actions and software failures
– The data still needs to be copied to offline storage for
disaster recovery purposes
15. #
• Client and access network don’t
have HA
– Opting for fast manual response
• LAN core w/ act/act redundancy
• Servers with failover
• SAN w/ active/active redundancy
• Storage w/ redundant components
17. #
• Used for offloading backups and
integrity verification
• Covers application level failures
• Activation requires manual
intervention
perforce2:1666 perforce3:1666
perforce1:1666 perforce1:1667
18. #
• Snapshot of Perforce every 4 hours
• Runs storage provided snapshot with “p4d –c”
– Ensures database integrity
– Locks database for 30-50 seconds
• Near-instant recovery
• Can be mounted and exported to other hosts
– To run checkpoint, verify, …
– To run test environment with production data
20. #
• “A user may never see a failure. But the maintenance
activity must.”
• Infrastructure monitored with vendor tools
• Central monitoring with Nagios
– P4D process, TCP connectivity to perforce:1666
– Check “p4 info” output
– Replication: check “changelist” counter on both partners
• P4review.py
21. #
• Define what HA means for your service
• Build it one step at a time
– Ensure redundancy of each component
– Make sure the component is monitored
• Backups are still needed
23. #
• Introduction to Remedy
• Perforce at Remedy
• High Availability
• Perforce Application Availability
• Monitoring
• Conclusions
24. #
Jouko Markkanen is an IT Manager at Remedy Entertainment
with broad experience in different areas of information and
communications technology including help desk responsibilities,
programming, application design, security systems, information
management, and infrastructure planning and design.
Editor's Notes
E3 Sofia scene
Agents of Storm in beta
Quantum Break to be released in 2015
- 1996 Virtual Reality 3D Mark
- Spin-off company Futuremark in 1997
- Over 120 employees from over 15 different countries
- TOP-50 growth companies in Finland
Remedy has been using Perforce as the sole SCM system for over 10 years. During that time, we have created several AAA console/PC games, as well as mobile games. The biggest production has been Alan Wake, and while still in-production, Quantum Break has already multiple times the number and size of files in our depot.
So far we have created over 10 million files, with a total size of 12 terabytes in 300K+ changelists. While the Perforce database is sized modestly at 30 gigabytes, the average file size is 1.2MB, and Quantum Break has over 7000 files sized over 100MB, and hundreds of files over 1GB in size.
This distribution is not typical for a “regular” software project, and the performance problems lie more in “how to copy the mass of files to/from the client”, instead of “how to manage the complex database metadata of a huge number of files”.
There are over 100 people working on QB, writing program code and creating content to our Perforce server. Most of them are working on the content production, using in-house tools to edit the game world. These tools export the content from the proprietary source format to a binary format that can be presented in real time by the game engine.
A lot of the program code is shared between the tools and the game engine, to ensure a matching presentation. This means that when the format is updated, for example due to a new feature in the engine, the whole dependency chain must be rebuilt and redistributed, and all of the existing content must be re-exported, and sometimes even the existing content source needs to be upgraded. This makes the whole version control and integration/delivery methodology to a different complexity level.
The dark-grey boxes in the diagram depict the binaries built and delivered by our automated build system, or built locally for local testing and modifications. The dark-red boxes depict files stored in our Perforce server, and at the same time they are the files almost all the work on the project is done on. Which brings us to…
… the importance of the Perforce service in our company. Everything that is delivered with our final product, the game, is coming out of the content stored in our Perforce. This makes it the #1 business critical IT service at Remedy, and it’s availability is on the top priority.
Before we delve into how we ensure HA of our Perforce service, let’s discuss a bit what the term actually means. As is common in this age, we’ll start with a “common” definition, ie. what Wikipedia writes about HA.
Wikipedia defines HA as a system design approach, and associated service implementation. Their purpose is to ensure a certain, specified, level of performance, specifically level of availability.
Three principals to practice this approach and implementation are listed.
The first is to get rid of single points of failure, so that any single component needed to produce the service can fail without disrupting the service. This is usually achieved by means of redundancy, that is by duplicating, or multiplying, all components of the system.
The second is to provide means of transferring service reliably from a failed component to the redundant counterpart.
And to allow this, the third principle is automatic failure monitoring and detection, because without it, the tasks of the failed component will never be transferred, until it’s redundant counterparts have also failed, and the availability is lost.
Next, we’ll go thru the IT infrastructure layers we use to provide the Perforce service, and how their redundancy has been provided.
We’ll start off with storage: we use a shared storage system, which provides storage space with different characteristics, like high IOPS for database storage and more inexpensive large storage for versioned files. This system is shown as a single entity, but in reality it is spread on multiple storage chassis, each having a RAID-style redundant array of disks, redundant power supplies, and redundant controller modules, so that there are no SPF’s.
The access to the shared storage system is provided via an iSCSI storage area network. This has simple redundancy, there are two switches, with active paths carried on both, so if one of them fails, the other will continue storage operations.
On the next level are the servers. Our Perforce servers (there are multiple, more about them later) are virtualized, and they run on a cluster of hypervisors. In normal conditions, they are distributed on different physical host computers, so in case one host fails, the other VM’s keep running. Also, the cluster monitors itself, and in case of a host failure, the VM’s are restarted on the remaining hosts.
LAN connectivity between the servers as well as towards the clients is also redundant, with several technologies providing redundancy on different levels of the OSI network model.
On the final stage of the client-server path is the access network (“floor switch”) and the client computer itself. These do not have HA as such, as a failure on those has very limited area of effect. However, we are prepared for failures here as well; we keep spare access switches in store, so they can be swapped manually but quickly in case of a failure, and the same applies to the computers and/or their components.
Even at this level the system is not foolproof, and cannot guarantee 100% availability. Many of the crossover paths used to provide HA are built upon automatic monitoring and software features, and software tends to have bugs for example. Also, there are mere humans using as well as administering and operating the system, and humans tend to make errors. Even a well designed and implemented HA system does not protect if the datacenter is consumed by fire, or flooded with water.
To protect from this, proper backups must be planned, made and tested.
This completes the HA architecture diagram. The backups are created with dedicated hardware, and stored on dedicated storage system. Those systems should preferably reside offsite. We create backups on a dedicated system onsite, for faster recoveries, but replicate that backup content to an offsite datacenter for ultimate disaster recovery.
There is still one level of failures not covered by the generic HA IT infrastructure, but we can prepare for those as well. In case the Perforce software itself fails for one reason or another (this is rare, but has happened, also with us).
We have currently two different primary Perforce servers. We have split different projects to different servers to give some scalability in performance, and to allow one project to perform maintenance without disturbing the other. This is possible, because we have projects that use different game engine; the projects that share the game engine, are located on the same server, but different depots.
Both of the servers are replicated using Perforce pull replication to a third server. That server runs two P4D processes on different ports.
The replicas serve two main purposes: one is to allow checkpoints and verifys run without interruption to end user service. We only do these operations on the primary servers during scheduled maintenance breaks, when needed.
The other purpose is to have a fallback in case the master server fails irrecoverably. In this case, we need to manually change the replica server configuration (to allow write operations), as well as point the clients to the failover replica. As the clients use an alias name to find the server, we can change that alias point to the replica, and as the name record has a 5 minute TTL, all clients are good to run within that period.
Naturally, this only helps if the master server has failed in such way, that the failure has not been replicated to the replica’s database or versioned files. If it has, we can always resort to recovering from backups. However, as the full recovery takes some time (currently almost 24 hours to copy all data back from the backups), we have another way…
… the storage snapshots. Many modern storage systems have a feature to create a near-instant point-in-time checkpoint within the storage hardware (or, rather by the software running the hardware). To ensure the integrity of the snapshot, we have scheduled a command to run every 4 hours. This command is a “p4d –c …”, which tells the Perforce server to commit any residing changes to disk and lock the database while running a command. In this case the command tells the storage system to snapshot all the volumes assigned to that Perforce server.
The snapshot takes around 30-50 seconds with our system, during which time the Perforce server will hold all write operations. Compared to the almost two hours that creating a checkpoint of our database takes, this is pretty fast.
Now, mounting this checkpoint back to the server happens also in less than a minute. In case of a failure, it is possible to mount the previous checkpoint in an alternate directory, and start the Perforce server from there, while keeping the failed volumes online for further investigation.
But the use of the checkpoints does not end there. You can also keep the production running, while mounting a checkpoint on the same, or another server, start a P4D on the checkpoint folder, and use that eg. to test upgrade or configuration changes with real-world data, without touching the production.
HA allows the system to recover when a single failure occurs. But the whole concept of High Availability is void, if the environment is not being monitored, and the single failures corrected before another failure occurs and brings the system down.
In Wikipedia, the third principle of HA engineering says that “A user may never see a failure. But the maintenance activity must”
The HA infrastructure stack is being monitored by tools provided by individual component vendors. They are configured to send an alert email in case a component fails, and many even have a feature that notifies the vendor tech support autonomously.
We have a central monitoring system, utilizing Nagios, that monitors, in addition to the hardware and OS environment and their resources, also our Perforce servers. There are several checks that verify that the P4D process is running, and a custom script that checks the replication status.
A good monitoring tool is also the p4review daemon (although this is not it’s primary purpose). We run p4review every two minutes, and if the server has any problems, it fails, sending a failure report to the admin contact email.