SlideShare a Scribd company logo
1 of 95
Download to read offline
Solving the Game Content Problem
Pipeline theory and practice
Koray Hagen
Background
Senior Programmer
Tools and Core Technology
SIEA, Santa Monica Studio
Focus:
Animation systems
Content build system
Content compiler tools
Game industry has a big problem
The problem is data
• The time required to develop games is increasing
• The time required to develop games is increasing
• The manpower and computational cost of developing games is increasing
• The time required to develop games is increasing
• The manpower and computational cost of developing games is increasing
• The unit price of a game has remained relatively stable
Meanwhile …
• Player expectations for graphical fidelity are increasing
• Player expectations for graphical fidelity are increasing
• Player expectations for the scale of the user experience are increasing
• Player expectations for graphical fidelity are increasing
• Player expectations for the scale of the user experience are increasing
• Developer expectations for our tools to meet this demand are increasing
• This problem has manifested itself in the form of impacted iteration times:
– For Artists
– For Designers
– For Programmers
– For Production
– For Operations
• This problem has manifested itself in the form of impacted iteration times:
• Questions:
– How long until a content author sees the result of changing ____ ?
– How long until a content author can commit a change to ____ ?
• This problem has manifested itself in the form of impacted iteration times:
• Questions:
• Answers:
– Desired? Instantly
– Reality? Possibly minutes to hours
This problem is the domain of this lecture
It is also the domain of your game’s pipeline
The game content pipeline
• Purpose:
– Transform unmeaningful data into meaningful data
The game content pipeline
• Purpose:
– Transform unmeaningful data into meaningful data
– Often, that means source data into production data
The game content pipeline
• Purpose:
– Transform unmeaningful data into meaningful data
– Often, that means source data into production data
– Production data is content that ends up in a player’s hands
The game content pipeline
• Purpose:
• Game content:
– Meshes
– Textures
– Animations
– Shaders
– Materials
– Physics
– Scripts
– Narrative
– Audio
The game content pipeline
• Purpose:
• Game content:
– Heterogeneous:
• Has different intrinsic characteristics (formats, disk-space impact)
• Requires different applied transformations
The game content pipeline
• Purpose:
• Game content:
– Heterogeneous:
• Has different intrinsic characteristics (formats, disk-space impact)
• Requires different applied transformations
– Hierarchical relationships
• Dependencies
• Content identity
The game content pipeline
• Philosophy:
– Create a unified content pipeline system and architecture at our studio
The game content pipeline
• Philosophy:
– Create a unified content pipeline system and architecture at our studio
– Properties:
• Correct
• Expressive
• Explicit
• Homogenous
• Scalable
• Debuggable
This is a lecture on how we think and reason about
achieving these desired properties
The agenda
• Foundations
– Creating the building blocks of a content pipeline
The agenda
• Foundations
• Theory and architecture
– Considerations for building a pipeline
– High level view of design
The agenda
• Foundations
• Theory and architecture
• Lessons and current research
– Concrete technology examples
– Questions for the future
Foundations
Each piece represents a conceptual building block
Data transformation functions
• Purpose:
– Core operator of a game’s pipeline
– Transforms a unit of data from one state to another state
Data transformation functions
• Purpose:
• Example:
Data transformation functions
• Purpose:
• Example:
• Invariants:
– Serializable
– Transferrable
– Side-effect free
– Discrete
– Atomic
Data transformation functions
• Purpose:
• Example:
• Invariants:
– Serializable
• Both inputs and outputs are representable in different contexts
– Memory mapped
– File system
– Networked
Data transformation functions
• Purpose:
• Example:
• Invariants:
– Serializable
– Transferrable
• Usable within a game’s pipeline or run-time
Data transformation functions
• Purpose:
• Example:
• Invariants:
– Serializable
– Transferrable
– Side-effect free
• Safely usable in a concurrent system
• Knowable state and always debuggable
Data transformation functions
• Purpose:
• Example:
• Invariants:
– Serializable
– Transferrable
– Side-effect free
– Discrete
• Conceptually simple to reason about
• Represents a single unit of transformation
Data transformation functions
• Purpose:
• Example:
• Invariants:
– Serializable
– Transferrable
– Side-effect free
– Discrete
– Atomic
• Transformation either did or did not occur, there is no half-way
Data transformation functions
• Goals:
– Develop a library of transformations
– Define an appropriate granularity
– Feverously respect invariants
Data transformation functions
• Goals:
– Develop a library of transformations
• Needed to represent the full scope of game data
– Define an appropriate granularity
– Feverously respect invariants
Data transformation functions
• Goals:
– Develop a library of transformations
– Define an appropriate granularity
• Base judgment on time and resources required for the transform
– Feverously respect invariants
Data transformation functions
• Goals:
– Develop a library of transformations
– Define an appropriate granularity
– Feverously respect invariants
• This will allow powerful architectural opportunities later
Dependency graph
• Purpose:
– Track the global state of data and relationships
– Serves as an interface to both data and transformations
Dependency graph
• Purpose:
• Data dependency graph
• Transformation dependency graph
Dependency graph
• Purpose:
• Data dependency graph
– The relationship network of data files
– Example:
o Skybox
o Cubemap
o 1D Texture (six)
o Cubemap Texture
• Transformation dependency graph
Dependency graph
• Purpose:
• Data dependency graph
– The relationship network of data files
– Example:
– Properties:
• File identity (as a path or GUID)
• Connectivity to and from other files
• Reachability to and from other files
• Transformation dependency graph
Dependency graph
• Purpose:
• Data dependency graph
• Transformation dependency graph
– The relationship network of data transformation functions
– Example:
o
o
o
Dependency graph
• Purpose:
• Data dependency graph
• Transformation dependency graph
– The relationship network of data transformation functions
– Example:
– Properties and differences from data dependency graph:
• Requires critical path evaluation for a desired result
• Requires deciding if and when a transformation occurs
• Performance critical
Storage
• Purpose:
– Provide permanent or transient contexts for information and data
Storage
• Purpose:
• Considerations:
– Often a tradeoff between latency, capacity, and longevity
• Example: low latency, in-memory key-value database
• Example: high latency, networked file system
Storage
• Purpose:
• Considerations:
– Often a tradeoff between latency, capacity, and longevity
– Needed for many areas in a pipeline:
• Source data
• Intermediate data
• Inter/Intraprocess communication
• Production data
Storage
• Purpose:
• Considerations:
– Often a tradeoff between latency, capacity, and longevity
– Needed for many areas in a pipeline:
– Should never be directly interacted with by transformation functions:
• Fundamentally breaks invariants
• Should be orchestrated at a higher level
– During evaluation of the transformation dependency graph
Theory and architecture
Computational work
• Iteration times:
– For content authors, improving iteration is purely a metric of time
• The amount of computation required is not relevant
Computational work
• Iteration times:
– For content authors, improving iteration is purely a metric of time
• The amount of computation required is not relevant
– Often, more computation results in more time required
Computational work
• Iteration times:
• Mitigation strategies:
– Data model
– Concurrency
– Avoidance
Computational work
• Iteration times:
• Mitigation strategies:
– Data model
• Reduction of transformation steps required to see result
– Transformations are unnecessary or optional
– Direct optimization or removal of transformations
– Concurrency
– Avoidance
Computational work
• Iteration times:
• Mitigation strategies:
– Data model
• Reduction of transformation steps required to see result
• Ideal to address, the best general optimization is to do less work
– Concurrency
– Avoidance
Computational work
• Iteration times:
• Mitigation strategies:
– Data model
– Concurrency
• Leveraging the invariants of our data transformation functions
• Available contexts:
– Threads
– Processes
– Network
– Avoidance
Computational work
• Iteration times:
• Mitigation strategies:
– Data model
– Concurrency
– Avoidance
• Hierarchal caching contexts (more detail later):
– In-memory
– File system
– Network
Computational work
• Iteration times:
• Mitigation strategies:
– Data model
– Concurrency
– Avoidance
• Hierarchal caching contexts:
• Pruning action based on criteria:
– Unneeded for desired result
– Redundant
Conceptual architectures
• Live iteration
• Asset baking
Conceptual architectures
• Live iteration
– Working directly in the game to see changes
– Streaming data to the game through an external tool
• Asset baking
Conceptual architectures
• Live iteration
– Working directly in the game to see changes
– Streaming data to the game through an external tool
– Properties:
• Has the benefit of user seeing instant results
• Has the problem of data transience
– Edit state could become corrupt
– Edit state could encounter an irrecoverable error (crashes)
• Has the problem of needing to reflect changes into source data
– Other assets may have dependence on changes
• Asset baking
Conceptual architectures
• Live iteration
• Asset baking
– Using an offline toolset to “cook” or “bake” assets for the game
Conceptual architectures
• Live iteration
• Asset baking
– Using an offline toolset to “cook” or “bake” assets for the game
– Properties:
• Has the benefit of permanence
• Has the benefit of simple directional flow of changes
– Source  intermediate  production
• Has the problem, often, of increased computational cost
• Has the problem, often, of non-uniformity
– Example: Different teams having different file formats
– Resulting in a more complex unified pipeline
Conceptual architectures
• Live iteration
• Asset baking
• Philosophy:
– Live iteration workflows and asset baking can be reconciled
– Both concepts are shades of the same architectural principles
Conceptual architectures
• Live iteration
• Asset baking
• Philosophy:
• Desired pipeline architecture:
– Transferrable library of available data transformations
– Full data dependency graph of all game content
– Computable transformation dependency graphs
– Generalized hierarchical caching model
Visualization of a content build
Asset baking pipeline
Visualization of a content build
Live iteration pipeline
Lessons and current research
Data transformation functions
• Lesson: Define the data formats
• Lesson: Discipline in respecting invariants
• Lesson: Transferability is crucial
Data transformation functions
• Lesson: Define the data formats
– Developed an in-house object serialization library
• Similar to Google Flat Buffers and Protocol Buffers
• Lesson: Discipline in respecting invariants
• Lesson: Transferability is crucial
Data transformation functions
• Lesson: Define the data formats
– Developed an in-house object serialization library
• Similar to Google Flat Buffers and Protocol Buffers
• Designed for:
– Memory-mappability
– Binary format for speed, Json format for debugging
– Data layout optimization
– Versioning and Identification
– Automatic API generation
– Custom allocators
• Lesson: Discipline in respecting invariants
• Lesson: Transferability is crucial
Data transformation functions
• Lesson: Define the data formats
• Lesson: Discipline in respecting invariants
– Allows for concurrency guarantees
– Allows for data, build, and game state guarantees
• Lesson: Transferability is crucial
Data transformation functions
• Lesson: Define the data formats
• Lesson: Discipline in respecting invariants
• Lesson: Transferability is crucial
– Same library of functions can be used at bake time and run time
– Opportunity for the transformation dependency graph to
computationally decide where to execute transform:
• A thread
• A local process
• A networked process
Dependency Graphs
• Lesson: Most research required here
• Lesson: Describing the flow of a network
• Lesson: Graph operators
Dependency Graphs
• Lesson: Most research required here
– Open problem: Scaling computation to millions of nodes
– Open problem: Cache coherent spatial data structures and processing
– Open Problem: Extremely performant transformation dependency
graph evaluation and iterative construction
• Lesson: Describing the flow of a network
• Lesson: Graph operators
Dependency Graphs
• Lesson: Most research required here
• Lesson: Describing the flow of a network
– Transformation dependency graph
• Required a language (DSL) to describe the transformation network
• Transformations known as commands:
– Examples of granularity:
» compress_texture
» create_animation_clip
• DSL results in vast amount of automatic code generation
• Lesson: Graph operators
Dependency Graphs
• Lesson: Most research required here
• Lesson: Describing the flow of a network
• Lesson: Graph operators
– Transformation dependency graph traversal:
• Two operators could encapsulate our most complex operations
• Reductions:
– Compute a subgraph under a node as input to a command
• Maps:
– Direct input node to a command
Storage
• Lesson: File system by itself is very limiting
• Lesson: Creating a global asset library
Storage
• Lesson: File system by itself is very limiting
– Butting heads with Windows file system rules is painful:
• Path lengths (256 characters)
• Path and file name uniqueness
– Often results in insane naming conventions for files
• Lesson: Creating a global asset library
Storage
• Lesson: File system by itself is very limiting
• Lesson: Creating a global asset library
– Many reasons for desirability:
• Freedom for custom identification and file versioning
• Freedom from file system path and naming conventions
• Filenames on physical disk can be mangled, while the asset library
can ensure human readability
Caching
• Lesson: Hierarchical caching is a powerful model
– Goal: Never re-compute a function with the same parameters.
– Achieved by decorating functions with appropriate caching strategies:
• Match a function with the most appropriate caching semantic
• Driven by the nature of input and output data
• Example:
– File-based parameters cached to the file system or network
– Normal function parameters cached in memory
Caching
• Lesson: Hierarchical caching is a powerful model
– Any piece of computation can be decorated with caching
• Found it to be most beneficial with:
– Path and string manipulation
– Data transformation functions
Caching
• Lesson: Hierarchical caching is a powerful model
– Any piece of computation can be decorated with caching
– Hiding caching structure behind generalized interfaces and decorators
allowed for experimentation.
• Proved to be invaluable when we switched our file-system based
key-value store from SQLite to LMDB
Caching
• Lesson: Hierarchical caching is a powerful model
– Any piece of computation can be decorated with caching
– Hiding caching structure behind generalized interfaces and decorators
allowed for experimentation.
– Our current implementations:
• In-memory: achieved with function level memoization
• File-system: achieved with LMDB
• Networked: achieved with an in-house technology backed by CEPH
(a networked file system object store)
Caching
• Lesson: Hierarchical caching is a powerful model
– Example:
Takeaways
Hard Problems
• Building the foundation
• Dependency graph scalability
• Dependency graph decision making
Hard Problems
• Building the foundation
– Most studios, including ours have a long history of tool’s development
– Difficult to modernize and justify changing monolithic, stable tools
– Large upfront cost for changing data formats
• Dependency graph scalability
• Dependency graph decision making
Hard Problems
• Building the foundation
• Dependency graph scalability
• Dependency graph decision making
– Performant and meaningful computation against large graphs is
difficult but not insurmountable.
– Pushing automated decision making to cover:
• Caching strategies
• Appropriate contexts for data transformations
The problem is data
The solution is automation
Thank you
Questions?

More Related Content

What's hot

HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012larsgeorge
 
Bn 1016 demo postgre sql-online-training
Bn 1016 demo  postgre sql-online-trainingBn 1016 demo  postgre sql-online-training
Bn 1016 demo postgre sql-online-trainingconline training
 
Getting Started with Hadoop
Getting Started with HadoopGetting Started with Hadoop
Getting Started with HadoopCloudera, Inc.
 
StreamHorizon overview
StreamHorizon overviewStreamHorizon overview
StreamHorizon overviewStreamHorizon
 
Geek Sync | Successfully Migrating Existing Databases to Azure SQL Database
Geek Sync | Successfully Migrating Existing Databases to Azure SQL DatabaseGeek Sync | Successfully Migrating Existing Databases to Azure SQL Database
Geek Sync | Successfully Migrating Existing Databases to Azure SQL DatabaseIDERA Software
 
Hadoop World 2011: Advanced HBase Schema Design
Hadoop World 2011: Advanced HBase Schema DesignHadoop World 2011: Advanced HBase Schema Design
Hadoop World 2011: Advanced HBase Schema DesignCloudera, Inc.
 
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911Ines Sombra
 
Open Source SQL Databases
Open Source SQL DatabasesOpen Source SQL Databases
Open Source SQL DatabasesEmanuel Calvo
 
Geek Sync | Designing Data Intensive Cloud Native Applications
Geek Sync | Designing Data Intensive Cloud Native ApplicationsGeek Sync | Designing Data Intensive Cloud Native Applications
Geek Sync | Designing Data Intensive Cloud Native ApplicationsIDERA Software
 
HBase Applications - Atlanta HUG - May 2014
HBase Applications - Atlanta HUG - May 2014HBase Applications - Atlanta HUG - May 2014
HBase Applications - Atlanta HUG - May 2014larsgeorge
 
Hadoop World 2011: Building Realtime Big Data Services at Facebook with Hadoo...
Hadoop World 2011: Building Realtime Big Data Services at Facebook with Hadoo...Hadoop World 2011: Building Realtime Big Data Services at Facebook with Hadoo...
Hadoop World 2011: Building Realtime Big Data Services at Facebook with Hadoo...Cloudera, Inc.
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databasesJames Serra
 
End-to-end Data Governance with Apache Avro and Atlas
End-to-end Data Governance with Apache Avro and AtlasEnd-to-end Data Governance with Apache Avro and Atlas
End-to-end Data Governance with Apache Avro and AtlasDataWorks Summit
 
Apachecon Europe 2012: Operating HBase - Things you need to know
Apachecon Europe 2012: Operating HBase - Things you need to knowApachecon Europe 2012: Operating HBase - Things you need to know
Apachecon Europe 2012: Operating HBase - Things you need to knowChristian Gügi
 
PelletServer: REST and Semantic Technologies
PelletServer: REST and Semantic TechnologiesPelletServer: REST and Semantic Technologies
PelletServer: REST and Semantic TechnologiesClark & Parsia LLC
 
Shard-Query, an MPP database for the cloud using the LAMP stack
Shard-Query, an MPP database for the cloud using the LAMP stackShard-Query, an MPP database for the cloud using the LAMP stack
Shard-Query, an MPP database for the cloud using the LAMP stackJustin Swanhart
 
Scaing databases on the cloud
Scaing databases on the cloudScaing databases on the cloud
Scaing databases on the cloudImaginea
 

What's hot (20)

HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012HBase Advanced Schema Design - Berlin Buzzwords - June 2012
HBase Advanced Schema Design - Berlin Buzzwords - June 2012
 
Bn 1016 demo postgre sql-online-training
Bn 1016 demo  postgre sql-online-trainingBn 1016 demo  postgre sql-online-training
Bn 1016 demo postgre sql-online-training
 
HBase internals
HBase internalsHBase internals
HBase internals
 
Getting Started with Hadoop
Getting Started with HadoopGetting Started with Hadoop
Getting Started with Hadoop
 
StreamHorizon overview
StreamHorizon overviewStreamHorizon overview
StreamHorizon overview
 
Geek Sync | Successfully Migrating Existing Databases to Azure SQL Database
Geek Sync | Successfully Migrating Existing Databases to Azure SQL DatabaseGeek Sync | Successfully Migrating Existing Databases to Azure SQL Database
Geek Sync | Successfully Migrating Existing Databases to Azure SQL Database
 
Hadoop World 2011: Advanced HBase Schema Design
Hadoop World 2011: Advanced HBase Schema DesignHadoop World 2011: Advanced HBase Schema Design
Hadoop World 2011: Advanced HBase Schema Design
 
North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911North Bay Ruby Meetup 101911
North Bay Ruby Meetup 101911
 
Open Source SQL Databases
Open Source SQL DatabasesOpen Source SQL Databases
Open Source SQL Databases
 
Geek Sync | Designing Data Intensive Cloud Native Applications
Geek Sync | Designing Data Intensive Cloud Native ApplicationsGeek Sync | Designing Data Intensive Cloud Native Applications
Geek Sync | Designing Data Intensive Cloud Native Applications
 
HBase Applications - Atlanta HUG - May 2014
HBase Applications - Atlanta HUG - May 2014HBase Applications - Atlanta HUG - May 2014
HBase Applications - Atlanta HUG - May 2014
 
Hadoop World 2011: Building Realtime Big Data Services at Facebook with Hadoo...
Hadoop World 2011: Building Realtime Big Data Services at Facebook with Hadoo...Hadoop World 2011: Building Realtime Big Data Services at Facebook with Hadoo...
Hadoop World 2011: Building Realtime Big Data Services at Facebook with Hadoo...
 
Relational databases vs Non-relational databases
Relational databases vs Non-relational databasesRelational databases vs Non-relational databases
Relational databases vs Non-relational databases
 
End-to-end Data Governance with Apache Avro and Atlas
End-to-end Data Governance with Apache Avro and AtlasEnd-to-end Data Governance with Apache Avro and Atlas
End-to-end Data Governance with Apache Avro and Atlas
 
Apachecon Europe 2012: Operating HBase - Things you need to know
Apachecon Europe 2012: Operating HBase - Things you need to knowApachecon Europe 2012: Operating HBase - Things you need to know
Apachecon Europe 2012: Operating HBase - Things you need to know
 
Scalability Design Principles - Internal Session
Scalability Design Principles - Internal SessionScalability Design Principles - Internal Session
Scalability Design Principles - Internal Session
 
Apache HBase™
Apache HBase™Apache HBase™
Apache HBase™
 
PelletServer: REST and Semantic Technologies
PelletServer: REST and Semantic TechnologiesPelletServer: REST and Semantic Technologies
PelletServer: REST and Semantic Technologies
 
Shard-Query, an MPP database for the cloud using the LAMP stack
Shard-Query, an MPP database for the cloud using the LAMP stackShard-Query, an MPP database for the cloud using the LAMP stack
Shard-Query, an MPP database for the cloud using the LAMP stack
 
Scaing databases on the cloud
Scaing databases on the cloudScaing databases on the cloud
Scaing databases on the cloud
 

Similar to Solving the Game Content Problem Pipeline theory and practice

A machine learning and data science pipeline for real companies
A machine learning and data science pipeline for real companiesA machine learning and data science pipeline for real companies
A machine learning and data science pipeline for real companiesDataWorks Summit
 
Future-Proofing the Web: What We Can Do Today
Future-Proofing the Web: What We Can Do TodayFuture-Proofing the Web: What We Can Do Today
Future-Proofing the Web: What We Can Do TodayJohn Kunze
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMichael Hiskey
 
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInJay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInLinkedIn
 
Architecting application with Hadoop - using clickstream analytics as an example
Architecting application with Hadoop - using clickstream analytics as an exampleArchitecting application with Hadoop - using clickstream analytics as an example
Architecting application with Hadoop - using clickstream analytics as an examplehadooparchbook
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureDatabricks
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarKognitio
 
Crossing Analytics Systems: Case for Integrated Provenance in Data Lakes
Crossing Analytics Systems: Case for Integrated Provenance in Data LakesCrossing Analytics Systems: Case for Integrated Provenance in Data Lakes
Crossing Analytics Systems: Case for Integrated Provenance in Data LakesIsuru Suriarachchi
 
Writing Scalable Software in Java
Writing Scalable Software in JavaWriting Scalable Software in Java
Writing Scalable Software in JavaRuben Badaró
 
HDFS_architecture.ppt
HDFS_architecture.pptHDFS_architecture.ppt
HDFS_architecture.pptvijayapraba1
 
Version Your Cloud: Using Perforce to Manage Your Object Storage
Version Your Cloud: Using Perforce to Manage Your Object StorageVersion Your Cloud: Using Perforce to Manage Your Object Storage
Version Your Cloud: Using Perforce to Manage Your Object StoragePerforce
 
Data Science meets Software Development
Data Science meets Software DevelopmentData Science meets Software Development
Data Science meets Software DevelopmentAlexis Seigneurin
 
Webinar: MongoDB Management Service (MMS): Session 02 - Backing up Data
Webinar: MongoDB Management Service (MMS): Session 02 - Backing up DataWebinar: MongoDB Management Service (MMS): Session 02 - Backing up Data
Webinar: MongoDB Management Service (MMS): Session 02 - Backing up DataMongoDB
 
Storage Systems For Scalable systems
Storage Systems For Scalable systemsStorage Systems For Scalable systems
Storage Systems For Scalable systemselliando dias
 
Drill architecture 20120913
Drill architecture 20120913Drill architecture 20120913
Drill architecture 20120913jasonfrantz
 
The Hadoop Ecosystem for Developers
The Hadoop Ecosystem for DevelopersThe Hadoop Ecosystem for Developers
The Hadoop Ecosystem for DevelopersZohar Elkayam
 

Similar to Solving the Game Content Problem Pipeline theory and practice (20)

A machine learning and data science pipeline for real companies
A machine learning and data science pipeline for real companiesA machine learning and data science pipeline for real companies
A machine learning and data science pipeline for real companies
 
Future-Proofing the Web: What We Can Do Today
Future-Proofing the Web: What We Can Do TodayFuture-Proofing the Web: What We Can Do Today
Future-Proofing the Web: What We Can Do Today
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar
 
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedInJay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
Jay Kreps on Project Voldemort Scaling Simple Storage At LinkedIn
 
Architecting application with Hadoop - using clickstream analytics as an example
Architecting application with Hadoop - using clickstream analytics as an exampleArchitecting application with Hadoop - using clickstream analytics as an example
Architecting application with Hadoop - using clickstream analytics as an example
 
Architect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh ArchitectureArchitect’s Open-Source Guide for a Data Mesh Architecture
Architect’s Open-Source Guide for a Data Mesh Architecture
 
Meta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinarMeta scale kognitio hadoop webinar
Meta scale kognitio hadoop webinar
 
Crossing Analytics Systems: Case for Integrated Provenance in Data Lakes
Crossing Analytics Systems: Case for Integrated Provenance in Data LakesCrossing Analytics Systems: Case for Integrated Provenance in Data Lakes
Crossing Analytics Systems: Case for Integrated Provenance in Data Lakes
 
Writing Scalable Software in Java
Writing Scalable Software in JavaWriting Scalable Software in Java
Writing Scalable Software in Java
 
Embedded
EmbeddedEmbedded
Embedded
 
HDFS_architecture.ppt
HDFS_architecture.pptHDFS_architecture.ppt
HDFS_architecture.ppt
 
JavaFX 101
JavaFX 101JavaFX 101
JavaFX 101
 
Version Your Cloud: Using Perforce to Manage Your Object Storage
Version Your Cloud: Using Perforce to Manage Your Object StorageVersion Your Cloud: Using Perforce to Manage Your Object Storage
Version Your Cloud: Using Perforce to Manage Your Object Storage
 
Data Science meets Software Development
Data Science meets Software DevelopmentData Science meets Software Development
Data Science meets Software Development
 
Webinar: MongoDB Management Service (MMS): Session 02 - Backing up Data
Webinar: MongoDB Management Service (MMS): Session 02 - Backing up DataWebinar: MongoDB Management Service (MMS): Session 02 - Backing up Data
Webinar: MongoDB Management Service (MMS): Session 02 - Backing up Data
 
Storage Systems For Scalable systems
Storage Systems For Scalable systemsStorage Systems For Scalable systems
Storage Systems For Scalable systems
 
Connected Components Labeling
Connected Components LabelingConnected Components Labeling
Connected Components Labeling
 
Drill architecture 20120913
Drill architecture 20120913Drill architecture 20120913
Drill architecture 20120913
 
The Hadoop Ecosystem for Developers
The Hadoop Ecosystem for DevelopersThe Hadoop Ecosystem for Developers
The Hadoop Ecosystem for Developers
 
Datastage Introduction To Data Warehousing
Datastage Introduction To Data WarehousingDatastage Introduction To Data Warehousing
Datastage Introduction To Data Warehousing
 

Recently uploaded

Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmSujith Sukumaran
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....kzayra69
 

Recently uploaded (20)

Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Intelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalmIntelligent Home Wi-Fi Solutions | ThinkPalm
Intelligent Home Wi-Fi Solutions | ThinkPalm
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....
 

Solving the Game Content Problem Pipeline theory and practice

  • 1. Solving the Game Content Problem Pipeline theory and practice Koray Hagen
  • 2. Background Senior Programmer Tools and Core Technology SIEA, Santa Monica Studio Focus: Animation systems Content build system Content compiler tools
  • 3. Game industry has a big problem
  • 5. • The time required to develop games is increasing
  • 6. • The time required to develop games is increasing • The manpower and computational cost of developing games is increasing
  • 7. • The time required to develop games is increasing • The manpower and computational cost of developing games is increasing • The unit price of a game has remained relatively stable
  • 9. • Player expectations for graphical fidelity are increasing
  • 10. • Player expectations for graphical fidelity are increasing • Player expectations for the scale of the user experience are increasing
  • 11. • Player expectations for graphical fidelity are increasing • Player expectations for the scale of the user experience are increasing • Developer expectations for our tools to meet this demand are increasing
  • 12. • This problem has manifested itself in the form of impacted iteration times: – For Artists – For Designers – For Programmers – For Production – For Operations
  • 13. • This problem has manifested itself in the form of impacted iteration times: • Questions: – How long until a content author sees the result of changing ____ ? – How long until a content author can commit a change to ____ ?
  • 14. • This problem has manifested itself in the form of impacted iteration times: • Questions: • Answers: – Desired? Instantly – Reality? Possibly minutes to hours
  • 15. This problem is the domain of this lecture
  • 16. It is also the domain of your game’s pipeline
  • 17. The game content pipeline • Purpose: – Transform unmeaningful data into meaningful data
  • 18. The game content pipeline • Purpose: – Transform unmeaningful data into meaningful data – Often, that means source data into production data
  • 19. The game content pipeline • Purpose: – Transform unmeaningful data into meaningful data – Often, that means source data into production data – Production data is content that ends up in a player’s hands
  • 20. The game content pipeline • Purpose: • Game content: – Meshes – Textures – Animations – Shaders – Materials – Physics – Scripts – Narrative – Audio
  • 21. The game content pipeline • Purpose: • Game content: – Heterogeneous: • Has different intrinsic characteristics (formats, disk-space impact) • Requires different applied transformations
  • 22. The game content pipeline • Purpose: • Game content: – Heterogeneous: • Has different intrinsic characteristics (formats, disk-space impact) • Requires different applied transformations – Hierarchical relationships • Dependencies • Content identity
  • 23. The game content pipeline • Philosophy: – Create a unified content pipeline system and architecture at our studio
  • 24. The game content pipeline • Philosophy: – Create a unified content pipeline system and architecture at our studio – Properties: • Correct • Expressive • Explicit • Homogenous • Scalable • Debuggable
  • 25. This is a lecture on how we think and reason about achieving these desired properties
  • 26. The agenda • Foundations – Creating the building blocks of a content pipeline
  • 27. The agenda • Foundations • Theory and architecture – Considerations for building a pipeline – High level view of design
  • 28. The agenda • Foundations • Theory and architecture • Lessons and current research – Concrete technology examples – Questions for the future
  • 30. Each piece represents a conceptual building block
  • 31. Data transformation functions • Purpose: – Core operator of a game’s pipeline – Transforms a unit of data from one state to another state
  • 32. Data transformation functions • Purpose: • Example:
  • 33. Data transformation functions • Purpose: • Example: • Invariants: – Serializable – Transferrable – Side-effect free – Discrete – Atomic
  • 34. Data transformation functions • Purpose: • Example: • Invariants: – Serializable • Both inputs and outputs are representable in different contexts – Memory mapped – File system – Networked
  • 35. Data transformation functions • Purpose: • Example: • Invariants: – Serializable – Transferrable • Usable within a game’s pipeline or run-time
  • 36. Data transformation functions • Purpose: • Example: • Invariants: – Serializable – Transferrable – Side-effect free • Safely usable in a concurrent system • Knowable state and always debuggable
  • 37. Data transformation functions • Purpose: • Example: • Invariants: – Serializable – Transferrable – Side-effect free – Discrete • Conceptually simple to reason about • Represents a single unit of transformation
  • 38. Data transformation functions • Purpose: • Example: • Invariants: – Serializable – Transferrable – Side-effect free – Discrete – Atomic • Transformation either did or did not occur, there is no half-way
  • 39. Data transformation functions • Goals: – Develop a library of transformations – Define an appropriate granularity – Feverously respect invariants
  • 40. Data transformation functions • Goals: – Develop a library of transformations • Needed to represent the full scope of game data – Define an appropriate granularity – Feverously respect invariants
  • 41. Data transformation functions • Goals: – Develop a library of transformations – Define an appropriate granularity • Base judgment on time and resources required for the transform – Feverously respect invariants
  • 42. Data transformation functions • Goals: – Develop a library of transformations – Define an appropriate granularity – Feverously respect invariants • This will allow powerful architectural opportunities later
  • 43. Dependency graph • Purpose: – Track the global state of data and relationships – Serves as an interface to both data and transformations
  • 44. Dependency graph • Purpose: • Data dependency graph • Transformation dependency graph
  • 45. Dependency graph • Purpose: • Data dependency graph – The relationship network of data files – Example: o Skybox o Cubemap o 1D Texture (six) o Cubemap Texture • Transformation dependency graph
  • 46. Dependency graph • Purpose: • Data dependency graph – The relationship network of data files – Example: – Properties: • File identity (as a path or GUID) • Connectivity to and from other files • Reachability to and from other files • Transformation dependency graph
  • 47. Dependency graph • Purpose: • Data dependency graph • Transformation dependency graph – The relationship network of data transformation functions – Example: o o o
  • 48. Dependency graph • Purpose: • Data dependency graph • Transformation dependency graph – The relationship network of data transformation functions – Example: – Properties and differences from data dependency graph: • Requires critical path evaluation for a desired result • Requires deciding if and when a transformation occurs • Performance critical
  • 49. Storage • Purpose: – Provide permanent or transient contexts for information and data
  • 50. Storage • Purpose: • Considerations: – Often a tradeoff between latency, capacity, and longevity • Example: low latency, in-memory key-value database • Example: high latency, networked file system
  • 51. Storage • Purpose: • Considerations: – Often a tradeoff between latency, capacity, and longevity – Needed for many areas in a pipeline: • Source data • Intermediate data • Inter/Intraprocess communication • Production data
  • 52. Storage • Purpose: • Considerations: – Often a tradeoff between latency, capacity, and longevity – Needed for many areas in a pipeline: – Should never be directly interacted with by transformation functions: • Fundamentally breaks invariants • Should be orchestrated at a higher level – During evaluation of the transformation dependency graph
  • 54. Computational work • Iteration times: – For content authors, improving iteration is purely a metric of time • The amount of computation required is not relevant
  • 55. Computational work • Iteration times: – For content authors, improving iteration is purely a metric of time • The amount of computation required is not relevant – Often, more computation results in more time required
  • 56. Computational work • Iteration times: • Mitigation strategies: – Data model – Concurrency – Avoidance
  • 57. Computational work • Iteration times: • Mitigation strategies: – Data model • Reduction of transformation steps required to see result – Transformations are unnecessary or optional – Direct optimization or removal of transformations – Concurrency – Avoidance
  • 58. Computational work • Iteration times: • Mitigation strategies: – Data model • Reduction of transformation steps required to see result • Ideal to address, the best general optimization is to do less work – Concurrency – Avoidance
  • 59. Computational work • Iteration times: • Mitigation strategies: – Data model – Concurrency • Leveraging the invariants of our data transformation functions • Available contexts: – Threads – Processes – Network – Avoidance
  • 60. Computational work • Iteration times: • Mitigation strategies: – Data model – Concurrency – Avoidance • Hierarchal caching contexts (more detail later): – In-memory – File system – Network
  • 61. Computational work • Iteration times: • Mitigation strategies: – Data model – Concurrency – Avoidance • Hierarchal caching contexts: • Pruning action based on criteria: – Unneeded for desired result – Redundant
  • 62. Conceptual architectures • Live iteration • Asset baking
  • 63. Conceptual architectures • Live iteration – Working directly in the game to see changes – Streaming data to the game through an external tool • Asset baking
  • 64. Conceptual architectures • Live iteration – Working directly in the game to see changes – Streaming data to the game through an external tool – Properties: • Has the benefit of user seeing instant results • Has the problem of data transience – Edit state could become corrupt – Edit state could encounter an irrecoverable error (crashes) • Has the problem of needing to reflect changes into source data – Other assets may have dependence on changes • Asset baking
  • 65. Conceptual architectures • Live iteration • Asset baking – Using an offline toolset to “cook” or “bake” assets for the game
  • 66. Conceptual architectures • Live iteration • Asset baking – Using an offline toolset to “cook” or “bake” assets for the game – Properties: • Has the benefit of permanence • Has the benefit of simple directional flow of changes – Source  intermediate  production • Has the problem, often, of increased computational cost • Has the problem, often, of non-uniformity – Example: Different teams having different file formats – Resulting in a more complex unified pipeline
  • 67. Conceptual architectures • Live iteration • Asset baking • Philosophy: – Live iteration workflows and asset baking can be reconciled – Both concepts are shades of the same architectural principles
  • 68. Conceptual architectures • Live iteration • Asset baking • Philosophy: • Desired pipeline architecture: – Transferrable library of available data transformations – Full data dependency graph of all game content – Computable transformation dependency graphs – Generalized hierarchical caching model
  • 69. Visualization of a content build Asset baking pipeline
  • 70. Visualization of a content build Live iteration pipeline
  • 72. Data transformation functions • Lesson: Define the data formats • Lesson: Discipline in respecting invariants • Lesson: Transferability is crucial
  • 73. Data transformation functions • Lesson: Define the data formats – Developed an in-house object serialization library • Similar to Google Flat Buffers and Protocol Buffers • Lesson: Discipline in respecting invariants • Lesson: Transferability is crucial
  • 74. Data transformation functions • Lesson: Define the data formats – Developed an in-house object serialization library • Similar to Google Flat Buffers and Protocol Buffers • Designed for: – Memory-mappability – Binary format for speed, Json format for debugging – Data layout optimization – Versioning and Identification – Automatic API generation – Custom allocators • Lesson: Discipline in respecting invariants • Lesson: Transferability is crucial
  • 75. Data transformation functions • Lesson: Define the data formats • Lesson: Discipline in respecting invariants – Allows for concurrency guarantees – Allows for data, build, and game state guarantees • Lesson: Transferability is crucial
  • 76. Data transformation functions • Lesson: Define the data formats • Lesson: Discipline in respecting invariants • Lesson: Transferability is crucial – Same library of functions can be used at bake time and run time – Opportunity for the transformation dependency graph to computationally decide where to execute transform: • A thread • A local process • A networked process
  • 77. Dependency Graphs • Lesson: Most research required here • Lesson: Describing the flow of a network • Lesson: Graph operators
  • 78. Dependency Graphs • Lesson: Most research required here – Open problem: Scaling computation to millions of nodes – Open problem: Cache coherent spatial data structures and processing – Open Problem: Extremely performant transformation dependency graph evaluation and iterative construction • Lesson: Describing the flow of a network • Lesson: Graph operators
  • 79. Dependency Graphs • Lesson: Most research required here • Lesson: Describing the flow of a network – Transformation dependency graph • Required a language (DSL) to describe the transformation network • Transformations known as commands: – Examples of granularity: » compress_texture » create_animation_clip • DSL results in vast amount of automatic code generation • Lesson: Graph operators
  • 80. Dependency Graphs • Lesson: Most research required here • Lesson: Describing the flow of a network • Lesson: Graph operators – Transformation dependency graph traversal: • Two operators could encapsulate our most complex operations • Reductions: – Compute a subgraph under a node as input to a command • Maps: – Direct input node to a command
  • 81. Storage • Lesson: File system by itself is very limiting • Lesson: Creating a global asset library
  • 82. Storage • Lesson: File system by itself is very limiting – Butting heads with Windows file system rules is painful: • Path lengths (256 characters) • Path and file name uniqueness – Often results in insane naming conventions for files • Lesson: Creating a global asset library
  • 83. Storage • Lesson: File system by itself is very limiting • Lesson: Creating a global asset library – Many reasons for desirability: • Freedom for custom identification and file versioning • Freedom from file system path and naming conventions • Filenames on physical disk can be mangled, while the asset library can ensure human readability
  • 84. Caching • Lesson: Hierarchical caching is a powerful model – Goal: Never re-compute a function with the same parameters. – Achieved by decorating functions with appropriate caching strategies: • Match a function with the most appropriate caching semantic • Driven by the nature of input and output data • Example: – File-based parameters cached to the file system or network – Normal function parameters cached in memory
  • 85. Caching • Lesson: Hierarchical caching is a powerful model – Any piece of computation can be decorated with caching • Found it to be most beneficial with: – Path and string manipulation – Data transformation functions
  • 86. Caching • Lesson: Hierarchical caching is a powerful model – Any piece of computation can be decorated with caching – Hiding caching structure behind generalized interfaces and decorators allowed for experimentation. • Proved to be invaluable when we switched our file-system based key-value store from SQLite to LMDB
  • 87. Caching • Lesson: Hierarchical caching is a powerful model – Any piece of computation can be decorated with caching – Hiding caching structure behind generalized interfaces and decorators allowed for experimentation. – Our current implementations: • In-memory: achieved with function level memoization • File-system: achieved with LMDB • Networked: achieved with an in-house technology backed by CEPH (a networked file system object store)
  • 88. Caching • Lesson: Hierarchical caching is a powerful model – Example:
  • 90. Hard Problems • Building the foundation • Dependency graph scalability • Dependency graph decision making
  • 91. Hard Problems • Building the foundation – Most studios, including ours have a long history of tool’s development – Difficult to modernize and justify changing monolithic, stable tools – Large upfront cost for changing data formats • Dependency graph scalability • Dependency graph decision making
  • 92. Hard Problems • Building the foundation • Dependency graph scalability • Dependency graph decision making – Performant and meaningful computation against large graphs is difficult but not insurmountable. – Pushing automated decision making to cover: • Caching strategies • Appropriate contexts for data transformations
  • 94. The solution is automation