• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Threading Successes 02   Supreme Commander
 

Threading Successes 02 Supreme Commander

on

  • 1,363 views

 

Statistics

Views

Total Views
1,363
Views on SlideShare
1,363
Embed Views
0

Actions

Likes
0
Downloads
12
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Threading Successes 02   Supreme Commander Threading Successes 02 Supreme Commander Presentation Transcript

  • THQ/Gas Powered Games Supreme Commander and Supreme Commander: Forged Alliance Thread for Performance
  • Supreme Commander runs best on 4 cores - let’s see how!
    • Threading in mid-project can be done!
    • Decoupled threads give great performance
    • Memory management extends the gains
    • Lessons learned
  • Threading was a mid-stream change
    • Code was initially single-threaded
      • Game demanded more performance
      • Changed mid-project (6-12 months into development)
      • Separate render/sim threads to run at different rates
      • Support multiple cores
    • Limited architecture choices due to existing code
    • Using Boost thread library
      • Portable, open-source thread library
  • Render split is essential to speed
    • Lots of “little” threads: sound, loading, etc.
    • Sim thread: All simulation
    • Render thread: Full speed, <=10x per sim tick
    • Sync phase: Once frame is ready to render
      • Sync render and sim
      • Fully queued in and out of sim
      • Fast
  • Decoupled architecture is built for speed
    • Ready to start a frame and a simulation tick
    Issue
  • Decoupled architecture is built for speed Simulation Render Issue Run decoupled sim and render Fully buffered input to sim, call via Sim Thread Interface Sim Thread Interface
  • Decoupled architecture is built for speed
    • Render can run repeatedly
    • Depends on sim duration
    Simulation Render Issue Up to 10x per sim tick Render …
  • Decoupled architecture is built for speed
    • Fully decoupled? No.
    • A few low level systems have locks.
    • No major performance impact!
    Simulation Render Issue Up to 10x per sim tick Render … Locks
  • Decoupled architecture is built for speed
    • Sync sim thread out to render thread,
    • via STI again
    Simulation Render Issue Up to 10x per sim tick Render Render … Sim Thread Interface Issue
  • Decoupled architecture is built for speed
    • Multiplayer: Record everything going through STI
    • Send over network
    Simulation Render Issue Up to 10x per sim tick Render Render … Sim Thread Interface Issue Sim Thread Interface
  • Decoupled architecture is built for speed
    • And so on…
    Simulation Render Issue Up to 10x per sim tick Render Sim Render Render Issue … … Issue
  • Thread model adapts to varying loads
    • Architecture scales well with loads
      • Render load will often dominate
      • Re-render to keep frame rates up
      • Sim-heavy map will try to be sim-dominated
  • Displaying frame times – cool!
    • Thread stats in real time
  • Sometimes, there’s more to render
    • Render
    • Runs as fast as possible
    • Simulation
    • Sim/render sync
    • Both threads synced, fully queued in and out of sim
  • Other times, there’s more to simulate
    • Sim runs across many rendered frames
  • A little sync doesn’t slow this code down Frame n Frame n+1 Sync Busy Waiting Threads are busy most of the time! Mostly waiting
  • Memory manager gives an additional boost
    • Memory: If you’re not careful in a threaded game…
      • Memory use can thrash cache – but not a problem here!
      • Memory alloc/free can be slow
    • Suspected memory management was problem
      • Doing lots of small allocations
      • Built code to make it easy to switch mem managers
    • Custom mem manager outperforms default malloc/free
      • Can cause some debugging questions
      • Purchased commercial one for Supreme Commander
      • Wrote new one for Forged Alliance
  • What are some current bottlenecks?
    • Multiplayer: all sims run concurrently
      • Limited by least-common-denominator machine
      • That’s the RTS way
    • Monolithic render thread
      • Multiple monitors, typically different views
      • Possibly split off top part of render for second monitor?
      • Too expensive/complex for niche feature
  • This was a great learning experience!
    • Good intermediate step
      • Especially for threading mid-project
    • Would do it differently if doing it from scratch
      • Target more processor cores
      • General worker threads w/dispatch system
      • Templates to define an interface to common semantics
      • Directed work graph/node graph (hard to express)
      • Or …?
    • The engine is so good, it’ll be back in Demigod!
      • Demigod team using modified Supreme Commander engine
  • We learned some DOs and DON’Ts
    • Do:
      • Architect for threading from the start, if you can
      • Thread single-threaded code, if you must
      • Decouple threads where possible
    • Don’t:
      • Be afraid to thread single-threaded code
  • Supreme Commander runs best on 4 cores – that’s how!
    • Threading in mid-project can be done!
    • Decoupled threads give great performance
    • Memory management extends the gains
    • Lessons learned
  • So, what do you think?
    • Have you tried something like this?
      • Successes?
      • Failures?
    • Have you rejected trying something like this?
      • Why?
  •