Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Using The New Flash Stage3D Web Technology
To Build Your Own Next 3D Browser MMOG
Daosheng Mu, Lead Programmer
Eric Chang,...
Outline
• Brief of Speakers
• Introduction of Adobe Flash Stage3D API
• XPEC Flash 3D Engine
• Optimization for Flash Prog...
Brief of Speakers
• Eric Chang
– 19 Years of Game Industry
Experiences
– Cross-platform 3D Game
Engine Development
– PC/Co...
Brief of Speakers
• Daosheng Mu
– 4.5 Years of Cross-platform 3D Game Engine Development
Experiences
– PC/Console/Web
Why Flash?
Native C/C++ vs. Unity vs. Flash
Native
C/C++
Unity Flash
Development
Difficulty
High Low Mid
Ease of
Cross Pla...
Project C4 Demo Video
Introduction of Adobe Flash
Stage3D API
Stage3D
• Support all browsers
Stage3D
• Stage3D includes with GPU-accelerated
3D APIs
– Z-buffering
– Stencil/Color buffer
– Vertex shaders
– Fragment s...
Stage3D
• Pros:
– GPU accelerated API
– Relies on DirectX, OpenGL, OpenGL ES
– Programmable pipeline
• Cons:
– No support ...
Stage3D
ResourceNumber allowedTotal memory
Vertex buffers 4096 256 MB
Index buffers 4096 128 MB
Programs 4096 16 MB
Textur...
AGAL
• Adobe Graphics Assembly Language
– No support of ‘if-else’ statements
– No support of ‘constants’
XPEC Flash 3D Engine
Model Pipeline
• Action Message Format (AMF):
– Native ByteArray compression
– Native object serialization
3DS Max
Engine
...
XPEC Flash 3D Engine
• Application: update/render on CPU
• Command buffer: store graphics API
instruction
Application
Comm...
XPEC Flash 3D Engine:
Application
Object3D
• Material
• Geometry
Update
• UpdateDeltaTime
• UpdateTransform
Scene
manageme...
Scene Management
• Goal: Minimize draw calls as possible
• Indoor Scene
– BSP tree
• Outdoor Scene
– Octree/Quad tree
– Ce...
Scene Management: Project C4
• Grid partition
• Object3D: (MinX, MaxX), (MinY, MaxY)
(0, 0)
(2, 2)
(4, 4)
y
x
Scene Management: Project C4
• Frustum: (MinX, MaxX), (MinY, MaxY)
(0, 0)
(2, 2)
(4, 4)
(1,4),(0,4)
y
x
XPEC Flash 3D Engine:
Command Buffer
Initialize
• createVertex/Index
Buffer
• createTexture
• createProgram
Begin
• clear
...
Material Sorting
• Opaque/Translucent
Material Sorting
• State management
• 1047/2598 draw calls
0
10
20
30
40
50
60
NVIDIA
8800GT -
1047 draw
calls
NVIDIA
8800GT -
1047 draw
calls with
material
sorting
NVIDIA
8800GT -
...
0
10
20
30
40
50
60
70
80
90
100
NVIDIA
6600GT -
1047 draw
calls
NVIDIA
6600GT -
1047 draw
calls with
material
sorting
NVI...
Before sorting(ms) After sorting(ms)
NVIDIA
8800 GT
- 1047 draw
calls
Render loop
elapsed time
16 16
Total elapsed
time
41...
Shared Buffers
• Problem:
– Numbers of buffers are limited
ResourceNumber allowedTotal memory
Vertex buffers 4096 256 MB
I...
Shared Buffers
Vertex
Buffer
Index
Buffer
Vertex
Buffer
Index
Buffer
Vertex
Buffer
Index
Buffer
Particle System
• Each particle property
is computed on the
CPU at each frame
– Alpha, Color,
LinearForce, Size,
Speed, UV...
Particle System
• Index buffer
– Indices will not be changed
• Vertex buffer
– Problem:
• Particle amount depends on frame...
Particle System
Static
Index
Buffer
Dynamic
Vertex
Buffer
Vertex
Data
Skinned Model
• Problem:
– Lesser vertex constants
allowed
• 128 constants per vertex
program
– Global vertex constants
• ...
Skinned Model
• 4x3 Matrix
• Bone count per
geometry is limited
to 29
– “Split mesh”
128 constants / 3 = 42.6666 bones
3 *...
Shadow Map
Shadow Map
present()
End frame
setRenderToBackBuffer()
Set shadow map
setRenderToTexture()
Clear shadow map Draw to shadow...
Shadow Map
• Problem:
– Texture format: RGBA8
– Artifact
• Aliasing
• Popping while moving
• Size: 1024x1024
• RGBA8  R32
Shadow Map
Shadow Map
• Percentage Closer Filtering (PCF) solution:
– Hard shadow
– Aliasing
– Popping while moving
Shadow Map
• PCF
pw = 1/mapWidth
ph = 1/mapHeight
• Result = 0.5 * texel( 0, 0)
+ 0.125 * texel( -pw, +ph) + 0.125 * texel...
Shadow Map
• PCF based solution:
0
20
40
60
80
100
NVIDIA
6600GT -
1047 draw
calls
NVIDIA
6600GT -
1047 draw
calls with PC...
Toon Shading
• Single pass
– Problem: Dependent on no. of face
• Two passes
– Scale vertex position following the vertex
n...
Toon Shading
• Enable
back face
• Scale
vertex
position
• Draw color
Toon
• Enable
front face
• Draw
material
General Resu...
Alpha Test
• Problem:
– Stage3D without alpha test
– “kil opcode in AGAL”
• Performance penalty on mobile device
Alpha Test
• Solution:
Render loop
time(ms)
Total time(ms)
6600GT alpha
test
17~19 47
6600GT alpha
blend
18~19 65~67
8800G...
Post Effect
OriginGlowDOFColor
Filter
Static Lightmap
• Pros:
– Pre-computation
– Global illumination
• Cons:
– More textures
Optimization for Flash Program
Optimization for Flash Program
• Problem:
– For Each is slow
• “Use for-loop to replace it”
– Memory management
• “Recycle...
Optimization for Flash Program
• Solution:
– Recycle manager
• Reduce garbage collection loading
• Save objects initial ti...
Optimization for Flash Program
• Solution:
– Strengthen garbage collection
• Avoid inner function
• Force to dereference f...
• Avoid inner function
• Force to dereference function pointer
Without inner function
Use inner function
Optimization for Flash Program
• Experiment: before vs. after
– Switching among levels
Before improvement: After improveme...
Rapid loading
Rapid loading
• Streaming
– Data compression
• PNG: swf compression: 20%~55%
• Package: zip compression: 25~30%
– Batch lo...
Rapid loading
Enter to
avatar stage
Enter to
game stage
After loading
picture
finished
5Mb/s
Elapsed time
(sec)
15 6 12
• ...
Future Works
• Adobe Texture Format (ATF)
– Support for compressed/mipmap textures on the
different GPU chipset
• FlasCC
–...
Conclusion
• Cross-Device/Cross-OS/Cross-Browser
– Browser + Cloud Computing
– Write Once, Run Anywhere
• Flash vs. HTML5
...
Acknowledgements
• XPEC - Project C4 Team
• XPEC - RDO Team
Q & A
Ellison_Mu@xpec.com
Eric_Chang@xpec.com
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser MMOG
Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser MMOG
Upcoming SlideShare
Loading in …5
×

Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser MMOG

2,128 views

Published on

Game Developer Conference China (2012). Programming track.

This speech talks about how to use Stage3D APIs to make a 3D web game engine, and discuss some points about optimizing it.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser MMOG

  1. 1. Using The New Flash Stage3D Web Technology To Build Your Own Next 3D Browser MMOG Daosheng Mu, Lead Programmer Eric Chang, CTO XPEC Entertainment Inc.
  2. 2. Outline • Brief of Speakers • Introduction of Adobe Flash Stage3D API • XPEC Flash 3D Engine • Optimization for Flash Program • Future Works • Conclusion • Q & A
  3. 3. Brief of Speakers • Eric Chang – 19 Years of Game Industry Experiences – Cross-platform 3D Game Engine Development – PC/Console/Web
  4. 4. Brief of Speakers • Daosheng Mu – 4.5 Years of Cross-platform 3D Game Engine Development Experiences – PC/Console/Web
  5. 5. Why Flash? Native C/C++ vs. Unity vs. Flash Native C/C++ Unity Flash Development Difficulty High Low Mid Ease of Cross Platform Low High High Performance High Mid Low Market Popularity Low Mid High (>95%)
  6. 6. Project C4 Demo Video
  7. 7. Introduction of Adobe Flash Stage3D API
  8. 8. Stage3D • Support all browsers
  9. 9. Stage3D • Stage3D includes with GPU-accelerated 3D APIs – Z-buffering – Stencil/Color buffer – Vertex shaders – Fragment shaders – Cube textures – More…
  10. 10. Stage3D • Pros: – GPU accelerated API – Relies on DirectX, OpenGL, OpenGL ES – Programmable pipeline • Cons: – No support of alpha test – No support of high-precision texture format
  11. 11. Stage3D ResourceNumber allowedTotal memory Vertex buffers 4096 256 MB Index buffers 4096 128 MB Programs 4096 16 MB Textures 4096 128 MB* Cube textures 4096 256 MB Draw call limits 32,768 *350 MB is absolute limit for textures, 340 MB is the result we gather
  12. 12. AGAL • Adobe Graphics Assembly Language – No support of ‘if-else’ statements – No support of ‘constants’
  13. 13. XPEC Flash 3D Engine
  14. 14. Model Pipeline • Action Message Format (AMF): – Native ByteArray compression – Native object serialization 3DS Max Engine Loader Exporter Collada Binary Converter AMF AMF Engine Render
  15. 15. XPEC Flash 3D Engine • Application: update/render on CPU • Command buffer: store graphics API instruction Application Command buffer Driver GPU CPU
  16. 16. XPEC Flash 3D Engine: Application Object3D • Material • Geometry Update • UpdateDeltaTime • UpdateTransform Scene management • Scene partition • Frustum culling Update • UpdateHierarchy Draw • SetMaterial • SetGeometry Stage3D • Set Stage3D APIs
  17. 17. Scene Management • Goal: Minimize draw calls as possible • Indoor Scene – BSP tree • Outdoor Scene – Octree/Quad tree – Cell – Grid
  18. 18. Scene Management: Project C4 • Grid partition • Object3D: (MinX, MaxX), (MinY, MaxY) (0, 0) (2, 2) (4, 4) y x
  19. 19. Scene Management: Project C4 • Frustum: (MinX, MaxX), (MinY, MaxY) (0, 0) (2, 2) (4, 4) (1,4),(0,4) y x
  20. 20. XPEC Flash 3D Engine: Command Buffer Initialize • createVertex/Index Buffer • createTexture • createProgram Begin • clear • setRenderToTexture Draw • setVertex/Index Buffer • setProgram • setProgramConstants • setRenderState • setTextureAt • drawTriangles End • present • Avoid user/kernel mode transition • Decrease shader patching – “Material sorting” • Reduce draw call – “Shared buffers” – “Dynamic batching”
  21. 21. Material Sorting • Opaque/Translucent
  22. 22. Material Sorting • State management • 1047/2598 draw calls
  23. 23. 0 10 20 30 40 50 60 NVIDIA 8800GT - 1047 draw calls NVIDIA 8800GT - 1047 draw calls with material sorting NVIDIA 8800GT - 2598 draw calls NVIDIA 8800GT - 2598 draw calls with material sorting Elapsedtime(ms) CPU waiting GPU Render loop
  24. 24. 0 10 20 30 40 50 60 70 80 90 100 NVIDIA 6600GT - 1047 draw calls NVIDIA 6600GT - 1047 draw calls with material sorting NVIDIA 6600GT - 2598 draw calls NVIDIA 6600GT - 2598 draw calls with material sorting Elapsedtime(ms) CPU waiting GPU Render loop
  25. 25. Before sorting(ms) After sorting(ms) NVIDIA 8800 GT - 1047 draw calls Render loop elapsed time 16 16 Total elapsed time 41 40 NVIDIA 8800 GT - 2598 draw calls Render loop elapsed time 36 36 Total elapsed time 50 50 Before sorting(ms) After sorting(ms) NVIDIA 6600 GT - 1047 draw calls Render loop elapsed time 34 31 Total elapsed time 53 48 NVIDIA 6600 GT - 2598 draw calls Render loop elapsed time 81 64 Total elapsed time 89 89
  26. 26. Shared Buffers • Problem: – Numbers of buffers are limited ResourceNumber allowedTotal memory Vertex buffers 4096 256 MB Index buffers 4096 128 MB Programs 4096 16 MB
  27. 27. Shared Buffers Vertex Buffer Index Buffer Vertex Buffer Index Buffer Vertex Buffer Index Buffer
  28. 28. Particle System • Each particle property is computed on the CPU at each frame – Alpha, Color, LinearForce, Size, Speed, UV – Facing
  29. 29. Particle System • Index buffer – Indices will not be changed • Vertex buffer – Problem: • Particle amount depends on frame • Upload data to vertex buffer frequently
  30. 30. Particle System Static Index Buffer Dynamic Vertex Buffer Vertex Data
  31. 31. Skinned Model • Problem: – Lesser vertex constants allowed • 128 constants per vertex program – Global vertex constants • Lighting, Fog, Const
  32. 32. Skinned Model • 4x3 Matrix • Bone count per geometry is limited to 29 – “Split mesh” 128 constants / 3 = 42.6666 bones 3 * 29 bones = 87 constants
  33. 33. Shadow Map
  34. 34. Shadow Map present() End frame setRenderToBackBuffer() Set shadow map setRenderToTexture() Clear shadow map Draw to shadow map clear() Clear back buffer
  35. 35. Shadow Map • Problem: – Texture format: RGBA8 – Artifact • Aliasing • Popping while moving
  36. 36. • Size: 1024x1024 • RGBA8  R32 Shadow Map
  37. 37. Shadow Map • Percentage Closer Filtering (PCF) solution: – Hard shadow – Aliasing – Popping while moving
  38. 38. Shadow Map • PCF pw = 1/mapWidth ph = 1/mapHeight • Result = 0.5 * texel( 0, 0) + 0.125 * texel( -pw, +ph) + 0.125 * texel(-pw, -ph) + 0.125 * texel( +pw, +ph) + 0.125 * texel(+pw, -ph) (-pw , +ph) (+pw , +ph) (0, 0) (+pw , -ph)(-pw , -ph)
  39. 39. Shadow Map • PCF based solution: 0 20 40 60 80 100 NVIDIA 6600GT - 1047 draw calls NVIDIA 6600GT - 1047 draw calls with PCF NVIDIA 8800GT - 1047 draw calls NVIDIA 8800GT - 1047 draw calls with PCF Elapsedtime(ms) CPU waiting GPU Render loop
  40. 40. Toon Shading • Single pass – Problem: Dependent on no. of face • Two passes – Scale vertex position following the vertex normal – Not dependent on no. of face 𝑣 ∶ 𝑣𝑖𝑒𝑤 𝑣𝑒𝑐𝑡𝑜𝑟 𝜃 𝑖𝑓 𝜃 > 𝑡ℎ𝑟𝑒𝑠ℎ𝑜𝑙𝑑, 𝑑𝑟𝑎𝑤 𝑡𝑜𝑜𝑛 𝑐𝑜𝑙𝑜𝑟 𝑁 : 𝑣𝑒𝑟𝑡𝑒𝑥 𝑛𝑜𝑟𝑚𝑎𝑙
  41. 41. Toon Shading • Enable back face • Scale vertex position • Draw color Toon • Enable front face • Draw material General Result
  42. 42. Alpha Test • Problem: – Stage3D without alpha test – “kil opcode in AGAL” • Performance penalty on mobile device
  43. 43. Alpha Test • Solution: Render loop time(ms) Total time(ms) 6600GT alpha test 17~19 47 6600GT alpha blend 18~19 65~67 8800GT alpha test 0.16 37 8800GT alpha blend 0.3 36 •304 draw calls •Alpha-test performance is better on desktop Replace alpha-test with alpha-blend
  44. 44. Post Effect OriginGlowDOFColor Filter
  45. 45. Static Lightmap • Pros: – Pre-computation – Global illumination • Cons: – More textures
  46. 46. Optimization for Flash Program
  47. 47. Optimization for Flash Program • Problem: – For Each is slow • “Use for-loop to replace it” – Memory management • “Recycle manager” • “Strengthen garbage collection”
  48. 48. Optimization for Flash Program • Solution: – Recycle manager • Reduce garbage collection loading • Save objects initial time • public function recycleObject3D( obj:IObject3D ):void • public function requestObject3D( classType:int , searchKey:*, renderHandle:int = 0 ):*
  49. 49. Optimization for Flash Program • Solution: – Strengthen garbage collection • Avoid inner function • Force to dereference function pointer • Dereference attribute in object destructor
  50. 50. • Avoid inner function • Force to dereference function pointer Without inner function Use inner function
  51. 51. Optimization for Flash Program • Experiment: before vs. after – Switching among levels Before improvement: After improvement :
  52. 52. Rapid loading
  53. 53. Rapid loading • Streaming – Data compression • PNG: swf compression: 20%~55% • Package: zip compression: 25~30% – Batch loading • Separate resource to several packages • Download what you really need
  54. 54. Rapid loading Enter to avatar stage Enter to game stage After loading picture finished 5Mb/s Elapsed time (sec) 15 6 12 • game code • ui • game scene • scene textures
  55. 55. Future Works • Adobe Texture Format (ATF) – Support for compressed/mipmap textures on the different GPU chipset • FlasCC – C++  AS3 Compilation • AS3 Workers – Multi-thread support • MovieClip – Replace with Stage3D UI framework, ex: Starling
  56. 56. Conclusion • Cross-Device/Cross-OS/Cross-Browser – Browser + Cloud Computing – Write Once, Run Anywhere • Flash vs. HTML5 • Cross-Compiling Technology Trend – C/C++ + Flash/ActionScript – C/C++ + HTML5/JavaScript
  57. 57. Acknowledgements • XPEC - Project C4 Team • XPEC - RDO Team
  58. 58. Q & A Ellison_Mu@xpec.com Eric_Chang@xpec.com

×