Your SlideShare is downloading. ×
0
×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

COSCUP 2013 - ThorScript: Programming Language 
for GPU Cloud and Beyond

1,223

Published on

Published in: Technology
0 Comments
4 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,223
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
4
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. 小迪克, Founder & CEO Programming Language for GPU Cloud and Beyond
  • 2. We have a dream...
  • 3. There’re many World’s #1 from Taiwan
  • 4. ...but not many in open source software...
  • 5. Why ? THOR is designed to resolve the barrier of heterogeneous computing THOR is designed to build a new breed of applications taking advantage of latest accelerator hardware THOR aims to be the “Next Big Language” (from Taiwan)
  • 6. “5 TFLOPS and 1TB/s to Global Memory”
  • 7. CUDA OpenCL OpenACC Microsoft C++ AMP DirectX Compute RenderScript
  • 8. GAP ScientistHippie CUDA OpenCL OpenACC Microsoft C++ AMP DirectX Compute RenderScript JavaScript WebRTC HTML5 /CSS3 NoSQLCoffeeScript WebSocket
  • 9. ScientistHippie Common Language Runtime CUDA OpenCL OpenACC Microsoft C++ AMP DirectX Compute RenderScript JavaScript WebRTC HTML5 /CSS3 NoSQLCoffeeScript WebSocket EASY, FAST and POWERFUL
  • 10. What is ? THOR is about Parallelism and Concurrency THOR is Garbage-Collected THOR is designed with love for C++ THOR implements Itanium C++ ABI to link with C++ THOR is based on LLVM and NVVM for NVIDIA target (in programmer’s terminology) THOR runs on both CPU and GPU
  • 11. this guy
  • 12. class MyClass { public function new() { } public function delete() { } public function getName() { return “MyClass” } } @entry function main():int32 { print(“Hello World”) } ECMA-Derived Syntax
  • 13. class MyClass { public function new() { } public function delete() { } public function getName():String { return “MyClass” } } @entry function main():int32 { print(“Hello World”) } ECMA-Derived Syntax Strong Static Typing
  • 14. class MyClass { public function new() { } public function delete() { } public function getName() { return “MyClass” } } @entry function main():int32 { print(“Hello World”) } ECMA-Derived Syntax ...with Type Inference
  • 15. function hello<T>(v:T) { print(v); } class CheckedMars<T> { public function set(idx:int32, v:T):void { ... } public function get(idx:int32): T {...} private data:Array<T>; } class CheckedArray<T:int32, Builder = Sum<T> > { public function set(idx:int32, v:T):void { ... } public function get(idx:int32): T {...} public function build():T { var builder = new Builder(); builder.build(data); } private data:Array<T>; } Function/Class Template Specialization Default Template Argument
  • 16. function dummy() { var x:int32; var y:int32; var adder = new Adder<int32>(); ... var f = lambda() : int32 { return adder.add(x, y); }; } Lambda with Auto Capture Value-capture Semantic (Objects are always in reference form)
  • 17. @entry function test1() : int32 { // ... var fib = lambda(x : int32) : int32 { if (x < 2) return x; return fib(x-1) + fib(x-2); }; // ... return 0; } Lambda with Auto Capture Recursive Lambda without Using Fix-point Combinator
  • 18. // adder.h template<typename T> class Adder { public: T add(T x, T y) { return x+y; } } // adder.t @native { include=”adder.h”} class Adder<T> { public function add(x:T, y:T):T; } @entry function main():int32 { var a = new Adder<int32>(); var result = a.add(123, 456); } Seamlessly Integrate with Existing C++ Code C++ Code Instantiate C++ Template Directly in ThorScript
  • 19. Data Parallelism
  • 20. Data-Parallelism // kernel for adding two arrays in parallel __global void add(int* a, int* b, int* c, int count) { int index = blockIdx.x * blockDim.x + threadIdx.x; c[index] = a[index] + b[index]; } int main() { // prepare array a, b, and c cudaMalloc(&a, size*sizeof(int)); ... // launch GPU kernel to add add<<<256,size/256>>>(a, b, c, size); cudaThreadSynchronize(); ... } Complicated
  • 21. int fib(int b) { if(n<2) return n; int x = cilk_spawn fib(n-2); int y = fib(n-1); cilk_sync; return x+y; } int main() { int n = fib(10); std::cout << n; return 0; } CilkPlus
  • 22. task fib(n:int32):int32 { if(n<2) return n; var a, b; flow -> { a = fib(n-1); b = fib(n-2); } return a+b; } task main() { int n:int32; pipeline -> { async -> n = fib(10); print(n); } } Express parallelism by flow, async, and pipeline Every statement runs in parallel tasks merge and continue here
  • 23. task fib(n:int32):int32 { if(n<2) return n; var a, b; flow -> { a = fib(n-1); b = fib(n-2); } return a+b; } task main() { int n:int32; pipeline -> { async -> n = fib(10); print(n); } } Express parallelism by flow, async, and pipeline Create an async task
  • 24. task fib(n:int32):int32 { if(n<2) return n; var a, b; flow -> { a = fib(n-1); b = fib(n-2); } return a+b; } task main() { int n:int32; pipeline -> { async -> n = fib(10); print(n); } } Express parallelism by flow, async, and pipeline “pipeline” converts the block into continuation-passing-style (CPS)
  • 25. @kernel function add(a:Array<int32>, b:Array<int32>, c:Array<int32>) { var idx = getGlobalIndex(); c[idx] = a[idx] + b[idx]; } task main() { var a = [0, 1, 2, 3]; var b = [0, 1, 2, 3]; var c = <int32>[4]; async[a.size()] -> add(a, b, c); } You can still use data-parallel kernel...
  • 26. @kernel function compute(a:Array<int32>, b:Array<int32>, c:Array<int32>) { var idx = getGlobalIndex(); if(idx == 0) { c.fill(0); a.copyFrom(b); } ... } task main() { ... async[a.size()] -> compute(a, b, c); } Hidden DMA Warp for Memory Operation the actual copy is done by the hidden DMA warp
  • 27. var counter:int32 = 0; function update():int32 { var n; atomic -> { if(counter % 2 == 0) counter+=2; } return n; } @entry function main() { pipeline -> { async[1024] -> update(); print(counter); } Transaction Memory Block (STM/HTM) All memory access within atomic block is transactional
  • 28. var counter:int32 = 0; function update():int32 { var n; atomic -> { ++counter; } return n; } @entry function main() { pipeline -> { async[1024] -> update(); print(counter); } Transaction Memory Block (STM/HTM) Simple transaction is converted into atomic add/ cmp_exchange instruction
  • 29. @server function compute(n:Request):int32 { ... } @client function run_at_client() { ... pipeline -> { remote[Domain.caller()] -> var n = compute(request); print(n); } } @server // run: tsc r --server main task main() { // prepare the network manager // setup the network listener... Domain.watch(DomainEvent.Connected, lambda(d:Domain):void { remote[d] -> run_at_client(); }); } Remote Procedure Call (RPC) with Automatic Object Replication Dierent execution domain Dierent execution domain Invoke through “remote”
  • 30. Full DWARF Support, Debug THOR in GDB
  • 31. With GPUDirect & NVM Express GPUDirect Big Data, Real-time Analytic Application Web App, GPU- Accelerated Database Filesystem on GPU (still work-in-progress...)
  • 32. ...is still evolving and changing everyday
  • 33. ...and we’d like to organize a small think tank (< 5~10 people) let us know your idea, and we will implement it! plz share your thoughts on programming language and send email to sdk@zillians.com
  • 34. 順便打點小廣告...
  • 35. SINGULARITY(HACKERSPACE) 奇異點
  • 36. SINGULARITY(HACKERSPACE) 奇異點 # 300坪 # 大安捷運站旁
  • 37. Educationis dead and out-dated Computer Science
  • 38. What Education Needs is NOT Evolution but Revolutionby Ken Robinson
  • 39. to Aggregate Talents
  • 40. by Creative Workshop
  • 41. c-base hackerspace (Berlin)
  • 42. NoiseBridge (Bay Area)
  • 43. HackerDojo (Bay Area)
  • 44. Artisan’s Asylum (Boston)
  • 45. XinCheJian/新車間 (Shanghai)
  • 46. 3D Printers Laser Cutter Linux Boards Components
  • 47. SW/HW Hackers
  • 48. SW/HW Hackers Designer
  • 49. DREAMER
  • 50. DREAMER DOER/MAKER
  • 51. DOER/MAKER COMMUNITY
  • 52. CHANGE
  • 53. IMPACT
  • 54. “where there’s hardship, there’s opportunity” Q&A

×