SlideShare a Scribd company logo
1 of 84
Download to read offline
If you don’t get this ref...shame on you
Jarred Nicholls
  @jarrednicholls
jarred@webkit.org
Work @ Sencha
Web Platform Team

   Doing webkitty things...
WebKit Committer
Co-Author
W3C Web Cryptography
        API
JavaScript on the GPU
What I’ll blabber about today
Why JavaScript on the GPU
Running JavaScript on the GPU

What’s to come...
Why JavaScript on the GPU?
Why JavaScript on the GPU?

        Better question:
         Why a GPU?
Why JavaScript on the GPU?

        Better question:
         Why a GPU?

        A: They’re fast!
         (well, at certain things...)
GPUs are fast b/c...
Totally different paradigm from CPUs
Data parallelism vs. Task parallelism
Stream processing vs. Sequential processing
    GPUs can divide-and-conquer
Hardware capable of a large number of “threads”
    e.g. ATI Radeon HD 6770m:
    480 stream processing units == 480 cores
Typically very high memory bandwidth
Many, many GigaFLOPs
GPUs don’t solve all problems
Not all tasks can be accelerated by GPUs
Tasks must be parallelizable, i.e.:
    Side effect free
    Homogeneous and/or streamable
Overall tasks will become limited by Amdahl’s Law
Let’s find out...
Experiment
Code Name “LateralJS”
LateralJS

Our Mission
To make JavaScript a first-class citizen on all GPUs
and take advantage of hardware accelerated
operations & data parallelization.
Our Options
          OpenCL                 Nvidia CUDA
AMD, Nvidia, Intel, etc.   Nvidia only
A shitty version of C99    C++ (C for CUDA)
No dynamic memory          Dynamic memory
No recursion               Recursion
No function pointers       Function pointers
Terrible tooling           Great dev. tooling
Immature (arguably)        More mature (arguably)
Our Options
          OpenCL                 Nvidia CUDA
AMD, Nvidia, Intel, etc.   Nvidia only
A shitty version of C99    C++ (C for CUDA)
No dynamic memory          Dynamic memory
No recursion               Recursion
No function pointers       Function pointers
Terrible tooling           Great dev. tooling
Immature (arguably)        More mature (arguably)
Why not a Static Compiler?
We want full JavaScript support
    Object / prototype
    Closures
    Recursion
    Functions as objects
    Variable typing
Type Inference limitations
Reasonably limited to size and complexity of “kernel-
esque” functions
Not nearly insane enough
Why an Interpreter?
We want it all baby - full JavaScript support!
Most insane approach
Challenging to make it good, but holds a lot of promise
OpenCL Headaches
Oh the agony...
Multiple memory spaces - pointer hell
No recursion - all inlined functions
No standard libc libraries
No dynamic memory
No standard data structures - apart from vector ops
Buggy ass AMD/Nvidia compilers
Multiple Memory Spaces
In the order of fastest to slowest:
         space                 description
                   very fast
         private   stream processor cache (~64KB)
                   scoped to a single work item
                   fast
          local    ~= L1 cache on CPUs (~64KB)
                   scoped to a single work group
                   slow, by orders of magnitude
         global    ~= system memory over slow bus
        constant   available to all work groups/items
                   all the VRAM on the card (MBs)
Memory Space Pointer Hell
global uchar* gptr = 0x1000;
local uchar* lptr = (local uchar*) gptr; // FAIL!
uchar* pptr = (uchar*) gptr; // FAIL! private is implicit


                               0x1000




             global             local            private




            0x1000 points to something different
              depending on the address space!
Memory Space Pointer Hell
           Pointers must always be fully qualified
                Macros to help ease the pain

#define   GPTR(TYPE)   global TYPE*
#define   CPTR(TYPE)   constant TYPE*
#define   LPTR(TYPE)   local TYPE*
#define   PPTR(TYPE)   private TYPE*
No Recursion!?!?!?
  No call stack
  All functions are inlined to the kernel function


uint factorial(uint n) {
    if (n <= 1)
         return 1;
    else
         return n * factorial(n - 1); // compile-time error
}
No standard libc libraries
memcpy?
strcpy?
strcmp?
etc...
No standard libc libraries
                              Implement our own
#define MEMCPY(NAME, DEST_AS, SRC_AS) 
    DEST_AS void* NAME(DEST_AS void*, SRC_AS const void*, uint); 
    DEST_AS void* NAME(DEST_AS void* dest, SRC_AS const void* src, uint size) { 
        DEST_AS uchar* cDest = (DEST_AS uchar*)dest; 
        SRC_AS const uchar* cSrc = (SRC_AS const uchar*)src; 
        for (uint i = 0; i < size; i++) 
            cDest[i] = cSrc[i]; 
        return (DEST_AS void*)cDest; 
    }
PTR_MACRO_DEST_SRC(MEMCPY, memcpy)


                                        Produces
             memcpy_g            memcpy_gc           memcpy_lc           memcpy_pc
             memcpy_l            memcpy_gl           memcpy_lg           memcpy_pg
             memcpy_p            memcpy_gp           memcpy_lp           memcpy_pl
No dynamic memory
No malloc()
No free()
What to do...
Yes! dynamic memory
  Create a large buffer of global memory - our “heap”
  Implement our own malloc() and free()
  Create a handle structure - “virtual memory”
  P(T, hnd) macro to get the current pointer address

GPTR(handle) hnd = malloc(sizeof(uint));
GPTR(uint) ptr = P(uint, hnd);
*ptr = 0xdeadbeef;
free(hnd);
Ok, we get the point...
        FYL!
High-level Architecture
       V8                 Data Heap



Esprima Parser            Stack-based
                          Interpreter

                          Host
                          Host
     Host                  GPUs
Data Serializer &
  Marshaller           Garbage Collector



  Device Mgr
High-level Architecture
                    eval(code);
       V8                               Data Heap
                    Build JSON AST

Esprima Parser                          Stack-based
                                        Interpreter

                                        Host
                                        Host
     Host                                GPUs
Data Serializer &
  Marshaller                         Garbage Collector



  Device Mgr
High-level Architecture
                    eval(code);
       V8                                  Data Heap
                    Build JSON AST

Esprima Parser                             Stack-based
                                           Interpreter
                       Serialize AST
                                           Host
                                           Host
     Host           JSON => C Structs       GPUs
Data Serializer &
  Marshaller                            Garbage Collector



  Device Mgr
High-level Architecture
                         eval(code);
       V8                                         Data Heap
                          Build JSON AST

Esprima Parser                                    Stack-based
                                                  Interpreter
                            Serialize AST
                                                  Host
                                                  Host
     Host                JSON => C Structs         GPUs
Data Serializer &
  Marshaller                                   Garbage Collector
                    Ship to GPU to Interpret

  Device Mgr
High-level Architecture
                         eval(code);
       V8                                         Data Heap
                          Build JSON AST

Esprima Parser                                    Stack-based
                                                  Interpreter
                            Serialize AST
                                                  Host
                                                  Host
     Host                JSON => C Structs         GPUs
Data Serializer &
  Marshaller                                   Garbage Collector
                    Ship to GPU to Interpret

  Device Mgr
                          Fetch Result
AST Generation
AST Generation

                                     JSON AST
JavaScript Source
                                    (v8::Object)




                                                   Lateral AST
                    Esprima in V8
                                                   (C structs)
Embed esprima.js

              Resource Generator

$ resgen esprima.js resgen_esprima_js.c
Embed esprima.js

                       resgen_esprima_js.c
const unsigned char resgen_esprima_js[]   = {
    0x2f, 0x2a, 0x0a, 0x20, 0x20, 0x43,   0x6f, 0x70, 0x79, 0x72,
    0x69, 0x67, 0x68, 0x74, 0x20, 0x28,   0x43, 0x29, 0x20, 0x32,
    ...
    0x20, 0x3a, 0x20, 0x2a, 0x2f, 0x0a,   0x0a, 0
};
Embed esprima.js
                          ASTGenerator.cpp
extern const char resgen_esprima_js;

void ASTGenerator::init()
{
    HandleScope scope;
    s_context = Context::New();
    s_context->Enter();
    Handle<Script> script = Script::Compile(String::New(&resgen_esprima_js));
    script->Run();
    s_context->Exit();
    s_initialized = true;
}
Build JSON AST

                    e.g.
ASTGenerator::esprimaParse(
    "var xyz = new Array(10);"
);
Build JSON AST
Handle<Object> ASTGenerator::esprimaParse(const char* javascript)
{
    if (!s_initialized)
        init();


    HandleScope scope;
    s_context->Enter();
    Handle<Object> global = s_context->Global();
    Handle<Object> esprima = Handle<Object>::Cast(global->Get(String::New("esprima")));
    Handle<Function> esprimaParse = Handle<Function>::Cast(esprima-
>Get(String::New("parse")));
    Handle<String> code = String::New(javascript);
    Handle<Object> ast = Handle<Object>::Cast(esprimaParse->Call(esprima, 1,
(Handle<Value>*)&code));


    s_context->Exit();
    return scope.Close(ast);
}
Build JSON AST
{
    "type": "VariableDeclaration",
    "declarations": [
        {
            "type": "VariableDeclarator",
            "id": {
                "type": "Identifier",
                "name": "xyz"
            },
            "init": {
                "type": "NewExpression",
                "callee": {
                    "type": "Identifier",
                    "name": "Array"
                },
                "arguments": [
                    {
                        "type": "Literal",
                        "value": 10
                    }
                ]
            }
        }
    ],
    "kind": "var"
}
Lateral AST structs
typedef struct ast_type_st {         #ifdef __OPENCL_VERSION__
    CL(uint) id;                     #define CL(TYPE) TYPE
    CL(uint) size;                   #else
} ast_type;                          #define CL(TYPE) cl_##TYPE
                                     #endif
typedef struct ast_program_st {
    ast_type type;
    CL(uint) body;
    CL(uint) numBody;
                                      Structs shared between
} ast_program;                           Host and OpenCL

typedef struct ast_identifier_st {
    ast_type type;
    CL(uint) name;
} ast_identifier;
Lateral AST structs

                            v8::Object => ast_type
                                  expanded
ast_type* vd1_1_init_id = (ast_type*)astCreateIdentifier("Array");
ast_type* vd1_1_init_args[1];
vd1_1_init_args[0] = (ast_type*)astCreateNumberLiteral(10);
ast_type* vd1_1_init = (ast_type*)astCreateNewExpression(vd1_1_init_id, vd1_1_init_args, 1);
free(vd1_1_init_id);
for (int i = 0; i < 1; i++)
    free(vd1_1_init_args[i]);
ast_type* vd1_1_id = (ast_type*)astCreateIdentifier("xyz");
ast_type* vd1_decls[1];
vd1_decls[0] = (ast_type*)astCreateVariableDeclarator(vd1_1_id, vd1_1_init);
free(vd1_1_id);
free(vd1_1_init);
ast_type* vd1 = (ast_type*)astCreateVariableDeclaration(vd1_decls, 1, "var");
for (int i = 0; i < 1; i++)
    free(vd1_decls[i]);
Lateral AST structs
                          astCreateIdentifier
ast_identifier* astCreateIdentifier(const char* str) {
    CL(uint) size = sizeof(ast_identifier) + rnd(strlen(str) + 1, 4);
    ast_identifier* ast_id = (ast_identifier*)malloc(size);

    // copy the string
    strcpy((char*)(ast_id + 1), str);

    // fill the struct
    ast_id->type.id = AST_IDENTIFIER;
    ast_id->type.size = size;
    ast_id->name = sizeof(ast_identifier); // offset

    return ast_id;
}
Lateral AST structs
         astCreateIdentifier(“xyz”)
offset      field              value
  0        type.id    AST_IDENTIFIER (0x01)
  4       type.size             16
  8        name             12 (offset)
 12        str[0]               ‘x’
 13        str[1]               ‘y’
 14        str[2]               ‘z’
 15        str[3]              ‘0’
Lateral AST structs
                                  astCreateNewExpression
ast_expression_new* astCreateNewExpression(ast_type* callee, ast_type** arguments, int numArgs) {
    CL(uint) size = sizeof(ast_expression_new) + callee->size;
    for (int i = 0; i < numArgs; i++)
        size += arguments[i]->size;

    ast_expression_new* ast_new = (ast_expression_new*)malloc(size);
    ast_new->type.id = AST_NEW_EXPR;
    ast_new->type.size = size;

    CL(uint) offset = sizeof(ast_expression_new);
    char* dest = (char*)ast_new;

    // copy callee
    memcpy(dest + offset, callee, callee->size);
    ast_new->callee = offset;
    offset += callee->size;

    // copy arguments
    if (numArgs) {
        ast_new->arguments = offset;
        for (int i = 0; i < numArgs; i++) {
            ast_type* arg = arguments[i];
            memcpy(dest + offset, arg, arg->size);
            offset += arg->size;
        }
    } else
        ast_new->arguments = 0;
    ast_new->numArguments = numArgs;

    return ast_new;
}
Lateral AST structs
                 new Array(10)
offset       field                 value
  0         type.id     AST_NEW_EXPR (0x308)
  4        type.size               52
  8         callee             20 (offset)
 12       arguments            40 (offset)
 16      numArguments              1
 20       callee node    ast_identifier (“Array”)
          arguments
 40                      ast_literal_number (10)
             node
Lateral AST structs
Shared across the Host and the OpenCL runtime
    Host writes, Lateral reads
Constructed on Host as contiguous blobs
    Easy to send to GPU: memcpy(gpu, ast, ast->size);
    Fast to send to GPU, single buffer write
    Simple to traverse w/ pointer arithmetic
Stack-based
 Interpreter
Building Blocks
                     JS Type Structs


AST Traverse Stack                       Lateral State


 Call/Exec Stack        Heap           Symbol/Ref Table


  Return Stack                           Scope Stack




AST Traverse Loop                      Interpret Loop
Kernels
#include "state.h"
#include "jsvm/asttraverse.h"
#include "jsvm/interpreter.h"

// Setup VM structures
kernel void lateral_init(GPTR(uchar) lateral_heap) {
    LATERAL_STATE_INIT
}

// Interpret the AST
kernel void lateral(GPTR(uchar) lateral_heap, GPTR(ast_type) lateral_ast) {
    LATERAL_STATE

    ast_push(lateral_ast);
    while (!Q_EMPTY(lateral_state->ast_stack, ast_q) || !Q_EMPTY(lateral_state->call_stack,
call_q)) {
        while (!Q_EMPTY(lateral_state->ast_stack, ast_q))
            traverse();
        if (!Q_EMPTY(lateral_state->call_stack, call_q))
            interpret();
    }
}
Let’s interpret...



 var x = 1 + 2;
var x = 1 + 2;
{
    "type": "VariableDeclaration",            AST   Call   Return
    "declarations": [
        {
            "type": "VariableDeclarator",
            "id": {
                "type": "Identifier",
                "name": "x"
            },
            "init": {
                "type": "BinaryExpression",
                "operator": "+",
                "left": {
                    "type": "Literal",
                    "value": 1
                },
                "right": {
                    "type": "Literal",
                    "value": 2
                }
            }
        }
    ],
    "kind": "var"
}
var x = 1 + 2;
{
    "type": "VariableDeclaration",             AST      Call   Return
    "declarations": [
        {
            "type": "VariableDeclarator",     VarDecl
            "id": {
                "type": "Identifier",
                "name": "x"
            },
            "init": {
                "type": "BinaryExpression",
                "operator": "+",
                "left": {
                     "type": "Literal",
                     "value": 1
                },
                "right": {
                     "type": "Literal",
                     "value": 2
                }
            }
        }
    ],
    "kind": "var"
}
var x = 1 + 2;
{
    "type": "VariableDeclaration",             AST      Call   Return
    "declarations": [
        {
            "type": "VariableDeclarator",     VarDtor
            "id": {
                "type": "Identifier",
                "name": "x"
            },
            "init": {
                "type": "BinaryExpression",
                "operator": "+",
                "left": {
                    "type": "Literal",
                    "value": 1
                },
                "right": {
                    "type": "Literal",
                    "value": 2
                }
            }
        }
    ],
    "kind": "var"
}
var x = 1 + 2;
{
    "type": "VariableDeclaration",            AST       Call     Return
    "declarations": [
        {
            "type": "VariableDeclarator",     Ident    VarDtor
            "id": {
                "type": "Identifier",          Binary
                "name": "x"
            },
            "init": {
                "type": "BinaryExpression",
                "operator": "+",
                "left": {
                    "type": "Literal",
                    "value": 1
                },
                "right": {
                    "type": "Literal",
                    "value": 2
                }
            }
        }
    ],
    "kind": "var"
}
var x = 1 + 2;
{
    "type": "VariableDeclaration",             AST       Call     Return
    "declarations": [
        {
            "type": "VariableDeclarator",      Ident    VarDtor
            "id": {
                "type": "Identifier",          Literal    Binary
            },
                "name": "x"
                                              Literal
            "init": {
                "type": "BinaryExpression",
                "operator": "+",
                "left": {
                    "type": "Literal",
                    "value": 1
                },
                "right": {
                    "type": "Literal",
                    "value": 2
                }
            }
        }
    ],
    "kind": "var"
}
var x = 1 + 2;
{
    "type": "VariableDeclaration",             AST        Call     Return
    "declarations": [
        {
            "type": "VariableDeclarator",      Ident    VarDtor
            "id": {
                "type": "Identifier",          Literal    Binary
            },
                "name": "x"
                                                         Literal
            "init": {
                "type": "BinaryExpression",
                "operator": "+",
                "left": {
                    "type": "Literal",
                    "value": 1
                },
                "right": {
                    "type": "Literal",
                    "value": 2
                }
            }
        }
    ],
    "kind": "var"
}
var x = 1 + 2;
{
    "type": "VariableDeclaration",            AST       Call     Return
    "declarations": [
        {
            "type": "VariableDeclarator",     Ident   VarDtor
            "id": {
                "type": "Identifier",                   Binary
            },
                "name": "x"
                                                       Literal
            "init": {
                "type": "BinaryExpression",
                                                       Literal
                "operator": "+",
                "left": {
                    "type": "Literal",
                    "value": 1
                },
                "right": {
                    "type": "Literal",
                    "value": 2
                }
            }
        }
    ],
    "kind": "var"
}
var x = 1 + 2;
{
    "type": "VariableDeclaration",            AST     Call     Return
    "declarations": [
        {
            "type": "VariableDeclarator",           VarDtor
            "id": {
                "type": "Identifier",                 Binary
            },
                "name": "x"
                                                     Literal
            "init": {
                "type": "BinaryExpression",
                                                     Literal
                "operator": "+",
                "left": {
                                                      Ident
                    "type": "Literal",
                    "value": 1
                },
                "right": {
                    "type": "Literal",
                    "value": 2
                }
            }
        }
    ],
    "kind": "var"
}
var x = 1 + 2;
{
    "type": "VariableDeclaration",            AST     Call     Return
    "declarations": [
        {
            "type": "VariableDeclarator",           VarDtor     “x”
            "id": {
                "type": "Identifier",                 Binary
            },
                "name": "x"
                                                     Literal
            "init": {
                "type": "BinaryExpression",
                                                     Literal
                "operator": "+",
                "left": {
                    "type": "Literal",
                    "value": 1
                },
                "right": {
                    "type": "Literal",
                    "value": 2
                }
            }
        }
    ],
    "kind": "var"
}
var x = 1 + 2;
{
    "type": "VariableDeclaration",            AST     Call     Return
    "declarations": [
        {
            "type": "VariableDeclarator",           VarDtor     “x”
            "id": {
                "type": "Identifier",                 Binary      1
            },
                "name": "x"
                                                     Literal
            "init": {
                "type": "BinaryExpression",
                "operator": "+",
                "left": {
                    "type": "Literal",
                    "value": 1
                },
                "right": {
                    "type": "Literal",
                    "value": 2
                }
            }
        }
    ],
    "kind": "var"
}
var x = 1 + 2;
{
    "type": "VariableDeclaration",            AST    Call     Return
    "declarations": [
        {
            "type": "VariableDeclarator",           VarDtor    “x”
            "id": {
                "type": "Identifier",                 Binary     1
            },
                "name": "x"
                                                                2
            "init": {
                "type": "BinaryExpression",
                "operator": "+",
                "left": {
                    "type": "Literal",
                    "value": 1
                },
                "right": {
                    "type": "Literal",
                    "value": 2
                }
            }
        }
    ],
    "kind": "var"
}
var x = 1 + 2;
{
    "type": "VariableDeclaration",            AST    Call     Return
    "declarations": [
        {
            "type": "VariableDeclarator",           VarDtor    “x”
            "id": {
                "type": "Identifier",                            3
                "name": "x"
            },
            "init": {
                "type": "BinaryExpression",
                "operator": "+",
                "left": {
                    "type": "Literal",
                    "value": 1
                },
                "right": {
                    "type": "Literal",
                    "value": 2
                }
            }
        }
    ],
    "kind": "var"
}
var x = 1 + 2;
{
    "type": "VariableDeclaration",            AST   Call   Return
    "declarations": [
        {
            "type": "VariableDeclarator",
            "id": {
                "type": "Identifier",
                "name": "x"
            },
            "init": {
                "type": "BinaryExpression",
                "operator": "+",
                "left": {
                    "type": "Literal",
                    "value": 1
                },
                "right": {
                    "type": "Literal",
                    "value": 2
                }
            }
        }
    ],
    "kind": "var"
}
Benchmark
Benchmark

                 Small loop of FLOPs
var input = new Array(10);
for (var i = 0; i < input.length; i++) {
    input[i] = Math.pow((i + 1) / 1.23, 3);
}
Execution Time
               Lateral
   GPU CL                CPU CL                      V8
 ATI Radeon 6770m   Intel Core i7 4x2.4Ghz   Intel Core i7 4x2.4Ghz




116.571533ms        0.226007ms               0.090664ms
Execution Time
               Lateral
   GPU CL                CPU CL                      V8
 ATI Radeon 6770m   Intel Core i7 4x2.4Ghz   Intel Core i7 4x2.4Ghz




116.571533ms        0.226007ms               0.090664ms
What went wrong?
Everything
Stack-based AST Interpreter, no optimizations
Heavy global memory access, no optimizations
No data or task parallelism
Stack-based Interpreter
Slow as molasses
Memory hog Eclipse style
Heavy memory access
     “var x = 1 + 2;” == 30 stack hits alone!
     Too much dynamic allocation
No inline optimizations, just following the yellow brick AST
Straight up lazy

Replace with something better!
Bytecode compiler on Host
Bytecode register-based interpreter on Device
Too much global access
   Everything is dynamically allocated to global memory
   Register based interpreter & bytecode compiler can
   make better use of local and private memory
// 11.1207 seconds
size_t tid = get_global_id(0);
c[tid] = a[tid];
while(b[tid] > 0) { // touch global memory on each loop
  b[tid]--; // touch global memory on each loop
  c[tid]++; // touch global memory on each loop       Optimizing memory access
}

// 0.0445558 seconds!! HOLY SHIT!
                                                      yields crazy results
size_t tid = get_global_id(0);
int tmp = a[tid]; // temp private variable
for(int i=b[tid]; i > 0; i--) tmp++; // touch private variables on each loop
c[tid] = tmp; // touch global memory one time
No data or task parallelism
  Everything being interpreted in a single “thread”
  We have hundreds of cores available to us!
  Build in heuristics
         Identify side-effect free statements
         Break into parallel tasks - very magical

                                                    input[0] = Math.pow((0 + 1) / 1.23, 3);
var input = new Array(10);
for (var i = 0; i < input.length; i++) {            input[1] = Math.pow((1 + 1) / 1.23, 3);

}
    input[i] = Math.pow((i + 1) / 1.23, 3);
                                                                        ...
                                                    input[9] = Math.pow((9 + 1) / 1.23, 3);
What’s in store
Acceptable performance on all CL devices
V8/Node extension to launch Lateral tasks
High-level API to perform map-reduce, etc.
Lateral-cluster...mmmmm
Thanks!

  Jarred Nicholls
  @jarrednicholls
jarred@webkit.org

More Related Content

What's hot

FalsyValues. Dmitry Soshnikov - ECMAScript 6
FalsyValues. Dmitry Soshnikov - ECMAScript 6FalsyValues. Dmitry Soshnikov - ECMAScript 6
FalsyValues. Dmitry Soshnikov - ECMAScript 6Dmitry Soshnikov
 
Lightweight wrapper for Hive on Amazon EMR
Lightweight wrapper for Hive on Amazon EMRLightweight wrapper for Hive on Amazon EMR
Lightweight wrapper for Hive on Amazon EMRShinji Tanaka
 
Introduction into ES6 JavaScript.
Introduction into ES6 JavaScript.Introduction into ES6 JavaScript.
Introduction into ES6 JavaScript.boyney123
 
Testing Backbone applications with Jasmine
Testing Backbone applications with JasmineTesting Backbone applications with Jasmine
Testing Backbone applications with JasmineLeon van der Grient
 
ES6 - Next Generation Javascript
ES6 - Next Generation JavascriptES6 - Next Generation Javascript
ES6 - Next Generation JavascriptRamesh Nair
 
ES2015 (ES6) Overview
ES2015 (ES6) OverviewES2015 (ES6) Overview
ES2015 (ES6) Overviewhesher
 
Explaining ES6: JavaScript History and What is to Come
Explaining ES6: JavaScript History and What is to ComeExplaining ES6: JavaScript History and What is to Come
Explaining ES6: JavaScript History and What is to ComeCory Forsyth
 
Introduction to Ecmascript - ES6
Introduction to Ecmascript - ES6Introduction to Ecmascript - ES6
Introduction to Ecmascript - ES6Nilesh Jayanandana
 
Øredev 2011 - JVM JIT for Dummies (What the JVM Does With Your Bytecode When ...
Øredev 2011 - JVM JIT for Dummies (What the JVM Does With Your Bytecode When ...Øredev 2011 - JVM JIT for Dummies (What the JVM Does With Your Bytecode When ...
Øredev 2011 - JVM JIT for Dummies (What the JVM Does With Your Bytecode When ...Charles Nutter
 
Start Wrap Episode 11: A New Rope
Start Wrap Episode 11: A New RopeStart Wrap Episode 11: A New Rope
Start Wrap Episode 11: A New RopeYung-Yu Chen
 
Google guava overview
Google guava overviewGoogle guava overview
Google guava overviewSteve Min
 
An Intro To ES6
An Intro To ES6An Intro To ES6
An Intro To ES6FITC
 
Down to Stack Traces, up from Heap Dumps
Down to Stack Traces, up from Heap DumpsDown to Stack Traces, up from Heap Dumps
Down to Stack Traces, up from Heap DumpsAndrei Pangin
 
JavaScript - new features in ECMAScript 6
JavaScript - new features in ECMAScript 6JavaScript - new features in ECMAScript 6
JavaScript - new features in ECMAScript 6Solution4Future
 
Mastering Java Bytecode With ASM - 33rd degree, 2012
Mastering Java Bytecode With ASM - 33rd degree, 2012Mastering Java Bytecode With ASM - 33rd degree, 2012
Mastering Java Bytecode With ASM - 33rd degree, 2012Anton Arhipov
 

What's hot (20)

FalsyValues. Dmitry Soshnikov - ECMAScript 6
FalsyValues. Dmitry Soshnikov - ECMAScript 6FalsyValues. Dmitry Soshnikov - ECMAScript 6
FalsyValues. Dmitry Soshnikov - ECMAScript 6
 
Lightweight wrapper for Hive on Amazon EMR
Lightweight wrapper for Hive on Amazon EMRLightweight wrapper for Hive on Amazon EMR
Lightweight wrapper for Hive on Amazon EMR
 
Introduction into ES6 JavaScript.
Introduction into ES6 JavaScript.Introduction into ES6 JavaScript.
Introduction into ES6 JavaScript.
 
Testing Backbone applications with Jasmine
Testing Backbone applications with JasmineTesting Backbone applications with Jasmine
Testing Backbone applications with Jasmine
 
Mastering Java ByteCode
Mastering Java ByteCodeMastering Java ByteCode
Mastering Java ByteCode
 
ES6 - Next Generation Javascript
ES6 - Next Generation JavascriptES6 - Next Generation Javascript
ES6 - Next Generation Javascript
 
ES2015 (ES6) Overview
ES2015 (ES6) OverviewES2015 (ES6) Overview
ES2015 (ES6) Overview
 
Explaining ES6: JavaScript History and What is to Come
Explaining ES6: JavaScript History and What is to ComeExplaining ES6: JavaScript History and What is to Come
Explaining ES6: JavaScript History and What is to Come
 
Python Objects
Python ObjectsPython Objects
Python Objects
 
ES6 Overview
ES6 OverviewES6 Overview
ES6 Overview
 
Introduction to Ecmascript - ES6
Introduction to Ecmascript - ES6Introduction to Ecmascript - ES6
Introduction to Ecmascript - ES6
 
Øredev 2011 - JVM JIT for Dummies (What the JVM Does With Your Bytecode When ...
Øredev 2011 - JVM JIT for Dummies (What the JVM Does With Your Bytecode When ...Øredev 2011 - JVM JIT for Dummies (What the JVM Does With Your Bytecode When ...
Øredev 2011 - JVM JIT for Dummies (What the JVM Does With Your Bytecode When ...
 
High Performance tDiary
High Performance tDiaryHigh Performance tDiary
High Performance tDiary
 
Advanced JavaScript
Advanced JavaScriptAdvanced JavaScript
Advanced JavaScript
 
Start Wrap Episode 11: A New Rope
Start Wrap Episode 11: A New RopeStart Wrap Episode 11: A New Rope
Start Wrap Episode 11: A New Rope
 
Google guava overview
Google guava overviewGoogle guava overview
Google guava overview
 
An Intro To ES6
An Intro To ES6An Intro To ES6
An Intro To ES6
 
Down to Stack Traces, up from Heap Dumps
Down to Stack Traces, up from Heap DumpsDown to Stack Traces, up from Heap Dumps
Down to Stack Traces, up from Heap Dumps
 
JavaScript - new features in ECMAScript 6
JavaScript - new features in ECMAScript 6JavaScript - new features in ECMAScript 6
JavaScript - new features in ECMAScript 6
 
Mastering Java Bytecode With ASM - 33rd degree, 2012
Mastering Java Bytecode With ASM - 33rd degree, 2012Mastering Java Bytecode With ASM - 33rd degree, 2012
Mastering Java Bytecode With ASM - 33rd degree, 2012
 

Viewers also liked

Hardware Acceleration on Mobile, Ariya Hidayat & Jarred Nicholls
Hardware Acceleration on Mobile, Ariya Hidayat & Jarred NichollsHardware Acceleration on Mobile, Ariya Hidayat & Jarred Nicholls
Hardware Acceleration on Mobile, Ariya Hidayat & Jarred NichollsSencha
 
レインボーテーブルを使ったハッシュの復号とSalt
レインボーテーブルを使ったハッシュの復号とSaltレインボーテーブルを使ったハッシュの復号とSalt
レインボーテーブルを使ったハッシュの復号とSaltRyo Maruyama
 
(COSCUP 2015) A Beginner's Journey to Mozilla SpiderMonkey JS Engine
(COSCUP 2015) A Beginner's Journey to Mozilla SpiderMonkey JS Engine(COSCUP 2015) A Beginner's Journey to Mozilla SpiderMonkey JS Engine
(COSCUP 2015) A Beginner's Journey to Mozilla SpiderMonkey JS EngineZongXian Shen
 
Node.js vs Play Framework (with Japanese subtitles)
Node.js vs Play Framework (with Japanese subtitles)Node.js vs Play Framework (with Japanese subtitles)
Node.js vs Play Framework (with Japanese subtitles)Yevgeniy Brikman
 
Graphics Processing Unit - GPU
Graphics Processing Unit - GPUGraphics Processing Unit - GPU
Graphics Processing Unit - GPUChetan Gole
 
Iocp 기본 구조 이해
Iocp 기본 구조 이해Iocp 기본 구조 이해
Iocp 기본 구조 이해Nam Hyeonuk
 
Graphics processing unit ppt
Graphics processing unit pptGraphics processing unit ppt
Graphics processing unit pptSandeep Singh
 
게임서버프로그래밍 #1 - IOCP
게임서버프로그래밍 #1 - IOCP게임서버프로그래밍 #1 - IOCP
게임서버프로그래밍 #1 - IOCPSeungmo Koo
 
Chainerで学ぶdeep learning
Chainerで学ぶdeep learningChainerで学ぶdeep learning
Chainerで学ぶdeep learningRetrieva inc.
 
헤테로지니어스 컴퓨팅 : CPU 에서 GPU 로 옮겨가기
헤테로지니어스 컴퓨팅 :  CPU 에서 GPU 로 옮겨가기헤테로지니어스 컴퓨팅 :  CPU 에서 GPU 로 옮겨가기
헤테로지니어스 컴퓨팅 : CPU 에서 GPU 로 옮겨가기zupet
 
使用Javascript及HTML5打造協同運作系統
使用Javascript及HTML5打造協同運作系統使用Javascript及HTML5打造協同運作系統
使用Javascript及HTML5打造協同運作系統Hsu Ping Feng
 
入門Gulp - 前端自動化開發工具
入門Gulp - 前端自動化開發工具入門Gulp - 前端自動化開發工具
入門Gulp - 前端自動化開發工具Anna Su
 
webpack 入門
webpack 入門webpack 入門
webpack 入門Anna Su
 
前端界流傳的神奇招式
前端界流傳的神奇招式前端界流傳的神奇招式
前端界流傳的神奇招式Anna Su
 
Railway Oriented Programming
Railway Oriented ProgrammingRailway Oriented Programming
Railway Oriented ProgrammingScott Wlaschin
 
俺のtensorが全然flowしないのでみんなchainer使おう by DEEPstation
俺のtensorが全然flowしないのでみんなchainer使おう by DEEPstation俺のtensorが全然flowしないのでみんなchainer使おう by DEEPstation
俺のtensorが全然flowしないのでみんなchainer使おう by DEEPstationYusuke HIDESHIMA
 
「速」を落とさないコードレビュー
「速」を落とさないコードレビュー「速」を落とさないコードレビュー
「速」を落とさないコードレビューTakafumi ONAKA
 
FMK2015: The Power of JavaScript by Marcel Moré
FMK2015: The Power of JavaScript by Marcel MoréFMK2015: The Power of JavaScript by Marcel Moré
FMK2015: The Power of JavaScript by Marcel MoréVerein FM Konferenz
 
FMK2015: Entwicklung von modernen Benutzeroberflächen mit FileMaker Pro by Ad...
FMK2015: Entwicklung von modernen Benutzeroberflächen mit FileMaker Pro by Ad...FMK2015: Entwicklung von modernen Benutzeroberflächen mit FileMaker Pro by Ad...
FMK2015: Entwicklung von modernen Benutzeroberflächen mit FileMaker Pro by Ad...Verein FM Konferenz
 
What Makes Great Infographics
What Makes Great InfographicsWhat Makes Great Infographics
What Makes Great InfographicsSlideShare
 

Viewers also liked (20)

Hardware Acceleration on Mobile, Ariya Hidayat & Jarred Nicholls
Hardware Acceleration on Mobile, Ariya Hidayat & Jarred NichollsHardware Acceleration on Mobile, Ariya Hidayat & Jarred Nicholls
Hardware Acceleration on Mobile, Ariya Hidayat & Jarred Nicholls
 
レインボーテーブルを使ったハッシュの復号とSalt
レインボーテーブルを使ったハッシュの復号とSaltレインボーテーブルを使ったハッシュの復号とSalt
レインボーテーブルを使ったハッシュの復号とSalt
 
(COSCUP 2015) A Beginner's Journey to Mozilla SpiderMonkey JS Engine
(COSCUP 2015) A Beginner's Journey to Mozilla SpiderMonkey JS Engine(COSCUP 2015) A Beginner's Journey to Mozilla SpiderMonkey JS Engine
(COSCUP 2015) A Beginner's Journey to Mozilla SpiderMonkey JS Engine
 
Node.js vs Play Framework (with Japanese subtitles)
Node.js vs Play Framework (with Japanese subtitles)Node.js vs Play Framework (with Japanese subtitles)
Node.js vs Play Framework (with Japanese subtitles)
 
Graphics Processing Unit - GPU
Graphics Processing Unit - GPUGraphics Processing Unit - GPU
Graphics Processing Unit - GPU
 
Iocp 기본 구조 이해
Iocp 기본 구조 이해Iocp 기본 구조 이해
Iocp 기본 구조 이해
 
Graphics processing unit ppt
Graphics processing unit pptGraphics processing unit ppt
Graphics processing unit ppt
 
게임서버프로그래밍 #1 - IOCP
게임서버프로그래밍 #1 - IOCP게임서버프로그래밍 #1 - IOCP
게임서버프로그래밍 #1 - IOCP
 
Chainerで学ぶdeep learning
Chainerで学ぶdeep learningChainerで学ぶdeep learning
Chainerで学ぶdeep learning
 
헤테로지니어스 컴퓨팅 : CPU 에서 GPU 로 옮겨가기
헤테로지니어스 컴퓨팅 :  CPU 에서 GPU 로 옮겨가기헤테로지니어스 컴퓨팅 :  CPU 에서 GPU 로 옮겨가기
헤테로지니어스 컴퓨팅 : CPU 에서 GPU 로 옮겨가기
 
使用Javascript及HTML5打造協同運作系統
使用Javascript及HTML5打造協同運作系統使用Javascript及HTML5打造協同運作系統
使用Javascript及HTML5打造協同運作系統
 
入門Gulp - 前端自動化開發工具
入門Gulp - 前端自動化開發工具入門Gulp - 前端自動化開發工具
入門Gulp - 前端自動化開發工具
 
webpack 入門
webpack 入門webpack 入門
webpack 入門
 
前端界流傳的神奇招式
前端界流傳的神奇招式前端界流傳的神奇招式
前端界流傳的神奇招式
 
Railway Oriented Programming
Railway Oriented ProgrammingRailway Oriented Programming
Railway Oriented Programming
 
俺のtensorが全然flowしないのでみんなchainer使おう by DEEPstation
俺のtensorが全然flowしないのでみんなchainer使おう by DEEPstation俺のtensorが全然flowしないのでみんなchainer使おう by DEEPstation
俺のtensorが全然flowしないのでみんなchainer使おう by DEEPstation
 
「速」を落とさないコードレビュー
「速」を落とさないコードレビュー「速」を落とさないコードレビュー
「速」を落とさないコードレビュー
 
FMK2015: The Power of JavaScript by Marcel Moré
FMK2015: The Power of JavaScript by Marcel MoréFMK2015: The Power of JavaScript by Marcel Moré
FMK2015: The Power of JavaScript by Marcel Moré
 
FMK2015: Entwicklung von modernen Benutzeroberflächen mit FileMaker Pro by Ad...
FMK2015: Entwicklung von modernen Benutzeroberflächen mit FileMaker Pro by Ad...FMK2015: Entwicklung von modernen Benutzeroberflächen mit FileMaker Pro by Ad...
FMK2015: Entwicklung von modernen Benutzeroberflächen mit FileMaker Pro by Ad...
 
What Makes Great Infographics
What Makes Great InfographicsWhat Makes Great Infographics
What Makes Great Infographics
 

Similar to JavaScript on the GPU

Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE
Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNEGenerating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE
Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNEDataWorks Summit/Hadoop Summit
 
Xdp and ebpf_maps
Xdp and ebpf_mapsXdp and ebpf_maps
Xdp and ebpf_mapslcplcp1
 
Using GPUs to handle Big Data with Java by Adam Roberts.
Using GPUs to handle Big Data with Java by Adam Roberts.Using GPUs to handle Big Data with Java by Adam Roberts.
Using GPUs to handle Big Data with Java by Adam Roberts.J On The Beach
 
Build Large-Scale Data Analytics and AI Pipeline Using RayDP
Build Large-Scale Data Analytics and AI Pipeline Using RayDPBuild Large-Scale Data Analytics and AI Pipeline Using RayDP
Build Large-Scale Data Analytics and AI Pipeline Using RayDPDatabricks
 
10 things i wish i'd known before using spark in production
10 things i wish i'd known before using spark in production10 things i wish i'd known before using spark in production
10 things i wish i'd known before using spark in productionParis Data Engineers !
 
NVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読みNVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読みNVIDIA Japan
 
Porting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to RustPorting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to RustEvan Chan
 
Building a SIMD Supported Vectorized Native Engine for Spark SQL
Building a SIMD Supported Vectorized Native Engine for Spark SQLBuilding a SIMD Supported Vectorized Native Engine for Spark SQL
Building a SIMD Supported Vectorized Native Engine for Spark SQLDatabricks
 
12 Monkeys Inside JS Engine
12 Monkeys Inside JS Engine12 Monkeys Inside JS Engine
12 Monkeys Inside JS EngineChengHui Weng
 
Reverse Engineering for exploit writers
Reverse Engineering for exploit writersReverse Engineering for exploit writers
Reverse Engineering for exploit writersamiable_indian
 
Nibin - Reverse Engineering for exploit writers - ClubHack2008
Nibin - Reverse Engineering for exploit writers - ClubHack2008Nibin - Reverse Engineering for exploit writers - ClubHack2008
Nibin - Reverse Engineering for exploit writers - ClubHack2008ClubHack
 
Beyond Breakpoints: A Tour of Dynamic Analysis
Beyond Breakpoints: A Tour of Dynamic AnalysisBeyond Breakpoints: A Tour of Dynamic Analysis
Beyond Breakpoints: A Tour of Dynamic AnalysisFastly
 
20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storageKohei KaiGai
 
import rdma: zero-copy networking with RDMA and Python
import rdma: zero-copy networking with RDMA and Pythonimport rdma: zero-copy networking with RDMA and Python
import rdma: zero-copy networking with RDMA and Pythongroveronline
 
DotNetFest - Let’s refresh our memory! Memory management in .NET
DotNetFest - Let’s refresh our memory! Memory management in .NETDotNetFest - Let’s refresh our memory! Memory management in .NET
DotNetFest - Let’s refresh our memory! Memory management in .NETMaarten Balliauw
 
ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Clust...
ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Clust...ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Clust...
ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Clust...DynamicInfraDays
 

Similar to JavaScript on the GPU (20)

Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE
Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNEGenerating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE
Generating Recommendations at Amazon Scale with Apache Spark and Amazon DSSTNE
 
Xdp and ebpf_maps
Xdp and ebpf_mapsXdp and ebpf_maps
Xdp and ebpf_maps
 
Using GPUs to handle Big Data with Java by Adam Roberts.
Using GPUs to handle Big Data with Java by Adam Roberts.Using GPUs to handle Big Data with Java by Adam Roberts.
Using GPUs to handle Big Data with Java by Adam Roberts.
 
Build Large-Scale Data Analytics and AI Pipeline Using RayDP
Build Large-Scale Data Analytics and AI Pipeline Using RayDPBuild Large-Scale Data Analytics and AI Pipeline Using RayDP
Build Large-Scale Data Analytics and AI Pipeline Using RayDP
 
10 things i wish i'd known before using spark in production
10 things i wish i'd known before using spark in production10 things i wish i'd known before using spark in production
10 things i wish i'd known before using spark in production
 
Evolution of Spark APIs
Evolution of Spark APIsEvolution of Spark APIs
Evolution of Spark APIs
 
NVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読みNVIDIA HPC ソフトウエア斜め読み
NVIDIA HPC ソフトウエア斜め読み
 
Porting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to RustPorting a Streaming Pipeline from Scala to Rust
Porting a Streaming Pipeline from Scala to Rust
 
Building a SIMD Supported Vectorized Native Engine for Spark SQL
Building a SIMD Supported Vectorized Native Engine for Spark SQLBuilding a SIMD Supported Vectorized Native Engine for Spark SQL
Building a SIMD Supported Vectorized Native Engine for Spark SQL
 
12 Monkeys Inside JS Engine
12 Monkeys Inside JS Engine12 Monkeys Inside JS Engine
12 Monkeys Inside JS Engine
 
Gpu Join Presentation
Gpu Join PresentationGpu Join Presentation
Gpu Join Presentation
 
Reverse Engineering for exploit writers
Reverse Engineering for exploit writersReverse Engineering for exploit writers
Reverse Engineering for exploit writers
 
Nibin - Reverse Engineering for exploit writers - ClubHack2008
Nibin - Reverse Engineering for exploit writers - ClubHack2008Nibin - Reverse Engineering for exploit writers - ClubHack2008
Nibin - Reverse Engineering for exploit writers - ClubHack2008
 
Beyond Breakpoints: A Tour of Dynamic Analysis
Beyond Breakpoints: A Tour of Dynamic AnalysisBeyond Breakpoints: A Tour of Dynamic Analysis
Beyond Breakpoints: A Tour of Dynamic Analysis
 
20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage
 
XS Japan 2008 Xen Mgmt English
XS Japan 2008 Xen Mgmt EnglishXS Japan 2008 Xen Mgmt English
XS Japan 2008 Xen Mgmt English
 
import rdma: zero-copy networking with RDMA and Python
import rdma: zero-copy networking with RDMA and Pythonimport rdma: zero-copy networking with RDMA and Python
import rdma: zero-copy networking with RDMA and Python
 
DotNetFest - Let’s refresh our memory! Memory management in .NET
DotNetFest - Let’s refresh our memory! Memory management in .NETDotNetFest - Let’s refresh our memory! Memory management in .NET
DotNetFest - Let’s refresh our memory! Memory management in .NET
 
JS everywhere 2011
JS everywhere 2011JS everywhere 2011
JS everywhere 2011
 
ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Clust...
ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Clust...ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Clust...
ContainerDays Boston 2015: "CoreOS: Building the Layers of the Scalable Clust...
 

Recently uploaded

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 

Recently uploaded (20)

Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 

JavaScript on the GPU

  • 1. If you don’t get this ref...shame on you
  • 2. Jarred Nicholls @jarrednicholls jarred@webkit.org
  • 3. Work @ Sencha Web Platform Team Doing webkitty things...
  • 7. What I’ll blabber about today Why JavaScript on the GPU Running JavaScript on the GPU What’s to come...
  • 8. Why JavaScript on the GPU?
  • 9. Why JavaScript on the GPU? Better question: Why a GPU?
  • 10. Why JavaScript on the GPU? Better question: Why a GPU? A: They’re fast! (well, at certain things...)
  • 11. GPUs are fast b/c... Totally different paradigm from CPUs Data parallelism vs. Task parallelism Stream processing vs. Sequential processing GPUs can divide-and-conquer Hardware capable of a large number of “threads” e.g. ATI Radeon HD 6770m: 480 stream processing units == 480 cores Typically very high memory bandwidth Many, many GigaFLOPs
  • 12. GPUs don’t solve all problems Not all tasks can be accelerated by GPUs Tasks must be parallelizable, i.e.: Side effect free Homogeneous and/or streamable Overall tasks will become limited by Amdahl’s Law
  • 13.
  • 16. LateralJS Our Mission To make JavaScript a first-class citizen on all GPUs and take advantage of hardware accelerated operations & data parallelization.
  • 17. Our Options OpenCL Nvidia CUDA AMD, Nvidia, Intel, etc. Nvidia only A shitty version of C99 C++ (C for CUDA) No dynamic memory Dynamic memory No recursion Recursion No function pointers Function pointers Terrible tooling Great dev. tooling Immature (arguably) More mature (arguably)
  • 18. Our Options OpenCL Nvidia CUDA AMD, Nvidia, Intel, etc. Nvidia only A shitty version of C99 C++ (C for CUDA) No dynamic memory Dynamic memory No recursion Recursion No function pointers Function pointers Terrible tooling Great dev. tooling Immature (arguably) More mature (arguably)
  • 19. Why not a Static Compiler? We want full JavaScript support Object / prototype Closures Recursion Functions as objects Variable typing Type Inference limitations Reasonably limited to size and complexity of “kernel- esque” functions Not nearly insane enough
  • 20.
  • 21. Why an Interpreter? We want it all baby - full JavaScript support! Most insane approach Challenging to make it good, but holds a lot of promise
  • 23.
  • 24. Oh the agony... Multiple memory spaces - pointer hell No recursion - all inlined functions No standard libc libraries No dynamic memory No standard data structures - apart from vector ops Buggy ass AMD/Nvidia compilers
  • 25.
  • 26. Multiple Memory Spaces In the order of fastest to slowest: space description very fast private stream processor cache (~64KB) scoped to a single work item fast local ~= L1 cache on CPUs (~64KB) scoped to a single work group slow, by orders of magnitude global ~= system memory over slow bus constant available to all work groups/items all the VRAM on the card (MBs)
  • 27. Memory Space Pointer Hell global uchar* gptr = 0x1000; local uchar* lptr = (local uchar*) gptr; // FAIL! uchar* pptr = (uchar*) gptr; // FAIL! private is implicit 0x1000 global local private 0x1000 points to something different depending on the address space!
  • 28. Memory Space Pointer Hell Pointers must always be fully qualified Macros to help ease the pain #define GPTR(TYPE) global TYPE* #define CPTR(TYPE) constant TYPE* #define LPTR(TYPE) local TYPE* #define PPTR(TYPE) private TYPE*
  • 29. No Recursion!?!?!? No call stack All functions are inlined to the kernel function uint factorial(uint n) { if (n <= 1) return 1; else return n * factorial(n - 1); // compile-time error }
  • 30. No standard libc libraries memcpy? strcpy? strcmp? etc...
  • 31. No standard libc libraries Implement our own #define MEMCPY(NAME, DEST_AS, SRC_AS) DEST_AS void* NAME(DEST_AS void*, SRC_AS const void*, uint); DEST_AS void* NAME(DEST_AS void* dest, SRC_AS const void* src, uint size) { DEST_AS uchar* cDest = (DEST_AS uchar*)dest; SRC_AS const uchar* cSrc = (SRC_AS const uchar*)src; for (uint i = 0; i < size; i++) cDest[i] = cSrc[i]; return (DEST_AS void*)cDest; } PTR_MACRO_DEST_SRC(MEMCPY, memcpy) Produces memcpy_g memcpy_gc memcpy_lc memcpy_pc memcpy_l memcpy_gl memcpy_lg memcpy_pg memcpy_p memcpy_gp memcpy_lp memcpy_pl
  • 32. No dynamic memory No malloc() No free() What to do...
  • 33. Yes! dynamic memory Create a large buffer of global memory - our “heap” Implement our own malloc() and free() Create a handle structure - “virtual memory” P(T, hnd) macro to get the current pointer address GPTR(handle) hnd = malloc(sizeof(uint)); GPTR(uint) ptr = P(uint, hnd); *ptr = 0xdeadbeef; free(hnd);
  • 34.
  • 35. Ok, we get the point... FYL!
  • 36. High-level Architecture V8 Data Heap Esprima Parser Stack-based Interpreter Host Host Host GPUs Data Serializer & Marshaller Garbage Collector Device Mgr
  • 37. High-level Architecture eval(code); V8 Data Heap Build JSON AST Esprima Parser Stack-based Interpreter Host Host Host GPUs Data Serializer & Marshaller Garbage Collector Device Mgr
  • 38. High-level Architecture eval(code); V8 Data Heap Build JSON AST Esprima Parser Stack-based Interpreter Serialize AST Host Host Host JSON => C Structs GPUs Data Serializer & Marshaller Garbage Collector Device Mgr
  • 39. High-level Architecture eval(code); V8 Data Heap Build JSON AST Esprima Parser Stack-based Interpreter Serialize AST Host Host Host JSON => C Structs GPUs Data Serializer & Marshaller Garbage Collector Ship to GPU to Interpret Device Mgr
  • 40. High-level Architecture eval(code); V8 Data Heap Build JSON AST Esprima Parser Stack-based Interpreter Serialize AST Host Host Host JSON => C Structs GPUs Data Serializer & Marshaller Garbage Collector Ship to GPU to Interpret Device Mgr Fetch Result
  • 42. AST Generation JSON AST JavaScript Source (v8::Object) Lateral AST Esprima in V8 (C structs)
  • 43. Embed esprima.js Resource Generator $ resgen esprima.js resgen_esprima_js.c
  • 44. Embed esprima.js resgen_esprima_js.c const unsigned char resgen_esprima_js[] = { 0x2f, 0x2a, 0x0a, 0x20, 0x20, 0x43, 0x6f, 0x70, 0x79, 0x72, 0x69, 0x67, 0x68, 0x74, 0x20, 0x28, 0x43, 0x29, 0x20, 0x32, ... 0x20, 0x3a, 0x20, 0x2a, 0x2f, 0x0a, 0x0a, 0 };
  • 45. Embed esprima.js ASTGenerator.cpp extern const char resgen_esprima_js; void ASTGenerator::init() { HandleScope scope; s_context = Context::New(); s_context->Enter(); Handle<Script> script = Script::Compile(String::New(&resgen_esprima_js)); script->Run(); s_context->Exit(); s_initialized = true; }
  • 46. Build JSON AST e.g. ASTGenerator::esprimaParse( "var xyz = new Array(10);" );
  • 47. Build JSON AST Handle<Object> ASTGenerator::esprimaParse(const char* javascript) { if (!s_initialized) init(); HandleScope scope; s_context->Enter(); Handle<Object> global = s_context->Global(); Handle<Object> esprima = Handle<Object>::Cast(global->Get(String::New("esprima"))); Handle<Function> esprimaParse = Handle<Function>::Cast(esprima- >Get(String::New("parse"))); Handle<String> code = String::New(javascript); Handle<Object> ast = Handle<Object>::Cast(esprimaParse->Call(esprima, 1, (Handle<Value>*)&code)); s_context->Exit(); return scope.Close(ast); }
  • 48. Build JSON AST { "type": "VariableDeclaration", "declarations": [ { "type": "VariableDeclarator", "id": { "type": "Identifier", "name": "xyz" }, "init": { "type": "NewExpression", "callee": { "type": "Identifier", "name": "Array" }, "arguments": [ { "type": "Literal", "value": 10 } ] } } ], "kind": "var" }
  • 49. Lateral AST structs typedef struct ast_type_st { #ifdef __OPENCL_VERSION__ CL(uint) id; #define CL(TYPE) TYPE CL(uint) size; #else } ast_type; #define CL(TYPE) cl_##TYPE #endif typedef struct ast_program_st { ast_type type; CL(uint) body; CL(uint) numBody; Structs shared between } ast_program; Host and OpenCL typedef struct ast_identifier_st { ast_type type; CL(uint) name; } ast_identifier;
  • 50. Lateral AST structs v8::Object => ast_type expanded ast_type* vd1_1_init_id = (ast_type*)astCreateIdentifier("Array"); ast_type* vd1_1_init_args[1]; vd1_1_init_args[0] = (ast_type*)astCreateNumberLiteral(10); ast_type* vd1_1_init = (ast_type*)astCreateNewExpression(vd1_1_init_id, vd1_1_init_args, 1); free(vd1_1_init_id); for (int i = 0; i < 1; i++) free(vd1_1_init_args[i]); ast_type* vd1_1_id = (ast_type*)astCreateIdentifier("xyz"); ast_type* vd1_decls[1]; vd1_decls[0] = (ast_type*)astCreateVariableDeclarator(vd1_1_id, vd1_1_init); free(vd1_1_id); free(vd1_1_init); ast_type* vd1 = (ast_type*)astCreateVariableDeclaration(vd1_decls, 1, "var"); for (int i = 0; i < 1; i++) free(vd1_decls[i]);
  • 51. Lateral AST structs astCreateIdentifier ast_identifier* astCreateIdentifier(const char* str) { CL(uint) size = sizeof(ast_identifier) + rnd(strlen(str) + 1, 4); ast_identifier* ast_id = (ast_identifier*)malloc(size); // copy the string strcpy((char*)(ast_id + 1), str); // fill the struct ast_id->type.id = AST_IDENTIFIER; ast_id->type.size = size; ast_id->name = sizeof(ast_identifier); // offset return ast_id; }
  • 52. Lateral AST structs astCreateIdentifier(“xyz”) offset field value 0 type.id AST_IDENTIFIER (0x01) 4 type.size 16 8 name 12 (offset) 12 str[0] ‘x’ 13 str[1] ‘y’ 14 str[2] ‘z’ 15 str[3] ‘0’
  • 53. Lateral AST structs astCreateNewExpression ast_expression_new* astCreateNewExpression(ast_type* callee, ast_type** arguments, int numArgs) { CL(uint) size = sizeof(ast_expression_new) + callee->size; for (int i = 0; i < numArgs; i++) size += arguments[i]->size; ast_expression_new* ast_new = (ast_expression_new*)malloc(size); ast_new->type.id = AST_NEW_EXPR; ast_new->type.size = size; CL(uint) offset = sizeof(ast_expression_new); char* dest = (char*)ast_new; // copy callee memcpy(dest + offset, callee, callee->size); ast_new->callee = offset; offset += callee->size; // copy arguments if (numArgs) { ast_new->arguments = offset; for (int i = 0; i < numArgs; i++) { ast_type* arg = arguments[i]; memcpy(dest + offset, arg, arg->size); offset += arg->size; } } else ast_new->arguments = 0; ast_new->numArguments = numArgs; return ast_new; }
  • 54. Lateral AST structs new Array(10) offset field value 0 type.id AST_NEW_EXPR (0x308) 4 type.size 52 8 callee 20 (offset) 12 arguments 40 (offset) 16 numArguments 1 20 callee node ast_identifier (“Array”) arguments 40 ast_literal_number (10) node
  • 55. Lateral AST structs Shared across the Host and the OpenCL runtime Host writes, Lateral reads Constructed on Host as contiguous blobs Easy to send to GPU: memcpy(gpu, ast, ast->size); Fast to send to GPU, single buffer write Simple to traverse w/ pointer arithmetic
  • 57. Building Blocks JS Type Structs AST Traverse Stack Lateral State Call/Exec Stack Heap Symbol/Ref Table Return Stack Scope Stack AST Traverse Loop Interpret Loop
  • 58. Kernels #include "state.h" #include "jsvm/asttraverse.h" #include "jsvm/interpreter.h" // Setup VM structures kernel void lateral_init(GPTR(uchar) lateral_heap) { LATERAL_STATE_INIT } // Interpret the AST kernel void lateral(GPTR(uchar) lateral_heap, GPTR(ast_type) lateral_ast) { LATERAL_STATE ast_push(lateral_ast); while (!Q_EMPTY(lateral_state->ast_stack, ast_q) || !Q_EMPTY(lateral_state->call_stack, call_q)) { while (!Q_EMPTY(lateral_state->ast_stack, ast_q)) traverse(); if (!Q_EMPTY(lateral_state->call_stack, call_q)) interpret(); } }
  • 60. var x = 1 + 2; { "type": "VariableDeclaration", AST Call Return "declarations": [ { "type": "VariableDeclarator", "id": { "type": "Identifier", "name": "x" }, "init": { "type": "BinaryExpression", "operator": "+", "left": { "type": "Literal", "value": 1 }, "right": { "type": "Literal", "value": 2 } } } ], "kind": "var" }
  • 61. var x = 1 + 2; { "type": "VariableDeclaration", AST Call Return "declarations": [ { "type": "VariableDeclarator", VarDecl "id": { "type": "Identifier", "name": "x" }, "init": { "type": "BinaryExpression", "operator": "+", "left": { "type": "Literal", "value": 1 }, "right": { "type": "Literal", "value": 2 } } } ], "kind": "var" }
  • 62. var x = 1 + 2; { "type": "VariableDeclaration", AST Call Return "declarations": [ { "type": "VariableDeclarator", VarDtor "id": { "type": "Identifier", "name": "x" }, "init": { "type": "BinaryExpression", "operator": "+", "left": { "type": "Literal", "value": 1 }, "right": { "type": "Literal", "value": 2 } } } ], "kind": "var" }
  • 63. var x = 1 + 2; { "type": "VariableDeclaration", AST Call Return "declarations": [ { "type": "VariableDeclarator", Ident VarDtor "id": { "type": "Identifier", Binary "name": "x" }, "init": { "type": "BinaryExpression", "operator": "+", "left": { "type": "Literal", "value": 1 }, "right": { "type": "Literal", "value": 2 } } } ], "kind": "var" }
  • 64. var x = 1 + 2; { "type": "VariableDeclaration", AST Call Return "declarations": [ { "type": "VariableDeclarator", Ident VarDtor "id": { "type": "Identifier", Literal Binary }, "name": "x" Literal "init": { "type": "BinaryExpression", "operator": "+", "left": { "type": "Literal", "value": 1 }, "right": { "type": "Literal", "value": 2 } } } ], "kind": "var" }
  • 65. var x = 1 + 2; { "type": "VariableDeclaration", AST Call Return "declarations": [ { "type": "VariableDeclarator", Ident VarDtor "id": { "type": "Identifier", Literal Binary }, "name": "x" Literal "init": { "type": "BinaryExpression", "operator": "+", "left": { "type": "Literal", "value": 1 }, "right": { "type": "Literal", "value": 2 } } } ], "kind": "var" }
  • 66. var x = 1 + 2; { "type": "VariableDeclaration", AST Call Return "declarations": [ { "type": "VariableDeclarator", Ident VarDtor "id": { "type": "Identifier", Binary }, "name": "x" Literal "init": { "type": "BinaryExpression", Literal "operator": "+", "left": { "type": "Literal", "value": 1 }, "right": { "type": "Literal", "value": 2 } } } ], "kind": "var" }
  • 67. var x = 1 + 2; { "type": "VariableDeclaration", AST Call Return "declarations": [ { "type": "VariableDeclarator", VarDtor "id": { "type": "Identifier", Binary }, "name": "x" Literal "init": { "type": "BinaryExpression", Literal "operator": "+", "left": { Ident "type": "Literal", "value": 1 }, "right": { "type": "Literal", "value": 2 } } } ], "kind": "var" }
  • 68. var x = 1 + 2; { "type": "VariableDeclaration", AST Call Return "declarations": [ { "type": "VariableDeclarator", VarDtor “x” "id": { "type": "Identifier", Binary }, "name": "x" Literal "init": { "type": "BinaryExpression", Literal "operator": "+", "left": { "type": "Literal", "value": 1 }, "right": { "type": "Literal", "value": 2 } } } ], "kind": "var" }
  • 69. var x = 1 + 2; { "type": "VariableDeclaration", AST Call Return "declarations": [ { "type": "VariableDeclarator", VarDtor “x” "id": { "type": "Identifier", Binary 1 }, "name": "x" Literal "init": { "type": "BinaryExpression", "operator": "+", "left": { "type": "Literal", "value": 1 }, "right": { "type": "Literal", "value": 2 } } } ], "kind": "var" }
  • 70. var x = 1 + 2; { "type": "VariableDeclaration", AST Call Return "declarations": [ { "type": "VariableDeclarator", VarDtor “x” "id": { "type": "Identifier", Binary 1 }, "name": "x" 2 "init": { "type": "BinaryExpression", "operator": "+", "left": { "type": "Literal", "value": 1 }, "right": { "type": "Literal", "value": 2 } } } ], "kind": "var" }
  • 71. var x = 1 + 2; { "type": "VariableDeclaration", AST Call Return "declarations": [ { "type": "VariableDeclarator", VarDtor “x” "id": { "type": "Identifier", 3 "name": "x" }, "init": { "type": "BinaryExpression", "operator": "+", "left": { "type": "Literal", "value": 1 }, "right": { "type": "Literal", "value": 2 } } } ], "kind": "var" }
  • 72. var x = 1 + 2; { "type": "VariableDeclaration", AST Call Return "declarations": [ { "type": "VariableDeclarator", "id": { "type": "Identifier", "name": "x" }, "init": { "type": "BinaryExpression", "operator": "+", "left": { "type": "Literal", "value": 1 }, "right": { "type": "Literal", "value": 2 } } } ], "kind": "var" }
  • 74. Benchmark Small loop of FLOPs var input = new Array(10); for (var i = 0; i < input.length; i++) { input[i] = Math.pow((i + 1) / 1.23, 3); }
  • 75. Execution Time Lateral GPU CL CPU CL V8 ATI Radeon 6770m Intel Core i7 4x2.4Ghz Intel Core i7 4x2.4Ghz 116.571533ms 0.226007ms 0.090664ms
  • 76. Execution Time Lateral GPU CL CPU CL V8 ATI Radeon 6770m Intel Core i7 4x2.4Ghz Intel Core i7 4x2.4Ghz 116.571533ms 0.226007ms 0.090664ms
  • 77.
  • 78. What went wrong? Everything Stack-based AST Interpreter, no optimizations Heavy global memory access, no optimizations No data or task parallelism
  • 79. Stack-based Interpreter Slow as molasses Memory hog Eclipse style Heavy memory access “var x = 1 + 2;” == 30 stack hits alone! Too much dynamic allocation No inline optimizations, just following the yellow brick AST Straight up lazy Replace with something better! Bytecode compiler on Host Bytecode register-based interpreter on Device
  • 80.
  • 81. Too much global access Everything is dynamically allocated to global memory Register based interpreter & bytecode compiler can make better use of local and private memory // 11.1207 seconds size_t tid = get_global_id(0); c[tid] = a[tid]; while(b[tid] > 0) { // touch global memory on each loop b[tid]--; // touch global memory on each loop c[tid]++; // touch global memory on each loop Optimizing memory access } // 0.0445558 seconds!! HOLY SHIT! yields crazy results size_t tid = get_global_id(0); int tmp = a[tid]; // temp private variable for(int i=b[tid]; i > 0; i--) tmp++; // touch private variables on each loop c[tid] = tmp; // touch global memory one time
  • 82. No data or task parallelism Everything being interpreted in a single “thread” We have hundreds of cores available to us! Build in heuristics Identify side-effect free statements Break into parallel tasks - very magical input[0] = Math.pow((0 + 1) / 1.23, 3); var input = new Array(10); for (var i = 0; i < input.length; i++) { input[1] = Math.pow((1 + 1) / 1.23, 3); } input[i] = Math.pow((i + 1) / 1.23, 3); ... input[9] = Math.pow((9 + 1) / 1.23, 3);
  • 83. What’s in store Acceptable performance on all CL devices V8/Node extension to launch Lateral tasks High-level API to perform map-reduce, etc. Lateral-cluster...mmmmm
  • 84. Thanks! Jarred Nicholls @jarrednicholls jarred@webkit.org