Cloud, Distributed, Embedded: Erlang in the Heterogeneous Computing World
Upcoming SlideShare
Loading in...5
×
 

Cloud, Distributed, Embedded: Erlang in the Heterogeneous Computing World

on

  • 982 views

 

Statistics

Views

Total Views
982
Views on SlideShare
751
Embed Views
231

Actions

Likes
1
Downloads
12
Comments
0

3 Embeds 231

http://omer.me 224
https://twitter.com 6
http://webcache.googleusercontent.com 1

Accessibility

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Cloud, Distributed, Embedded: Erlang in the Heterogeneous Computing World Cloud, Distributed, Embedded: Erlang in the Heterogeneous Computing World Presentation Transcript

  • Cloud, Distributed, Embedded. Erlang in the Heterogeneous Computing World Omer Kilic || @OmerK omer@erlang-solutions.com
  • Outline • • • • • • • • Challenges in modern computing systems Heterogeneous computing Co-processors and accelerators Programming models and tools Alternate architectures Parallella Vision System Erlang Embedded Project Q&A 10/12/2013 Build Stuff 2013 Slide 2 of 46
  • Challenges: Software • Frequency wall • Memory bottlenecks • Software complexity 10/12/2013 Build Stuff 2013 Slide 3 of 46
  • Amdahl’s Law • “…the maximum speed-up through parallel processing is set by the amount of code which has to run serial” 10/12/2013 Build Stuff 2013 Slide 4 of 46
  • Challenges: Hardware • Yield issues • Wiring and interconnect • Thermal density • Power consumption End of Moore’s law imminent… 10/12/2013 Build Stuff 2013 Slide 5 of 46
  • Challenges “With nearly 10 billion devices connected to the internet and predictions for exponential growth, we’ve reached a point where the space, power, and cost demands of traditional technology are no longer sustainable.” Meg Whitman President and CEO, HP 10/12/2013 Build Stuff 2013 Slide 6 of 46
  • Internet of Things 10/12/2013 Build Stuff 2013 Slide 7 of 46
  • Device Architectures (I) 10/12/2013 Build Stuff 2013 Slide 8 of 46
  • Device Architectures (II) 10/12/2013 Build Stuff 2013 Slide 9 of 46
  • Heterogeneous Computing (I) • Special purpose, highly specialised architectures will outperform general purpose processing devices – Possibly by orders of magnitude – In terms of energy efficiency as well as raw speed – Parallel execution is key • Non-programmable/pseudo-programmable accelerators: ASIC, DSP, GPU, … • Fully programmable accelerators: FPGAs 10/12/2013 Build Stuff 2013 Slide 10 of 46
  • Open Compute Project 10/12/2013 Build Stuff 2013 Slide 11 of 46
  • Heterogeneous Computing (II) 10/12/2013 Build Stuff 2013 Slide 12 of 46
  • GPUs 10/12/2013 Build Stuff 2013 Slide 13 of 46
  • Anatomy of a GPU 10/12/2013 Build Stuff 2013 Slide 14 of 46
  • Co-processors: NetFPGA 10G 10/12/2013 Build Stuff 2013 Slide 15 of 46
  • Co-processors: Generic COTS devices 10/12/2013 Build Stuff 2013 Slide 16 of 46
  • Landscape of accelerator programming Interface CUDA OpenCL DirectCompute RenderScript Originator NVIDIA Khronos (Apple) Microsoft Google Year 2007 2008 2009 2011 Area HPC, desktop Desktop, mobile, embedded, HPC Desktop Mobile OS Windows, Linux, Mac OS Windows, Linux, Mac OS (10.6+) Windows (Vista+) Android (3.0+) Devices GPUs (NVIDIA) CPUs, GPUs, custom GPUs (NVIDIA, AMD) CPUs, GPUs, DSPs Work unit Kernel Kernel Compute shader Compute script Language CUDA C/C++ OpenCL C HLSL Script C Distributed Source, PTX Source Source, bytecode LLVM bitcode From: “The landscape of accelerator programming: a view from ARM”, Lokhmotov, A., 3rd UK GPU Computing Conference, London 10/12/2013 Build Stuff 2013 Slide 17 of 46
  • Accelerator types • Programmable accelerators – CPU Vector extensions: x86/SSE/AVX, PowerPC/VMX, ARM/NEON – GPUs supporting general-purpose computing (GPGPUs) – Sony/Toshiba/IBM Cell (Sony PlayStation 3, HPC) – ClearSpeed CSX (HPC, embedded) – Adapteva Epiphany (HPC, mobile) – Intel MIC (HPC) 10/12/2013 Build Stuff 2013 Slide 18 of 46
  • Programming accelerators • Proprietary low-level APIs, typically C-based: – Vector intrinsics – NVIDIA CUDA – ATI Brook+ – ClearSpeed Cn • No software portability, obsolescence risk. 10/12/2013 Build Stuff 2013 Slide 19 of 46
  • OpenCL (I) “OpenCL (Open Computing Language) is an open, royalty-free standard for general-purpose parallel programming of heterogeneous systems. OpenCL provides a uniform programming environment for software developers to write efficient, portable code for high-performance compute servers, desktop computer systems and handheld devices using a diverse mix of multi-core CPUs, GPUs, Cell-type architectures and other parallel processors such as DSPs.” 10/12/2013 Build Stuff 2013 Slide 20 of 46
  • OpenCL (II) • Allows you to write C like code which executes on GPUs and many other devices – CPUs, FPGAs, various other architectures • Key point is data parallelism: applying the same function to a large amount of data • Allows us to leverage devices like GPUs from Erlang easily with a minimal wrapper 10/12/2013 Build Stuff 2013 Slide 21 of 46
  • The Parallella Board 10/12/2013 Build Stuff 2013 Slide 22 of 46
  • Shiny prototype! 10/12/2013 Build Stuff 2013 Slide 23 of 46
  • The Parallella Board 10/12/2013 Build Stuff 2013 Slide 24 of 46
  • Epiphany Architecture 10/12/2013 Build Stuff 2013 Slide 25 of 46
  • Epiphany-IV 64-core 28nm (E64G401) • • • • • • • • • • • • 64 High Performance RISC CPU Cores 800 MHz Operating Frequency 100 GFLOPS Peak Performance 1.6 TB/s Local Memory Bandwidth 102 GB/s Network-On-Chip Bisection Bandwidth 6.4 GB/s Off-Chip Bandwidth 2 MB On-Chip Distributed Shared Memory 2 Watt Maximum Chip Power Consumption IEEE Floating Point Instruction Set Fully-featured ANSI-C/C++ programmable GNU/Eclipse based tool chain Source synchronous LVDS off chip links for host or direct chip-tochip interfacing. • Chip to chip links for integrating up to 64 chips on a single board 10/12/2013 Build Stuff 2013 Slide 26 of 46
  • Parallella Vision Demo - Overview 10/12/2013 Build Stuff 2013 Slide 27 of 46
  • Parallella Vision Demo - Cameras 10/12/2013 Build Stuff 2013 Slide 28 of 46
  • Parallella Vision Demo - Architecture 10/12/2013 Build Stuff 2013 Slide 29 of 46
  • OpenCL and Erlang • Erlang is not that great for crunching image data. – This is where OpenCL fits in. • Erlang provides an environment around OpenCL. Our server implementation collect frames, offloads processing to Epiphany and send results back. – Low latency distributed communications and message passing between processes and nodes – Monitoring and supervision facilities – “Glue” between heterogeneous nodes 10/12/2013 Build Stuff 2013 Slide 30 of 46
  • OpenCL on the Parallella • Parallella is a little different than standard GPUs – Work sizes are different (smaller amount of cores compared to GPU) – Requires some forethought into structuring your kernels 10/12/2013 Build Stuff 2013 Slide 31 of 46
  • Parallella and Erlang • Ubuntu armhf packages up and running – Will be included in the standard distro image • Vision Demo code available now – https://github.com/esl/parcv 10/12/2013 Build Stuff 2013 Slide 32 of 46
  • Embedded Landscape 10/12/2013 Build Stuff 2013 Slide 34 of 46
  • #include <stats.h> Source: http://embedded.com/electronics-blogs/programming-pointers/4372180/Unexpected-trends 10/12/2013 Build Stuff 2013 Slide 35 of 46
  • External Interfaces in Erlang 10/12/2013 Build Stuff 2013 Slide 36 of 46
  • Accessing hardware • Peripherals are memory mapped • Access via /dev/mem… – Faster, needs root, potentially dangerous! • …or by kernel modules/sysfs – Slower, doesn’t need root, easier, relatively safer Generally very messy… 10/12/2013 Build Stuff 2013 Slide 37 of 46
  • Introducing… Erlang/ALE Actor Library for Embedded http://github.com/esl/erlang-ale 10/12/2013 Build Stuff 2013 Slide 38 of 46
  • Erlang/ALE • Brings embedded peripheral interfaces into the Erlang domain • Provides easy to use, familiar abstractions for Erlang programmers • Uses Raspberry Pi as reference platform, easy to port it to other embedded platforms • Open source (Apache version 2) 10/12/2013 Build Stuff 2013 Slide 39 of 46
  • Beta release • Based on pihwm – http://omerk.github.io/pihwm • GPIO and GPIO interrupts, SPI, I2C and PWM peripherals supported • Documentation, supporting material and educational package under development 10/12/2013 Build Stuff 2013 Slide 40 of 46
  • ALE Example: Blink! {ok, _} = gpio:start_link(?LED_PIN, output), blink() -> gpio:write(?LED_PIN, 1), timer:sleep(1000), gpio:write(?LED_PIN, 0), timer:sleep(1000). 10/12/2013 Build Stuff 2013 Slide 41 of 46
  • ALE Example: Interrupts {ok, _} = gpio:start_link(?IN_PIN, input), ok = gpio:set_int(?IN_PIN, rising), handle_info({gpio_interrupt, _Pin, _Condition}, State) -> blink(). 10/12/2013 Build Stuff 2013 Slide 42 of 46
  • Hardware Projects – Demo Board 10/12/2013 Build Stuff 2013 Slide 43 of 46
  • Packages for Embedded Architectures https://www.erlang-solutions.com/downloads/download-erlang-otp 10/12/2013 Build Stuff 2013 Slide 44 of 46
  • Erlang 10/12/2013 Build Stuff 2013 Slide 45 of 46
  • Thank you • http://erlang-embedded.com • embedded@erlang-solutions.com • @ErlangEmbedded “ The world is concurrent. Things in the world don't share data. Things communicate with messages. Things fail. - Joe Armstrong Father of Erlang 10/12/2013 Build Stuff 2013 Slide 46 of 46