Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Julia Computing - an alternative to Hadoop

1,294 views

Published on

A General Presentation on Julia Computing - a language which is mainly used for numerical analysis & is thus, a more efficient alternative to Hadoop

Published in: Engineering

Julia Computing - an alternative to Hadoop

  1. 1. Julia Computing Shaurya Shekhar (14BCE0497) Aarushi Thakral (14BCE0499)
  2. 2. • 4 Main Creators - one of which is Indian - Viral B. Shah. • He started working on it on the same day as he joined UIDAI - the worlds largest biometric collection and collation initiative - Aadhaar by Day - Julia By Night • MIT Professors played instrumental role
  3. 3. The Need For Julia MATLAB for matrix calculations and linear algebra R Language for statistics Ruby & Python for web development All Languages Serve Different Purposes However, they aren’t as fast as C or Java. started off as a network simulation tool. It was meant to be ‘good at EVERYTHING’.
  4. 4. What Is Julia? • Data Science is nowadays a very big deal. • It involves tonnes of data and the analysis of this data to evolve meaningful inferences. • Most of the softwares which help in these are proprietary like MATLAB and Wolfram’s Mathematica. • is the first free alternative.
  5. 5. How Is It Faster? THE PROBLEM • Programmers use tools to translate languages like Ruby & Python into faster languages like C & Java. • This in turn then needs to be compiled into machine code - language that the machine understands. • This makes it slower, adds complexity and allows more room for error Solution Eliminates the need of the intermediary step. Uses LLVM, a compiler developed by UI-UC and enhanced by Apple & Google
  6. 6. How Does It Compare?
  7. 7. Alternative to Hadoop • Hadoop is the widely used data crunching system developed by Yahoo and used by Facebook. • Hadoop breaks up a larger problem into many smaller problems, spreads them across many systems. • Julia not only incorporates this fundamental principle of ‘design parallelism’, but also enhances it.
  8. 8. Features • Multiple Dispatch • Dynamic Programming Language • Good Implementation Speed • Metaprogramming • Built-in Package Manager • Designed for Parallelism & Cloud Computing
  9. 9. Features (Cont.) • User-Defined Data Types as fast as built in ones • Elegant & Extensible Conversions • MIT-licensed (free & open-source)
  10. 10. Multiple Dispatch • We All Know What Polymorphism is about. • Number & type of arguments have to be analyzed to understand which of the function implementations needs to be executed. • This is different from what we’ve learnt in C++, as in C++, the switch is made during compile time, whereas here it is made at run-time.
  11. 11. Dynamic Programming Language • It is a term used in computer science to describe a class of high-level programming language, which at run-time, execute many common programming behaviors that static programming languages perform during compilation • First native in the Lisp language
  12. 12. Good Implementation Speed • Almost at most times equivalent to C (the grand- daddy of programming languages) • The secret behind the highly efficient Julia computing is: 1. Just-In-Time (JIT) Compilation using LLVM Compiler Framework 2. Language Design
  13. 13. LLVM (Low Level Virtual Machine) • It is a collection of modular and reusable compiler and toolchain technologies • Written in C++ • Can generate relocatable machine code at compile-time or link-time or even binary machine code at runtime • It supports language independent instruction set & type system • Each instruction is in static single assessment form (SSA), which means that each variable is assigned once and is frozen
  14. 14. Metaprogramming • It is the ability to write programming languages which treat their programs as their data. The program could be designed to read, generate, analyze or transform other programs & even modify itself while running • Hence technically, the program operates on code itself, this involves inspecting & modifying the code as it runs • The strongest legacy of Lisp in Julia is its metaprogramming support
  15. 15. Built-in Package Manager • It has a built-in manager for installing add-on functionality. • All package commands are found in Pkg module & are included with ‘Base’ install itself. • This ensures that libraries of other languages can be ported easily into Julia. • Example: 1. ’ccall’ is used to access the C shared libraries 2. It has Unicode support for allowing math operators 3. For Strings, it has UTF-8, UTF-16 & UTF-32 & ASCII 4. Markup Languages Like HTML & XML are also supported
  16. 16. Parallelism • It provides a multiprocessing environment based on message passing to allow programs to run on multiple processors in shared or distributed memory • Implementation of message passing is one-sided User has to manage only one processor These do not look like message send & receive, instead resemble high-level function calls Two key notions are: remote calls & remote references
  17. 17. Remote Calls & Remote References • A remote reference is an object that can be used from any processor to refer to an object stored on a particular processor • A remote call is a request by a processor to call a certain function on certain arguments on another (possibly the same) processor. A remote call returns a remote reference. • How remote calls are handled in the program flow: 1. Remote Calls return immediately 2. Processor proceeds to next operation while remote call happens somewhere else 3. You can wait for it to finish by calling ’wait’ on its remote reference 4. You can obtain full value of result by ‘fetch’
  18. 18. Conversions • Conversions of values to various types is carried about by the ‘convert’ function • Its a function which accepts two arguments, the first is a type object, the second is a value to convert to that type • It is also really easy to define our own conversions
  19. 19. Licences • The core of Julia implementation is licensed under the MIT License. • Various libraries used by Julia have their own licenses. • It is an open-source language which gives people the flexibility of modifying the language to better suit their needs.
  20. 20. Plotting Capabilities • Since, this is being used to handle large amounts of data, it is only normal for it to be able to aptly visualize data easily. • It uses various libraries to enable it to plot graphs, flow charts, pie charts like: 1. PyPlot to call Python’s matplotlib from Julia with little or no overhead (linspace) 2. Gadfly is another implementation of a different style of grammar of graphics (draw)
  21. 21. Similar to MATLAB (using PyPlot)
  22. 22. Using Gadfly
  23. 23. Just For Fun
  24. 24. Lets Take An Example We Use The N-Queens Problem
  25. 25. Understanding Time Notations
  26. 26. Julia In The Future
  27. 27. Thank You

×