Intro to Apache Spark:
Fast cluster computing engine for
Hadoop
Intro to Scala:
Object-oriented and functional
language fo...
2Intro to Spark: Intro to Scala | 7/9/2014
About me: Roger Huang
• Visa
– Digital & Mobile Products Architecture, Strategi...
3Intro to Spark: Intro to Scala | 7/9/2014
Different perspectives on an elephant Scala
4Intro to Spark: Intro to Scala | 7/9/2014
Outline
• Spark
– Hadoop eco system
• Scala
– Background
• Why Scala?
– For the...
5Intro to Spark: Intro to Scala | 7/9/2014
Spark in the Hadoop ecosystem
6Intro to Spark: Intro to Scala | 7/9/2014
Spark Ecosystem of Software Projects
• Spark [Ognen]
– APIs: Scala, Python [Rob...
7Intro to Spark: Intro to Scala | 7/9/2014
Resilient Distributed Dataset
• Fault tolerant collection of elements partition...
8Intro to Spark: Intro to Scala | 7/9/2014
(Some) RDD Operations
• Transformations
– map(func)
– filter(func)
– flatMap(fu...
9Intro to Spark: Intro to Scala | 7/9/2014
Scala background
• Scalable, Object oriented, functional language
– Version 2.1...
10Intro to Spark: Intro to Scala | 7/9/2014
Outline
• Spark
– Hadoop eco system
• Scala
– Background
• Why Scala?
– For th...
11Intro to Spark: Intro to Scala | 7/9/2014
Scala for the computer scientist:
functional programming (FP)
12Intro to Spark: Intro to Scala | 7/9/2014
Scala for the computer scientist:
functional programming (FP)
• Math functions...
13Intro to Spark: Intro to Scala | 7/9/2014
Why functional programming?
• Multi core processors
• Concurrency
– Computatio...
14Intro to Spark: Intro to Scala | 7/9/2014
Scala for the computer scientist:
functional programming
• Functions
– Lambda,...
15Intro to Spark: Intro to Scala | 7/9/2014
FP: functions
• Anonymous function
– Function without a name
– lambda function...
16Intro to Spark: Intro to Scala | 7/9/2014
FP: functions
• applyPercentage is an example of a closure
– scala> var percen...
17Intro to Spark: Intro to Scala | 7/9/2014
FP: functions
• Anonymous function
• Closure
18Intro to Spark: Intro to Scala | 7/9/2014
FP: Higher order functions
scala> :load Person.scala
Loading Person.scala...
d...
19Intro to Spark: Intro to Scala | 7/9/2014
FP: Higher order functions
• HOF
– takes a function as an argument
– Returns a...
20Intro to Spark: Intro to Scala | 7/9/2014
FP: Higher order functions: map
• Creates a new collection from an existing co...
21Intro to Spark: Intro to Scala | 7/9/2014
FP: Higher order functions: map
22Intro to Spark: Intro to Scala | 7/9/2014
FP: Higher order functions: flatmap
23Intro to Spark: Intro to Scala | 7/9/2014
FP: for-comprehension
• Syntax
– for ( <generator> | <guard> ) <expression> [y...
24Intro to Spark: Intro to Scala | 7/9/2014
FP: for-comprehension
• Syntax
– for ( <generator> | <guard> ) <expression> [y...
25Intro to Spark: Intro to Scala | 7/9/2014
FP: for-comprehension
26Intro to Spark: Intro to Scala | 7/9/2014
FP: foldLeft
• scala> val numbers = 1.to(10)
• numbers: scala.collection.immut...
27Intro to Spark: Intro to Scala | 7/9/2014
FP: foldLeft
28Intro to Spark: Intro to Scala | 7/9/2014
FP: find the last item in an array
• scala> val ns = Array(20, 40, 60)
• ns: A...
29Intro to Spark: Intro to Scala | 7/9/2014
FP: reverse an array w/ foldLeft
• scala> val ns = Array(20, 40, 60)
• ns: Arr...
30Intro to Spark: Intro to Scala | 7/9/2014
FP: reverse an array w/ foldLeft
31Intro to Spark: Intro to Scala | 7/9/2014
Outline
• Spark
– Hadoop eco system
• Scala
– Background
• Why Scala?
– For th...
32Intro to Spark: Intro to Scala | 7/9/2014
Scala for the Java / OO developer:
• Interoperable w/ Java
• Case classes
• Mi...
33Intro to Spark: Intro to Scala | 7/9/2014
Scala for the Java / OO developer:
• case class
– Implements equals(), hashCod...
34Intro to Spark: Intro to Scala | 7/9/2014
Scala for the Java / OO developer:
• http://docs.oracle.com/javase/8/docs/api/...
35Intro to Spark: Intro to Scala | 7/9/2014
Outline
• Spark
– Hadoop eco system
• Scala
– Background
• Why Scala?
– For th...
36Intro to Spark: Intro to Scala | 7/9/2014
Scala for the Spark developer
• ResilientDistributedDataset (RDD)
• A Resilien...
37Intro to Spark: Intro to Scala | 7/9/2014
Outline
• Spark
– Hadoop eco system
• Scala
– Background
• Why Scala?
– For th...
38Intro to Spark: Intro to Scala | 7/9/2014
Scala for the Big Data developer
• Spark
– Programming API in Scala
– Implemen...
39Intro to Spark: Intro to Scala | 7/9/2014
Outline
• Spark
– Hadoop eco system
• Scala
– Background
• Why Scala?
– For th...
40Intro to Spark: Intro to Scala | 7/9/2014
Scala for the Big Data scientist / mathematician
• Monoid
– If you want to “at...
41Intro to Spark: Intro to Scala | 7/9/2014
Outline
• Spark
– Hadoop eco system
• Scala
– Background
• Why Scala?
– For th...
42Intro to Spark: Intro to Scala | 7/9/2014
Scala for the system architect
• Concurrency
• Problem:
– Threads
– Shared mut...
43Intro to Spark: Intro to Scala | 7/9/2014
Scala for the system architect: Akka
• Fault tolerance
– Supervision
– Strateg...
44Intro to Spark: Intro to Scala | 7/9/2014
Scala for the system architect
• Parallel collections
– scala> import scala.co...
45Intro to Spark: Intro to Scala | 7/9/2014
Sequential collections
46Intro to Spark: Intro to Scala | 7/9/2014
Parallel collections
47Intro to Spark: Intro to Scala | 7/9/2014
Outline
• Spark
– Hadoop eco system
• Scala
– Background
• Why Scala?
– For th...
48Intro to Spark: Intro to Scala | 7/9/2014
Different perspectives on an elephant Scala
49Intro to Spark: Intro to Scala | 7/9/2014
Spark in the Hadoop ecosystem
50Intro to Spark: Intro to Scala | 7/9/2014
References
• http://scala-lang.org/
• Scala in Action, Nilanjan Raychaudhuri
•...
Intro to Apache Spark:
Fast cluster computing engine for
Hadoop
Intro to Scala:
Object-oriented and functional
language fo...
Upcoming SlideShare
Loading in …5
×

Scala 20140715

758 views

Published on

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
758
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
9
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Visa Presentation Template
  • Scala 20140715

    1. 1. Intro to Apache Spark: Fast cluster computing engine for Hadoop Intro to Scala: Object-oriented and functional language for the Java Virtual Machine ACM SIGKDD, 7/9/2014 Roger Huang Lead System Architect rohuang@visa.com rog4096@yahoo.com @BigDataWrangler
    2. 2. 2Intro to Spark: Intro to Scala | 7/9/2014 About me: Roger Huang • Visa – Digital & Mobile Products Architecture, Strategic Projects & infrastructure – Search infrastructure – Customer segmentation – Logging Framework – Splunk on Hadoop (Hunk) – Real-time monitoring – Data • PayPal – Java Infrastructure
    3. 3. 3Intro to Spark: Intro to Scala | 7/9/2014 Different perspectives on an elephant Scala
    4. 4. 4Intro to Spark: Intro to Scala | 7/9/2014 Outline • Spark – Hadoop eco system • Scala – Background • Why Scala? – For the computer scientist – For the Java / OO programmer – For the Spark developer – For the Big Data developer – For the Big Data scientist / mathematician – For the system architect
    5. 5. 5Intro to Spark: Intro to Scala | 7/9/2014 Spark in the Hadoop ecosystem
    6. 6. 6Intro to Spark: Intro to Scala | 7/9/2014 Spark Ecosystem of Software Projects • Spark [Ognen] – APIs: Scala, Python [Robert], Java • “SQL” – Shark (Hive + Spark) [Roger] – SparkSQL (alpha) • Machine Learning Library (MLlib) [Omar] – Clustering – Classification • binary classification • Linear regression – recommendations • Spark Streaming [Chance] • GraphX [Srini] • …
    7. 7. 7Intro to Spark: Intro to Scala | 7/9/2014 Resilient Distributed Dataset • Fault tolerant collection of elements partitioned across the nodes of the cluster that can be operated on in parallel • Data sources for RDDs – Parallelized collections • From Scala collections – Hadoop datasets • From HDFS, any Hadoop supported storage system (Hbase, Amazon S3, …) • Text files, SequenceFile, any Hadoop InputFormat • Two types of operations – Transformation • takes an existing dataset and creates a new one – Action • takes a dataset, run a computation, and return value to driver program
    8. 8. 8Intro to Spark: Intro to Scala | 7/9/2014 (Some) RDD Operations • Transformations – map(func) – filter(func) – flatMap(func) – mapPartitions(func) – mapPartitionsWithIndex(func) – sample(withReplacement, fraction, seed) – union(otherDataset) – distinct() – groupByKey() – reduceByKey(func) – sortByKey() – Join(otherDataset) – cogroup(otherDataset) – cartesian(otherDataset) • Actions – reduce(func) – collect() – count() – first() – take(n) – takeSample(withReplacement, num, seed) – saveAsTextFile(path) – saveAsSequenceFile(path) – countByKey() – foreach(func) – …
    9. 9. 9Intro to Spark: Intro to Scala | 7/9/2014 Scala background • Scalable, Object oriented, functional language – Version 2.11 (4/2014) • Runs on the Java Virtual Machine • Martin Odersky – javac – Java generics • http://scala-lang.org/, REPL • http://www.scala-lang.org/api/current • http://scala-ide.org/ • http://www.scala-sbt.org/, Simple build tool • Who’s using Scala? – Twitter, LinkedIn, … • Powered by Scala – Apache Spark, Apache Kafka, Akka,…
    10. 10. 10Intro to Spark: Intro to Scala | 7/9/2014 Outline • Spark – Hadoop eco system • Scala – Background • Why Scala? – For the computer scientist – For the Java / OO programmer – For the Hadoop/Spark developer – For the Big Data developer – For the Big Data scientist / mathematician – For the system architect
    11. 11. 11Intro to Spark: Intro to Scala | 7/9/2014 Scala for the computer scientist: functional programming (FP)
    12. 12. 12Intro to Spark: Intro to Scala | 7/9/2014 Scala for the computer scientist: functional programming (FP) • Math functions, e.g., f(x) = y – A function has a single responsibility – A function has no side effects – A function is referentially transparent • A function outputs the same value for the same inputs. • Functional programming – expresses computation as the evaluation and composition of mathematical functions – Avoid side effects and mutating state data
    13. 13. 13Intro to Spark: Intro to Scala | 7/9/2014 Why functional programming? • Multi core processors • Concurrency – Computation as a series of independent data transformations – Parallel data transformations without side effects • Referential transparency
    14. 14. 14Intro to Spark: Intro to Scala | 7/9/2014 Scala for the computer scientist: functional programming • Functions – Lambda, closure • For-comprehensions • Type inference • Pattern matching • Higher order functions – map, flatMap, foldLeft • And more …
    15. 15. 15Intro to Spark: Intro to Scala | 7/9/2014 FP: functions • Anonymous function – Function without a name – lambda function • Example – scala> List(100, 200, 300) map { _ * 10/100} – res0: List[Int] = List(10, 20, 30) • Closure (Wikipedia) – Closure = A function, together with a referencing environment – a table storing a reference to each of the non-local variables of that function. – A closure allows a function to access those non-local variables even when invoked outside its immediate lexical scope.
    16. 16. 16Intro to Spark: Intro to Scala | 7/9/2014 FP: functions • applyPercentage is an example of a closure – scala> var percentage = 10 – percentage: Int = 10 – scala> val applyPercentage = (amount: Int) => amount * percentage / 100 – applyPercentage: Int => Int = <function1> – scala> percentage = 20 – percentage: Int = 20 – scala> List (100, 200, 300) map applyPercentage – res1: List[Int] = List(20, 40, 60) – scala>
    17. 17. 17Intro to Spark: Intro to Scala | 7/9/2014 FP: functions • Anonymous function • Closure
    18. 18. 18Intro to Spark: Intro to Scala | 7/9/2014 FP: Higher order functions scala> :load Person.scala Loading Person.scala... defined class Person scala> val jd = new Person("John", "Doe", 17) jd: Person = Person@372a6e85 scala> val rh = new Person("Roger", "Huang", 34) rh: Person = Person@611c4041 scala> val people = Array(jd, rh) people: Array[Person] = Array(Person@372a6e85, Person@611c4041) scala> val (minors, adults) = people partition (_.age < 18) minors: Array[Person] = Array(Person@372a6e85) adults: Array[Person] = Array(Person@611c4041) scala>
    19. 19. 19Intro to Spark: Intro to Scala | 7/9/2014 FP: Higher order functions • HOF – takes a function as an argument – Returns a function
    20. 20. 20Intro to Spark: Intro to Scala | 7/9/2014 FP: Higher order functions: map • Creates a new collection from an existing collection by applying a function • Anonymous function scala> List(1, 2, 3 ) map { (x: Int) => x + 1 } res0: List[Int] = List(2, 3, 4) • Function literal scala> List(1, 2, 3) map { _ + 1 } res1: List[Int] = List(2, 3, 4) • Passing an existing function scala> def addOne(num: Int) = num + 1 addOne: (num: Int)Int scala> List(1, 2, 3) map addOne res2: List[Int] = List(2, 3, 4)
    21. 21. 21Intro to Spark: Intro to Scala | 7/9/2014 FP: Higher order functions: map
    22. 22. 22Intro to Spark: Intro to Scala | 7/9/2014 FP: Higher order functions: flatmap
    23. 23. 23Intro to Spark: Intro to Scala | 7/9/2014 FP: for-comprehension • Syntax – for ( <generator> | <guard> ) <expression> [yield] <expression> • Types – Imperative form. Does not return a value. scala> val aList = List(1, 2, 3) aList: List[Int] = List(1, 2, 3) scala> val bList = List(4, 5, 6) bList: List[Int] = List(4, 5, 6) scala> for { a <- aList; if (a < 2); b <- bList; if (b < 7) } println( a + b ) 5 6 7
    24. 24. 24Intro to Spark: Intro to Scala | 7/9/2014 FP: for-comprehension • Syntax – for ( <generator> | <guard> ) <expression> [yield] <expression> • Types – Functional form (a.k.a., sequence comprehension) . Returns/yields a value scala> for { a <- aList; b <- bList} yield a + b res0: List[Int] = List(5, 6, 7, 6, 7, 8, 7, 8, 9) scala> res0.take(1) res1: List[Int] = List(5) scala> for { a <- aList; if (a < 2); b <- bList } yield a + b res2: List[Int] = List(5, 6, 7) scala>
    25. 25. 25Intro to Spark: Intro to Scala | 7/9/2014 FP: for-comprehension
    26. 26. 26Intro to Spark: Intro to Scala | 7/9/2014 FP: foldLeft • scala> val numbers = 1.to(10) • numbers: scala.collection.immutable.Range.Inclusive = Range(1, 2, 3, 4, 5, 6, 7, 8, 9, 10) • scala> def add( a:Int, b:Int ): Int = { a + b } • add: (a: Int, b: Int)Int • scala> numbers.foldLeft(0){ add } • res0: Int = 55 • scala> numbers.foldLeft(0){ (acc, b) => acc + b } • res1: Int = 55 • scala>
    27. 27. 27Intro to Spark: Intro to Scala | 7/9/2014 FP: foldLeft
    28. 28. 28Intro to Spark: Intro to Scala | 7/9/2014 FP: find the last item in an array • scala> val ns = Array(20, 40, 60) • ns: Array[Int] = Array(20, 40, 60) • scala> ns.foldLeft(ns.head) {(acc, b) => b} • res0: Int = 60 • scala>
    29. 29. 29Intro to Spark: Intro to Scala | 7/9/2014 FP: reverse an array w/ foldLeft • scala> val ns = Array(20, 40, 60) • ns: Array[Int] = Array(20, 40, 60) • scala> ns.foldLeft( Array[Int]() ) { (acc, b) => b +: acc} • res1: Array[Int] = Array(60, 40, 20) • scala>
    30. 30. 30Intro to Spark: Intro to Scala | 7/9/2014 FP: reverse an array w/ foldLeft
    31. 31. 31Intro to Spark: Intro to Scala | 7/9/2014 Outline • Spark – Hadoop eco system • Scala – Background • Why Scala? – For the computer scientist – For the Java / OO programmer – For the Spark developer – For the Big Data developer – For the Big Data scientist / mathematician – For the system architect
    32. 32. 32Intro to Spark: Intro to Scala | 7/9/2014 Scala for the Java / OO developer: • Interoperable w/ Java • Case classes • Mixins with traits
    33. 33. 33Intro to Spark: Intro to Scala | 7/9/2014 Scala for the Java / OO developer: • case class – Implements equals(), hashCode(), toString() – Can be used in Pattern Matching
    34. 34. 34Intro to Spark: Intro to Scala | 7/9/2014 Scala for the Java / OO developer: • http://docs.oracle.com/javase/8/docs/api/java/util/stream/Str eam.html • map – <R> Stream<R> map(Function<? super T,? extends R> mapper)Returns a stream consisting of the results of applying the given function to the elements of this stream.This is an intermediate operation. • flatMap – <R> Stream<R> flatMap(Function<? super T,? extends Stream<? extends R>> mapper)Returns a stream consisting of the results of replacing each element of this stream with the contents of a mapped stream produced by applying the provided mapping function to each element. Each mapped stream is closed after its contents have been placed into this stream. (If a mapped stream is null an empty stream is used, instead.)This is an intermediate operation. `
    35. 35. 35Intro to Spark: Intro to Scala | 7/9/2014 Outline • Spark – Hadoop eco system • Scala – Background • Why Scala? – For the computer scientist – For the Java / OO programmer – For the Spark developer – For the Big Data developer – For the Big Data scientist / mathematician – For the system architect
    36. 36. 36Intro to Spark: Intro to Scala | 7/9/2014 Scala for the Spark developer • ResilientDistributedDataset (RDD) • A Resilient Distributed Dataset (RDD), the basic abstraction in Spark. Represents an immutable, partitioned collection of elements that can be operated on in parallel. This class contains the basic operations available on all RDDs, such as map, filter, and persist. • http://spark.apache.org/docs/latest/api/scala/index.html#org.apa che.spark.rdd.RDD
    37. 37. 37Intro to Spark: Intro to Scala | 7/9/2014 Outline • Spark – Hadoop eco system • Scala – Background • Why Scala? – For the computer scientist – For the Java / OO programmer – For the Spark developer – For the Big Data developer – For the Big Data scientist / mathematician – For the system architect
    38. 38. 38Intro to Spark: Intro to Scala | 7/9/2014 Scala for the Big Data developer • Spark – Programming API in Scala – Implemented in Scala • Scalding – Scala DSL on top of Cascading – data processing API and processing query planner used for defining, sharing, and executing data-processing workflows – Abstractions: tuples, pipes, source/sink taps • Algebird • Summingbird – Library that lets you write MapReduce programs that look like native Scala or Java collection transformations – Execute them on a number of well-known distributed MapReduce platforms, including Storm and Scalding.
    39. 39. 39Intro to Spark: Intro to Scala | 7/9/2014 Outline • Spark – Hadoop eco system • Scala – Background • Why Scala? – For the computer scientist – For the Java / OO programmer – For the Hadoop/Spark developer – For the Big Data developer – For the Big Data scientist / mathematician – For the system architect
    40. 40. 40Intro to Spark: Intro to Scala | 7/9/2014 Scala for the Big Data scientist / mathematician • Monoid – If you want to “attach” operations such as +, -, *, / or <= to data objects (e.g., Bloom filters), then you want to provide monoid forms of those data objects – Consists of • A set of objects • Binary operation that satisfies the monoid axioms • Monad – If you want to create a data processing pipeline that transforms the state of a data object – composition
    41. 41. 41Intro to Spark: Intro to Scala | 7/9/2014 Outline • Spark – Hadoop eco system • Scala – Background • Why Scala? – For the computer scientist – For the Java / OO programmer – For the Hadoop/Spark developer – For the Big Data developer – For the Big Data scientist / mathematician – For the system architect
    42. 42. 42Intro to Spark: Intro to Scala | 7/9/2014 Scala for the system architect • Concurrency • Problem: – Threads – Shared mutable state – Locks, • Solution: – message passing concurrency w/ Actors – Future, Promise • Abstractions – Actor • an object that processes a message • encapsulates state (state not shared) – ActorRef – Message, usually sent asynchronously – Mailbox – ActorSystem
    43. 43. 43Intro to Spark: Intro to Scala | 7/9/2014 Scala for the system architect: Akka • Fault tolerance – Supervision – Strategies • Resume, restart, stop, escalate, … • Scale out: remote actors – Via configuration
    44. 44. 44Intro to Spark: Intro to Scala | 7/9/2014 Scala for the system architect • Parallel collections – scala> import scala.collection.parallel.immutable._ – import scala.collection.parallel.immutable._ – scala> ParVector(10, 20, 30, 40, 50, 60, 70, 80, 90) .map { x => – | println( Thread.currentThread.getName); x / 2 } – ForkJoinPool-1-worker-13 – ForkJoinPool-1-worker-1 – ForkJoinPool-1-worker-1 – ForkJoinPool-1-worker-9 – ForkJoinPool-1-worker-11 – ForkJoinPool-1-worker-5 – ForkJoinPool-1-worker-3 – ForkJoinPool-1-worker-15 – ForkJoinPool-1-worker-7 – res0: scala.collection.parallel.immutable.ParVector[Int] = ParVector(5, 10, 15, – 20, 25, 30, 35, 40, 45) – scala>
    45. 45. 45Intro to Spark: Intro to Scala | 7/9/2014 Sequential collections
    46. 46. 46Intro to Spark: Intro to Scala | 7/9/2014 Parallel collections
    47. 47. 47Intro to Spark: Intro to Scala | 7/9/2014 Outline • Spark – Hadoop eco system • Scala – Background • Why Scala? – For the computer scientist – For the Java / OO programmer – For the Spark developer – For the Big Data developer – For the Big Data scientist / mathematician – For the system architect
    48. 48. 48Intro to Spark: Intro to Scala | 7/9/2014 Different perspectives on an elephant Scala
    49. 49. 49Intro to Spark: Intro to Scala | 7/9/2014 Spark in the Hadoop ecosystem
    50. 50. 50Intro to Spark: Intro to Scala | 7/9/2014 References • http://scala-lang.org/ • Scala in Action, Nilanjan Raychaudhuri • Grokking Functional Programming, Aslam Khan • Michael Noll
    51. 51. Intro to Apache Spark: Fast cluster computing engine for Hadoop Intro to Scala: Object-oriented and functional language for the Java Virtual Machine ACM SIGKDD, 7/9/2014 Roger Huang Lead System Architect Digital & Mobile Products Architecture rohuang@visa.com rog4096@yahoo.com

    ×