SlideShare a Scribd company logo
!
Fast and Simple
Statistics with Scala
@xxxnell
1
2
3
4
5
6
7
8
9
10
Problems of KDE
4 Slow: or more
4 Large memory consumption:
4 Require a prior knowledge of dataset
11
animation link 12
'Adaptive histogram (Sketch)' solves
the problems:
4 Fast:
4 Lightweight:
4 Does NOT require a prior knowledge of dataset
13
14
animation link 15
animation link 16
animation link 17
animation link 18
Category of probability distributions
19
Functions of Dist (simply D)
def probability[A](dist: D[A], start: A, end: A): Double
20
Functions of Histogram (simply H)
def probability[A](hist: H[A], start: A, end: A): Double
def update[A](hist: H[A], as: List[A]): H[A]
21
Functions of Sketch (simply S)
def probability[A](sketch: S[A], start: A, end: A): Double
def update[A](sketch: S[A], as: List[A]): H[A]
def narrowUpdate[A](sketch: S[A], as: List[A]): S[A]
def deepUpdate[A](sketch: S[A], as: List[A]): S[A]
22
Functions of Sketch (simply S)
def probability[A](sketch: S[A], start: A, end: A): Double
def narrowUpdate[A](sketch: S[A], as: List[A]): S[A]
def deepUpdate[A](sketch: S[A], as: List[A]): S[A]
23
import flip.implicits._
// get 100 random variables from standard normal distribution
val underlying0 = NumericDist.normal(0.0, 1.0)
val (underlying1, samples) = underlying0.samples(100)
// update samples to sketch
val sketch0 = Sketch.empty[Double]
val sketch1 = samples.foldLeft(sketch0) {
case (sketch, sample) ⇒ sketch.update(sample)
}
// get probability for interval [0.0, 1.0]
println("result: " + sketch1.probability(0.0, 1.0))
println("expected: " + underlying1.probability(0.0, 1.0))
24
import flip.implicits._
// get 100 random variables from standard normal distribution
val underlying0 = NumericDist.normal(0.0, 1.0)
val (underlying1, samples) = underlying0.samples(100)
// update samples to sketch
val sketch0 = Sketch.empty[Double]
val sketch1 = samples.foldLeft(sketch0) {
case (sketch, sample) ⇒ sketch.update(sample)
}
// get probability for interval [0.0, 1.0]
println("result: " + sketch1.probability(0.0, 1.0))
println("expected: " + underlying1.probability(0.0, 1.0))
25
import flip.implicits._
// get 100 random variables from standard normal distribution
val underlying0 = NumericDist.normal(0.0, 1.0)
val (underlying1, samples) = underlying0.samples(100)
// update samples to sketch
val sketch0 = Sketch.empty[Double]
val sketch1 = samples.foldLeft(sketch0) {
case (sketch, sample) ⇒ sketch.update(sample)
}
// get probability for interval [0.0, 1.0]
println("result: " + sketch1.probability(0.0, 1.0))
println("expected: " + underlying1.probability(0.0, 1.0))
26
import flip.implicits._
// get 100 random variables from standard normal distribution
val underlying0 = NumericDist.normal(0.0, 1.0)
val (underlying1, samples) = underlying0.samples(100)
// update samples to sketch
val sketch0 = Sketch.empty[Double]
val sketch1 = samples.foldLeft(sketch0) {
case (sketch, sample) ⇒ sketch.update(sample)
}
// get probability for interval [0.0, 1.0]
println("result: " + sketch1.probability(0.0, 1.0))
println("expected: " + underlying1.probability(0.0, 1.0))
27
// probability for interval [0.0, 1.0]
sketch.probability(0.0, 1.0)
// probability density at 0.0
sketch.pdf(0.0)
// median
sketch.median
// 100 random samples
sketch.samples(100)
28
Rolling one die
29
Rolling two dice
30
Rolling two dice
for {
n1 ← diceDist
n2 ← diceDist
} yield n1 + n2
31
Probability distribution is monad
// premises
def pure[A](a: A): Dist[A]
def flatMap[A, B](f: Dist[A], g: A ⇒ Dist[B]): Dist[B]
// proposition
def map[A](f: Dist[A], g: A ⇒ B): Dist[B]
= flatMap(f, (a: A) ⇒ pure(g(a)))
32
pure
def pure[A](a: A): Dist[A]
33
flatMap
def flatMap[A, B](f: Dist[A], g: A ⇒ Dist[B]): Dist[B]
34
Experiment result of flatMap
sketch.flatMap(x ⇒ Normal(x, 1.5))
35
map is domain transformation
// translation transformation
sketch.map(x ⇒ x + 1)
// scaling transformation
sketch.map(x ⇒ x * 2)
// reflection transformation
sketch.map(x ⇒ x * -1)
36
Experiment result of map
sketch.map(x ⇒ math.exp(x))
37
Average speed for three different speed
models
for {
speedA ← speedSketchA
speedB ← speedSketchB
speedC ← speedSketchC
} yield (speedA + speedB + speedC) / 3
38
Average height for male and female
for {
gender ← genderDist
height ← gender match {
case Male ⇒ maleHeightDist
case Female ⇒ femaleHeightDist
}
} yield height
39
!
Flip: Fast, Lightweight library for
Information and Probability
4 Most fast and lightweight
4 Pure-functional
4 Unique and high-level open source
4 GitHub: https://github.com/xxxnell/flip
40
Conclusion
41
Further Readings
4 Kernel Density Estimation in Spark
4 A frequentist approach to probability
4 Probability Distribution Monad (code)
4 Foundations of the Giry Monad
4 Platform for statistical modeling
4 A library for probabilistic modeling on TF
42

More Related Content

What's hot

How to extend map? Or why we need collections redesign? - Scalar 2017
How to extend map? Or why we need collections redesign? - Scalar 2017How to extend map? Or why we need collections redesign? - Scalar 2017
How to extend map? Or why we need collections redesign? - Scalar 2017
Szymon Matejczyk
 
Intro To Agda
Intro To AgdaIntro To Agda
Intro To Agda
Larry Diehl
 
Taking your side effects aside
Taking your side effects asideTaking your side effects aside
Taking your side effects aside
💡 Tomasz Kogut
 
The Ring programming language version 1.8 book - Part 35 of 202
The Ring programming language version 1.8 book - Part 35 of 202The Ring programming language version 1.8 book - Part 35 of 202
The Ring programming language version 1.8 book - Part 35 of 202
Mahmoud Samir Fayed
 
The Ring programming language version 1.9 book - Part 38 of 210
The Ring programming language version 1.9 book - Part 38 of 210The Ring programming language version 1.9 book - Part 38 of 210
The Ring programming language version 1.9 book - Part 38 of 210
Mahmoud Samir Fayed
 
Computer graphics
Computer graphics   Computer graphics
Computer graphics
Prianka Padmanaban
 
ملخص البرمجة المرئية - الوحدة السادسة
ملخص البرمجة المرئية - الوحدة السادسةملخص البرمجة المرئية - الوحدة السادسة
ملخص البرمجة المرئية - الوحدة السادسة
جامعة القدس المفتوحة
 
computer graphics practicals
computer graphics practicalscomputer graphics practicals
computer graphics practicals
Manoj Chauhan
 
Wap in c to draw a line using DDA algorithm
Wap in c to draw a line using DDA algorithmWap in c to draw a line using DDA algorithm
Wap in c to draw a line using DDA algorithm
Kapil Pandit
 
Thesis PPT
Thesis PPTThesis PPT
Thesis PPT
Drew Ferkin
 
D3 svg & angular
D3 svg & angularD3 svg & angular
D3 svg & angular
500Tech
 
Pythonic Graphics
Pythonic GraphicsPythonic Graphics
Pythonic Graphics
Kirby Urner
 
Assignment on Numerical Method C Code
Assignment on Numerical Method C CodeAssignment on Numerical Method C Code
Assignment on Numerical Method C Code
Syed Ahmed Zaki
 
Graphics practical lab manual
Graphics practical lab manualGraphics practical lab manual
Graphics practical lab manual
Vivek Kumar Sinha
 
Problemas de funciones
Problemas de funcionesProblemas de funciones
Problemas de funciones
KatherineVanessaOliv1
 
Computer graphics programs in c++
Computer graphics programs in c++Computer graphics programs in c++
Computer graphics programs in c++
Ankit Kumar
 
Computer Graphics Lab
Computer Graphics LabComputer Graphics Lab
Computer Graphics Lab
Neil Mathew
 
Functions
FunctionsFunctions
Functions
wilde1kd
 
SE Computer, Programming Laboratory(210251) University of Pune
SE Computer, Programming Laboratory(210251) University of PuneSE Computer, Programming Laboratory(210251) University of Pune
SE Computer, Programming Laboratory(210251) University of Pune
Bhavesh Shah
 
10CSL67 CG LAB PROGRAM 4
10CSL67 CG LAB PROGRAM 410CSL67 CG LAB PROGRAM 4
10CSL67 CG LAB PROGRAM 4
Vanishree Arun
 

What's hot (20)

How to extend map? Or why we need collections redesign? - Scalar 2017
How to extend map? Or why we need collections redesign? - Scalar 2017How to extend map? Or why we need collections redesign? - Scalar 2017
How to extend map? Or why we need collections redesign? - Scalar 2017
 
Intro To Agda
Intro To AgdaIntro To Agda
Intro To Agda
 
Taking your side effects aside
Taking your side effects asideTaking your side effects aside
Taking your side effects aside
 
The Ring programming language version 1.8 book - Part 35 of 202
The Ring programming language version 1.8 book - Part 35 of 202The Ring programming language version 1.8 book - Part 35 of 202
The Ring programming language version 1.8 book - Part 35 of 202
 
The Ring programming language version 1.9 book - Part 38 of 210
The Ring programming language version 1.9 book - Part 38 of 210The Ring programming language version 1.9 book - Part 38 of 210
The Ring programming language version 1.9 book - Part 38 of 210
 
Computer graphics
Computer graphics   Computer graphics
Computer graphics
 
ملخص البرمجة المرئية - الوحدة السادسة
ملخص البرمجة المرئية - الوحدة السادسةملخص البرمجة المرئية - الوحدة السادسة
ملخص البرمجة المرئية - الوحدة السادسة
 
computer graphics practicals
computer graphics practicalscomputer graphics practicals
computer graphics practicals
 
Wap in c to draw a line using DDA algorithm
Wap in c to draw a line using DDA algorithmWap in c to draw a line using DDA algorithm
Wap in c to draw a line using DDA algorithm
 
Thesis PPT
Thesis PPTThesis PPT
Thesis PPT
 
D3 svg & angular
D3 svg & angularD3 svg & angular
D3 svg & angular
 
Pythonic Graphics
Pythonic GraphicsPythonic Graphics
Pythonic Graphics
 
Assignment on Numerical Method C Code
Assignment on Numerical Method C CodeAssignment on Numerical Method C Code
Assignment on Numerical Method C Code
 
Graphics practical lab manual
Graphics practical lab manualGraphics practical lab manual
Graphics practical lab manual
 
Problemas de funciones
Problemas de funcionesProblemas de funciones
Problemas de funciones
 
Computer graphics programs in c++
Computer graphics programs in c++Computer graphics programs in c++
Computer graphics programs in c++
 
Computer Graphics Lab
Computer Graphics LabComputer Graphics Lab
Computer Graphics Lab
 
Functions
FunctionsFunctions
Functions
 
SE Computer, Programming Laboratory(210251) University of Pune
SE Computer, Programming Laboratory(210251) University of PuneSE Computer, Programming Laboratory(210251) University of Pune
SE Computer, Programming Laboratory(210251) University of Pune
 
10CSL67 CG LAB PROGRAM 4
10CSL67 CG LAB PROGRAM 410CSL67 CG LAB PROGRAM 4
10CSL67 CG LAB PROGRAM 4
 

Similar to Fast and Simple Statistics with Scala

Computer Graphics in Java and Scala - Part 1b
Computer Graphics in Java and Scala - Part 1bComputer Graphics in Java and Scala - Part 1b
Computer Graphics in Java and Scala - Part 1b
Philip Schwarz
 
SAMPLE QUESTIONExercise 1 Consider the functionf (x,C).docx
SAMPLE QUESTIONExercise 1 Consider the functionf (x,C).docxSAMPLE QUESTIONExercise 1 Consider the functionf (x,C).docx
SAMPLE QUESTIONExercise 1 Consider the functionf (x,C).docx
agnesdcarey33086
 
Computer Graphics in Java and Scala - Part 1
Computer Graphics in Java and Scala - Part 1Computer Graphics in Java and Scala - Part 1
Computer Graphics in Java and Scala - Part 1
Philip Schwarz
 
Monadologie
MonadologieMonadologie
Monadologie
league
 
First-Class Patterns
First-Class PatternsFirst-Class Patterns
First-Class Patterns
John De Goes
 
Matlab plotting
Matlab plottingMatlab plotting
Matlab plotting
pramodkumar1804
 
Hidden Gems in Swift
Hidden Gems in SwiftHidden Gems in Swift
Hidden Gems in Swift
Netguru
 
Matlab1
Matlab1Matlab1
Matlab1
guest8ba004
 
User Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love StoryUser Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love Story
Databricks
 
User Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love StoryUser Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love Story
Databricks
 
More instructions for the lab write-up1) You are not obli.docx
More instructions for the lab write-up1) You are not obli.docxMore instructions for the lab write-up1) You are not obli.docx
More instructions for the lab write-up1) You are not obli.docx
gilpinleeanna
 
SwiftUI Animation - The basic overview
SwiftUI Animation - The basic overviewSwiftUI Animation - The basic overview
SwiftUI Animation - The basic overview
WannitaTolaema
 
Struct examples
Struct examplesStruct examples
Struct examples
mondalakash2012
 
Lesson 3
Lesson 3Lesson 3
Lesson 3
Vinnu Vinay
 
High-Performance Haskell
High-Performance HaskellHigh-Performance Haskell
High-Performance Haskell
Johan Tibell
 
Mary Had a Little λ (QCon)
Mary Had a Little λ (QCon)Mary Had a Little λ (QCon)
Mary Had a Little λ (QCon)
Stephen Chin
 
Functional Design Explained (David Sankel CppCon 2015)
Functional Design Explained (David Sankel CppCon 2015)Functional Design Explained (David Sankel CppCon 2015)
Functional Design Explained (David Sankel CppCon 2015)
sankeld
 
R Language Introduction
R Language IntroductionR Language Introduction
R Language Introduction
Khaled Al-Shamaa
 
Intro to HTML5 Canvas
Intro to HTML5 CanvasIntro to HTML5 Canvas
Intro to HTML5 Canvas
Juho Vepsäläinen
 
ACM Distinguished Program: Cooperative Testing and Analysis: Human-Tool, Tool...
ACM Distinguished Program: Cooperative Testing and Analysis: Human-Tool, Tool...ACM Distinguished Program: Cooperative Testing and Analysis: Human-Tool, Tool...
ACM Distinguished Program: Cooperative Testing and Analysis: Human-Tool, Tool...
Tao Xie
 

Similar to Fast and Simple Statistics with Scala (20)

Computer Graphics in Java and Scala - Part 1b
Computer Graphics in Java and Scala - Part 1bComputer Graphics in Java and Scala - Part 1b
Computer Graphics in Java and Scala - Part 1b
 
SAMPLE QUESTIONExercise 1 Consider the functionf (x,C).docx
SAMPLE QUESTIONExercise 1 Consider the functionf (x,C).docxSAMPLE QUESTIONExercise 1 Consider the functionf (x,C).docx
SAMPLE QUESTIONExercise 1 Consider the functionf (x,C).docx
 
Computer Graphics in Java and Scala - Part 1
Computer Graphics in Java and Scala - Part 1Computer Graphics in Java and Scala - Part 1
Computer Graphics in Java and Scala - Part 1
 
Monadologie
MonadologieMonadologie
Monadologie
 
First-Class Patterns
First-Class PatternsFirst-Class Patterns
First-Class Patterns
 
Matlab plotting
Matlab plottingMatlab plotting
Matlab plotting
 
Hidden Gems in Swift
Hidden Gems in SwiftHidden Gems in Swift
Hidden Gems in Swift
 
Matlab1
Matlab1Matlab1
Matlab1
 
User Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love StoryUser Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love Story
 
User Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love StoryUser Defined Aggregation in Apache Spark: A Love Story
User Defined Aggregation in Apache Spark: A Love Story
 
More instructions for the lab write-up1) You are not obli.docx
More instructions for the lab write-up1) You are not obli.docxMore instructions for the lab write-up1) You are not obli.docx
More instructions for the lab write-up1) You are not obli.docx
 
SwiftUI Animation - The basic overview
SwiftUI Animation - The basic overviewSwiftUI Animation - The basic overview
SwiftUI Animation - The basic overview
 
Struct examples
Struct examplesStruct examples
Struct examples
 
Lesson 3
Lesson 3Lesson 3
Lesson 3
 
High-Performance Haskell
High-Performance HaskellHigh-Performance Haskell
High-Performance Haskell
 
Mary Had a Little λ (QCon)
Mary Had a Little λ (QCon)Mary Had a Little λ (QCon)
Mary Had a Little λ (QCon)
 
Functional Design Explained (David Sankel CppCon 2015)
Functional Design Explained (David Sankel CppCon 2015)Functional Design Explained (David Sankel CppCon 2015)
Functional Design Explained (David Sankel CppCon 2015)
 
R Language Introduction
R Language IntroductionR Language Introduction
R Language Introduction
 
Intro to HTML5 Canvas
Intro to HTML5 CanvasIntro to HTML5 Canvas
Intro to HTML5 Canvas
 
ACM Distinguished Program: Cooperative Testing and Analysis: Human-Tool, Tool...
ACM Distinguished Program: Cooperative Testing and Analysis: Human-Tool, Tool...ACM Distinguished Program: Cooperative Testing and Analysis: Human-Tool, Tool...
ACM Distinguished Program: Cooperative Testing and Analysis: Human-Tool, Tool...
 

Recently uploaded

WWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders AustinWWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders Austin
Patrick Weigel
 
UI5con 2024 - Bring Your Own Design System
UI5con 2024 - Bring Your Own Design SystemUI5con 2024 - Bring Your Own Design System
UI5con 2024 - Bring Your Own Design System
Peter Muessig
 
E-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet DynamicsE-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet Dynamics
Hornet Dynamics
 
YAML crash COURSE how to write yaml file for adding configuring details
YAML crash COURSE how to write yaml file for adding configuring detailsYAML crash COURSE how to write yaml file for adding configuring details
YAML crash COURSE how to write yaml file for adding configuring details
NishanthaBulumulla1
 
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
mz5nrf0n
 
Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
Sven Peters
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
Green Software Development
 
Using Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query PerformanceUsing Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query Performance
Grant Fritchey
 
Malibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed RoundMalibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed Round
sjcobrien
 
14 th Edition of International conference on computer vision
14 th Edition of International conference on computer vision14 th Edition of International conference on computer vision
14 th Edition of International conference on computer vision
ShulagnaSarkar2
 
SQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure MalaysiaSQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure Malaysia
GohKiangHock
 
Hand Rolled Applicative User Validation Code Kata
Hand Rolled Applicative User ValidationCode KataHand Rolled Applicative User ValidationCode Kata
Hand Rolled Applicative User Validation Code Kata
Philip Schwarz
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
mz5nrf0n
 
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
gapen1
 
Unveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdfUnveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdf
brainerhub1
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
Octavian Nadolu
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
Peter Muessig
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
ICS
 
Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !
Marcin Chrost
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
Green Software Development
 

Recently uploaded (20)

WWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders AustinWWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders Austin
 
UI5con 2024 - Bring Your Own Design System
UI5con 2024 - Bring Your Own Design SystemUI5con 2024 - Bring Your Own Design System
UI5con 2024 - Bring Your Own Design System
 
E-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet DynamicsE-commerce Development Services- Hornet Dynamics
E-commerce Development Services- Hornet Dynamics
 
YAML crash COURSE how to write yaml file for adding configuring details
YAML crash COURSE how to write yaml file for adding configuring detailsYAML crash COURSE how to write yaml file for adding configuring details
YAML crash COURSE how to write yaml file for adding configuring details
 
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
 
Microservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we workMicroservice Teams - How the cloud changes the way we work
Microservice Teams - How the cloud changes the way we work
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
 
Using Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query PerformanceUsing Query Store in Azure PostgreSQL to Understand Query Performance
Using Query Store in Azure PostgreSQL to Understand Query Performance
 
Malibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed RoundMalibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed Round
 
14 th Edition of International conference on computer vision
14 th Edition of International conference on computer vision14 th Edition of International conference on computer vision
14 th Edition of International conference on computer vision
 
SQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure MalaysiaSQL Accounting Software Brochure Malaysia
SQL Accounting Software Brochure Malaysia
 
Hand Rolled Applicative User Validation Code Kata
Hand Rolled Applicative User ValidationCode KataHand Rolled Applicative User ValidationCode Kata
Hand Rolled Applicative User Validation Code Kata
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
 
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
如何办理(hull学位证书)英国赫尔大学毕业证硕士文凭原版一模一样
 
Unveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdfUnveiling the Advantages of Agile Software Development.pdf
Unveiling the Advantages of Agile Software Development.pdf
 
Artificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension FunctionsArtificia Intellicence and XPath Extension Functions
Artificia Intellicence and XPath Extension Functions
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
 
Webinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for EmbeddedWebinar On-Demand: Using Flutter for Embedded
Webinar On-Demand: Using Flutter for Embedded
 
Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
 

Fast and Simple Statistics with Scala

  • 1. ! Fast and Simple Statistics with Scala @xxxnell 1
  • 2. 2
  • 3. 3
  • 4. 4
  • 5. 5
  • 6. 6
  • 7. 7
  • 8. 8
  • 9. 9
  • 10. 10
  • 11. Problems of KDE 4 Slow: or more 4 Large memory consumption: 4 Require a prior knowledge of dataset 11
  • 13. 'Adaptive histogram (Sketch)' solves the problems: 4 Fast: 4 Lightweight: 4 Does NOT require a prior knowledge of dataset 13
  • 14. 14
  • 19. Category of probability distributions 19
  • 20. Functions of Dist (simply D) def probability[A](dist: D[A], start: A, end: A): Double 20
  • 21. Functions of Histogram (simply H) def probability[A](hist: H[A], start: A, end: A): Double def update[A](hist: H[A], as: List[A]): H[A] 21
  • 22. Functions of Sketch (simply S) def probability[A](sketch: S[A], start: A, end: A): Double def update[A](sketch: S[A], as: List[A]): H[A] def narrowUpdate[A](sketch: S[A], as: List[A]): S[A] def deepUpdate[A](sketch: S[A], as: List[A]): S[A] 22
  • 23. Functions of Sketch (simply S) def probability[A](sketch: S[A], start: A, end: A): Double def narrowUpdate[A](sketch: S[A], as: List[A]): S[A] def deepUpdate[A](sketch: S[A], as: List[A]): S[A] 23
  • 24. import flip.implicits._ // get 100 random variables from standard normal distribution val underlying0 = NumericDist.normal(0.0, 1.0) val (underlying1, samples) = underlying0.samples(100) // update samples to sketch val sketch0 = Sketch.empty[Double] val sketch1 = samples.foldLeft(sketch0) { case (sketch, sample) ⇒ sketch.update(sample) } // get probability for interval [0.0, 1.0] println("result: " + sketch1.probability(0.0, 1.0)) println("expected: " + underlying1.probability(0.0, 1.0)) 24
  • 25. import flip.implicits._ // get 100 random variables from standard normal distribution val underlying0 = NumericDist.normal(0.0, 1.0) val (underlying1, samples) = underlying0.samples(100) // update samples to sketch val sketch0 = Sketch.empty[Double] val sketch1 = samples.foldLeft(sketch0) { case (sketch, sample) ⇒ sketch.update(sample) } // get probability for interval [0.0, 1.0] println("result: " + sketch1.probability(0.0, 1.0)) println("expected: " + underlying1.probability(0.0, 1.0)) 25
  • 26. import flip.implicits._ // get 100 random variables from standard normal distribution val underlying0 = NumericDist.normal(0.0, 1.0) val (underlying1, samples) = underlying0.samples(100) // update samples to sketch val sketch0 = Sketch.empty[Double] val sketch1 = samples.foldLeft(sketch0) { case (sketch, sample) ⇒ sketch.update(sample) } // get probability for interval [0.0, 1.0] println("result: " + sketch1.probability(0.0, 1.0)) println("expected: " + underlying1.probability(0.0, 1.0)) 26
  • 27. import flip.implicits._ // get 100 random variables from standard normal distribution val underlying0 = NumericDist.normal(0.0, 1.0) val (underlying1, samples) = underlying0.samples(100) // update samples to sketch val sketch0 = Sketch.empty[Double] val sketch1 = samples.foldLeft(sketch0) { case (sketch, sample) ⇒ sketch.update(sample) } // get probability for interval [0.0, 1.0] println("result: " + sketch1.probability(0.0, 1.0)) println("expected: " + underlying1.probability(0.0, 1.0)) 27
  • 28. // probability for interval [0.0, 1.0] sketch.probability(0.0, 1.0) // probability density at 0.0 sketch.pdf(0.0) // median sketch.median // 100 random samples sketch.samples(100) 28
  • 31. Rolling two dice for { n1 ← diceDist n2 ← diceDist } yield n1 + n2 31
  • 32. Probability distribution is monad // premises def pure[A](a: A): Dist[A] def flatMap[A, B](f: Dist[A], g: A ⇒ Dist[B]): Dist[B] // proposition def map[A](f: Dist[A], g: A ⇒ B): Dist[B] = flatMap(f, (a: A) ⇒ pure(g(a))) 32
  • 34. flatMap def flatMap[A, B](f: Dist[A], g: A ⇒ Dist[B]): Dist[B] 34
  • 35. Experiment result of flatMap sketch.flatMap(x ⇒ Normal(x, 1.5)) 35
  • 36. map is domain transformation // translation transformation sketch.map(x ⇒ x + 1) // scaling transformation sketch.map(x ⇒ x * 2) // reflection transformation sketch.map(x ⇒ x * -1) 36
  • 37. Experiment result of map sketch.map(x ⇒ math.exp(x)) 37
  • 38. Average speed for three different speed models for { speedA ← speedSketchA speedB ← speedSketchB speedC ← speedSketchC } yield (speedA + speedB + speedC) / 3 38
  • 39. Average height for male and female for { gender ← genderDist height ← gender match { case Male ⇒ maleHeightDist case Female ⇒ femaleHeightDist } } yield height 39
  • 40. ! Flip: Fast, Lightweight library for Information and Probability 4 Most fast and lightweight 4 Pure-functional 4 Unique and high-level open source 4 GitHub: https://github.com/xxxnell/flip 40
  • 42. Further Readings 4 Kernel Density Estimation in Spark 4 A frequentist approach to probability 4 Probability Distribution Monad (code) 4 Foundations of the Giry Monad 4 Platform for statistical modeling 4 A library for probabilistic modeling on TF 42