SlideShare a Scribd company logo
Computing
Probabilities With R
mining the patterns in lottery
Chia-Chi@MLDM Monday
20160912
Sample Codes
https://github.com/c3h3/mldm20120912
ETL in R
%>% + dplyr + tidyr
在開始演講前
想先調查一下現場聽眾
近一年內 ...
有買過樂透的朋友 ?
你買彩卷時 ...
不知道當年度最常出現的號碼的 ?
一向都用電腦選號的 ?
你相信 ...
彩卷的號碼是有規律的 ?
你相信 ...
彩卷的號碼是沒有規律的 ?
對於,不相信的朋友 ...
是驗證過,嘗試過,發現沒有 pattern,才不相信?
還是是沒驗證過,沒嘗試過,直接不相信?
對賭徒來說 ...
最重要的兩件事
瘋險控管
不對稱資訊
瘋險的種類 ?
● 猜錯行情 (Prediction)
○ 在該下注時,不下
○ 再不該下時,拼命下
● 下注數量控制失衡 (Position Sizing)
○ 輸的時候,下注過大
○ 贏得時候,下注太小
● 陷入負期望值,而未自覺
瘋險控管的核心
(1) 套利 Versus 套損
(2) 勝率 Versus 賠率
以股票為例
什模是賺錢的唯一法則呢 ?
賺錢的唯一法則 ?
低買高賣
賺錢的唯一法則 ?
● 行情向上時,先低買,後高賣
● 行情向下時,先高賣,後低買
市場上有兩種 Trader
Buy-Side / Sell-Side
Trend Follower / Mean Reversion
high p, low WLR / low p, High WLR
什麼是高?什麼是低?
標準是 ...... ?
兩種 Traders:
● Type I (Trend Follower)
○ 順勢操作,追高殺低
○ 低勝率,高報酬
● Type II (Mean Reversion)
○ 逆勢操作,買黑賣紅
○ 高勝率,低報酬
套損 Versus 套利
E = pW - (1-p)L - T > 0
套損 Versus 套利
假設 T = 0, WLR = W/L
p > 1 / (1+WLR)
對賭徒來說 ...
最重要的兩件事
瘋險控管
不對稱資訊
以 coin tossing 為例 ...
你相信 ...
投擲硬幣正面和反面的機率是平均的 ?
你覺得 ...
投擲硬幣正面和反面的機率是多少呢 ?
這些機率,受到什模影響呢 ?
大家覺得 ...
P(H) = ? and P(T) = ?
P(H| ?? ) = ? and P(T | ??) = ?
大家覺得 ...
什模是 "機率" ?
(這其實是今天演講中,最重要的問題之一 !)
樂透的空間 pattern
各個號碼出現的次數 ?
tidyr::
gather ->
<- spread
<- Spread
平均 ... ?
但如果看條件機率呢 ?
利用條件機率進行預測
P(Xt | Xt-1)
利用條件機率進行預測
P(Xt | Xt-k)
策略回測
E = W*p - L*(1-p)
首先 ...
要知道 W =? L =?
Avoid Overfitting
Walk Forward Analysis
Walk Forward Analysis
樂透的時間 pattern
各個號碼出現的 Stopping Time ?
怎樣才是 ...
正確的週期 ?正確的時間尺度 ?
來看空間的 pattern !
進擊的條件機率進行預測
P(DISTt | DISTt-1)
感謝大家 !
c3h3.tw@gmail.com
詳情請搜尋
Learning by Hacking 粉絲團
課程說明:
http://goo.gl/CTR7nk
dplyr 101
df %>% group_by(...) %>% summerize(...)
tidyr 101
df %>% spread(key,value)
df %>% gather(key,value,...)
gather ->
<- spread
Long
format
Wide
format
ETL: dplyr + tidyr
Cheat Sheet
Viz: ggplot2
Cheat Sheet
Before go into ggplot2
Please make sure that your data.frame is
in the long format !

More Related Content

Viewers also liked

Predictshine
PredictshinePredictshine
Predictshine
Tom Liptrot
 
Twitter Hashtag #appleindia Text Mining using R
Twitter Hashtag #appleindia Text Mining using RTwitter Hashtag #appleindia Text Mining using R
Twitter Hashtag #appleindia Text Mining using R
Nikhil Gadkar
 
Quantifying Text Sentiment in R
Quantifying Text Sentiment in RQuantifying Text Sentiment in R
Quantifying Text Sentiment in R
Rajarshi Guha
 
Automatic extraction of microorganisms and their habitats from free text usin...
Automatic extraction of microorganisms and their habitats from free text usin...Automatic extraction of microorganisms and their habitats from free text usin...
Automatic extraction of microorganisms and their habitats from free text usin...
Catherine Canevet
 
Text mining with R-studio
Text mining with R-studioText mining with R-studio
Text mining with R-studio
Ashley Lindley
 
My Data Analysis Portfolio (Text Mining)
My Data Analysis Portfolio (Text Mining)My Data Analysis Portfolio (Text Mining)
My Data Analysis Portfolio (Text Mining)
Vincent Handara
 
Data mining with R- regression models
Data mining with R- regression modelsData mining with R- regression models
Data mining with R- regression models
Hamideh Iraj
 
Twitter Text Mining with Web scraping, R, Shiny and Hadoop - Richard Sheng
Twitter Text Mining with Web scraping, R, Shiny and Hadoop - Richard Sheng Twitter Text Mining with Web scraping, R, Shiny and Hadoop - Richard Sheng
Twitter Text Mining with Web scraping, R, Shiny and Hadoop - Richard Sheng
Richard Sheng
 
Data Exploration and Visualization with R
Data Exploration and Visualization with RData Exploration and Visualization with R
Data Exploration and Visualization with R
Yanchang Zhao
 
Introduction to Data Mining with R and Data Import/Export in R
Introduction to Data Mining with R and Data Import/Export in RIntroduction to Data Mining with R and Data Import/Export in R
Introduction to Data Mining with R and Data Import/Export in R
Yanchang Zhao
 
hands on: Text Mining With R
hands on: Text Mining With Rhands on: Text Mining With R
hands on: Text Mining With R
Jahnab Kumar Deka
 
R Reference Card for Data Mining
R Reference Card for Data MiningR Reference Card for Data Mining
R Reference Card for Data Mining
Yanchang Zhao
 
An Introduction to Data Mining with R
An Introduction to Data Mining with RAn Introduction to Data Mining with R
An Introduction to Data Mining with R
Yanchang Zhao
 
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
Gigaom
 
Regression and Classification with R
Regression and Classification with RRegression and Classification with R
Regression and Classification with R
Yanchang Zhao
 
R in finance: Introduction to R and Its Applications in Finance
R in finance: Introduction to R and Its Applications in FinanceR in finance: Introduction to R and Its Applications in Finance
R in finance: Introduction to R and Its Applications in Finance
Liang C. Zhang (張良丞)
 
A short tutorial on r
A short tutorial on rA short tutorial on r
A short tutorial on r
Ashraf Uddin
 
Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)
fridolin.wild
 
Introduction to R for Data Mining (Feb 2013)
Introduction to R for Data Mining (Feb 2013)Introduction to R for Data Mining (Feb 2013)
Introduction to R for Data Mining (Feb 2013)
Revolution Analytics
 
TextMining with R
TextMining with RTextMining with R
TextMining with R
Aleksei Beloshytski
 

Viewers also liked (20)

Predictshine
PredictshinePredictshine
Predictshine
 
Twitter Hashtag #appleindia Text Mining using R
Twitter Hashtag #appleindia Text Mining using RTwitter Hashtag #appleindia Text Mining using R
Twitter Hashtag #appleindia Text Mining using R
 
Quantifying Text Sentiment in R
Quantifying Text Sentiment in RQuantifying Text Sentiment in R
Quantifying Text Sentiment in R
 
Automatic extraction of microorganisms and their habitats from free text usin...
Automatic extraction of microorganisms and their habitats from free text usin...Automatic extraction of microorganisms and their habitats from free text usin...
Automatic extraction of microorganisms and their habitats from free text usin...
 
Text mining with R-studio
Text mining with R-studioText mining with R-studio
Text mining with R-studio
 
My Data Analysis Portfolio (Text Mining)
My Data Analysis Portfolio (Text Mining)My Data Analysis Portfolio (Text Mining)
My Data Analysis Portfolio (Text Mining)
 
Data mining with R- regression models
Data mining with R- regression modelsData mining with R- regression models
Data mining with R- regression models
 
Twitter Text Mining with Web scraping, R, Shiny and Hadoop - Richard Sheng
Twitter Text Mining with Web scraping, R, Shiny and Hadoop - Richard Sheng Twitter Text Mining with Web scraping, R, Shiny and Hadoop - Richard Sheng
Twitter Text Mining with Web scraping, R, Shiny and Hadoop - Richard Sheng
 
Data Exploration and Visualization with R
Data Exploration and Visualization with RData Exploration and Visualization with R
Data Exploration and Visualization with R
 
Introduction to Data Mining with R and Data Import/Export in R
Introduction to Data Mining with R and Data Import/Export in RIntroduction to Data Mining with R and Data Import/Export in R
Introduction to Data Mining with R and Data Import/Export in R
 
hands on: Text Mining With R
hands on: Text Mining With Rhands on: Text Mining With R
hands on: Text Mining With R
 
R Reference Card for Data Mining
R Reference Card for Data MiningR Reference Card for Data Mining
R Reference Card for Data Mining
 
An Introduction to Data Mining with R
An Introduction to Data Mining with RAn Introduction to Data Mining with R
An Introduction to Data Mining with R
 
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
THE 3V's OF BIG DATA: VARIETY, VELOCITY, AND VOLUME from Structure:Data 2012
 
Regression and Classification with R
Regression and Classification with RRegression and Classification with R
Regression and Classification with R
 
R in finance: Introduction to R and Its Applications in Finance
R in finance: Introduction to R and Its Applications in FinanceR in finance: Introduction to R and Its Applications in Finance
R in finance: Introduction to R and Its Applications in Finance
 
A short tutorial on r
A short tutorial on rA short tutorial on r
A short tutorial on r
 
Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)Natural Language Processing in R (rNLP)
Natural Language Processing in R (rNLP)
 
Introduction to R for Data Mining (Feb 2013)
Introduction to R for Data Mining (Feb 2013)Introduction to R for Data Mining (Feb 2013)
Introduction to R for Data Mining (Feb 2013)
 
TextMining with R
TextMining with RTextMining with R
TextMining with R
 

More from Chia-Chi Chang

Power BI x R
Power BI x RPower BI x R
Power BI x R
Chia-Chi Chang
 
Communicate with your data 20170104
Communicate with your data 20170104Communicate with your data 20170104
Communicate with your data 20170104
Chia-Chi Chang
 
20161110 quantstrat in seattle
20161110 quantstrat in seattle20161110 quantstrat in seattle
20161110 quantstrat in seattle
Chia-Chi Chang
 
20130506 mldm monday intorduction to quantmod package
20130506 mldm monday intorduction to  quantmod package20130506 mldm monday intorduction to  quantmod package
20130506 mldm monday intorduction to quantmod package
Chia-Chi Chang
 
PyData SF 2016 --- Moving forward through the darkness
PyData SF 2016 --- Moving forward through the darknessPyData SF 2016 --- Moving forward through the darkness
PyData SF 2016 --- Moving forward through the darkness
Chia-Chi Chang
 
20160827 open community camp
20160827 open community camp20160827 open community camp
20160827 open community camp
Chia-Chi Chang
 
20160827 open community camp
20160827 open community camp20160827 open community camp
20160827 open community camp
Chia-Chi Chang
 
20130325 mldm monday spide r
20130325 mldm monday spide r20130325 mldm monday spide r
20130325 mldm monday spide rChia-Chi Chang
 
20130107 MLDM Monday
20130107 MLDM Monday20130107 MLDM Monday
20130107 MLDM Monday
Chia-Chi Chang
 
Learning notes of r for python programmer (Temp1)
Learning notes of r for python programmer (Temp1)Learning notes of r for python programmer (Temp1)
Learning notes of r for python programmer (Temp1)
Chia-Chi Chang
 
素食丙級考試流程重點整理
素食丙級考試流程重點整理素食丙級考試流程重點整理
素食丙級考試流程重點整理Chia-Chi Chang
 

More from Chia-Chi Chang (11)

Power BI x R
Power BI x RPower BI x R
Power BI x R
 
Communicate with your data 20170104
Communicate with your data 20170104Communicate with your data 20170104
Communicate with your data 20170104
 
20161110 quantstrat in seattle
20161110 quantstrat in seattle20161110 quantstrat in seattle
20161110 quantstrat in seattle
 
20130506 mldm monday intorduction to quantmod package
20130506 mldm monday intorduction to  quantmod package20130506 mldm monday intorduction to  quantmod package
20130506 mldm monday intorduction to quantmod package
 
PyData SF 2016 --- Moving forward through the darkness
PyData SF 2016 --- Moving forward through the darknessPyData SF 2016 --- Moving forward through the darkness
PyData SF 2016 --- Moving forward through the darkness
 
20160827 open community camp
20160827 open community camp20160827 open community camp
20160827 open community camp
 
20160827 open community camp
20160827 open community camp20160827 open community camp
20160827 open community camp
 
20130325 mldm monday spide r
20130325 mldm monday spide r20130325 mldm monday spide r
20130325 mldm monday spide r
 
20130107 MLDM Monday
20130107 MLDM Monday20130107 MLDM Monday
20130107 MLDM Monday
 
Learning notes of r for python programmer (Temp1)
Learning notes of r for python programmer (Temp1)Learning notes of r for python programmer (Temp1)
Learning notes of r for python programmer (Temp1)
 
素食丙級考試流程重點整理
素食丙級考試流程重點整理素食丙級考試流程重點整理
素食丙級考試流程重點整理
 

Computing Probabilities With R: mining the patterns in lottery