解決正確的問題:
如何讓數據發揮影響力?
Pei-shen Wu, MD
peishen.wu@gmail.com
Outline
• 是什麼阻礙了組織運用數據資料的力量?
• 怎麼解決?從 practitioner 的角度出發
• 重新思考數位時代下的學習
資料分析需求的分佈
因果關係
驗證
因果關係
驗證
資料探索、
假設建立、
相關性探究
資料探索、
假設建立、
相關性探究
大多數的
分析需求在這
與資料人員的對話
「近期增加了產品品項,但客戶人數不見起
色,想問最近用戶的活躍狀況?」
「這些用戶的狀況跟去年同期比較如何?」
「這些用戶是否有地區差異?」
…「與這些用戶同班的用戶的狀況?」
回應單一個問題需要的功夫
SELECT keyID,userID,dot_email,gender,points,user_nickname,username,userEmail,userSendableEmail,userRole,userCity,userSchool,userGrade,joinedTime,userBirthdate,match_score FROM (SELECT pre.keyID AS
keyID,pre.userID AS userID,pre.dot_email AS dot_email,pre.gender AS gender,pre.points AS points,pre.user_nickname AS user_nickname,pre.username AS username,pre.userEmail AS userEmail,pre.userSendableEmail AS
userSendableEmail,pre.userRole AS userRole,pre.userCity AS userCity,pre.userSchool AS userSchool,pre.userGrade AS userGrade,pre.joinedTime AS joinedTime,pre.userBirthdate AS userBirthdate,post.match_score AS
match_score FROM (SELECT * FROM (SELECT pre.keyID AS keyID,pre.userID AS userID,pre.dot_email AS dot_email,pre.gender AS gender,pre.points AS points,pre.user_nickname AS user_nickname,pre.username AS
username,post.userEmail AS userEmail,post.userSendableEmail AS userSendableEmail,post.userRole AS userRole,post.userCity AS userCity,post.userSchool AS userSchool,post.userGrade AS userGrade,post.joinedTime AS
joinedTime,post.userBirthdate AS userBirthdate FROM (SELECT * FROM (SELECT keyID, userID, dot_email, gender, points, user_nickname, username FROM (SELECT __key__.name AS keyID, user_id AS
userID, user_email AS underline_email, current_user.email AS cu_email, user.email AS dot_email, gender, points, user_nickname, username FROM
junyi_20161212.UserData_20161212)) AS pre INNER JOIN EACH (SELECT userEmail, userSendableEmail, userId AS userID, userRole, userCity, userSchool, userGrade, joinedTime,
userBirthdate FROM FinalTable.UserFinalTmpInfo) AS post ON pre.userID = post.userID)) AS pre INNER JOIN EACH (SELECT keyID,match_score FROM (SELECT keyID, match_score FROM (SELECT
keyID,prob1*prob2 AS match_score FROM (SELECT pre.keyID AS keyID,pre.prob1 AS prob1,post.prob2 AS prob2 FROM (SELECT * FROM (SELECT keyID, prob1 FROM (SELECT keyID, match_score AS prob1 FROM (SELECT
keyID, match_score FROM (SELECT keyID, match_score_A*cdf AS match_score FROM (SELECT keyID, MAX(output) AS cdf, match_score_A FROM (SELECT keyID, IF(metric >= input,1,0) AS compare, output, match_score_A
FROM (SELECT pre.keyID AS keyID,pre.comparekey AS comparekey,pre.metric AS metric,pre.match_score_A AS match_score_A,post.input AS input,post.output AS output FROM (SELECT * FROM (SELECT keyID, 1 AS
comparekey, metric, match_score AS match_score_A FROM (SELECT pre.keyID AS keyID,pre.match_score AS match_score,post.dot_email AS dot_email,post.metric AS metric FROM (SELECT * FROM (SELECT teacher_keyID AS
keyID, MAX(match_score) AS match_score FROM(SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,pre.teacher_keyID AS
teacher_keyID,post.match_score AS match_score FROM (SELECT * FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,post.teacher_keyID AS
teacher_keyID FROM (SELECT * FROM (SELECT student_keyID, classID, student_underline_email FROM FLATTEN((SELECT __key__.name AS student_keyID, user_email AS student_underline_email,
student_lists.path AS classID FROM junyi_20161212.UserData_20161212), classID)) AS pre INNER JOIN EACH (SELECT teacher_keyID, classID FROM(SELECT coaches.name AS teacher_keyID,
__key__.path AS classID, code AS classcode, name AS classname FROM junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID)) AS pre INNER JOIN EACH (SELECT classID,
MAX(match_score) AS match_score FROM (SELECT classID,match_score FROM (SELECT classID, MAX(match_score) AS match_score FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS
classID,pre.student_underline_email AS student_underline_email,pre.teacher_keyID AS teacher_keyID,post.match_score AS match_score FROM (SELECT * FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS
classID,pre.student_underline_email AS student_underline_email,post.teacher_keyID AS teacher_keyID FROM (SELECT * FROM (SELECT student_keyID, classID, student_underline_email FROM FLATTEN((SELECT
__key__.name AS student_keyID, user_email AS student_underline_email, student_lists.path AS classID FROM junyi_20161212.UserData_20161212), classID)) AS pre INNER JOIN EACH (SELECT
teacher_keyID, classID FROM(SELECT coaches.name AS teacher_keyID, __key__.path AS classID, code AS classcode, name AS classname FROM
junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID)) AS pre INNER JOIN EACH (SELECT keyID AS student_keyID, match_score FROM (SELECT student_keyID AS keyID, MAX(match_score) AS
match_score FROM(SELECT student_keyID, teacher_keyID, 1 AS match_score FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS
student_underline_email,post.teacher_keyID AS teacher_keyID FROM (SELECT * FROM (SELECT student_keyID, classID, student_underline_email FROM FLATTEN((SELECT __key__.name AS student_keyID, user_email AS
student_underline_email, student_lists.path AS classID FROM junyi_20161212.UserData_20161212), classID)) AS pre INNER JOIN EACH (SELECT teacher_keyID, classID FROM(SELECT coaches.name AS
teacher_keyID, __key__.path AS classID, code AS classcode, name AS classname FROM junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID))) GROUP BY
keyID)) AS post ON pre.student_keyID = post.student_keyID)) GROUP BY classID)) GROUP BY classID) AS post ON pre.classID = post.classID)) GROUP BY keyID) AS pre INNER JOIN EACH (SELECT pre.dot_email AS
dot_email,pre.metric AS metric,post.keyID AS keyID FROM (SELECT * FROM (SELECT dot_email, SUM(points_earned) AS metric FROM(SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT
dot_email, points_earned, timestamp, time_consumed FROM (SELECT dot_email,points_earned,timestamp,time_consumed FROM (SELECT dot_email, points_earned, timestamp, time_consumed
FROM (SELECT user.email AS dot_email, points_earned, seconds_watched AS time_consumed, DATE_ADD(time_watched, 8, 'HOUR') AS timestamp FROM
junyi_20161212.VideoLog_20161212)),(SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT user.email AS dot_email, points_earned, time_taken AS time_consumed,
DATE_ADD(time_done, 8, 'HOUR') AS timestamp FROM junyi_20161212.ProblemLog_20161212)))) WHERE timestamp >= TIMESTAMP('2016-09-01') AND timestamp <= TIMESTAMP('2016-12-12'))
GROUP BY dot_email) AS pre INNER JOIN EACH (SELECT keyID, dot_email FROM (SELECT __key__.name AS keyID, user.email AS dot_email FROM junyi_20161212.UserData_20161212)) AS post ON
pre.dot_email = post.dot_email)) AS post ON pre.keyID = post.keyID))) AS pre INNER JOIN EACH (SELECT input, AVG(output) AS output, 1 AS comparekey FROM ( SELECT input, output/101 AS output
FROM ( SELECT input, ROW_NUMBER() OVER() AS output FROM ( FLATTEN(( SELECT QUANTILES(metric,101) AS input FROM (SELECT pre.keyID AS keyID,pre.match_score AS
match_score,post.dot_email AS dot_email,post.metric AS metric FROM (SELECT * FROM (SELECT student_keyID AS keyID, MAX(match_score) AS match_score FROM(SELECT student_keyID, teacher_keyID, 1 AS match_score
FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,post.teacher_keyID AS teacher_keyID FROM (SELECT * FROM (SELECT
student_keyID, classID, student_underline_email FROM FLATTEN((SELECT __key__.name AS student_keyID, user_email AS student_underline_email, student_lists.path AS classID FROM
junyi_20161212.UserData_20161212), classID)) AS pre INNER JOIN EACH (SELECT teacher_keyID, classID FROM(SELECT coaches.name AS teacher_keyID, __key__.path AS classID, code AS
classcode, name AS classname FROM junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID))) GROUP BY keyID) AS pre INNER JOIN EACH (SELECT pre.dot_email AS dot_email,pre.metric AS
metric,post.keyID AS keyID FROM (SELECT * FROM (SELECT dot_email, SUM(points_earned) AS metric FROM(SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT dot_email,
points_earned, timestamp, time_consumed FROM (SELECT dot_email,points_earned,timestamp,time_consumed FROM (SELECT dot_email, points_earned, timestamp, time_consumed FROM
(SELECT user.email AS dot_email, points_earned, seconds_watched AS time_consumed, DATE_ADD(time_watched, 8, 'HOUR') AS timestamp FROM
junyi_20161212.VideoLog_20161212)),(SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT user.email AS dot_email, points_earned, time_taken AS time_consumed,
DATE_ADD(time_done, 8, 'HOUR') AS timestamp FROM junyi_20161212.ProblemLog_20161212)))) WHERE timestamp >= TIMESTAMP('2016-09-01') AND timestamp <= TIMESTAMP('2016-12-12'))
GROUP BY dot_email) AS pre INNER JOIN EACH (SELECT keyID, dot_email FROM (SELECT __key__.name AS keyID, user.email AS dot_email FROM junyi_20161212.UserData_20161212)) AS post ON
pre.dot_email = post.dot_email)) AS post ON pre.keyID = post.keyID))), input)))) GROUP BY input) AS post ON pre.comparekey = post.comparekey))) WHERE compare == 1 GROUP BY keyID,
試圖以三週努力
滿足三秒的好奇
回答速度趕不上
問題產生的速度
資料人員第一線人員
是否可以更快?
需求
提出
需求
提出
規格
確認
規格
確認
初步
實作
初步
實作
驗證除錯驗證除錯
Scale in complexity,
not size
IT-centric Business Intelligence
因果關係
驗證
因果關係
驗證
資料探索、
假設建立、
相關性探究
資料探索、
假設建立、
相關性探究
資料人員
第一線人員
User-centric Business
Intelligence
因果關係
驗證
因果關係
驗證
資料探索、
假設建立、
相關性探究
資料探索、
假設建立、
相關性探究
資料人員
第一線人員
市場上既已存在工具方案 (eg. Tableau)
又何必自行研究?
Complexity is both
domain + infrastructure
specific
How do you encode
& implement this part?
即便是用 Google Bigquery 仍是如
此
SQL 是 Business
Logic
的組合語言
如果可以這樣該有多麼美好?
SELECT
keyID,userID,dot_email,gender,points,user_nickname,username,userEmail,userSendableEmail,userRole,userCity,userSchool,userGrade,joinedTime,
userBirthdate,match_score FROM (SELECT pre.keyID AS keyID,pre.userID AS userID,pre.dot_email AS dot_email,pre.gender AS
gender,pre.points AS points,pre.user_nickname AS user_nickname,pre.username AS username,pre.userEmail AS userEmail,pre.userSendableEmail
AS userSendableEmail,pre.userRole AS userRole,pre.userCity AS userCity,pre.userSchool AS userSchool,pre.userGrade AS userGrade,pre.joinedTime
AS joinedTime,pre.userBirthdate AS userBirthdate,post.match_score AS match_score FROM (SELECT * FROM (SELECT pre.keyID AS
keyID,pre.userID AS userID,pre.dot_email AS dot_email,pre.gender AS gender,pre.points AS points,pre.user_nickname AS
user_nickname,pre.username AS username,post.userEmail AS userEmail,post.userSendableEmail AS userSendableEmail,post.userRole AS
userRole,post.userCity AS userCity,post.userSchool AS userSchool,post.userGrade AS userGrade,post.joinedTime AS joinedTime,post.userBirthdate
AS userBirthdate FROM (SELECT * FROM (SELECT keyID, userID, dot_email, gender, points, user_nickname, username FROM (SELECT
__key__.name AS keyID, user_id AS userID, user_email AS underline_email, current_user.email AS
cu_email, user.email AS dot_email, gender, points, user_nickname, username FROM
junyi_20161212.UserData_20161212)) AS pre INNER JOIN EACH (SELECT userEmail, userSendableEmail, userId AS userID,
userRole, userCity, userSchool, userGrade, joinedTime, userBirthdate FROM FinalTable.UserFinalTmpInfo) AS
post ON pre.userID = post.userID)) AS pre INNER JOIN EACH (SELECT keyID,match_score FROM (SELECT keyID, match_score
FROM (SELECT keyID,prob1*prob2 AS match_score FROM (SELECT pre.keyID AS keyID,pre.prob1 AS prob1,post.prob2 AS prob2 FROM (SELECT *
FROM (SELECT keyID, prob1 FROM (SELECT keyID, match_score AS prob1 FROM (SELECT keyID, match_score FROM (SELECT keyID,
match_score_A*cdf AS match_score FROM (SELECT keyID, MAX(output) AS cdf, match_score_A FROM (SELECT keyID, IF(metric >= input,1,0) AS
compare, output, match_score_A FROM (SELECT pre.keyID AS keyID,pre.comparekey AS comparekey,pre.metric AS metric,pre.match_score_A AS
match_score_A,post.input AS input,post.output AS output FROM (SELECT * FROM (SELECT keyID, 1 AS comparekey, metric, match_score AS
match_score_A FROM (SELECT pre.keyID AS keyID,pre.match_score AS match_score,post.dot_email AS dot_email,post.metric AS metric FROM
(SELECT * FROM (SELECT teacher_keyID AS keyID, MAX(match_score) AS match_score FROM(SELECT pre.student_keyID AS
student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,pre.teacher_keyID AS
teacher_keyID,post.match_score AS match_score FROM (SELECT * FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS
classID,pre.student_underline_email AS student_underline_email,post.teacher_keyID AS teacher_keyID FROM (SELECT * FROM (SELECT
student_keyID, classID, student_underline_email FROM FLATTEN((SELECT __key__.name AS student_keyID, user_email AS
student_underline_email, student_lists.path AS classID FROM junyi_20161212.UserData_20161212), classID)) AS pre
INNER JOIN EACH (SELECT teacher_keyID, classID FROM(SELECT coaches.name AS teacher_keyID, __key__.path AS classID,
code AS classcode, name AS classname FROM junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID))
AS pre INNER JOIN EACH (SELECT classID, MAX(match_score) AS match_score FROM (SELECT classID,match_score FROM
(SELECT classID, MAX(match_score) AS match_score FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS
classID,pre.student_underline_email AS student_underline_email,pre.teacher_keyID AS teacher_keyID,post.match_score AS match_score FROM
(SELECT * FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS
student_underline_email,post.teacher_keyID AS teacher_keyID FROM (SELECT * FROM (SELECT student_keyID, classID, student_underline_email
FROM FLATTEN((SELECT __key__.name AS student_keyID, user_email AS student_underline_email, student_lists.path AS classID
FROM junyi_20161212.UserData_20161212), classID)) AS pre INNER JOIN EACH (SELECT teacher_keyID, classID FROM(SELECT coaches.name
AS teacher_keyID, __key__.path AS classID, code AS classcode, name AS classname FROM
junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID)) AS pre INNER JOIN EACH (SELECT keyID AS student_keyID,
match_score FROM (SELECT student_keyID AS keyID, MAX(match_score) AS match_score FROM(SELECT student_keyID, teacher_keyID, 1 AS
match_score FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS
student_underline_email,post.teacher_keyID AS teacher_keyID FROM (SELECT * FROM (SELECT student_keyID, classID, student_underline_email
FROM FLATTEN((SELECT __key__.name AS student_keyID, user_email AS student_underline_email, student_lists.path AS classID
FROM junyi_20161212.UserData_20161212), classID)) AS pre INNER JOIN EACH (SELECT teacher_keyID, classID FROM(SELECT coaches.name
AS teacher_keyID, __key__.path AS classID, code AS classcode, name AS classname FROM
junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID))) GROUP BY keyID)) AS post ON pre.student_keyID =
post.student_keyID)) GROUP BY classID)) GROUP BY classID) AS post ON pre.classID = post.classID)) GROUP BY keyID) AS pre INNER JOIN EACH
(SELECT pre.dot_email AS dot_email,pre.metric AS metric,post.keyID AS keyID FROM (SELECT * FROM (SELECT dot_email, SUM(points_earned) AS
metric FROM(SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT dot_email, points_earned, timestamp,
time_consumed FROM (SELECT dot_email,points_earned,timestamp,time_consumed FROM (SELECT dot_email,
points_earned, timestamp, time_consumed FROM (SELECT user.email AS dot_email, points_earned, seconds_watched AS
time_consumed, DATE_ADD(time_watched, 8, 'HOUR') AS timestamp FROM junyi_20161212.VideoLog_20161212)),
(SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT user.email AS dot_email, points_earned,
time_taken AS time_consumed, DATE_ADD(time_done, 8, 'HOUR') AS timestamp FROM
junyi_20161212.ProblemLog_20161212)))) WHERE timestamp >= TIMESTAMP('2016-09-01') AND timestamp <= TIMESTAMP('2016-12-
12')) GROUP BY dot_email) AS pre INNER JOIN EACH (SELECT keyID, dot_email FROM (SELECT __key__.name AS keyID, user.email AS
dot_email FROM junyi_20161212.UserData_20161212)) AS post ON pre.dot_email = post.dot_email)) AS post ON pre.keyID =
post.keyID))) AS pre INNER JOIN EACH (SELECT input, AVG(output) AS output, 1 AS comparekey FROM ( SELECT input, output/101 AS
output FROM ( SELECT input, ROW_NUMBER() OVER() AS output FROM ( FLATTEN(( SELECT
QUANTILES(metric,101) AS input FROM (SELECT pre.keyID AS keyID,pre.match_score AS
match_score,post.dot_email AS dot_email,post.metric AS metric FROM (SELECT * FROM (SELECT student_keyID AS keyID, MAX(match_score) AS
match_score FROM(SELECT student_keyID, teacher_keyID, 1 AS match_score FROM (SELECT pre.student_keyID AS
student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,post.teacher_keyID AS teacher_keyID FROM
(SELECT * FROM (SELECT student_keyID, classID, student_underline_email FROM FLATTEN((SELECT __key__.name AS student_keyID, user_email AS
student_underline_email, student_lists.path AS classID FROM junyi_20161212.UserData_20161212), classID)) AS pre
INNER JOIN EACH (SELECT teacher_keyID, classID FROM(SELECT coaches.name AS teacher_keyID, __key__.path AS classID,
code AS classcode, name AS classname FROM junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID)))
GROUP BY keyID) AS pre INNER JOIN EACH (SELECT pre.dot_email AS dot_email,pre.metric AS metric,post.keyID AS keyID FROM (SELECT * FROM
(SELECT dot_email, SUM(points_earned) AS metric FROM(SELECT dot_email, points_earned, timestamp, time_consumed FROM
(SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT dot_email,points_earned,timestamp,time_consumed
FROM (SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT user.email AS dot_email,
points_earned, seconds_watched AS time_consumed, DATE_ADD(time_watched, 8, 'HOUR') AS timestamp FROM
junyi_20161212.VideoLog_20161212)),(SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT user.email AS
dot_email, points_earned, time_taken AS time_consumed, DATE_ADD(time_done, 8, 'HOUR') AS timestamp
FROM junyi_20161212.ProblemLog_20161212)))) WHERE timestamp >= TIMESTAMP('2016-09-01') AND timestamp <=
TIMESTAMP('2016-12-12')) GROUP BY dot_email) AS pre INNER JOIN EACH (SELECT keyID, dot_email FROM (SELECT __key__.name AS keyID,
user.email AS dot_email FROM junyi_20161212.UserData_20161212)) AS post ON pre.dot_email = post.dot_email)) AS post ON
class.students <- con_relationshipByClass(dataset, outputUsers.role =
"student")
class.teachers <- con_relationshipByClass(dataset, outputUsers.role =
"teacher”, student_UserObj = class.students)
Regular_Preparation_Public.teachers <- UserObj_ANDmerge(
con_probability_AgreaterthanB(dataset,
UserObj.A = class.teachers,
UserObj.B = class.students,
metric = "seconds”, video = T, prob = T,
window.begin = window.begin,
window.end = window.end),
con_activity_timing(dataset, prob = T, video = T,
complete.mission = T, assign.mission = T,
hour = 9:15,
window.begin = window.begin,
window.end = window.end))
Regular_Preparation_Public.teachers <- view_User_FinalTable(
dataset, Regular_Preparation_Public.teachers, output = "data")
如果可以這樣該有多麼美好?
class.students <- con_relationshipByClass(dataset, outputUsers.role =
"student")
class.teachers <- con_relationshipByClass(dataset, outputUsers.role =
"teacher”, student_UserObj = class.students)
Regular_Preparation_Public.teachers <- UserObj_ANDmerge(
con_probability_AgreaterthanB(dataset,
UserObj.A = class.teachers,
UserObj.B = class.students,
metric = "seconds”, video = T, prob = T,
window.begin = window.begin,
window.end = window.end),
con_activity_timing(dataset, prob = T, video = T,
complete.mission = T, assign.mission = T,
hour = 9:15,
window.begin = window.begin,
window.end = window.end))
Regular_Preparation_Public.teachers <- view_User_FinalTable(
dataset, Regular_Preparation_Public.teachers, output = "data")
有班級的使用者
( 學生 )
有班級的使用者
( 學生 ) 所屬的老師所屬的老師
View
Function
View
Function
比較 A 與 B
課堂時間影片跟
習題的秒數
回傳 A
比較 A 與 B
課堂時間影片跟
習題的秒數
回傳 A
以邏輯模組來操作已建置的 SQL
有班級的使用者
( 學生 )
有班級的使用者
( 學生 ) 所屬的老師所屬的老師
View
Function
View
Function
比較 A 與 B
課堂時間影片跟
習題的秒數
回傳 A
比較 A 與 B
課堂時間影片跟
習題的秒數
回傳 A
AB
邏輯模組的其他好處
• Reusable code
• Focus on business logic itself
• Enabling very complex queries
= Allowing rapid hypothesis
generation and alignment with data
• Makes validation & debugging
easier
Having data
is not enough
Too much data is
Counter-productive
Lost in strange trends (not big pictures)
and on numbers that lead to No action
It makes no sense
if you can’t do or
no need to do
anything about it.
7 天前來過,每次平均消費 3000NTD
總共來過 4 次
Recency = 7
Frequency = 4
Monetary Value = 3000
Monetary
Value
Frequency
常來 + 每次來花很多
少來 + 做不到業績
偶爾來
一旦來就花很多
常來
但每次來都很省
Monetary
Value
Frequency
常來 + 每次來花很多
少來 + 做不到業績
偶爾來
一旦來就花很多
常來
但每次來都很省
來店有禮  吸引你常來蒞臨
Monetary
Value
Frequency
常來 + 每次來花很多
少來 + 做不到業績
偶爾來
一旦來就花很多
常來
但每次來都很省
買滿千送百  給你湊數的誘因
同一件商品,若可以辨識你是
價格不敏感者  賣高價品
價格敏感者  打折賣
= “Actions we can take”
其他可以用來辨識
「價格敏感度」的資料 ?
收入?學生?資產狀況
小孩人數?
Coupon 的使用?
是否是當地人?
品牌忠誠度?
市場熟悉程度?
同類型產品,買 種價位的?哪
Ignore coupons  Offer high price

Maximize
Revenue
Theory:
消費行為受 Price-sensitivity 影響
“Data” “Action” “Outcome”
要倒回來想:主詞是能做什麼
(Action) 才能產生想要的
Outcome?
Data  Action  Outcom
Asking a relevant question
is domain specific skill
Take Action = P( 生理數據 )
非專業人士:
Take Action = P( 生理數據 | 病史 )
專業人士:
專業人士:
P( HR<60 | 不規律呼吸 , 高血壓 ) = 腦壓上昇
醫療文獻稱這三者
合併出現為 Cushing’s reflex
善用資料改善線上教育 slide (https://goo.gl/Yd6ajF)
善用資料改善線上教育 slide (https://goo.gl/Yd6ajF)
https://www.junyiacademy.org/
https://www.junyiacademy.org/
https://www.junyiacademy.org/
善用資料改善線上教育 slide (https://goo.gl/Yd6ajF)
老師及所屬學生
的使用行為分析
翻轉教學
成就孩子
Theory:
好的 practice 是可以被複製跟運用的
“Data” “Action” “Outcome”

找出領頭羊
以便觀摩

醫師 老師
病史詢問、各式檢查
執行治療
規劃治療方針
衛教
評估先備知識
安排學習情境、教材
評估學習成效
給予指導、輔導跟支持
選擇教學策略
Data  Action  Outcom
What is the theory?
What cause-effect is already known?
必須深入
有關學習的理論
才能有更多可能性
學生學習的行為 成就孩子
Learning theories
“Data” “Action” “Outcome”

協助挑選
合適的
教學策略

三種主流的觀點
Behaviorism
Cognitivism
Constructivism
 數據最容易應用的思維
Behaviorism 行為主義
刺激  行為  後果
刺激  行為  後果
古典制約 Classical Conditioning
刺激  行為  後果
操作制約 Operant Conditioning
行為主義的觀點
Feedback loops + 合適的 Reward 搭
配
可以系統化的影響學習行為
Behaviorism 行為主義
• 被動的接受知識
• 遺忘是因為久了沒用
• 學習就是
– 行為機率變化 : P( 刺激 ) = 行為
• Generalization 是因為刺激源類似
• 強調
practice 、 cues 、 hints 、 rewards
行為主義的侷限
認知資源低的學習 = 很強大、很有效
–寫選擇題、填空、背誦、配對比對
認知資源高的學習 = 解釋力很差
–學會能賴以工作的外語
–做別人沒做過的事、 ill-defined problems
–創作、申論題
What can we learn from other
theories?
Cognitivism 認知主義
–內在 knowledge state 變化  State & Path
analysis
學習者必須主動的把知識 encode 到記憶
中
–強調新知識跟舊知識的關係  倒帶行為
0:10 – 0:60
Peak at 0:30
2:00-2:20
3:50 -
「質數判別方法」影片劇本
• 找出 1 to 10 哪些是質數?
– [0:10] 1 不是質數,也不是合數
– [0:25] 合數就是除了質數以外的數
– [0:40] 0 也不是質數,也不是合數
– [0:50] 2 是質數唯一的偶數,其他質數都是奇數
• [2:00] 再次強調,只有 2 是偶數的質數
• [2:10] 13 是否為質數?
• [3:50] 33 是否為質數?
• [4:10] 一個數一定可以被自己跟 1 整除
• [4:30] 33 可以被 3 整除,所以不是質數
把影片內容依主題切成九段
從頭看到尾,只看一次 只看前面的觀念,不做 13,33 的練習
前面的觀念看兩次後,才做 13,33 練習 重複從頭到尾看很多次
能對教材「倒帶」「暫
停」「跳過」是少數能呈
現
學生對自己的學習做選擇
的結果
學生所做的選擇
反映他如何建構他自己
的知識
Issues not addressed by
Behaviorism
Context 情境
Intrinsic motivation( 內在動機 )
–Competence = 成就感
–Autonomy = 自我控制感,可以自己做選擇
–Relatedness = 社群、夥伴、團隊
使用者 long-term engagement 關鍵
1.社群的存在
2.內在動機驅動
3.Experience 本身會 evolve over
time ,而非短期的 funnel 或 habit
loops
創造一個會隨時間逐漸 unfold 的體驗
一個會隨使用者的
投入與技能成長
改變的體驗

解決正確的問題 - 如何讓數據發揮影響力?

  • 1.
  • 2.
    Outline • 是什麼阻礙了組織運用數據資料的力量? • 怎麼解決?從practitioner 的角度出發 • 重新思考數位時代下的學習
  • 3.
  • 4.
  • 5.
    回應單一個問題需要的功夫 SELECT keyID,userID,dot_email,gender,points,user_nickname,username,userEmail,userSendableEmail,userRole,userCity,userSchool,userGrade,joinedTime,userBirthdate,match_score FROM(SELECT pre.keyID AS keyID,pre.userID AS userID,pre.dot_email AS dot_email,pre.gender AS gender,pre.points AS points,pre.user_nickname AS user_nickname,pre.username AS username,pre.userEmail AS userEmail,pre.userSendableEmail AS userSendableEmail,pre.userRole AS userRole,pre.userCity AS userCity,pre.userSchool AS userSchool,pre.userGrade AS userGrade,pre.joinedTime AS joinedTime,pre.userBirthdate AS userBirthdate,post.match_score AS match_score FROM (SELECT * FROM (SELECT pre.keyID AS keyID,pre.userID AS userID,pre.dot_email AS dot_email,pre.gender AS gender,pre.points AS points,pre.user_nickname AS user_nickname,pre.username AS username,post.userEmail AS userEmail,post.userSendableEmail AS userSendableEmail,post.userRole AS userRole,post.userCity AS userCity,post.userSchool AS userSchool,post.userGrade AS userGrade,post.joinedTime AS joinedTime,post.userBirthdate AS userBirthdate FROM (SELECT * FROM (SELECT keyID, userID, dot_email, gender, points, user_nickname, username FROM (SELECT __key__.name AS keyID, user_id AS userID, user_email AS underline_email, current_user.email AS cu_email, user.email AS dot_email, gender, points, user_nickname, username FROM junyi_20161212.UserData_20161212)) AS pre INNER JOIN EACH (SELECT userEmail, userSendableEmail, userId AS userID, userRole, userCity, userSchool, userGrade, joinedTime, userBirthdate FROM FinalTable.UserFinalTmpInfo) AS post ON pre.userID = post.userID)) AS pre INNER JOIN EACH (SELECT keyID,match_score FROM (SELECT keyID, match_score FROM (SELECT keyID,prob1*prob2 AS match_score FROM (SELECT pre.keyID AS keyID,pre.prob1 AS prob1,post.prob2 AS prob2 FROM (SELECT * FROM (SELECT keyID, prob1 FROM (SELECT keyID, match_score AS prob1 FROM (SELECT keyID, match_score FROM (SELECT keyID, match_score_A*cdf AS match_score FROM (SELECT keyID, MAX(output) AS cdf, match_score_A FROM (SELECT keyID, IF(metric >= input,1,0) AS compare, output, match_score_A FROM (SELECT pre.keyID AS keyID,pre.comparekey AS comparekey,pre.metric AS metric,pre.match_score_A AS match_score_A,post.input AS input,post.output AS output FROM (SELECT * FROM (SELECT keyID, 1 AS comparekey, metric, match_score AS match_score_A FROM (SELECT pre.keyID AS keyID,pre.match_score AS match_score,post.dot_email AS dot_email,post.metric AS metric FROM (SELECT * FROM (SELECT teacher_keyID AS keyID, MAX(match_score) AS match_score FROM(SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,pre.teacher_keyID AS teacher_keyID,post.match_score AS match_score FROM (SELECT * FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,post.teacher_keyID AS teacher_keyID FROM (SELECT * FROM (SELECT student_keyID, classID, student_underline_email FROM FLATTEN((SELECT __key__.name AS student_keyID, user_email AS student_underline_email, student_lists.path AS classID FROM junyi_20161212.UserData_20161212), classID)) AS pre INNER JOIN EACH (SELECT teacher_keyID, classID FROM(SELECT coaches.name AS teacher_keyID, __key__.path AS classID, code AS classcode, name AS classname FROM junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID)) AS pre INNER JOIN EACH (SELECT classID, MAX(match_score) AS match_score FROM (SELECT classID,match_score FROM (SELECT classID, MAX(match_score) AS match_score FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,pre.teacher_keyID AS teacher_keyID,post.match_score AS match_score FROM (SELECT * FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,post.teacher_keyID AS teacher_keyID FROM (SELECT * FROM (SELECT student_keyID, classID, student_underline_email FROM FLATTEN((SELECT __key__.name AS student_keyID, user_email AS student_underline_email, student_lists.path AS classID FROM junyi_20161212.UserData_20161212), classID)) AS pre INNER JOIN EACH (SELECT teacher_keyID, classID FROM(SELECT coaches.name AS teacher_keyID, __key__.path AS classID, code AS classcode, name AS classname FROM junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID)) AS pre INNER JOIN EACH (SELECT keyID AS student_keyID, match_score FROM (SELECT student_keyID AS keyID, MAX(match_score) AS match_score FROM(SELECT student_keyID, teacher_keyID, 1 AS match_score FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,post.teacher_keyID AS teacher_keyID FROM (SELECT * FROM (SELECT student_keyID, classID, student_underline_email FROM FLATTEN((SELECT __key__.name AS student_keyID, user_email AS student_underline_email, student_lists.path AS classID FROM junyi_20161212.UserData_20161212), classID)) AS pre INNER JOIN EACH (SELECT teacher_keyID, classID FROM(SELECT coaches.name AS teacher_keyID, __key__.path AS classID, code AS classcode, name AS classname FROM junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID))) GROUP BY keyID)) AS post ON pre.student_keyID = post.student_keyID)) GROUP BY classID)) GROUP BY classID) AS post ON pre.classID = post.classID)) GROUP BY keyID) AS pre INNER JOIN EACH (SELECT pre.dot_email AS dot_email,pre.metric AS metric,post.keyID AS keyID FROM (SELECT * FROM (SELECT dot_email, SUM(points_earned) AS metric FROM(SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT dot_email,points_earned,timestamp,time_consumed FROM (SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT user.email AS dot_email, points_earned, seconds_watched AS time_consumed, DATE_ADD(time_watched, 8, 'HOUR') AS timestamp FROM junyi_20161212.VideoLog_20161212)),(SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT user.email AS dot_email, points_earned, time_taken AS time_consumed, DATE_ADD(time_done, 8, 'HOUR') AS timestamp FROM junyi_20161212.ProblemLog_20161212)))) WHERE timestamp >= TIMESTAMP('2016-09-01') AND timestamp <= TIMESTAMP('2016-12-12')) GROUP BY dot_email) AS pre INNER JOIN EACH (SELECT keyID, dot_email FROM (SELECT __key__.name AS keyID, user.email AS dot_email FROM junyi_20161212.UserData_20161212)) AS post ON pre.dot_email = post.dot_email)) AS post ON pre.keyID = post.keyID))) AS pre INNER JOIN EACH (SELECT input, AVG(output) AS output, 1 AS comparekey FROM ( SELECT input, output/101 AS output FROM ( SELECT input, ROW_NUMBER() OVER() AS output FROM ( FLATTEN(( SELECT QUANTILES(metric,101) AS input FROM (SELECT pre.keyID AS keyID,pre.match_score AS match_score,post.dot_email AS dot_email,post.metric AS metric FROM (SELECT * FROM (SELECT student_keyID AS keyID, MAX(match_score) AS match_score FROM(SELECT student_keyID, teacher_keyID, 1 AS match_score FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,post.teacher_keyID AS teacher_keyID FROM (SELECT * FROM (SELECT student_keyID, classID, student_underline_email FROM FLATTEN((SELECT __key__.name AS student_keyID, user_email AS student_underline_email, student_lists.path AS classID FROM junyi_20161212.UserData_20161212), classID)) AS pre INNER JOIN EACH (SELECT teacher_keyID, classID FROM(SELECT coaches.name AS teacher_keyID, __key__.path AS classID, code AS classcode, name AS classname FROM junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID))) GROUP BY keyID) AS pre INNER JOIN EACH (SELECT pre.dot_email AS dot_email,pre.metric AS metric,post.keyID AS keyID FROM (SELECT * FROM (SELECT dot_email, SUM(points_earned) AS metric FROM(SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT dot_email,points_earned,timestamp,time_consumed FROM (SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT user.email AS dot_email, points_earned, seconds_watched AS time_consumed, DATE_ADD(time_watched, 8, 'HOUR') AS timestamp FROM junyi_20161212.VideoLog_20161212)),(SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT user.email AS dot_email, points_earned, time_taken AS time_consumed, DATE_ADD(time_done, 8, 'HOUR') AS timestamp FROM junyi_20161212.ProblemLog_20161212)))) WHERE timestamp >= TIMESTAMP('2016-09-01') AND timestamp <= TIMESTAMP('2016-12-12')) GROUP BY dot_email) AS pre INNER JOIN EACH (SELECT keyID, dot_email FROM (SELECT __key__.name AS keyID, user.email AS dot_email FROM junyi_20161212.UserData_20161212)) AS post ON pre.dot_email = post.dot_email)) AS post ON pre.keyID = post.keyID))), input)))) GROUP BY input) AS post ON pre.comparekey = post.comparekey))) WHERE compare == 1 GROUP BY keyID,
  • 6.
  • 7.
  • 8.
  • 9.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
    Complexity is both domain+ infrastructure specific
  • 16.
    How do youencode & implement this part?
  • 17.
    即便是用 Google Bigquery仍是如 此 SQL 是 Business Logic 的組合語言
  • 18.
    如果可以這樣該有多麼美好? SELECT keyID,userID,dot_email,gender,points,user_nickname,username,userEmail,userSendableEmail,userRole,userCity,userSchool,userGrade,joinedTime, userBirthdate,match_score FROM (SELECTpre.keyID AS keyID,pre.userID AS userID,pre.dot_email AS dot_email,pre.gender AS gender,pre.points AS points,pre.user_nickname AS user_nickname,pre.username AS username,pre.userEmail AS userEmail,pre.userSendableEmail AS userSendableEmail,pre.userRole AS userRole,pre.userCity AS userCity,pre.userSchool AS userSchool,pre.userGrade AS userGrade,pre.joinedTime AS joinedTime,pre.userBirthdate AS userBirthdate,post.match_score AS match_score FROM (SELECT * FROM (SELECT pre.keyID AS keyID,pre.userID AS userID,pre.dot_email AS dot_email,pre.gender AS gender,pre.points AS points,pre.user_nickname AS user_nickname,pre.username AS username,post.userEmail AS userEmail,post.userSendableEmail AS userSendableEmail,post.userRole AS userRole,post.userCity AS userCity,post.userSchool AS userSchool,post.userGrade AS userGrade,post.joinedTime AS joinedTime,post.userBirthdate AS userBirthdate FROM (SELECT * FROM (SELECT keyID, userID, dot_email, gender, points, user_nickname, username FROM (SELECT __key__.name AS keyID, user_id AS userID, user_email AS underline_email, current_user.email AS cu_email, user.email AS dot_email, gender, points, user_nickname, username FROM junyi_20161212.UserData_20161212)) AS pre INNER JOIN EACH (SELECT userEmail, userSendableEmail, userId AS userID, userRole, userCity, userSchool, userGrade, joinedTime, userBirthdate FROM FinalTable.UserFinalTmpInfo) AS post ON pre.userID = post.userID)) AS pre INNER JOIN EACH (SELECT keyID,match_score FROM (SELECT keyID, match_score FROM (SELECT keyID,prob1*prob2 AS match_score FROM (SELECT pre.keyID AS keyID,pre.prob1 AS prob1,post.prob2 AS prob2 FROM (SELECT * FROM (SELECT keyID, prob1 FROM (SELECT keyID, match_score AS prob1 FROM (SELECT keyID, match_score FROM (SELECT keyID, match_score_A*cdf AS match_score FROM (SELECT keyID, MAX(output) AS cdf, match_score_A FROM (SELECT keyID, IF(metric >= input,1,0) AS compare, output, match_score_A FROM (SELECT pre.keyID AS keyID,pre.comparekey AS comparekey,pre.metric AS metric,pre.match_score_A AS match_score_A,post.input AS input,post.output AS output FROM (SELECT * FROM (SELECT keyID, 1 AS comparekey, metric, match_score AS match_score_A FROM (SELECT pre.keyID AS keyID,pre.match_score AS match_score,post.dot_email AS dot_email,post.metric AS metric FROM (SELECT * FROM (SELECT teacher_keyID AS keyID, MAX(match_score) AS match_score FROM(SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,pre.teacher_keyID AS teacher_keyID,post.match_score AS match_score FROM (SELECT * FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,post.teacher_keyID AS teacher_keyID FROM (SELECT * FROM (SELECT student_keyID, classID, student_underline_email FROM FLATTEN((SELECT __key__.name AS student_keyID, user_email AS student_underline_email, student_lists.path AS classID FROM junyi_20161212.UserData_20161212), classID)) AS pre INNER JOIN EACH (SELECT teacher_keyID, classID FROM(SELECT coaches.name AS teacher_keyID, __key__.path AS classID, code AS classcode, name AS classname FROM junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID)) AS pre INNER JOIN EACH (SELECT classID, MAX(match_score) AS match_score FROM (SELECT classID,match_score FROM (SELECT classID, MAX(match_score) AS match_score FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,pre.teacher_keyID AS teacher_keyID,post.match_score AS match_score FROM (SELECT * FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,post.teacher_keyID AS teacher_keyID FROM (SELECT * FROM (SELECT student_keyID, classID, student_underline_email FROM FLATTEN((SELECT __key__.name AS student_keyID, user_email AS student_underline_email, student_lists.path AS classID FROM junyi_20161212.UserData_20161212), classID)) AS pre INNER JOIN EACH (SELECT teacher_keyID, classID FROM(SELECT coaches.name AS teacher_keyID, __key__.path AS classID, code AS classcode, name AS classname FROM junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID)) AS pre INNER JOIN EACH (SELECT keyID AS student_keyID, match_score FROM (SELECT student_keyID AS keyID, MAX(match_score) AS match_score FROM(SELECT student_keyID, teacher_keyID, 1 AS match_score FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,post.teacher_keyID AS teacher_keyID FROM (SELECT * FROM (SELECT student_keyID, classID, student_underline_email FROM FLATTEN((SELECT __key__.name AS student_keyID, user_email AS student_underline_email, student_lists.path AS classID FROM junyi_20161212.UserData_20161212), classID)) AS pre INNER JOIN EACH (SELECT teacher_keyID, classID FROM(SELECT coaches.name AS teacher_keyID, __key__.path AS classID, code AS classcode, name AS classname FROM junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID))) GROUP BY keyID)) AS post ON pre.student_keyID = post.student_keyID)) GROUP BY classID)) GROUP BY classID) AS post ON pre.classID = post.classID)) GROUP BY keyID) AS pre INNER JOIN EACH (SELECT pre.dot_email AS dot_email,pre.metric AS metric,post.keyID AS keyID FROM (SELECT * FROM (SELECT dot_email, SUM(points_earned) AS metric FROM(SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT dot_email,points_earned,timestamp,time_consumed FROM (SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT user.email AS dot_email, points_earned, seconds_watched AS time_consumed, DATE_ADD(time_watched, 8, 'HOUR') AS timestamp FROM junyi_20161212.VideoLog_20161212)), (SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT user.email AS dot_email, points_earned, time_taken AS time_consumed, DATE_ADD(time_done, 8, 'HOUR') AS timestamp FROM junyi_20161212.ProblemLog_20161212)))) WHERE timestamp >= TIMESTAMP('2016-09-01') AND timestamp <= TIMESTAMP('2016-12- 12')) GROUP BY dot_email) AS pre INNER JOIN EACH (SELECT keyID, dot_email FROM (SELECT __key__.name AS keyID, user.email AS dot_email FROM junyi_20161212.UserData_20161212)) AS post ON pre.dot_email = post.dot_email)) AS post ON pre.keyID = post.keyID))) AS pre INNER JOIN EACH (SELECT input, AVG(output) AS output, 1 AS comparekey FROM ( SELECT input, output/101 AS output FROM ( SELECT input, ROW_NUMBER() OVER() AS output FROM ( FLATTEN(( SELECT QUANTILES(metric,101) AS input FROM (SELECT pre.keyID AS keyID,pre.match_score AS match_score,post.dot_email AS dot_email,post.metric AS metric FROM (SELECT * FROM (SELECT student_keyID AS keyID, MAX(match_score) AS match_score FROM(SELECT student_keyID, teacher_keyID, 1 AS match_score FROM (SELECT pre.student_keyID AS student_keyID,pre.classID AS classID,pre.student_underline_email AS student_underline_email,post.teacher_keyID AS teacher_keyID FROM (SELECT * FROM (SELECT student_keyID, classID, student_underline_email FROM FLATTEN((SELECT __key__.name AS student_keyID, user_email AS student_underline_email, student_lists.path AS classID FROM junyi_20161212.UserData_20161212), classID)) AS pre INNER JOIN EACH (SELECT teacher_keyID, classID FROM(SELECT coaches.name AS teacher_keyID, __key__.path AS classID, code AS classcode, name AS classname FROM junyi_20161212.StudentList_20161212)) AS post ON pre.classID = post.classID))) GROUP BY keyID) AS pre INNER JOIN EACH (SELECT pre.dot_email AS dot_email,pre.metric AS metric,post.keyID AS keyID FROM (SELECT * FROM (SELECT dot_email, SUM(points_earned) AS metric FROM(SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT dot_email,points_earned,timestamp,time_consumed FROM (SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT user.email AS dot_email, points_earned, seconds_watched AS time_consumed, DATE_ADD(time_watched, 8, 'HOUR') AS timestamp FROM junyi_20161212.VideoLog_20161212)),(SELECT dot_email, points_earned, timestamp, time_consumed FROM (SELECT user.email AS dot_email, points_earned, time_taken AS time_consumed, DATE_ADD(time_done, 8, 'HOUR') AS timestamp FROM junyi_20161212.ProblemLog_20161212)))) WHERE timestamp >= TIMESTAMP('2016-09-01') AND timestamp <= TIMESTAMP('2016-12-12')) GROUP BY dot_email) AS pre INNER JOIN EACH (SELECT keyID, dot_email FROM (SELECT __key__.name AS keyID, user.email AS dot_email FROM junyi_20161212.UserData_20161212)) AS post ON pre.dot_email = post.dot_email)) AS post ON class.students <- con_relationshipByClass(dataset, outputUsers.role = "student") class.teachers <- con_relationshipByClass(dataset, outputUsers.role = "teacher”, student_UserObj = class.students) Regular_Preparation_Public.teachers <- UserObj_ANDmerge( con_probability_AgreaterthanB(dataset, UserObj.A = class.teachers, UserObj.B = class.students, metric = "seconds”, video = T, prob = T, window.begin = window.begin, window.end = window.end), con_activity_timing(dataset, prob = T, video = T, complete.mission = T, assign.mission = T, hour = 9:15, window.begin = window.begin, window.end = window.end)) Regular_Preparation_Public.teachers <- view_User_FinalTable( dataset, Regular_Preparation_Public.teachers, output = "data")
  • 19.
    如果可以這樣該有多麼美好? class.students <- con_relationshipByClass(dataset,outputUsers.role = "student") class.teachers <- con_relationshipByClass(dataset, outputUsers.role = "teacher”, student_UserObj = class.students) Regular_Preparation_Public.teachers <- UserObj_ANDmerge( con_probability_AgreaterthanB(dataset, UserObj.A = class.teachers, UserObj.B = class.students, metric = "seconds”, video = T, prob = T, window.begin = window.begin, window.end = window.end), con_activity_timing(dataset, prob = T, video = T, complete.mission = T, assign.mission = T, hour = 9:15, window.begin = window.begin, window.end = window.end)) Regular_Preparation_Public.teachers <- view_User_FinalTable( dataset, Regular_Preparation_Public.teachers, output = "data") 有班級的使用者 ( 學生 ) 有班級的使用者 ( 學生 ) 所屬的老師所屬的老師 View Function View Function 比較 A 與 B 課堂時間影片跟 習題的秒數 回傳 A 比較 A 與 B 課堂時間影片跟 習題的秒數 回傳 A
  • 20.
    以邏輯模組來操作已建置的 SQL 有班級的使用者 ( 學生) 有班級的使用者 ( 學生 ) 所屬的老師所屬的老師 View Function View Function 比較 A 與 B 課堂時間影片跟 習題的秒數 回傳 A 比較 A 與 B 課堂時間影片跟 習題的秒數 回傳 A AB
  • 21.
    邏輯模組的其他好處 • Reusable code •Focus on business logic itself • Enabling very complex queries = Allowing rapid hypothesis generation and alignment with data • Makes validation & debugging easier
  • 22.
  • 23.
    Too much datais Counter-productive Lost in strange trends (not big pictures) and on numbers that lead to No action
  • 25.
    It makes nosense if you can’t do or no need to do anything about it.
  • 26.
    7 天前來過,每次平均消費 3000NTD 總共來過4 次 Recency = 7 Frequency = 4 Monetary Value = 3000
  • 27.
    Monetary Value Frequency 常來 + 每次來花很多 少來+ 做不到業績 偶爾來 一旦來就花很多 常來 但每次來都很省
  • 28.
    Monetary Value Frequency 常來 + 每次來花很多 少來+ 做不到業績 偶爾來 一旦來就花很多 常來 但每次來都很省 來店有禮  吸引你常來蒞臨
  • 29.
    Monetary Value Frequency 常來 + 每次來花很多 少來+ 做不到業績 偶爾來 一旦來就花很多 常來 但每次來都很省 買滿千送百  給你湊數的誘因
  • 30.
  • 31.
  • 32.
  • 33.
    Ignore coupons Offer high price  Maximize Revenue Theory: 消費行為受 Price-sensitivity 影響 “Data” “Action” “Outcome”
  • 34.
  • 35.
    Asking a relevantquestion is domain specific skill
  • 37.
    Take Action =P( 生理數據 ) 非專業人士:
  • 38.
    Take Action =P( 生理數據 | 病史 ) 專業人士:
  • 39.
    專業人士: P( HR<60 |不規律呼吸 , 高血壓 ) = 腦壓上昇 醫療文獻稱這三者 合併出現為 Cushing’s reflex
  • 40.
  • 41.
  • 42.
  • 43.
  • 44.
  • 45.
  • 46.
  • 48.
  • 49.
    Data  Action Outcom What is the theory? What cause-effect is already known?
  • 50.
  • 51.
    學生學習的行為 成就孩子 Learning theories “Data”“Action” “Outcome”  協助挑選 合適的 教學策略 
  • 52.
  • 53.
  • 54.
    刺激  行為 後果 古典制約 Classical Conditioning
  • 55.
    刺激  行為 後果 操作制約 Operant Conditioning
  • 57.
    行為主義的觀點 Feedback loops +合適的 Reward 搭 配 可以系統化的影響學習行為
  • 58.
    Behaviorism 行為主義 • 被動的接受知識 •遺忘是因為久了沒用 • 學習就是 – 行為機率變化 : P( 刺激 ) = 行為 • Generalization 是因為刺激源類似 • 強調 practice 、 cues 、 hints 、 rewards
  • 59.
    行為主義的侷限 認知資源低的學習 = 很強大、很有效 –寫選擇題、填空、背誦、配對比對 認知資源高的學習= 解釋力很差 –學會能賴以工作的外語 –做別人沒做過的事、 ill-defined problems –創作、申論題
  • 60.
    What can welearn from other theories? Cognitivism 認知主義 –內在 knowledge state 變化  State & Path analysis 學習者必須主動的把知識 encode 到記憶 中 –強調新知識跟舊知識的關係  倒帶行為
  • 61.
    0:10 – 0:60 Peakat 0:30 2:00-2:20 3:50 -
  • 62.
    「質數判別方法」影片劇本 • 找出 1to 10 哪些是質數? – [0:10] 1 不是質數,也不是合數 – [0:25] 合數就是除了質數以外的數 – [0:40] 0 也不是質數,也不是合數 – [0:50] 2 是質數唯一的偶數,其他質數都是奇數 • [2:00] 再次強調,只有 2 是偶數的質數 • [2:10] 13 是否為質數? • [3:50] 33 是否為質數? • [4:10] 一個數一定可以被自己跟 1 整除 • [4:30] 33 可以被 3 整除,所以不是質數
  • 63.
  • 64.
    從頭看到尾,只看一次 只看前面的觀念,不做 13,33的練習 前面的觀念看兩次後,才做 13,33 練習 重複從頭到尾看很多次
  • 65.
  • 66.
  • 67.
    Issues not addressedby Behaviorism Context 情境 Intrinsic motivation( 內在動機 ) –Competence = 成就感 –Autonomy = 自我控制感,可以自己做選擇 –Relatedness = 社群、夥伴、團隊
  • 68.
    使用者 long-term engagement關鍵 1.社群的存在 2.內在動機驅動 3.Experience 本身會 evolve over time ,而非短期的 funnel 或 habit loops
  • 69.

Editor's Notes

  • #48 利用「WSQ」(Watch-Summary-Question)學習單協助引導學生留下觀片紀錄,逐漸發展筆記。 Assign a video to watch as homework and require the submission of a written summary which includes a question based on the material. It is an elegantly simple idea and it works.