1
Summary of Insights Learned from the
Data Science Program Team Training 



Fred Chiang (@fredchiang)
fredchiang@etusolution.com
The Lead of Etu and the DSP Committee Member
May 19th, 2014
2
Agenda
1. What is DSP?
2. How did DSP become about?
3. What does DSP do?
4. What have we learned?
3
What is DSP?
4
Data Science Program (DSP)
DSP was initiated by Etu, Code for Tomorrow (CfT) and
supported by OKFN Taiwan and other various parties.
http://datasci.co
5
DSP, is a case of enterprise run data-driven CSR
with NPO, from SYSTEX/Etu’s perspective
Etu, SYSTEX
Etu is a pioneer of Big Data providing Hadoop-based
solutions from Taiwan primarily focused on helping
customers discover, unlock, and utilize valuable information
embedded in extremely large data sets through simple
steps.
SYSTEX Group is an Asia-Pacific regional IT service provider
and the largest one in Taiwan.
Etu is an independent brand incubated by SYSTEX.
Code for Tomorrow
Code for Tomorrow (CfT) Foundation Initiative is a non-
profit organization that actively encourages governments,
private sectors, and civil society organizations to embrace
the power of the internet and people to better our
governance in the 21st century.
6
codefortomorrow.org
How did it become about?
What does it do?
What have we learned?
7
How did DSP become about?
8
Harvard
Business
Review
October 2012
http://cromi.org/main/wp-content/uploads/2012/10/Davenport-2012-data-scientist.pdf
But where can we find these
sexy people and make them
work with us?
9
No one person can be the perfect
data scientist, so we need teams
Source: Next-Gen Data Scientist, Dr. Rachel Schutt
Data Science Profiles
10
Data Science Program Goal
Train 300talented data science team members
within 3years for Taiwan
11
What does DSP do?
12
DSP Working Group
Committee
CEO (CfT) / Principal Secretory (Etu)
Administration
Team
COO (Etu)
Curriculum Team
CCO (CfT)
Marketing Team
CMO (CfT)
13
DSP Courses
(continuously developing)
1. Team Training
2. Data ETL and Analysis with Python
3. Data Journalism (coming soon)
14
Who are interested?
Those who signed up for DSP Team
Training #1  #2. Totaling 168 counts
0 10 20 30 40 50 60 70 80
UI Designer
Art Designer
UX Designer
Other
Product/Service Planner
Story Teller
Programmer
Data Analyst
5
6
7
22
48
52
67
75
77%
23%
Male
Female
Analyst
Hygienist
Campaigner
Campaigner
Designer
Designer
Designer
15
Self-tagging by Role
•  Campaigner
•  Analyst
•  Hygienist
•  Designer
16
17
[DSP’s Motto #1]
“The point of statistics is not to do
myriad rigorous mathematical
calculations; the point is to gain insight
into meaningful social phenomena.”
~ Charles Wheelan
from the book ‘Naked Statistics: Stripping the Dread from the Data’
18
[DSP’s Motto #2]
19
•  2012.08 ~ 2013.09
•  All (22) counties/cities of Taiwan
•  About 470,000 records
Dataset 1:
Real Estate Transaction Data
20
Dataset 2: PIXNET’s open data
The largest blog service
provider in Taiwan
Data opened:
1. Metadata of popular photo
2. Photo EXIF
3. Metadata of popular blog
4. Visitor logs of popular blog
*Article and photo can be retrieved by API
www.pixnet.net
http://developer.pixnet.pro/
21
Data Fiesta: Team Project Showtime
22
LOVE  EASIER LIVING Infographic download: http://goo.gl/fKdXXi
Elder’s Happiness Index by a number of medical treatment resources, disease  death,
education resources, recreation resources, and social participation of every district in Taipei
23
What have we learned?
24
Insights Learned from DSP Team
Training
1.  Potential Data Science Members are everywhere.
But this does not matter without the ability to organize them and to
train them to reach their potential.
2.  Access to individual specialized classes are
available.
But there are a lack of classes that combine all this knowledge and
integrate it to become a complete End-to-End course.
3.  There is a great amount of Data out there,
especially within the Government.
But the Government lacks a powerful strategic plan of how to open
data for the betterment of society.
4.  Insights are around us.
But these insights need to be turned into actions.
25
More or Less
1.  More Quality in Life, Less Cynic
2.  More Real Strategy, Less Bluffing
3.  More Data, Less Guessing
4.  More Correlation, Less Summation
5.  More Cross-over, Less Limitation
Do them right,
let Data Science help to make many things good
26
Taipei, Taiwan
Add : 318, Rueiguang Rd., Taipei 114, Taiwan
Tel : +886-2-77201888
Fax : +886-2-87986069
www.etusolution.com

Summary of Insights Learned from the Data Science Program Team Training

  • 1.
    1 Summary of InsightsLearned from the Data Science Program Team Training Fred Chiang (@fredchiang) fredchiang@etusolution.com The Lead of Etu and the DSP Committee Member May 19th, 2014
  • 2.
    2 Agenda 1. What is DSP? 2. Howdid DSP become about? 3. What does DSP do? 4. What have we learned?
  • 3.
  • 4.
    4 Data Science Program(DSP) DSP was initiated by Etu, Code for Tomorrow (CfT) and supported by OKFN Taiwan and other various parties. http://datasci.co
  • 5.
    5 DSP, is acase of enterprise run data-driven CSR with NPO, from SYSTEX/Etu’s perspective Etu, SYSTEX Etu is a pioneer of Big Data providing Hadoop-based solutions from Taiwan primarily focused on helping customers discover, unlock, and utilize valuable information embedded in extremely large data sets through simple steps. SYSTEX Group is an Asia-Pacific regional IT service provider and the largest one in Taiwan. Etu is an independent brand incubated by SYSTEX. Code for Tomorrow Code for Tomorrow (CfT) Foundation Initiative is a non- profit organization that actively encourages governments, private sectors, and civil society organizations to embrace the power of the internet and people to better our governance in the 21st century.
  • 6.
    6 codefortomorrow.org How did itbecome about? What does it do? What have we learned?
  • 7.
    7 How did DSPbecome about?
  • 8.
  • 9.
    9 No one personcan be the perfect data scientist, so we need teams Source: Next-Gen Data Scientist, Dr. Rachel Schutt Data Science Profiles
  • 10.
    10 Data Science ProgramGoal Train 300talented data science team members within 3years for Taiwan
  • 11.
  • 12.
    12 DSP Working Group Committee CEO(CfT) / Principal Secretory (Etu) Administration Team COO (Etu) Curriculum Team CCO (CfT) Marketing Team CMO (CfT)
  • 13.
    13 DSP Courses (continuously developing) 1. TeamTraining 2. Data ETL and Analysis with Python 3. Data Journalism (coming soon)
  • 14.
    14 Who are interested? Thosewho signed up for DSP Team Training #1 #2. Totaling 168 counts 0 10 20 30 40 50 60 70 80 UI Designer Art Designer UX Designer Other Product/Service Planner Story Teller Programmer Data Analyst 5 6 7 22 48 52 67 75 77% 23% Male Female Analyst Hygienist Campaigner Campaigner Designer Designer Designer
  • 15.
    15 Self-tagging by Role • Campaigner •  Analyst •  Hygienist •  Designer
  • 16.
  • 17.
    17 [DSP’s Motto #1] “Thepoint of statistics is not to do myriad rigorous mathematical calculations; the point is to gain insight into meaningful social phenomena.” ~ Charles Wheelan from the book ‘Naked Statistics: Stripping the Dread from the Data’
  • 18.
  • 19.
    19 •  2012.08 ~2013.09 •  All (22) counties/cities of Taiwan •  About 470,000 records Dataset 1: Real Estate Transaction Data
  • 20.
    20 Dataset 2: PIXNET’sopen data The largest blog service provider in Taiwan Data opened: 1. Metadata of popular photo 2. Photo EXIF 3. Metadata of popular blog 4. Visitor logs of popular blog *Article and photo can be retrieved by API www.pixnet.net http://developer.pixnet.pro/
  • 21.
    21 Data Fiesta: TeamProject Showtime
  • 22.
    22 LOVE EASIERLIVING Infographic download: http://goo.gl/fKdXXi Elder’s Happiness Index by a number of medical treatment resources, disease death, education resources, recreation resources, and social participation of every district in Taipei
  • 23.
  • 24.
    24 Insights Learned fromDSP Team Training 1.  Potential Data Science Members are everywhere. But this does not matter without the ability to organize them and to train them to reach their potential. 2.  Access to individual specialized classes are available. But there are a lack of classes that combine all this knowledge and integrate it to become a complete End-to-End course. 3.  There is a great amount of Data out there, especially within the Government. But the Government lacks a powerful strategic plan of how to open data for the betterment of society. 4.  Insights are around us. But these insights need to be turned into actions.
  • 25.
    25 More or Less 1. More Quality in Life, Less Cynic 2.  More Real Strategy, Less Bluffing 3.  More Data, Less Guessing 4.  More Correlation, Less Summation 5.  More Cross-over, Less Limitation Do them right, let Data Science help to make many things good
  • 26.
    26 Taipei, Taiwan Add :318, Rueiguang Rd., Taipei 114, Taiwan Tel : +886-2-77201888 Fax : +886-2-87986069 www.etusolution.com