Era of Data Economy
Jun Miyazaki Ph.D.
CEO OrangeTechLab Inc.
Co-project leader, GMS lab, Komazawa Univ.
Visiting Collaborator, National Institute of
Advanced Industry & Science Technology (AIST)
1
AI and Fintech
brought us “data” as new economy
2
• Data have become like “currency” under
the development of AI (deep learning) +
Fintech (blockchain)
• Value of “currency” = number
ex) 10MUSD, 1,00Kyen,, etc,
• Value of “data” = data structure + context
Deep Learning applications have
been changing programming
• Program = data + logic
3
①Application programs are
variations of pre-defined patterns
②Utilize pre-trained networks
③Seek best fit Parameters
④Need Huge Computational power
① Data quality and
quantity make the
difference
② annotated data for
training is hard to obtain
Less valueMore value
Birth of “Data Economy”
Deep Neural Network
AI paradigm shift
Blockchain has introduced trusted
contract network
• Bitcoin is an application of blockchain
• Many trusted contract applications will come
• In a short : Data structure + context
• This leads us data as economy too
• Value will be affected by the context
4
Blockchain
Bit coins
Trusted contract
applications
ICO
Domain Knowledge
• Data Source
• Semantics of Data
• Legal
Data Science
• Data Model
• Analytics Framework
• Statistical model
• Machine Learning
• Deep Learning
Systems Architecture
• Building System
Architecture
• Rapid Dashboard
Consultation for IoT & Data Science
Multi-Disciplinary Business Architecture
The diagram shows how multi-disciplinary roles
in consulting business for IoT / AI
Ex) Sales,
Plant engineers,
Lawyers,
Domain Knowledge
• Data Source
• Semantics of Data
• legal
Data Science
• Data Model
• Analytics Framework
System Architecture
• Building System
• Architecture
• Rapid Dashboard
Consultation for IoT & Data Science
Multi-Disciplinary Business Architecture
• Domain Knowledge depends on an application area.
The application have data sources, and semantics of data.
• Data Science is necessary to build data models, framework of
analytics in order to clarify domain knowledge.
• System Architecture specifies methods of data processing such as
building system architecture and rapid dashboard for users.
• Consultation integrates these three disciplines because each
specialty is totally different and it is an only way to communicate
each other.
Value of data is different among the roles
A) For domain knowledge specialists,
1. data privacy is most important
2. To create annotated data is so expensive, and
resource eater
3. NDAs take long time (several months)
7
Domain Knowledge
• Data Source
• Semantics of Data
• legal
d1 d2 d3 ,, di ,, dn
,, ,, ,, ,,
,, ,, ,,
Ex) lot of privacy issues
HR data, healthcare data,
Human faces,,,,
Value of data is different among the roles
B) For data scientists,
1. Features of data are important
2. Which kind of analytic models should be
adopted ?
3. NDAs is important but not matter (in fact ?!)
8
d1 d2 d3 ,, di ,, dn
,, ,, ,, ,,
,, ,, ,,
Data Science
• Features of data
• Data Model
• Analytics Framework
• Statistical model
• Machine Learning
• Deep Learning
Value of data is different among the roles
C) For Systems architects,
1. data quantity, and duration are important
2. To estimate computational power (CPU,
memory, (GPU for deep learning) is important
3. How to visualize is key with data scientists
9
d1 d2 d3 ,, di ,, dn
,, ,, ,, ,,
,, ,, ,,
System Architecture
• Building System
• Architecture
• Rapid Dashboard
Value of data is different among the roles
D) For AI/IoT consultants,
1. Consultants need multi-discipline to
coordinate other three roles
2. To understand how differ the meanings for
data among the other roles
10
d1 d2 d3 ,, di ,, dn
,, ,, ,, ,,
,, ,, ,,
Consultation for IoT & Data Science
Analogy = PoC (Proof of Concept) iteration
• PoC (Proof of concept) is a tool to
communicate among the different disciplines
• Rapid prototyping and iterations of PoCs will
become telling analogies among the roles
11
PoC
Importance of analogy in multidisciplinary : a conversation with Prof. Nagib Callaos
Seeing is Believing
Through
Analogy telling
Interaction between the roles
1. NDA takes several months
2. A solution
① Provide on the data scheme (CSV tab definition)
② Provide example
③ Then Domain Knowledge specialist
/ manager can start telling analogies
④ New feature models can be guessed
by Data Scientists
12
Domain Knowledge
• Data Source
• Semantics of Data
• Legal
Data Science
• Data Model
• Analytics Framework
• Statistical model
• Machine Learning
• Deep Learning
Examples
13
Example 1: Aging healthcare
a business intelligence application
for automatic healthcare systems of
IoT in hospitals for aging seniors
14
Japanese Population trends
• Japanese population will dramatically decrease in
coming 20 years
• Japan is most rapid Aging society
15
2015, “Population Projections for Japan”, IPSS(www.stat.go.jp)
Example1: Aging healthcare
• Aging care hospitals need IoT system to capture seniors
movement by several IoT sensors.
• Bed sensors, heart beat sensors,,,etc.
• The hidden purpose is how they can effectively assign and
reduce care person
16
Sense lay-down or up Wearables Heart pressure sensor
A) Systems architects and Data Scientists consider which IoT
sensors can fulfil the domain knowledge specialists (care
person, doctors etc.,)
B) Always have semantic gap between the domain knowledge
specialists’ intention and sensors raw data
Semantic gap between data and meanings
• Censored data and domain specialists’ intention have
semantic gaps
17
Meanings: target to solve
IoT sensors, captured data
Semantic
gap
Whether sensors data can describe senior people’s
real situation or not ?
1. Are seniors really on the bed (or other person) ?
• How we can detect ID of seniors ?
2. Are they in the rooms ?
3. Too much sensors cannot be affordable
4. It may have cost performance issues, traffic and
security / privacy issues v.s. precise semantics
18
One way to fill the semantic gap
1. Algebra between sensors data and
semantics
2. We have introduced
time / space / direction algebra
19
20
Our Approach in Time-Space-Direction Algebra[6]
Design of algebra of objects for spatial, temporal
and directional features in ubiquitous environment.
whiteboardH
A B
( A adds B ) meetss H = ( A meetss H ) adds ( B meetss H )
whiteboardH
A
whiteboardH
B
adds=
• By introduction of algebra, the high description
capability and reduction of computational complexity
are realized.
• By introduction of directions
of objects, high precision
for semantics is realized.
operators relationship
o1 beforet o2
o1 aftert o2
o1 duringt o2
o1 containst o2
o1 overlapst o2
o1 overlappedt o2
o1 meetst o2
o1 metbyt o2
o1 startst o2
o1 startedbyt o2
o1 finishest o2
o1 finishedbyt o2
o1 equalst o2
o1 o2
o2 o1
o2
o1
o2
o1
o2o1
o1 o2
o2
o1
o1
o2
o2
o1
o2
o1
o1
o2
o1o2
o2
o1
(a)
operators relationship
o1 disjoints o2
o1 containss o2
o1 insides o2
o1 equals o2
o1 meetss o2
o1 coveress o2
o1 coveredbys o2
o1 overlapss o2
(b)
o1 o2
o1 o2
o1 o2
o1 o2
o1 o2
o1 o2
o1 o2
o1 o2
Temporal and Special Operators in Time-Space-Direction Algebra
Please refer our past paper
TensorFlow
AWS (EC2 (GPU Instances), S3)
NDIVIA
TK1, TX1
MAC
Dashboard Application
Docker on
Windows
Python (NumPy or PyCUDA)
A Sample of Systems architecture
for aging healthcare system
• This is an example of systems architecture for Deep
Learning system
• Systems architects have to consider which technology
should be adequate to solve the issues in the projects
Systems point: durable architecture
23
IoT
Sensor
IoT
Sensor
IoT
Sensor
IoT
Sensor
IoT
Sensor
IoT
Sensor
Many
data with 1s duration
Data
unification
Upload
for
precise
analysis
• Too many sensors cause lots of traffics to cloud, if you just
pass through the data
• For data unification Edge server at client side may work to
minimize the traffic with clouds
Cloud
For
Analyzing
Edge
computer
1. Data filtering & unification
2. Partial deep learning
3. Abnormal state Detection by
machine learning
Example2: professional driver support
1. Domain Knowledge: come from taxi or bus
driver company (support professional drivers)
• Preventing dozing drive, taxi robbery
2. Data Scientists: have methodologies for
analyzing data with objects and directions of
objects (gaze)
3. System Architects: with car mounted cameras
and their communication systems.
• we show the real-time face recognition for car mounted
cameras..
4. Consultants: are managing these four
disciplines.
Example2: PoC for Analogy telling
Based on simple PoC, each can roles will be able to
tell their analogy stories
デモビデオ削除
Example2: gaze analysis PoC
• The system is built on OpenCV, using normal
Machintosh and cameras.
• The red arrows show the direction of faces (gaze).
• Various applications and business can be built on the
top of them.
Gaze detecting
Detecting
dozing drive
Detecting
Taxi robbery
Eample 3: PoC with Deep Leaning
• New types of object recognition is needed
• Use the existing trained network for its detection
(Transfer learning)
• Ex) VGG16 can identify 1000 objects, and extend it to
detect new type of recognition
• Program is few hundreds of lines, but data must be huge
27
VGG16
Transfer
Learning
Can be extended
to new types of
recognition
based on domain
issues
Conclusion
1. Time is now “data economy”
2. Compared to currency, data have structure and
context
3. Data meanings are different among domain
knowledge experts, data scientists, systems
architects, and total system consultants
4. There is always semantic gap between data and
domain specialist’s intention
5. Iteration of PoCs is like telling analogies
6. PoC can help interaction among multi-disciplinary
people, and help design and implement business
oriented AI applications especially deep learning .
Future Work: difficulty to obtain initial dataset
• One of the most urgent matter of big data analysis
is that it is hard to access big data in domain data.
• When we develop information systems for big
data, the problem is critical.
 Privacy issues of company data (HR data, healthcare data)
 Medical data
 Anonymous data may be one of answer but even such
technology, it takes long time to get data
• a data generation method for big data, especially
for prototypes (PoC) are needed
Initial PoC may need pseudo data
1. NDA takes several months
2. A solution
① Provide on the data scheme (CSV tab definition)
② Provide example
③ Pseudo Generation of initial data
④ Then Domain Knowledge specialist
/ manager can start telling analogies
⑤ New feature models can be guessed
by Data Scientists and domain specialists 30
Domain Knowledge
• Data Source
• Semantics of Data
• Legal
Data Science
• Data Model
• Analytics Framework
• Statistical model
• Machine Learning
• Deep Learning
About GMS laboratory activity
1. Komazawa univ. is one of the oldest university
in Japan
2. Global Media Studies has been 10 years since
established
3. It is a multi-disciplinary department
4. GMS lab is a collaborative research system
among university professors, students, and
outside private companies or other researchers,
and domain knowlede specialists.
5. This presentation work is the first output from
one of the new GMS lab. activity
31
メディア&コンテンツ
D
Media and Contents
Creating New Area
Business
Admin.
Economics
Sociology
Law
Communi-
cation
Policy
Manage-
ment
Infor-
matics
Culture
Faculty of Global Media Studies, Komazawa University
Practical English Education
IT Literacy Media Literacy
Practical Collaboration between
Industry and Academism
Global Research/Education and Interdisciplinarity
Biz
Architecture
Lab
GMS lab: biz architecture lab. formation
Komazawa univ.
professors
Komazawa univ.
phd candidates
Komazawa univ.
undergraduate
students
Domain knowledge
Specialists
Data Scientists
Systems
Architects /
implementers
HR,
Medical/healthcare
AI system planner
Etc.
Machine learning
Deep Learning
Cloud systems
Web applications
Mobile systems
AI/IoT solution
Consultants
34
Thank you very much !
Questions ?
35

Era ofdataeconomyv4short

  • 1.
    Era of DataEconomy Jun Miyazaki Ph.D. CEO OrangeTechLab Inc. Co-project leader, GMS lab, Komazawa Univ. Visiting Collaborator, National Institute of Advanced Industry & Science Technology (AIST) 1
  • 2.
    AI and Fintech broughtus “data” as new economy 2 • Data have become like “currency” under the development of AI (deep learning) + Fintech (blockchain) • Value of “currency” = number ex) 10MUSD, 1,00Kyen,, etc, • Value of “data” = data structure + context
  • 3.
    Deep Learning applicationshave been changing programming • Program = data + logic 3 ①Application programs are variations of pre-defined patterns ②Utilize pre-trained networks ③Seek best fit Parameters ④Need Huge Computational power ① Data quality and quantity make the difference ② annotated data for training is hard to obtain Less valueMore value Birth of “Data Economy” Deep Neural Network AI paradigm shift
  • 4.
    Blockchain has introducedtrusted contract network • Bitcoin is an application of blockchain • Many trusted contract applications will come • In a short : Data structure + context • This leads us data as economy too • Value will be affected by the context 4 Blockchain Bit coins Trusted contract applications ICO
  • 5.
    Domain Knowledge • DataSource • Semantics of Data • Legal Data Science • Data Model • Analytics Framework • Statistical model • Machine Learning • Deep Learning Systems Architecture • Building System Architecture • Rapid Dashboard Consultation for IoT & Data Science Multi-Disciplinary Business Architecture The diagram shows how multi-disciplinary roles in consulting business for IoT / AI Ex) Sales, Plant engineers, Lawyers,
  • 6.
    Domain Knowledge • DataSource • Semantics of Data • legal Data Science • Data Model • Analytics Framework System Architecture • Building System • Architecture • Rapid Dashboard Consultation for IoT & Data Science Multi-Disciplinary Business Architecture • Domain Knowledge depends on an application area. The application have data sources, and semantics of data. • Data Science is necessary to build data models, framework of analytics in order to clarify domain knowledge. • System Architecture specifies methods of data processing such as building system architecture and rapid dashboard for users. • Consultation integrates these three disciplines because each specialty is totally different and it is an only way to communicate each other.
  • 7.
    Value of datais different among the roles A) For domain knowledge specialists, 1. data privacy is most important 2. To create annotated data is so expensive, and resource eater 3. NDAs take long time (several months) 7 Domain Knowledge • Data Source • Semantics of Data • legal d1 d2 d3 ,, di ,, dn ,, ,, ,, ,, ,, ,, ,, Ex) lot of privacy issues HR data, healthcare data, Human faces,,,,
  • 8.
    Value of datais different among the roles B) For data scientists, 1. Features of data are important 2. Which kind of analytic models should be adopted ? 3. NDAs is important but not matter (in fact ?!) 8 d1 d2 d3 ,, di ,, dn ,, ,, ,, ,, ,, ,, ,, Data Science • Features of data • Data Model • Analytics Framework • Statistical model • Machine Learning • Deep Learning
  • 9.
    Value of datais different among the roles C) For Systems architects, 1. data quantity, and duration are important 2. To estimate computational power (CPU, memory, (GPU for deep learning) is important 3. How to visualize is key with data scientists 9 d1 d2 d3 ,, di ,, dn ,, ,, ,, ,, ,, ,, ,, System Architecture • Building System • Architecture • Rapid Dashboard
  • 10.
    Value of datais different among the roles D) For AI/IoT consultants, 1. Consultants need multi-discipline to coordinate other three roles 2. To understand how differ the meanings for data among the other roles 10 d1 d2 d3 ,, di ,, dn ,, ,, ,, ,, ,, ,, ,, Consultation for IoT & Data Science
  • 11.
    Analogy = PoC(Proof of Concept) iteration • PoC (Proof of concept) is a tool to communicate among the different disciplines • Rapid prototyping and iterations of PoCs will become telling analogies among the roles 11 PoC Importance of analogy in multidisciplinary : a conversation with Prof. Nagib Callaos Seeing is Believing Through Analogy telling
  • 12.
    Interaction between theroles 1. NDA takes several months 2. A solution ① Provide on the data scheme (CSV tab definition) ② Provide example ③ Then Domain Knowledge specialist / manager can start telling analogies ④ New feature models can be guessed by Data Scientists 12 Domain Knowledge • Data Source • Semantics of Data • Legal Data Science • Data Model • Analytics Framework • Statistical model • Machine Learning • Deep Learning
  • 13.
  • 14.
    Example 1: Aginghealthcare a business intelligence application for automatic healthcare systems of IoT in hospitals for aging seniors 14
  • 15.
    Japanese Population trends •Japanese population will dramatically decrease in coming 20 years • Japan is most rapid Aging society 15 2015, “Population Projections for Japan”, IPSS(www.stat.go.jp)
  • 16.
    Example1: Aging healthcare •Aging care hospitals need IoT system to capture seniors movement by several IoT sensors. • Bed sensors, heart beat sensors,,,etc. • The hidden purpose is how they can effectively assign and reduce care person 16 Sense lay-down or up Wearables Heart pressure sensor A) Systems architects and Data Scientists consider which IoT sensors can fulfil the domain knowledge specialists (care person, doctors etc.,) B) Always have semantic gap between the domain knowledge specialists’ intention and sensors raw data
  • 17.
    Semantic gap betweendata and meanings • Censored data and domain specialists’ intention have semantic gaps 17 Meanings: target to solve IoT sensors, captured data Semantic gap
  • 18.
    Whether sensors datacan describe senior people’s real situation or not ? 1. Are seniors really on the bed (or other person) ? • How we can detect ID of seniors ? 2. Are they in the rooms ? 3. Too much sensors cannot be affordable 4. It may have cost performance issues, traffic and security / privacy issues v.s. precise semantics 18
  • 19.
    One way tofill the semantic gap 1. Algebra between sensors data and semantics 2. We have introduced time / space / direction algebra 19
  • 20.
    20 Our Approach inTime-Space-Direction Algebra[6] Design of algebra of objects for spatial, temporal and directional features in ubiquitous environment. whiteboardH A B ( A adds B ) meetss H = ( A meetss H ) adds ( B meetss H ) whiteboardH A whiteboardH B adds= • By introduction of algebra, the high description capability and reduction of computational complexity are realized. • By introduction of directions of objects, high precision for semantics is realized.
  • 21.
    operators relationship o1 beforeto2 o1 aftert o2 o1 duringt o2 o1 containst o2 o1 overlapst o2 o1 overlappedt o2 o1 meetst o2 o1 metbyt o2 o1 startst o2 o1 startedbyt o2 o1 finishest o2 o1 finishedbyt o2 o1 equalst o2 o1 o2 o2 o1 o2 o1 o2 o1 o2o1 o1 o2 o2 o1 o1 o2 o2 o1 o2 o1 o1 o2 o1o2 o2 o1 (a) operators relationship o1 disjoints o2 o1 containss o2 o1 insides o2 o1 equals o2 o1 meetss o2 o1 coveress o2 o1 coveredbys o2 o1 overlapss o2 (b) o1 o2 o1 o2 o1 o2 o1 o2 o1 o2 o1 o2 o1 o2 o1 o2 Temporal and Special Operators in Time-Space-Direction Algebra Please refer our past paper
  • 22.
    TensorFlow AWS (EC2 (GPUInstances), S3) NDIVIA TK1, TX1 MAC Dashboard Application Docker on Windows Python (NumPy or PyCUDA) A Sample of Systems architecture for aging healthcare system • This is an example of systems architecture for Deep Learning system • Systems architects have to consider which technology should be adequate to solve the issues in the projects
  • 23.
    Systems point: durablearchitecture 23 IoT Sensor IoT Sensor IoT Sensor IoT Sensor IoT Sensor IoT Sensor Many data with 1s duration Data unification Upload for precise analysis • Too many sensors cause lots of traffics to cloud, if you just pass through the data • For data unification Edge server at client side may work to minimize the traffic with clouds Cloud For Analyzing Edge computer 1. Data filtering & unification 2. Partial deep learning 3. Abnormal state Detection by machine learning
  • 24.
    Example2: professional driversupport 1. Domain Knowledge: come from taxi or bus driver company (support professional drivers) • Preventing dozing drive, taxi robbery 2. Data Scientists: have methodologies for analyzing data with objects and directions of objects (gaze) 3. System Architects: with car mounted cameras and their communication systems. • we show the real-time face recognition for car mounted cameras.. 4. Consultants: are managing these four disciplines.
  • 25.
    Example2: PoC forAnalogy telling Based on simple PoC, each can roles will be able to tell their analogy stories デモビデオ削除
  • 26.
    Example2: gaze analysisPoC • The system is built on OpenCV, using normal Machintosh and cameras. • The red arrows show the direction of faces (gaze). • Various applications and business can be built on the top of them. Gaze detecting Detecting dozing drive Detecting Taxi robbery
  • 27.
    Eample 3: PoCwith Deep Leaning • New types of object recognition is needed • Use the existing trained network for its detection (Transfer learning) • Ex) VGG16 can identify 1000 objects, and extend it to detect new type of recognition • Program is few hundreds of lines, but data must be huge 27 VGG16 Transfer Learning Can be extended to new types of recognition based on domain issues
  • 28.
    Conclusion 1. Time isnow “data economy” 2. Compared to currency, data have structure and context 3. Data meanings are different among domain knowledge experts, data scientists, systems architects, and total system consultants 4. There is always semantic gap between data and domain specialist’s intention 5. Iteration of PoCs is like telling analogies 6. PoC can help interaction among multi-disciplinary people, and help design and implement business oriented AI applications especially deep learning .
  • 29.
    Future Work: difficultyto obtain initial dataset • One of the most urgent matter of big data analysis is that it is hard to access big data in domain data. • When we develop information systems for big data, the problem is critical.  Privacy issues of company data (HR data, healthcare data)  Medical data  Anonymous data may be one of answer but even such technology, it takes long time to get data • a data generation method for big data, especially for prototypes (PoC) are needed
  • 30.
    Initial PoC mayneed pseudo data 1. NDA takes several months 2. A solution ① Provide on the data scheme (CSV tab definition) ② Provide example ③ Pseudo Generation of initial data ④ Then Domain Knowledge specialist / manager can start telling analogies ⑤ New feature models can be guessed by Data Scientists and domain specialists 30 Domain Knowledge • Data Source • Semantics of Data • Legal Data Science • Data Model • Analytics Framework • Statistical model • Machine Learning • Deep Learning
  • 31.
    About GMS laboratoryactivity 1. Komazawa univ. is one of the oldest university in Japan 2. Global Media Studies has been 10 years since established 3. It is a multi-disciplinary department 4. GMS lab is a collaborative research system among university professors, students, and outside private companies or other researchers, and domain knowlede specialists. 5. This presentation work is the first output from one of the new GMS lab. activity 31
  • 32.
    メディア&コンテンツ D Media and Contents CreatingNew Area Business Admin. Economics Sociology Law Communi- cation Policy Manage- ment Infor- matics Culture Faculty of Global Media Studies, Komazawa University Practical English Education IT Literacy Media Literacy Practical Collaboration between Industry and Academism Global Research/Education and Interdisciplinarity
  • 33.
    Biz Architecture Lab GMS lab: bizarchitecture lab. formation Komazawa univ. professors Komazawa univ. phd candidates Komazawa univ. undergraduate students Domain knowledge Specialists Data Scientists Systems Architects / implementers HR, Medical/healthcare AI system planner Etc. Machine learning Deep Learning Cloud systems Web applications Mobile systems AI/IoT solution Consultants
  • 34.
    34 Thank you verymuch ! Questions ?
  • 35.

Editor's Notes

  • #4 Application variations like CNN (convolutional NN) RNN (recurrent NN), and GAN (Generative Adversarial Networks) DNN initially gave aa EYE to computer, but these day DNN gives time domain applications, which means natural language processing: from symbol to patterns, and from patterns to semantics
  • #6 Figure 1: Multi-Disciplinary Business Architecture by Domain Knowledge, Data Science, System Architecture, and Consultation for them
  • #7 Figure 1: Multi-Disciplinary Business Architecture by Domain Knowledge, Data Science, System Architecture, and Consultation for them
  • #15 AS you may know Japan is becoming rapidly to be an aging society, someone says Japan is R&D of aging society.
  • #21 We may need Time space of patients, time space of care person, and we need direction for fulfill semantic gap
  • #22 I do not mention this algebra in detail, please refer our past paper in reference.
  • #23 Figure 5: Example of Machine Learning Architecture (for TensorFlow case)
  • #26 Rapid prototyping, PoC gives us insights to start telling analogy, this PoC itself is a Analogy
  • #27 Ten years ago, detecting gaze was one of hard problems, but now a days, open source like OpenCV can easily prototype these kind of analysis. Detection of gaze can give good insight for data scientists how they can solve professional drivers issues,
  • #28 We have been challenged by a company to recognize their manual measurement of process result, We have supplied some real image with annotated tag, which we call teacher answer tags for data.