This document discusses how companies are exploiting data science in various industries. It provides examples of using gaming data to create new products, using production line sensor data and machine learning to schedule maintenance, and using a neural network trained on artificially curled pages to remove page curling from scanned books in a non-destructive manner. The gaming data example increased one company's revenue by €75k-€150k annually and the production line example could save hundreds of thousands per day in breakdown costs if scaled globally across industries.
Handwritten Text Recognition for manuscripts and early printed texts
How Companies Are Leveraging Data Science
1. HOW COMPANIES ARE
EXPLOITING DATA
SCIENCE
Exploiting the data-driven revolution
Exploiting the data-driven revolution27 February 2019
2. Data Analytics – practical examples
Some examples where data-analytics has been applied in
industry and commerce
• Games industry
• Production line monitoring
• Digitisation
27 February 2019 Exploiting the data-driven revolution
3. Gaming
• On-line gaming involves hundreds of millions of users with
data being collected all the time.
• Complex ecosystem of companies
• Services exist which provide data on game content and
usage (Giantbomb, Steam)
• KUMO – Spanish company whose income comes from
added value products, in this case 3-D printed models.
27 February 2019 Exploiting the data-driven revolution
4. Business Opportunity
Can you drive up revenue by creating new
products that will appeal to users based on
user profile and gaming data?
• Needs to be done quickly as this is a fast moving industry
where trends come and go on a short timescale.
27 February 2019 Exploiting the data-driven revolution
6. Architecture
27 February 2019 Exploiting the data-driven revolution
Users can observe how a
tag popularity varies over
a given time window
Uses tag popularity to suggest
the most popular tags among
those related to a set of input
tags
7. Benefits
• Smart Retail Recommendation Engine (REC-ENG)
• Supports marketing campaigns of value added product aimed at
collectors
• Increase in KUMO turnover of €75k – €150k per year
• 3D DASH
• Service for game authors. Gives information of what game
characteristics are trending.
• Huge global market
27 February 2019 Exploiting the data-driven revolution
9. Business Opportunity
A breakdown on the production line can be very costly. In
this example of the dairy industry the product is perishable.
In addition to lost production time, the line must be cleaned
and waste must be disposed of correctly. If maintenance
can be scheduled before a breakdown occurs, a lot of
expense can be saved.
Exploiting the data-driven revolution27 February 2019
10. Solution
Solution: Use production line condition data and
machine learning to schedule maintenance.
University of Portsmouth developing ML algorithms running
on EPCC’s HPC systems to identify conditions that lead to
breakdown. Data supplied by Müller.
Exploiting the data-driven revolution27 February 2019
11. Architecture
Exploiting the data-driven revolution27 February 2019
Factory
Machine
Machine
HPC system
Web server + App server
HPC Task
Manager
Parallel Machine
Learning
AlgorithmDatabase
12. Benefits
Benefits:
Major saving due to lower breakdown of production line. A
milk-bottling line breaks down around 3 times per year, with
a typical breakdown being 6 days at a cost of around
£300,000 per day.
With enough HPC, the system could scale to serve
thousands of production lines across the globe in many
different industries.
Exploiting the data-driven revolution27 February 2019
13. Scanning of printed material
Scanning of printed material is made complicated by
distortion of the page – page curling.
Machines exists that can automatically turn pages and
scan the book, but non-destructive removal of curling
effects is hard.
27 February 2019 Exploiting the data-driven revolution
14. Solution – Neural Network
• Test data set generated by creating artificially curled
pages using large number of different parameters.
• NN is trained to classify based on the curled shape of 4
edges (top, bottom, left and right).
• If classification is correctly done, the inverse curling
function can be applied to the image.
27 February 2019 Exploiting the data-driven revolution
15. Benefits
• Non-destructive method of removing curl effects.
• Very large number of old books for which there is
no digital copy
• Many scanning projects are in progress – this is
very useful technology for curation
27 February 2019 Exploiting the data-driven revolution