Graphene Quantum Dots-Based Composites for Biomedical Applications
Big Data - A view
1. 1
Big Data – a view
DBC
14 January 2016
Bjarne Kjær Ersbøll / bker@dtu.dk
2 DTU Compute, Technical University of Denmark
Acknowledgements
This slide deck is compiled from material from a lot of my colleagues and
people I collaborate with at DTU. The following list is incomplete:
• Jakob Eg Larsen
• Mark Riis
• Mads Odgaard
• Knut Conradsen
• Tage Thyrsted
• Lone Falsig Hansen
• Elena Guarneri
• And many more…
2. 2
3 DTU Compute, Technical University of Denmark
So, what is Big Data anyway?
4 DTU Compute, Technical University of Denmark
The 4 V’s
3. 3
5 DTU Compute, Technical University of Denmark
Data
explosion
6 DTU Compute, Technical University of Denmark
4. 4
7 DTU Compute, Technical University of Denmark
Crowds, Bluetooth and Rock n’ Roll:
Understanding Music Festival Participant
Behavior
8 DTU Compute, Technical University of Denmark
5. 5
9 DTU Compute, Technical University of Denmark
10 DTU Compute, Technical University of Denmark
6. 6
BIG1
Den 3. december 2013
12 DTU Compute, Technical University of Denmark
BIG1 purpose
• Identify technological challenges associated with exploiting the
potential of Big Data / Data-driven business development - to
improve animal health and higher food quality and safety.
7. 7
13 DTU Compute, Technical University of Denmark
BIG1 participants
• DTU Compute
• DTU National Food Institute
• DTU Veterinary Institute
• DTU Management
• DTU Biosys
• DTU Administration
14 DTU Compute, Technical University of Denmark
Big Data Value-chain
Data
Origins
The Internet,
sensors,
machines,
etc.
Data
Collection
Web log,
sensor data,
images/au‐
dio, RFID and
videos etc.
Data
Storage
Technologies
supporting
data storage
Analytics
Predictive
analytics,
patterns in
data,
decision
making
Consumers
Business
processes,
humans, and
applications
Sense Think Act
8. 8
15 DTU Compute, Technical University of Denmark
Feed/plants Animals Processing Consumers
Value chain
Actors
Data
Feed producers
Plant producers
Equipm. producers
Farmers
Abbatoir
Dairy
Retail sector
Export
Eg feed quality Eg growth rate
of animals
Eg efficiency in
slaughtering
process
Consumer
patterns and
food quality
Big Data
Stakeholders in BIG1 value-chain
16 DTU Compute, Technical University of Denmark
Optimere/speede algoritmernes funktionalitet og gøre beregningerne billigere
GenericBigData
problemtopics
Domain / application areas
Cattle Pigs Nutritional
composition
… and other
applications
Collection of data, eg sensors on individuals (eg RFID or image analysis)
Storage, manipulation, real-time data
Establising a dynamic Big Data cloud
Structuring data, distributed data and data-sharing
Merging and integration of databases
Pattern recognition, machine learning, artificial intelligence, query-algorithms
Multivariat analsis and advanced statistics and data analysis
Privacy/ethics regarding data
Visualisation of data wrt descision support
Platform project
Targeted projects
Optimation/speed-up algorithm functionality and lower cost of calculation
BIG1: What can we do?
9. 9
17 DTU Compute, Technical University of Denmark
18 DTU Compute, Technical University of Denmark
Sensors and data generation
10. 10
19 DTU Compute, Technical University of Denmark
Hardware and software
DTU Compute, Technical University of Denmark
Big Data – 1991 – Economic Geology
20 18.01.2016
11. 11
DTU Compute, Technical University of Denmark
Data
• Landsat satellite (common reference) – 4 scenes – 8 tapes
– Geometric rectification, mosaicking, ratios, factor scores,
• Geological – geological maps, topographic maps
– Structural information, lineaments converted to concentrations in 10
directions
• Geochemical – K, Rb, Sr, U, Nb, Y, Ga, Fe in stream sediments.
– Kriging to a 1 km grid, interpolation by bicubic spline to Landsat
pixels
• Radiometric – helicoptor-bourne gamma-spectrometric measurements,
U, Th, K, and Total concentration.
– Max in 1 km grid interpolated by minimum curvature and further by
bicubic spline
• Aeromagnetic data – 11 map sheets
– Manually digitized and interpolated
• Resulting in 40 variables on a pixel level (50.8m x 50.8m)
21 18.01.2016
DTU Compute, Technical University of Denmark
Data
• Converted to a 5km x 5km grid – trying to preserve information by
taking (when relevant):
– Min, max, 1%, 5%, median, 95%, 99%, mean, stddev, %land-cover
– 240 variables in all in 1084 squares
• Training set of
– 17 mineralized, central
– 21 mineralized, marginal
– 14 barren, central
– 5 barren, marginal
• Discriminant analysis using stepwise selection
– 1084 squares classified
22 18.01.2016
12. 12
DTU Compute, Technical University of Denmark23 18.01.2016
DTU Compute, Technical University of Denmark24 18.01.2016
13. 13
DTU Compute, Technical University of Denmark
Big Data ?
25 18.01.2016
DTU Compute, Technical University of Denmark
Other Big Data cases
ELIXIR Data describing the human
genetic variation
Development of personal
medical drugs which take
variation between patients
into account
Global Microbial Identifier Global system on genome-
sequence data from micro-
organismes to improve
national clinical diagnostics
and international
surveillance of diseases
CITIES IT-solutions for analysis,
operation and development
of integrated energy-
systems (electricity, gas,
district heating and bio-
masse) in cities to achieve
higher flexibility in eg
energy-storage
14. 14
Data Science (Big Data)
Profile at DTU Compute
28 DTU Compute, Technical University of Denmark
Data Science – main elements
Ambitious – courses: 45 ECTS (4/6
core) + thesis: A further 30-35 ECTS
Pioneering – across the Big Data
value chain and competences
Application oriented:
o Work with concrete data sets
o Collaboration with companies
15. 15
29 DTU Compute, Technical University of Denmark
Entry via all 3 DTU Compute programs
• Computer Science and Engineering
• Mathematical Modelling and Computation
• Digital Media Engineering
• …and now also: IT & Health (combination education btw KU & DTU)
• Cross-educational skills
30 DTU Compute, Technical University of Denmark
Big Data Value chain
data BIG data model
analysis
Data Origins
The Internet, sensors,
machines, etc.
Data Collection
Web log, sensor data,
images/audio, RFID and
videos, etc.
Data Storage
Technologies
supporting data storage
Analytics:
Predictive analytics,
patterns in data,
decision making
Consumers:
Business processes,
humans, and
applications
Sense Think Act
16. 16
31 DTU Compute, Technical University of Denmark
Courses in Data Science specialization
Origin Collection Storage Analytics Consumers
01227 Graph theory (5) 1 3
01405 Error correcting codes 2 1 1
01617 Dynamical Systems 1 2
02170 Database systems (5) 4
02232 Applied Cryptography (5) 2 3 1 1
Core 02239 Data Security 1 4 1
02249 Computationally hard problems (7.5) 1 1 4
02266 User experience engineering 1 1 5
02281 Data Logic (5) 1 2 1 1
Core 02282 Algorithms for Massive Data Sets (7.5) 2 3 3
Core 02288 Missing a course on “Advanced databases/w arehouses”? 2
02407 Stochastic Processes (5) 3
02409 Multivariate Statistics (5) 4
02417 Time Series Analysis (5) 4
02443 Stochastic Simulation (5) 4 1
02450 Introduction to Machine Learning and Data Modeling (5) 3 1
02457 Non-linear signal processing 1 1
02458 Cognitive Modelling (5) 3 2
02460 Advanced Machine Learning (5) 1 3 1
02506 Advanced Image Analysis 3
02515 Health technology 1 2
Core 02582 Computational dataanalysis 3
02586 Statistical Genetics (5) 2
Core 02806 Social data analysis and visualization(5) 2 3
Core 02819 Data Mining using Python (5) 1 3 1
30530 Geographical information systems 1 1 1
25303 Mathematical Biology 1 1 1 1
27411 Biological data analysis and chemometrics 1
27625 Algorithms in bioinformatics 1 1
42112 Mathematical Programming w ith Modelling Softw are 1 1
32 DTU Compute, Technical University of Denmark
Big Data
Hackathon
65 students
10 groups
48 hours
DTU's Skylab
Funding
1-2 start up companies
17. 17
33 DTU Compute, Technical University of Denmark
Big Data solutions for Lyngby-Taarbæk
municipality
”Smart City app” to make it a better place to
live
34 DTU Compute, Technical University of Denmark
Projects!
Energy utilization in buildings
Optimization of Bus-routes
Smart Traffic-regulation
Smart Energy renovation
Personalized Care for elderly
Smart tests for the Schools
Flexible collection of Waste
18. 18
35 DTU Compute, Technical University of Denmark
36 DTU Compute, Technical University of Denmark
Implementation of first recommendation:
Big Data•DTU