SlideShare a Scribd company logo
Big Data and Small Decives 
Katharina Morik, Artificial Intelligence, 
TU Dortmund University, Germany
Overview 
! Big data -- small devices 
! Graphical Models 
! Computation on the batch 
layer 
! Computation on the speed 
layer 
! Integer Random Fields 
! Spatio-Temporal Random Fields 
! Traffic Scenarios 
! Using the streams framework
Big data – small devices 
! Machine learning for the 
enhancement of small 
devices 
! Machine learning on small 
devices 
! Reduce 
! Computational 
complexity 
! Memory consumption 
! Communication 
! Energy
Overview 
! Big data -- small devices 
! Graphical Models 
! Computation on the batch 
layer 
! Computation on the speed 
layer 
! Integer Random Fields 
! Spatio-Temporal Random Fields 
! Traffic Scenarios 
! Using the streams framework
Graphical Models 
! Graph G=(V,E) 
! Multi-variate random variable 
X 
with finite discrete domain X 
! Mapping joint vertex 
assignment into vector space 
φ(x): X --> Rd 
! Parameter to be learned: 
θ in Rd 
CRF : 
p 
( ! 
! 
y 
x) = 
1 
! 
( x) 
Z θ, 
Σ 
exp θi 
φi 
( ! 
x) 
! 
y, 
i 
" 
# $ 
% 
& ' 
1 
a 
= exp(−ln a) 
Σ 
= exp θi 
φi 
( ! 
x) 
! 
y, 
i 
" 
# $ 
% 
& ' 
! 
( x) 
− ln Z θ, 
) 
* + 
, 
- . 
! 
x ( ) − A θ ( ) )* 
= exp θ,φ 
! 
y, 
,- 
MRF : 
p( 
! 
x) = 
1 
Z θ ( ) 
Σ 
exp θi 
φi 
! 
(x) 
i 
" 
# $ 
% 
& ' 
! 
x ( ) − A θ ( ) )* 
= exp θ,φ 
,-
Note: vectors are sparse 
φ(x): ( /* Nodes*/ 
1. red, /*dom(lights)*/ 
2. yellow, 
3. green, 
4. wet, /*dom(rain)*/ 
5. dry, 
6. jam, /*dom(jam)*/ 
7. free, 
/*Edges*/ 
1. red, dry, /* edge lights-rain*/ 
2. yellow, dry, 
3. green, dry, 
4. red, wet, 
5. yellow, wet, 
6. green, wet, 
7. red, jam, /*edge lights-jam*/ 
8. yellow, jam, 
9. green, jam, 
10. red, free, 
11. yellow, free, 
12. green, free, 
13. dry, free, /*edge rain-jam*/ 
14. dry, jam, 
15. wet, free, 
16. wet, jam) 
) 
! x1:(red,dry,free) 
Lights 
Rain 
Jam 
{jam, 
free} 
{dry, wet} 
{red, 
yellow, 
green}
Conditional Random Field: Likelihood
Markov Random Field: Likelihood
Markov Random Field: Estimation and Prediction
Belief propagation 
! Actually computing the expected probability of parameters given θ by 
message passing for each direction.
Note: vectors are sparse 
φ(x): ( /* Nodes*/ 
1. red, /*dom(lights)*/ 
2. yellow, 
3. green, 
4. wet, /*dom(rain)*/ 
5. dry, 
6. jam, /*dom(jam)*/ 
7. free, 
/*Edges*/ 
1. red, dry, /* edge lights-rain*/ 
2. yellow, dry, 
3. green, dry, 
4. red, wet, 
5. yellow, wet, 
6. green, wet, 
7. red, jam, /*edge lights-jam*/ 
8. yellow, jam, 
9. green, jam, 
10. red, free, 
11. yellow, free, 
12. green, free, 
13. dry, free, /*edge rain-jam*/ 
14. dry, jam, 
15. wet, free, 
16. wet, jam) 
) 
! x1:(red,dry,free) 
! φ(x1)= 
(1,0,0,0,1,0,1,1,0,0,0,0,0,0, 
0,0,1,0,0,1,0,0,0) 
! d = 23 
Lights 
Rain 
Jam 
{jam, 
free} 
{dry, wet} 
{red, 
yellow, 
green}
Belief propagation example 
Π 
Σ 
mv→u (y) = exp θvu=xy +θv=x ( ) mwu (x) 
Lights 
Rain 
Jam 
{jam, 
free} 
{dry, wet} 
{red, 
yellow, 
green} 
w∈Nv−{u} 
x∈ℵv 
Compute for each node u 
and each of its neighbors 
and each of the states
Parallel Belief Propagation (GPU) 
! Data Parallel: all node and neighbor pairs become blocks with b 
threads. A thread computes the msg v" u. 
! Thread-cooperative: all node and instance pairs become blocks. A 
thread computes partial outgoing messages. These are cooperatively 
processed by all threads in a block. Partial messages are merged 
afterwards. 
! Nico Piatkowski (2011) Parallel Algorithms for GPU Accelerated Probabilistic Inference 
! http://biglearn.org/2011/files/papers/biglearn2011_submission_28.pdf
File Prefetching for Smartphone Energy Saving 
! Accessing files that are stored in cache 
consumes less energy than accessing those on the hard disc. 
! Prefetching files which soon will be accessed 
allows for grouping requests to a batched one. This saves energy. 
" Prediction of file access saves energy! 
Given a system call: 
1,open,1812,179,178,201,200,firefox,/etc/hosts,524288,438,7 
Predict: hosts full 
Prefetch: hosts 
2,read,1812,179,178,201,200,firefox,/etc/hosts,4096,361 
3,read,1812,179,178,201,200,firefox,/etc/hosts,4096,0 
4,close,1812,179,178,201,200,firefox,/etc/hosts 
timestamp, syscall, thread-id, process-id, parent, user, group, exec, file, 
parameters (optional) 
14
Results: Memory 
! Naive Bayes models use less memory than those of regular CRF. 
! Implementing CRF on GPGPU reduces memory needs. 
! CRF on GPU scales best. 
15 
1600 
1400 
1200 
1000 
800 
600 
400 
200 
0 
67k 202k 337k 472k 606k 
NB memory 
CRF memory 
GPU CRF memory 
Felix Jungermann, Nico Piatkowski, Katharina 
Morik (2010) Enhancing Ubiquitous Systems 
Through System Call Mining, LACIS at ICDM 
http://www.computer.org/csdl/proceedings/ 
icdmw/2010/4257/00/4257b338-abs.html
Overview 
! Big data -- small devices 
! Graphical Models 
! Computation on the batch 
layer 
! Computation on the speed 
layer 
! Integer Random Fields 
! Spatio-Temporal Random Fields 
! Traffic Scenarios 
! Using the streams framework
Graphical models on resource-restricted processors 
! Floating point arithmetics 
(real) costs more clock cycles 
than integer arithmetics. 
! “The most obvious technique 
to conserve power is to 
reduce the number of cycles 
it takes to complete a 
workload.” (Intel 64, IA-32 
architectures optimization 
reference manual, guidelines 
for extending battery life). 
! Restrict the parameter space 
of MRF 
Sandy Bridge ARM 11 
Real Int Real Int 
+ 3 1 8 1 
* 5 3 8 4-5 
/ 14 13-15 19 - 
Bit 
- 3 - 2 
shift 
Clock cycles for arithmetics 
on different processors: 
Real vs. integer. 
θ ∈{0,1,...,K}⊂ Ν
Parameter space transformation 
! Graph model tree-structured 
! Transform the parameter 
space: 
ηi (θ) = θi ln 2 
MRF : 
p( 
! 
x) = 
1 
Z θ ( ) 
Σ 
exp θi 
φi 
! 
(x) 
i 
" 
# $ 
% 
& ' 
! 
x ( ) − A θ ( ) )* 
= exp θ,φ 
+, 
IntegerMRF : 
p( 
! 
x) = exp η θ ( ),φ 
! 
x ( ) )* 
+, 
= 2 
θ ,φ x ( ) −A η θ ( ) ( ) )* 
+, 
= 
2 θ ,φ (x) 
2 θ ,φ (y) 
y∈ℵ 
Σ
Integer belief propagation 
! Simply replacing the exp(.) by 2(.) is not sufficient 
! Overflows are normally avoided by normalization. 
! Normalization is impossible in integer division. 
! Magnitude of messages corresponds to probability 
! Use the length of each message 
! Bit-length is similar to log 
Π 
Σ 
mv→u (y) = exp θvu=xy +θv=x ( ) mwu (x) 
w∈Nv−{u} 
x∈ℵv 
m! v→u (y) = 2 (θvu=xy+θv=x ) 
Σ m! wu (x) 
x∈ℵv 
Π 
w∈Nv−{u} 
βv→u (y) = max 
x∈ℵv 
Σ 
θvu=xy +θv=x + βwu (x) 
w∈Nv−{u}
Evaluation: accuracy 
0.95 
0.9 
0.85 
Int, deg=4 
Int, deg=8 
Int, deg=16 
1000 2000 3000 4000 5000 6000 7000 8000 
! Different vertex degrees 
! Up to 8000 nodes in the graph 
! 100 runs of IntMRF 
! Real MRF is 100% accuracy. 
# Scalable!
Integer models use less clock cycles and memory 
! Integer MRF shows 
! high accuracy: 86 – 92% 
of full MRF for up to 8 
states and 8000 nodes 
! high speed-up: 2.56 
Raspberry Pie and 2.34 
for Sandy Bridge 
! Graphical models can now be 
computed on small devices. 
! Using less clock cycles really 
saves energy. 
Nico Piatkowski, Sangkyun Lee, Katharina Morik (2014) 
The Integer Approximation of Undirected Graphical 
Models, ICPRAM 
http://www-ai.cs.uni-dortmund.de/ 
PublicPublicationFiles/piatkowski_etal_2014a.pdf
Overview 
! Big data -- small devices 
! Graphical Models 
! Computation on the batch 
layer 
! Computation on the speed 
layer 
! Integer Random Fields 
! Spatio-Temporal Random Fields 
! Traffic Scenarios 
! Using the streams framework
Spatio-temporal random fields 
! The spatio-temporal graph is 
trained to predict each node’s 
maximum a posteriori 
probability with the marginal 
probabilities. 
! Generative model predicting 
all nodes. 
! Dimension 
T x |V0| x | X| + 
[(T-1)(|V0|+3|E0|)+ |E0|] x |X|2 
! Remember: vectors are sparse 
we have to exploit that! 
User queries: 
Given traffic densities at all 
nodes at t1, t2, t3, what is the 
probability of traffic density 
at node A at time t5? 
Given state “jam” at place A 
ts, which other places have a 
higher probability for “jam” in 
ts < t < te?
Training efficiently 
! Reparametrization 
! Sparseness and smoothness 
! There are not many changes 
over time 
! Small ratio of non-zero 
entries 
! Regularization 
! L1 and L2 norm now 
penalizing difference θ(t-1) 
and θ(t) 
! STRF uses 8 times less memory 
than MRF model 
! Its distributed optimization 
converges about 2 times faster.
Compressible parametrization 
! STRF is 8 times smaller than 
MRF model 
! The ratio of non-zero entries 
is quite small already for 
quite small L1 regularization 
! STRF’s distributed 
optimization converges 
faster than MRF: 
! Run MRF for 100 
iterations, record the 
likelihood value; 
! Run STRF until it reaches 
the same result of the 
objective function. 
! Compare seconds.
Evaluation: accuracy 
! Smaller state spaces result 
in higher accuracies. 
! Again, for MRF accuracy 
100%. 
# State space is important! 
1 
0.8 
0.6 
0.4 
Int, |X|=2 
Int, |X|=4 
Int, |X|=8 
Int, |X|=16 
0 1000 2000 3000 4000 5000 6000 7000 8000
Flood in the City 
! The city of Dublin suffers 
from flood due to rainfall and 
tide. Wanted: 
! Optimal traffic 
regulations based on 
traffic predictions. 
! Insight EU Project 
www.insight-ict.eu 
Intelligent Synthesis and 
Real-time Response using 
Massive Streaming of 
Heterogeneous Data 
Coordinator: 
Dimitrios Gunopoulos, Univ. 
Athens, Greece
Smart trip modeling for Dublin 
! Open Street Map " graph 
topology 
! Open Trip Planner: user query 
(v,w), route planning based on 
traffic costs. 
! Traffic costs learned: 
! Spatio-temporal random 
field based on sensor data 
stream; 
! Gaussian process estimates 
values for non-sensor 
locations. 
! Framework for real-time 
processing of data streams, XML 
configuration of data flow, 
connecting data, traffic model 
and planner. 
7:00 
8:00 
v 
v 
w 
w
Constructing the spatio-temporal graph of Dublin 
OpenStreetMap streets segmented according to junctions. 
966 sensors transmit traffic flow every 6 minutes (www.dublinked.ie). 
Aggregate sensor readings for 30 minutes, aggregate sensor nodes by 7NN. 
Traffic flow discretized into 6 intervals of density. 
48 time layers for each day (48* 30=1440 minutes make a day). 
Training for every weekday. 
Predicting density of each node and edge – interpreted as costs.
Using STRF for smart trip modeling -- Evaluation 
! Confusion matrix of predicting the number of vehicles (6 
intervals) for all sensors and all half hours following 1 pm 
on Fridays, tested on March 1.,8.,15.,22.,29. 
! given the traffic at 1 p.m. (bold is true). 
Predicted 0 1-5 6-20 21- 
30 
31-60 61- Prec 
True 
0 840 32 10 6 3 0 0.943 
1-5 2 632 498 3 0 1 0.556 
6-20 91 156 12169 2006 83 25 0.838 
21-30 32 0 1223 5637 717 14 0.739 
31-60 43 0 60 893 1945 29 0.655 
61- 0 0 16 3 12 35 0.530 
Recall 0.833 0.771 0.871 0.659 0.705 0.34 
Environment Energy Engineering
Confusion Matrices for Cluster-Based Discretization (all t) 
Friday 0-7 8-12 13-21 22 - Precision 
0-7 139 518 38 551 45 350 50 501 0.5093 
8-12 37 419 51 284 37 153 27 112 0.3396 
13-21 23 535 25 863 115 081 66 171 0.4989 
22- 22 387 18 034 58229 205 822 0.6760 
Recall 0.6260 0.3881 0.4499 0.5887 
Saturday 0-7 8-12 13-21 22 - Precision 
0-7 167 073 57 448 34 370 16 590 0.6065 
8-12 46 944 102 703 68 194 26 181 0.4209 
13-21 31 626 45 606 140 866 51 136 0.5232 
22- 19 365 25 737 46 949 89 301 0.4937 
Recall 0.6304 0.4437 0.4859 0.4874
Smart traffic for smart cities 
! Spatio-temporal modeling is 
natural for traffic prediction. 
! Several questions can be 
answered using the same 
learned model. 
! The answers come along with 
their probabilities. This might 
be helpful for decision 
makers. 
! Integration of Spatio-temporal 
random fields into 
the Open Trip Planner and 
Gaussian information 
completion resulted in an 
excellent navigation system. 
Piatkowski, Lee, Morik (2013) Spatio-temporal random 
fields: compressible representation and distributed 
estimation, Machine Learning Journal 93:1, 115 – 140. 
Liebig, Piatkowski, Bockermann, Morik (2014) 
Predictive Trip Planning – Smart Routing in Smart Cities, 
Mining Urban Data Workshop at 17th Intern. Conf. on 
Extending Database Technology.
Overview 
! Introduction 
! Big data 
! Small devices 
! Graphical Models 
! Computation on the batch 
layer 
! Computation on the speed 
layer 
! Integer Random Fields 
! Spatio-Temporal Random Fields 
! Traffic Scenarios 
! Using the streams framework
Streams framework 
! Open-source Java 
environment for abstract 
orchestration of streaming 
functions by Christian 
Bockermann, GNU AGPL 
license. 
! High-level API for sources, 
processing nodes, 
executable on the STORM 
engine and topology. 
! Graph of nodes can be used 
for distributed processing. 
! Easy integration of MOA 
library. 
! Eases integration of various 
engines. 
Christian Bockermann, Hendrik Blom 
2012 
The Streams Framework 
Tech.Rep. SFB 876 
http://www-ai.cs.uni-dortmund.de/ 
PublicPublicationFiles/ 
bockermann_blom_2012c.pdf
Stream Processing Engines 
! Fault tolerance by resending 
not acknowledged data 
(streams) vs. sophisticated 
recovery methods 
(MillWheel). 
! State management by storing 
state in process context 
(streams) vs. links to 
distributed storage systems 
(Samza, MillWheel on Apache 
Cassandra, BigTable). 
! Explicit partitioning by flow 
graph vs. built upon 
distributed computing 
environment managed by a 
central node.
Comparison of stream processing engines 
Christian Bockermann 
A Survey of the Stream 
Processing Landscape 
2014 
TechRep. SFB 876 
http://sfb876.tu-dortmund.de/ 
PublicPublicationFiles/ 
bockermann_2014b.pdf
Streams framework 
! Streams to map reduce for 
an easy execution in Hadoop 
reading HDFS files.
Conclusion 
! Complex models such as MRF 
can be computed on small 
devices such as smartphones. 
! Graphical Models 
! Parallel BP using GPU 
! Approximation to Integer 
Random Fields 
! Spatio-Temporal Random 
Fields 
! Using the streams 
framework for integrating 
processes (e.g. MOA for 
learning) and real-time 
processing and 
documentation.
References 
! Nico Piatkowski (2011) Parallel Algorithms 
for GPU Accelerated Probabilistic 
Inference 
http://biglearn.org/2011/files/papers/ 
biglearn2011_submission_28.pdf 
! Felix Jungermann, Nico Piatkowski, 
Katharina Morik (2010) Enhancing 
Ubiquitous Systems Through System Call 
Mining, LACIS at ICDM 
http://www.computer.org/csdl/proceedings/ 
icdmw/2010/4257/00/4257b338-abs.html 
! Nico Piatkowski, Sangkyun Lee, Katharina 
Morik (2014) The Integer Approximation of 
Undirected Graphical Models, ICPRAM 
http://www-ai.cs.uni-dortmund.de/ 
PublicPublicationFiles/ 
piatkowski_etal_2014a.pdf 
! Piatkowski, Lee, Morik (2013) Spatio-temporal 
random fields: compressible 
representation and distributed 
estimation, Machine Learning Journal 
93:1, 115 – 140. 
! Liebig, Piatkowski, Bockermann, Morik 
(2014) Predictive Trip Planning – Smart 
Routing in Smart Cities, Mining Urban 
Data Workshop at 17th Intern. Conf. on 
Extending Database Technology. 
! Streams framework 
Christian Bockermann, Hendrik Blom 
2012 The Streams Framework 
Tech.Rep. SFB 876 
http://www-ai.cs.uni-dortmund.de/ 
PublicPublicationFiles/ 
bockermann_blom_2012c.pdf 
! Christian Bockermann (2014) A Survey of 
the Stream Processing Landscape 
TechRep. SFB 876 
http://sfb876.tu-dortmund.de/ 
PublicPublicationFiles/ 
bockermann_2014b.pdf 
! For streams software (not just scripts!) 
check: 
http://www-ai.cs.uni-dortmund.de/ 
SOFTWARE/streams/

More Related Content

Viewers also liked

Exact Data Reduction for Big Data by Jieping Ye
Exact Data Reduction for Big Data by Jieping YeExact Data Reduction for Big Data by Jieping Ye
Exact Data Reduction for Big Data by Jieping Ye
BigMine
 
Are we with-it? - Lucia Schoombee
Are we with-it? - Lucia SchoombeeAre we with-it? - Lucia Schoombee
Are we with-it? - Lucia SchoombeeHELIGLIASA
 
Personalising Customer Experience in the Hospitality Industry June 2016
Personalising Customer Experience in the Hospitality Industry June 2016Personalising Customer Experience in the Hospitality Industry June 2016
Personalising Customer Experience in the Hospitality Industry June 2016
Carly Watling
 
和菓子復興大作戦〜萌えキャラで和菓子ブームを〜
和菓子復興大作戦〜萌えキャラで和菓子ブームを〜和菓子復興大作戦〜萌えキャラで和菓子ブームを〜
和菓子復興大作戦〜萌えキャラで和菓子ブームを〜
stucon
 
NCC achieves transparency via IBX Spend Analytics enabling a complete procure...
NCC achieves transparency via IBX Spend Analytics enabling a complete procure...NCC achieves transparency via IBX Spend Analytics enabling a complete procure...
NCC achieves transparency via IBX Spend Analytics enabling a complete procure...
Capgemini
 
Social Media Strategies for Powerful Communications
Social Media Strategies for Powerful CommunicationsSocial Media Strategies for Powerful Communications
Social Media Strategies for Powerful Communications
courtneymbarnes
 
Semantic web-and-public-data - en
Semantic web-and-public-data - enSemantic web-and-public-data - en
Semantic web-and-public-data - en
Tenforce
 
Valeriia Mozharova and Natalia Loukachevitch - Combining Knowledge and CRF-b...
Valeriia Mozharova and  Natalia Loukachevitch - Combining Knowledge and CRF-b...Valeriia Mozharova and  Natalia Loukachevitch - Combining Knowledge and CRF-b...
Valeriia Mozharova and Natalia Loukachevitch - Combining Knowledge and CRF-b...
AIST
 
윈도 Xp 종료, 오픈소스 소프트웨어에 기회가 될 것인가
윈도 Xp 종료, 오픈소스 소프트웨어에 기회가 될 것인가윈도 Xp 종료, 오픈소스 소프트웨어에 기회가 될 것인가
윈도 Xp 종료, 오픈소스 소프트웨어에 기회가 될 것인가atelier t*h
 
Skolkovo
SkolkovoSkolkovo
Skolkovo
Kaysheva
 
Trabalho 1
Trabalho 1Trabalho 1
Trabalho 1
EB2 Mira
 
15 Things to Give Up to be Happy
15 Things to Give Up to be Happy 15 Things to Give Up to be Happy
15 Things to Give Up to be Happy
OH TEIK BIN
 
Configuracion de IP windows XP
Configuracion de IP windows XPConfiguracion de IP windows XP
Configuracion de IP windows XP
Roten Skate
 
Cancer in dogs
Cancer in dogsCancer in dogs
Cancer in dogs
MatThomson
 

Viewers also liked (15)

Exact Data Reduction for Big Data by Jieping Ye
Exact Data Reduction for Big Data by Jieping YeExact Data Reduction for Big Data by Jieping Ye
Exact Data Reduction for Big Data by Jieping Ye
 
Are we with-it? - Lucia Schoombee
Are we with-it? - Lucia SchoombeeAre we with-it? - Lucia Schoombee
Are we with-it? - Lucia Schoombee
 
Personalising Customer Experience in the Hospitality Industry June 2016
Personalising Customer Experience in the Hospitality Industry June 2016Personalising Customer Experience in the Hospitality Industry June 2016
Personalising Customer Experience in the Hospitality Industry June 2016
 
和菓子復興大作戦〜萌えキャラで和菓子ブームを〜
和菓子復興大作戦〜萌えキャラで和菓子ブームを〜和菓子復興大作戦〜萌えキャラで和菓子ブームを〜
和菓子復興大作戦〜萌えキャラで和菓子ブームを〜
 
NCC achieves transparency via IBX Spend Analytics enabling a complete procure...
NCC achieves transparency via IBX Spend Analytics enabling a complete procure...NCC achieves transparency via IBX Spend Analytics enabling a complete procure...
NCC achieves transparency via IBX Spend Analytics enabling a complete procure...
 
Social Media Strategies for Powerful Communications
Social Media Strategies for Powerful CommunicationsSocial Media Strategies for Powerful Communications
Social Media Strategies for Powerful Communications
 
Semantic web-and-public-data - en
Semantic web-and-public-data - enSemantic web-and-public-data - en
Semantic web-and-public-data - en
 
Valeriia Mozharova and Natalia Loukachevitch - Combining Knowledge and CRF-b...
Valeriia Mozharova and  Natalia Loukachevitch - Combining Knowledge and CRF-b...Valeriia Mozharova and  Natalia Loukachevitch - Combining Knowledge and CRF-b...
Valeriia Mozharova and Natalia Loukachevitch - Combining Knowledge and CRF-b...
 
윈도 Xp 종료, 오픈소스 소프트웨어에 기회가 될 것인가
윈도 Xp 종료, 오픈소스 소프트웨어에 기회가 될 것인가윈도 Xp 종료, 오픈소스 소프트웨어에 기회가 될 것인가
윈도 Xp 종료, 오픈소스 소프트웨어에 기회가 될 것인가
 
Skolkovo
SkolkovoSkolkovo
Skolkovo
 
Trabalho 1
Trabalho 1Trabalho 1
Trabalho 1
 
14 de Dezembro 2009
14 de Dezembro 200914 de Dezembro 2009
14 de Dezembro 2009
 
15 Things to Give Up to be Happy
15 Things to Give Up to be Happy 15 Things to Give Up to be Happy
15 Things to Give Up to be Happy
 
Configuracion de IP windows XP
Configuracion de IP windows XPConfiguracion de IP windows XP
Configuracion de IP windows XP
 
Cancer in dogs
Cancer in dogsCancer in dogs
Cancer in dogs
 

Similar to Big Data and Small Devices by Katharina Morik

Model-counting Approaches For Nonlinear Numerical Constraints
Model-counting Approaches For Nonlinear Numerical ConstraintsModel-counting Approaches For Nonlinear Numerical Constraints
Model-counting Approaches For Nonlinear Numerical Constraints
Quoc-Sang Phan
 
Gene's law
Gene's lawGene's law
Gene's law
Hoopeer Hoopeer
 
cis97003
cis97003cis97003
cis97003perfj
 
Applied parallel coordinates for logs and network traffic attack analysis
Applied parallel coordinates for logs and network traffic attack analysisApplied parallel coordinates for logs and network traffic attack analysis
Applied parallel coordinates for logs and network traffic attack analysisUltraUploader
 
Self-Similarity in Complex Networks
Self-Similarity in Complex NetworksSelf-Similarity in Complex Networks
Self-Similarity in Complex Networks
norman_fahrer
 
Vol 9 No 1 - January 2014
Vol 9 No 1 - January 2014Vol 9 No 1 - January 2014
Vol 9 No 1 - January 2014
ijcsbi
 
Citython presentation
Citython presentationCitython presentation
Citython presentation
Ankit Tewari
 
DESIGN OF QUATERNARY LOGICAL CIRCUIT USING VOLTAGE AND CURRENT MODE LOGIC
DESIGN OF QUATERNARY LOGICAL CIRCUIT USING VOLTAGE AND CURRENT MODE LOGICDESIGN OF QUATERNARY LOGICAL CIRCUIT USING VOLTAGE AND CURRENT MODE LOGIC
DESIGN OF QUATERNARY LOGICAL CIRCUIT USING VOLTAGE AND CURRENT MODE LOGIC
VLSICS Design
 
Representing Simplicial Complexes with Mangroves
Representing Simplicial Complexes with MangrovesRepresenting Simplicial Complexes with Mangroves
Representing Simplicial Complexes with Mangroves
David Canino
 
cis97007
cis97007cis97007
cis97007perfj
 
ScaleGraph - A High-Performance Library for Billion-Scale Graph Analytics
ScaleGraph - A High-Performance Library for Billion-Scale Graph AnalyticsScaleGraph - A High-Performance Library for Billion-Scale Graph Analytics
ScaleGraph - A High-Performance Library for Billion-Scale Graph Analytics
Toyotaro Suzumura
 
AWS 클라우드를 통한 쓰나미 연구 사례: 日츄오대 - AWS Summit Seoul 2017
AWS 클라우드를 통한 쓰나미 연구 사례: 日츄오대 - AWS Summit Seoul 2017AWS 클라우드를 통한 쓰나미 연구 사례: 日츄오대 - AWS Summit Seoul 2017
AWS 클라우드를 통한 쓰나미 연구 사례: 日츄오대 - AWS Summit Seoul 2017
Amazon Web Services Korea
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix Factorization
Arthur Mensch
 
Embedded system Design introduction _ Karakola
Embedded system Design introduction _ KarakolaEmbedded system Design introduction _ Karakola
Embedded system Design introduction _ Karakola
JohanAspro
 
Real Time Speech Enhancement in the Waveform Domain
Real Time Speech Enhancement in the Waveform DomainReal Time Speech Enhancement in the Waveform Domain
Real Time Speech Enhancement in the Waveform Domain
Willy Marroquin (WillyDevNET)
 
Adaptive blind multiuser detection under impulsive noise using principal comp...
Adaptive blind multiuser detection under impulsive noise using principal comp...Adaptive blind multiuser detection under impulsive noise using principal comp...
Adaptive blind multiuser detection under impulsive noise using principal comp...
csandit
 
ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...
ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...
ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...
csandit
 
ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...
ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...
ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...
cscpconf
 
Project 2: Baseband Data Communication
Project 2: Baseband Data CommunicationProject 2: Baseband Data Communication
Project 2: Baseband Data Communication
Danish Bangash
 
Angular and Deep Learning
Angular and Deep LearningAngular and Deep Learning
Angular and Deep Learning
Oswald Campesato
 

Similar to Big Data and Small Devices by Katharina Morik (20)

Model-counting Approaches For Nonlinear Numerical Constraints
Model-counting Approaches For Nonlinear Numerical ConstraintsModel-counting Approaches For Nonlinear Numerical Constraints
Model-counting Approaches For Nonlinear Numerical Constraints
 
Gene's law
Gene's lawGene's law
Gene's law
 
cis97003
cis97003cis97003
cis97003
 
Applied parallel coordinates for logs and network traffic attack analysis
Applied parallel coordinates for logs and network traffic attack analysisApplied parallel coordinates for logs and network traffic attack analysis
Applied parallel coordinates for logs and network traffic attack analysis
 
Self-Similarity in Complex Networks
Self-Similarity in Complex NetworksSelf-Similarity in Complex Networks
Self-Similarity in Complex Networks
 
Vol 9 No 1 - January 2014
Vol 9 No 1 - January 2014Vol 9 No 1 - January 2014
Vol 9 No 1 - January 2014
 
Citython presentation
Citython presentationCitython presentation
Citython presentation
 
DESIGN OF QUATERNARY LOGICAL CIRCUIT USING VOLTAGE AND CURRENT MODE LOGIC
DESIGN OF QUATERNARY LOGICAL CIRCUIT USING VOLTAGE AND CURRENT MODE LOGICDESIGN OF QUATERNARY LOGICAL CIRCUIT USING VOLTAGE AND CURRENT MODE LOGIC
DESIGN OF QUATERNARY LOGICAL CIRCUIT USING VOLTAGE AND CURRENT MODE LOGIC
 
Representing Simplicial Complexes with Mangroves
Representing Simplicial Complexes with MangrovesRepresenting Simplicial Complexes with Mangroves
Representing Simplicial Complexes with Mangroves
 
cis97007
cis97007cis97007
cis97007
 
ScaleGraph - A High-Performance Library for Billion-Scale Graph Analytics
ScaleGraph - A High-Performance Library for Billion-Scale Graph AnalyticsScaleGraph - A High-Performance Library for Billion-Scale Graph Analytics
ScaleGraph - A High-Performance Library for Billion-Scale Graph Analytics
 
AWS 클라우드를 통한 쓰나미 연구 사례: 日츄오대 - AWS Summit Seoul 2017
AWS 클라우드를 통한 쓰나미 연구 사례: 日츄오대 - AWS Summit Seoul 2017AWS 클라우드를 통한 쓰나미 연구 사례: 日츄오대 - AWS Summit Seoul 2017
AWS 클라우드를 통한 쓰나미 연구 사례: 日츄오대 - AWS Summit Seoul 2017
 
Dictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix FactorizationDictionary Learning for Massive Matrix Factorization
Dictionary Learning for Massive Matrix Factorization
 
Embedded system Design introduction _ Karakola
Embedded system Design introduction _ KarakolaEmbedded system Design introduction _ Karakola
Embedded system Design introduction _ Karakola
 
Real Time Speech Enhancement in the Waveform Domain
Real Time Speech Enhancement in the Waveform DomainReal Time Speech Enhancement in the Waveform Domain
Real Time Speech Enhancement in the Waveform Domain
 
Adaptive blind multiuser detection under impulsive noise using principal comp...
Adaptive blind multiuser detection under impulsive noise using principal comp...Adaptive blind multiuser detection under impulsive noise using principal comp...
Adaptive blind multiuser detection under impulsive noise using principal comp...
 
ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...
ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...
ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...
 
ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...
ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...
ADAPTIVE BLIND MULTIUSER DETECTION UNDER IMPULSIVE NOISE USING PRINCIPAL COMP...
 
Project 2: Baseband Data Communication
Project 2: Baseband Data CommunicationProject 2: Baseband Data Communication
Project 2: Baseband Data Communication
 
Angular and Deep Learning
Angular and Deep LearningAngular and Deep Learning
Angular and Deep Learning
 

More from BigMine

Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
BigMine
 
From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...
From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...
From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...
BigMine
 
Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16
Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16
Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16
BigMine
 
Processing Reachability Queries with Realistic Constraints on Massive Network...
Processing Reachability Queries with Realistic Constraints on Massive Network...Processing Reachability Queries with Realistic Constraints on Massive Network...
Processing Reachability Queries with Realistic Constraints on Massive Network...
BigMine
 
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
BigMine
 
Big & Personal: the data and the models behind Netflix recommendations by Xa...
 Big & Personal: the data and the models behind Netflix recommendations by Xa... Big & Personal: the data and the models behind Netflix recommendations by Xa...
Big & Personal: the data and the models behind Netflix recommendations by Xa...
BigMine
 
Large Graph Mining – Patterns, tools and cascade analysis by Christos Faloutsos
Large Graph Mining – Patterns, tools and cascade analysis by Christos FaloutsosLarge Graph Mining – Patterns, tools and cascade analysis by Christos Faloutsos
Large Graph Mining – Patterns, tools and cascade analysis by Christos Faloutsos
BigMine
 
Unexpected Challenges in Large Scale Machine Learning by Charles Parker
 Unexpected Challenges in Large Scale Machine Learning by Charles Parker Unexpected Challenges in Large Scale Machine Learning by Charles Parker
Unexpected Challenges in Large Scale Machine Learning by Charles Parker
BigMine
 
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
BigMine
 

More from BigMine (9)

Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
Inside the Atoms: Mining a Network of Networks and Beyond by HangHang Tong at...
 
From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...
From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...
From Practice to Theory in Learning from Massive Data by Charles Elkan at Big...
 
Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16
Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16
Foundations for Scaling ML in Apache Spark by Joseph Bradley at BigMine16
 
Processing Reachability Queries with Realistic Constraints on Massive Network...
Processing Reachability Queries with Realistic Constraints on Massive Network...Processing Reachability Queries with Realistic Constraints on Massive Network...
Processing Reachability Queries with Realistic Constraints on Massive Network...
 
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
Challenging Problems for Scalable Mining of Heterogeneous Social and Informat...
 
Big & Personal: the data and the models behind Netflix recommendations by Xa...
 Big & Personal: the data and the models behind Netflix recommendations by Xa... Big & Personal: the data and the models behind Netflix recommendations by Xa...
Big & Personal: the data and the models behind Netflix recommendations by Xa...
 
Large Graph Mining – Patterns, tools and cascade analysis by Christos Faloutsos
Large Graph Mining – Patterns, tools and cascade analysis by Christos FaloutsosLarge Graph Mining – Patterns, tools and cascade analysis by Christos Faloutsos
Large Graph Mining – Patterns, tools and cascade analysis by Christos Faloutsos
 
Unexpected Challenges in Large Scale Machine Learning by Charles Parker
 Unexpected Challenges in Large Scale Machine Learning by Charles Parker Unexpected Challenges in Large Scale Machine Learning by Charles Parker
Unexpected Challenges in Large Scale Machine Learning by Charles Parker
 
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
Big Data Analytics: Applications and Opportunities in On-line Predictive Mode...
 

Recently uploaded

Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 

Recently uploaded (20)

Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 

Big Data and Small Devices by Katharina Morik

  • 1. Big Data and Small Decives Katharina Morik, Artificial Intelligence, TU Dortmund University, Germany
  • 2. Overview ! Big data -- small devices ! Graphical Models ! Computation on the batch layer ! Computation on the speed layer ! Integer Random Fields ! Spatio-Temporal Random Fields ! Traffic Scenarios ! Using the streams framework
  • 3. Big data – small devices ! Machine learning for the enhancement of small devices ! Machine learning on small devices ! Reduce ! Computational complexity ! Memory consumption ! Communication ! Energy
  • 4. Overview ! Big data -- small devices ! Graphical Models ! Computation on the batch layer ! Computation on the speed layer ! Integer Random Fields ! Spatio-Temporal Random Fields ! Traffic Scenarios ! Using the streams framework
  • 5. Graphical Models ! Graph G=(V,E) ! Multi-variate random variable X with finite discrete domain X ! Mapping joint vertex assignment into vector space φ(x): X --> Rd ! Parameter to be learned: θ in Rd CRF : p ( ! ! y x) = 1 ! ( x) Z θ, Σ exp θi φi ( ! x) ! y, i " # $ % & ' 1 a = exp(−ln a) Σ = exp θi φi ( ! x) ! y, i " # $ % & ' ! ( x) − ln Z θ, ) * + , - . ! x ( ) − A θ ( ) )* = exp θ,φ ! y, ,- MRF : p( ! x) = 1 Z θ ( ) Σ exp θi φi ! (x) i " # $ % & ' ! x ( ) − A θ ( ) )* = exp θ,φ ,-
  • 6. Note: vectors are sparse φ(x): ( /* Nodes*/ 1. red, /*dom(lights)*/ 2. yellow, 3. green, 4. wet, /*dom(rain)*/ 5. dry, 6. jam, /*dom(jam)*/ 7. free, /*Edges*/ 1. red, dry, /* edge lights-rain*/ 2. yellow, dry, 3. green, dry, 4. red, wet, 5. yellow, wet, 6. green, wet, 7. red, jam, /*edge lights-jam*/ 8. yellow, jam, 9. green, jam, 10. red, free, 11. yellow, free, 12. green, free, 13. dry, free, /*edge rain-jam*/ 14. dry, jam, 15. wet, free, 16. wet, jam) ) ! x1:(red,dry,free) Lights Rain Jam {jam, free} {dry, wet} {red, yellow, green}
  • 8. Markov Random Field: Likelihood
  • 9. Markov Random Field: Estimation and Prediction
  • 10. Belief propagation ! Actually computing the expected probability of parameters given θ by message passing for each direction.
  • 11. Note: vectors are sparse φ(x): ( /* Nodes*/ 1. red, /*dom(lights)*/ 2. yellow, 3. green, 4. wet, /*dom(rain)*/ 5. dry, 6. jam, /*dom(jam)*/ 7. free, /*Edges*/ 1. red, dry, /* edge lights-rain*/ 2. yellow, dry, 3. green, dry, 4. red, wet, 5. yellow, wet, 6. green, wet, 7. red, jam, /*edge lights-jam*/ 8. yellow, jam, 9. green, jam, 10. red, free, 11. yellow, free, 12. green, free, 13. dry, free, /*edge rain-jam*/ 14. dry, jam, 15. wet, free, 16. wet, jam) ) ! x1:(red,dry,free) ! φ(x1)= (1,0,0,0,1,0,1,1,0,0,0,0,0,0, 0,0,1,0,0,1,0,0,0) ! d = 23 Lights Rain Jam {jam, free} {dry, wet} {red, yellow, green}
  • 12. Belief propagation example Π Σ mv→u (y) = exp θvu=xy +θv=x ( ) mwu (x) Lights Rain Jam {jam, free} {dry, wet} {red, yellow, green} w∈Nv−{u} x∈ℵv Compute for each node u and each of its neighbors and each of the states
  • 13. Parallel Belief Propagation (GPU) ! Data Parallel: all node and neighbor pairs become blocks with b threads. A thread computes the msg v" u. ! Thread-cooperative: all node and instance pairs become blocks. A thread computes partial outgoing messages. These are cooperatively processed by all threads in a block. Partial messages are merged afterwards. ! Nico Piatkowski (2011) Parallel Algorithms for GPU Accelerated Probabilistic Inference ! http://biglearn.org/2011/files/papers/biglearn2011_submission_28.pdf
  • 14. File Prefetching for Smartphone Energy Saving ! Accessing files that are stored in cache consumes less energy than accessing those on the hard disc. ! Prefetching files which soon will be accessed allows for grouping requests to a batched one. This saves energy. " Prediction of file access saves energy! Given a system call: 1,open,1812,179,178,201,200,firefox,/etc/hosts,524288,438,7 Predict: hosts full Prefetch: hosts 2,read,1812,179,178,201,200,firefox,/etc/hosts,4096,361 3,read,1812,179,178,201,200,firefox,/etc/hosts,4096,0 4,close,1812,179,178,201,200,firefox,/etc/hosts timestamp, syscall, thread-id, process-id, parent, user, group, exec, file, parameters (optional) 14
  • 15. Results: Memory ! Naive Bayes models use less memory than those of regular CRF. ! Implementing CRF on GPGPU reduces memory needs. ! CRF on GPU scales best. 15 1600 1400 1200 1000 800 600 400 200 0 67k 202k 337k 472k 606k NB memory CRF memory GPU CRF memory Felix Jungermann, Nico Piatkowski, Katharina Morik (2010) Enhancing Ubiquitous Systems Through System Call Mining, LACIS at ICDM http://www.computer.org/csdl/proceedings/ icdmw/2010/4257/00/4257b338-abs.html
  • 16. Overview ! Big data -- small devices ! Graphical Models ! Computation on the batch layer ! Computation on the speed layer ! Integer Random Fields ! Spatio-Temporal Random Fields ! Traffic Scenarios ! Using the streams framework
  • 17. Graphical models on resource-restricted processors ! Floating point arithmetics (real) costs more clock cycles than integer arithmetics. ! “The most obvious technique to conserve power is to reduce the number of cycles it takes to complete a workload.” (Intel 64, IA-32 architectures optimization reference manual, guidelines for extending battery life). ! Restrict the parameter space of MRF Sandy Bridge ARM 11 Real Int Real Int + 3 1 8 1 * 5 3 8 4-5 / 14 13-15 19 - Bit - 3 - 2 shift Clock cycles for arithmetics on different processors: Real vs. integer. θ ∈{0,1,...,K}⊂ Ν
  • 18. Parameter space transformation ! Graph model tree-structured ! Transform the parameter space: ηi (θ) = θi ln 2 MRF : p( ! x) = 1 Z θ ( ) Σ exp θi φi ! (x) i " # $ % & ' ! x ( ) − A θ ( ) )* = exp θ,φ +, IntegerMRF : p( ! x) = exp η θ ( ),φ ! x ( ) )* +, = 2 θ ,φ x ( ) −A η θ ( ) ( ) )* +, = 2 θ ,φ (x) 2 θ ,φ (y) y∈ℵ Σ
  • 19. Integer belief propagation ! Simply replacing the exp(.) by 2(.) is not sufficient ! Overflows are normally avoided by normalization. ! Normalization is impossible in integer division. ! Magnitude of messages corresponds to probability ! Use the length of each message ! Bit-length is similar to log Π Σ mv→u (y) = exp θvu=xy +θv=x ( ) mwu (x) w∈Nv−{u} x∈ℵv m! v→u (y) = 2 (θvu=xy+θv=x ) Σ m! wu (x) x∈ℵv Π w∈Nv−{u} βv→u (y) = max x∈ℵv Σ θvu=xy +θv=x + βwu (x) w∈Nv−{u}
  • 20. Evaluation: accuracy 0.95 0.9 0.85 Int, deg=4 Int, deg=8 Int, deg=16 1000 2000 3000 4000 5000 6000 7000 8000 ! Different vertex degrees ! Up to 8000 nodes in the graph ! 100 runs of IntMRF ! Real MRF is 100% accuracy. # Scalable!
  • 21. Integer models use less clock cycles and memory ! Integer MRF shows ! high accuracy: 86 – 92% of full MRF for up to 8 states and 8000 nodes ! high speed-up: 2.56 Raspberry Pie and 2.34 for Sandy Bridge ! Graphical models can now be computed on small devices. ! Using less clock cycles really saves energy. Nico Piatkowski, Sangkyun Lee, Katharina Morik (2014) The Integer Approximation of Undirected Graphical Models, ICPRAM http://www-ai.cs.uni-dortmund.de/ PublicPublicationFiles/piatkowski_etal_2014a.pdf
  • 22. Overview ! Big data -- small devices ! Graphical Models ! Computation on the batch layer ! Computation on the speed layer ! Integer Random Fields ! Spatio-Temporal Random Fields ! Traffic Scenarios ! Using the streams framework
  • 23. Spatio-temporal random fields ! The spatio-temporal graph is trained to predict each node’s maximum a posteriori probability with the marginal probabilities. ! Generative model predicting all nodes. ! Dimension T x |V0| x | X| + [(T-1)(|V0|+3|E0|)+ |E0|] x |X|2 ! Remember: vectors are sparse we have to exploit that! User queries: Given traffic densities at all nodes at t1, t2, t3, what is the probability of traffic density at node A at time t5? Given state “jam” at place A ts, which other places have a higher probability for “jam” in ts < t < te?
  • 24. Training efficiently ! Reparametrization ! Sparseness and smoothness ! There are not many changes over time ! Small ratio of non-zero entries ! Regularization ! L1 and L2 norm now penalizing difference θ(t-1) and θ(t) ! STRF uses 8 times less memory than MRF model ! Its distributed optimization converges about 2 times faster.
  • 25. Compressible parametrization ! STRF is 8 times smaller than MRF model ! The ratio of non-zero entries is quite small already for quite small L1 regularization ! STRF’s distributed optimization converges faster than MRF: ! Run MRF for 100 iterations, record the likelihood value; ! Run STRF until it reaches the same result of the objective function. ! Compare seconds.
  • 26. Evaluation: accuracy ! Smaller state spaces result in higher accuracies. ! Again, for MRF accuracy 100%. # State space is important! 1 0.8 0.6 0.4 Int, |X|=2 Int, |X|=4 Int, |X|=8 Int, |X|=16 0 1000 2000 3000 4000 5000 6000 7000 8000
  • 27. Flood in the City ! The city of Dublin suffers from flood due to rainfall and tide. Wanted: ! Optimal traffic regulations based on traffic predictions. ! Insight EU Project www.insight-ict.eu Intelligent Synthesis and Real-time Response using Massive Streaming of Heterogeneous Data Coordinator: Dimitrios Gunopoulos, Univ. Athens, Greece
  • 28. Smart trip modeling for Dublin ! Open Street Map " graph topology ! Open Trip Planner: user query (v,w), route planning based on traffic costs. ! Traffic costs learned: ! Spatio-temporal random field based on sensor data stream; ! Gaussian process estimates values for non-sensor locations. ! Framework for real-time processing of data streams, XML configuration of data flow, connecting data, traffic model and planner. 7:00 8:00 v v w w
  • 29. Constructing the spatio-temporal graph of Dublin OpenStreetMap streets segmented according to junctions. 966 sensors transmit traffic flow every 6 minutes (www.dublinked.ie). Aggregate sensor readings for 30 minutes, aggregate sensor nodes by 7NN. Traffic flow discretized into 6 intervals of density. 48 time layers for each day (48* 30=1440 minutes make a day). Training for every weekday. Predicting density of each node and edge – interpreted as costs.
  • 30. Using STRF for smart trip modeling -- Evaluation ! Confusion matrix of predicting the number of vehicles (6 intervals) for all sensors and all half hours following 1 pm on Fridays, tested on March 1.,8.,15.,22.,29. ! given the traffic at 1 p.m. (bold is true). Predicted 0 1-5 6-20 21- 30 31-60 61- Prec True 0 840 32 10 6 3 0 0.943 1-5 2 632 498 3 0 1 0.556 6-20 91 156 12169 2006 83 25 0.838 21-30 32 0 1223 5637 717 14 0.739 31-60 43 0 60 893 1945 29 0.655 61- 0 0 16 3 12 35 0.530 Recall 0.833 0.771 0.871 0.659 0.705 0.34 Environment Energy Engineering
  • 31. Confusion Matrices for Cluster-Based Discretization (all t) Friday 0-7 8-12 13-21 22 - Precision 0-7 139 518 38 551 45 350 50 501 0.5093 8-12 37 419 51 284 37 153 27 112 0.3396 13-21 23 535 25 863 115 081 66 171 0.4989 22- 22 387 18 034 58229 205 822 0.6760 Recall 0.6260 0.3881 0.4499 0.5887 Saturday 0-7 8-12 13-21 22 - Precision 0-7 167 073 57 448 34 370 16 590 0.6065 8-12 46 944 102 703 68 194 26 181 0.4209 13-21 31 626 45 606 140 866 51 136 0.5232 22- 19 365 25 737 46 949 89 301 0.4937 Recall 0.6304 0.4437 0.4859 0.4874
  • 32. Smart traffic for smart cities ! Spatio-temporal modeling is natural for traffic prediction. ! Several questions can be answered using the same learned model. ! The answers come along with their probabilities. This might be helpful for decision makers. ! Integration of Spatio-temporal random fields into the Open Trip Planner and Gaussian information completion resulted in an excellent navigation system. Piatkowski, Lee, Morik (2013) Spatio-temporal random fields: compressible representation and distributed estimation, Machine Learning Journal 93:1, 115 – 140. Liebig, Piatkowski, Bockermann, Morik (2014) Predictive Trip Planning – Smart Routing in Smart Cities, Mining Urban Data Workshop at 17th Intern. Conf. on Extending Database Technology.
  • 33. Overview ! Introduction ! Big data ! Small devices ! Graphical Models ! Computation on the batch layer ! Computation on the speed layer ! Integer Random Fields ! Spatio-Temporal Random Fields ! Traffic Scenarios ! Using the streams framework
  • 34. Streams framework ! Open-source Java environment for abstract orchestration of streaming functions by Christian Bockermann, GNU AGPL license. ! High-level API for sources, processing nodes, executable on the STORM engine and topology. ! Graph of nodes can be used for distributed processing. ! Easy integration of MOA library. ! Eases integration of various engines. Christian Bockermann, Hendrik Blom 2012 The Streams Framework Tech.Rep. SFB 876 http://www-ai.cs.uni-dortmund.de/ PublicPublicationFiles/ bockermann_blom_2012c.pdf
  • 35. Stream Processing Engines ! Fault tolerance by resending not acknowledged data (streams) vs. sophisticated recovery methods (MillWheel). ! State management by storing state in process context (streams) vs. links to distributed storage systems (Samza, MillWheel on Apache Cassandra, BigTable). ! Explicit partitioning by flow graph vs. built upon distributed computing environment managed by a central node.
  • 36. Comparison of stream processing engines Christian Bockermann A Survey of the Stream Processing Landscape 2014 TechRep. SFB 876 http://sfb876.tu-dortmund.de/ PublicPublicationFiles/ bockermann_2014b.pdf
  • 37. Streams framework ! Streams to map reduce for an easy execution in Hadoop reading HDFS files.
  • 38. Conclusion ! Complex models such as MRF can be computed on small devices such as smartphones. ! Graphical Models ! Parallel BP using GPU ! Approximation to Integer Random Fields ! Spatio-Temporal Random Fields ! Using the streams framework for integrating processes (e.g. MOA for learning) and real-time processing and documentation.
  • 39. References ! Nico Piatkowski (2011) Parallel Algorithms for GPU Accelerated Probabilistic Inference http://biglearn.org/2011/files/papers/ biglearn2011_submission_28.pdf ! Felix Jungermann, Nico Piatkowski, Katharina Morik (2010) Enhancing Ubiquitous Systems Through System Call Mining, LACIS at ICDM http://www.computer.org/csdl/proceedings/ icdmw/2010/4257/00/4257b338-abs.html ! Nico Piatkowski, Sangkyun Lee, Katharina Morik (2014) The Integer Approximation of Undirected Graphical Models, ICPRAM http://www-ai.cs.uni-dortmund.de/ PublicPublicationFiles/ piatkowski_etal_2014a.pdf ! Piatkowski, Lee, Morik (2013) Spatio-temporal random fields: compressible representation and distributed estimation, Machine Learning Journal 93:1, 115 – 140. ! Liebig, Piatkowski, Bockermann, Morik (2014) Predictive Trip Planning – Smart Routing in Smart Cities, Mining Urban Data Workshop at 17th Intern. Conf. on Extending Database Technology. ! Streams framework Christian Bockermann, Hendrik Blom 2012 The Streams Framework Tech.Rep. SFB 876 http://www-ai.cs.uni-dortmund.de/ PublicPublicationFiles/ bockermann_blom_2012c.pdf ! Christian Bockermann (2014) A Survey of the Stream Processing Landscape TechRep. SFB 876 http://sfb876.tu-dortmund.de/ PublicPublicationFiles/ bockermann_2014b.pdf ! For streams software (not just scripts!) check: http://www-ai.cs.uni-dortmund.de/ SOFTWARE/streams/