Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Competence	
  Center	
  Informa.on	
  Retrieval	
  and	
  Machine	
  Learning	
  
Talk	
  at	
  DFG	
  Graduiertenkolleg	
...
The World of Data
Less	
  than	
  	
  
one	
  percent	
  
of	
  the	
  world’s	
  data	
  is	
  analyzed.	
  
Data is Gold
Smart	
  Informa.on	
  Systems	
  rely	
  on	
  the	
  
analysis	
  of	
  both	
  small	
  and	
  big	
  data...
Content	
  
Analy.cs	
  
Big	
  Data	
  
Informa.on	
  
Aggrega.on	
  
Complex	
  
Event-­‐
Processing	
  
Gamifica.on	
  
...
Hype Cycle for Emerging Technologies 2013
©	
  Gartner	
  Inc.,	
  Source:	
  hTp://na1.www.gartner.com/imagesrv/newsroom/...
Content	
  
Analy.cs	
  
Big	
  Data	
  
Informa.on	
  
Aggrega.on	
  
Complex	
  
Event-­‐
Processing	
  
Gamifica.on	
  
...
Content Analytics
	
  
	
  
Use	
  Case	
  1:	
  
	
  
Iden.fica.on	
  of	
  Affec.ve	
  Music	
  Video	
  Content	
  	
  
	...
The Video Affective Analysis Method
8
▶  Using	
  deep	
  learning	
  
methods	
  -­‐	
  convolu.onal	
  
neural	
  networ...
Dataset & Groundtruth
▶  The	
   DEAP	
   dataset:	
   a	
   dataset	
   of	
  
YouTube	
  videos	
  for	
  the	
  analysi...
Results
Classifica4on	
  accuracies	
  on	
  the	
  DEAP	
  dataset	
  (with	
  audio-­‐visual	
  representa4ons)	
  
Class...
Content Analytics
	
  
	
  
Use	
  Case	
  2:	
  
	
  
Violence	
  Detec.on	
  in	
  Hollywood	
  Movies	
  
	
  
	
  
	
 ...
Discriminative Mid-Level Audio Features
▶  We	
   use	
   mid-­‐level	
   audio	
   features	
   based	
   on	
   MFCCs	
 ...
Learning Violence Detection Model
Learning	
  a	
  Violence	
  Model	
  
Dataset & Groundtruth
▶  Dataset:	
  
§  32,708	
  video	
  shots	
  from	
  18	
  Hollywood	
  movies	
  of	
  different	...
Results & Discussions (1)
Average	
  Precision	
  at	
  100	
  for	
  the	
  Baseline	
  and	
  Our	
  Methods	
  
Average...
Results & Discussions (2)
Average	
  Precision	
  at	
  20	
  &	
  100	
  	
  and	
  R-­‐precision	
  on	
  Independence	
...
Content	
  
Analy.cs	
  
Big	
  Data	
  
Informa.on	
  
Aggrega.on	
  
Complex	
  
Event-­‐
Processing	
  
Gamifica.on	
  
...
What	
  is	
  Big	
  Data?	
  	
  
▶  Big	
  Data	
  is	
  not	
  only	
  “big“	
  
▶  Big	
  Data	
  is	
  data	
  that	
...
Big Data
	
  
	
  
Use	
  Case	
  3:	
  
	
  
Recommending	
  News	
  Ar.cles	
  in	
  Real-­‐Time	
  
	
  
	
  
	
  
[Kil...
 
Recommender	
  Systems	
  help	
  users	
  
to	
  find	
  items	
  that	
  they	
  might	
  not	
  
have	
  found	
  by	
...
Items?
Movies	
   Songs	
  
Images	
   …	
  
Example: News Articles
Source:	
  T.	
  Brodt	
  of	
  plista.com	
  
Evaluation in a Living Lab
▶  News	
  Recommender	
  Challenge	
  organized	
  at	
  ACM	
  RecSys	
  
2013	
  together	
 ...
Open Recommender Platform
Some results…
clicks
impressions
No.ofclicks
0
10000
20000
30000
40000
50000
NoofImpressions
0
1
2
3
4
5
6
7
8
9×106
31 1 ...
Advertisement
▶  Want	
  to	
  par.cipate	
  in	
  a	
  living	
  lab?	
  
▶  Campaign-­‐style	
  evalua.on	
  lab	
  CLEF...
Content	
  
Analy.cs	
  
Big	
  Data	
  
Informa.on	
  
Aggrega.on	
  
Complex	
  
Event-­‐
Processing	
  
Gamifica.on	
  
...
Information Aggregation
	
  
	
  
Use	
  Case	
  4:	
  
	
  
Informa.on	
  Aggrega.on	
  in	
  the	
  Berlin	
  Administra...
Motivation
▶  Informa.on	
  stored	
  in	
  many	
  heterogeneous	
  sources	
  =>	
  search	
  osen	
  very	
  
cumbersom...
Use Case – Net Infrastructure of the Berlin Administration
Search Engines – Approaches
Approaches	
  to	
  build	
  an	
  index	
  for	
  search	
  engines:	
  
1.  Central	
  Crawl...
1.	
  Central	
  Crawling	
  (Indexing)	
  
Search Engines – Central Crawler
▶  Many	
  Problems	
  
§  all	
  data	
  (Index)	
  has	
  to	
  be	
  saved	
  central...
Distributed	
  Crawling	
  (Indexing)	
  
Search Engines – Distributed Crawler (Spider)
▶  Many	
  Problems	
  
§  all	
  data	
  (Index)	
  has	
  to	
  be	
  sav...
▶  Solu.on:	
  PIA	
  Enterprise	
  
§  …	
  is	
  a	
  distributed,	
  intelligent	
  search	
  engine	
  (based	
  on	
...
▶  Research	
  Topics	
  
§  Topic-­‐Based	
  Retrieval	
  
§  search	
  quality	
  improvement	
  (e.g.	
  Query	
  Exp...
Content	
  
Analy.cs	
  
Big	
  Data	
  
Informa.on	
  
Aggrega.on	
  
Complex	
  
Event-­‐
Processing	
  
Gamifica.on	
  
...
Complex Event-Processing
	
  
	
  
Use	
  Case	
  5:	
  
	
  
Energy	
  Disaggrega.on	
  
	
  
	
  
	
  
	
  
	
  
[Spiege...
Motivation
▶  Op.miza.on	
  of	
  Hea.ng	
  Schedules	
  based	
  on	
  Energy	
  Profiles	
  
	
  	
  	
  	
  
	
  	
  	
 ...
Research Challenges
▶  Detec.on	
  of	
  resident	
  presence	
  in	
  home	
  environment	
  
§  Overall	
  energy	
  co...
Energy Disaggregation
▶  Is	
  it	
  possible	
  to	
  recognize	
  individual	
  home	
  appliances	
  (and	
  
their	
  ...
REDD Dataset
▶  6	
  households,	
  around	
  15-­‐20	
  appliances,	
  readings	
  in	
  1	
  Hz	
  rate	
  
Classification Accuracy per Household
▶  NB	
  =	
  Naïve	
  Bayes	
  
▶  FHMM	
  =	
  Factorial	
  Hidden	
  Markov	
  Mo...
Classification Accuracy per Appliance
Observed and Estimated Device States
Confusion Matrix
Content	
  
Analy.cs	
  
Big	
  Data	
  
Informa.on	
  
Aggrega.on	
  
Complex	
  
Event-­‐
Processing	
  
Gamifica.on	
  
...
Gamification (User Engagement)
	
  
	
  
Use	
  Case	
  6:	
  
	
  
DAIKnow:	
  A	
  Gamified	
  Enterprise	
  Bookmarking	...
Fun vs. Work
Fun vs. Work
DAIKnow is a social bookmarking platform for company
intranets which supports the centralized and ubiquitous
access to kno...
 
“Gamifica.on	
  refers	
  to	
  the	
  use	
  of	
  
design	
  elements	
  characteris.c	
  for	
  
games	
  in	
  a	
  n...
No.fica.ons	
  
Badges	
  
Leaderboard	
  
DAI Know Gamified – User Activity
▶  Leaderboard	
  is	
  the	
  most	
  visited	
  gamified	
  page	
  (every	
  day)	
  
...
DAI Know Gamified – Leaderboard
Advertisement
	
  
	
  
Call	
  for	
  Papers	
  
We	
  invite	
  the	
  submission	
  of	
  posi.on	
  papers	
  as	
  we...
Content	
  
Analy.cs	
  
Big	
  Data	
  
Informa.on	
  
Aggrega.on	
  
Complex	
  
Event-­‐
Processing	
  
Gamifica.on	
  
...
Acknowledgements
1.  Members	
  of	
  CC	
  IRML,	
  in	
  par.cular:	
  
1.  Esra	
  Acar	
  
2.  Michael	
  Meder	
  
3....
More use cases
F.	
  Hopfgartner	
  (Ed.).	
  Smart	
  
Informa.on	
  Systems:	
  
Computa.onal	
  Intelligence	
  for	
  ...
www.dai-­‐labor.de	
  
Fon
Fax
+49 (0) 30 / 314 – 74
+49 (0) 30 / 314 – 74 003
DAI-Labor
Technische Universität Berlin
Fak...
Upcoming SlideShare
Loading in …5
×

Towards Smart Information Systems: Exploitation of Small and Big Data in the Public and Private Sector

925 views

Published on

Invited Talk given on 13 December 2013 at DFG-Graduiertenkolleg 1564 - Imaging New Modalities at University of Siegen, Germany.

Abstract:
In the Age of Information, the acquisition and analysis of data is considered to be the new oil that drives our economy. Indeed, the business models of an increasing amount of companies depend on the exploitation of such data, e.g., for enriching their product portfolio with adaptive information services, also referred to as smart information services that support us in nearly all areas of life. Although a formal definition of smart information services does not exist, they can be described as information services that adapt based on specific parameters such as user requirements or contextual conditions. Key techniques for the provision of such smart services are personalization, data mining, machine learning, knowledge discovery and information management techniques. With increasing computation power, these techniques enable us to identify patterns, test research hypotheses or to create data models, hence shedding light on the potential usage of this data. However, although such techniques have matured over the past few years, there seems to be an increasing gap between current research trends in the analysis of data and the application of data analysis techniques in the industry. This talk illustrates a range of real-life use cases that showcase the challenges and potentials of smart information systems.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Towards Smart Information Systems: Exploitation of Small and Big Data in the Public and Private Sector

  1. 1. Competence  Center  Informa.on  Retrieval  and  Machine  Learning   Talk  at  DFG  Graduiertenkolleg  1564  -­‐  Imaging  New  Modali<es   Towards Smart Information Systems: Exploitation of Small and Big Data in the Public and Private Sector Frank  Hopfgartner  
  2. 2. The World of Data Less  than     one  percent   of  the  world’s  data  is  analyzed.  
  3. 3. Data is Gold Smart  Informa.on  Systems  rely  on  the   analysis  of  both  small  and  big  data  to  support   us  in  our  every  day  life.  Further,  they  enable   us  to  access  this  data.         Key  technology  for  the  provision  of  such  smart  services  are   personaliza.on,  data  mining,  machine  learning,  knowledge   discovery  and  informa.on  management  techniques.  
  4. 4. Content   Analy.cs   Big  Data   Informa.on   Aggrega.on   Complex   Event-­‐ Processing   Gamifica.on   (User   Engagement)   Towards Smart Information Systems
  5. 5. Hype Cycle for Emerging Technologies 2013 ©  Gartner  Inc.,  Source:  hTp://na1.www.gartner.com/imagesrv/newsroom/images/hype-­‐cycle-­‐pr.png  
  6. 6. Content   Analy.cs   Big  Data   Informa.on   Aggrega.on   Complex   Event-­‐ Processing   Gamifica.on   (User   Engagement)   Outline
  7. 7. Content Analytics     Use  Case  1:     Iden.fica.on  of  Affec.ve  Music  Video  Content             [Acar,  2014]  Esra  Acar,  Frank  Hopfgartner,  and  Sahin  Albayrak.  Understanding  Affec.ve  Content   of  Music  Videos  Through  Learned  Representa.ons,  In  MMM'14:  Proceedings  of  the  20th   Interna8onal  Conference  on  Mul8media  Modeling.  pages  303-­‐314,  Springer  Verlag,  01  2014.  to   appear.  
  8. 8. The Video Affective Analysis Method 8 ▶  Using  deep  learning   methods  -­‐  convolu.onal   neural  networks  (CNNs)  -­‐   to  learn  mid-­‐level   representa.ons  from   automa.cally  extracted   low-­‐level  features.   ▶  Exploit  audio  and  visual   modality  of  videos  and   employ  MFCC  and  color  in   RGB  space  in  order  to   build  higher  level  audio   and  visual   representa.ons.  
  9. 9. Dataset & Groundtruth ▶  The   DEAP   dataset:   a   dataset   of   YouTube  videos  for  the  analysis  of   human   affec.ve   states   using   e l e c t r o e n c e p h a l o g r a m ,   physiological  and  video  signals.   ▶  Arousal   and   valence   values   of   music  clips  are  in  the  range  of  1  to   9.   §  arousal   ranging   from   calm/ bored  to  s.mulated/excited.   §  valence   ranging   from   unhappy/ sad  to  happy/joyful.   Source:  hTp://www.eecs.qmul.ac.uk/mmv/datasets/deap  
  10. 10. Results Classifica4on  accuracies  on  the  DEAP  dataset  (with  audio-­‐visual  representa4ons)   Classifica4on  accuracies  on  the  DEAP  dataset  (with  unimodal  representa4ons)   Method   Accuracy  (%)   Our  method  (mid-­‐level  audio-­‐visual)   52.63   The  low-­‐level  audio-­‐visual  method   36.84   Method   Accuracy  (%)   Our  method  (mid-­‐level  audio)   47.37   Our  method  (mid-­‐level  visual)   36.84   The  low-­‐level  audio  method   36.84   The  low-­‐level  visual  method   31.58  
  11. 11. Content Analytics     Use  Case  2:     Violence  Detec.on  in  Hollywood  Movies           [Acar,  2013b]  Esra  Acar,  Frank  Hopfgartner,  and  Sahin  Albayrak.  Violence  Detec.on  in  Hollywood   Movies  by  the  Fusion  of  Visual  and  Mid-­‐level  Audio  Cues.  In  ACM  MM'13:  Proceedings  of  the  21st  ACM   Interna8onal  Conference  on  Mul8media,  pages  717-­‐720.  ACM,  10  2013.     [Acar,  2013a]  Esra  Acar,  Frank  Hopfgartner,  and  Sahin  Albayrak.  Detec.ng  Violent  Content  in   Hollywood  Movies  by  Mid-­‐Level  Audio  Representa.ons.  In  CBMI'13:  Proceedings  of  the  Workshop  on   Content-­‐Based  Mul8media  Indexing,  pages  73-­‐78.  Springer  Verlag,  6  2013.    
  12. 12. Discriminative Mid-Level Audio Features ▶  We   use   mid-­‐level   audio   features   based   on   MFCCs   (i.e.,   BoAW  approach).   ▶  The  BoAW  approach  with  two  different  coding  schemes   §  Vector  quan4za4on  (by  k-­‐means  clustering)   §  dividing   feature   vectors   into   groups,   where   each   group   is   represented   by   its   centroid   point   (e.g.,   k-­‐means   clustering   algorithm).   §  Sparse  coding  (by  the  LARS  algorithm)   §  represen.ng  a  feature  vector  as  a  linear  combina.on  of  an  over-­‐ complete  set  of  basis  vectors.  
  13. 13. Learning Violence Detection Model Learning  a  Violence  Model  
  14. 14. Dataset & Groundtruth ▶  Dataset:   §  32,708  video  shots  from  18  Hollywood  movies  of  different  genres   (ranging   from   extremely   violent   movies   to   movies   without   violence).   §  Training  set:  26,138  video  shots  from  15  movies.   §  Test  set:  6,570  video  shots  from  3  movies.   ▶  Ground  truth:     §  generated   by   7   human   assessors.   Violent   movie   segments   are   annotated  at  the  frame-­‐level.   §  Each  video  shot  is  labeled  as  violent  or  non-­‐violent.   The  characteris4cs  of  training  and  test  datasets  
  15. 15. Results & Discussions (1) Average  Precision  at  100  for  the  Baseline  and  Our  Methods   Average  Precision  at  20  &  100    and  R-­‐precision     for  the  VQ-­‐  and  SC-­‐based  methods  
  16. 16. Results & Discussions (2) Average  Precision  at  20  &  100    and  R-­‐precision  on  Independence  Day   Average  Precision  at  20  &  100    and  R-­‐precision  on  Dead  Poets  Society   Average  Precision  at  20  &  100    and  R-­‐precision  on  Fight  Club  
  17. 17. Content   Analy.cs   Big  Data   Informa.on   Aggrega.on   Complex   Event-­‐ Processing   Gamifica.on   (User   Engagement)   Towards Smart Information Systems
  18. 18. What  is  Big  Data?     ▶  Big  Data  is  not  only  “big“   ▶  Big  Data  is  data  that  is  too  big,  too  fast,  and  too  hard  for  exis.ng  systems  to  handle   ▶  Big  Data  can  be  characterized  by  the  four  V’s   §  Volume   §  rapid  increase  in  the  amount  of  data  to  be  stored   §  Variety   §  heterogeneous  data   §  Velocity   §  processing  data  streams  in  near  real-­‐.me   §  Veracity   §  missing,  incomplete,  and  noisy  data   The Big Data Problem
  19. 19. Big Data     Use  Case  3:     Recommending  News  Ar.cles  in  Real-­‐Time         [Kille,  2013]  Benjamin  Kille,  Frank  Hopfgartner,  Torben  Brodt,  and  Tobias  Heintz.  The  plista  Dataset.  In   NRS'13:  Proceedings  of  the  Interna8onal  Workshop  and  Challenge  on  News  Recommender  Systems,   pages  14-­‐21.  ACM,  10  2013.     [Tavakolifard,  2013]  Mozhgan  Tavakolifard,  Jon  Atle  Gulla,  Kevin  C.  Almeroth,  Frank  Hopfgartner,   Benjamin  Kille,  Till  Plumbaum,  Andreas  Lommatzsch,  Torben  Brodt,  Arthur  Bucko,  and  Tobias  Heintz.   Workshop  and  Challenge  on  News  Recommender  Systems.  In  RecSys'13:  Proceedings  of  the   Interna8onal  ACM  Conference  on  Recommender  Systems,  pages  481-­‐482.  ACM,  10  2013.          
  20. 20.   Recommender  Systems  help  users   to  find  items  that  they  might  not   have  found  by  themselves.      
  21. 21. Items? Movies   Songs   Images   …  
  22. 22. Example: News Articles Source:  T.  Brodt  of  plista.com  
  23. 23. Evaluation in a Living Lab ▶  News  Recommender  Challenge  organized  at  ACM  RecSys   2013  together  with  plista  GmbH   ▶  Leading  provider  of  a  recommenda.on  and  adver.sement   network  in  Central  Europe   ▶  Thousands  of  content  providers  use  plista’s  service  to   generate  recommenda.ons  for  their  customers  (i.e.,  web   users)   ▶  >  10.5  billion  requests  …  per  month    
  24. 24. Open Recommender Platform
  25. 25. Some results… clicks impressions No.ofclicks 0 10000 20000 30000 40000 50000 NoofImpressions 0 1 2 3 4 5 6 7 8 9×106 31 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Aug Sep Clicks and Impressions in total during the Challenge
  26. 26. Advertisement ▶  Want  to  par.cipate  in  a  living  lab?   ▶  Campaign-­‐style  evalua.on  lab  CLEF-­‐NewsREEL  of  CLEF  2014   ▶  Started  in  November  2014   ▶  Two  tasks:     §  Predict  the  ar.cles  a  user  will  click  on  (offline  evalua.on)   §  Recommend  ar.cles  in  real-­‐.me  (online  evalua.on)  over   several  months     ▶  Website:  hTp://www.clef-­‐newsreel.org/   ▶  TwiTer:  @clefnewsreel  
  27. 27. Content   Analy.cs   Big  Data   Informa.on   Aggrega.on   Complex   Event-­‐ Processing   Gamifica.on   (User   Engagement)   Towards Smart Information Systems
  28. 28. Information Aggregation     Use  Case  4:     Informa.on  Aggrega.on  in  the  Berlin  Administra.on  Offices           [Gunadi,  2014]  Erwin  Gunadi,  Michael  Meder,  Till  Plumbaum,  Chris.an  Scheel,  Frank  Hopfgartner,  and   Sahin  Albayrak.  Distributed  Enterprise  Search  Using  Sosware  Agents.  In  AAMAS'14:  Proceedings  of  the   Interna8onal  Conference  on  Autonomous  Agents  and  Mul8-­‐Agent  Systems,  05  2014,  to  appear.      
  29. 29. Motivation ▶  Informa.on  stored  in  many  heterogeneous  sources  =>  search  osen  very   cumbersome  and  inefficient.   ▶  Challenge:  Efficient  search  over  all  sources  must  be  possible  with  only   one  search  request.  
  30. 30. Use Case – Net Infrastructure of the Berlin Administration
  31. 31. Search Engines – Approaches Approaches  to  build  an  index  for  search  engines:   1.  Central  Crawling  (Indexing)   2.  Distributed  Crawling  (Indexing)  
  32. 32. 1.  Central  Crawling  (Indexing)  
  33. 33. Search Engines – Central Crawler ▶  Many  Problems   §  all  data  (Index)  has  to  be  saved  centrally  and  connected   §  possible  security  issues  (data  silo)   §  central  crawler  needs  read  access  to  all  networks  (also   desktop)   §  high  administra.ve  effort   §  privacy  issues  (confiden.al  data,  data  superiority)   §  desktop  search  not  possible  (central  read  access  should  not   be  allowed)  
  34. 34. Distributed  Crawling  (Indexing)  
  35. 35. Search Engines – Distributed Crawler (Spider) ▶  Many  Problems   §  all  data  (Index)  has  to  be  saved  centrally  and  connected     §  possible  security  issues  (data  silo)   §  central  crawler  needs  read  access  to  all  networks  (also   desktop)   §  high  administra.ve  effort   §  privacy  issues  (confiden.al  data,  data  superiority)   §  desktop  search  not  possible  (central  read  access  should  not   be  allowed)   ▶  Many  Problems?   §  all  data  (Index)  has  to  be  saved  centrally  and  connected     §  possible  security  issues  (data  silo)   §  central  crawler  needs  read  access  to  all  networks  (also   desktop)   §  high  administra.ve  effort   §  privacy  issues  (confiden.al  data,  data  superiority)   §  desktop  search  not  possible  (central  read  access  should  not   be  allowed)     …solved  by  PIA  
  36. 36. ▶  Solu.on:  PIA  Enterprise   §  …  is  a  distributed,  intelligent  search  engine  (based  on  JIAC  agents).   §  …  aggregates  and  analyzes  informa.on  from  heterogeneous  sources.   §  …  simplifies  search,  edi4ng  and  sharing  of  informa.on.   §  …  is  a  personalized  informa.on  system.   §  …  alerts  proac4vely  on  new  or  changed  informa.on.   §  …  admin  tool  to  maintain,  monitor  and  add  new  sources  at  run4me   §  …  considers  exis.ng  rights  management  systems  (LDAP,  firewall).   PIA Enterprise – Features
  37. 37. ▶  Research  Topics   §  Topic-­‐Based  Retrieval   §  search  quality  improvement  (e.g.  Query  Expansion,  Seman.cs)   §  Aggregated  Search   §  aggrega.on  of  results  from  independent  distributed  indices   §  adap.ve  aggrega.on  model  and  methods  (for  highly  dynamic  indices)   §  uniform  relevance  scoring   §  intelligent  diversity  and  ranking   §  Sensor  Data  Aggrega.on   §  handle  vast  amounts  of  data   §  Gamifica.on   §  gamifica.on  design  op.miza.on     §  player  type  recogni.on/iden.fica.on/modeling   §  Text  Summariza.on   §  intelligent  document  summariza.on   Research Challenges
  38. 38. Content   Analy.cs   Big  Data   Informa.on   Aggrega.on   Complex   Event-­‐ Processing   Gamifica.on   (User   Engagement)   Towards Smart Information Systems
  39. 39. Complex Event-Processing     Use  Case  5:     Energy  Disaggrega.on             [Spiegel,  2014]  Stephan  Spiegel  and  Sahin  Albayrak.  Energy  Disaggrega.on  meets  Hea.ng   Control.  In  SAC'14:  Proceedings  of  the  29th  Symposium  On  Applied  Compu8ng.  ACM,  03  2014.  to   appear        
  40. 40. Motivation ▶  Op.miza.on  of  Hea.ng  Schedules  based  on  Energy  Profiles                  
  41. 41. Research Challenges ▶  Detec.on  of  resident  presence  in  home  environment   §  Overall  energy  consump.on  is  an  indicator  for  presence,  but  some   devices  con.nually  consume  energy  regardless  of  human  interac.on     ▶  Recogni.on  of  resident  ac4vi4es   §  Knowledge  about  the  usage  of  home  appliances  allow  us  to  draw   conclusions  on  the  context  of  the  resident(s)   ▶  Recommenda.on  of  op.mized  hea4ng  schedules   §  Based  on  the  informa.on  about  presence  and  ac.vi.es  we  are  able   to  gradually  learn  characteris.c  behavior,  which  in  turn  allows  us  to   create  personalized  schedules  for  hea.ng  control    
  42. 42. Energy Disaggregation ▶  Is  it  possible  to  recognize  individual  home  appliances  (and   their  opera.on  mode)  in  an  overall  energy  consump.on   profile  that  is  corrupted  with  noise  and  superimposed  with   the  energy  footprint  of  unrelated  devices?                
  43. 43. REDD Dataset ▶  6  households,  around  15-­‐20  appliances,  readings  in  1  Hz  rate  
  44. 44. Classification Accuracy per Household ▶  NB  =  Naïve  Bayes   ▶  FHMM  =  Factorial  Hidden  Markov  Model   ▶  CT  =  Classifica.on  Tree   ▶  1NN  =  One  Nearest  Neighbor  Classifier  
  45. 45. Classification Accuracy per Appliance
  46. 46. Observed and Estimated Device States
  47. 47. Confusion Matrix
  48. 48. Content   Analy.cs   Big  Data   Informa.on   Aggrega.on   Complex   Event-­‐ Processing   Gamifica.on   (User   Engagement)   Towards Smart Information Systems
  49. 49. Gamification (User Engagement)     Use  Case  6:     DAIKnow:  A  Gamified  Enterprise  Bookmarking  System           [Meder,  2014]  Michael  Meder,  Till  Plumbaum,  and  Frank  Hopfgartner.  DAIKnow:  A  Gamified  Enterprise   Bookmarking  System,  In  ECIR’14:  Proceedings  of  the  European  Conference  on  Informa8on  Retrieval.   Springer  Verlag,  04  2014.  to  appear.   [Meder,  2013]  Michael  Meder,  Till  Plumbaum,  and  Frank  Hopfgartner.  Perceived  and  Actual  Role  of   Gamifica.on  Principles,  In  CGCloud’13:  Proceedings  of  the  Interna8onal  Workshop  on  Crowdsourcing   and  Gamifica8on  in  the  Cloud.  IEEE,  12  2013.  
  50. 50. Fun vs. Work
  51. 51. Fun vs. Work
  52. 52. DAIKnow is a social bookmarking platform for company intranets which supports the centralized and ubiquitous access to knowledge and information for all employees. How do I motivate employees to use enterprise systems?   Add toSend to
  53. 53.   “Gamifica.on  refers  to  the  use  of   design  elements  characteris.c  for   games  in  a  non-­‐game  context.”       Deterding,  S.,  Dixon,  D.,  Khaled,  R.,  and  Nacke,  L.  From  game  design  elements  to  gamefulness:  defining  gamifica.on.  Proc.  Int.  Academic  MindTrek  Conference  (2011),  9–15.  
  54. 54. No.fica.ons  
  55. 55. Badges  
  56. 56. Leaderboard  
  57. 57. DAI Know Gamified – User Activity ▶  Leaderboard  is  the  most  visited  gamified  page  (every  day)   ▶  Overall  number  of  request  declines  =>  interest  in  the  system   declines  as  well  
  58. 58. DAI Know Gamified – Leaderboard
  59. 59. Advertisement     Call  for  Papers   We  invite  the  submission  of  posi.on  papers  as  well  as  novel  research  papers  and  demos  addressing  problems  related  to   gamifica.on  in  IR.  Topics  include  but  are  not  limited  to:   §  Gamifica.on  approaches  in  a  variety  of  informa.on-­‐seeking  contexts   §  User  engagement  and  mo.va.onal  factors  of  gamifica.on   §  Player  types,  contests,  coopera.ve  games   §  Challenges  and  opportuni.es  of  applying  gamifica.on  in  IR   §  Gamifica.on  design  and  game  mechanics   §  Game  based  work  and  crowdsourcing   §  Applica.ons  and  prototypes   §  Evalua.on  of  gamifica.on  techniques   Submissions  from  outside  the  core  IR  community  and  from  industry  are  ac.vely  encouraged.     hcp://gamifir2014.dai-­‐labor.de  
  60. 60. Content   Analy.cs   Big  Data   Informa.on   Aggrega.on   Complex   Event-­‐ Processing   Gamifica.on   (User   Engagement)   Towards Smart Information Systems Thank  you     for  your  aTen.on  
  61. 61. Acknowledgements 1.  Members  of  CC  IRML,  in  par.cular:   1.  Esra  Acar   2.  Michael  Meder   3.  Till  Plumbaum   4.  Brijnesh  Johannes  Jain   5.  Benjamin  Kille   6.  Stephan  Spiegel   2.  Funding  agencies:   1.  EU  FP7   2.  ITDZ  Berlin   3.  BMBF   3.  Project  partners  
  62. 62. More use cases F.  Hopfgartner  (Ed.).  Smart   Informa.on  Systems:   Computa.onal  Intelligence  for   Real-­‐Life  Applica.ons.  Advances   in  Computer  Vision  and  PaTern   Recogni.on,  Springer  Verlag.  to   appear.    
  63. 63. www.dai-­‐labor.de   Fon Fax +49 (0) 30 / 314 – 74 +49 (0) 30 / 314 – 74 003 DAI-Labor Technische Universität Berlin Fakultät IV – Elektrotechnik & Informatik Sekretariat TEL 14 Ernst Reuter Platz 7 10587 Berlin Competence Center Information Retrieval and Machine Learning Frank Hopfgartner, PhD Director frank.hopfgartner@tu-berlin.de 202

×