Your SlideShare is downloading. ×
0
Workshop	
  BBVA	
  –	
  Open	
  InnovaHon	
  

AnalyHcs	
  &	
  Models	
  
Esteban	
  Moro	
  
Alejandro	
  Llorente	
  
...
INNOVA	
  CHALLENGE	
  

BigDataSpain	
  7/11	
  
h*ps://www.centrodeinnovacionbbva.com/en/innovachallenge	
  

INNOVA	
  CHALLENGE	
  

BigDataSpain	
  7/11	
  
AnalyHcs	
  and	
  Models	
  
Challenge	
  par?cipant	
  “roadmap”	
  	
  

Data	
  
Maps	
  
Infrastructures/
Places	
  
...
Summary	
  

IntroducHon	
  to	
  geo-­‐tagged	
  data	
  
	
  
Access	
  to	
  (open)	
  geo-­‐tagged	
  data	
  
	
  
Ex...
IntroducHon	
  
to	
  geo-­‐tagged	
  data	
  
IntroducHon	
  to	
  geo-­‐tagged	
  data	
  
InformaHon:	
  
Person,	
  event,	
  
infrastructure.	
  	
  

Geography:	
 ...
GeospaHal	
  Bigdata	
  

Ac?vity	
  (Transport)	
  

Geospa?al	
  
BigData	
  

Maps	
  

Satellite	
  Images	
  
INNOVA	...
Geo-­‐tagged	
  BigData	
  applicaHons	
  
With	
  geo-­‐tagged	
  data	
  we	
  can	
  	
  
	
  Measure	
  zone/area	
  o...
Geo-­‐social	
  Analysis	
  

Use	
  of	
  pervasive	
  sensors	
  
(mobile	
  phones,	
  social	
  media)	
  
to	
  model...
Geo-­‐social	
  analysis	
  
!!

Estudio de geolocalización en Madrid

Localización:!!Puerta!del!Sol!

1500

count

food
n...
Fraud	
  detecHon	
  

Use	
  merchant	
  
localiza?on	
  and/
or	
  IP	
  address	
  in	
  
online	
  
transac?ons	
  to	...
GeomarkeHng	
  

Bars	
  

Shops	
  

Manage	
  sales	
  risk	
  

INNOVA	
  CHALLENGE	
  

BigDataSpain	
  7/11	
  
OpHmal	
  resource	
  allocaHon	
  
Op?mize	
  cash	
  
holding	
  in	
  
bank	
  
branches,	
  
minimizing	
  
costs	
  
...
Event	
  detecHon	
  

Detect	
  unexpected	
  
behavior	
  using	
  social/
mobile/urban	
  sensors	
  

INNOVA	
  CHALLE...
Access	
  to	
  	
  
(open)	
  geographical	
  data	
  
Geographical	
  data	
  	
  

Map	
  
Infrastructure/
places	
  

AcHvity	
  

INNOVA	
  CHALLENGE	
  

BigDataSpain	
  7/...
Types	
  of	
  data	
  

Maps	
  
	
  
Economic/Demographic	
  data	
  
	
  
AcHvity	
  
	
  Twi*er	
  
	
  BBVA	
  API	
 ...
Maps::	
  Google	
  Maps	
  
Google	
  Maps	
  has	
  a	
  number	
  of	
  different	
  services/APIs,	
  with	
  different	...
Maps	
  ::	
  OpenStreetMap	
  
Open	
  and	
  collabora?ve	
  project	
  to	
  create	
  and	
  distribute	
  free	
  map...
Mapas	
  ::	
  shapefiles	
  
Geospa?al	
  vector	
  data	
  format	
  for	
  geographical	
  informa?on	
  
	
  
•  Region...
Mapas	
  ::	
  shapefiles	
  
Edi?on	
  and	
  Visualiza?on	
  of	
  Shapefiles:	
  h*p://www.qgis.org	
  
	
  

INNOVA	
  C...
Maps	
  ::	
  Spain	
  cartography	
  
CartoCiudad (Ministerio de Fomento): shapefiles for each province at
municipality a...
Maps	
  ::	
  Madrid	
  cartography	
  
Nomecalles (CAM): shapefiles, POIs (museums, theaters, health services ),
subway (...
Maps	
  ::	
  Barcelona	
  province	
  cartography	
  
Plan territorial metropolitano de Barcelona – Generalitat de Catalu...
Maps	
  ::	
  Barcelona	
  City	
  cartography	
  
Open data
gencat
Catalonia
Cartography
	
  
Link	
  

INNOVA	
  CHALLEN...
Maps	
  ::	
  Barcelona	
  city	
  cartography	
  	
  
Plan territorial metropolitano de Barcelona – Generalitat de Catalu...
Demographic/Economic	
  data	
  ::	
  Spain	
  
Demographic	
  Data:	
  
	
  Ins?tuto	
  Nacional	
  de	
  Estadís?ca	
  (...
Demographic/Economic	
  data	
  	
  ::	
  Madrid	
  
Madrid	
  City	
  
	
  Madrid	
  City	
  Council	
  database:	
  
	
 ...
Demographic/Economic	
  data	
  ::	
  Barcelona	
  
Barcelona	
  city	
  
	
  Departament	
  d’Estadís?ca	
  
	
  h*p://ww...
Other data sources :: Google Points of Interest
Google	
  API	
  Console	
  

INNOVA	
  CHALLENGE	
  

BigDataSpain	
  7/1...
Other data sources :: Google Points of Interest
Google	
  API	
  Console	
  

INNOVA	
  CHALLENGE	
  

BigDataSpain	
  7/1...
Other data sources :: Google Points of Interest
Google	
  API	
  Console	
  

INNOVA	
  CHALLENGE	
  

BigDataSpain	
  7/1...
Other data sources :: Google Points of Interest

Points of interest around
Puerta del Sol (Madrid)
Service 1: Places Searc...
Other	
  data	
  sources	
  ::	
  Weather	
  forecast	
  	
  
GFS: Global Forecast System	
  
OpeNDAP protocol.	
  
Python...
AcHvity	
  ::	
  data	
  from	
  TwiZer	
  API	
  
Developers webpage http://dev.twitter.com

INNOVA	
  CHALLENGE	
  

Big...
AcHvity	
  ::	
  data	
  from	
  TwiZer	
  API	
  
Developers webpage http://dev.twitter.com

INNOVA	
  CHALLENGE	
  

Big...
AcHvity	
  ::	
  data	
  from	
  TwiZer	
  API	
  
Developers webpage http://dev.twitter.com

INNOVA	
  CHALLENGE	
  

Big...
AcHvity	
  ::	
  data	
  from	
  TwiZer	
  API	
  
Developers webpage http://dev.twitter.com

Consumer	
  Key	
  
Consumer...
AcHvity	
  ::	
  data	
  from	
  TwiZer	
  API	
  
OAuth Authentication
Consumer	
  Key	
  
Consumer	
  Secret	
  
Access	...
AcHvity	
  ::	
  data	
  from	
  TwiZer	
  API	
  
Stream API
Example:
Geolocalized Tweets in the Madrid region
API Servic...
AcHvity	
  ::	
  data	
  from	
  TwiZer	
  API	
  
Stream API
As we said before, there are no data in Madrid about adminis...
AcHvity	
  ::	
  data	
  from	
  TwiZer	
  API	
  
Stream API

INNOVA	
  CHALLENGE	
  

BigDataSpain	
  7/11	
  
AcHvity	
  ::	
  data	
  from	
  TwiZer	
  API	
  
Stream API

INNOVA	
  CHALLENGE	
  

BigDataSpain	
  7/11	
  
AcHvity	
  ::	
  data	
  from	
  BBVA	
  API	
  
hZps://www.centrodeinnovacionbbva.com/signup	
  
	
  

INNOVA	
  CHALLENG...
AcHvity	
  ::	
  data	
  from	
  BBVA	
  API	
  

https://developer.bbva.com/panel

INNOVA	
  CHALLENGE	
  

BigDataSpain	...
AcHvity	
  ::	
  data	
  from	
  BBVA	
  API	
  

https://developer.bbva.com/panel

INNOVA	
  CHALLENGE	
  

BigDataSpain	...
AcHvity	
  ::	
  data	
  from	
  BBVA	
  API	
  

https://developer.bbva.com/panel

INNOVA	
  CHALLENGE	
  

BigDataSpain	...
AcHvity	
  ::	
  data	
  from	
  BBVA	
  API	
  
Geng	
  the	
  authenHcaHon	
  data:	
  
1.  With	
  the	
  APP_ID	
  and...
AcHvity	
  ::	
  CUSTOMER_ZIPCODES	
  example	
  
Parameters	
  

INNOVA	
  CHALLENGE	
  

Workshop	
  
BigDataSpain	
  7/...
AcHvity	
  ::	
  CUSTOMER_ZIPCODES	
  example	
  
ExtracHng	
  data	
  

INNOVA	
  CHALLENGE	
  

Workshop	
  
BigDataSpai...
AcHvity	
  ::	
  CUSTOMER_ZIPCODES	
  example	
  
Building	
  the	
  adjacency	
  list	
  

INNOVA	
  CHALLENGE	
  

Works...
AcHvity	
  ::	
  CUSTOMER_ZIPCODES	
  example	
  
Building	
  and	
  plong	
  the	
  graph	
  

INNOVA	
  CHALLENGE	
  

W...
AcHvity	
  ::	
  CUSTOMER_ZIPCODES	
  example	
  
Economical	
  flows	
  from	
  
Puerta	
  del	
  Sol	
  

Servicio	
  API...
Example:	
  development	
  	
  
of	
  a	
  geolocalized	
  	
  
recommender	
  app.	
  
Recommender	
  systems	
  ::	
  IntroducHon	
  
ObjecHve:	
  recommend	
  users	
  what	
  areas	
  to	
  visit	
  accordi...
Recommender	
  systems	
  ::	
  user	
  language	
  

Use	
  twi*er	
  data	
  to	
  
	
  
1.  Get	
  what	
  people	
  ar...
Recommender	
  systems	
  ::	
  user	
  language	
  
CP	
  28013:	
  Madrid	
  city	
  center	
  

INNOVA	
  CHALLENGE	
  ...
Recommender	
  systems	
  ::	
  user	
  language	
  
CP 28009 : Retiro

INNOVA	
  CHALLENGE	
  

BigDataSpain	
  7/11	
  
Recommender	
  systems	
  ::	
  user	
  demographic	
  profile	
  

Use	
  CARDS_CUBE	
  service	
  from	
  the	
  BBVA	
  ...
Recommender	
  systems	
  ::	
  user	
  demographic	
  profile	
  
•  Use	
  CARDS_CUBE	
  service	
  data	
  	
  
•  For	
...
Recommender	
  systems	
  ::	
  user	
  demographic	
  profile	
  
Example:	
  Male,	
  age	
  36-­‐45	
  
Fashion	
  	
  
...
Recommender	
  systems	
  ::	
  user	
  geographic	
  profile	
  

Use	
  CUSTOMER_ZIPCODES	
  service	
  in	
  the	
  BBVA...
Recommender	
  systems	
  ::	
  user	
  geographic	
  profile	
  
•  Use	
  data	
  from	
  the	
  CUSTOMER_ZIPCODES	
  ser...
Recommender	
  systems	
  ::	
  user	
  geographic	
  profile	
  
Example:	
  postal	
  code	
  28045	
  
Fashion	
  

INNO...
Recommender	
  systems	
  ::	
  combinaHon	
  

Geographical and demographic
recommendation system
INNOVA	
  CHALLENGE	
  ...
Recommender	
  systems	
  ::	
  combinaHon	
  
Example:	
  Male,	
  age	
  36-­‐45,	
  living	
  in	
  postal	
  code	
  2...
From	
  the	
  data	
  to	
  the	
  app	
  
From	
  data	
  to	
  the	
  app	
  
1.  The	
  idea.	
  
2.  What	
  data	
  do	
  I	
  need	
  to	
  carry	
  out	
  thi...
Esteban	
  Moro	
  
Alejandro	
  Llorente	
  
	
  
www.iic.uam.es	
  
	
  
	
  
	
  
alejandro.llorente@iic.uam.es	
  	
  ...
Upcoming SlideShare
Loading in...5
×

Big Data analytics and models

972

Published on

Big Data analytics and models by Esteban Moro

Published in: Technology, Business
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
972
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
57
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Transcript of "Big Data analytics and models "

  1. 1. Workshop  BBVA  –  Open  InnovaHon   AnalyHcs  &  Models   Esteban  Moro   Alejandro  Llorente     www.iic.uam.es       INNOVA  CHALLENGE   BigDataSpain  7/11  
  2. 2. INNOVA  CHALLENGE   BigDataSpain  7/11  
  3. 3. h*ps://www.centrodeinnovacionbbva.com/en/innovachallenge   INNOVA  CHALLENGE   BigDataSpain  7/11  
  4. 4. AnalyHcs  and  Models   Challenge  par?cipant  “roadmap”     Data   Maps   Infrastructures/ Places   AcHvity   INNOVA  CHALLENGE   Mining   Analysis   Development   App   Content   Models   VisualizaHon   BigDataSpain  7/11  
  5. 5. Summary   IntroducHon  to  geo-­‐tagged  data     Access  to  (open)  geo-­‐tagged  data     Example:  development  of  geolocalized   recommender  app.     INNOVA  CHALLENGE   BigDataSpain  7/11  
  6. 6. IntroducHon   to  geo-­‐tagged  data  
  7. 7. IntroducHon  to  geo-­‐tagged  data   InformaHon:   Person,  event,   infrastructure.     Geography:   GPS   coordinates,   zone,  city   INNOVA  CHALLENGE   BigDataSpain  7/11  
  8. 8. GeospaHal  Bigdata   Ac?vity  (Transport)   Geospa?al   BigData   Maps   Satellite  Images   INNOVA  CHALLENGE   Social  Media   Sensors   BigDataSpain  7/11  
  9. 9. Geo-­‐tagged  BigData  applicaHons   With  geo-­‐tagged  data  we  can      Measure  zone/area  occupa?on  &  ac?vity    Iden?fy  flows  of  persons/money  between  different  areas    …       With  those  data  we  can  build  applicaHons  in        Geo-­‐social  analysis    Geomarke?ng    Op?mal  alloca?on  of  resources    Fraud  detec?on    Event  detec?on    …   INNOVA  CHALLENGE   BigDataSpain  7/11  
  10. 10. Geo-­‐social  Analysis   Use  of  pervasive  sensors   (mobile  phones,  social  media)   to  model  movement  and   communica?on  of  people  in   urban  areas.   INNOVA  CHALLENGE   BigDataSpain  7/11  
  11. 11. Geo-­‐social  analysis   !! Estudio de geolocalización en Madrid Localización:!!Puerta!del!Sol! 1500 count food nightlife shops 0 lunes martes miércoles jueves dia viernes sábado domingo 700 600 count 500 factor(tipo) arts_entertainment 400 food 300 nightlife shops 200 100 0 0 place n_checkins user 316 1 amazel666 269 2 runway4 73 3 mercado de san miguel 251 3 edaindil el corte inglés 136 4 maestrodarius 39 5 mercado de san antón 113 5 ivo_campos 35 6 yelmo cines ideal 3d 87 6 despop 33 7 vips 84 7 edumaiza mcdonald's 78 8 dalogu8 café de oriente 77 9 desdealbert0 32 10 sala joy eslava 71 10 mmetafetan 30 20 32 9 15 25 33 8 hora 40 4 10 121 starbucks coffee 5 n_checkins fnac 2 INNOVA  CHALLENGE   arts_entertainment 500 1 ! factor(tipo) 1000 150 factor(tipo0) 100 arts_entertainment count Characteriza?on   of  urban   neighborhoods   according  to   their  social/ commercial  use   ! Número!de!checkins!totales:!2651!(30.5!al!día)! Número!de!usuarios!únicos!en!la!zona:!1231! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! food nightlife shops 50 34! 0 abr−11 may−11 timedays jun−11 BigDataSpain  7/11  
  12. 12. Fraud  detecHon   Use  merchant   localiza?on  and/ or  IP  address  in   online   transac?ons  to   detect  fraud.   INNOVA  CHALLENGE   BigDataSpain  7/11  
  13. 13. GeomarkeHng   Bars   Shops   Manage  sales  risk   INNOVA  CHALLENGE   BigDataSpain  7/11  
  14. 14. OpHmal  resource  allocaHon   Op?mize  cash   holding  in   bank   branches,   minimizing   costs   associated   with  it.   Bares   Tiendas   Iden?fy  best   placement  for  a   new  shop/ branch   INNOVA  CHALLENGE   BigDataSpain  7/11  
  15. 15. Event  detecHon   Detect  unexpected   behavior  using  social/ mobile/urban  sensors   INNOVA  CHALLENGE   BigDataSpain  7/11  
  16. 16. Access  to     (open)  geographical  data  
  17. 17. Geographical  data     Map   Infrastructure/ places   AcHvity   INNOVA  CHALLENGE   BigDataSpain  7/11  
  18. 18. Types  of  data   Maps     Economic/Demographic  data     AcHvity    Twi*er    BBVA  API   INNOVA  CHALLENGE   BigDataSpain  7/11  
  19. 19. Maps::  Google  Maps   Google  Maps  has  a  number  of  different  services/APIs,  with  different  restric?ons  and   protocols.  It  allows  to  define  maps,  routes,  markers,  etc.   Example:  get  a  staHc  map  (without  authenHcaHon).   URL  Base:  h*p://maps.google.com/maps/api/sta?cmap   Parameters:   •    center:  40.4153,-­‐3.6875   •    size:  640x640   •    maptype:  mobile   •    format:  png32   •    sensor:  true   INNOVA  CHALLENGE   BigDataSpain  7/11  
  20. 20. Maps  ::  OpenStreetMap   Open  and  collabora?ve  project  to  create  and  distribute  free  maps.     Different  APIs  to  get  informa?on  about  routes,  points,  maps,  etc.   There  are  a  number  of  Mapping  projects  (applica?ons)  build  on  top  of  OSM  with   very  different  purposes   Example:  get  the  route  between  two  locaHons.  MapQuest.   URL  Base:  h*p://open.mapquestapi.com/guidance/v1/   Parameters:   •  Key:  authen?ca?on  key   •  From:  la?tud  y  longitud  del  origen  en  JSON.   •  To:  la?tud  y  longitud  del  des?no  en  JSON.   INNOVA  CHALLENGE   BigDataSpain  7/11  
  21. 21. Mapas  ::  shapefiles   Geospa?al  vector  data  format  for  geographical  informa?on     •  Regions,  points,  paths  defined  as  points,  lines,  polygons   •  Each  of  them  usually  has  a*ributes  that  describe  it   Region  Codes,  Names,  Popula?on,  etc.     pyshp:  h*p://code.google.com/p/pyshp/     maptools:  h*p://cran.r-­‐project.org/web/packages/maptools   h*p://www.naturalearthdata.com/downloads/     INNOVA  CHALLENGE   BigDataSpain  7/11  
  22. 22. Mapas  ::  shapefiles   Edi?on  and  Visualiza?on  of  Shapefiles:  h*p://www.qgis.org     INNOVA  CHALLENGE   BigDataSpain  7/11  
  23. 23. Maps  ::  Spain  cartography   CartoCiudad (Ministerio de Fomento): shapefiles for each province at municipality and postal code levels. They also include data about the urban background   h*p://www.cartociudad.es/portal/     INNOVA  CHALLENGE   BigDataSpain  7/11  
  24. 24. Maps  ::  Madrid  cartography   Nomecalles (CAM): shapefiles, POIs (museums, theaters, health services ), subway (stations), etc.     h*p://www.madrid.org/nomecalles/DescargaBDTCorte.icm       Resolu?on  level:  municipali?es,  districts,  postal  codes,  etc.   INNOVA  CHALLENGE   BigDataSpain  7/11  
  25. 25. Maps  ::  Barcelona  province  cartography   Plan territorial metropolitano de Barcelona – Generalitat de Catalunya   Link   INNOVA  CHALLENGE   BigDataSpain  7/11  
  26. 26. Maps  ::  Barcelona  City  cartography   Open data gencat Catalonia Cartography   Link   INNOVA  CHALLENGE   BigDataSpain  7/11  
  27. 27. Maps  ::  Barcelona  city  cartography     Plan territorial metropolitano de Barcelona – Generalitat de Catalunya   Link   This  web  has  also  data  about   mobility,  economic  development,   popula?on,  etc.  at  the  district  level     There  is  nothing  at  this  level  of   detail  in  Madrid.       Solu?on:  Use  other  data  sources  to   es?mate  them  (see  below).   INNOVA  CHALLENGE   BigDataSpain  7/11  
  28. 28. Demographic/Economic  data  ::  Spain   Demographic  Data:    Ins?tuto  Nacional  de  Estadís?ca  (INE)    Census  by  provinces  /  municipality  /  census  sec?on.    Link   Economic  Data:      Servicio  Público  de  Empleo  Estatal  (SEPE).      Unemployment  by  municipality.      Link   INNOVA  CHALLENGE   BigDataSpain  7/11  
  29. 29. Demographic/Economic  data    ::  Madrid   Madrid  City    Madrid  City  Council  database:    h*p://www-­‐2.munimadrid.es/CSE6/jsps/menuBancoDatos.jsp    Popula?on  by  districts,  neighborhoods,  etc.     Madrid  Region    Comunidad  de  Madrid  database:    h*p://www.madrid.org/desvan/Inicio.icm?enlace=almudena    Popula?on  by  municipality.      Economical  data  by  municipality   INNOVA  CHALLENGE   BigDataSpain  7/11  
  30. 30. Demographic/Economic  data  ::  Barcelona   Barcelona  city    Departament  d’Estadís?ca    h*p://www.bcn.cat/estadis?ca/castella/    Popula?on  by  district.    Unemployment  by  district.     Catalonia  region    Idescat  (Ins?tut  d’Estadís?ca  de  Catalunya)    h*p://www.idescat.cat/es/    Popula?on  by  municipality    Economical  data  by  municipality.   INNOVA  CHALLENGE   BigDataSpain  7/11  
  31. 31. Other data sources :: Google Points of Interest Google  API  Console   INNOVA  CHALLENGE   BigDataSpain  7/11  
  32. 32. Other data sources :: Google Points of Interest Google  API  Console   INNOVA  CHALLENGE   BigDataSpain  7/11  
  33. 33. Other data sources :: Google Points of Interest Google  API  Console   INNOVA  CHALLENGE   BigDataSpain  7/11  
  34. 34. Other data sources :: Google Points of Interest Points of interest around Puerta del Sol (Madrid) Service 1: Places Search Parameters : location: 40.417, -3.703 radius: 1000 Service 2: Places Details parameters: reference: place code INNOVA  CHALLENGE   BigDataSpain  7/11  
  35. 35. Other  data  sources  ::  Weather  forecast     GFS: Global Forecast System   OpeNDAP protocol.   Python implementation : pydap   Query format:   SERVER = http://nomads.ncep.noaa.gov:9090/dods/gfs_hd/   DATE = AAAAMMDD   HOUR = HH   VAR = weather metric r (tmp2m, ugrd10m, pressfc, …)   LAT = latitude interval [259:263] (0.5º steps from South Pole)   LON = longitude interval [710:714] (0.5º steps from Greenwich)       QUERY = SERVERgfs_hdDATE/gfs_hd_HOURz.dods?VAR[0:0][LAT][LON]   dataset = open_dods(QUERY)   INNOVA  CHALLENGE   BigDataSpain  7/11  
  36. 36. AcHvity  ::  data  from  TwiZer  API   Developers webpage http://dev.twitter.com INNOVA  CHALLENGE   BigDataSpain  7/11  
  37. 37. AcHvity  ::  data  from  TwiZer  API   Developers webpage http://dev.twitter.com INNOVA  CHALLENGE   BigDataSpain  7/11  
  38. 38. AcHvity  ::  data  from  TwiZer  API   Developers webpage http://dev.twitter.com INNOVA  CHALLENGE   BigDataSpain  7/11  
  39. 39. AcHvity  ::  data  from  TwiZer  API   Developers webpage http://dev.twitter.com Consumer  Key   Consumer  Secret   Access  token   Access  token  secret   INNOVA  CHALLENGE   BigDataSpain  7/11  
  40. 40. AcHvity  ::  data  from  TwiZer  API   OAuth Authentication Consumer  Key   Consumer  Secret   Access  token   Access  token  secret   Rest API Stream API Several queries with parameters Number of requests is limited INNOVA  CHALLENGE   Only one query (with parameters) Requests are not timelimited BigDataSpain  7/11  
  41. 41. AcHvity  ::  data  from  TwiZer  API   Stream API Example: Geolocalized Tweets in the Madrid region API Service: POST statuses/filter parameters: locations: -4.59, 39.90, -3.04, 41.17 INNOVA  CHALLENGE   BigDataSpain  7/11  
  42. 42. AcHvity  ::  data  from  TwiZer  API   Stream API As we said before, there are no data in Madrid about administrative zones below the municipality. But we can estimate some of the with Twitter •  Example: population by postal codes 1.  Round geographical coordinates to the 3rd decimal place (square cells of approx. 100 meters squared). 2.  Analyze the most visited postal code by user. Define that as his/her residence. Count number of residents by postal code 3.  Visualize. INNOVA  CHALLENGE   BigDataSpain  7/11  
  43. 43. AcHvity  ::  data  from  TwiZer  API   Stream API INNOVA  CHALLENGE   BigDataSpain  7/11  
  44. 44. AcHvity  ::  data  from  TwiZer  API   Stream API INNOVA  CHALLENGE   BigDataSpain  7/11  
  45. 45. AcHvity  ::  data  from  BBVA  API   hZps://www.centrodeinnovacionbbva.com/signup     INNOVA  CHALLENGE   BigDataSpain  7/11  
  46. 46. AcHvity  ::  data  from  BBVA  API   https://developer.bbva.com/panel INNOVA  CHALLENGE   BigDataSpain  7/11  
  47. 47. AcHvity  ::  data  from  BBVA  API   https://developer.bbva.com/panel INNOVA  CHALLENGE   BigDataSpain  7/11  
  48. 48. AcHvity  ::  data  from  BBVA  API   https://developer.bbva.com/panel INNOVA  CHALLENGE   BigDataSpain  7/11  
  49. 49. AcHvity  ::  data  from  BBVA  API   Geng  the  authenHcaHon  data:   1.  With  the  APP_ID  and  APP_KEY,  generate  the  authoriza?on  code  concatena?ng  both   strings  with  and  codifying  it  to  base64.   2.  This  authoriza?on  code  is  added  to  the  H*p  Request  Header.   Example:     APP_ID  =  "iic_formacion_innovachallenge"   APP_KEY  =  "0f1d750a5baea6c7022452d0d2ece01fc5901ad7”   str_to_encode="iic_formacion_innovachallenge:0f1d750a5baea6c7022452d0d2ece01fc5901ad7”   auth  =  strToBase64(str_to_encode)     Request  =  H*pRequest(SERVICE,  PARAMETERS,  header  =  {‘Authoriza?on’  :  auth})     INNOVA  CHALLENGE   BigDataSpain  7/11  
  50. 50. AcHvity  ::  CUSTOMER_ZIPCODES  example   Parameters   INNOVA  CHALLENGE   Workshop   BigDataSpain  7/11   30thOctober  
  51. 51. AcHvity  ::  CUSTOMER_ZIPCODES  example   ExtracHng  data   INNOVA  CHALLENGE   Workshop   BigDataSpain  7/11   30thOctober  
  52. 52. AcHvity  ::  CUSTOMER_ZIPCODES  example   Building  the  adjacency  list   INNOVA  CHALLENGE   Workshop   BigDataSpain  7/11   30thOctober  
  53. 53. AcHvity  ::  CUSTOMER_ZIPCODES  example   Building  and  plong  the  graph   INNOVA  CHALLENGE   Workshop   BigDataSpain  7/11   30thOctober  
  54. 54. AcHvity  ::  CUSTOMER_ZIPCODES  example   Economical  flows  from   Puerta  del  Sol   Servicio  API:   customer_zipcodes   Parámetros:    date_min:201304    date_max:201304    zipcode:28013    by:cards    group_by:month   INNOVA  CHALLENGE   BigDataSpain  7/11  
  55. 55. Example:  development     of  a  geolocalized     recommender  app.  
  56. 56. Recommender  systems  ::  IntroducHon   ObjecHve:  recommend  users  what  areas  to  visit  according  to   their  profile,  residence,  preferences,  etc.     Using  informaHon  about  what  similar  users  do.   Data  used:     1.  API  Innova  Challenge  –  CARDS_CUBE.   2.  API  Innova  Challenge  –  CUSTOMER_ZIPCODES.   INNOVA  CHALLENGE   BigDataSpain  7/11  
  57. 57. Recommender  systems  ::  user  language   Use  twi*er  data  to     1.  Get  what  people  are  talking  about  in  city  areas.   2.  Analyze  user  language  in  Twi*er   3.  Compare  user  language  with  area  language  and   recommend  user  most  similar  areas.   INNOVA  CHALLENGE   BigDataSpain  7/11  
  58. 58. Recommender  systems  ::  user  language   CP  28013:  Madrid  city  center   INNOVA  CHALLENGE   BigDataSpain  7/11  
  59. 59. Recommender  systems  ::  user  language   CP 28009 : Retiro INNOVA  CHALLENGE   BigDataSpain  7/11  
  60. 60. Recommender  systems  ::  user  demographic  profile   Use  CARDS_CUBE  service  from  the  BBVA  API   INNOVA  CHALLENGE   BigDataSpain  7/11  
  61. 61. Recommender  systems  ::  user  demographic  profile   •  Use  CARDS_CUBE  service  data     •  For  each  merchant  category  Z  (bars,  fashion,  health,  etc.)  build  a   matrix  in  which  each  entry  is  the  number  of  different  credit  cards  for   a  given  profile  X  (gender,  age)  that  went  shopping  to  the  postal  code   Y  in  a  merchant  of  category  Z.   Where  do  people  like  me  go  shopping?     Which  restaurants  are  visited  by  people  similar  to  me?   INNOVA  CHALLENGE   BigDataSpain  7/11  
  62. 62. Recommender  systems  ::  user  demographic  profile   Example:  Male,  age  36-­‐45   Fashion     INNOVA  CHALLENGE   Bars  and  restaurants   BigDataSpain  7/11  
  63. 63. Recommender  systems  ::  user  geographic  profile   Use  CUSTOMER_ZIPCODES  service  in  the  BBVA  API   INNOVA  CHALLENGE   BigDataSpain  7/11  
  64. 64. Recommender  systems  ::  user  geographic  profile   •  Use  data  from  the  CUSTOMER_ZIPCODES  service   •  For  each  merchant  category  Z  (bars,  fashion,  health,  etc.)  we  build  a   matrix  in  which  each  entry  is  the  number  of  different  credit  cards  from  a   postal  code  X  that  go  shopping  to  postal  code  Y  in  merchant  category  Z.   Where  do  people  in  my  district  go  shopping?     What  restaurants  are  visited  by  people  living  in  my  district?   INNOVA  CHALLENGE   BigDataSpain  7/11  
  65. 65. Recommender  systems  ::  user  geographic  profile   Example:  postal  code  28045   Fashion   INNOVA  CHALLENGE   Bars  and  restaurants   BigDataSpain  7/11  
  66. 66. Recommender  systems  ::  combinaHon   Geographical and demographic recommendation system INNOVA  CHALLENGE   BigDataSpain  7/11  
  67. 67. Recommender  systems  ::  combinaHon   Example:  Male,  age  36-­‐45,  living  in  postal  code  28045.   Fashion   INNOVA  CHALLENGE   Bars  and  restaurants   BigDataSpain  7/11  
  68. 68. From  the  data  to  the  app  
  69. 69. From  data  to  the  app   1.  The  idea.   2.  What  data  do  I  need  to  carry  out  this  idea?  Which  services  of  the   Challenge  API  do  I  need?  May  I  improve  it  with  other  informa?on   sources?   3.  Analysis:  disHlling  the  idea  and  assessing  its  viability.  Extrac?ng  the   hidden  value  of  analy?cs  and  models.   4.  How  can  the  user  take  advantage  of  this  idea?   5.  Iterate  2,3  and  4  un?l  the  idea  and  the  user  profit  show  up.   6.  Convert  the  value  of  the  analysis  to  an  applica?on.   INNOVA  CHALLENGE   BigDataSpain  7/11  
  70. 70. Esteban  Moro   Alejandro  Llorente     www.iic.uam.es         alejandro.llorente@iic.uam.es          @llorentealex   esteban.moro@iic.uam.es    @estebanmoro       INNOVA  CHALLENGE   BigDataSpain  7/11  
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×