Exploring the Bi-verse
A trip across the digital and physical ecospheres
Marco Brambilla
marco.brambilla@polimi.it
@marcobrambi
Scope and Purpose
measure
analyse
build
value
Taming
Complexity
There are more things
In heaven and earth, Horatio,
Than are dreamt of in your philosophy.
Shakespeare (Hamlet Act 1, scene 5)
The Answer to the Great Question...
Of Life, the Universe and Everything
Data
Information
Knowledge
Wisdom
Context
independence
Understanding
Understanding relations
Understanding patterns
Understanding principles
The Digital “Heaven”
Vs.
The “physical” Earth
[photo: http://hoglundassociates.com/Images/Cloud_Gate.jpg]
The digital reflection
of our life is
sharpening
Twitter sentiment vs. Gallup Poll of
Consumer Confidence
Brendan O'Connor, Ramnath Balasubramanyan, Bryan R. Routledge, and Noah A. Smith. 2010. From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series. In ICWSM-2010
Stance in time on the Brexit process
Brambilla, Calisir. Analysis of On-line Debate on Long-Running Political Phenomena. The Brexit Case.
Stance drift on Twitter
after the Brexit referendum
Brambilla, Calisir. Analysis of On-line Debate on Long-Running Political Phenomena. The Brexit Case.
Vaccination vs. Disinformation
http://genomic.elet.polimi.it/vaccinitaly/
https://vaccineu.herokuapp.com/
https://periscopeproject.eu/
TOOL 1. MODELS. Abstraction and decisions
Model Reality
Purposes:
• descriptive
• predictive
• diagnostic
• prescriptive
Cities
A Model
TOOL 2. PERSPECTIVES. The MacroScope
Joël de Rosnay, The Macroscope, 1979
TOOL 2. PERSPECTIVES. The MacroScope
Joël de Rosnay, The Macroscope, 1979
Sentiment, anyone?
• Which locations do people visit from where?
Provenance and visits
Semantic Analysis
• Entity Linking
• Semantics and relations
• Users vs. topics
Travel
Lovers
Art
Lovers
Internet &
Tech Lovers
Users’ Biography Word Clouds
User segmentation and profiling
Age Sex
50.4% female
49.6% male
Demographics
Race
Bias of the medium?
Struggling against
the Obvious
New knowledge is hard
Only high frequency emerges
The long tail challenge
the streetlamp effect
Improving
Quality
Data Quality Issue
Gartner Report
33% of the largest global companies experienced an information crisis due
to their inability to adequately value, govern and trust their enterprise
information.
If you torture the data long enough,
it will confess to anything
– Darrell Huff
Participation to Public Events
0
10
20
30
40
50
60
70
Week
end
Week
end
Week
end
29
• The places where the people in the city want to be seen.
(the dynamic of the hotels between Expo and the Furniture Fair)
30
• The places where the people in the city want to be seen
(the dynamic of the hotels between Expo and the Furniture Fair)
Data vs. Question
• Are they aligned?
• The usual problem of representativeness of the sample…
• At a different scale
• With much less control
Foursquare
Checkins
Copyright
©
Milano-Hub
project
@Politecnico
di
Milano
Flickr
Copyright
©
Milano-Hub
project
@Politecnico
di
Milano
Instagram
Copyright
©
Milano-Hub
project
@Politecnico
di
Milano
Adapting
Granularity
Time, Space, and multiple (Web) sources
Milano Fashion Week – Social Media and Events
Spatial and Temporal Data Integration
Space Granularity: the Grid
• Regular squared grid
• Irregular grid with official business-driven meaning
• Irregular grid with data-driven definition
12/4
Cities into cities
http://urbanscope.polimi.it
Multi-granularity and Multi-source
City-scale: mobile telephone and (gross-grain geo-located)
social media data
Street/square: people counting & profiling IoT
sensors
Point of Interest:
people counting
sensor, WiFi log analysis,
beacons and (fine grain geo-
located)
social media
Measuring …
… and Communicating
IoT Use Cases
Source: GSMA Intelligence - IoT: the next wave of connectivity and services
Assessing
The Consumer
Perspective
HOME
AUTOMATION
BTICINO – LeGrand Home+Control
PHILIPS HUE
Alexa / Google Assistant / Siri
Arlo / Eufy / Nest / Ring …
Actuators
How Many apps?
• Data integration
• Compatibility
• Programming and planning
• Inconsistencies - duplications
Granting
Access and VAlue
“The Data Divide”
Separation between the ones that have (or can access) data
and the ones who don’t.
Specific: target a specific area for improvement
Measurable: quantify or at least suggest an indicator of progress
Assignable: specify who will do it
Realistic: state what results can realistically be achieved, given available resources
Time-related: specify when the result(s) can be achieve
S.M.A.R.T. Goals!
Data-driven decisions
Action
Data
Descriptive
What happen?
Diagnostic
Why did it happen?
Predictive
What will happen?
Decision
Prescriptive
What should I do?
Decision Support
Decision Automation
Analytics Human-Centered & Machine-Centered
[Gartner]
A new role: the Business Translator
• Understands the business needs
• Is able to discuss with the stakeholders
• Is able to negotiate with the technical roles
TRANSFORMS BUSINESS REQUIREMENTS
INTO ACTIONABLE DATA STRATEGIES
GOAL: SMART Objective!
USER QUESTIONS BUSINESS METRICS TECHNICAL & QUALITY
METRICS
ACTION
USE CASE
REACTION
TIME
•Who is the
user?
•How is the
data product
used?
•Which questions do
the user ask to
reach the Goal?
•Which question do
the user ask during
that Use Case?
•Which are the dimensions that cover this
question?
•Are there any targets for that metric?
•What is the time frame to be taken into
consideration for that metric?
•Are there any relationships between the
metrics?
•Are there quality metrics to achieve in
models?
•What kind of
reasoning is
stimulated by
reading the
metric?
•What are the
actions that derive
from the
reasoning?
•When? How
often?
The Goal Poster
Thanks!
Any Questions?
References
• Marco Di Giovanni, Francesco Pierri, Christopher Torres-Lugo, Marco Brambilla:
VaccinEU: COVID-19 Vaccine Conversations on Twitter in French, German and Italian. ICWSM 2022: 1236-1244
• Alireza Javadian Sabet, Marco Brambilla, Marjan Hosseini: A multi-perspective approach for analyzing long-running live events on social
media. A case study on the "Big Four" international fashion weeks. Online Soc. Networks Media 24: 100140 (2021)
• Marco Di Giovanni, Marco Brambilla:
Content-based Stance Classification of Tweets about the 2020 Italian Constitutional Referendum. SocialNLP@NAACL 2021: 14-23
• Marco Brambilla, Alireza Javadian Sabet, Amin Endah Sulistiawati:
Conversation Graphs in Online Social Media. ICWE 2021: 97-112
• Giorgia Ramponi, Marco Brambilla, Stefano Ceri, Florian Daniel, Marco Di Giovanni:
Content-based characterization of online social communities. Inf. Process. Manag. 57(6): 102133 (2020)
• Emre Calisir, Marco Brambilla:
The Long-Running Debate about Brexit on Social Media. ICWSM 2020: 848-852
• Marco Balduini, Marco Brambilla, Emanuele Della Valle, Christian Marazzi, Tahereh Arabghalizi, Behnam Rahdari, Michele Vescovi:
Models and Practices in Urban Data Science at Scale. Big Data Res. 17: 66-84 (2019)
• …
Marco Brambilla, @marcobrambi, marco.brambilla@polimi.it
http://datascience.deib.polimi.it
http://www.marco-brambilla.com

Exploring the Bi-verse. A trip across the digital and physical ecospheres