1. 1
Intelligent Infrastructure using Human-in-
Loop Cyber-physical Systems
- Social Network as a Soft Sensor
Dr. Arpan Pal
Head of Research
Innovation Lab, Kolkata
Tata Consultancy Services Ltd.
2. Outline
Intelligent Infrastructure and Human-centric Cyber-physical Systems
Architecture
Social Media as a Soft Sensor
Application Use Cases
Public Safety
Transportation
Healthcare
Technology for Social Media Soft Sensing
Access Technologies
Natural Language Processing and Emotion Mining
RIPSAC – a generic platform for Internet-of-Things
Innovation @TCS
8. 8
Intelligent Infrastructure Architecture
People Feedback & Emotions
Social Media
Integrated Services
Sensors & IoT
Platform
Traditional Monitoring & Control Systems Citizen Data
Smart Integration Platform
Transportation Healthcare Electricity
WaterPublic Safety Tourism
Smart Domain Services
Community
etc.
Sense: People Activity, Appliances, Vehicles , Road, Home/Bldg, Utility Infrastructure
Integration Platform
9. 9
Social Media as a soft sensor?
Incoming data is not from any physical
measurement but from the social media
expressions from people
People either talk about themselves or they
talk about what is happening in their
surrounding
Both have rich information content
Can be converted into meaningful soft sensor
observations through Text Processing and
Natural Language Processing (NLP) of the
social media posts
10. 10
Source of Data
Text data that gives rise to sensor readings
– Mail chains
– Public discussion forums
– Crowd-sourced wikis
– Blogs / Microblogs / Social network status updates and comments
12. 12
Public Safety
Continuosly collect tweets based on location and
keywords
Create keyword lists for each calamity using
synonyms of the calamity word.
(e.g.) burglary, theft, arson for burglary.
Detection of abnormal activities related to public
safety like burglary, fire, gunshots, earthquake etc.
from geo tagged tweets
Cluster classified tweets based on geo location to
determine sense of intensity and extensiveness of
calamity such as earthquake in a region
13. 13
Transportation
Sentiment analysis on public tweets during a
sporting/entertatinment event and predicting
the time and route of departure of the huge
crowd (e.g. traffic and crowd control around
a sports stadium)
Finding occurences of traffic jam from geo-
tagged tweets and finding the best route
from crowd-sourced data
Creating a pothole map of the city
14. 14
Healthcare
Find out the set of people who can form a self help
group amongst themselves for support and advice
for a given disease.
– Use hidden community detection by applying NLP on
posts to create the social graph to identify the
undeclared community
Monitoring health by inferring from social media such
as blogs, micro-blogs, posts, comments with possible
extension to disease onset detection like dementia or
Alzheimer's disease from social network posts
– Search for patterns in posts to detect possible
symptoms to diseases
– E.g. - sentiment analysis on posts will give whether the
given post’s emotion is positive or negative. If the
emotions are cycling between positive and negative
extremes with some periodicity, probably the person
has bipolar disorder
16. 16
Access Technologies
Source: Moo Num Ko et. al., IEEE Computer Magazine, Aug 2010
http://www.profsandhu.com/journals/computer/computer1008.pdf
17. 17
Access Technologies (Contd…)
Authentication and Authorization
– OAuth
o Twitter and Facebook currently use OAuth version 2.0
– OpenID
o Used to create a user name and password to be used
across different sources of social network posts
Streams
– Open Stream used by facebook
– Open Social used by Google and MySpace
Application APIs
– Rest API
– Streaming API
o Twitter Firehose – real time streaming of all tweets
o Twitter Gardenhose access level
18. 18
Natural Language Processing
Language identification and translation
Spelling correction
Segmentation of different Parts of Speech
– Named entity recognition – to identify proper
nouns
Classification – To classify text into categories
– Dynamic language model classifier / Naive Bayes
classifier
– Requires manual annotation for training
Word sense disambiguation
– Dictionary based
19. 19
Emotion Mining
Classify text as either ‘positive’ or ‘negative’
emotions
– Use AFINN to find the amount of +ve or –ve of
emotions in each sentence
Use Dictionary like Wordnet
– Use word sense disambiguation to disambiguate
each word in the sentences of a post into its
corresponding synset id
– Map them to the type of emotion that that synset id
represents
Observe the flow of emotions as a function of time
and users – track at individual and group level
20. 20
TCS IoT Platform - RIPSAC
Internet
Internet
Sensor
Services
Analytics
Storage
Services
RIPSAC Platform
App Developers
PaaS Provider
End User
Sensors
Sensor Providers
RIPSAC – Real-time Integrated Platform for Services & AnalytiCs
21. 21
RIPSAC Architecture - mapping to Social Media Analytics
Internet
End Users
Administrators
Device Integration & Management Services
Analytics Services
Application Services
Storage
Messaging & Event Distribution Services
ApplicationServices
Presentation Services
Application Support Services
Middleware
Edge Gateway
Sensors
Internet
Back-end on Cloud
RIPSAC – Real-time Integrated Platform for Services & AnalytiCs
Traditional
Internet
Service Delivery
Platform & App
Development
Platform
Security/Privacy
Framework
Lightweight M2M
Protocols
Analytics-as-a-
Service
Social Network
Integration
SDKs and APIs for
App developer
Access
Technology
Library
NLP and Emotion
Mining Library
24. 24
Academic Co-Innovation Network (COIN )
Fostering joint research
and innovation through a
mutually beneficial
alliance between TCS and
academia
Academic
context
Thoughts and research towards disruptive Innovation
Knowledge exchange and people development
Industry-oriented
Business context
innovation scalability of academia context of real-world problems
Collaborative
research
environment
Collaboration Mechanisms
• MoU based Alliances
• Sabbaticals – Academia to TCS Innovation Lab and TCS Innovation Lab to Academia
• TCS Research Scholar Program
• Masters and PhD Internships
Joint publications and IPRs
25. 25
Innovation Lab, Kolkata
Research Areas
• Sensor Signal Processing
• 2D/ 3D Image / Video Processing
• Protocols, Security and Privacy
• Parallel and Distributed Computing
• Stream Processing and Reasoning
• System Modeling and Identification
• Sematic Sensor Web
• Social Media Analytics
Academic Collaborations
• Singapore Management University (iCity Platform)
• Indian Statistical Institute (Protocol/Privacy, Image /
Video Processing)
• IIT Kharagpur (Analytics, Personal Context Extraction)
• IIT Bombay (Energy and Utilities)
• Jadavpur University (Signal Processing)
Application Areas
• Wellness and Healthcare
• Energy and Utilities
• Transportation