Big Social Machines: Architecture and Challenges

Big Social Machines:
Architecture and Challenges
Srinath Srinivasa
Web Science Lab
IIIT Bangalore
http://cds.iiitb.ac.in/wsl/

More data beats better algorithms...
But...

Without good models, we get insights like these..

Social Machines
Represents a class of environments comprising of interplay
between humans and technology
Outputs of social machines a result of both human and
algorithmic decisions
Humans as “participants” rather than users
Humans not just use social machines for problemsolving – they are
also used as elements of problemsolving
“The Web is an engine to create abstract social machines” – Tim
BernersLee, Weaving the Web
About Social Machines https://youtu.be/8Iz7ZqSOJGU

Social Machines
Social Machines and Interaction Machines (Open-world computing) [Wegner 97]
Output
Input
A (closed-world)
Turing Machine
computation
Hidden-variable
Single-stream
Interaction
Machine
Hidden-adversary
Multi-stream
Interaction
Machine

Social Machines
Social Machines are “openworld” systems – the environment is
changing even as it is computing responses
They are “multistream” interaction machines – inputs on one
interaction channel may become part of the response on some
other interaction channel
Characteristic building blocks of a multistream interaction
machine [Srinivasa 2001]:
– Computation
– Persistence of state (across computations)
– Channel Sensitivity (of responses)

Big Social Machines
Technical Challenges
Massive number of users
Wide geographical distribution
Significant amounts of disconnected, mobile and ad hoc operations
Human Issues
Privacy and identity management nightmare
Privileges and access control challenges
Moral dilemmas concerning use of humans as part of the machine

Technical Challenges
Data Management Challenges
The Vs of Big Data: Volume, Variety, Velocity, Veracity
Continuous models of Consistency, Semantics extraction, and
Transactions
Process Management Challenges
Divergent aggregation (no one problem being solved at any time)
LRC (longrunning continuous) models of information logistics

Architectural Elements
Content Aggregation and Distribution
Network (CADN)
Interactive nature of social machines require
evolution of presentday CDNs into CADNs
Efficient management of human interactions
comprising of pull and push elements
Continuous Optimization in a multiuser
environment

Architectural Elements
CADN Example Based on Geo Hashing
Smart Traffic Social Machine
HSR area
Notifier
Hebbal area
Aggregator
BTM area
Notifier
BTM area
Notifier
Icon made by Freepik from www.flaticon.com is licensed under CC BY 3.0

Information Logistics
Getting the right information to the right
place at the right time to the right recipient
Challenges:
Uncertainty in information requirements
Churn in information location, relevance and
user requirements

Information Logistics
Strategic infrastructure: Distributed lookup tables [Patil 2010]
Optimality criteria
Efficiency of lookup
Robustness against
Random failures
Churn
Targeted attacks
Cost
Latency
Bookkeeping cost
Infrastructure cost
Optimal topology classes
under different constraints:
bookkeeping, lookup efficiency, infra cost

Social Machines
Business logic for Big Social Machines
Challenges:
Database semantics for long running distributed
processes
Scalable security and access control
Consistency issues

Business Logic Challenges
Data and Consistency Challenges
Conventional relational databases may prove insufficient for
data challenges of Big social machines
NoSQL: Useful for databases with highly skewed read/write
ratio and/or require large amount of joins (graph queries)
Difficult to enforce ACID semantics on distributed NoSQL data
stores
CAP theorem and the Single System Image (SSI)

Semantics Layer
The “brain” behind the social machine
Continuously extracts semantics from operational
details and feeds back configuration and control
options to the lower layers
Need for an underlying data structure to represent
operational knowledge for extracting semantics

Semantics Layer
Document vector model not very attractive:
Curse of dimensionality
Sparse vector space
Spurious features increasing dimensionality
Costly operations involving dimensionality
reduction
Difficult to obtain precise semantic associations by
dimensionality reduction

Semantics Layer
Proposal for a new model: Co
occurrence graph [Rachakonda et al.
2014]
Founded on Hebbian theory and
Cognitive models of episodic and
semantic memory
Cooccurrence represents starting
point for mining semantics
Reasoning across cooccurrences
facilitated by different algorithms for
mining different kinds of semantics

Semantics Layer
Business Logic Layer
POS tagging
Entity Resolution
Canonicalization

Semantics Layer
Episodic hypotheses
Algorithms running over the cooccurrence graph to
extract specific semantic associations
Based on hypothesizing how episodic knowledge can
be generalized into semantic knowledge
Example “topical anchor” hypothesis:
If a conversation/process is about topic t, then the longer
the conversation/process is observed, the greater the
probability of encountering t.

23
Semantics Layer
Topical Anchors: Given
a list of noun phrases,
identify a semantic
topic for these terms.
Powered by Wikipedia
cooccurrence graph
hosted by Agama
(graphdb developed at
WSL)
Web APIs enable use of
Topical Anchors in
third party applications

24
Semantics Layer
Topic Expansion: Given a
term, expands it into
semantically relevant topical
clusters with different
senses.
Uses co-occurrence
datasets from Wikipedia
2006 or 2011.
Web APIs enable use by
third party applications

25
Semantics Layer
Other algorithms on cooccurrence graphs
developed at WSL:
[Rachakonda et al. 2014, Kulkarni et al. 2014a, Kulkarni et al. 2014b]
● Topical markers
● Semantic siblings
● Deep matching
● Narrative modeling (work in progress)

26
Semantics Layer
Some algorithmic techniques for mining semantics from co
occurrence graphs:
Random walks
MCMC
Graph clustering
Centrality and PageRank based models
HITS
Gibbs Sampling
Stochastic graphical models (Markov random fields, Bayesian networks)
Spectral analysis of graph neighborhoods

27
Conclusions
Proposal:
Abstract architecture for social machines
Challenges:
Integration of systems, data and semantic layers
Continuous, diffusive computation and systemic
optimization
Continuous semantics extraction and semantic
interventions

References
● SOCIAM http://www.sociam.org/
● Shadbolt, Nigel R.; Daniel A. Smith; Elena Simperl; Max Van Kleek; Yang Yang; Wendy Hall (2013). "Towards a Classification
Framework for Social Machines" (PDF). WWW 2013 Companion.
● BernersLee, Tim; J. Hendler (2009). "From the Semantic Web to social machines: A research challenge for AI on the World
WideWeb" (PDF). Artificial Intelligence. doi:10.1016/j.artint.2009.11.010.
● Peter Wegner. 1997. Why interaction is more powerful than algorithms. Commun. ACM 40, 5 (May 1997), 8091.
● Srinivasa, Srinath. "An algebra of fixpoints for characterizing interactive behavior of information systems." PhD diss.,
Universitätsbibliothek, Brandenburgische Technische Universitaet, Cottbus, 2001.
● Sanket Patil. Designing Optimal Network Topologies under Multiple Efficiency and Robustness Constraints. Proceedings of
the PhD Forum at the International Conference on Distributed Computing and Networking (ICDCN 2010), Kolkata, January
2010.
● Aditya Ramana Rachakonda, Srinath Srinivasa, Sumant Kulkarni, M S Srinivasan. A Generic Framework and Methodology
for Extracting Semantics from Cooccurrences. Data & Knowledge Engineering, Elsevier, Volume 92, July 2014, Pages 39–59.
DOI: 10.1016/j.datak.2014.06.002
● Sumant Kulkarni, Srinath Srinivasa. SortingHat: A Deep Matching Framework to Match Labeled Concepts. Demo Paper in
the 20th International Conference on Management of Data (COMAD 2014), Hyderabad, India, December 2014.
● Sumant Kulkarni, Srinath Srinivasa, Jyotiska Nath Khasnabish, Karthikay Nagal, Sandeep Kurdagi. SortingHat: A
Framework for Deep Matching Between Classes of Entities. Proceedings of 10th International Workshop on Information
Integration on the Web (IIWeb 2014) colocated with ICDE 2014, Chicago, Illinois, USA, March 2014.

Big Social Machines: Architecture and Challenges

More Related Content

Similar to Big Social Machines: Architecture and Challenges

More from Srinath Srinivasa

Recently uploaded

Big Social Machines: Architecture and Challenges