Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
#AICRECSYS
ADVANsse
Advances in social semantic enterprise
HTTP://ADVANSSE.DERI.IE/	

MACIEJ DABROWSKI 	

BENJAMIN HEITMANN	

 	

CON...
About me
MACIEJ DABROWSKI!
maciej.dabrowski@deri.org!
lecturerAt
co-PI
contact
co-PI
worksWith
researcherAt
graduated
name
Overview
THISTALK	

	

RESEARCH	

INDUSTRY	

1.  WHY? 	

2.  WHAT? 	

3.  HOW?	

4.  TECHNICAL DECISIONS	

5.  LESSONS LEA...
Why? What? How?
technical considerations
lessons learned
Various information domains
preferences
recommendations
implicit
connections
User profile
TRAVEL	

FOOD	

SPORTS	

POLITICS	

??
Use Case: Enterprise Social Web
Enterprise social web
ENTERPRISE INFORMATION SPACE
MARKETING
DEVELOPMENT
R & D
ANDREW
BOB
CECILIA
DANNY
Limited information flow
MARKETING
DEVELOPMENT
R & D
GREAT
TOOL!"
MEETING
IBM"
TALK
BY DERI"
ANDREW
BOB
CECILIA
DANNY
ENTE...
Disconnected Social Networks
?	
  
ANDREW
BOB
CECILIA
DANNY
MARKETING
DEVELOPMENT
R & D
Distributed Social Platforms
?	
  
MARKETING
DEVELOPMENT
R & D
Problem 1: information overload and discovery
Problem 2: data level issues
DISTRIBUTION	

	

	

MULTIPLE DOMAINS	

ANDTYPES OF ENTITIES	

	

	

	

PEOPLE 	

 	

 	

INT...
Requirements - personalization
USE BACKGROUND KNOWLEDGE	

ALLOW CROSS-DOMAIN MULTI-
SOURCE PERSONALIZATION	

EXPLOIT SOCIA...
Requirements - data
DATA LEVEL	

•  FLEXIBLE	

•  COMPACT 	

•  ENABLE CRUD	

•  GRAPH?	

TRANSPORT PROTOCOL:	

•  RELIABL...
What?
A PLATFORM BASED ON OPEN STANDARDS
THAT IS EASILY PLUGGABLETO EXISTING
INFRASTRUCTURES ANDTHAT EXPLOITS
LEGACY INFOR...
use cases
HOW? A look inside
Step 1: Exploit distributed (social) graphs
http://www.insidefacebook.com/wp-content/uploads/2013/06/shutterstock_10710831...
Step 2: Exploit interest graphs
BENEFITS OF USING INTEREST GRAPHS:	

1.  FLEXIBLE SOURCE OF BACKGROUND KNOWLEDGE	

2.  ANY...
Interest graphs
DERI
Maciej
Blog
Post2
Maurice
"Emerging Technology"
http://dbpedia.org/resource/Data_analytics
http://dbp...
Our Approach
	

	

A PLATFORM FOR SOCIAL NETWORKS:	

§  ENTERPRISE FOCUS: PEOPLE, COMMUNITIES, INFORMATION	

§  EFFICIEN...
Demonstrator
EASYTO INTEGRATE WITH CISCO INFRASTRUCTURE	

OPEN STANDARDS (XMPP, SPARQL 1.1 UPDATE)	

SCALABLE RECOMMENDATI...
demonstrator
Prototype stats
SOCIAL NETWORK GRAPH:	

•  100S USERS	

•  100S POSTS	

•  500+TAGS	

•  2000+ ENTITIES	

•  15000+ EDGES	...
Why? What? How?
technical considerations
lessons learned
Technical considerations
ALGORITHM:	

•  SEMANTIC NETWORK	

•  LARGE DATASET	

•  ITERATIVE GRAPH ALGORITHM	

•  STATEFUL ...
Technical considerations
NON-NATIVE IMPORT OF RDF	

STARTUPTIME WITH DBPEDIA	

•  12 MIN ON 24 CORE, 96GB RAMTO LOAD	

PAR...
Technical considerations
NATIVE SUPPORT FOR RDF	

DBPEDIA (5.46GB) COMPRESSEDTO 436MB	

LOW MEMORY REQUIREMENTS	

LOW STAR...
Server design
XMPP	

SPREADING ACTIVATION	

HDT	

	

ADVANSSE connected
social platform
XMPP client:
Ignite Smack
Web appl...
configuration
•  DISTANCE CONSTRAINT DISABLED 	

•  FANOUT CONSTRAINT ENABLED	

•  10TARGET ACTIVATIONS	

•  ACTIVATIONTHR...
stats
DATASET:	

•  371 USERS	

•  6 INTEREST ON AVERAGE	

•  DEGREE 2-5, UPTO 51	

200ms	

	

	

85%	

AVERAGE EXECUTION ...
The value
	

SOCIAL CAPITAL IN ENTERPRISE
SOCIAL NETWORKS IN NOT FULLY
EXPLOITED.	

	

ENTERPRISE SOCIAL PLATFORMS
ARE DIS...
Why? What? How?
technical considerations
lessons learned
Lessons learned
•  GREATER RELEVANCETO REAL PROBLEMS	

•  CLEARER REQUIREMENTS (AND MORE)	

•  ACCESSTO ACTUAL USAGE DATA ...
major considerations
ACCESSTO INDUSTRY
DATA	

INTEGRATION WITH
THE PRODUCT?	

https://www.keytrac.net/assets/industry-soci...
Summary
PROBLEM	

§  INFORMATION OVERLOAD AND INEFFICIENT INFORMATION
DISCOVERY IN DISTRIBUTED ENTERPRISE SOCIAL NETWORKS...
ENORMOUS	

VALUE 	

IN 	

INDUSTRY-ACADEMIA 	

COLLABORATIONS	

	

CONTACT: 	

 	

MACIEJ.DABROWSKI@DERI.ORG	

@MACDAB
Near real-time recommendations in enterprise social networks
Upcoming SlideShare
Loading in …5
×

Near real-time recommendations in enterprise social networks

829 views

Published on

- how to compute recommendations using a graph with 40m edges and 11m nodes in 0.2s (200ms)
- new perspective on near real-time social recommendations in enterprise social platforms using Linked Data
- recommender system that is easy to integrate with social networks and legacy data
- application of data analytics in enterprise context

Published in: Technology, Education
  • Be the first to comment

Near real-time recommendations in enterprise social networks

  1. 1. #AICRECSYS
  2. 2. ADVANsse Advances in social semantic enterprise HTTP://ADVANSSE.DERI.IE/ MACIEJ DABROWSKI BENJAMIN HEITMANN CONOR HAYES KEITH GRIFFIN 10TH JULY 2013
  3. 3. About me MACIEJ DABROWSKI! maciej.dabrowski@deri.org! lecturerAt co-PI contact co-PI worksWith researcherAt graduated name
  4. 4. Overview THISTALK RESEARCH INDUSTRY 1.  WHY? 2.  WHAT? 3.  HOW? 4.  TECHNICAL DECISIONS 5.  LESSONS LEARNED
  5. 5. Why? What? How? technical considerations lessons learned
  6. 6. Various information domains preferences recommendations implicit connections
  7. 7. User profile TRAVEL FOOD SPORTS POLITICS ??
  8. 8. Use Case: Enterprise Social Web
  9. 9. Enterprise social web ENTERPRISE INFORMATION SPACE MARKETING DEVELOPMENT R & D ANDREW BOB CECILIA DANNY
  10. 10. Limited information flow MARKETING DEVELOPMENT R & D GREAT TOOL!" MEETING IBM" TALK BY DERI" ANDREW BOB CECILIA DANNY ENTERPRISE INFORMATION SPACE
  11. 11. Disconnected Social Networks ?   ANDREW BOB CECILIA DANNY MARKETING DEVELOPMENT R & D
  12. 12. Distributed Social Platforms ?   MARKETING DEVELOPMENT R & D
  13. 13. Problem 1: information overload and discovery
  14. 14. Problem 2: data level issues DISTRIBUTION MULTIPLE DOMAINS ANDTYPES OF ENTITIES PEOPLE INTERESTS CONTENT
  15. 15. Requirements - personalization USE BACKGROUND KNOWLEDGE ALLOW CROSS-DOMAIN MULTI- SOURCE PERSONALIZATION EXPLOIT SOCIAL GRAPH ALLOW REAL-TIME APPLICATIONS
  16. 16. Requirements - data DATA LEVEL •  FLEXIBLE •  COMPACT •  ENABLE CRUD •  GRAPH? TRANSPORT PROTOCOL: •  RELIABLE •  EFFICIENT •  PUBSUB?
  17. 17. What? A PLATFORM BASED ON OPEN STANDARDS THAT IS EASILY PLUGGABLETO EXISTING INFRASTRUCTURES ANDTHAT EXPLOITS LEGACY INFORMATION, SOCIAL GRAPH AND INTEREST GRAPHTO PROVIDE A PERSONALIZED INFORMATION “DASHBOARD” IN NEAR REAL-TIME.
  18. 18. use cases
  19. 19. HOW? A look inside
  20. 20. Step 1: Exploit distributed (social) graphs http://www.insidefacebook.com/wp-content/uploads/2013/06/shutterstock_107108318.jpg
  21. 21. Step 2: Exploit interest graphs BENEFITS OF USING INTEREST GRAPHS: 1.  FLEXIBLE SOURCE OF BACKGROUND KNOWLEDGE 2.  ANY DATASET CAN BE “PLUGGED-IN” IF NEEDED 3.  CROSS-DOMAIN RECOMMENDATIONS 4.  VERY GOOD IN DISCOVERING INTERESTING RECOMMENDATIONS OUR APPROACH: SPREADING ACTIVATION
  22. 22. Interest graphs DERI Maciej Blog Post2 Maurice "Emerging Technology" http://dbpedia.org/resource/Data_analytics http://dbpedia.org/resource/ Emerging_technologies sioc:creator_of sioc:topic works at interest recommended interest owl:sameAs Expanded User Profile (EUP) Includes both original and recommended interests Social Software Entities Additional Profile Knowledge External Background Knowledge (DBPedia + domain datasets)
  23. 23. Our Approach A PLATFORM FOR SOCIAL NETWORKS: §  ENTERPRISE FOCUS: PEOPLE, COMMUNITIES, INFORMATION §  EFFICIENCY USING XMPP PUBSUB AND SPARQL 1.1 UPDATE §  EXPLOIT INTEREST GRAPH ANDVARIOUS DATA SOURCES TO PROVIDE PERSONALIZATIONTHROUGH SOPHISTICATED NEAR REAL-TIME RECOMMENDATIONS
  24. 24. Demonstrator EASYTO INTEGRATE WITH CISCO INFRASTRUCTURE OPEN STANDARDS (XMPP, SPARQL 1.1 UPDATE) SCALABLE RECOMMENDATIONS BASED ON SOCIAL GRAPH WITH OVER 10M ENTITIES AND 40M EDGES COMPUTED BELOW 1 SECOND (0.2S ON AVERAGE). MORE DETAILS: HTTP://ADVANSSE.DERI.IE/
  25. 25. demonstrator
  26. 26. Prototype stats SOCIAL NETWORK GRAPH: •  100S USERS •  100S POSTS •  500+TAGS •  2000+ ENTITIES •  15000+ EDGES Saffron.deri.ie BACKGROUND KNOWLEDGE GRAPH: •  11M ENTITIES •  40M EDGES CROSS-DOMAIN GRAPH: •  3956 RESEARCH ARTICLES •  LANGUAGE CONFERENCES
  27. 27. Why? What? How? technical considerations lessons learned
  28. 28. Technical considerations ALGORITHM: •  SEMANTIC NETWORK •  LARGE DATASET •  ITERATIVE GRAPH ALGORITHM •  STATEFUL NODES •  EMBEDDING OF DOMAIN LOGIC
  29. 29. Technical considerations NON-NATIVE IMPORT OF RDF STARTUPTIME WITH DBPEDIA •  12 MIN ON 24 CORE, 96GB RAMTO LOAD PARALLEL PROCESSING OF ACTIVATIONS •  STATE FOR EACH USER AT EACH NODE SCALABILITY ISSUES LACK OF GLOBAL ALGORITHM CONTROL IMMATURE CODE BASE, LACK OF DOCUMENTATION
  30. 30. Technical considerations NATIVE SUPPORT FOR RDF DBPEDIA (5.46GB) COMPRESSEDTO 436MB LOW MEMORY REQUIREMENTS LOW STARTUPTIME (90S) FAST QUERY ACCESS < 1ms
  31. 31. Server design XMPP SPREADING ACTIVATION HDT ADVANSSE connected social platform XMPP client: Ignite Smack Web application: Tomcat + Servlet RDF store: Jena Fuseki ADVANSSE server Personalisation component Recommendation algorithm XMPP R/W RDF store: Jena Fuseki XMPP Java API XMPP server: Ignite OpenFire XMPP client: Ignite Smack Fast, R/O RDF store: HDT SPARQL SPARQL + Java API Java API + SPARQL Java API SPARQL Java API File import Link resolver RDF store: Jena Fuseki
  32. 32. configuration •  DISTANCE CONSTRAINT DISABLED •  FANOUT CONSTRAINT ENABLED •  10TARGET ACTIVATIONS •  ACTIVATIONTHRESHOLD 0.5 •  INITIAL ACTIVATION 4.0, •  MAXIMUM OUT EDGES 500, •  AND A MAXIMUM OF 10 WAVES AND 1 PHASE
  33. 33. stats DATASET: •  371 USERS •  6 INTEREST ON AVERAGE •  DEGREE 2-5, UPTO 51 200ms 85% AVERAGE EXECUTION COVERAGE
  34. 34. The value SOCIAL CAPITAL IN ENTERPRISE SOCIAL NETWORKS IN NOT FULLY EXPLOITED. ENTERPRISE SOCIAL PLATFORMS ARE DISTRIBUTED AND INCLUDE VARIOUS SOURCES OF INFORMATION. VALUABLE INFORMATION IN AN ORGANIZATION IS NOT DISCOVERED BYTHE RELEVANT EMPLOYEES. DISCOVER AND CONNECT WITH RELEVANT PEOPLE IN THE ORGANIZATION. AGGREGATE INFORMATION FROM VARIOUS DISTRIBUTED SOCIAL PLATFORMS USING OPEN STANDARDS PROVIDE NEAR REAL-TIME PERSONALIZATION BASED ON LARGE, DYNAMIC GRAPH DATA.
  35. 35. Why? What? How? technical considerations lessons learned
  36. 36. Lessons learned •  GREATER RELEVANCETO REAL PROBLEMS •  CLEARER REQUIREMENTS (AND MORE) •  ACCESSTO ACTUAL USAGE DATA (REAL USERS) •  PATENTSVS. PUBLISHING •  PROTOTYPE INTEGRATION CONSUMES RESOURCES •  MORE FOCUS ON FEATURE DEVELOPMENT •  LESS EXPLORATION AND HYPOTHESISTESTING
  37. 37. major considerations ACCESSTO INDUSTRY DATA INTEGRATION WITH THE PRODUCT? https://www.keytrac.net/assets/industry-social-networks.jpg http://www.autointhenews.com/wp-content/uploads/2010/05/volvo-s60-crash-video-image.jpg
  38. 38. Summary PROBLEM §  INFORMATION OVERLOAD AND INEFFICIENT INFORMATION DISCOVERY IN DISTRIBUTED ENTERPRISE SOCIAL NETWORKS SOLUTION §  RECOMMENDER SYSTEMTHAT EXPLOITS SOCIAL GRAPH §  UTILIZE INTEREST GRAPH AND LEGACY INFORMATION §  NEAR-REALTIME PERSONALIZATION TECHNOLOGY §  OPEN SOURCE COMPONENT FOR RDF DATA AGGREGATION USING XMPP AND SPARQL 1.1 UPDATE §  PERSONALIZATION COMPONENT BASED ON SPREADING ACTIVATION APPLICABLETO MULTI-SOURCE, CROSS DOMAIN DATA
  39. 39. ENORMOUS VALUE IN INDUSTRY-ACADEMIA COLLABORATIONS CONTACT: MACIEJ.DABROWSKI@DERI.ORG @MACDAB

×