The slides of our talk at the ECMLPKDD 2017 conference.
Abstract. In this paper we propose a survival factorization framework that models information cascades by tying together social influence pat- terns, topical structure and temporal dynamics. This is achieved through the introduction of a latent space which encodes: (a) the relevance of a information cascade on a topic; (b) the topical authoritativeness and the susceptibility of each individual involved in the information cascade, and (c) temporal topical patterns. By exploiting the cumulative proper- ties of the survival function and of the likelihood of the model on a given adoption log, which records the observed activation times of users and side-information for each cascade, we show that the inference phase is linear in the number of users and in the number of adoptions. The eval- uation on both synthetic and real-world data shows the effectiveness of the model in detecting the interplay between topics and social influence patterns, which ultimately provides high accuracy in predicting users activation times.
The paper is available at
http://ecmlpkdd2017.ijs.si/papers/paperID392.pdf
or alternatively at
https://www.dropbox.com/s/hd3r3z3gwqlx9gp/pkdd17.pdf?dl=0
Code is also available at
https://github.com/gmanco/SurvivalFactorization
Temporal profiles of avalanches on networksJames Gleeson
My talk at the Workshop on Advances on Epidemics in Complex Networks, Delft University of Technology, The Netherlands, 31 Aug 2017 www.nas.ewi.tudelft.nl/aecn/
Synchronous Online Experiments with NodeGamefuturdorko
Slide deck of the workshop at the Computational Social Science Symposium in Cologne, 5/12/2018
More details on the nodegame website: https://nodegame.org/css_tutorial.htm
Temporal profiles of avalanches on networksJames Gleeson
My talk at the Workshop on Advances on Epidemics in Complex Networks, Delft University of Technology, The Netherlands, 31 Aug 2017 www.nas.ewi.tudelft.nl/aecn/
Synchronous Online Experiments with NodeGamefuturdorko
Slide deck of the workshop at the Computational Social Science Symposium in Cologne, 5/12/2018
More details on the nodegame website: https://nodegame.org/css_tutorial.htm
Tim Groeling is a professor and former chair of the UCLA Department of Communication Studies. He has written numerous books and articles on political communication, including the award-winning “When Politicians Attack.” He is currently leading a project to digitize three decades of television news for the UCLA Communication Studies Archive.
Introduction to Computational Social Science - Lecture 1Lauri Eloranta
First lecture of the course CSS01: Introduction to Computational Social Science at the University of Helsinki, Spring 2015. (http://blogs.helsinki.fi/computationalsocialscience/).
Lecturer: Lauri Eloranta
Questions & Comments: https://twitter.com/laurieloranta
Keynote address 'Opening Science' at NORFest 2023 on November 2, 2023 at the Royal Irish Academy in Dublin Ireland. Keynote speaker: Chelle Gentemann, science lead for NASA’s Transform to Open Science Mission and co-chair of the U.S. White House Office for Science and Technology and Policy (OSTP) Sub-working group on the Year of Open Science
Accessibility as Innovation - giving your potential users the chance to inspi...Jonathan Hassell
Many organisations seem to fear that making their products accessible means dumbing them down: they might then work for everyone, but they will lose a lot of their pizzazz in the process.
In this eAccess-13 presentation Jonathan Hassell presents the contrary view - that organisations that really look into the different needs of their disabled audiences often find this breaks them out of fixed positions, allowing them to take innovative leaps in product design.
Using examples from the typewriter to the iPhone classic ‘Zombies, Run!’ and his own recent projects involving the Microsoft Kinect games controller, Jonathan guides you through a way of thinking about product development which is inclusive, creative and potentially very lucrative.
Gatewatching 11: Echo Chambers? Filter Bubbles? Reviewing the EvidenceAxel Bruns
Lecture 11 in the course From Gatekeeping to Gatewatching: News and Journalism in the Digital Age.
This lecture series addresses the continuing transformation of the production and consumption of journalism in the contemporary media environment. It provides a brief history of the impact of participatory online news production and engagement practices – from the first wave of citizen journalism to the social media platforms of today – on how news content is disseminated and experienced; examines reactive and proactive responses to these changes by news organisations and journalists; and explores the longer-term impact of these developments on the public sphere, touching on the power of social media platforms and their role in shaping their users’ information diets.
Readings are largely drawn from Gatewatching and News Curation: Journalism, Social Media, and the Public Sphere (Bruns, 2018), with additional readings recommended for selected lectures.
Reading for this lecture:
Bruns, A. (2022). Echo Chambers? Filter Bubbles? The Misleading Metaphors That Obscure the Real Problem. In M. Pérez-Escolar & J. M. Noguera-Vivo (Eds.), Hate Speech and Polarization in Participatory Society (pp. 33–48). Routledge. https://doi.org/10.4324/9781003109891-4
Workshop: Science Meets Industry: Online Behavioral Experiments with nodeGame...NodejsFoundation
The goal of social science research is to discover fundamental features of human behavior. One of the best environments to accomplish this task are "games." Scientific behavioral experiments are a particular type of games, where the players' preferences are revealed through strategic interactions. "Gamification" -- the general technique of applying behavioral economics insights to understand any type of problems through games -- is increasingly been used in corporate business. This talk introduces the open-source project nodeGame (nodegame.org), a framework in Node.js to create and manage real-time and discrete-time synchronous behavioral experiments.
The basic API for creating and running behavioral experiments in a stand-alone server or on the cloud is presented, and techniques to recruit participants from the Amazon Mechanical Turk labor market and other sources such as Volunteer Science are discussed. The talks also peeks into the world of academia, showing how the availability of data and new quantitative methodologies has changed the way social scientists do research. Their current and future needs for better, more reliable software, simple APIs, and down-to-Earth documentations are discussed.
Do Museums Worldwide form a true Community on Twitter? Museum Twitter ecosys...LaMagnética
Some insights on the museum Twitter ecosystem through Social Network Analysis and Network Science. Presentation for the International conference Museums and the web 2014, in Florence (Italy) and Baltimore, MD (USA).
Iterative knowledge extraction from social networks. The Web Conference 2018Marco Brambilla
Knowledge in the world continuously evolves, and ontologies are largely incomplete, especially regarding data belonging to the so-called long tail. We propose a method for discovering emerging knowledge by extracting it from social content. Once initialized by domain experts, the method is capable of finding relevant entities by means of a mixed syntactic-semantic method. The method uses seeds, i.e. prototypes of emerging entities provided by experts, for generating candidates; then, it associates candidates to feature vectors built by using terms occurring in their social content and ranks the candidates by using their distance from the centroid of seeds, returning the top candidates. Our method can run iteratively, using the results as new seeds.
In this paper we address the following research questions: (1) How does the reconstructed domain knowledge evolve if the candidates of one extraction are recursively used as seeds (2) How does the reconstructed domain knowledge spread geographically (3) Can the method be used to inspect the past, present, and future of knowledge (4) Can the method be used to find emerging knowledge?.
This work was presented at The Web Conference 2018, MSM workshop.
Open access for researchers, policy makers and research managersIryna Kuchma
Presented at Open Access: Maximising Research Impact, April 23 2009, New Bulgarian University Library, Sofia. Open access for researchers: enlarged audience, citation impact, tenure and promotion. Open access for policy makers and research managers:
new tools to manage a university’s image and impact. How to maximize the visibility of research publications, improve the impact and influence of the work, disseminate the results of the research, showcase the quality of the research in the Universities and research institutions, better measure and manage the research in the institution, collect and curate the digital outputs, generate new knowledge from existing findings, enable and encourage collaboration, bring savings to the higher education sector and better return on investment. What are the key functions for research libraries?
Forecasting the Spreading of Technologies in Research Communities @ K-CAP 2017Francesco Osborne
Technologies such as algorithms, applications and formats are an
important part of the knowledge produced and reused in the
research process. Typically, a technology is expected to originate
in the context of a research area and then spread and contribute to several other fields. For example, Semantic Web technologies
have been successfully adopted by a variety of fields, e.g.,
Information Retrieval, Human Computer Interaction, Biology, and
many others. Unfortunately, the spreading of technologies across
research areas may be a slow and inefficient process, since it is
easy for researchers to be unaware of potentially relevant
solutions produced by other research communities. In this paper,
we hypothesise that it is possible to learn typical technology
propagation patterns from historical data and to exploit this
knowledge i) to anticipate where a technology may be adopted
next and ii) to alert relevant stakeholders about emerging and
relevant technologies in other fields. To do so, we propose the
Technology-Topic Framework, a novel approach which uses a
semantically enhanced technology-topic model to forecast the
propagation of technologies to research areas. A formal evaluation
of the approach on a set of technologies in the Semantic Web and
Artificial Intelligence areas has produced excellent results,
confirming the validity of our solution.
StarCompliance is a leading firm specializing in the recovery of stolen cryptocurrency. Our comprehensive services are designed to assist individuals and organizations in navigating the complex process of fraud reporting, investigation, and fund recovery. We combine cutting-edge technology with expert legal support to provide a robust solution for victims of crypto theft.
Our Services Include:
Reporting to Tracking Authorities:
We immediately notify all relevant centralized exchanges (CEX), decentralized exchanges (DEX), and wallet providers about the stolen cryptocurrency. This ensures that the stolen assets are flagged as scam transactions, making it impossible for the thief to use them.
Assistance with Filing Police Reports:
We guide you through the process of filing a valid police report. Our support team provides detailed instructions on which police department to contact and helps you complete the necessary paperwork within the critical 72-hour window.
Launching the Refund Process:
Our team of experienced lawyers can initiate lawsuits on your behalf and represent you in various jurisdictions around the world. They work diligently to recover your stolen funds and ensure that justice is served.
At StarCompliance, we understand the urgency and stress involved in dealing with cryptocurrency theft. Our dedicated team works quickly and efficiently to provide you with the support and expertise needed to recover your assets. Trust us to be your partner in navigating the complexities of the crypto world and safeguarding your investments.
Tim Groeling is a professor and former chair of the UCLA Department of Communication Studies. He has written numerous books and articles on political communication, including the award-winning “When Politicians Attack.” He is currently leading a project to digitize three decades of television news for the UCLA Communication Studies Archive.
Introduction to Computational Social Science - Lecture 1Lauri Eloranta
First lecture of the course CSS01: Introduction to Computational Social Science at the University of Helsinki, Spring 2015. (http://blogs.helsinki.fi/computationalsocialscience/).
Lecturer: Lauri Eloranta
Questions & Comments: https://twitter.com/laurieloranta
Keynote address 'Opening Science' at NORFest 2023 on November 2, 2023 at the Royal Irish Academy in Dublin Ireland. Keynote speaker: Chelle Gentemann, science lead for NASA’s Transform to Open Science Mission and co-chair of the U.S. White House Office for Science and Technology and Policy (OSTP) Sub-working group on the Year of Open Science
Accessibility as Innovation - giving your potential users the chance to inspi...Jonathan Hassell
Many organisations seem to fear that making their products accessible means dumbing them down: they might then work for everyone, but they will lose a lot of their pizzazz in the process.
In this eAccess-13 presentation Jonathan Hassell presents the contrary view - that organisations that really look into the different needs of their disabled audiences often find this breaks them out of fixed positions, allowing them to take innovative leaps in product design.
Using examples from the typewriter to the iPhone classic ‘Zombies, Run!’ and his own recent projects involving the Microsoft Kinect games controller, Jonathan guides you through a way of thinking about product development which is inclusive, creative and potentially very lucrative.
Gatewatching 11: Echo Chambers? Filter Bubbles? Reviewing the EvidenceAxel Bruns
Lecture 11 in the course From Gatekeeping to Gatewatching: News and Journalism in the Digital Age.
This lecture series addresses the continuing transformation of the production and consumption of journalism in the contemporary media environment. It provides a brief history of the impact of participatory online news production and engagement practices – from the first wave of citizen journalism to the social media platforms of today – on how news content is disseminated and experienced; examines reactive and proactive responses to these changes by news organisations and journalists; and explores the longer-term impact of these developments on the public sphere, touching on the power of social media platforms and their role in shaping their users’ information diets.
Readings are largely drawn from Gatewatching and News Curation: Journalism, Social Media, and the Public Sphere (Bruns, 2018), with additional readings recommended for selected lectures.
Reading for this lecture:
Bruns, A. (2022). Echo Chambers? Filter Bubbles? The Misleading Metaphors That Obscure the Real Problem. In M. Pérez-Escolar & J. M. Noguera-Vivo (Eds.), Hate Speech and Polarization in Participatory Society (pp. 33–48). Routledge. https://doi.org/10.4324/9781003109891-4
Workshop: Science Meets Industry: Online Behavioral Experiments with nodeGame...NodejsFoundation
The goal of social science research is to discover fundamental features of human behavior. One of the best environments to accomplish this task are "games." Scientific behavioral experiments are a particular type of games, where the players' preferences are revealed through strategic interactions. "Gamification" -- the general technique of applying behavioral economics insights to understand any type of problems through games -- is increasingly been used in corporate business. This talk introduces the open-source project nodeGame (nodegame.org), a framework in Node.js to create and manage real-time and discrete-time synchronous behavioral experiments.
The basic API for creating and running behavioral experiments in a stand-alone server or on the cloud is presented, and techniques to recruit participants from the Amazon Mechanical Turk labor market and other sources such as Volunteer Science are discussed. The talks also peeks into the world of academia, showing how the availability of data and new quantitative methodologies has changed the way social scientists do research. Their current and future needs for better, more reliable software, simple APIs, and down-to-Earth documentations are discussed.
Do Museums Worldwide form a true Community on Twitter? Museum Twitter ecosys...LaMagnética
Some insights on the museum Twitter ecosystem through Social Network Analysis and Network Science. Presentation for the International conference Museums and the web 2014, in Florence (Italy) and Baltimore, MD (USA).
Iterative knowledge extraction from social networks. The Web Conference 2018Marco Brambilla
Knowledge in the world continuously evolves, and ontologies are largely incomplete, especially regarding data belonging to the so-called long tail. We propose a method for discovering emerging knowledge by extracting it from social content. Once initialized by domain experts, the method is capable of finding relevant entities by means of a mixed syntactic-semantic method. The method uses seeds, i.e. prototypes of emerging entities provided by experts, for generating candidates; then, it associates candidates to feature vectors built by using terms occurring in their social content and ranks the candidates by using their distance from the centroid of seeds, returning the top candidates. Our method can run iteratively, using the results as new seeds.
In this paper we address the following research questions: (1) How does the reconstructed domain knowledge evolve if the candidates of one extraction are recursively used as seeds (2) How does the reconstructed domain knowledge spread geographically (3) Can the method be used to inspect the past, present, and future of knowledge (4) Can the method be used to find emerging knowledge?.
This work was presented at The Web Conference 2018, MSM workshop.
Open access for researchers, policy makers and research managersIryna Kuchma
Presented at Open Access: Maximising Research Impact, April 23 2009, New Bulgarian University Library, Sofia. Open access for researchers: enlarged audience, citation impact, tenure and promotion. Open access for policy makers and research managers:
new tools to manage a university’s image and impact. How to maximize the visibility of research publications, improve the impact and influence of the work, disseminate the results of the research, showcase the quality of the research in the Universities and research institutions, better measure and manage the research in the institution, collect and curate the digital outputs, generate new knowledge from existing findings, enable and encourage collaboration, bring savings to the higher education sector and better return on investment. What are the key functions for research libraries?
Forecasting the Spreading of Technologies in Research Communities @ K-CAP 2017Francesco Osborne
Technologies such as algorithms, applications and formats are an
important part of the knowledge produced and reused in the
research process. Typically, a technology is expected to originate
in the context of a research area and then spread and contribute to several other fields. For example, Semantic Web technologies
have been successfully adopted by a variety of fields, e.g.,
Information Retrieval, Human Computer Interaction, Biology, and
many others. Unfortunately, the spreading of technologies across
research areas may be a slow and inefficient process, since it is
easy for researchers to be unaware of potentially relevant
solutions produced by other research communities. In this paper,
we hypothesise that it is possible to learn typical technology
propagation patterns from historical data and to exploit this
knowledge i) to anticipate where a technology may be adopted
next and ii) to alert relevant stakeholders about emerging and
relevant technologies in other fields. To do so, we propose the
Technology-Topic Framework, a novel approach which uses a
semantically enhanced technology-topic model to forecast the
propagation of technologies to research areas. A formal evaluation
of the approach on a set of technologies in the Semantic Web and
Artificial Intelligence areas has produced excellent results,
confirming the validity of our solution.
StarCompliance is a leading firm specializing in the recovery of stolen cryptocurrency. Our comprehensive services are designed to assist individuals and organizations in navigating the complex process of fraud reporting, investigation, and fund recovery. We combine cutting-edge technology with expert legal support to provide a robust solution for victims of crypto theft.
Our Services Include:
Reporting to Tracking Authorities:
We immediately notify all relevant centralized exchanges (CEX), decentralized exchanges (DEX), and wallet providers about the stolen cryptocurrency. This ensures that the stolen assets are flagged as scam transactions, making it impossible for the thief to use them.
Assistance with Filing Police Reports:
We guide you through the process of filing a valid police report. Our support team provides detailed instructions on which police department to contact and helps you complete the necessary paperwork within the critical 72-hour window.
Launching the Refund Process:
Our team of experienced lawyers can initiate lawsuits on your behalf and represent you in various jurisdictions around the world. They work diligently to recover your stolen funds and ensure that justice is served.
At StarCompliance, we understand the urgency and stress involved in dealing with cryptocurrency theft. Our dedicated team works quickly and efficiently to provide you with the support and expertise needed to recover your assets. Trust us to be your partner in navigating the complexities of the crypto world and safeguarding your investments.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
1. Survival Factorization
on Diffusion Networks
Nicola Barbieri, Giuseppe Manco and Ettore Ritacco
Tumblr, 35 E 21st St, 10010, New York, USA -
nicola@tumblr.com
ICAR - CNR, via Bucci 7/11C, 87036 Arcavacata di Rende
(CS), ITALY - giuseppe.manco@icar.cnr.it,
ettore.ritacco@icar.cnr.it
2. Context
• Users can create contents
• Contents can be shared within a diffusion network
• The diffusion takes place within cascades
• Trees of timed word-of-mouth chains
8. Formally
• 𝒕"
= 𝑡% 𝑐 , … , 𝑡) 𝑐 a cascade with content 𝑐,
• 𝑁 is the number of users
• 𝑡+ 𝑐 ∈ 0, 𝑇" ∪ ∞ is the timestamp when the node 𝑢 becomes active on
the cascade 𝒕",
• 𝑇" is the time horizon
• The probability of user 𝒖 being infected by user 𝒗 at time 𝑡+ 𝑐 is
given by:
𝑓 𝑡+ 𝑐 |𝑡6 𝑐 , 𝜆+,6 ∝ 𝑒:;<,= >< " :>= "
9. The Infection model
• 𝑣 is the influencer
• 𝜆+,6 represents the influence exerted by 𝑣 on 𝑢
• The transmission rate
• 𝑆 𝑡+ 𝑐 |𝑡6 𝑐 , 𝜆+,6 = 𝑒:;<,= >< " :>= "
is the survival function,
• the probability of resisting the contagion 𝑝 𝑇 ≥ 𝑡+ 𝑐 |𝑡6 𝑐
• 𝑡+ 𝑐 − 𝑡6 𝑐 represents the exposure time
• The longer the delay, the lower the probability of infection
𝑓 𝑡+ 𝑐 |𝑡6 𝑐 , 𝜆+,6 = 𝜆+,6 ⋅ 𝑒:;<,= >< " :>= "
11. Building the Survival Model – Issues
• The nature of the transmission rate 𝝀 determines the adaptiveness of
the model to the personalization of the contagion
• A very fine-grain approach:
• a single value 𝜆+,6 for each pair of users within each cascade
• This approach is intractable in real scenarios
• The matrix 𝚲, containing all the 𝜆+,6, has size 𝑁T
18. The complete model
• Content can be modeled jointly
• E.g., textual content model by a mixture of Poisson distributions expressing
topic dependency
• For each cascade 𝑐 ∈ 1, … , 𝑀
o Sample the topical diffusion pattern,
𝑧" ∼ 𝑀𝑢𝑙𝑡𝑖𝑛𝑜𝑚𝑖𝑎𝑙 𝛩
o For each word 𝑤 in 𝑐
§ Sample the occurrences of 𝑤 in 𝑐,
𝑛g," ∼ 𝑃𝑜𝑖𝑠𝑠𝑜𝑛 𝛷
o For each user 𝑢 in 𝑐
§ Sample the user who generated the
contagion, 𝑦+
" ∼ 𝑀𝑢𝑙𝑡𝑖𝑛𝑜𝑚𝑖𝑎𝑙 Ξ
§ Sample her activation time,
𝑡+ 𝑐 ∼ 𝑊𝑒𝑖𝑏𝑢𝑙𝑙 𝑧", 𝑦+
", 𝐴, 𝑆
Ξ
19. Model Learning
• EM approach
• E-step:
• Update latent variables
• M-step:
• Given the status of the latent variables 𝒁 and 𝒀, update parameters
• Linear complexity!
• The update equations in the EM algorithm can be optimized by exploiting the
factorization of 𝜆+,6
U
20. Model Learning
• EM approach
• E-step:
• Update latent variables
• M-step:
• Given the status of the latent variables 𝒁 and 𝒀, update parameters
• Linear complexity!
• The update equations in the EM algorithm can be optimized by exploiting the
factorization of 𝜆+,6
U
21. Exploiting the model
• We started with four questions:
• Q: Who will share a content?
• A: users infected within a given time horizon
• Q: When will someone share a content?
• A: A sample from 𝑝 𝒕"
|𝒁, 𝒀, 𝚲
• Q: Who is expert in a topic characterizing a set of contents?
• A: Influential users, see 𝐴6,U
• Q: Who is interested in a topic?
• A: Susceptible users, see 𝑆+,U
23. Evaluation
• Activation prediction:
• Two samples of Twitter (filtered/noisy draws)
• Testing protocol:
• Given an incomplete cascade (50%, 80%), fill the missing activations
• Predict activation times
• Influencers and topics:
• MemeTracker dataset
• Testing protocol:
• A semantic (handmade) analysis on the top topics and most influential users
26. Conclusions
• Robust, efficient and accurate modeling of information cascades
• Factorizing the infection rate uncovers highly relevant
information concerning the underlying diffusion process
• Works with general Weibull distribution, not just the exponential
• Future work
• Bayesian learning: The underlying probability distributions allow conjugate priors
• Exploit multiple mutual elicitation processes (e.g. Hawkes processes) in the same
modeling
• Deep architectures for combining heterogeneous content
• Content dynamics within a cascade