We propose a neural embedding approach to identify temporally
like-minded user communities, i.e., those communities of users who have similar temporal alignment in their topics of interest. Like-minded user communities in social networks are usually identified by either considering explicit structural connections between users (link analysis), users’ topics of interest expressed in their posted contents (content analysis), or in tandem. In such communities, however, the users’ rich temporal behavior towards topics of interest is overlooked. Only few recent research efforts consider the time dimension and define like-minded user communities as groups of users who share not only similar topical interests but also similar temporal behavior. Temporal like-minded user communities find application in areas such as recommender systems where relevant items are recommended to the users at the right time. In this paper, we tackle the problem of identifying temporally like-minded user communities by leveraging unsupervised feature learning (embeddings). Specifically, we learn a mapping from the user space to a low-dimensional vector space of features that incorporate both topics of interest and their temporal nature. We demonstrate the efficacy of our proposed approach on a Twitter dataset in the context of three applications: news recommendation, user prediction and community selection, where our work is able to outperform the state-of-the-art on important information retrieval metrics.
ECIR23: A Streaming Approach to Neural Team Formation TrainingHossein Fani
Predicting future successful teams of experts who can effectively collaborate is challenging due to the experts’ temporality of skill sets, levels of expertise, and collaboration ties, which is overlooked by prior work. Specifically, state-of-the-art neural-based methods learn vector representations of experts and skills in a static latent space, falling short of incorporating the possible drift and variability of experts’ skills and collaboration ties in time. In this paper, we propose (1) a streaming-based training strategy for neural models to capture the evolution of experts’ skills and collaboration ties over time and (2) to consume time information as an additional signal to the model for predicting future successful teams. We empirically benchmark our proposed method against state-of-the-art neural team formation methods and a strong temporal recommender system on datasets from varying domains with distinct distributions of skills and experts in teams. The results demonstrate neural models that utilize our proposed training strategy excel at efficacy in terms of classification and information retrieval metrics. The codebase is available at https://github.com/fani-lab/OpeNTF/tree/ecir24.
SEKE15: An ontology for describing security eventsHossein Fani
Mining security events helps with better precautionary planning for community safety. However, incident records are expressed in diverse and application dependent formats which impedes common comprehension for automatic knowledge extraction and reasoning. In this paper, we present Security Incident Ontology, SIO, a novel light-weight domain ontology for security incidents. We use Timeline to annotate the temporal facts of incidents and adopt Event to represent any security issues from indecent behavior to assault to more adverse crime which raises the security alarm in a community. It will present a unique way to the security incident detectors, a police officer, Robocops, or intelligent CCTV cameras, to report security events. We use SIO in populating security incident notifications of Integrated Risk Management (IRM) at Ryerson University to evaluate its competency, for Ryerson University campus has both business and housing area in the vicinity and encompass not only high rate, but also a wide variety of different security issues. SIO is developed in OWL 2 with Protégé.
ECIR20: Temporal Latent Space Modeling for Community PredictionHossein Fani
We propose a temporal latent space model for user community prediction in social networks, whose goal is to predict future emerging user communities based on past history of users’ topics of interest. Our model assumes that each user lies within an unobserved latent space, and similar users in the latent space representation are more likely to be members of the same user community. The model allows each user to adjust its location in the latent space as her topics of interest evolve over time. Empirically, we demonstrate that our model, when evaluated on a Twitter dataset, outperforms existing approaches under two application scenarios, namely news recommendation and user prediction on a host of metrics such as mrr, ndcg as well as precision and f-measure.
CIKM AnalytiCup 2017: Bagging Model for Product Title Quality with NoiseHossein Fani
To stand out from the crowd, sellers employ creative, sometimes disruptive titles for their products in online stores to improve their search relevancy or attract the attention of customers. As a part of the CIKM AnalytiCup 2017, the challenge is to build a product title quality model that can automatically grade the clarity and the conciseness of a product title. Our proposed “Bagging Model for Product Title Quality with Noise” could leave others behind in performance and become the winner of the CIKM Cup 2017 competition.
ECIR23: A Streaming Approach to Neural Team Formation TrainingHossein Fani
Predicting future successful teams of experts who can effectively collaborate is challenging due to the experts’ temporality of skill sets, levels of expertise, and collaboration ties, which is overlooked by prior work. Specifically, state-of-the-art neural-based methods learn vector representations of experts and skills in a static latent space, falling short of incorporating the possible drift and variability of experts’ skills and collaboration ties in time. In this paper, we propose (1) a streaming-based training strategy for neural models to capture the evolution of experts’ skills and collaboration ties over time and (2) to consume time information as an additional signal to the model for predicting future successful teams. We empirically benchmark our proposed method against state-of-the-art neural team formation methods and a strong temporal recommender system on datasets from varying domains with distinct distributions of skills and experts in teams. The results demonstrate neural models that utilize our proposed training strategy excel at efficacy in terms of classification and information retrieval metrics. The codebase is available at https://github.com/fani-lab/OpeNTF/tree/ecir24.
SEKE15: An ontology for describing security eventsHossein Fani
Mining security events helps with better precautionary planning for community safety. However, incident records are expressed in diverse and application dependent formats which impedes common comprehension for automatic knowledge extraction and reasoning. In this paper, we present Security Incident Ontology, SIO, a novel light-weight domain ontology for security incidents. We use Timeline to annotate the temporal facts of incidents and adopt Event to represent any security issues from indecent behavior to assault to more adverse crime which raises the security alarm in a community. It will present a unique way to the security incident detectors, a police officer, Robocops, or intelligent CCTV cameras, to report security events. We use SIO in populating security incident notifications of Integrated Risk Management (IRM) at Ryerson University to evaluate its competency, for Ryerson University campus has both business and housing area in the vicinity and encompass not only high rate, but also a wide variety of different security issues. SIO is developed in OWL 2 with Protégé.
ECIR20: Temporal Latent Space Modeling for Community PredictionHossein Fani
We propose a temporal latent space model for user community prediction in social networks, whose goal is to predict future emerging user communities based on past history of users’ topics of interest. Our model assumes that each user lies within an unobserved latent space, and similar users in the latent space representation are more likely to be members of the same user community. The model allows each user to adjust its location in the latent space as her topics of interest evolve over time. Empirically, we demonstrate that our model, when evaluated on a Twitter dataset, outperforms existing approaches under two application scenarios, namely news recommendation and user prediction on a host of metrics such as mrr, ndcg as well as precision and f-measure.
CIKM AnalytiCup 2017: Bagging Model for Product Title Quality with NoiseHossein Fani
To stand out from the crowd, sellers employ creative, sometimes disruptive titles for their products in online stores to improve their search relevancy or attract the attention of customers. As a part of the CIKM AnalytiCup 2017, the challenge is to build a product title quality model that can automatically grade the clarity and the conciseness of a product title. Our proposed “Bagging Model for Product Title Quality with Noise” could leave others behind in performance and become the winner of the CIKM Cup 2017 competition.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
3. 3
motivation:
‘War in Afghanistan’
item recommendation with correct timing
hypothesis: like-minded users exhibit similar temporal behavior towards
similar topics due to ‘sth’
4. 4
Hu et al.
aaai'14
Group Specific Topics-over-Time
(GrosToT)
thank you so much for your clean implementation
much appreciated!
Fani et al.
ci’15
user-topic timeseries
2d-xcorrelation
graph clustering
6. 6
Gold Standard
approach
1. regions of like-mindedness (RoL)
identify the co-occurrence context of users in topic and time spaces
user-topic-time cuboids
7. 7
Gold Standard
approach
1. regions of like-mindedness (RoL)
1. for each time: 2d RoLs in user and topic spaces
1. build a multigraph Gt = (V, E)
V = topics
E = {Utzi ,zj(c): to be the maximal set of users whose
interest towards zi and zj satisfies the condition of
homogeneity c.}
2. dfs
2. for each 2d RoLs: 3d RoLs in (user,topic) and time spaces
8. 8
regions of like-mindedness (RoL)
Zhao et al. (TriCluster) in gene’s 3d microarray
equality is [0,
0.1)
equality is [0.1, 1.0]
9. 9
r = {u1,u2,u3} × {}, C =[z40, z40, z41, z41, ..., z45, z45
r = {u1,u2,u3}×{{} + z40}, C = [z40, z41, z41, ..., z45,
z45]
r= {u1,u2}×{z40+z40}, C =[z41, z41, ..., z45, z45]
10. 10
Gold Standard
approach
1. regions of like-mindedness (RoL)
identify the co-occurrence context of users in topic and time spaces
user-topic-time cuboids
2. embeddings
input the user space of the RoL to w2v (cbow) and build u2v
3. graph clustering
12. 12
Gold Standard
approach
1. regions of like-mindedness (RoL)
identify the co-occurrence context of users in topic and time spaces
user-topic-time cuboids
2. embeddings
input PoTI to w2v and build u2v
3. graph clustering
Louvain method on weighted graph based on u2v cosine similarity
13. 13
Gold Standard
gold standard
assumption:
users are interested in the topics of the news article about which
they have posted
golden set:
news articles to which a user has explicitly linked in her tweets
mentions = {(user, news article, timestamp)}
Abel et al.: Twitter, 3M tweets posted by 135K users between Nov. 1 and Dec. 31, 2010.
25,756 triples extracted from 3,468 distinct news articles posted by 1,922 users
14. 14
Gold Standard
evaluation
1. news recommendation:
at time t, recommend news article a to all communities
recommendation task: {(user, ?, timestamp)
prediction task: (?, news article, timestamp)
2. community selection
given a news article a at time t (the input query), find the
communities of those users (similar to documents related to an
input query) who have mentioned the news article at that time
To find the final 2-d RoLs for time t , we apply depth-first-search
(DFS) on the multigraph Gt based on the pseudo code described
in Algorithm 1. We start with a 2-d RoL r = U × .; all users U,
but no topics since no node (topic) has been processed yet and
C = [z1, z1, z2, z2, ..., z |Z| , z |Z| ] as the set of all initial nodes (topics)
to be processed. Here,C includes duplicated initial topics to support
for directed loops on each node. At each intermediate recursive
call, we have a current candidate 2-d RoL r = A × B and a list of
not yet processed topics C. We add r into an initially empty set Rt
if it satisfies c and is not already contained in some RoL r ′ ∈ Rt .
Then, we remove any 2-d RoL r ” ∈ Rt , which has already been
subsumed by r (lines 2-6). We expand the current candidate r from
each of its old topics zi to a new topic zj if there is a directed edge
(zi →zj ) ∈ Ut . Then, the function is called on the new candidate
{r .A ∩ Ut
zi ,zj
} × {r .B ∪ {zj }} (lines 7-15).
For example, let us consider how the 2-d RoLs are identified
from the multigraph G22 shown in Figure 4a. Initially the algorithm
starts with the candidate 2-d RoL r = {u1,u2,u3} × .,C =
[z40, z40, z41, z41, ..., z45, z45]. We pop node z40 and recursively call
the function onr = {u1,u2,u3}×{z40},C = [z40, z41, z41, ..., z45, z45]
(line 10). Since {u1,u2,u3} × {z40} does not satisfy condition c, we
continue by popping a new node (topic) which is again z40. There
is only one directed edge (loop) from z40 →z40, so we obtain a new
candidate (line 14) and call the function on r = {u1,u2}×{z40},C =
[z41, z41, ..., z45, z45] (line 15). Now, the input r satisfies c and we
add it to the thus far empty R22 (line 6). Next, we pop z41 and there
is a directed edge from z40 →z41 with U22
z40,z41 = {u2}. So we call
the function on r = {u2} × {z40, z41},C = [z41, ..., z45, z45] which
leads to a new element in R22.