Core–periphery detection in networks with nonlinear Perron eigenvectorsFrancesco Tudisco
Core–periphery detection is a highly relevant task in exploratory network analysis. Given a network of nodes and edges, one is interested in revealing the presence and measuring the consistency of a core–periphery structure using only the network topology. This mesoscale network structure consists of two sets: the core, a set of nodes that is highly connected across the whole network, and the periphery, a set of nodes that is well connected only to the nodes that are in the core. Networks with such a core–periphery structure have been observed in several applications, including economic, social, communication and citation networks.
In this talk we discuss a new core–periphery detection model based on the optimization of a class of core–periphery quality functions. While the quality measures are highly nonconvex in general and thus hardly treatable, we show that the global solution coincides with the nonlinear Perron eigenvector of a suitably defined parameter dependent matrix M(x), i.e. the positive solution to the nonlinear eigenvector problem M(x)x=λx. Using recent advances in nonlinear Perron–Frobeniustheory, we discuss uniqueness of the global solution and we propose a nonlinear power method-type scheme that (a) allows us to solve the optimization problem with global convergence guarantees and (b) effectively scales to very large and sparse networks. Finally, we present several numerical experiments showing that the new method largely out-performs state-of-the-art techniques for core-periphery detection.
Core–periphery detection in networks with nonlinear Perron eigenvectorsFrancesco Tudisco
Core–periphery detection is a highly relevant task in exploratory network analysis. Given a network of nodes and edges, one is interested in revealing the presence and measuring the consistency of a core–periphery structure using only the network topology. This mesoscale network structure consists of two sets: the core, a set of nodes that is highly connected across the whole network, and the periphery, a set of nodes that is well connected only to the nodes that are in the core. Networks with such a core–periphery structure have been observed in several applications, including economic, social, communication and citation networks.
In this talk we discuss a new core–periphery detection model based on the optimization of a class of core–periphery quality functions. While the quality measures are highly nonconvex in general and thus hardly treatable, we show that the global solution coincides with the nonlinear Perron eigenvector of a suitably defined parameter dependent matrix M(x), i.e. the positive solution to the nonlinear eigenvector problem M(x)x=λx. Using recent advances in nonlinear Perron–Frobeniustheory, we discuss uniqueness of the global solution and we propose a nonlinear power method-type scheme that (a) allows us to solve the optimization problem with global convergence guarantees and (b) effectively scales to very large and sparse networks. Finally, we present several numerical experiments showing that the new method largely out-performs state-of-the-art techniques for core-periphery detection.
This presentation on Pseudo Random Number Generator enlists the different generators, their mechanisms and the various applications of random numbers and pseudo random numbers in different arenas.
Weight enumerators of block codes and the mc williamsMadhumita Tamhane
Best possible error control codes of a certain rate and block length can be adjudged depending on bounds such that no codes can exist beyond the bounds and codes are sure to exist within the bounds. This presentation gives composition structure of Block codes and the probability of decoding error and of decoding failure.Mac William's Identities is relationship between weight distribution of a linear code and weight distribution of its dual code, which hold for any linear code and are based on vector space structure of linear codes and on the fact that dual code of a code is the orthogonal compliment of the code...
It presents various approximation schemes including absolute approximation, epsilon approximation and also presents some polynomial time approximation schemes. It also presents some probabilistically good algorithms.
This is a detailed review of ACM International Collegiate Programming Contest (ICPC) Northeastern European Regional Contest (NEERC) 2015 Problems. It includes a summary of problem and names of problem authors and detailed runs statistics for each problem. Video of the actual presentation that was recorded during NEERC is here https://www.youtube.com/watch?v=vn7v1MuWXdU (in Russian)
Note: there were only preliminary stats avaialble, because problems review was happening before before the closing ceremony. This published presentation has full stats.
Computation of electromagnetic_fields_scattered_from_dielectric_objects_of_un...Alexander Litvinenko
Tools for electromagnetic scattering from objects with uncertain shapes are needed in various applications.
We develop numerical methods for predicting radar and scattering cross sections (RCS and SCS) of complex targets.
To reduce cost of Monte Carlo (MC) we offer modified multilevel MC (CMLMC) method.
Computation of electromagnetic fields scattered from dielectric objects of un...Alexander Litvinenko
We develop fast and efficient stochastic methods for characterizing scattering
from objects of uncertain shapes. This is highly needed in the
fields of electromagnetics, optics, and photonics.
The continuation multilevel Monte Carlo (CMLMC) method is
used together with a surface integral equation solver. The
CMLMC method optimally balances statistical errors due to
sampling of the parametric space, and numerical errors due
to the discretization of the geometry using a hierarchy of
discretizations, from coarse to fine. The number of realizations
of finer discretizations can be kept low, with most samples
computed on coarser discretizations to minimize computational
work. Consequently, the total execution time is significantly
reduced, in comparison to the standard MC scheme.
Here we look at how to create a projection Matrix for orthogonal projection or transformation and projection onto a Line under Numerical Linear Algebra.
This presentation on Pseudo Random Number Generator enlists the different generators, their mechanisms and the various applications of random numbers and pseudo random numbers in different arenas.
Weight enumerators of block codes and the mc williamsMadhumita Tamhane
Best possible error control codes of a certain rate and block length can be adjudged depending on bounds such that no codes can exist beyond the bounds and codes are sure to exist within the bounds. This presentation gives composition structure of Block codes and the probability of decoding error and of decoding failure.Mac William's Identities is relationship between weight distribution of a linear code and weight distribution of its dual code, which hold for any linear code and are based on vector space structure of linear codes and on the fact that dual code of a code is the orthogonal compliment of the code...
It presents various approximation schemes including absolute approximation, epsilon approximation and also presents some polynomial time approximation schemes. It also presents some probabilistically good algorithms.
This is a detailed review of ACM International Collegiate Programming Contest (ICPC) Northeastern European Regional Contest (NEERC) 2015 Problems. It includes a summary of problem and names of problem authors and detailed runs statistics for each problem. Video of the actual presentation that was recorded during NEERC is here https://www.youtube.com/watch?v=vn7v1MuWXdU (in Russian)
Note: there were only preliminary stats avaialble, because problems review was happening before before the closing ceremony. This published presentation has full stats.
Computation of electromagnetic_fields_scattered_from_dielectric_objects_of_un...Alexander Litvinenko
Tools for electromagnetic scattering from objects with uncertain shapes are needed in various applications.
We develop numerical methods for predicting radar and scattering cross sections (RCS and SCS) of complex targets.
To reduce cost of Monte Carlo (MC) we offer modified multilevel MC (CMLMC) method.
Computation of electromagnetic fields scattered from dielectric objects of un...Alexander Litvinenko
We develop fast and efficient stochastic methods for characterizing scattering
from objects of uncertain shapes. This is highly needed in the
fields of electromagnetics, optics, and photonics.
The continuation multilevel Monte Carlo (CMLMC) method is
used together with a surface integral equation solver. The
CMLMC method optimally balances statistical errors due to
sampling of the parametric space, and numerical errors due
to the discretization of the geometry using a hierarchy of
discretizations, from coarse to fine. The number of realizations
of finer discretizations can be kept low, with most samples
computed on coarser discretizations to minimize computational
work. Consequently, the total execution time is significantly
reduced, in comparison to the standard MC scheme.
Here we look at how to create a projection Matrix for orthogonal projection or transformation and projection onto a Line under Numerical Linear Algebra.
Application of the random walk theory for simulation of flood hazzards jeddah...Amro Elfeki
Simulation of Urban Flooding by Diffusive Wave Model: Jeddah Flood 2009 Case Study. Proceeding of 1st International Geomatics Symposium in Saudi Arabia in collaboration with University of Florida and INSA Strasbourg: Geomatics Technologies in the City, 10-11 May 2011, A. Mohamed (Ed.), pp. 119-123.
Random Walks, Efficient Markets & Stock PricesNEO Empresarial
The famous financial theory of Efficient Markets is associated with the idea of a Random Walk. If the theory holds true, that makes prices unpredictable, and therefore it'd be impossible to consistently beat the market.
The seminar discusses the mathematical idea of a random walk, then moves on to understand what makes a market efficient.
Finally, we conduct a Monte Carlo Simulation on Wolfram Mathematica, to forecast the behaviour of Google's stock price one year from now.
WIC2006 - Research Paper Recommender Systems: A Random-Walk Based ApproachAugusto Pucci, PhD
Every day researchers from all over the world have to filter the huge mass of existing research papers with the crucial aim of finding out useful publications related to their current work. In this paper we propose a research paper recommending algorithm based on the Citation Graph and random-walker properties. The PaperRank algorithm is able to assign a preference score to a set of documents contained in a digital library and linked one each other by bibliographic references. A data set of papers extracted by ACM Portal has been used for testing and very promising performances have been measured.
Presented "Random Walk on Graphs" in the reading group for Knoesis. Specifically for Recommendation Context.
Referred: Purnamrita Sarkar, Random Walks on Graphs: An Overview
Spam Detection with a Content-based Random-walk Algorithm (SMUC'2010)Javier Ortega
Presentation of PolaritySpam, a graph-based ranking algorithm intended to demote the spam web pages in the ranking provided by a web search engine.
Cite as:
F. Javier Ortega; Craig Macdonald; José A. Troyano; Fermín L. Cruz. “Spam Detection with a Content-based Random-Walk Algorithm”. Proceedings of the Second International Workshop on Search and Mining User-Generated Contents, International Conference on Information and Knowledge Management. 2010. Toronto, Canadá
Episode 50 : Simulation Problem Solution Approaches Convergence Techniques S...SAJJAD KHUDHUR ABBAS
Episode 50 : Simulation Problem Solution Approaches Convergence Techniques Simulation Strategies
3.2.3.3. Quasi-Newton (QN) Methods
These methods represent a very important class of techniques because of their extensive use in practical alqorithms. They attempt to use an approximation to the Jacobian and then update this at each step thus reducing the overall computational work.
The QN method uses an approximation Hk to the true Jacobian i and computes the step via a Newton-like iteration. That is,
SAJJAD KHUDHUR ABBAS
Ceo , Founder & Head of SHacademy
Chemical Engineering , Al-Muthanna University, Iraq
Oil & Gas Safety and Health Professional – OSHACADEMY
Trainer of Trainers (TOT) - Canadian Center of Human
Development
Implicit schemes are needed in order to have fast runtime in wave models. Parallelization using the Message Passing Interface are needed in order to run on computers with thousands of processors. Implicit schemes rely on preconditioner in order for the iterative schemes to converge fast. Thus we need fast preconditioners and we present those here.
We examine the effectiveness of randomized quasi Monte Carlo (RQMC) to improve the convergence rate of the mean integrated square error, compared with crude Monte Carlo (MC), when estimating the density of a random variable X defined as a function over the s-dimensional unit cube (0,1)^s. We consider histograms and kernel density estimators. We show both theoretically and empirically that RQMC estimators can achieve faster convergence rates in
some situations.
This is joint work with Amal Ben Abdellah, Art B. Owen, and Florian Puchhammer.
Stratified Monte Carlo and bootstrapping for approximate Bayesian computationUmberto Picchini
Presented on 7 May 2020 at "One World Approximate Bayesian Computation (ABC) Seminar". A video is available at https://youtu.be/IOPnRfAJ_W8
Approximate Bayesian computation (ABC) is computationally intensive for complex model simulators. To exploit expensive simulations, data-resampling via bootstrapping was used with success in [1] to obtain many artificial datasets at little cost and construct a synthetic likelihood. When using the same approach within ABC to produce a pseudo-marginal ABC-MCMC algorithm, the posterior variance is inflated, thus producing biased posterior inference. Here we use stratified Monte Carlo to considerably reduce the bias induced by data resampling. We also show that it is possible to obtain reliable inference using a larger than usual ABC threshold, by employing stratified Monte Carlo. Finally, we show that with stratified sampling we obtain a less variable ABC likelihood. In our paper [2] we consider simulation studies for static (Gaussian, g-and-k distribution, Ising model) and dynamic models (Lotka-Volterra). For the Lotka-Volterra case study, we compare our results against a standard pseudo-Marginal ABC and find that our approach is four times more efficient and, given limited computational budget, it explores the posterior surface more thoroughly. A comparison against state-of-art sequential Monte Carlo ABC is also reported.
References
[1] R. G. Everitt (2017). Bootstrapped synthetic likelihood. arXiv:1711.05825.
[2] U. Picchini, R.G. Everitt (2019). Stratified sampling and resampling for approximateBayesian computation. arXiv:1905.07976
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
Presentation of our work "Local discrepancy maximization on graphs" at the International IEEE Conference on Data Engineering (ICDE) 2015 in Seoul, Korea.
A brief information about the SCOP protein database used in bioinformatics.
The Structural Classification of Proteins (SCOP) database is a comprehensive and authoritative resource for the structural and evolutionary relationships of proteins. It provides a detailed and curated classification of protein structures, grouping them into families, superfamilies, and folds based on their structural and sequence similarities.
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Sérgio Sacani
Since volcanic activity was first discovered on Io from Voyager images in 1979, changes
on Io’s surface have been monitored from both spacecraft and ground-based telescopes.
Here, we present the highest spatial resolution images of Io ever obtained from a groundbased telescope. These images, acquired by the SHARK-VIS instrument on the Large
Binocular Telescope, show evidence of a major resurfacing event on Io’s trailing hemisphere. When compared to the most recent spacecraft images, the SHARK-VIS images
show that a plume deposit from a powerful eruption at Pillan Patera has covered part
of the long-lived Pele plume deposit. Although this type of resurfacing event may be common on Io, few have been detected due to the rarity of spacecraft visits and the previously low spatial resolution available from Earth-based telescopes. The SHARK-VIS instrument ushers in a new era of high resolution imaging of Io’s surface using adaptive
optics at visible wavelengths.
Professional air quality monitoring systems provide immediate, on-site data for analysis, compliance, and decision-making.
Monitor common gases, weather parameters, particulates.
Seminar of U.V. Spectroscopy by SAMIR PANDASAMIR PANDA
Spectroscopy is a branch of science dealing the study of interaction of electromagnetic radiation with matter.
Ultraviolet-visible spectroscopy refers to absorption spectroscopy or reflect spectroscopy in the UV-VIS spectral region.
Ultraviolet-visible spectroscopy is an analytical method that can measure the amount of light received by the analyte.
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...University of Maribor
Slides from:
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Track: Artificial Intelligence
https://www.etran.rs/2024/en/home-english/
Nutraceutical market, scope and growth: Herbal drug technologyLokesh Patil
As consumer awareness of health and wellness rises, the nutraceutical market—which includes goods like functional meals, drinks, and dietary supplements that provide health advantages beyond basic nutrition—is growing significantly. As healthcare expenses rise, the population ages, and people want natural and preventative health solutions more and more, this industry is increasing quickly. Further driving market expansion are product formulation innovations and the use of cutting-edge technology for customized nutrition. With its worldwide reach, the nutraceutical industry is expected to keep growing and provide significant chances for research and investment in a number of categories, including vitamins, minerals, probiotics, and herbal supplements.
Slide 1: Title Slide
Extrachromosomal Inheritance
Slide 2: Introduction to Extrachromosomal Inheritance
Definition: Extrachromosomal inheritance refers to the transmission of genetic material that is not found within the nucleus.
Key Components: Involves genes located in mitochondria, chloroplasts, and plasmids.
Slide 3: Mitochondrial Inheritance
Mitochondria: Organelles responsible for energy production.
Mitochondrial DNA (mtDNA): Circular DNA molecule found in mitochondria.
Inheritance Pattern: Maternally inherited, meaning it is passed from mothers to all their offspring.
Diseases: Examples include Leber’s hereditary optic neuropathy (LHON) and mitochondrial myopathy.
Slide 4: Chloroplast Inheritance
Chloroplasts: Organelles responsible for photosynthesis in plants.
Chloroplast DNA (cpDNA): Circular DNA molecule found in chloroplasts.
Inheritance Pattern: Often maternally inherited in most plants, but can vary in some species.
Examples: Variegation in plants, where leaf color patterns are determined by chloroplast DNA.
Slide 5: Plasmid Inheritance
Plasmids: Small, circular DNA molecules found in bacteria and some eukaryotes.
Features: Can carry antibiotic resistance genes and can be transferred between cells through processes like conjugation.
Significance: Important in biotechnology for gene cloning and genetic engineering.
Slide 6: Mechanisms of Extrachromosomal Inheritance
Non-Mendelian Patterns: Do not follow Mendel’s laws of inheritance.
Cytoplasmic Segregation: During cell division, organelles like mitochondria and chloroplasts are randomly distributed to daughter cells.
Heteroplasmy: Presence of more than one type of organellar genome within a cell, leading to variation in expression.
Slide 7: Examples of Extrachromosomal Inheritance
Four O’clock Plant (Mirabilis jalapa): Shows variegated leaves due to different cpDNA in leaf cells.
Petite Mutants in Yeast: Result from mutations in mitochondrial DNA affecting respiration.
Slide 8: Importance of Extrachromosomal Inheritance
Evolution: Provides insight into the evolution of eukaryotic cells.
Medicine: Understanding mitochondrial inheritance helps in diagnosing and treating mitochondrial diseases.
Agriculture: Chloroplast inheritance can be used in plant breeding and genetic modification.
Slide 9: Recent Research and Advances
Gene Editing: Techniques like CRISPR-Cas9 are being used to edit mitochondrial and chloroplast DNA.
Therapies: Development of mitochondrial replacement therapy (MRT) for preventing mitochondrial diseases.
Slide 10: Conclusion
Summary: Extrachromosomal inheritance involves the transmission of genetic material outside the nucleus and plays a crucial role in genetics, medicine, and biotechnology.
Future Directions: Continued research and technological advancements hold promise for new treatments and applications.
Slide 11: Questions and Discussion
Invite Audience: Open the floor for any questions or further discussion on the topic.
Richard's aventures in two entangled wonderlandsRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
Cancer cell metabolism: special Reference to Lactate PathwayAADYARAJPANDEY1
Normal Cell Metabolism:
Cellular respiration describes the series of steps that cells use to break down sugar and other chemicals to get the energy we need to function.
Energy is stored in the bonds of glucose and when glucose is broken down, much of that energy is released.
Cell utilize energy in the form of ATP.
The first step of respiration is called glycolysis. In a series of steps, glycolysis breaks glucose into two smaller molecules - a chemical called pyruvate. A small amount of ATP is formed during this process.
Most healthy cells continue the breakdown in a second process, called the Kreb's cycle. The Kreb's cycle allows cells to “burn” the pyruvates made in glycolysis to get more ATP.
The last step in the breakdown of glucose is called oxidative phosphorylation (Ox-Phos).
It takes place in specialized cell structures called mitochondria. This process produces a large amount of ATP. Importantly, cells need oxygen to complete oxidative phosphorylation.
If a cell completes only glycolysis, only 2 molecules of ATP are made per glucose. However, if the cell completes the entire respiration process (glycolysis - Kreb's - oxidative phosphorylation), about 36 molecules of ATP are created, giving it much more energy to use.
IN CANCER CELL:
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
introduction to WARBERG PHENOMENA:
WARBURG EFFECT Usually, cancer cells are highly glycolytic (glucose addiction) and take up more glucose than do normal cells from outside.
Otto Heinrich Warburg (; 8 October 1883 – 1 August 1970) In 1931 was awarded the Nobel Prize in Physiology for his "discovery of the nature and mode of action of the respiratory enzyme.
WARNBURG EFFECT : cancer cells under aerobic (well-oxygenated) conditions to metabolize glucose to lactate (aerobic glycolysis) is known as the Warburg effect. Warburg made the observation that tumor slices consume glucose and secrete lactate at a higher rate than normal tissues.
Introduction:
RNA interference (RNAi) or Post-Transcriptional Gene Silencing (PTGS) is an important biological process for modulating eukaryotic gene expression.
It is highly conserved process of posttranscriptional gene silencing by which double stranded RNA (dsRNA) causes sequence-specific degradation of mRNA sequences.
dsRNA-induced gene silencing (RNAi) is reported in a wide range of eukaryotes ranging from worms, insects, mammals and plants.
This process mediates resistance to both endogenous parasitic and exogenous pathogenic nucleic acids, and regulates the expression of protein-coding genes.
What are small ncRNAs?
micro RNA (miRNA)
short interfering RNA (siRNA)
Properties of small non-coding RNA:
Involved in silencing mRNA transcripts.
Called “small” because they are usually only about 21-24 nucleotides long.
Synthesized by first cutting up longer precursor sequences (like the 61nt one that Lee discovered).
Silence an mRNA by base pairing with some sequence on the mRNA.
Discovery of siRNA?
The first small RNA:
In 1993 Rosalind Lee (Victor Ambros lab) was studying a non- coding gene in C. elegans, lin-4, that was involved in silencing of another gene, lin-14, at the appropriate time in the
development of the worm C. elegans.
Two small transcripts of lin-4 (22nt and 61nt) were found to be complementary to a sequence in the 3' UTR of lin-14.
Because lin-4 encoded no protein, she deduced that it must be these transcripts that are causing the silencing by RNA-RNA interactions.
Types of RNAi ( non coding RNA)
MiRNA
Length (23-25 nt)
Trans acting
Binds with target MRNA in mismatch
Translation inhibition
Si RNA
Length 21 nt.
Cis acting
Bind with target Mrna in perfect complementary sequence
Piwi-RNA
Length ; 25 to 36 nt.
Expressed in Germ Cells
Regulates trnasposomes activity
MECHANISM OF RNAI:
First the double-stranded RNA teams up with a protein complex named Dicer, which cuts the long RNA into short pieces.
Then another protein complex called RISC (RNA-induced silencing complex) discards one of the two RNA strands.
The RISC-docked, single-stranded RNA then pairs with the homologous mRNA and destroys it.
THE RISC COMPLEX:
RISC is large(>500kD) RNA multi- protein Binding complex which triggers MRNA degradation in response to MRNA
Unwinding of double stranded Si RNA by ATP independent Helicase
Active component of RISC is Ago proteins( ENDONUCLEASE) which cleave target MRNA.
DICER: endonuclease (RNase Family III)
Argonaute: Central Component of the RNA-Induced Silencing Complex (RISC)
One strand of the dsRNA produced by Dicer is retained in the RISC complex in association with Argonaute
ARGONAUTE PROTEIN :
1.PAZ(PIWI/Argonaute/ Zwille)- Recognition of target MRNA
2.PIWI (p-element induced wimpy Testis)- breaks Phosphodiester bond of mRNA.)RNAse H activity.
MiRNA:
The Double-stranded RNAs are naturally produced in eukaryotic cells during development, and they have a key role in regulating gene expression .
1. Absorbing
Random
Walk
Centrality
Theory
and
Algorithms
1
Harry
Mavroforakis,
Boston
University
Michael
Mathioudakis,
Aalto
University
ArisAdes
Gionis,
Aalto
University
Helsinki
-‐
September
15th
2015
2. 2
submit
query
to
twiKer
e.g.
‘#ferguson’
want
to
see
messages
from
few,
central
users
many
and
different
groups
of
people
might
be
posAng
about
it
3. one
approach...
represent
acAvity
with
graph
3
users
in
the
results
(query
nodes)
other
users
connecAons
4. 4
select
k
central
query
nodes
what
is
‘central’?
many
measures
have
been
studied
here,
we
use
random
walks
for
robustness
absorbing
random
walk
centrality
5. absorbing
random
walk
centrality
5
model
start
from
query
node
q
w.p.
s(q)
re-‐start
with
probability
α
at
each
step
transiAons
follow
edges
at
random
from
one
node
to
another
absorpAon
we
can
designate
set
of
absorbing
nodes
no
escape
from
absorbing
nodes
centrality
of
nodes
S
expected
Ame
(number
of
steps)
unAl
absorbed
by
S
6. problem
input
graph
,
query
nodes
,
α
&
candidate
nodes
(e.g.
query
nodes
or
all
nodes)
select
k
candidate
nodes
with
minimum
absorpAon
Ame
6
7. 7
select
nodes
that
are
central
w.r.t.
the
query
nodes
k
=
1
8. 8
select
nodes
that
are
central
w.r.t.
the
query
nodes
k
=
3
9. 9
select
nodes
that
are
central
w.r.t.
the
query
nodes
k
≥|Q|
12. approximaAon
centrality
measure
monotonicity
absorpAon
Ame
decreases
with
more
absorbing
nodes
supermodularity
diminishing
returns
12
13. approximaAon
centrality
gain
mc:
min
centrality
for
k=1
gain
=
mc
-‐
centrality,
k>1
non-‐negaAve,
non-‐
decreasing,
submodular
13
S
υ {u}
S
υ
{u,v}
S
greedy
algorithm
(1-1/e)-‐approximaAon
guarantee
for
gain
14. greedy
S
=
empty
for
i
=
1..k
for
u
in
V
-‐
S
calculate
centrality
of
S
υ {u}
(*)
update
S
:=
S
υ {best
u}
14
boKleneck
is
in
line
(*)
one
matrix
inversion
(super-‐quadraAc)
for
step
(*)
use
sherman-morrison
to
perform
(*)
in
O(n2)
with
one
(1)
inversion
for
first
node
however...
sAll
O(kn3)
15. heurisAcs
&
baselines
• personalized
pagerank
with
same
α
• spectralQ
&
spectralD
– spectral
embedding
of
nodes
– k-‐means
on
embedding
of
query
nodes
– select
candidates
close
to
centers
– spectral
Q
selects
more
nodes
from
larger
clusters
• spectralC
– similar
to
spectralQ
but
clustering
on
candidates
• degree
&
distance
centrality
15
16. evaluaAon
16
TABLE I: Dataset statistics
small
Dataset |V | |E|
karate 34 78
dolphins 62 159
lesmis 77 254
adjnoun 112 425
football 115 613
large
Dataset |V | |E|
kddCoauthors 2 891 2 891
livejournal 3 645 4 141
ca-GrQc 5 242 14 496
ca-HepTh 9 877 25 998
roadnet 10 199 13 932
oregon-1 11 174 23 409
Degree and distance centrality. Finally, we consider the
standard degree and distance centrality measures.
Degree returns the k highest-degree nodes. Note that this
baseline is oblivious to query nodes Q.
with q
datasets
Final
starting
Implem
with ex
Intel X
C. Res
Figu
algorith
better).
of the fi
other tw
data
cannot
run
greedy
on
these
17. input
graphs
from
previous
datasets
query
nodes:
planted
spheres
k
spheres
(k
=
1)
radius
ρ
(ρ
=
2
or
3)
s
special
nodes
inside
spheres
(s
=
10
or
20)
17
α
=
0.15
24. conclusions
complex
problem,
greedy
algorithm
is
expensive
personalized
pagerank
is
good
alternaAve
future
work
expansion
strategies
comparison
with
more
alternaAves
parallel
implementaAon
24
26. 26
where the inequality comes from the fact that a path in GX
passing from Z and being absorbed by X corresponds to a
shorter path in GY being absorbed by Y .
B. Proposition 5
Proposition Let Ci 1 be a set of i 1 absorbing nodes,
Pi 1 the corresponding transition matrix, and Fi 1 = (I
Pi 1) 1
. Let Ci = Ci 1 [ {u}. Given Fi 1, the centrality
score acQ(Ci) can be computed in time O(n2
).
The proof makes use of the following lemma.
Lemma 1 (Sherman-Morrison Formula [7]) Let M be a
square n⇥n invertible matrix and M 1
its inverse. Moreover,
let a and b be any two column vectors of size n. Then, the
following equation holds
(M + abT
) 1
= M 1
M 1
abT
M 1
/(1 + bT
M 1
a).
By a direct a
can compute
cost of O(n2
Fi =
We have thu
and therefore
C. Propositio
Proposition
correspondin
C0
= C {
score acQ(C
Proof: T
Again we ass
Puu = 0 for
two sets of a
path in GX
sponds to a
bing nodes,
i 1 = (I
e centrality
t M be a
. Moreover,
. Then, the
By a direct application of Lemma 1, it is easy to see that we
can compute Fi from Fi 1 with the following formula, at a
cost of O(n2
) operations.
Fi = Fi 1 (Fi 1a)(bT
Fi 1)/(1 + bT
(Fi 1a))
We have thus shown that, given Fi 1, we can compute Fi,
and therefore acQ
(Ci) as well, in O(n2
).
C. Proposition 6
Proposition Let C be a set of absorbing nodes, P the
corresponding transition matrix, and F = (I P) 1
. Let
C0
= C {v} [ {u}, for u, v 2 C. Given F, the centrality
score acQ(C0
) can be computed in time O(n2
).
Proof: The proof is similar to the proof of Proposition 5.
score acQ(Ci) can be computed in time O(n2
).
The proof makes use of the following lemma.
Lemma 1 (Sherman-Morrison Formula [7]) Let M be a
square n⇥n invertible matrix and M 1
its inverse. Moreover,
let a and b be any two column vectors of size n. Then, the
following equation holds
(M + abT
) 1
= M 1
M 1
abT
M 1
/(1 + bT
M 1
a).
Proof: (Proposition 5) Without loss of generality, let the
set of absorbing nodes be Ci 1 = {1, 2, . . . , i 1}. For
simplicity, assume no self-loops for non-absorbing nodes, i.e.,
Pu,u = 0 for u 2 V Ci 1. As in Section VI, the expected
number of steps before absorption is given by the formulas
acQ
(Ci 1) = sT
Q
Fi 11,
with Fi 1 = A 1
i 1 and Ai 1 = I Pi 1.
We proceed to show how to increase the set of absorbing nodes
by one and calculate the new absorption time by updating Fi 1
in O(n2
). Without loss of generality, suppose we add node i
to the absorbing nodes Ci 1, so that
Ci = Ci 1 [ {i} = {1, 2, . . . , i 1, i}.
Let Pi be the transition matrix over G with absorbing nodes
C. Proposition 6
Proposition Let C be
corresponding transition
C0
= C {v} [ {u}, f
score acQ(C0
) can be c
Proof: The proof is
Again we assume no sel
Puu = 0 for u 2 V
two sets of absorbing no
C = {
C0
= {
Let P0
be the transition
absorbing centrality for t
C0
is expressed as a fun
F = A 1
,
F0
= A0
Notice that
A0
A = (I
=
2
6
6
4
0(i 1
pi,1 . . .
pi+1,0 . . .
0(n i
where pi,j denotes the
node j in a transition ma