My slides from my 3-hour tutorial on mesoscale structures in networks from the 2016 Lake Como School on Complex Networks (http://ntmb.lakecomoschool.org/).
After my talk, Tiago Peixoto gave a talk on statistical inference of large-scale mesoscale structures in networks. His presentation, which takes a complementary perspective from mine, is available at the following website: https://speakerdeck.com/count0/statisical-inference-of-generative-network-models
In these slides i have explained an important design pattern that is "singleton pattern".
slides includes everything required about it, from definition to implementation and also different ways to achieve it according to situation and requirements.
In these slides i have explained an important design pattern that is "singleton pattern".
slides includes everything required about it, from definition to implementation and also different ways to achieve it according to situation and requirements.
This is a presentation I gave in a workshop on "Language, concepts, history" organized by historian Joanna Innes. It took place on Friday 4/22/16 in Somerville College, Oxford.
I was one of the only people present who was not from the humanities, so it was a rather different-than-usual audience and set of participants for me.
I drew some of these slides from other presentations to rather different audiences. I emphasized rather different parts of some of those slides, so I am not sure if the slides on their own give an accurate reflection of the difference between this presentation and some of my other ones.
I thought the presentation went rather well.
Networks in Space: Granular Force Networks and BeyondMason Porter
This is my talk for the Network Geometry Workshop (http://ginestra-bianconi-6flt.squarespace.com) at QMUL on 16 July 2015.
(A few of the slides are adapted from slides by my coauthors Dani Bassett and Karen Daniels.)
Enhancing Our Capacity for Large Health Dataset AnalysisCTSI at UCSF
Overview of UCSF-CTSI Comparative Effectiveness Large Dataset Analysis Core, which offers resources for the analysis of large, public data sets on health and health care.
JALA Editor-in-Chief Edward Kai-Hua Chow, Ph.D., of National University of Singapore shared step-by-step advice on how to design and write scientific research papers more clearly and effectively to improve their chances for successful publication at the recently held conference in Washington, DC. Learn what editors want, what they don't want and how reviewers evaluate manuscripts by reviewing slides from the session.
JALA Deputy Editor-in-Chief Edward Chow, Ph.D., University of Singapore, offers instruction for central message design, journal selection and proper manuscript composition. Originality, citations and the peer review process also are covered. This presentation is from the popular “JALA & JBS Author Workshop: How to Get Your Work Published,” SLAS2014 in San Diego.
David A. Weil, Ph.D, senior applications scientist with Agilent Technologies, presented "Identification of Potential Bioactive Leachables and Extractables from Plastic Lab Ware by using GC and LC Separation Methods linked with MS Detection."
Dr. Praveen Balimane, senior staff fellow, Division of Clinical Pharmacology-1 at OCP/OTS/CDER/FDA, spoke during the Society for Laboratory Automation and Screening ADMET Special Interest Group Meeting on “Transporter Evaluation in Drug Development.”
Transporters, like CYPs, are being recognized as proteins that can play a pivotal role in dictating the ADME properties of drugs. A thorough understanding of potential roles of transporters in drug interactions and toxicity is important in drug development. The talk provided a high level overview of various transporter evaluation initiatives at the agency. Some of the topics discussed:
• On-going efforts on decision trees within the DDI guidance
• Novel emerging transporters impacting ADME
• Inter-play of hepatic transporters and liver-toxicity
• Inter-play of renal transporters and renal function
A presentation on mashing up Twitter Annotations with the Semantic Web. June 24, 2010 at the Semantic Technology Conference, San Francisco (SemTech 2010).
Deploying Automated Workstreams and Computational Approaches for Generation of Toxicity Data Used for Hazard Identification, by Robert T. Dunn, II, Ph.D., DABT
Quick introduction to community detection.
Structural properties of real world networks, definition of "communities", fundamental techniques and evaluation measures.
This is a presentation I gave in a workshop on "Language, concepts, history" organized by historian Joanna Innes. It took place on Friday 4/22/16 in Somerville College, Oxford.
I was one of the only people present who was not from the humanities, so it was a rather different-than-usual audience and set of participants for me.
I drew some of these slides from other presentations to rather different audiences. I emphasized rather different parts of some of those slides, so I am not sure if the slides on their own give an accurate reflection of the difference between this presentation and some of my other ones.
I thought the presentation went rather well.
Networks in Space: Granular Force Networks and BeyondMason Porter
This is my talk for the Network Geometry Workshop (http://ginestra-bianconi-6flt.squarespace.com) at QMUL on 16 July 2015.
(A few of the slides are adapted from slides by my coauthors Dani Bassett and Karen Daniels.)
Enhancing Our Capacity for Large Health Dataset AnalysisCTSI at UCSF
Overview of UCSF-CTSI Comparative Effectiveness Large Dataset Analysis Core, which offers resources for the analysis of large, public data sets on health and health care.
JALA Editor-in-Chief Edward Kai-Hua Chow, Ph.D., of National University of Singapore shared step-by-step advice on how to design and write scientific research papers more clearly and effectively to improve their chances for successful publication at the recently held conference in Washington, DC. Learn what editors want, what they don't want and how reviewers evaluate manuscripts by reviewing slides from the session.
JALA Deputy Editor-in-Chief Edward Chow, Ph.D., University of Singapore, offers instruction for central message design, journal selection and proper manuscript composition. Originality, citations and the peer review process also are covered. This presentation is from the popular “JALA & JBS Author Workshop: How to Get Your Work Published,” SLAS2014 in San Diego.
David A. Weil, Ph.D, senior applications scientist with Agilent Technologies, presented "Identification of Potential Bioactive Leachables and Extractables from Plastic Lab Ware by using GC and LC Separation Methods linked with MS Detection."
Dr. Praveen Balimane, senior staff fellow, Division of Clinical Pharmacology-1 at OCP/OTS/CDER/FDA, spoke during the Society for Laboratory Automation and Screening ADMET Special Interest Group Meeting on “Transporter Evaluation in Drug Development.”
Transporters, like CYPs, are being recognized as proteins that can play a pivotal role in dictating the ADME properties of drugs. A thorough understanding of potential roles of transporters in drug interactions and toxicity is important in drug development. The talk provided a high level overview of various transporter evaluation initiatives at the agency. Some of the topics discussed:
• On-going efforts on decision trees within the DDI guidance
• Novel emerging transporters impacting ADME
• Inter-play of hepatic transporters and liver-toxicity
• Inter-play of renal transporters and renal function
A presentation on mashing up Twitter Annotations with the Semantic Web. June 24, 2010 at the Semantic Technology Conference, San Francisco (SemTech 2010).
Deploying Automated Workstreams and Computational Approaches for Generation of Toxicity Data Used for Hazard Identification, by Robert T. Dunn, II, Ph.D., DABT
Quick introduction to community detection.
Structural properties of real world networks, definition of "communities", fundamental techniques and evaluation measures.
This is my attempt at an introduction to data ethics for mathematicians. Mathematicians increasingly need to deal with these kinds of issues, but we don't have the tradition of ethics training from other disciplines.
I welcome comments on how to improve these slides. Did I miss any salient points? Do you want to offer a different perspective on any of these? Do you want to offer any counterpoints? (Please e-mail me directly with comments and suggestions.)
Eventually, I hope to develop these slides further into an article for a venue aimed at mathematical scientists, and of course I would love to have knowledgeable coauthors who can offer a different perspective from mine.
By giving digital proximity to organizations with a potential common purpose, companies can leapfrog the natural limitations of physical industry clusters.
Centrality in Time- Dependent NetworksMason Porter
My slides for my keynote talk at the NetSci 2018 (#NetSci2018) conference in Paris, France (June 2018). This talk will take place on Thursday 13 June in the morning.
The Complexity of Data: Computer Simulation and “Everyday” Social ScienceEdmund Chattoe-Brown
Although the existence of various forms of complexity in social systems is now widely recognised, this approach to explanation faces two major challenges that turn out to be intimately connected. The first is the existing conflict in social science between “micro” and “macro” styles of social explanation. The second is the relationship of complexity to the kind of data routinely collected in social science. In order to be accepted, complexity approaches need simultaneously to dodge the first conflict while making much better use of existing forms of data.
The first part of the talk will provide an introduction to the simulation approach and a discussion of various concepts in complexity with reference to simulation as a distinctive theory-building tool and methodology. The second part of the talk will develop these ideas in more depth using simulations by the author as case studies.
2010 june - personal democracy forum - marc smith - mapping political socia...Marc Smith
Marc Smith's presentation to the Personal Democracy Forum 2010 in New York City on June 4th, 2010 about the use of NodeXL, a social media network analysis tool, to map political topics in services like Twitter.
NodeXL is available from http://nodexl.codeplex.com
1. Basics of Social Networks
2. Real-world problem
3. How to construct graph from real-world problem?
4. What graph theory problem getting from real-world problem?
5. Graph type of Social Networks
6. Special properties in social graph
7. How to find communities and groups in social networks? (Algorithms)
8. How to interpret graph solution back to real-world problem?
These slides are for my talk for the Somerville College Mathematics Reunion ("Somerville Maths Reunion", 6/24/17): http://www.some.ox.ac.uk/event/somerville-maths-reunion/
[DSC Adria 23] Marija Mitrovic Dankulov Complex networks and data science eff...DataScienceConferenc1
Complex systems are everywhere. While they differ in type, dynamics, and function, a common feature is that they consist of many interacting units with a non-trivial interaction network structure. The abundance of various data has driven a field focusing on a quantitative study of the structure of interaction networks in complex systems, complex networks theory. In this talk, we will show how we can use a combined approach from complex networks theory and data science to tackle complex phenomena at a large scale in real socio-economic systems.
Opinion Dynamics on Generalized NetworksMason Porter
This is a talk on opinion dynamics (especially bounded-confidence models) on generalized networks.
It is part of the MIX-NEXT III (Multiscale & Integrative compleX Networks: EXperiments & Theories) satellite at NetSci 2022.
(Thursday 14 July 2022)
Mathematical Models of the Spread of Diseases, Opinions, Information, and Mis...Mason Porter
This is my general-audience talk at DiscCon III (2021 WorldCon).
My talk overlapped with the Hugo Award ceremony, but the video will be posted later on the DisCon website for attendees who want to see it.
This is a colloquium that I presented on 4/22/21: Stockholm University, Nordic Institute for Theoretical Physics (NORDITA), WINQ–AlbaNova Colloquium
Here is a video of my talk: http://video.albanova.se/ALBANOVA20210422/video.mp4
Introduction to Topological Data AnalysisMason Porter
Here are slides for my 3/14/21 talk on an introduction to topological data analysis.
This is the first talk in our Short Course on topological data analysis at the 2021 American Physical Society (APS) March Meeting: https://march.aps.org/program/dsoft/gsnp-short-course-introduction-to-topological-data-analysis/
Topological Data Analysis of Complex Spatial SystemsMason Porter
These are slides from a seminar I gave in "Cardiff" (for the mathematics department at University of Cardiff) on 4/15/20.
You can also find a recording of a similar talk that I gave in March 2020 for MBI (Mathematical Biosciences Institute): https://mbi.osu.edu/events/online-colloquium-mason-porter-spatial-systems-and-topological-data-analysis
Here are my slides (though the animated gifs on a couple of them are stills in this version) of my talk on an introduction to the science of "chaos" at WorldCon 77 in Dublin, Ireland.
This is my attempt to give a gentle introduction to the notion of chaos to a science-fiction audience.
Paper Writing in Applied Mathematics (slightly updated slides)Mason Porter
Here are my slides (which I have updated very slightly) in writing papers in applied mathematics.
There will be an accompanying oral presentation and discussion on Friday 20 April. I am recording the video for that and plan to post it along with these (or a further updated version of these) slides.
Tutorial on Paper-Writing in Applied Mathematics (Preliminary Draft of Slides)Mason Porter
These are preliminary slides for a tutorial and discussion on "Writing Papers in Applied Mathematics" that I'll be giving at UCLA, first for a few of my own PhD students on 4/6 and later (on 4/20 ?) in a recorded session to a larger UCLA group.
Several people have expressed interest, so I will post the recorded session online and circulate it.
My talk at the 2017 SIAM "Snowbird" conference on applications of dynamical systems (#SIAMDS17).
I spoke in a session on topological data analysis (TDA). My talk concerned persistent homology and its application to Brexit data (including voting data) and "functional networks" from coupled time series from both experiments and output of dynamical systems.
Eventually, a version of these slides that is synchronized with the audio of my talk is supposed to be posted online.
These are slides for my tutorial talk on network dynamics. (The colors are fine in the downloaded version, though there seem to be color issues if you view the slides directly in slideshare.)
Slides from my talk on a systems-level investigation of long-term human migration in Korea. Our paper is available at the following page: http://journals.aps.org/prx/abstract/10.1103/PhysRevX.4.041009
I adapted these slides from the ones created by my coauthor Sang Hoon Lee.
These are the slides for a tutorial talk about "multilayer networks" that I gave at NetSci 2014.
I walk people through a review article that I wrote with my PLEXMATH collaborators: http://comnet.oxfordjournals.org/content/2/3/203
Slide 1: Title Slide
Extrachromosomal Inheritance
Slide 2: Introduction to Extrachromosomal Inheritance
Definition: Extrachromosomal inheritance refers to the transmission of genetic material that is not found within the nucleus.
Key Components: Involves genes located in mitochondria, chloroplasts, and plasmids.
Slide 3: Mitochondrial Inheritance
Mitochondria: Organelles responsible for energy production.
Mitochondrial DNA (mtDNA): Circular DNA molecule found in mitochondria.
Inheritance Pattern: Maternally inherited, meaning it is passed from mothers to all their offspring.
Diseases: Examples include Leber’s hereditary optic neuropathy (LHON) and mitochondrial myopathy.
Slide 4: Chloroplast Inheritance
Chloroplasts: Organelles responsible for photosynthesis in plants.
Chloroplast DNA (cpDNA): Circular DNA molecule found in chloroplasts.
Inheritance Pattern: Often maternally inherited in most plants, but can vary in some species.
Examples: Variegation in plants, where leaf color patterns are determined by chloroplast DNA.
Slide 5: Plasmid Inheritance
Plasmids: Small, circular DNA molecules found in bacteria and some eukaryotes.
Features: Can carry antibiotic resistance genes and can be transferred between cells through processes like conjugation.
Significance: Important in biotechnology for gene cloning and genetic engineering.
Slide 6: Mechanisms of Extrachromosomal Inheritance
Non-Mendelian Patterns: Do not follow Mendel’s laws of inheritance.
Cytoplasmic Segregation: During cell division, organelles like mitochondria and chloroplasts are randomly distributed to daughter cells.
Heteroplasmy: Presence of more than one type of organellar genome within a cell, leading to variation in expression.
Slide 7: Examples of Extrachromosomal Inheritance
Four O’clock Plant (Mirabilis jalapa): Shows variegated leaves due to different cpDNA in leaf cells.
Petite Mutants in Yeast: Result from mutations in mitochondrial DNA affecting respiration.
Slide 8: Importance of Extrachromosomal Inheritance
Evolution: Provides insight into the evolution of eukaryotic cells.
Medicine: Understanding mitochondrial inheritance helps in diagnosing and treating mitochondrial diseases.
Agriculture: Chloroplast inheritance can be used in plant breeding and genetic modification.
Slide 9: Recent Research and Advances
Gene Editing: Techniques like CRISPR-Cas9 are being used to edit mitochondrial and chloroplast DNA.
Therapies: Development of mitochondrial replacement therapy (MRT) for preventing mitochondrial diseases.
Slide 10: Conclusion
Summary: Extrachromosomal inheritance involves the transmission of genetic material outside the nucleus and plays a crucial role in genetics, medicine, and biotechnology.
Future Directions: Continued research and technological advancements hold promise for new treatments and applications.
Slide 11: Questions and Discussion
Invite Audience: Open the floor for any questions or further discussion on the topic.
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Sérgio Sacani
Since volcanic activity was first discovered on Io from Voyager images in 1979, changes
on Io’s surface have been monitored from both spacecraft and ground-based telescopes.
Here, we present the highest spatial resolution images of Io ever obtained from a groundbased telescope. These images, acquired by the SHARK-VIS instrument on the Large
Binocular Telescope, show evidence of a major resurfacing event on Io’s trailing hemisphere. When compared to the most recent spacecraft images, the SHARK-VIS images
show that a plume deposit from a powerful eruption at Pillan Patera has covered part
of the long-lived Pele plume deposit. Although this type of resurfacing event may be common on Io, few have been detected due to the rarity of spacecraft visits and the previously low spatial resolution available from Earth-based telescopes. The SHARK-VIS instrument ushers in a new era of high resolution imaging of Io’s surface using adaptive
optics at visible wavelengths.
Multi-source connectivity as the driver of solar wind variability in the heli...Sérgio Sacani
The ambient solar wind that flls the heliosphere originates from multiple
sources in the solar corona and is highly structured. It is often described
as high-speed, relatively homogeneous, plasma streams from coronal
holes and slow-speed, highly variable, streams whose source regions are
under debate. A key goal of ESA/NASA’s Solar Orbiter mission is to identify
solar wind sources and understand what drives the complexity seen in the
heliosphere. By combining magnetic feld modelling and spectroscopic
techniques with high-resolution observations and measurements, we show
that the solar wind variability detected in situ by Solar Orbiter in March
2022 is driven by spatio-temporal changes in the magnetic connectivity to
multiple sources in the solar atmosphere. The magnetic feld footpoints
connected to the spacecraft moved from the boundaries of a coronal hole
to one active region (12961) and then across to another region (12957). This
is refected in the in situ measurements, which show the transition from fast
to highly Alfvénic then to slow solar wind that is disrupted by the arrival of
a coronal mass ejection. Our results describe solar wind variability at 0.5 au
but are applicable to near-Earth observatories.
The increased availability of biomedical data, particularly in the public domain, offers the opportunity to better understand human health and to develop effective therapeutics for a wide range of unmet medical needs. However, data scientists remain stymied by the fact that data remain hard to find and to productively reuse because data and their metadata i) are wholly inaccessible, ii) are in non-standard or incompatible representations, iii) do not conform to community standards, and iv) have unclear or highly restricted terms and conditions that preclude legitimate reuse. These limitations require a rethink on data can be made machine and AI-ready - the key motivation behind the FAIR Guiding Principles. Concurrently, while recent efforts have explored the use of deep learning to fuse disparate data into predictive models for a wide range of biomedical applications, these models often fail even when the correct answer is already known, and fail to explain individual predictions in terms that data scientists can appreciate. These limitations suggest that new methods to produce practical artificial intelligence are still needed.
In this talk, I will discuss our work in (1) building an integrative knowledge infrastructure to prepare FAIR and "AI-ready" data and services along with (2) neurosymbolic AI methods to improve the quality of predictions and to generate plausible explanations. Attention is given to standards, platforms, and methods to wrangle knowledge into simple, but effective semantic and latent representations, and to make these available into standards-compliant and discoverable interfaces that can be used in model building, validation, and explanation. Our work, and those of others in the field, creates a baseline for building trustworthy and easy to deploy AI models in biomedicine.
Bio
Dr. Michel Dumontier is the Distinguished Professor of Data Science at Maastricht University, founder and executive director of the Institute of Data Science, and co-founder of the FAIR (Findable, Accessible, Interoperable and Reusable) data principles. His research explores socio-technological approaches for responsible discovery science, which includes collaborative multi-modal knowledge graphs, privacy-preserving distributed data mining, and AI methods for drug discovery and personalized medicine. His work is supported through the Dutch National Research Agenda, the Netherlands Organisation for Scientific Research, Horizon Europe, the European Open Science Cloud, the US National Institutes of Health, and a Marie-Curie Innovative Training Network. He is the editor-in-chief for the journal Data Science and is internationally recognized for his contributions in bioinformatics, biomedical informatics, and semantic technologies including ontologies and linked data.
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Sérgio Sacani
We characterize the earliest galaxy population in the JADES Origins Field (JOF), the deepest
imaging field observed with JWST. We make use of the ancillary Hubble optical images (5 filters
spanning 0.4−0.9µm) and novel JWST images with 14 filters spanning 0.8−5µm, including 7 mediumband filters, and reaching total exposure times of up to 46 hours per filter. We combine all our data
at > 2.3µm to construct an ultradeep image, reaching as deep as ≈ 31.4 AB mag in the stack and
30.3-31.0 AB mag (5σ, r = 0.1” circular aperture) in individual filters. We measure photometric
redshifts and use robust selection criteria to identify a sample of eight galaxy candidates at redshifts
z = 11.5 − 15. These objects show compact half-light radii of R1/2 ∼ 50 − 200pc, stellar masses of
M⋆ ∼ 107−108M⊙, and star-formation rates of SFR ∼ 0.1−1 M⊙ yr−1
. Our search finds no candidates
at 15 < z < 20, placing upper limits at these redshifts. We develop a forward modeling approach to
infer the properties of the evolving luminosity function without binning in redshift or luminosity that
marginalizes over the photometric redshift uncertainty of our candidate galaxies and incorporates the
impact of non-detections. We find a z = 12 luminosity function in good agreement with prior results,
and that the luminosity function normalization and UV luminosity density decline by a factor of ∼ 2.5
from z = 12 to z = 14. We discuss the possible implications of our results in the context of theoretical
models for evolution of the dark matter halo mass function.
Cancer cell metabolism: special Reference to Lactate PathwayAADYARAJPANDEY1
Normal Cell Metabolism:
Cellular respiration describes the series of steps that cells use to break down sugar and other chemicals to get the energy we need to function.
Energy is stored in the bonds of glucose and when glucose is broken down, much of that energy is released.
Cell utilize energy in the form of ATP.
The first step of respiration is called glycolysis. In a series of steps, glycolysis breaks glucose into two smaller molecules - a chemical called pyruvate. A small amount of ATP is formed during this process.
Most healthy cells continue the breakdown in a second process, called the Kreb's cycle. The Kreb's cycle allows cells to “burn” the pyruvates made in glycolysis to get more ATP.
The last step in the breakdown of glucose is called oxidative phosphorylation (Ox-Phos).
It takes place in specialized cell structures called mitochondria. This process produces a large amount of ATP. Importantly, cells need oxygen to complete oxidative phosphorylation.
If a cell completes only glycolysis, only 2 molecules of ATP are made per glucose. However, if the cell completes the entire respiration process (glycolysis - Kreb's - oxidative phosphorylation), about 36 molecules of ATP are created, giving it much more energy to use.
IN CANCER CELL:
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
introduction to WARBERG PHENOMENA:
WARBURG EFFECT Usually, cancer cells are highly glycolytic (glucose addiction) and take up more glucose than do normal cells from outside.
Otto Heinrich Warburg (; 8 October 1883 – 1 August 1970) In 1931 was awarded the Nobel Prize in Physiology for his "discovery of the nature and mode of action of the respiratory enzyme.
WARNBURG EFFECT : cancer cells under aerobic (well-oxygenated) conditions to metabolize glucose to lactate (aerobic glycolysis) is known as the Warburg effect. Warburg made the observation that tumor slices consume glucose and secrete lactate at a higher rate than normal tissues.
Introduction:
RNA interference (RNAi) or Post-Transcriptional Gene Silencing (PTGS) is an important biological process for modulating eukaryotic gene expression.
It is highly conserved process of posttranscriptional gene silencing by which double stranded RNA (dsRNA) causes sequence-specific degradation of mRNA sequences.
dsRNA-induced gene silencing (RNAi) is reported in a wide range of eukaryotes ranging from worms, insects, mammals and plants.
This process mediates resistance to both endogenous parasitic and exogenous pathogenic nucleic acids, and regulates the expression of protein-coding genes.
What are small ncRNAs?
micro RNA (miRNA)
short interfering RNA (siRNA)
Properties of small non-coding RNA:
Involved in silencing mRNA transcripts.
Called “small” because they are usually only about 21-24 nucleotides long.
Synthesized by first cutting up longer precursor sequences (like the 61nt one that Lee discovered).
Silence an mRNA by base pairing with some sequence on the mRNA.
Discovery of siRNA?
The first small RNA:
In 1993 Rosalind Lee (Victor Ambros lab) was studying a non- coding gene in C. elegans, lin-4, that was involved in silencing of another gene, lin-14, at the appropriate time in the
development of the worm C. elegans.
Two small transcripts of lin-4 (22nt and 61nt) were found to be complementary to a sequence in the 3' UTR of lin-14.
Because lin-4 encoded no protein, she deduced that it must be these transcripts that are causing the silencing by RNA-RNA interactions.
Types of RNAi ( non coding RNA)
MiRNA
Length (23-25 nt)
Trans acting
Binds with target MRNA in mismatch
Translation inhibition
Si RNA
Length 21 nt.
Cis acting
Bind with target Mrna in perfect complementary sequence
Piwi-RNA
Length ; 25 to 36 nt.
Expressed in Germ Cells
Regulates trnasposomes activity
MECHANISM OF RNAI:
First the double-stranded RNA teams up with a protein complex named Dicer, which cuts the long RNA into short pieces.
Then another protein complex called RISC (RNA-induced silencing complex) discards one of the two RNA strands.
The RISC-docked, single-stranded RNA then pairs with the homologous mRNA and destroys it.
THE RISC COMPLEX:
RISC is large(>500kD) RNA multi- protein Binding complex which triggers MRNA degradation in response to MRNA
Unwinding of double stranded Si RNA by ATP independent Helicase
Active component of RISC is Ago proteins( ENDONUCLEASE) which cleave target MRNA.
DICER: endonuclease (RNase Family III)
Argonaute: Central Component of the RNA-Induced Silencing Complex (RISC)
One strand of the dsRNA produced by Dicer is retained in the RISC complex in association with Argonaute
ARGONAUTE PROTEIN :
1.PAZ(PIWI/Argonaute/ Zwille)- Recognition of target MRNA
2.PIWI (p-element induced wimpy Testis)- breaks Phosphodiester bond of mRNA.)RNAse H activity.
MiRNA:
The Double-stranded RNAs are naturally produced in eukaryotic cells during development, and they have a key role in regulating gene expression .
Seminar of U.V. Spectroscopy by SAMIR PANDASAMIR PANDA
Spectroscopy is a branch of science dealing the study of interaction of electromagnetic radiation with matter.
Ultraviolet-visible spectroscopy refers to absorption spectroscopy or reflect spectroscopy in the UV-VIS spectral region.
Ultraviolet-visible spectroscopy is an analytical method that can measure the amount of light received by the analyte.
2. – I do not expect to have time to cover all of my slides!
3. – Introduction and Overview
– Community Structure
– Core–Periphery Structure
– Roles and Positions
– Summary and Conclusions
– Note: I’ll occasionally mention other ideas from the advertised blurb
along the way.
5. – Microscale structures: information centered on nodes, edges, or other
substructures
– Examples: degree of node i, centrality (various types) of node i, centrality (various
types) of edge (i,j), clustering coefficient of node i, etc.
– Macroscale structures: properties of distributions of microscale properties
across all nodes
– Examples: Is the degree distribution a power law? What is the relationship
between degree and local clustering coefficient?
– Mesoscale structures: middle-scale properties
– Examples: cohesive social groups, core versus peripheral banks, functional roles of
nodes in a network, etc.
– Note: Useful to examine distributions of microscale quantities separately within
mesoscale structures
7. – The paradigm, on which many methods have been developed, is that one
finds densely-connected sets of nodes (called “communities”) with
sparse connections between them.
– Important note: Most of these methods will return a community
structure whether or not it is present.
– Exercise: Try methods on Erdös–Rényi random graphs, which have no
inherent structure, and see what results you get.
– My view: We make an assumption when doing this, so there is an “if”
statement in these calculations: If we view a network in this way (or, for
that matter, in another way), what do we see? What, if anything, do we
learn in an application by doing this?
– “We must be cautious.” (Obi Wan Kenobi, Star Wars)
8. – Sometimes it can be, but that
intuitive is extremely naïve, and
good low-dimensional structures
are often (typically?) too much to
expect.
– “Community detection: You will
never find a more wretched hive of
scum and villainy. We must be
cautious.”
– Inspired by the full quote from Obi
Wan Kenobi
– Figure: Jeub et al., Phys. Rev. E,
2015
9. – Other structures besides assortative structures: different types of block sructures
– Bipartite, core–peripherystructure, etc.
– Block Models
– Roles and Positions
– Nodes that are “similar” (or, more strongly, the same) in some way, but they don’t have to be
part of the same densely-connected set
– Example: Given network structure only at a university, who is a professor, who is a postdoc,
who is a grad student, who is an undergrad, and who is staff? Perhaps the networkstructure
near a mathematics graduate student looks similar to that near a physics graduate student?
– A different type of block model
– Stochastic Block Models
– Statistically principled approach
– See the presentation by Tiago Peixoto
10. This is the “traditional” (assortative) type of mesoscale
structure to study in networks (in the network-science
community). There is a very large body of work on it.
11. – Survey article: MAP, J.-P. Onnela & P. J. Mucha [2009], “Communities in
Networks”, Notices of the American Mathematical Society 56:1082–
1097, 1164–1166
– Review article: S. Fortunato [2010], “Community Detection in Graphs”,
Physics Reports 486:75–174
– Important: These articles are out of date in several respects. There have
been significant developments since they were published. We need new
reviews.
12. – 1. Much more emphasis on statistical inference and statistically principled
methods. Significant development of these methods.
– Tiago’s presentation
– 2. Development of some methods for generalized situations (e.g., spatial
networks, temporal networks, multilayer networks)
– Introduction to a few of the available ideas towards the end of my presentation
– 3. Validation of results (e.g., with ”ground truth”) of methods applied to
empirical studies?
– More than there used to be, but there is still much more to do here. It will happen.
– Note: not just development of benchmarks
– Use of results of clustering method to do something
– Still much less focus here than on methods to cluster data in the first place
14. – “Hard/rigid” versus “soft/fuzzy/overlapping”
clustering
– A community should describe a “cohesive group”
of nodes
– Tons of methods available
– Usual notion: more intra-community edges than
one would expect at random
– But what does “at random” mean?
15. – Has a “low-dimensional” assortative (block diagonal) structure that has
unduly influenced our intuition of what we should see. (Real life is
usually more complicated.)
– We’re making a big assumption.
– Assortative structure
Puck Rombach
CENSORED!
17. • Popular approach: Use a “modularity” quality
function
• GOAL: Assign nodes to communities to maximize Q. (Use
some computational heuristic.)
18. • Cannot guarantee optimal quality without full
enumeration of possible partitions
– NP-hard problem
– Many algorithms available (spectral, Louvain, etc.)
– Need to pick null model appropriate to problem
– Extreme near-degeneracies in “good” local optima of Q
• (B. H. Good, Y.-A. de Montjoye, & A. Clauset, PRE 81:046106, 2010)
19.
20. • Erdös–Rényi (Bernoulli) • Newman–Girvan*
• Arenas et al.*, Leicht–
Newman* (directed)
• Barber* (bipartite)
• With additional resolution
parameter γ
• To try to take
“resolution limit” into
account, although
there are still some
issues
• Examine multiple
resolutions of
assortative structure
21. – Directly from consideration of assortative structure: counting edges within
communities versus edges between them
– Potts Hamiltonian with a particular choice of interaction energy
– From random walks (Laplacian dynamics) on networks
– For some null models
– R. Lambiotte, J.-C. Delvenne, &. M Barahona, arXiv:0812.1770 (now published,
with updates, in TNSE, 2015)
– I like this derivation, because it provides a direct connection between community
structure and dynamical systems on networks. It suggests that one can think
about community structure based not only on network adjacencies per se but also
based on dynamical process of interest, such that one seeks bottlenecks in network
to such dynamics starting from initial (seed) set of nodes.
– This idea provides way to get to local community structure and overlapping communites.
It also leads to direct connections with spectral and expander properties of graphs.
22. – Nodes = individuals
– Edges = self-identified friendships (1 or 0)
– The data (“Facebook100”)
– 100 different universities (full networks)
– Single-time snapshot: September 2005 (Facebook was university-only)
– Self-reported demographics: Gender, class year, high school, major,
dormitory/”House”
– Provided by Adam D’Angelo and Facebook
– We consider 4 types of networks for each school.
– Largest connected component (LCC); “Full”
– Student-only subset of LCC; “Student”
– Female-only subset of LCC; “Female”
– Male-only subset of LCC; “Male”
23. – Full networks (single university, largest connected component)
24.
25.
26.
27. ž Related to other set distances, but applied to node pairs
ž w11 = # node pairs put in the same group in 1st and also in the same
group in 2nd partition
ž w10 = # node pairs put in the same group in 1st partition but different
groups in 2nd partition
ž w01 and w00 defined analogously
ž M = total node pairs = Σijwij
28. 1. Z-scores for Rand, AdjustedRand,Fowlkes-
Mallows, & gamma indices are provably
identical
2. Analytical formulas exist for the above
indices (need permutation tests for Jaccard
andMinkowski)
29.
30. Legends gives disk size as a function of
maximum distance d between the 6 different
partitions
Full
– We visualize social organization using barycentric
coordinates.
– Center aroundYear vertex because of importance
of that category.
– Compute coordinates for each of 6 partition
methods and for each institution plot a disk
whose radius is proportional to maximum
difference between the 6 coordinates.
– Dormitory residence dominates organization at
Rice (31), Caltech (36), andUC Santa Cruz(68).
“Angel, it's not like this is the first time I've
had sex under a mystical influence. I went to
U.C. Santa Cruz.”
Full networks
33. – Greater importance of High School vertex in many Female networks
versus corresponding Full networks
– Residence vertex very important for Males at Michigan and Notre Dame,
in contrast to Full, Student, and Female networks at those institutions.
– Male networks seem to have a larger variation among second-most
important factor (after Year) than the Female networks.
– Suggestive of possibly interesting differences in friendship patterns between
the two genders?
– Relative ordering of Major at a given institution is sometimes gender-
dependent
34. – L. G. S. Jeub, P. Balachandran, MAP, P. J.
Mucha, & M. W. Mahoney [2015], Phys.
Rev. E 91(1):012821
– L. G. S. Jeub, MWM, PJM, & MAP [2015],
arXiv:1510.05185
– Code available at
http://github.com/LJeub/LocalCommunit
ies
THINK LOCALLY, ACT LOCALLY: DETECTION OF . . . PHYSICAL REVIEW E 91, 012821 (2015)
100 101 102 103 104
10−3
10−2
10−1
100
size
conductance
CA-GrQc
FB-Johns55
US-Senate
(a) NCP
100 101 102 103 104
10−3
10−2
10−1
100
101
102
size
conductanceratio
CA-GrQc
FB-Johns55
US-Senate
(b) CRP
(c) CA-GrQc (d) FB-Johns55
0
0.5
1
(e) US-Senate
FIG. 6. (Color online) NCP plots [in panel (a)] and conductance ratio profile (CRP) plots [in panel (b)] for CA-GRQC, FB-JOHNS55, and
US-SENATE (i.e., the smaller network from each of the three pairs of networks from Table I) generated using the ACLCUT method. In panels
(c)–(e), we show modified Kamada-Kawai [86] spring-embedding visualizations that emphasize community structure [87] of corresponding
(color-coded) communities and their neighborhoods (a 2-neighborhood for CA-GRQC, a 1-neighborhood for FB-JOHNS55, and all Senates
that have at least one senator in common with those in the communities for US-SENATE). We find good small communities but no good large
communities in CA-GRQC, some weak large-scale structure in FB-JOHNS55 that does not create substantial bottlenecks for the random-walk
dynamics, and signatures of low-dimensional structure (i.e., good large communities but no good small communities) for US-SENATE. The
low-dimensional structure in US-SENATE results from the multilayer structure that encapsulates the network’s temporal properties. [The dashed
line in panel (b) indicates a conductance ratio of 1.]
reason for the downward-sloping shape is that US-SENATE and structure using the MOVCUT (see Appendix C) and EGONET
35. – Upper left plot of previous slide: highest-
conductance community for each
community size (isoperimetric structure)
– Smaller conductance è better
communities (i.e., more ”community-like”)
JEUB, BALACHANDRAN, PORTER, MUCHA, AND MAHONEY PHYSICAL REVIEW E 91, 012821 (2015)
we then discuss our extensions of such ideas. For more details
on conductance and NCPs, see Refs. [25,37,67,68]. If G =
(V,E,w) is a graph with weighted adjacency matrix A, then the
“volume” between two sets S1 and S2 of nodes (i.e., Si ⊂ V )
equals the total weight of edges with one end in S1 and one
end in S2. That is,
vol(S1,S2) =
i∈S1 j∈S2
Aij . (1)
In this case, the “volume” of a set S ⊂ V of nodes is
vol(S) = vol(S,V ) =
i∈S j∈V
Aij . (2)
In other words, the set volume equals the total weight of
edges that are attached to nodes in the set. The volume
vol(S,S) between a set S and its complement S has a
natural interpretation as the “surface area” of the “boundary”
between S and S. In this study, a set S is a hypothesized
community. Informally, the conductance of a set S of nodes is
the surface area of that hypothesized community divided by
“volume” (i.e., size) of that community. From this perspective,
studying community structure amounts to an exploration of the
isoperimetric structure of G.
Somewhat more formally, the conductance of a set of nodes
S ⊂ V is
vol(S,S)
To gain insight into how to understand an NCP and what it
reveals about network structure, consider Fig. 2. In Fig. 2(a),
we illustrate three possible ways that an NCP can behave. In
each case, we use conductance as a measure of community
quality. The three cases are the following ones.
(1) Upward-sloping NCP. In this case, small communities
are “better” than large communities.
(2) Flat NCP. In this case, community quality is indepen-
dent of size. (As illustrated in this figure, the quality tends to
be comparably poor for all sizes.)
(3) Downward-sloping NCP. In this case, large communi-
ties are better than small communities.
For ease of visualization and computational considerations,
we only show NCPs for communities up to half of the size of
a network. An NCP for very large communities, which we do
not show in figures as a result of this choice, roughly mirrors
that for small communities, as the complement of a good small
community is a good large community because of the inherent
symmetry in conductance [see Eq. (3)].
In Fig. 2(b), we show an NCP of a LIVEJOURNAL network
from Ref. [25]. It demonstrates an empirical fact about a
large variety of large social and information networks: There
exist good small conductance-based communities, but there
do not exist any good large conductance-based communities
JEUB, BALACHANDRAN, PORTER, MUCHA, AND MAHONEY PHYSICAL REVIEW E 91, 012821 (2015)
we then discuss our extensions of such ideas. For more details
on conductance and NCPs, see Refs. [25,37,67,68]. If G =
(V,E,w) is a graph with weighted adjacency matrix A, then the
“volume” between two sets S1 and S2 of nodes (i.e., Si ⊂ V )
equals the total weight of edges with one end in S1 and one
end in S2. That is,
vol(S1,S2) =
i∈S1 j∈S2
Aij . (1)
In this case, the “volume” of a set S ⊂ V of nodes is
vol(S) = vol(S,V ) =
i∈S j∈V
Aij . (2)
In other words, the set volume equals the total weight of
edges that are attached to nodes in the set. The volume
vol(S,S) between a set S and its complement S has a
natural interpretation as the “surface area” of the “boundary”
between S and S. In this study, a set S is a hypothesized
community. Informally, the conductance of a set S of nodes is
the surface area of that hypothesized community divided by
“volume” (i.e., size) of that community. From this perspective,
studying community structure amounts to an exploration of the
isoperimetric structure of G.
Somewhat more formally, the conductance of a set of nodes
S ⊂ V is
φ(S) =
vol(S,S)
min (vol(S),vol(S))
. (3)
Thus, smaller values of conductance correspond to better
communities. The conductance of a graph G is the minimum
conductance of any subset of nodes:
φ(G) = min
S⊂V
φ(S). (4)
Computing the conductance φ(G) of an arbitrary graph is
an intractable problem (in the sense that the associated
decision problem is NP-hard [69]), but this quantity can be
approximated by the second-smallest eigenvalue λ2 of the
To gain insight into how to understand an NCP and what it
reveals about network structure, consider Fig. 2. In Fig. 2(a),
we illustrate three possible ways that an NCP can behave. In
each case, we use conductance as a measure of community
quality. The three cases are the following ones.
(1) Upward-sloping NCP. In this case, small communities
are “better” than large communities.
(2) Flat NCP. In this case, community quality is indepen-
dent of size. (As illustrated in this figure, the quality tends to
be comparably poor for all sizes.)
(3) Downward-sloping NCP. In this case, large communi-
ties are better than small communities.
For ease of visualization and computational considerations,
we only show NCPs for communities up to half of the size of
a network. An NCP for very large communities, which we do
not show in figures as a result of this choice, roughly mirrors
that for small communities, as the complement of a good small
community is a good large community because of the inherent
symmetry in conductance [see Eq. (3)].
In Fig. 2(b), we show an NCP of a LIVEJOURNAL network
from Ref. [25]. It demonstrates an empirical fact about a
large variety of large social and information networks: There
exist good small conductance-based communities, but there
do not exist any good large conductance-based communities
in many such networks. (See Refs. [24–26,37,67,68] for more
empirical evidence that large social and information networks
tend not to have large communities with low conductances.)
On the contrary, Fig. 2(c) illustrates a small toy network—a
so-called “caveman network”—formed from several small
cliques connected by rewiring one edge from each clique to
create a ring [70]. As illustrated by the downward-sloping NCP
in Fig. 2(d), this network possesses good conductance-based
communities, and large communities are better than small
ones. One obtains a similar downward-sloping NCP for the
Zachary Karate Club network [59] as well as for many other
Network Community Profile (NCP)
36. – Upper right plot from two slides ago:
ratio of conductance to internal
conductance
– Smaller ratio è better communities
00 101 102 103
size
ipartite structure
he idealized example
Karate Club network.
om a block model with
-R´enyi graph. (d) NCP
rtite block model.
e only connected via
s than the periphery.
phery structure tends
Figs. 1(b) and 3(b).
ply to all networks
riphery structure. If
d (though still much
ger observes good,
ike expanders from
om walkers, so they
les of such networks
model that we used
(b) [61].
mogeneous expander
s tend to have poor
Appendix A for a
not have any charac-
NCP of a bipartite
in the network. For
o types of nodes are
P [see Fig. 3(d)] has
r.
tent of NCPs
tness properties of
priori, as an NCP is
ever, the qualitative
upward-sloping, or
of nodes and edges,
ocessing decisions,
–26]. For example,
y small communities
behave in a roughly similar manner to conductance-based
NCPs, whereas measures that capture only one of the two
criteria exhibit qualitatively different behavior (typically for
rather trivial reasons) [26].
Although the basic NCP that we have been discussing
yields numerous insights about both small-scale and large-
scale network structure, it also has important limitations.
For example, an NCP gives no information on the number
or density of communities with different community quality
scores. (This contributes to the robustness properties of NCPs
with respect to perturbations of a network.) Accordingly,
the communities that are revealed by an NCP need not be
representative of the majority of communities in a network.
However, the extremal features that are revealed by an NCP
have important system-level implications for the behavior of
dynamical processes on a network: They are responsible for
the most severe bottlenecks for associated dynamical processes
on networks [72].
Another property that is not revealed by an NCP is the
internal structure of communities. Recall from Eq. (3) that
the conductance of a community measures how well (relative
to its size) it is separated from the remainder of a network,
but it does not consider the internal structure of a community
(except for size and edge density). In an extreme case, a com-
munity with good conductance might even consist of several
disjoint pieces. Recent work has addressed how spectral-based
approximations to optimizing conductance also approximately
optimize measures of internal connectivity [73].
We augment the information from basic NCPs with
some additional computations. To obtain an indication of
a community’s internal structure, we compute the internal
conductance of the communities that form an NCP. The
internal conductance φin(S) of a community S is
φin(S) = φ(G|S), (6)
where G|S is the subgraph of G induced by the nodes in
the community S. The internal conductance is equal to the
conductance of the best partition into two communities of the
network G|S viewed as a graph in isolation. Because a good
community should be well separated from the remainder of
a network and also relatively well connected internally, we
expect good communities to have low conductance but high
internal conductance. We thus compute the conductance ratio
(S) =
φ(S)
φin(S)
(7)
012821-7
vol(S1,S2) =
i∈S1 j∈S2
Aij . (1)
In this case, the “volume” of a set S ⊂ V of nodes is
vol(S) = vol(S,V ) =
i∈S j∈V
Aij . (2)
In other words, the set volume equals the total weight of
edges that are attached to nodes in the set. The volume
vol(S,S) between a set S and its complement S has a
natural interpretation as the “surface area” of the “boundary”
between S and S. In this study, a set S is a hypothesized
community. Informally, the conductance of a set S of nodes is
the surface area of that hypothesized community divided by
“volume” (i.e., size) of that community. From this perspective,
studying community structure amounts to an exploration of the
isoperimetric structure of G.
Somewhat more formally, the conductance of a set of nodes
S ⊂ V is
φ(S) =
vol(S,S)
min (vol(S),vol(S))
. (3)
Thus, smaller values of conductance correspond to better
communities. The conductance of a graph G is the minimum
conductance of any subset of nodes:
φ(G) = min
S⊂V
φ(S). (4)
Computing the conductance φ(G) of an arbitrary graph is
an intractable problem (in the sense that the associated
decision problem is NP-hard [69]), but this quantity can be
approximated by the second-smallest eigenvalue λ2 of the
normalized Laplacian [67,68].
If the “surface area to volume” (i.e., isoperimetric) inter-
pretation captures the notion of a good community as a set of
nodes that is connected more densely internally than with the
remainder of a network, then computing the solution to Eq. (4)
leads to the “best” (in this sense) community of any size in the
network.
are “better” than large communities.
(2) Flat NCP. In this case, community quality is indepen-
dent of size. (As illustrated in this figure, the quality tends to
be comparably poor for all sizes.)
(3) Downward-sloping NCP. In this case, large communi-
ties are better than small communities.
For ease of visualization and computational considerations,
we only show NCPs for communities up to half of the size of
a network. An NCP for very large communities, which we do
not show in figures as a result of this choice, roughly mirrors
that for small communities, as the complement of a good small
community is a good large community because of the inherent
symmetry in conductance [see Eq. (3)].
In Fig. 2(b), we show an NCP of a LIVEJOURNAL network
from Ref. [25]. It demonstrates an empirical fact about a
large variety of large social and information networks: There
exist good small conductance-based communities, but there
do not exist any good large conductance-based communities
in many such networks. (See Refs. [24–26,37,67,68] for more
empirical evidence that large social and information networks
tend not to have large communities with low conductances.)
On the contrary, Fig. 2(c) illustrates a small toy network—a
so-called “caveman network”—formed from several small
cliques connected by rewiring one edge from each clique to
create a ring [70]. As illustrated by the downward-sloping NCP
in Fig. 2(d), this network possesses good conductance-based
communities, and large communities are better than small
ones. One obtains a similar downward-sloping NCP for the
Zachary Karate Club network [59] as well as for many other
networks for which there exist meaningful visualizations [25].
The wide use of networks that have interpretable visualizations
(such as the Zachary Karate Club and planted-partition
models [71] with balanced communities) to help develop
and evaluate methods for community detection and other
procedures can lead to a strong selection bias when evaluating
the quality of those methods.
the most severe bottlenecks for associated dynamical processes
on networks [72].
Another property that is not revealed by an NCP is the
internal structure of communities. Recall from Eq. (3) that
the conductance of a community measures how well (relative
to its size) it is separated from the remainder of a network,
but it does not consider the internal structure of a community
(except for size and edge density). In an extreme case, a com-
munity with good conductance might even consist of several
disjoint pieces. Recent work has addressed how spectral-based
approximations to optimizing conductance also approximately
optimize measures of internal connectivity [73].
We augment the information from basic NCPs with
some additional computations. To obtain an indication of
a community’s internal structure, we compute the internal
conductance of the communities that form an NCP. The
internal conductance φin(S) of a community S is
φin(S) = φ(G|S), (6)
where G|S is the subgraph of G induced by the nodes in
the community S. The internal conductance is equal to the
conductance of the best partition into two communities of the
network G|S viewed as a graph in isolation. Because a good
community should be well separated from the remainder of
a network and also relatively well connected internally, we
expect good communities to have low conductance but high
internal conductance. We thus compute the conductance ratio
(S) =
φ(S)
φin(S)
(7)
21-7
Conductance Ratio Profile (CRP)
37. K LOCALLY, ACT LOCALLY: DETECTION OF . . . PHYSICAL REVIEW E 91, 012821 (2015)
low-dimensional space. Spectral clustering or other
ing methods often find meaningful communities in such
rks, and one can often readily construct meaningful and
etable visualizations of network structure.
Core-periphery structure. In Fig. 1(b), we illustrate
se in which α11 ≫ α12 ≫ α22. This is an example
network with a density-based “core-periphery” struc-
24,25,62–64]. There is a core set of nodes that are
ely well connected both among themselves and to a set
pheral nodes that interact very little among themselves.
Expander or complete graph. In Fig. 1(c), we illustrate
se in which α11 ≈ α12 ≈ α22. This corresponds to a
rk with little or no discernible structure. For example,
= α12 = α22 = 1, then the graph is a clique (i.e., the
ete graph). Alternatively, if the graph is a constant-
expander, then α11 ≈ α12 ≈ α22 ≪ 1. As discussed
pendix A, constant-degree expanders yield the metric
that embed least well in low-dimensional Euclidean
. In terms of the idealized block model in Fig. 1, they
ke complete graphs, and partitioning them would not
network structure that one should expect to construe as
ngful. Informally, they are largely unstructured when
d at large size scales.
Bipartite structure. In Fig. 1(d), we illustrate the case
ch α12 ≫ α11 ≈ α22. This corresponds to a bipartite or
bipartite graph. Such networks arise, e.g., when there
(a) Three possible NCPs (b) Realistic NCP from [25]
(c) A caveman network
100 101 102 103
10−4
10−3
10−2
10−1
100
size
conductance
(d) NCP of caveman network
FIG. 2. (Color online) Illustration of network community pro-
files (NCPs) of conductance versus community size. (a) Stylized
versions of possible shapes for an NCP: downward-sloping (black,
solid curve), upward-sloping (red, dotted curve), and flat (blue,
dashed curve). (b) NCP of a LIVEJOURNAL network that illustrates
the characteristic upward-sloping NCP that is typical for many large
empirical social and information networks [25]. (c) A toy “caveman
38. – We examine a few different processes (a community S reflects a
roadblock to the dynamics of a given process).
– Example: Personalized PageRank
– The dynamics is a random walk with teleportation. Look at which nodes
get visited as it unfolds. Sample over different seed nodes. Use
approximate PPR vector in estimation of conductance.
there exists an ϵ > 0 such that h(Gt ) ϵ
mally, a given graph G is an expander if its
rge.
Ref. [114], one can view expanders from
tary viewpoints. From a combinatorial
ers are graphs that are highly connected
e has to sever many edges to disconnect a
nder graph. From a geometric perspective,
ifficulty implies that every set of nodes has
ry relative to its size. From a probabilistic
ders are graphs for which the natural
ss converges to its limiting distribution as
Finally, from an algebraic perspective,
hs in which the first nontrivial eigenvalue
erator is bounded away from 0. (Because
d-regular graphs, note that this statement
mbinatorial Laplacian and the normalized
tion, constant-degree (i.e., d-regular, for
d) expanders are the metric spaces that
d strong sense [114]) embed least well in
aces (such as those discussed informally in
se interpretations imply that smaller values
pond more closely to the intuitive notion
ies (whereas larger values of expansion
nition, to better expanders).
ies between Eq. (A2) and Eq. (A3), which
and Eq. (3) and Eq. (4), which define
equations make it clear that the difference
and conductance simply amounts to a
he size (or volume) of sets of nodes and the
y (or surface area) between a set of nodes
t. This difference is inconsequential for
owever, because of the deep connections
nd rapidly mixing random walks, the latter
densely connected nodes) [9,32,53,115–118] as well as for
finding sets of nodes that are related to each other in other
ways [48,54,115,119,120].
In this paper, we build on the idea that random walks and
related diffusion-based dynamics, as well as other types of
local dynamics (e.g., ones, like geodesic hops, that depend on
ideas based on egocentric networks), should get “trapped” in
good communities. We examine three dynamical methods for
community identification.
1. Dynamics type 1: Local diffusions (the “ACLCUT” method)
In this procedure, we consider a random walk that starts at a
given seed node s and runs for some small number of steps. We
take advantage of the idea that if a random walk starts inside a
good community and takes only a small number of steps, then
it should become trapped inside that community. To do this,
we use the locally biased PPR procedure of Refs. [121,122].
Recall that a PPR vector is defined implicitly as the solution
⃗pr(α,⃗s) of the equation
⃗pr(α,⃗s) = αD−1
A ⃗pr(α,⃗s) + (1 − α)⃗s, (B1)
where 1 − α is a “teleportation” probability and ⃗s is a seed
vector. From the perspective of random walks, evolution occurs
either by the walker moving to a neighbor of the current node or
by the walker “teleporting” to a random node (e.g., determined
uniformly at random as in the usual PageRank procedure, or to
a random node that is biased towards ⃗s in the PPR procedure).
The PPR vector ⃗pr(α,⃗s) represents the stationary distribution
of this random walk. In general, teleportation results in a bias
to the random walk, and one usually tries to minimize such a
bias when detecting communities. (See Ref. [123] for clever
ways to choose ⃗s with this goal in mind.)
39. – L. G. S. Jeub, P. Balachandran, MAP, P. J.
Mucha, & M. W. Mahoney [2015], Phys.
Rev. E 91(1):012821
– L. G. S. Jeub, MWM, PJM, & MAP [2015],
arXiv:1510.05185
– Code available at
http://github.com/LJeub/LocalCommunit
ies
THINK LOCALLY, ACT LOCALLY: DETECTION OF . . . PHYSICAL REVIEW E 91, 012821 (2015)
100 101 102 103 104
10−3
10−2
10−1
100
size
conductance
CA-GrQc
FB-Johns55
US-Senate
(a) NCP
100 101 102 103 104
10−3
10−2
10−1
100
101
102
size
conductanceratio
CA-GrQc
FB-Johns55
US-Senate
(b) CRP
(c) CA-GrQc (d) FB-Johns55
0
0.5
1
(e) US-Senate
FIG. 6. (Color online) NCP plots [in panel (a)] and conductance ratio profile (CRP) plots [in panel (b)] for CA-GRQC, FB-JOHNS55, and
US-SENATE (i.e., the smaller network from each of the three pairs of networks from Table I) generated using the ACLCUT method. In panels
(c)–(e), we show modified Kamada-Kawai [86] spring-embedding visualizations that emphasize community structure [87] of corresponding
(color-coded) communities and their neighborhoods (a 2-neighborhood for CA-GRQC, a 1-neighborhood for FB-JOHNS55, and all Senates
that have at least one senator in common with those in the communities for US-SENATE). We find good small communities but no good large
communities in CA-GRQC, some weak large-scale structure in FB-JOHNS55 that does not create substantial bottlenecks for the random-walk
dynamics, and signatures of low-dimensional structure (i.e., good large communities but no good small communities) for US-SENATE. The
low-dimensional structure in US-SENATE results from the multilayer structure that encapsulates the network’s temporal properties. [The dashed
line in panel (b) indicates a conductance ratio of 1.]
reason for the downward-sloping shape is that US-SENATE and structure using the MOVCUT (see Appendix C) and EGONET
40. – You’re making an assumption by saying you are looking for assortative
structures.
– Other structures may be more informative and/or more appropriate.
– Modularity maximization has well-studied issues. They include:
– Numerous near-degeneracies in the optimization landscape (Good et al., 2010)
– Resolution limit (Fortunato & Barthelemy, 2007)
– Statistical inconsistency (Bickel & Chen, 2009)
– Always gives you an answer as output, but is it meaningful?
– Other methods have unknown issues. They haven’t been as well-studied, so
their problems are less well appreciated. Don’t assume that they don’t have
problems.
🤔
41. – Studying community structure can be very insightful—I
spend time doing this, after all!—but one has to use such
tools carefully.
😉
42. Another important type of mesoscale
structure, which is becoming increasingly
popular to study.
43. – P. Csermely, A. London, L.-Y. Wu, & B. Uzzi [2013], “Structure and
Dynamics of Core–Periphery Networks”, Journal of Complex Networks
1:93–123.
– Note: We also included extensive discussion of the background to
studying core–periphery structure in the following article:
– M. P. Rombach, MAP, J. H. Fowler, & P. J. Mucha [2014], “Core–Periphery
Structure in Networks”, SIAM J. App. Math. 74(1):167–190.
45. ì Note: Intuitive that many networks have such structure, but how to examine it?
ì Core versus peripheral countries in international relations (seems to be origin of the notion),
social networks, core versus peripheralbanks, transportation networks, etc.
ì Borgatti–Everett (1999):
ì Discrete notions: simpler one is to compare networkto an ideal block model consisting of a fully
connected core and a periphery with no internal connections but fully connected to the core
ì Continuous notion: start with above idea and determine a “core value” for each node
ì A subset of other notions of core-periphery structure
ì Holme (2005): Defined a core-periphery coefficient in terms of the k-core of a graph
ì Da Silva et al (2008): Defined a core coefficient using closeness centrality and a measure of
shortest paths
ì Leskovec et al. (2009): Onions and whiskers
ì Leskovec and collaborators (2013): Core regions from overlap of communities
46. – Origin in international relations (political, economical, etc.)
– First-world countries = “core” countries
– Second-world countries = “semi-peripheral” countries
– Third-world countries = “peripheral” countries
– Discrete versus continuous core–periphery structure
– Debates and discussions date back to the early qualitative work several decades ago
– Continuous method gives a centrality measure. One can then obtain a discrete
classification starting from a continuous spread of values.
– Intuition: Peeling an onion
– Remark: “nestedness” in ecology is a bipartite analog of core–periphery
structure (see, e.g., discussion in S. H. Lee, PRE 93:022306, 2016)
47. New York & Erie Railroad, diagram from about 160 years ago
The London
Underground
(“The Tube”)
48.
49. – Given k, remove all nodes of degree k-1 or less. After this, some nodes that previously
had degree k now have degree k-1 or less, so remove those too. Iterate until all nodes
have at least degree k. That is the k-core.
– Good points:
– Very fast algorithm, captures intuition of onion peeling, mathematically tractable (e.g.,
analysis of k-core percolation), probably does a reasonable job of getting high-degree nodes
in the core
– Bad points:
– Low-degree nodes can be core nodes, so there are false negatives (in the most interesting
situation, so it’s not really solving the problem. It’s also “too coarse” in other respects.
– Example: Should all nodes in k-shell be in the same level of the core?
– Example: How deep is the core? The largest k for a given network may not be satisfactorily
deep to study a problem in this way.
50. ì Core–periphery coefficient:
ì Average over all undirected, unweighted graphs with the same degree sequence
(configuration model)
ì P(i,j) = number of edges in shortest path between i and j
ì A k-core is a maximal connected subgraph in which all nodes have degree at least k
51. ì Aij = element of weighted, undirected adjacency matrix
ì Seek a value of ρC that is large compared to expected
value of ρC obtained if entries of vector C are shuffled
ì Output = core vector C giving core and periphery nodes
ì Continuous notion: node i is assigned a ‘coreness’ value
and Cij = Ci x Cj = a
52. Maximize
where Cij = 1 if i or j is in the
core and Cij = 0 otherwise.
Find the best fit to a core–peripheryblockmodel.
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
1 1 1 1 0 0 0 0
1 1 1 1 0 0 0 0
1 1 1 1 0 0 0 0
1 1 1 1 0 0 0 0
53. Find the best fit to a core–peripheryblockmodel.
1 1 1 1 a a a a
1 1 1 1 a a a a
1 1 1 1 a a a a
1 1 1 1 a a a a
a a a a 0 0 0 0
a a a a 0 0 0 0
a a a a 0 0 0 0
a a a a 0 0 0 0
Maximize
where Cij = 1 if i and j are in
the core, Cij = a if i xor j is
in the core, and Cij = 0
otherwise.
54. Let C be some vector of values between 0 and 1, and
maximize
This method does not assume any “shape” of the core–
periphery structure beforehand.
Approach by Rombach et al. [2014] builds on this idea.
55. v Interpolates between continuous and discrete notions of core-periphery structure
v We consider weighted, undirected networks
v Entries of core vector C can take non-negative values (e.g. Cij = Ci x Cj)
v Seek C that is normalized and is a shuffle of the vector C*
whose components specify local core values [N =
total number of nodes] via a transition function.
v Example transition function:
v C is chosen to maximize the core quality R:
v Parameters: α [where 0≤α≤1] sets sharpness of boundary between core and periphery; and β [where 0≤β≤1]
sets the size of the core
v Another transition function:
56. ì We obtain a core score for each node by averaging results over
different values of α and β:
ì Z = normalizationconstant; ensures that max[CS(i)] = 1
ì It would be interesting to develop more sophisticated procedures
for sampling values of α and β.
60. – The desire to be able to have both a continuous core-periphery spectrum and
discrete core/periphery (or core/semi-periphery/periphery) partition was
already recognized in old work in international relations and sociology.
61.
62. 2006 Network
Important note: The lists depend strongly on
which papers are in the data sets, is based on
coauthorship only, etc. (Therefore: Don’t take
them too seriously!)
Richardson, Porter,
Mucha (PRE, 2009)
63. – For applications such as transportation networks, perhaps we should
look directly at path-based notions to determine core junctions (nodes)
and core edges?
– For example, one can use modified notions of centrality measures like
betweenness, and now one can easily define notions for directed networks
(which is difficult for the density-based approaches discussed earlier).
64. – Theory: M. Cucuringu, P. Rombach, S. H. Lee, & MAP [2016], “Detection
of Core–Periphery Structure in Networks Using Spectral Methods and
Geodesic Paths”, Eur. J. App. Math., in press (arXiv:1410.6572)
– Applications: S. H. Lee, M. Cucuringu, & MAP [2014], “Density-Based and
Transport-Based Core–Periphery Structures in Networks”, Phys. Rev. E
89:032810
65. ì Rank nodes by a “participation score”, which is computed as
follows: For each edge (i,j) in a graph G, compute the
shortest path in G with that edge removed. All nodes
participating in such a path have a +1 added to their
participation score.
ì This method (“edge-removed betweenness centrality”)
rewards nodes for being part of cycles.
ì Similar for alternative measures of short paths (need not
consider only geodesic paths)
ì Similar definition for path-based core values for edges
68. S. H. Lee & P. Holme, Phys. Rev. Lett. 108, 128701 (2012)
Data (100 cites) available at https://sites.google.com/site/lshlj82/
69. S. H. Lee, M. D. Fricker, and MAP [2016], J.
Cplx. Networks, advance access
[http://comnet.oxfordjournals.org/
content/early/2016/04/29/comnet.cnv034.a
bstract]; includes release of large fungal
data set
Uses variant of taxonomy method from
Onnela et al., PRE, Vol. 86, 036104 (2012).
(Mesoscopic response functions based on
community structure at multiple scales.)
• Fungi = living networks
• Edges are fundamental
(nodes are placeholders)
70. Given a network, can we assign “roles” (i.e.,
colors) to nodes to identify their type?
(Not based on density of connections!)
71. – S. Wasserman & K. Faust [1994], Social Network Analysis: Methods and
Applications, Cambridge University Press
– P. Doreian, V. Batagelj, and A. Ferligoj [2004], Generalized
Blockmodeling, Cambridge University Press
– R. A. Rossi and N. K. Ahmed, “Role discovery in Networks” [2015], IEEE
Transactions on Knowledge and Data Engineering 27(4):1112–1131
– M. G. Everett and S. B. Borgatti [1994], “Regular equivalence: General
theory”, Journal of Mathematical Sociology 19(1):29–52
72. – One can examine roles in networks by looking at types of block
structure that are based on things other than density
– Role equivalence/assignment/coloring
– Define an equivalence relation between nodes, such that two nodes are in
the same equivalence class (i.e., colored in the same way) if they are the
same in some respect.
– Loosely speaking, “role equivalence” is trying to find nodes that are playing
similar roles (e.g., social roles, etc.) in a network. These nodes are supposed
to have the same network environment (or, more generally, similar ones),
such as a social environment, as measured in some way.
– Rearrange the nodes so that each color indicates a set of successive nodes.
Then the adjacency matrix shows a block structure.
Some parts (and snapshots!)
of my presentation on roles
andpositions are taken or
adapted from slides by Tom
Snijders(5/2/2012):
http://www.stats.ox.ac.uk/~s
nijders/Equivalences.pdf
73. Each type of coloring is a member
of the class specified above it.
(Each type corresponds to a
different way of what it means for
a pair of nodes to be “equivalent”.)
74. – F. Lorrain and H. C. White [1971], “Structural equivalence of individuals
in social networks” Journal of Mathematical Sociology 1:49–80
– Written in language of “category theory”
– Nodes i and j are structurally equivalent if they relate to other nodes in
the same way.
– Consider the following example from Borgatti and Everett:
Tom Snijders
77. – L. D. Sailer [1978], “Structural equivalence: Meaning and Definition,
Computation and Application”, Social Networks 1:73–90
– D. R. White & K. P. Reitz [1983], “Graph and Semigroup Homomorphisms
on Networks of Relations”, Social Networks 5:193–234
A coloring is a regular equivalence if two nodes of the same color also have neighbors of the same color.
Tom Snijders
78. – For empirical data, asking for exact equivalence is too stringent a
demand. It is necessary to relax this idea.
– One way to do this is to examine stochastic equivalence between nodes.
– For a probability distribution of edges in a graph, a coloring is a
stochastic equivalence if nodes with the same color have the same
probability distribution of edges with other nodes.
– That is, the probability distribution of the network has to remain the same
when (stochastically-)equivalent nodes are exchanged. This probability
distribution is a stochastic block model.
79. ➞
– Another way to loosen notions of exact equivalence is to compute
similarities between nodes that play similar roles in a network.
– One can then study community structure (i.e., assortative cohesive
groups) of a network, and an associated adjacency matrix, that encodes
these similarities.
80. – Example similarity from the following paper:
– E. A. Leicht, P. Holme, and M. E. J. Newman [2006], “Vertex Similarity in
Networks” Physical Review E 73:026120
– α is a parameter
– λ1 is the largest eigenvalue of A
– Then you can detect communities (i.e., assortative structures) in the
similarity matrix S
81. – M. Beguerisse-Díaz, G. Garduño-Hernández,
B. Vangelov, S. N. Yaliraki, & M. Barahona
[2014], “Interest Communities and Flow
Roles in Directed Networks: The Twitter
Network of the UK Riots”, Journal of the
Royal Society Interface 11:20140940
82. Some illustrative examples and basic ideas for
examining community structure in more
general types of networks.
83. – Multilayer Networks
– M. Kivelä, A. Arenas, M. Barthelemy, J. P. Gleeson, Y. Moreno, & MAP [2014],
“Multilayer Networks”, Journal of Complex Networks, 2(3):203–271
– S. Boccaletti et al. [2014], “Structure and Function of Multilayer Networks”,
Physics Reports, 544(1):1–122
– Temporal Networks
– P. Holme & J. Saramäki [2012], “Temporal Networks”, Phys. Rep. 519:97–125
– P. Holme [2015], “Modern Temporal Network Theory: A Colloquium”, Eur. Phys. J.
B 88(9):234
– Spatial Networks
– M. Barthelemy [2011], “Spatial Networks”, Phys. Rep. 499:1–101
84. – We’ll discuss extending community structure to these situations, but of
course one also wants to extend other ways of examining mesoscale
structures in these networks.
– Example: Using stochastic block models (see Tiago’s presentation)
– I am only giving examples and will focus mostly on the context of
modularity optimization (though I’ll also show an example with
extending the Jeub et al. local approach). One can also generalize other
approaches, and there is a lot more work to do.
85. – Many networks are either explicitly embedded in space (e.g., road
networks, granular materials) or have structures that are affected by
space (e.g., due to mobility).
– This has a large effect on network structure (e.g., see Marc Barthelemy’s
lecture and review article).
– Useful to develop and consider null models that incorporate spatial
information.
86.
87. – 2D, vertical, 1 layer aggregate of photoelastic disks
– Internal stress pattern in compressed packing manifests as network
of force chains (panel B)
– Force network is a weighted graph in which an edge between 2
particles (nodes) exists if the two particles are in contact with each
other; the forces give the weights
88. – D. S. Bassett, E. T. Owens, K. E. Daniels, and MAP [2012],
Physical Review E 86:041306
– 2D granular medium of photoelastic disks
– Two networks
– Underlying topology (unweighted)
– Forces (weighted)
– Both types of networks are needed for characterizing sound
propagation
89. Ø Use a null model that includes more
information
Ø Fix topology (i.e., connectivity) but scramble
geometry (i.e., edge weights)
› Wij = weighted adjacency-matrix element =
force network
› Aij = binary adjacency-matrix element =
contact network
Ø Communities obtained from optimization of
modularity (with “physical null model”) match
well with empirical granular force networks in
both laboratory and computational experiments
matrix W is oen called a “weight matrix.”
To obtain force chains from W, we want to determ
particles for which strong inter-particle forces occ
densely connected sets of particles. We can obtain a
this problem via “community detection”,34,35,44
in whi
sets of densely connected nodes called “modules” o
nities.” A popular way to identify communities in a
by maximizing a quality function known as modu
respect to the assignment of particles to sets called
nities.” Modularity Q is dened as
Q ¼
X
i;j
Â
Wij À gPij
Ã
d
À
ci; cj
Á
;
where node i is assigned to community ci, node j is a
community cj, the Kronecker delta d(ci, cj) ¼ 1 if and
cj, the quantity g is a resolution parameter, and
expected weight of the edge that connects node i
j under a specied null model.
One can use the maximum value of modularity
the quality of a partition of a force network into sets
that are more densely interconnected by strong f
expected under a given null model. The resolution
g provides a means of probing the organization of in
forces across a range of spatial resolutions. To pro
intuition, we note that a perfectly hexagonal packing
uniform forces should still possess a single comm
small values of g and should consist of a collection
particle (i.e., singleton) communities for large valu
intermediate values of g, we expect maximizing mo
yield a roughly homogeneous assignment of par
communities of some size (i.e., number of particles)
and the total number of particles. (The exact size d
the value of g.) The strongly inhomogeneous c
assignments that we observe in the laboratory and
packings (see Section IV) are a direct consequen
Publishedon23February2015.DownloadedbyCaliforniaInstituteofTechnologyon21/04/
90. we observe a maximum of guniform at g ¼ 0.9 (for g ˛ {0.1, 0.3,.,
2.1}) in high-pressure packings (5.9 Â 10À3
E) and at g ¼ 1.5 for
low-pressure packings (2.7 Â 10À4
E). In the numerical packings,
we observe a maximum of guniform at g ¼ 1.1 for all pressures. In
comparison to our observations in the main text from employ-
ing the size-weighted systemic gap factor g, we nd that the
optimal value of g is larger when we instead employ guniform
(compare Fig. 10 to Fig. 5 and 7). We also observe that the curves
of the systemic gap factor versus resolution parameter exhibit
larger variation for the uniformly-weighted gap factor than for
the size-weighted gap factor.
Optimal value of the resolution parameter
The large variation in the maximum of guniform over packings
and pressures makes it difficult to choose an optimal resolu-
tion-parameter value. We choose to take gopt ¼ 1.1 because (1) it
corresponds to the maximum of guniform in the numerical
packings and (2) it corresponds to the mean of the maximum of
guniform in the laboratory packings. To facilitate the comparison
of optimal values of g from the two weighting schemes, we
denote gopt for g as ^g and we denote gopt for guniform as ^guniform.
Note that ^guniform ¼ 1.1 differs from (and is larger than) ^g ¼ 0.9.
Force-chain structure at the optimal value of the resolution
parameter
The force chains that we identify for the optimal value for the
uniformly-weighted gap factor (at ^guniform ¼ 1.1) differ from
those that we identied in the main text for the optimal value of
the size-weighted gap factor (at ^g ¼ 0.9). We show our
comparison in Fig. 11. For both laboratory and numerical
Fig. 11 In both (A) (frictional) laboratory and (B) (frictionless) numerical
packings, we identify larger and more branched force chains at the
optimal resolution determined by (left; g ¼ 0.9) the size-weighted gap
factor g, and we identify smaller and less branched force chains at the
optimal resolution determined by (right; g ¼ 1.1) the uniformly-
weighted gap factor guniform. These observations are consistent across
all pressure values, but they are especially evident at high pressures in
Paper Soft Matter
bruary2015.DownloadedbyCaliforniaInstituteofTechnologyon21/04/201522:40:58.
View Article Online
– For larger pressures, we
obtain larger and more
branched force chains in
both the (frictional)
laboratory packings
described earlier and in
(frictionless) numerical
packings
91. – One can also incorporate mobility models into the construction of null models Pij
– P. Expert, T. Evans, V. Blondel, R. Lambiotte [2011], “Uncovering Space-Independent
Communities in Spatial Networks”, PNAS 108:7663–7668
– Introduced null modelbased on gravity model
– Found French vs. Flemish communities in mobile phone networkin Belgium
– M. Sarzynska, E. A. Leicht, G. Chowell, & MAP [2016], “Null Models for Community
Detection in Spatially-Embedded, Temporal Networks”, J. Cplx. Networks, advance
access (doi:10.1093/comnet/cnv027)
– Comparison of results using Newman–Girvan, gravity, and (newly introduced in null-model
form) radiation null models
– The situation is much more complicated than in the example studied by Expert et al.
– Introduction of new generative benchmarks (e.g., one based on distance, one based on flux)
and also empirical example from weekly cases of dengue fever in Peru over 15 years
93. – You can construct a radiation null model in a similar way, and both the gravity
and radiation null models can be generalized to temporal networks using a
multilayer representation (see later).
94. LN 2000 Spa 2000
Fig. 10. Circular plots of migration community structures in 2000. The size of the ribbons
corresponds to the amount of migration stock that remains in a community or is directed
to other communities. The color of the ribbons indicates the source communities. We
create the plots using Circos Table Viewer (Krzywinski et al., 2009), which is available at
http://mkweb.bcgsc.ca/tableviewer/visualize/.
7. Continuity and Change in Migration Communities
Migration communities are involved in complex processes of emerging, splitting,
merging, and dissolving. In Fig. 11, we map continuity and change in migration
communities using alluvial diagrams (Rosvall and Bergstrom, 2010). Instead of
95. – M. Kivelä et al., “Multilayer Networks”, JCN, 2014
96. – Use multilayer representations of temporal (e.g., with ordinary
coupling) and multiplex networks (with categorical coupling).
• P. J. Mucha, T. Richardson, K. Macon, MAP, &
J.-P. Onnela [2010], “Community Structure in
Time-Dependent, Multiscale, and Multiplex
Networks”, Science 328(5980):876–878 (2010)
• Code available at
http://netwiki.amath.unc.edu/GenLouvain/
GenLouvain
97. • Schematic from M. Bazzi, MAP, S.
Williams, M. McDonald, D. J. Fenn, & S.
D. Howison [2016] Multiscale Modeling
and Simulation: A SIAM Interdisciplinary
Journal, 14(1):1–41
13
Layer 1
11 21
31
Layer 2
12 22
32
Layer 3
13 23
33
!
2
6
6
6
6
6
6
6
6
6
6
6
6
4
0 1 1 ! 0 0 0 0 0
1 0 0 0 ! 0 0 0 0
1 0 0 0 0 ! 0 0 0
! 0 0 0 1 1 ! 0 0
0 ! 0 1 0 1 0 ! 0
0 0 ! 1 1 0 0 0 !
0 0 0 ! 0 0 0 1 0
0 0 0 0 ! 0 1 0 1
0 0 0 0 0 ! 0 1 0
3
7
7
7
7
7
7
7
7
7
7
7
7
5
Fig. 3.1. Example of (left) a multilayer network with unweighted intra-layer connections (solid
lines) and uniformly weighted inter-layer connections (dashed curves) and (right) its corresponding
adjacency matrix. (The adjacency matrix that corresponds to a multilayer network is sometimes
called a “supra-adjacency matrix” in the network-science literature [39].)
or an adjacency matrix to represent a multilayer network.) The generalization in [49]
consists of applying the function in (2.16) to the N|T |-node multilayer network:
ˆr(C, t) =
N|T |
X
i,j=1
✓
⇡i
⇥
ij + t⇤ii(Mij ij)
⇤
⇡i⇢i|j
◆
(ci, cj) , (3.1)
98. • Find communities algorithmically by optimizing “multislice
modularity”
– We derived this function in Mucha et al, 2010
• Laplacian dynamics: find communities based on how long random walkers are
trapped there. Exponentiate and then linearize to derive modularity.
• Generalizes derivation of ordinary modularity from R. Lambiotte, J.-C. Delvenne,
&. M Barahona, arXiv:0812.1770 (now published, with updates, in TNSE, 2015)
• Different spreading weights on different types of edges
– Recall: Node x in layer r is a different node-layer from node x in layer s
Remark: One can
generalize the null
model to incorporate
space (as discussed
previously). See, e.g.,
M. Sarzynska et al.
[2016].
99.
100. • A. S. Waugh, L. Pei, J. H. Fowler, P. J. Mucha, & MAP, “Party Polarization in CongressL A Network Science
Approach”, arXiv:0907.3509 (processed data available via figshare; original data from Voteview)
• One network layer for each two-year Congress
• Intralayer edges given by number of bills in which two legislators voted the same way divided by the total
number of bills on which they both voted
• Interlayer edges of weight ω = constant between a legislator and him/herself in consecutive Congresses if
a member for both (all other interlayer edges are 0)
• Each node-layer (i,s) assigned to a community by maximizing multislice modularity
101. munities under Laplacian dynamics (13), which
we have generalized to recover the null models for
bipartite, directed, and signed networks (14). First,
we obtained the resolution-parameter generaliza-
standard null model for directed networks (16, 17)
(again with a resolution parameter) by generaliz-
ing the Laplacian dynamics to include motion
along different kinds of connections—in this case,
the differe
lution par
for signed
We ap
models fo
existing q
an additio
between s
by adjace
interslice
r to itself
attention
(Aijs = Ajis
incorpora
couplings
single-slic
each node
and acros
multislice
time Lapla
p˙is
respects
interslice
probabilit
∑jrkjr, w
terms of t
slice s con
tureallow
for intra-
1800 1820 1840 1860 1880 1900 1920 1940 1960 1980 2000
40PA, 24F, 8AA
151DR, 30AA, 14PA, 5F
141F, 43DR
44D, 2R
1784R, 276D, 149DR, 162J, 53W, 84other
176W, 97AJ, 61DR, 49A,
24D, 19F, 13J, 37other
3168D, 252R, 73other
222D, 6W, 11other
1490R, 247D, 19other
Year
Senator
NM
UT
WY
CA
OR
WA
AK
HI
A
B
102. P. J. Mucha & M. A. Porter [2010], Chaos 20(4):041108
108. – fMRI data: network from correlated time
series
– Examine role of modularity in human
learning by identifying dynamic changes in
modular organization over multiple time
scales
– Main result: “flexibility”, as measured by
allegiance of nodes to communities over
temporal layers, in one session predicts
amount of learning in subsequent session
109. • D. S. Bassett, N. F. Wymbs, M. P.
Rombach, MAP, P. J. Mucha, & S. T.
Grafton [2013], PLoS Comput. Bio.
9(9):1003171
• Flexible nodes are consistently in a
“periphery” as computed for static
networks encompassing given time
windows
• Nodes that are not flexible (call them
“stiff”) are consistently in a structural
core in these static networks
• Uses our methodology for computing
core–periphery structure.
– M. P. Rombach, MAP, J. H. Fowler, & P. J.
Mucha [2014], SIAM J. App. Math.
74(1):167–190.
110. Temporal core–periphery organization ≈ Geometrical core–periphery organization!
(the latter is a density-based core using network structure in individual layers)
111. – L. G. S. Jeub, M. W. Mahoney, P. J. Mucha, & MAP [2015],
arXiv:1510.05185
– Extension to multilayer networks: allow spreading dynamics along
both intralayer and interlayer edges
local˙multiplex15 1 September 2015 20:47
A local perspective on community structure in multilayer networks 3
et al., 2015; Kuncheva & Montana, 2015). For our purposes, we define
pia (t +1) = Â
j,b
P
jb
ia pjb (t), (1)
where pjb (t) is the probability for a random walker to be at node j in layer b at time t and
P
jb
ia is the probability for a random walker to transition from node j in layer b to node i in
layer a in a time step. We also want the random walk to be ergodic, so that it has a well-
defined stationary distribution pia (•). We then use the stationary distribution to define the
conductance (Jerrum & Sinclair, 1988) of a set of state nodes1 S as
f(S) =
Â
(i,a)2S
Â
( j,b)/2S
Pia
jb pia (•)
 pia (•)
, (2)
ZU064-05-FPR local˙multiplex15 1 September 2015 20:47
A local perspective on community structure in multilayer networks 3
et al., 2015; Kuncheva & Montana, 2015). For our purposes, we define
pia (t +1) = Â
j,b
P
jb
ia pjb (t), (1)
where pjb (t) is the probability for a random walker to be at node j in layer b at time t and
P
jb
ia is the probability for a random walker to transition from node j in layer b to node i in
layer a in a time step. We also want the random walk to be ergodic, so that it has a well-
defined stationary distribution pia (•). We then use the stationary distribution to define the
conductance (Jerrum & Sinclair, 1988) of a set of state nodes1 S as
f(S) =
Â
(i,a)2S
Â
( j,b)/2S
Pia
jb pia (•)
Â
(i,a)2S
pia (•)
, (2)
which we use as a quality measure for local communities. Once we select an appropri-
ate random walk (or other Markov process2), we can define the associated personalized
PageRank (PPR) score of state node (i,a) as the solution to the equation
PPR(g)ia = g ÂP
jb
ia PPR(g)jb +(1 g)sia , (3)
et al., 2015; Kuncheva & Montana, 2015). For our purposes, we define
pia (t +1) = Â
j,b
P
jb
ia pjb (t),
where pjb (t) is the probability for a random walker to be at node j in layer b at ti
P
jb
ia is the probability for a random walker to transition from node j in layer b to
layer a in a time step. We also want the random walk to be ergodic, so that it ha
defined stationary distribution pia (•). We then use the stationary distribution to d
conductance (Jerrum & Sinclair, 1988) of a set of state nodes1 S as
f(S) =
Â
(i,a)2S
Â
(j,b)/2S
Pia
jb pia (•)
Â
(i,a)2S
pia (•)
,
which we use as a quality measure for local communities. Once we select an
ate random walk (or other Markov process2), we can define the associated pers
PageRank (PPR) score of state node (i,a) as the solution to the equation
PPR(g)ia = g Â
j,b
P
jb
ia PPR(g)jb +(1 g)sia ,
where s is a probability distribution that determines the seed nodes for the method
2014). We then approximate the solution to equation (3) locally (Andersen et al.,
find local communities.
Given a random walk on a multilayer network, one can analyze communities
layer networks using the same methods as for single-layer networks. See Jeub et a
for a detailed discussion of a few different methods and their application to severa
of networks (which exhibit rather different types of behavior with respect to the
ics of diffusion processes). Our code for identifying local communities and vi
networks is available from https://github.com/LJeub.
In the present article, we illustrate some features that one can encounter as a con
112. ZU064-05-FPR local˙multiplex15 1 September 2015 20:47
A local perspective on community structure in multilayer networks 5
100 101 102 103
10 3
10 2
10 1
100
number of state nodes
conductance
w = 0.1 w = 1 w = 10
(a) Classical random walk
100 101 102 103
10 3
10 2
10 1
100
number of state nodes
conductance
r = 0.01 r = 0.1 r = 1
(b) Relaxed random walk
Wizz Air
Ryanair
(c) Best community with 173 state nodes for
w = 0.1 (physical node as seed)
SunExpress
Panagra Airways
Turkish Airlines
(d) Best community with 169 state nodes for
r = 0.1 (state node as seed)
Fig. 2: European Airline Network: Multiplex transportation network with 37 layers. Each
layer includes the flights for a single airline (Cardillo et al., 2013). Panels (a) and (b)
show (mostly downward-sloping) network community profiles (NCPs) for this network,
where we plot the quality (as measured by conductance) of the best community of each
size (as measured by the number of state nodes that are a member of the community).
Observe that sampling using physical nodes (the thin curves) and sampling using state
nodes leads to similar results. Panels (c) and (d) illustrate some of the communities that
we obtain. We shade the state nodes in a community from dark red to light grey based
on their rank within the community. The large arrows point to the seed nodes. For small
layer-jumping probability r in the relaxed random walk and small interlayer edge weight w
113. Mesoscale structures can give fascinating
insights about networks, but be careful about
how you apply these tools and ideas.
114. – Numerous different types: communities, core–periphery structures,
roles and positions, block models (arbitrary block structures), etc.
– Not just communities!
– Community structure (to examine assortative structures) is the most popular
and best-studied type of mesoscale structure, but it’s far from the only one,
and there is no reason to think it is the most important one.
– Our focal question: If we examine a given type of structure (or given
types of structures), what can we learn about a network?
– A different question: How does one infer the statistically most likely
block structure? If we want to study “large-scale” structure in networks
broadly, what should we be looking for?
– Statistical inference and model selection (see presentation by Tiago Peixoto)
115. – Multilayer modularity maximization, community-detection
method of Jeub et al., code for visualization and analysis of
multilayer networks, and other methods available at
http://www.plexmath.eu/?page_id=327