Slides for paper
Reconstructing Textual Documents from n-grams
KDD 2015 (Knowledge Discovery and Data Mining)
http://dl.acm.org/citation.cfm?id=2783361
Pedagogy of Mathematics (Part II) - Set language introduction and Ex.1.4, Set Language, Maths, IX std Maths, Samacheerkalvi maths, II year B.Ed., Pedagogy,
Pedagogy of Mathematics (Part II) - Set language introduction and Ex.1.4, Set Language, Maths, IX std Maths, Samacheerkalvi maths, II year B.Ed., Pedagogy,
We consider here k-valent plane and toroidal maps with faces of size a and b. The faces are said to be in a lego if the faces are organized in blocks that then tile the sphere. We expose some enumeration results and the technical enumeration methods.
Then we expose how we managed to draw the graphs from the combinatorial data.
A Quest for Subexponential Time Parameterized Algorithms for Planar-k-Path: F...cseiitgn
The field of designing subexponential time parameterized algorithms has gained a lot of momentum lately. While the subexponential time algorithm for Planar-k-Path (finding a path of length at least k on planar graphs) has been known for last 15 years. There was no such algorithms known on directed planar graphs. In this talk, I will survey this journey of designing subexponential time parameterized algorithms for finding a path of length at least k in planar undirected graphs to planar directed graphs; highlighting the new tools and techniques that got developed on the way.
Enumeration methods are very important in a variety of settings, both mathematical and applications. For many problems there is actually no real hope to do the enumeration in reasonable time since the number of solutions is so big. This talk is about how to compute at the limit.
The talk is decomposed into:
(a) Regular enumeration procedure where one uses computerized case distinction.
(b) Use of symmetry groups for isomorphism checks.
(c) The augmentation scheme that allows to enumerate object up to isomorphism without keeping the full list in memory.
(d) The homomorphism principle that allows to map a complex problem to a simpler one.
ON ALGORITHMIC PROBLEMS CONCERNING GRAPHS OF HIGHER DEGREE OF SYMMETRYFransiskeran
Since the ancient determination of the five platonic solids the study of symmetry and regularity has always
been one of the most fascinating aspects of mathematics. One intriguing phenomenon of studies in graph
theory is the fact that quite often arithmetic regularity properties of a graph imply the existence of many
symmetries, i.e. large automorphism group G. In some important special situation higher degree of
regularity means that G is an automorphism group of finite geometry. For example, a glance through the
list of distance regular graphs of diameter d < 3 reveals the fact that most of them are connected with
classical Lie geometry. Theory of distance regular graphs is an important part of algebraic combinatorics
and its applications such as coding theory, communication networks, and block design. An important tool
for investigation of such graphs is their spectra, which is the set of eigenvalues of adjacency matrix of a
graph. Let G be a finite simple group of Lie type and X be the set homogeneous elements of the associated
geometry.
Similar to Reconstructing Textual Documents from n-grams (20)
Professional air quality monitoring systems provide immediate, on-site data for analysis, compliance, and decision-making.
Monitor common gases, weather parameters, particulates.
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Sérgio Sacani
Since volcanic activity was first discovered on Io from Voyager images in 1979, changes
on Io’s surface have been monitored from both spacecraft and ground-based telescopes.
Here, we present the highest spatial resolution images of Io ever obtained from a groundbased telescope. These images, acquired by the SHARK-VIS instrument on the Large
Binocular Telescope, show evidence of a major resurfacing event on Io’s trailing hemisphere. When compared to the most recent spacecraft images, the SHARK-VIS images
show that a plume deposit from a powerful eruption at Pillan Patera has covered part
of the long-lived Pele plume deposit. Although this type of resurfacing event may be common on Io, few have been detected due to the rarity of spacecraft visits and the previously low spatial resolution available from Earth-based telescopes. The SHARK-VIS instrument ushers in a new era of high resolution imaging of Io’s surface using adaptive
optics at visible wavelengths.
Richard's aventures in two entangled wonderlandsRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
Cancer cell metabolism: special Reference to Lactate PathwayAADYARAJPANDEY1
Normal Cell Metabolism:
Cellular respiration describes the series of steps that cells use to break down sugar and other chemicals to get the energy we need to function.
Energy is stored in the bonds of glucose and when glucose is broken down, much of that energy is released.
Cell utilize energy in the form of ATP.
The first step of respiration is called glycolysis. In a series of steps, glycolysis breaks glucose into two smaller molecules - a chemical called pyruvate. A small amount of ATP is formed during this process.
Most healthy cells continue the breakdown in a second process, called the Kreb's cycle. The Kreb's cycle allows cells to “burn” the pyruvates made in glycolysis to get more ATP.
The last step in the breakdown of glucose is called oxidative phosphorylation (Ox-Phos).
It takes place in specialized cell structures called mitochondria. This process produces a large amount of ATP. Importantly, cells need oxygen to complete oxidative phosphorylation.
If a cell completes only glycolysis, only 2 molecules of ATP are made per glucose. However, if the cell completes the entire respiration process (glycolysis - Kreb's - oxidative phosphorylation), about 36 molecules of ATP are created, giving it much more energy to use.
IN CANCER CELL:
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
introduction to WARBERG PHENOMENA:
WARBURG EFFECT Usually, cancer cells are highly glycolytic (glucose addiction) and take up more glucose than do normal cells from outside.
Otto Heinrich Warburg (; 8 October 1883 – 1 August 1970) In 1931 was awarded the Nobel Prize in Physiology for his "discovery of the nature and mode of action of the respiratory enzyme.
WARNBURG EFFECT : cancer cells under aerobic (well-oxygenated) conditions to metabolize glucose to lactate (aerobic glycolysis) is known as the Warburg effect. Warburg made the observation that tumor slices consume glucose and secrete lactate at a higher rate than normal tissues.
Richard's entangled aventures in wonderlandRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
2. Motivation: Privacy-preserving data mining
Share textual data for mutual benefit, general good or contractual reasons
But not all of it:
text analytics on private documents
1
3. Motivation: Privacy-preserving data mining
Share textual data for mutual benefit, general good or contractual reasons
But not all of it:
text analytics on private documents
marketplace scenarios [Cancedda ACL 2012]
1
4. Motivation: Privacy-preserving data mining
Share textual data for mutual benefit, general good or contractual reasons
But not all of it:
text analytics on private documents
marketplace scenarios [Cancedda ACL 2012]
copyright concerns
1
5. Problem
1 Given n-gram information of a document d, how well can we
reconstruct d?
2 If I want/have to share n-gram statistics, what is a good strategy to
avoid reconstruction, while preserving utility of data?
2
7. Example
s = $ a rose rose is a rose is a rose #
2-grams:
$ a 1
a rose 3
rose rose 1
rose is 2
is a 2
rose # 1
3
8. Example
s = $ a rose rose is a rose is a rose #
2-grams:
$ a 1
a rose 3
rose rose 1
rose is 2
is a 2
rose # 1
Note that the same 2-grams are obtained starting from:
s = $ a rose is a rose rose is a rose #
s = $ a rose is a rose is a rose rose #
3
9. Example
s = $ a rose rose is a rose is a rose #
2-grams:
$ a 1
a rose 3
rose rose 1
rose is 2
is a 2
rose # 1
Note that the same 2-grams are obtained starting from:
s = $ a rose is a rose rose is a rose #
s = $ a rose is a rose is a rose rose #
=⇒ Find large chunks of text of whose presence we are
certain
3
10. Problem Encoding
An n-gram corpus is encoded as a graph, subgraph of the de Bruijn graph, where
edges correspond to n-grams
0
1
$ a , 1
2
a rose , 3
rose rose , 1
3
rose is , 2
4
rose # , 1
is a , 2
4
11. Problem Encoding
[2, 2, 3, 1] → rose rose is a
0
1
$ a , 1
2
a rose , 3
rose rose , 1
3
rose is , 2
4
rose # , 1
is a , 2
4
13. Problem encoding
Given such a graph, each Eulerian path gives a plausible reconstruction
Problem: Find those parts that are common in all of them
14. Problem encoding
Given such a graph, each Eulerian path gives a plausible reconstruction
Problem: Find those parts that are common in all of them
BEST Theorem, 1951
Given an Eulerian graph G = (V , E), the number of different Eulerian
cycles is
Tw (G)
v∈V
(d(v) − 1)!
Tw (G) is the number of trees directed towards the root at a fixed node w
5
15. Problem Encoding
[0, 1, 2] → $ a rose
0
1
$ a , 1
2
a rose , 3
rose rose , 1
3
rose is , 2
4
rose # , 1
is a , 2
6
16. Definitions
ec(G): the set of all Eulerian paths of G
given the path c = e1, . . . , en; (c) = [label(e1), . . . , label(en)]
s(c) = label(e1).label(e2). . . . .label(en) (overlapping concatenation)
17. Definitions
ec(G): the set of all Eulerian paths of G
given the path c = e1, . . . , en; (c) = [label(e1), . . . , label(en)]
s(c) = label(e1).label(e2). . . . .label(en) (overlapping concatenation)
Given G, we want G∗ st:
1 is equivalent:
{s(c) : c ∈ ec(G)} = {s(c) : c ∈ ec(G∗
)}
2 is irreducible:
∃e1, e2 ∈ E∗
: [label(e1), label(e2)] appears in all (c), c ∈ ec(G∗
)
18. Definitions
ec(G): the set of all Eulerian paths of G
given the path c = e1, . . . , en; (c) = [label(e1), . . . , label(en)]
s(c) = label(e1).label(e2). . . . .label(en) (overlapping concatenation)
Given G, we want G∗ st:
1 is equivalent:
{s(c) : c ∈ ec(G)} = {s(c) : c ∈ ec(G∗
)}
2 is irreducible:
∃e1, e2 ∈ E∗
: [label(e1), label(e2)] appears in all (c), c ∈ ec(G∗
)
Given G∗ we can just read maximal blocks from the labels.
7
19. Example
s = $ a rose rose is a rose is a rose #
2
rose rose , 1
rose is a rose , 2
4
rose # , 1
0
$ a rose , 1
8
39. Conclusions
How well can textual documents be reconstructed from their list of
n-grams
Resilience to standard noisifying approach
Better noisifying by adding (instead of removing) n-grams
18
42. Rule 1 (Pigeonhole rule)
Incoming edges of x: ( v1, x, 1 , p1), . . . , ( vn, x, n , pn)
Outgoing edges ( x, w1, t1 , k1) . . . , ( x, wm, tm , km).
If ∃i, j such that pi > d(x) − kj .
then
E = E ({ vi , x, i , a), (x, wj , tj , a)}) ∪ { vi , wj , i .tj , a)} where
a = pi − (d(x) − kj ).
if a = d(x) then V = V {x}, else V = V
21
43. Rule 2: non-local information
x division point dividing G in components G1, G2. If ˆdinG1
(x) = 1 and
ˆdoutG2
(x) = 1 (( v, x, , p) and ( x, w, t , k)), then
E = (E {( v, x, , 1), ( x, w, t , 1)}) ∪ {( v, w, .t , 1)}
V = V
22