SlideShare a Scribd company logo
Introduction
Crossing theory
Evaluation
The scarcity of crossing dependencies: a
direct outcome of a specic constraint?
C. Gómez-Rodríguez  R. Ferrer-i-Cancho
1 LyS Research Group
Departamento de Computación
Universidade da Coruña
2Complexity  Quantitatitve Linguistics Lab
LARCA Research Group
Universitat Politècnica de Catalunya
Grapth-TA 4 (Barcelona), 4 March 2016
C. Gómez-Rodríguez  R. Ferrer-i-Cancho The scarcity of crossing dependencies: a direct outcome of a
Introduction
Crossing theory
Evaluation
The scarcity of crossing syntactic dependencies
Indeed , the government is taking a calculated risk .
5
4
1
2
1
4
3
2
1
We keep wondering what Mr. Gates wanted to say .
1 1
5
1 1
4
1
2
8
Dependencies tend to not cross when drawn above the sentence
(Hays and Lecerf in the 1960s)
The average C does not exceed 1 for most of the languages (sample
of 30 treebanks) [Gómez-Rodríguez and Ferrer-i-Cancho, 2016].
C. Gómez-Rodríguez  R. Ferrer-i-Cancho The scarcity of crossing dependencies: a direct outcome of a
Introduction
Crossing theory
Evaluation
Why? Two major hypotheses
The scarcity of crossings dependencies arises,
Directly from an underlying rule or principle of human
languages that is responsible for this fact (including the
possibility of some cognitive cost associated directly to
crossings). Held by the overwhelming majority of
researchers. Serious problems
[Ferrer-i-Cancho and Gómez-Rodríguez, 2016]:
Require heavy assumptions that compromise the parsimony of
linguistic theory as a whole.
Involve explanations based on internal constraints of obscure
nature.
Indirectly, from the actual length of dependencies, which are
constrained by a well-known psychological principle:
dependency length minimization
[Gómez-Rodríguez and Ferrer-i-Cancho, 2016].
C. Gómez-Rodríguez  R. Ferrer-i-Cancho The scarcity of crossing dependencies: a direct outcome of a
Introduction
Crossing theory
Evaluation
Crossing theory
Q, set of pairs of edges of a graph that can potentially
cross when their vertices are arranged linearly in some
arbitrary order (edges sharing a vertex cannot cross).
|Q|, the cardinality of Q, is the potential number of
crossings.
In a tree, one has
|Q| = n(n − 1 − k
2 )/2, (1)
|Q| = 0 if and only if the tree is a star tree
[Ferrer-i-Cancho, 2013].
C. Gómez-Rodríguez  R. Ferrer-i-Cancho The scarcity of crossing dependencies: a direct outcome of a
Introduction
Crossing theory
Evaluation
1st predictor of C
The number of edge crossings
C =
(e1,e2)∈Q
C (e1, e2), (2)
C (e1, e2) is an indicator variable (C (e1, e2) = 1 if the edges e1
and e2 cross and C (e1, e2) = 0 otherwise).
Null hypothesis that the vertices are arranged linearly at random
(all possible orderings are equally likely). p(C (e1, e2) = 1) = 1/3
yields E0[C ] = |Q|/3 [Ferrer-i-Cancho, 2013].
C. Gómez-Rodríguez  R. Ferrer-i-Cancho The scarcity of crossing dependencies: a direct outcome of a
Introduction
Crossing theory
Evaluation
2nd predictor of C
Introducing knowledge about the length of the dependencies
(edges of length 1 or n − 1 are not crossable).
p(C (e1, e2) = 1) is replaced by p(C (e1, e2) = 1|d (e1), d (e2)),
obtaining [Ferrer-i-Cancho, 2014]
E2[C ] =
(e1,e2)∈Q
p(C (e1, e2) = 1|d (e1), d (e2)), (3)
p(C (e1, e2) = 1|d (e1), d (e2)) depends only on n, d (e1) and
d (e2) [Ferrer-i-Cancho, 2014].
E0[C ] is a true expectation while E2[C ] is not!
C. Gómez-Rodríguez  R. Ferrer-i-Cancho The scarcity of crossing dependencies: a direct outcome of a
Introduction
Crossing theory
Evaluation
Evaluation of the predictors: materials
Corpora in version 2.0 of the HamleDT collection of treebanks
[Zeman et al., 2014, Rosa et al., 2014]: a harmonization of
existing treebanks for 30 dierent languages into two
well-known annotation styles: Prague dependencies
[Haji£ et al., 2006] and Universal Stanford dependencies
[de Marnee et al., 2014].
Preprocessing: nodes corresponding to punctuation tokens or
null elements were removed (non-punctuation nodes that had
a punctuation node as their head were attached as dependents
of their nearest non-punctuation ancestor). Same treatment
for null elements.
A syntactic dependency structure was included in our analyses
if (1) it dened a tree and (2) the tree was not a star tree.
C. Gómez-Rodríguez  R. Ferrer-i-Cancho The scarcity of crossing dependencies: a direct outcome of a
Introduction
Crossing theory
Evaluation
Evaluation of the predictors: methods
C , number of crossings of the linear arrangement of a graph in
general.
Ctrue, the number of crossings of the syntactic dependencies of
a real sentence.
relative number of crossings, i.e. ¯C = C /|Q| or
¯Ctrue = Ctrue/|Q| [Ferrer-i-Cancho, 2014].
relative error of a predictor [Ferrer-i-Cancho, 2014]
∆x = Ex ¯C − ¯Ctrue = (Ex[C ] − Ctrue)/|Q|. (4)
∆0 will be used as a baseline for ∆2.
∆0 converges to 1/3 for suciently long sentences when Ctrue
is small [Ferrer-i-Cancho, 2014].
C. Gómez-Rodríguez  R. Ferrer-i-Cancho The scarcity of crossing dependencies: a direct outcome of a
Introduction
Crossing theory
Evaluation
Results
Mixing sentences of dierent lengths:
The average ∆2, the relative error of the predictor E2[C ], is
small: it does not exceed 5%.
The average ∆2 is at least 6 times smaller than the baseline
∆0 ≈ 30%.
Controlling for sentence length (grouping sentences by length):
Average over group averages of ∆2: it does not exceed 4.3%.
The average ∆2 is at least 7 times smaller than the baseline
error, again ∆0 ≈ 30%.
The minimum size of a group is one sentence; the qualitative
results are very similar if the minimum size is set to 2.
C. Gómez-Rodríguez  R. Ferrer-i-Cancho The scarcity of crossing dependencies: a direct outcome of a
Introduction
Crossing theory
Evaluation
Concluding remarks
principle of dependency length minimization
[Ferrer-i-Cancho, 2015]
↓
dependency lengths
↓
the actual number of crossings in sentences
To explain the low frequency of crossings in world languages, it
many not be necessary to recur to recur to
A ban of crossings by grammar (e.g.,
[Hudson, 2007, Tanaka, 1997]).
A principle of minimization of crossings [Liu, 2008]
A competence-plus [Hurford, 2012] limiting the number of
crossings.
C. Gómez-Rodríguez  R. Ferrer-i-Cancho The scarcity of crossing dependencies: a direct outcome of a
Introduction
Crossing theory
Evaluation
de Marnee, M.-C., Dozat, T., Silveira, N., Haverinen, K.,
Ginter, F., Nivre, J., and Manning, C. D. (2014).
Universal Stanford dependencies: a cross-linguistic typology.
In Chair), N. C. C., Choukri, K., Declerck, T., Loftsson, H.,
Maegaard, B., Mariani, J., Moreno, A., Odijk, J., and Piperidis,
S., editors, Proceedings of the Ninth International Conference
on Language Resources and Evaluation (LREC'14), Reykjavik,
Iceland. European Language Resources Association (ELRA).
Ferrer-i-Cancho, R. (2013).
Random crossings in dependency trees.
http://arxiv.org/abs/1305.4561.
Ferrer-i-Cancho, R. (2014).
A stronger null hypothesis for crossing dependencies.
Europhysics Letters, 108:58003.
Ferrer-i-Cancho, R. (2015).
C. Gómez-Rodríguez  R. Ferrer-i-Cancho The scarcity of crossing dependencies: a direct outcome of a
Introduction
Crossing theory
Evaluation
The placement of the head that minimizes online memory. A
complex systems approach.
Language Dynamics and Change, 5:141164.
Ferrer-i-Cancho, R. and Gómez-Rodríguez, C. (2016).
Liberating language research from dogmas of the 20th century.
Glottometrics, 33:3334.
Gómez-Rodríguez, C. and Ferrer-i-Cancho, R. (2016).
The scarcity of crossing dependencies: a direct outcome of a
specic constraint?
http://arxiv.org/abs/1601.03210.
Haji£, J., Panevová, J., Haji£ová, E., Panevová, J., Sgall, P.,
Pajas, P., ’t¥pánek, J., Havelka, J., and Mikulová, M. (2006).
Prague Dependency Treebank 2.0.
CDROM CAT: LDC2006T01, ISBN 1-58563-370-4. Linguistic
Data Consortium.
C. Gómez-Rodríguez  R. Ferrer-i-Cancho The scarcity of crossing dependencies: a direct outcome of a
Introduction
Crossing theory
Evaluation
Hudson, R. A. (2007).
Language Networks: The New Word Grammar.
Oxford University Press.
Hurford, J. R. (2012).
Chapter 3. Syntax in the Light of Evolution, pages 175258.
Oxford University Press, Oxford.
Liu, H. (2008).
Dependency distance as a metric of language comprehension
diculty.
Journal of Cognitive Science, 9:159191.
Rosa, R., Mašek, J., Marecek, D., Popel, M., Zeman, D., and
Žabokrtský, Z. (2014).
HamleDT 2.0: Thirty dependency treebanks stanfordized.
In Chair), N. C. C., Choukri, K., Declerck, T., Loftsson, H.,
Maegaard, B., Mariani, J., Moreno, A., Odijk, J., and Piperidis,
C. Gómez-Rodríguez  R. Ferrer-i-Cancho The scarcity of crossing dependencies: a direct outcome of a
Introduction
Crossing theory
Evaluation
S., editors, Proceedings of the Ninth International Conference
on Language Resources and Evaluation (LREC'14), Reykjavik,
Iceland. European Language Resources Association (ELRA).
Tanaka, H. (1997).
Invisible movement in sika-nai and the linear crossing
constraint.
Journal of East Asian Linguistics, 6:143188.
Zeman, D., Du²ek, O., Mare£ek, D., Popel, M., Ramasamy, L.,
’t¥pánek, J., šabokrtský, Z., and Haji£, J. (2014).
HamleDT: Harmonized multi-language dependency treebank.
Language Resources and Evaluation, 48(4):601637.
C. Gómez-Rodríguez  R. Ferrer-i-Cancho The scarcity of crossing dependencies: a direct outcome of a

More Related Content

Similar to The scarcity of crossing dependencies: a direct outcome of a specific constraint?

Inductive learning of long-distance dissimilation as a problem for phonology
Inductive learning of long-distance dissimilation as a problem for phonologyInductive learning of long-distance dissimilation as a problem for phonology
Inductive learning of long-distance dissimilation as a problem for phonology
Kevin McMullin
 
A survey on parallel corpora alignment
A survey on parallel corpora alignment A survey on parallel corpora alignment
A survey on parallel corpora alignment
andrefsantos
 
AGU 2012 Bayesian analysis of non Gaussian LRD processes
AGU 2012 Bayesian analysis of non Gaussian LRD processesAGU 2012 Bayesian analysis of non Gaussian LRD processes
AGU 2012 Bayesian analysis of non Gaussian LRD processes
Nick Watkins
 
As pi re2015_abstracts
As pi re2015_abstractsAs pi re2015_abstracts
As pi re2015_abstracts
Joseph Park
 
Cunha CILAMCE 2016
Cunha CILAMCE 2016Cunha CILAMCE 2016
Cunha CILAMCE 2016
LucasHildebrand3
 

Similar to The scarcity of crossing dependencies: a direct outcome of a specific constraint? (6)

Inductive learning of long-distance dissimilation as a problem for phonology
Inductive learning of long-distance dissimilation as a problem for phonologyInductive learning of long-distance dissimilation as a problem for phonology
Inductive learning of long-distance dissimilation as a problem for phonology
 
A survey on parallel corpora alignment
A survey on parallel corpora alignment A survey on parallel corpora alignment
A survey on parallel corpora alignment
 
AGU 2012 Bayesian analysis of non Gaussian LRD processes
AGU 2012 Bayesian analysis of non Gaussian LRD processesAGU 2012 Bayesian analysis of non Gaussian LRD processes
AGU 2012 Bayesian analysis of non Gaussian LRD processes
 
poster
posterposter
poster
 
As pi re2015_abstracts
As pi re2015_abstractsAs pi re2015_abstracts
As pi re2015_abstracts
 
Cunha CILAMCE 2016
Cunha CILAMCE 2016Cunha CILAMCE 2016
Cunha CILAMCE 2016
 

More from Graph-TA

Reactive Databases for Big Data applications
Reactive Databases for Big Data applicationsReactive Databases for Big Data applications
Reactive Databases for Big Data applications
Graph-TA
 
Benchmarking Versioning for Big Linked Data
Benchmarking Versioning for Big Linked DataBenchmarking Versioning for Big Linked Data
Benchmarking Versioning for Big Linked Data
Graph-TA
 
Use of Graphs for Cloud Service Selection in Multi-Cloud Environments
Use of Graphs for Cloud Service Selection in Multi-Cloud EnvironmentsUse of Graphs for Cloud Service Selection in Multi-Cloud Environments
Use of Graphs for Cloud Service Selection in Multi-Cloud Environments
Graph-TA
 
Graphalytics: A big data benchmark for graph-processing platforms
Graphalytics: A big data benchmark for graph-processing platformsGraphalytics: A big data benchmark for graph-processing platforms
Graphalytics: A big data benchmark for graph-processing platforms
Graph-TA
 
RDF Graph Data Management in Oracle Database and NoSQL Platforms
RDF Graph Data Management in Oracle Database and NoSQL PlatformsRDF Graph Data Management in Oracle Database and NoSQL Platforms
RDF Graph Data Management in Oracle Database and NoSQL Platforms
Graph-TA
 
GRAPHITE — An Extensible Graph Traversal Framework for RDBMS
GRAPHITE — An Extensible Graph Traversal Framework for RDBMSGRAPHITE — An Extensible Graph Traversal Framework for RDBMS
GRAPHITE — An Extensible Graph Traversal Framework for RDBMS
Graph-TA
 
On the Discovery of Novel Drug-Target Interactions from Dense SubGraphs
On the Discovery of Novel Drug-Target Interactions from Dense SubGraphsOn the Discovery of Novel Drug-Target Interactions from Dense SubGraphs
On the Discovery of Novel Drug-Target Interactions from Dense SubGraphs
Graph-TA
 
Graphalytics: A big data benchmark for graph processing platforms
Graphalytics: A big data benchmark for graph processing platformsGraphalytics: A big data benchmark for graph processing platforms
Graphalytics: A big data benchmark for graph processing platforms
Graph-TA
 
Autograph: an evolving lightweight graph tool
Autograph: an evolving lightweight graph toolAutograph: an evolving lightweight graph tool
Autograph: an evolving lightweight graph tool
Graph-TA
 
Understanding Graph Structure in Knowledge Bases
Understanding Graph Structure in Knowledge BasesUnderstanding Graph Structure in Knowledge Bases
Understanding Graph Structure in Knowledge Bases
Graph-TA
 
Finding patterns of chronic disease and medication prescriptions from a large...
Finding patterns of chronic disease and medication prescriptions from a large...Finding patterns of chronic disease and medication prescriptions from a large...
Finding patterns of chronic disease and medication prescriptions from a large...
Graph-TA
 
Recent Updates on IBM System G — GraphBIG and Temporal Data
Recent Updates on IBM System G — GraphBIG and Temporal DataRecent Updates on IBM System G — GraphBIG and Temporal Data
Recent Updates on IBM System G — GraphBIG and Temporal Data
Graph-TA
 
Analysing the degree distribution of real graphs by means of several probabil...
Analysing the degree distribution of real graphs by means of several probabil...Analysing the degree distribution of real graphs by means of several probabil...
Analysing the degree distribution of real graphs by means of several probabil...
Graph-TA
 
SPIMBENCH: A Scalable, Schema-Aware Instance Matching Benchmark for the Seman...
SPIMBENCH: A Scalable, Schema-Aware Instance Matching Benchmark for the Seman...SPIMBENCH: A Scalable, Schema-Aware Instance Matching Benchmark for the Seman...
SPIMBENCH: A Scalable, Schema-Aware Instance Matching Benchmark for the Seman...
Graph-TA
 
Generating synthetic online social network graph data and topologies
Generating synthetic online social network graph data and topologiesGenerating synthetic online social network graph data and topologies
Generating synthetic online social network graph data and topologies
Graph-TA
 
Deriving an Emergent Relational Schema from RDF Data
Deriving an Emergent Relational Schema from RDF DataDeriving an Emergent Relational Schema from RDF Data
Deriving an Emergent Relational Schema from RDF Data
Graph-TA
 
Managing RDF data with graph databases
Managing RDF data with graph databasesManaging RDF data with graph databases
Managing RDF data with graph databases
Graph-TA
 
Graph Based Word Spotting Approach for Large Document Collections
Graph Based Word Spotting Approach for Large Document CollectionsGraph Based Word Spotting Approach for Large Document Collections
Graph Based Word Spotting Approach for Large Document Collections
Graph-TA
 
Use of graphs for political analysis
Use of graphs for political analysisUse of graphs for political analysis
Use of graphs for political analysis
Graph-TA
 
Graphium Chrysalis: Exploiting Graph Database
Graphium Chrysalis: Exploiting Graph DatabaseGraphium Chrysalis: Exploiting Graph Database
Graphium Chrysalis: Exploiting Graph Database
Graph-TA
 

More from Graph-TA (20)

Reactive Databases for Big Data applications
Reactive Databases for Big Data applicationsReactive Databases for Big Data applications
Reactive Databases for Big Data applications
 
Benchmarking Versioning for Big Linked Data
Benchmarking Versioning for Big Linked DataBenchmarking Versioning for Big Linked Data
Benchmarking Versioning for Big Linked Data
 
Use of Graphs for Cloud Service Selection in Multi-Cloud Environments
Use of Graphs for Cloud Service Selection in Multi-Cloud EnvironmentsUse of Graphs for Cloud Service Selection in Multi-Cloud Environments
Use of Graphs for Cloud Service Selection in Multi-Cloud Environments
 
Graphalytics: A big data benchmark for graph-processing platforms
Graphalytics: A big data benchmark for graph-processing platformsGraphalytics: A big data benchmark for graph-processing platforms
Graphalytics: A big data benchmark for graph-processing platforms
 
RDF Graph Data Management in Oracle Database and NoSQL Platforms
RDF Graph Data Management in Oracle Database and NoSQL PlatformsRDF Graph Data Management in Oracle Database and NoSQL Platforms
RDF Graph Data Management in Oracle Database and NoSQL Platforms
 
GRAPHITE — An Extensible Graph Traversal Framework for RDBMS
GRAPHITE — An Extensible Graph Traversal Framework for RDBMSGRAPHITE — An Extensible Graph Traversal Framework for RDBMS
GRAPHITE — An Extensible Graph Traversal Framework for RDBMS
 
On the Discovery of Novel Drug-Target Interactions from Dense SubGraphs
On the Discovery of Novel Drug-Target Interactions from Dense SubGraphsOn the Discovery of Novel Drug-Target Interactions from Dense SubGraphs
On the Discovery of Novel Drug-Target Interactions from Dense SubGraphs
 
Graphalytics: A big data benchmark for graph processing platforms
Graphalytics: A big data benchmark for graph processing platformsGraphalytics: A big data benchmark for graph processing platforms
Graphalytics: A big data benchmark for graph processing platforms
 
Autograph: an evolving lightweight graph tool
Autograph: an evolving lightweight graph toolAutograph: an evolving lightweight graph tool
Autograph: an evolving lightweight graph tool
 
Understanding Graph Structure in Knowledge Bases
Understanding Graph Structure in Knowledge BasesUnderstanding Graph Structure in Knowledge Bases
Understanding Graph Structure in Knowledge Bases
 
Finding patterns of chronic disease and medication prescriptions from a large...
Finding patterns of chronic disease and medication prescriptions from a large...Finding patterns of chronic disease and medication prescriptions from a large...
Finding patterns of chronic disease and medication prescriptions from a large...
 
Recent Updates on IBM System G — GraphBIG and Temporal Data
Recent Updates on IBM System G — GraphBIG and Temporal DataRecent Updates on IBM System G — GraphBIG and Temporal Data
Recent Updates on IBM System G — GraphBIG and Temporal Data
 
Analysing the degree distribution of real graphs by means of several probabil...
Analysing the degree distribution of real graphs by means of several probabil...Analysing the degree distribution of real graphs by means of several probabil...
Analysing the degree distribution of real graphs by means of several probabil...
 
SPIMBENCH: A Scalable, Schema-Aware Instance Matching Benchmark for the Seman...
SPIMBENCH: A Scalable, Schema-Aware Instance Matching Benchmark for the Seman...SPIMBENCH: A Scalable, Schema-Aware Instance Matching Benchmark for the Seman...
SPIMBENCH: A Scalable, Schema-Aware Instance Matching Benchmark for the Seman...
 
Generating synthetic online social network graph data and topologies
Generating synthetic online social network graph data and topologiesGenerating synthetic online social network graph data and topologies
Generating synthetic online social network graph data and topologies
 
Deriving an Emergent Relational Schema from RDF Data
Deriving an Emergent Relational Schema from RDF DataDeriving an Emergent Relational Schema from RDF Data
Deriving an Emergent Relational Schema from RDF Data
 
Managing RDF data with graph databases
Managing RDF data with graph databasesManaging RDF data with graph databases
Managing RDF data with graph databases
 
Graph Based Word Spotting Approach for Large Document Collections
Graph Based Word Spotting Approach for Large Document CollectionsGraph Based Word Spotting Approach for Large Document Collections
Graph Based Word Spotting Approach for Large Document Collections
 
Use of graphs for political analysis
Use of graphs for political analysisUse of graphs for political analysis
Use of graphs for political analysis
 
Graphium Chrysalis: Exploiting Graph Database
Graphium Chrysalis: Exploiting Graph DatabaseGraphium Chrysalis: Exploiting Graph Database
Graphium Chrysalis: Exploiting Graph Database
 

Recently uploaded

ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
AhmedHussein950959
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
WENKENLI1
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
Jayaprasanna4
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
MLILAB
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
TeeVichai
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
obonagu
 
block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
Divya Somashekar
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
Amil Baba Dawood bangali
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
thanhdowork
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
Robbie Edward Sayers
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Sreedhar Chowdam
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
fxintegritypublishin
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
AafreenAbuthahir2
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
Kamal Acharya
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
SamSarthak3
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
gdsczhcet
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation & Control
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
Kamal Acharya
 
space technology lecture notes on satellite
space technology lecture notes on satellitespace technology lecture notes on satellite
space technology lecture notes on satellite
ongomchris
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
gerogepatton
 

Recently uploaded (20)

ASME IX(9) 2007 Full Version .pdf
ASME IX(9)  2007 Full Version       .pdfASME IX(9)  2007 Full Version       .pdf
ASME IX(9) 2007 Full Version .pdf
 
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdfGoverning Equations for Fundamental Aerodynamics_Anderson2010.pdf
Governing Equations for Fundamental Aerodynamics_Anderson2010.pdf
 
ethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.pptethical hacking-mobile hacking methods.ppt
ethical hacking-mobile hacking methods.ppt
 
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
H.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdfH.Seo,  ICLR 2024, MLILAB,  KAIST AI.pdf
H.Seo, ICLR 2024, MLILAB, KAIST AI.pdf
 
Railway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdfRailway Signalling Principles Edition 3.pdf
Railway Signalling Principles Edition 3.pdf
 
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
在线办理(ANU毕业证书)澳洲国立大学毕业证录取通知书一模一样
 
block diagram and signal flow graph representation
block diagram and signal flow graph representationblock diagram and signal flow graph representation
block diagram and signal flow graph representation
 
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
 
HYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generationHYDROPOWER - Hydroelectric power generation
HYDROPOWER - Hydroelectric power generation
 
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&BDesign and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
Design and Analysis of Algorithms-DP,Backtracking,Graphs,B&B
 
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdfHybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
 
WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234WATER CRISIS and its solutions-pptx 1234
WATER CRISIS and its solutions-pptx 1234
 
Student information management system project report ii.pdf
Student information management system project report ii.pdfStudent information management system project report ii.pdf
Student information management system project report ii.pdf
 
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdfAKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
AKS UNIVERSITY Satna Final Year Project By OM Hardaha.pdf
 
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdfGen AI Study Jams _ For the GDSC Leads in India.pdf
Gen AI Study Jams _ For the GDSC Leads in India.pdf
 
Water Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdfWater Industry Process Automation and Control Monthly - May 2024.pdf
Water Industry Process Automation and Control Monthly - May 2024.pdf
 
Final project report on grocery store management system..pdf
Final project report on grocery store management system..pdfFinal project report on grocery store management system..pdf
Final project report on grocery store management system..pdf
 
space technology lecture notes on satellite
space technology lecture notes on satellitespace technology lecture notes on satellite
space technology lecture notes on satellite
 
Immunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary AttacksImmunizing Image Classifiers Against Localized Adversary Attacks
Immunizing Image Classifiers Against Localized Adversary Attacks
 

The scarcity of crossing dependencies: a direct outcome of a specific constraint?

  • 1. Introduction Crossing theory Evaluation The scarcity of crossing dependencies: a direct outcome of a specic constraint? C. Gómez-Rodríguez R. Ferrer-i-Cancho 1 LyS Research Group Departamento de Computación Universidade da Coruña 2Complexity Quantitatitve Linguistics Lab LARCA Research Group Universitat Politècnica de Catalunya Grapth-TA 4 (Barcelona), 4 March 2016 C. Gómez-Rodríguez R. Ferrer-i-Cancho The scarcity of crossing dependencies: a direct outcome of a
  • 2. Introduction Crossing theory Evaluation The scarcity of crossing syntactic dependencies Indeed , the government is taking a calculated risk . 5 4 1 2 1 4 3 2 1 We keep wondering what Mr. Gates wanted to say . 1 1 5 1 1 4 1 2 8 Dependencies tend to not cross when drawn above the sentence (Hays and Lecerf in the 1960s) The average C does not exceed 1 for most of the languages (sample of 30 treebanks) [Gómez-Rodríguez and Ferrer-i-Cancho, 2016]. C. Gómez-Rodríguez R. Ferrer-i-Cancho The scarcity of crossing dependencies: a direct outcome of a
  • 3. Introduction Crossing theory Evaluation Why? Two major hypotheses The scarcity of crossings dependencies arises, Directly from an underlying rule or principle of human languages that is responsible for this fact (including the possibility of some cognitive cost associated directly to crossings). Held by the overwhelming majority of researchers. Serious problems [Ferrer-i-Cancho and Gómez-Rodríguez, 2016]: Require heavy assumptions that compromise the parsimony of linguistic theory as a whole. Involve explanations based on internal constraints of obscure nature. Indirectly, from the actual length of dependencies, which are constrained by a well-known psychological principle: dependency length minimization [Gómez-Rodríguez and Ferrer-i-Cancho, 2016]. C. Gómez-Rodríguez R. Ferrer-i-Cancho The scarcity of crossing dependencies: a direct outcome of a
  • 4. Introduction Crossing theory Evaluation Crossing theory Q, set of pairs of edges of a graph that can potentially cross when their vertices are arranged linearly in some arbitrary order (edges sharing a vertex cannot cross). |Q|, the cardinality of Q, is the potential number of crossings. In a tree, one has |Q| = n(n − 1 − k 2 )/2, (1) |Q| = 0 if and only if the tree is a star tree [Ferrer-i-Cancho, 2013]. C. Gómez-Rodríguez R. Ferrer-i-Cancho The scarcity of crossing dependencies: a direct outcome of a
  • 5. Introduction Crossing theory Evaluation 1st predictor of C The number of edge crossings C = (e1,e2)∈Q C (e1, e2), (2) C (e1, e2) is an indicator variable (C (e1, e2) = 1 if the edges e1 and e2 cross and C (e1, e2) = 0 otherwise). Null hypothesis that the vertices are arranged linearly at random (all possible orderings are equally likely). p(C (e1, e2) = 1) = 1/3 yields E0[C ] = |Q|/3 [Ferrer-i-Cancho, 2013]. C. Gómez-Rodríguez R. Ferrer-i-Cancho The scarcity of crossing dependencies: a direct outcome of a
  • 6. Introduction Crossing theory Evaluation 2nd predictor of C Introducing knowledge about the length of the dependencies (edges of length 1 or n − 1 are not crossable). p(C (e1, e2) = 1) is replaced by p(C (e1, e2) = 1|d (e1), d (e2)), obtaining [Ferrer-i-Cancho, 2014] E2[C ] = (e1,e2)∈Q p(C (e1, e2) = 1|d (e1), d (e2)), (3) p(C (e1, e2) = 1|d (e1), d (e2)) depends only on n, d (e1) and d (e2) [Ferrer-i-Cancho, 2014]. E0[C ] is a true expectation while E2[C ] is not! C. Gómez-Rodríguez R. Ferrer-i-Cancho The scarcity of crossing dependencies: a direct outcome of a
  • 7. Introduction Crossing theory Evaluation Evaluation of the predictors: materials Corpora in version 2.0 of the HamleDT collection of treebanks [Zeman et al., 2014, Rosa et al., 2014]: a harmonization of existing treebanks for 30 dierent languages into two well-known annotation styles: Prague dependencies [Haji£ et al., 2006] and Universal Stanford dependencies [de Marnee et al., 2014]. Preprocessing: nodes corresponding to punctuation tokens or null elements were removed (non-punctuation nodes that had a punctuation node as their head were attached as dependents of their nearest non-punctuation ancestor). Same treatment for null elements. A syntactic dependency structure was included in our analyses if (1) it dened a tree and (2) the tree was not a star tree. C. Gómez-Rodríguez R. Ferrer-i-Cancho The scarcity of crossing dependencies: a direct outcome of a
  • 8. Introduction Crossing theory Evaluation Evaluation of the predictors: methods C , number of crossings of the linear arrangement of a graph in general. Ctrue, the number of crossings of the syntactic dependencies of a real sentence. relative number of crossings, i.e. ¯C = C /|Q| or ¯Ctrue = Ctrue/|Q| [Ferrer-i-Cancho, 2014]. relative error of a predictor [Ferrer-i-Cancho, 2014] ∆x = Ex ¯C − ¯Ctrue = (Ex[C ] − Ctrue)/|Q|. (4) ∆0 will be used as a baseline for ∆2. ∆0 converges to 1/3 for suciently long sentences when Ctrue is small [Ferrer-i-Cancho, 2014]. C. Gómez-Rodríguez R. Ferrer-i-Cancho The scarcity of crossing dependencies: a direct outcome of a
  • 9. Introduction Crossing theory Evaluation Results Mixing sentences of dierent lengths: The average ∆2, the relative error of the predictor E2[C ], is small: it does not exceed 5%. The average ∆2 is at least 6 times smaller than the baseline ∆0 ≈ 30%. Controlling for sentence length (grouping sentences by length): Average over group averages of ∆2: it does not exceed 4.3%. The average ∆2 is at least 7 times smaller than the baseline error, again ∆0 ≈ 30%. The minimum size of a group is one sentence; the qualitative results are very similar if the minimum size is set to 2. C. Gómez-Rodríguez R. Ferrer-i-Cancho The scarcity of crossing dependencies: a direct outcome of a
  • 10. Introduction Crossing theory Evaluation Concluding remarks principle of dependency length minimization [Ferrer-i-Cancho, 2015] ↓ dependency lengths ↓ the actual number of crossings in sentences To explain the low frequency of crossings in world languages, it many not be necessary to recur to recur to A ban of crossings by grammar (e.g., [Hudson, 2007, Tanaka, 1997]). A principle of minimization of crossings [Liu, 2008] A competence-plus [Hurford, 2012] limiting the number of crossings. C. Gómez-Rodríguez R. Ferrer-i-Cancho The scarcity of crossing dependencies: a direct outcome of a
  • 11. Introduction Crossing theory Evaluation de Marnee, M.-C., Dozat, T., Silveira, N., Haverinen, K., Ginter, F., Nivre, J., and Manning, C. D. (2014). Universal Stanford dependencies: a cross-linguistic typology. In Chair), N. C. C., Choukri, K., Declerck, T., Loftsson, H., Maegaard, B., Mariani, J., Moreno, A., Odijk, J., and Piperidis, S., editors, Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), Reykjavik, Iceland. European Language Resources Association (ELRA). Ferrer-i-Cancho, R. (2013). Random crossings in dependency trees. http://arxiv.org/abs/1305.4561. Ferrer-i-Cancho, R. (2014). A stronger null hypothesis for crossing dependencies. Europhysics Letters, 108:58003. Ferrer-i-Cancho, R. (2015). C. Gómez-Rodríguez R. Ferrer-i-Cancho The scarcity of crossing dependencies: a direct outcome of a
  • 12. Introduction Crossing theory Evaluation The placement of the head that minimizes online memory. A complex systems approach. Language Dynamics and Change, 5:141164. Ferrer-i-Cancho, R. and Gómez-Rodríguez, C. (2016). Liberating language research from dogmas of the 20th century. Glottometrics, 33:3334. Gómez-Rodríguez, C. and Ferrer-i-Cancho, R. (2016). The scarcity of crossing dependencies: a direct outcome of a specic constraint? http://arxiv.org/abs/1601.03210. Haji£, J., Panevová, J., Haji£ová, E., Panevová, J., Sgall, P., Pajas, P., ’t¥pánek, J., Havelka, J., and Mikulová, M. (2006). Prague Dependency Treebank 2.0. CDROM CAT: LDC2006T01, ISBN 1-58563-370-4. Linguistic Data Consortium. C. Gómez-Rodríguez R. Ferrer-i-Cancho The scarcity of crossing dependencies: a direct outcome of a
  • 13. Introduction Crossing theory Evaluation Hudson, R. A. (2007). Language Networks: The New Word Grammar. Oxford University Press. Hurford, J. R. (2012). Chapter 3. Syntax in the Light of Evolution, pages 175258. Oxford University Press, Oxford. Liu, H. (2008). Dependency distance as a metric of language comprehension diculty. Journal of Cognitive Science, 9:159191. Rosa, R., Mašek, J., Marecek, D., Popel, M., Zeman, D., and Žabokrtský, Z. (2014). HamleDT 2.0: Thirty dependency treebanks stanfordized. In Chair), N. C. C., Choukri, K., Declerck, T., Loftsson, H., Maegaard, B., Mariani, J., Moreno, A., Odijk, J., and Piperidis, C. Gómez-Rodríguez R. Ferrer-i-Cancho The scarcity of crossing dependencies: a direct outcome of a
  • 14. Introduction Crossing theory Evaluation S., editors, Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), Reykjavik, Iceland. European Language Resources Association (ELRA). Tanaka, H. (1997). Invisible movement in sika-nai and the linear crossing constraint. Journal of East Asian Linguistics, 6:143188. Zeman, D., Du²ek, O., Mare£ek, D., Popel, M., Ramasamy, L., ’t¥pánek, J., šabokrtský, Z., and Haji£, J. (2014). HamleDT: Harmonized multi-language dependency treebank. Language Resources and Evaluation, 48(4):601637. C. Gómez-Rodríguez R. Ferrer-i-Cancho The scarcity of crossing dependencies: a direct outcome of a