This is presentation slides of the paper "Attentional Parallel RNNs for Generating Punctuation in Transcribed Speech" in 5th International Conference on Statistical Language and Speech Processing (SLSP 2017)
Abstract
Until very recently, the generation of punctuation marks for automatic speech recognition (ASR) output has been mostly done by looking at the syntactic structure of the recognized utterances. Prosodic cues such as breaks, speech rate, pitch intonation that influence placing of punctuation marks on speech transcripts have been seldom used. We propose a method that uses recurrent neural networks, taking prosodic and lexical information into account in order to predict punctuation marks for raw ASR output. Our experiments show that an attention mechanism over parallel sequences of prosodic cues aligned with transcribed speech improves accuracy of punctuation generation.
The slides present a text recovery method based on a probabilistic post-recognition processing of the output of an Optical Character Recognition system. The proposed method is trying to fill in the gaps of missing text resulted from the recognition process of degraded documents. For this task, a corpus of up to 5-grams provided by Google is used. Several heuristics for using this corpus for the fulfilment of this task are described after presenting the general problem and alternative solutions. These heuristics have been validated using a set of experiments that are also discussed together with the results that have been obtained.
In this presentation we discuss several concepts that include Word Representation using SVD as well as neural networks based techniques. In addition we also cover core concepts such as cosine similarity, atomic and distributed representations.
Words and sentences are the basic units of text. In this lecture we discuss basics of operations on words and sentences such as tokenization, text normalization, tf-idf, cosine similarity measures, vector space models and word representation
Presented by Ted Xiao at RobotXSpace on 4/18/2017. This workshop covers the fundamentals of Natural Language Processing, crucial NLP approaches, and an overview of NLP in industry.
The slides present a text recovery method based on a probabilistic post-recognition processing of the output of an Optical Character Recognition system. The proposed method is trying to fill in the gaps of missing text resulted from the recognition process of degraded documents. For this task, a corpus of up to 5-grams provided by Google is used. Several heuristics for using this corpus for the fulfilment of this task are described after presenting the general problem and alternative solutions. These heuristics have been validated using a set of experiments that are also discussed together with the results that have been obtained.
In this presentation we discuss several concepts that include Word Representation using SVD as well as neural networks based techniques. In addition we also cover core concepts such as cosine similarity, atomic and distributed representations.
Words and sentences are the basic units of text. In this lecture we discuss basics of operations on words and sentences such as tokenization, text normalization, tf-idf, cosine similarity measures, vector space models and word representation
Presented by Ted Xiao at RobotXSpace on 4/18/2017. This workshop covers the fundamentals of Natural Language Processing, crucial NLP approaches, and an overview of NLP in industry.
This lectures provides students with an introduction to natural language processing, with a specific focus on the basics of two applications: vector semantics and text classification.
(Lecture at the QUARTZ PhD Winter School (http://www.quartz-itn.eu/training/winter-school/ in Padua, Italy on February 12, 2018)
DataFest 2017. Introduction to Natural Language Processing by Rudolf Eremyanrudolf eremyan
The objective of this workshop is to show how natural language processing applied in modern applications such as Google Search, Apple Siri, Bing Translator and etc. During the workshop we will go through history if natural language processing, talk about typical problems, consider classical approaches and methods, and compare them with state-of-the-art deep learning techniques.
Author: Rudolf Eremyan
Email: eremyan.rudolf@gmail.com
Phone: +995599607066
LinkedIn: https://www.linkedin.com/in/rudolferemyan/
DataFest Tbilisi 2017 website: https://datafest.ge
Provides a basic introduction to Natural Language Processing (NLP), its properties, and some common techniques such as stemming, tokenization, bag-of-words, stripping, and n-grams
Introduction to Natural Language ProcessingPranav Gupta
the presentation gives a gist about the major tasks and challenges involved in natural language processing. In the second part, it talks about one technique each for Part Of Speech Tagging and Automatic Text Summarization
This lecture talks about parsing. Briefly gives overview on lexicon, categorization, grammar rules, syntactic tree, word senses and various challenges of natural language processing
Big Data and Natural Language ProcessingMichel Bruley
Natural Language Processing (NLP) is the branch of computer science focused on developing systems that allow computers to communicate with people using everyday language.
What one needs to know to work in Natural Language Processing field and the aspects of developing an NLP project using the example of a system to identify text language
Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Mach...Antonio Toral
We reassess a recent study (Hassan et al.,
2018) that claimed that machine translation
(MT) has reached human parity for the transla-
tion of news from Chinese into English, using
pairwise ranking and considering three vari-
ables that were not taken into account in that
previous study: the language in which the
source side of the test set was originally writ-
ten, the translation proficiency of the evalua-
tors, and the provision of inter-sentential con-
text. If we consider only original source text
(i.e. not translated from another language, or
translationese), then we find evidence showing
that human parity has not been achieved. We
compare the judgments of professional trans-
lators against those of non-experts and dis-
cover that those of the experts result in higher
inter-annotator agreement and better discrim-
ination between human and machine transla-
tions. In addition, we analyse the human trans-
lations of the test set and identify important
translation issues. Finally, based on these find-
ings, we provide a set of recommendations for
future human evaluations of MT.
Natural Language Processing for Games ResearchJose Zagal
Extended version of talk given at GAMNLP Workshop - Kanazawa Japan 2012.
Presents earlier work analyzing game reviews using natural language processing techniques (first previewed at the Game Studies Research Seminar, Tampere Finland 2010)
Natural Language Processing: L01 introductionananth
This presentation introduces the course Natural Language Processing (NLP) by enumerating a number of applications, course positioning, challenges presented by Natural Language text and emerging approaches to topics like word representation.
Natural language processing for requirements engineering: ICSE 2021 Technical...alessio_ferrari
These are the slides for the technical briefing given at ICSE 2021, given by Alessio Ferrari, Liping Zhao, and Waad Alhoshan
It covers RE tasks to which NLP is applied, an overview of a recent systematic mapping study on the topic, and a hands-on tutorial on using transfer learning for requirements classification.
Please find the links to the colab notebooks here:
https://colab.research.google.com/drive/158H-lEJE1pc-xHc1ISBAKGDHMt_eg4Gn?usp=sharing
https://colab.research.google.com/d rive/1B_5ow3rvS0Qz1y-KyJtlMNnm gmx9w3kJ?usp=sharing
https://colab.research.google.com/d rive/1Xrm0gNaa41YwlM5g2CRYYX cRvpbDnTRT?usp=sharing
This lectures provides students with an introduction to natural language processing, with a specific focus on the basics of two applications: vector semantics and text classification.
(Lecture at the QUARTZ PhD Winter School (http://www.quartz-itn.eu/training/winter-school/ in Padua, Italy on February 12, 2018)
DataFest 2017. Introduction to Natural Language Processing by Rudolf Eremyanrudolf eremyan
The objective of this workshop is to show how natural language processing applied in modern applications such as Google Search, Apple Siri, Bing Translator and etc. During the workshop we will go through history if natural language processing, talk about typical problems, consider classical approaches and methods, and compare them with state-of-the-art deep learning techniques.
Author: Rudolf Eremyan
Email: eremyan.rudolf@gmail.com
Phone: +995599607066
LinkedIn: https://www.linkedin.com/in/rudolferemyan/
DataFest Tbilisi 2017 website: https://datafest.ge
Provides a basic introduction to Natural Language Processing (NLP), its properties, and some common techniques such as stemming, tokenization, bag-of-words, stripping, and n-grams
Introduction to Natural Language ProcessingPranav Gupta
the presentation gives a gist about the major tasks and challenges involved in natural language processing. In the second part, it talks about one technique each for Part Of Speech Tagging and Automatic Text Summarization
This lecture talks about parsing. Briefly gives overview on lexicon, categorization, grammar rules, syntactic tree, word senses and various challenges of natural language processing
Big Data and Natural Language ProcessingMichel Bruley
Natural Language Processing (NLP) is the branch of computer science focused on developing systems that allow computers to communicate with people using everyday language.
What one needs to know to work in Natural Language Processing field and the aspects of developing an NLP project using the example of a system to identify text language
Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Mach...Antonio Toral
We reassess a recent study (Hassan et al.,
2018) that claimed that machine translation
(MT) has reached human parity for the transla-
tion of news from Chinese into English, using
pairwise ranking and considering three vari-
ables that were not taken into account in that
previous study: the language in which the
source side of the test set was originally writ-
ten, the translation proficiency of the evalua-
tors, and the provision of inter-sentential con-
text. If we consider only original source text
(i.e. not translated from another language, or
translationese), then we find evidence showing
that human parity has not been achieved. We
compare the judgments of professional trans-
lators against those of non-experts and dis-
cover that those of the experts result in higher
inter-annotator agreement and better discrim-
ination between human and machine transla-
tions. In addition, we analyse the human trans-
lations of the test set and identify important
translation issues. Finally, based on these find-
ings, we provide a set of recommendations for
future human evaluations of MT.
Natural Language Processing for Games ResearchJose Zagal
Extended version of talk given at GAMNLP Workshop - Kanazawa Japan 2012.
Presents earlier work analyzing game reviews using natural language processing techniques (first previewed at the Game Studies Research Seminar, Tampere Finland 2010)
Natural Language Processing: L01 introductionananth
This presentation introduces the course Natural Language Processing (NLP) by enumerating a number of applications, course positioning, challenges presented by Natural Language text and emerging approaches to topics like word representation.
Natural language processing for requirements engineering: ICSE 2021 Technical...alessio_ferrari
These are the slides for the technical briefing given at ICSE 2021, given by Alessio Ferrari, Liping Zhao, and Waad Alhoshan
It covers RE tasks to which NLP is applied, an overview of a recent systematic mapping study on the topic, and a hands-on tutorial on using transfer learning for requirements classification.
Please find the links to the colab notebooks here:
https://colab.research.google.com/drive/158H-lEJE1pc-xHc1ISBAKGDHMt_eg4Gn?usp=sharing
https://colab.research.google.com/d rive/1B_5ow3rvS0Qz1y-KyJtlMNnm gmx9w3kJ?usp=sharing
https://colab.research.google.com/d rive/1Xrm0gNaa41YwlM5g2CRYYX cRvpbDnTRT?usp=sharing
Natural language processing with python and amharic syntax parse tree by dani...Daniel Adenew
Natural Language Processing is an interrelated disincline adding the capability of communicating as human beings to Computerworld. Amharic language is having much improvement over time thanks to researcher at PHD, MSC level at AAU. Here , I have tried to study and come up a limited scope solution that does syntax parsing for Amharic language and draws syntax parse trees using Python!!
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.Lifeng (Aaron) Han
Invited Presentation in NLP lab of Soochow University, about my NLP journey and ADAPT Centre. NLP part covers Machine Translation Evaluation, Quality Estimation, Multiword Expression Identification, Named Entity Recognition, Word Segmentation, Treebanks, Parsing.
French machine reading for question answeringAli Kabbadj
This paper proposes to unlock the main barrier to machine reading and comprehension French natural language texts. This open the way to machine to find to a question a precise answer buried in the mass of unstructured French texts. Or to create a universal French chatbot. Deep learning has produced extremely promising results for various tasks in natural language understanding particularly topic classification, sentiment analysis, question answering, and language translation. But to be effective Deep Learning methods need very large training da-tasets. Until now these technics cannot be actually used for French texts Question Answering (Q&A) applications since there was not a large Q&A training dataset. We produced a large (100 000+) French training Dataset for Q&A by translating and adapting the English SQuAD v1.1 Dataset, a GloVe French word and character embed-ding vectors from Wikipedia French Dump. We trained and evaluated of three different Q&A neural network ar-chitectures in French and carried out a French Q&A models with F1 score around 70%.
El modelo de traducción de voz de extremo a extremo de alta calidad se basa en una gran escala de datos de entrenamiento de voz a texto,
que suele ser escaso o incluso no está disponible para algunos pares de idiomas de bajos recursos. Para superar esto, nos
proponer un método de aumento de datos del lado del objetivo para la traducción del habla en idiomas de bajos recursos. En particular,
primero generamos paráfrasis del lado objetivo a gran escala basadas en un modelo de generación de paráfrasis
que incorpora varias características de traducción automática estadística (SMT) y el uso común
función de red neuronal recurrente (RNN). Luego, un modelo de filtrado que consiste en similitud semántica
y se propuso la co-ocurrencia de pares de palabras y habla para seleccionar la fuente con la puntuación más alta
pares de paráfrasis de los candidatos. Resultados experimentales en inglés, árabe, alemán, letón, estonio,
La generación de paráfrasis eslovena y sueca muestra que el método propuesto logra resultados significativos.
y mejoras consistentes sobre varios modelos de referencia sólidos en conjuntos de datos PPDB (http://paraphrase.
org/). Para introducir los resultados de la generación de paráfrasis en la traducción de voz de bajo recurso,
proponen dos estrategias: recombinación de pares audio-texto y entrenamiento de referencias múltiples. Experimental
Los resultados muestran que los modelos de traducción de voz entrenados en nuevos conjuntos de datos de audio y texto que combinan
los resultados de la generación de paráfrasis conducen a mejoras sustanciales sobre las líneas de base, especialmente en
lenguas de escasos recursos.
Getting started on your natural language processing project? First you'll need to extract some features from your corpus. Frequency, Syntax parsing, word vectors are good ones to start with.
Explore the power of Natural Language Processing (NLP) and Data Science in uncovering valuable insights from Flipkart product reviews. This presentation delves into the methodology, tools, and techniques used to analyze customer sentiments, identify trends, and extract actionable intelligence from a vast sea of textual data. From understanding customer preferences to improving product offerings, discover how NLP Data Science is revolutionizing the way businesses leverage consumer feedback on Flipkart. Visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
Sentiment analysis is an important current research area. The demand for sentiment analysis and classification is growing day by day; this paper presents a novel method to classify Urdu documents as previously no work recorded on sentiment classification for Urdu text. We consider the problem by determining whether the review or sentence is positive, negative or neutral. For the purpose we use two machine learning methods Naïve Bayes and Support Vector Machines (SVM) . Firstly the documents are preprocessed and the sentiments features are extracted, then the polarity has been calculated, judged and classify through Machine learning methods.
Introduction to Text Mining and Topic ModellingDavid Paule
A brief introduction to Text Mining and Topic Modelling given at the Urban Big Data Centre (University of Glasgow).
Want to know more? Visit my website davidpaule.es
Module 8: Natural language processing Pt 1Sara Hooker
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org. If you would like to use this material to further our mission of improving access to machine learning. Education please reach out to inquiry@deltanalytics.org .
A decision tree based word sense disambiguation system in manipuri languageacijjournal
This paper manifests a primary attempt on building a word sense disambiguation system in Manipuri
language. The paper discusses related attempts made in the Manipuri language followed by the proposed
plan. A database, consisting of 650 sentences, is collected in Manipuri language in the course of the study.
Conventional positional and context based features are suggested to capture the sense of the words, which
have ambiguous and multiple senses. The proposed work is expected to predict the senses of the
polysemous words with high accuracy with the help of the suitable knowledge acquisition techniques. The
system produces an accuracy of 71.75 %.
Multi-source connectivity as the driver of solar wind variability in the heli...Sérgio Sacani
The ambient solar wind that flls the heliosphere originates from multiple
sources in the solar corona and is highly structured. It is often described
as high-speed, relatively homogeneous, plasma streams from coronal
holes and slow-speed, highly variable, streams whose source regions are
under debate. A key goal of ESA/NASA’s Solar Orbiter mission is to identify
solar wind sources and understand what drives the complexity seen in the
heliosphere. By combining magnetic feld modelling and spectroscopic
techniques with high-resolution observations and measurements, we show
that the solar wind variability detected in situ by Solar Orbiter in March
2022 is driven by spatio-temporal changes in the magnetic connectivity to
multiple sources in the solar atmosphere. The magnetic feld footpoints
connected to the spacecraft moved from the boundaries of a coronal hole
to one active region (12961) and then across to another region (12957). This
is refected in the in situ measurements, which show the transition from fast
to highly Alfvénic then to slow solar wind that is disrupted by the arrival of
a coronal mass ejection. Our results describe solar wind variability at 0.5 au
but are applicable to near-Earth observatories.
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...Scintica Instrumentation
Intravital microscopy (IVM) is a powerful tool utilized to study cellular behavior over time and space in vivo. Much of our understanding of cell biology has been accomplished using various in vitro and ex vivo methods; however, these studies do not necessarily reflect the natural dynamics of biological processes. Unlike traditional cell culture or fixed tissue imaging, IVM allows for the ultra-fast high-resolution imaging of cellular processes over time and space and were studied in its natural environment. Real-time visualization of biological processes in the context of an intact organism helps maintain physiological relevance and provide insights into the progression of disease, response to treatments or developmental processes.
In this webinar we give an overview of advanced applications of the IVM system in preclinical research. IVIM technology is a provider of all-in-one intravital microscopy systems and solutions optimized for in vivo imaging of live animal models at sub-micron resolution. The system’s unique features and user-friendly software enables researchers to probe fast dynamic biological processes such as immune cell tracking, cell-cell interaction as well as vascularization and tumor metastasis with exceptional detail. This webinar will also give an overview of IVM being utilized in drug development, offering a view into the intricate interaction between drugs/nanoparticles and tissues in vivo and allows for the evaluation of therapeutic intervention in a variety of tissues and organs. This interdisciplinary collaboration continues to drive the advancements of novel therapeutic strategies.
This pdf is about the Schizophrenia.
For more details visit on YouTube; @SELF-EXPLANATORY;
https://www.youtube.com/channel/UCAiarMZDNhe1A3Rnpr_WkzA/videos
Thanks...!
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Sérgio Sacani
Since volcanic activity was first discovered on Io from Voyager images in 1979, changes
on Io’s surface have been monitored from both spacecraft and ground-based telescopes.
Here, we present the highest spatial resolution images of Io ever obtained from a groundbased telescope. These images, acquired by the SHARK-VIS instrument on the Large
Binocular Telescope, show evidence of a major resurfacing event on Io’s trailing hemisphere. When compared to the most recent spacecraft images, the SHARK-VIS images
show that a plume deposit from a powerful eruption at Pillan Patera has covered part
of the long-lived Pele plume deposit. Although this type of resurfacing event may be common on Io, few have been detected due to the rarity of spacecraft visits and the previously low spatial resolution available from Earth-based telescopes. The SHARK-VIS instrument ushers in a new era of high resolution imaging of Io’s surface using adaptive
optics at visible wavelengths.
Richard's aventures in two entangled wonderlandsRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
The increased availability of biomedical data, particularly in the public domain, offers the opportunity to better understand human health and to develop effective therapeutics for a wide range of unmet medical needs. However, data scientists remain stymied by the fact that data remain hard to find and to productively reuse because data and their metadata i) are wholly inaccessible, ii) are in non-standard or incompatible representations, iii) do not conform to community standards, and iv) have unclear or highly restricted terms and conditions that preclude legitimate reuse. These limitations require a rethink on data can be made machine and AI-ready - the key motivation behind the FAIR Guiding Principles. Concurrently, while recent efforts have explored the use of deep learning to fuse disparate data into predictive models for a wide range of biomedical applications, these models often fail even when the correct answer is already known, and fail to explain individual predictions in terms that data scientists can appreciate. These limitations suggest that new methods to produce practical artificial intelligence are still needed.
In this talk, I will discuss our work in (1) building an integrative knowledge infrastructure to prepare FAIR and "AI-ready" data and services along with (2) neurosymbolic AI methods to improve the quality of predictions and to generate plausible explanations. Attention is given to standards, platforms, and methods to wrangle knowledge into simple, but effective semantic and latent representations, and to make these available into standards-compliant and discoverable interfaces that can be used in model building, validation, and explanation. Our work, and those of others in the field, creates a baseline for building trustworthy and easy to deploy AI models in biomedicine.
Bio
Dr. Michel Dumontier is the Distinguished Professor of Data Science at Maastricht University, founder and executive director of the Institute of Data Science, and co-founder of the FAIR (Findable, Accessible, Interoperable and Reusable) data principles. His research explores socio-technological approaches for responsible discovery science, which includes collaborative multi-modal knowledge graphs, privacy-preserving distributed data mining, and AI methods for drug discovery and personalized medicine. His work is supported through the Dutch National Research Agenda, the Netherlands Organisation for Scientific Research, Horizon Europe, the European Open Science Cloud, the US National Institutes of Health, and a Marie-Curie Innovative Training Network. He is the editor-in-chief for the journal Data Science and is internationally recognized for his contributions in bioinformatics, biomedical informatics, and semantic technologies including ontologies and linked data.
A brief information about the SCOP protein database used in bioinformatics.
The Structural Classification of Proteins (SCOP) database is a comprehensive and authoritative resource for the structural and evolutionary relationships of proteins. It provides a detailed and curated classification of protein structures, grouping them into families, superfamilies, and folds based on their structural and sequence similarities.
SLSP 2017 presentation - Attentional Parallel RNNs for Generating Punctuation in Transcribed Speech
1. Attentional Parallel RNNs for
Generating Punctuation in
Transcribed Speech
Alp Öktem, Mireia Farrús, Leo Wanner
E-mail: alp.oktem@upf.edu
Other works: https://www.researchgate.net/profile/Alp_Oktem
Github: https://github.com/alpoktem
2. Contents
1) Motivation
2) Punctuating spoken text
3) Approaches
a) Related Work
b) Our approach
4) Proposed model
5) Data and experimental setup
6) Results
7) Contributions
3. Motivation
...
so under that basis we put it out and said
look we're skeptical about this thing we
don't know but what can we do the
material looks good it feels right but we
just can't verify it and we then got a letter
just this week from the company who
wrote it wanting to track down the source
saying hey we want to track down the
source and we were like oh tell us more
what document is it precisely you're
talking about can you show that you had
legal authority over that document is it
really yours
...
ASR
4. Motivation
ASR
...
So under that basis, we put it out and
said, "Look, we're skeptical about this
thing. We don't know, but what can we
do? The material looks good, it feels
right, but we just can't verify it." And we
then got a letter just this week from the
company who wrote it, wanting to track
down the source saying, "Hey, we want
to track down the source." And we were
like, "Oh, tell us more. What document is
it, precisely, you're talking about? Can
you show that you had legal authority
over that document? Is it really yours?
...
5. Why punctuation?
Punctuation serves for:
● For human readability,
● To aid interpretation,
● For machine processing:
○ Parsing
○ Machine translation
6. Motivation
RESEARCH QUESTIONS
1. How to approach the problem of unpunctuated ASR output?
2. Which linguistic phenomena affect the placement of
punctuation marks in spoken text?
7. Contents
1) Motivation
2) Punctuating spoken text
3) Approaches
a) Related Work
b) Our approach
4) Proposed model
5) Data and experimental setup
6) Results
7) Contributions
8. Punctuating Spoken Text
What signals punctuation in speech?
1) Syntax/Orthography:
Usage of commas, which are required e.g. in seperating clauses, depend a lot on
syntax.
Today, I am giving a talk.
10. Contents
1) Motivation
2) Punctuating spoken text
3) Approaches
a) Related Work
b) Our approach
4) Proposed model
5) Data and experimental setup
6) Results
7) Contributions
11. Related Work
❖ Data-driven models → Trainable on any language
❖ Recurrent Neural Networks (RNN) employed on two kinds of data:
Written Data
Features: Lexical, POS
Written+Spoken Data
Features: Lexical, pause
durations
Training in two stages
(Ballesteros et al., 2016)
Many prosodic features contributing to punctuation usage
are neglected!
(Tilk et al., 2016)
12. Our Approach
❖ Process lexical and prosodic information in parallel.
❖ Train a model solely from spoken data
❖ Test various acoustic features contributing to prosody:
➢ Pause durations
➢ Fundemental frequency (f0)
➢ Intensity
13. Contents
1) Motivation
2) Punctuating spoken text
3) Approaches
a) Related Work
b) Our approach
4) Proposed model
5) Data and experimental setup
6) Results
7) Contributions
17. Contents
1) Motivation
2) Punctuating spoken text
3) Approaches
a) Related Work
b) Our approach
4) Proposed model
5) Data and experimental setup
6) Results
7) Contributions
18. Data
❖ 1046 TED Talks
❖ 884 English speakers
❖ 156034 sentences
❖ Manual transcription available
https://www.ted.com/talks
20. Experimental Setup
❖ Reduced punctuation set
❖ 50 words per training sample
❖ 59811 samples
❖ %70-%15-%15: Training,
testing, validation
❖ Word vocabulary: 13830
❖ Implementation using Theano
no
punctuation
21. Contents
1) Motivation
2) Punctuating spoken text
3) Approaches
a) Related Work
b) Our approach
4) Proposed model
5) Data and experimental setup
6) Results
7) Contributions
24. Results from Testing Set
julian _ welcome . it's _ been _ reported _ that _ wikileaks _ your _ baby _ has _ in _
the _ last _ few _ years _ has _ released _ more _ classified _ documents _ than _ the
_ rest _ of _ the _ world's _ media _ combined . can _ that _ possibly _ be _ true ?
yeah , can _ it _ possibly _ be _ true ? it's _ a _ worry . isn't _ it _ that _ the _ rest _ of _
the _ world's _ media _ is _ doing _ such _ a _ bad _ job _ that _ a _ little _ group _ of
_ activists _ is _ able _ to _ release _ more _ of _ that _ type _ of _ information _ than _
the _ rest _ of _ the _ world _ press _ combined . how _ does _ it _ work ? how _ do _
people _ release _ the _ documents ?
who _ was _ the _ richest _ man ? still _ is _ the _ richest _ man _ in _ kenya .
when _ we _ released _ that _ report , we _ did _ so _ three _ days _ after _ the _ new
_ president _ kibaki _ had _ decided _ to _ pal _ up _ with _ the _ man _ that _ he _
was _ going _ to _ clean _ out , daniel _ arap _ moi .
25. Contents
1) Motivation
2) Punctuating spoken text
3) Approaches
a) Related Work
b) Our approach
4) Proposed model
5) Data and experimental setup
6) Results
7) Contributions
26. Contributions
❖ A study on the effect of various acoustic features on
punctuating spoken text.
❖ A model that is able to...
➢ process lexical/prosodic features in parallel
➢ integrate any aligned feature
❖ Training solely on spoken data
❖ Improvement compared to baseline (+%9,1 in terms of
F1
-score)
Source code available at:
https://github.com/alpoktem/punkProse