The objective of this work is to provide a complete analysis of a piece of conversation, carrying out the following features:
- Phonologic features of dialogue and a brief statistical analysis;
Alto is a game audio localization tool that compares thousands of dialogues lines in different languages and generates reports. It checks their audio format, loudness, duration and more. It can also interface with game audio middleware (Adx2, Fabric, Fmod, Wwise...). It includes special tools such as an interactive dialogue tester, a speech synthesizer for placeholder dialogue and more!
http://www.tsugi-studio.com/?page_id=1923
An Extensible Multilingual Open Source LemmatizerCOMRADES project
Ahmet Aker and Johann Petraka and Firas Sabbahb
Department of Computer Science, University of Sheffield
Department of Information Engineering, University of Duisburg-Essen
a.aker@is.inf.uni-due.de, johann.petrak@sheffield.ac.uk
firas.sabbah@stud.uni-due.de
dialogue act modeling for automatic tagging and recognitionVipul Munot
Aim to present comprehensive framework
for modelling and automatic classification of DA’s
founded on well-known statistical methods
Present results obtained with this approach
on large widely available corpus of
spontaneous conversational speech.
Timo Honkela: Introductory lecture of the seminar course on Computational Pra...Timo Honkela
T-61.6020 Computational Pragmatics
Timo Honkela, Aalto University School of Science
Spring 2012
Pragmatics is a subfield of linguistics which studies the ways in
which context contributes to meaning. It studies how the transmission
of meaning depends not only on the linguistic knowledge of the speaker
and listener, but also on the context of the utterance, knowledge
about the status of those involved, the intent of the speaker, etc.
Even though pragmatics is traditionally considered as an area of
linguistics, similar considerations related to meaning in context are
also relevant for information systems design and especially
interactive systems development. An interesting issue within computer
science is the interface between pragmatics and semantics. Ontologies
are used in semantic web to define prototypical meanings but in the
real-world contexts, pragmatics deals with the subjective and
contextual variation around prototypical meanings. In human-to-machine
communication, information systems may have practical uses in new
contexts beyond the ones defined originally by the designer of the
system. In machine-to-machine communication, formal semantics may fall
short in solving interoperability issues and thus issues related to
pragmatics need to be considered. In overall, the focus is in how
understanding takes place, not in how meanings are defined.
During the course, the participants are introduced with the main
linguistic theories related to pragmatics including but not limited to
the theories about the functions of languages, the speech act theory,
and the theory of conversational maxims. The participants will
familiarize themselves with computational models in the area of
pragmatics with specific focus on dynamic and adaptive systems and
statistical machine learning. They will also conduct a small empirical
study related to the subjectivity and contextuality of meaning using
the grounded intersubjective concept analysis (GICA). The collected
data will be analyzed using statistical methods.
Alto is a game audio localization tool that compares thousands of dialogues lines in different languages and generates reports. It checks their audio format, loudness, duration and more. It can also interface with game audio middleware (Adx2, Fabric, Fmod, Wwise...). It includes special tools such as an interactive dialogue tester, a speech synthesizer for placeholder dialogue and more!
http://www.tsugi-studio.com/?page_id=1923
An Extensible Multilingual Open Source LemmatizerCOMRADES project
Ahmet Aker and Johann Petraka and Firas Sabbahb
Department of Computer Science, University of Sheffield
Department of Information Engineering, University of Duisburg-Essen
a.aker@is.inf.uni-due.de, johann.petrak@sheffield.ac.uk
firas.sabbah@stud.uni-due.de
dialogue act modeling for automatic tagging and recognitionVipul Munot
Aim to present comprehensive framework
for modelling and automatic classification of DA’s
founded on well-known statistical methods
Present results obtained with this approach
on large widely available corpus of
spontaneous conversational speech.
Timo Honkela: Introductory lecture of the seminar course on Computational Pra...Timo Honkela
T-61.6020 Computational Pragmatics
Timo Honkela, Aalto University School of Science
Spring 2012
Pragmatics is a subfield of linguistics which studies the ways in
which context contributes to meaning. It studies how the transmission
of meaning depends not only on the linguistic knowledge of the speaker
and listener, but also on the context of the utterance, knowledge
about the status of those involved, the intent of the speaker, etc.
Even though pragmatics is traditionally considered as an area of
linguistics, similar considerations related to meaning in context are
also relevant for information systems design and especially
interactive systems development. An interesting issue within computer
science is the interface between pragmatics and semantics. Ontologies
are used in semantic web to define prototypical meanings but in the
real-world contexts, pragmatics deals with the subjective and
contextual variation around prototypical meanings. In human-to-machine
communication, information systems may have practical uses in new
contexts beyond the ones defined originally by the designer of the
system. In machine-to-machine communication, formal semantics may fall
short in solving interoperability issues and thus issues related to
pragmatics need to be considered. In overall, the focus is in how
understanding takes place, not in how meanings are defined.
During the course, the participants are introduced with the main
linguistic theories related to pragmatics including but not limited to
the theories about the functions of languages, the speech act theory,
and the theory of conversational maxims. The participants will
familiarize themselves with computational models in the area of
pragmatics with specific focus on dynamic and adaptive systems and
statistical machine learning. They will also conduct a small empirical
study related to the subjectivity and contextuality of meaning using
the grounded intersubjective concept analysis (GICA). The collected
data will be analyzed using statistical methods.
Are you responsible for developing satellite on-board software? Are you the Dutch government and you have to efficiently implement the public benefits law? Are you a healthcare startup, developing companion apps that help patients through a treatment? Are you an insurance company struggling to create new, and evolve existing products quickly to keep up with the market? These are all examples of organisations who have built their own domain-specific programming language to streamline the development of applications that have a non-trivial algorithmic core. All have built their languages with Jetbrains MPS, an open source language development tool optimized for ecosystems of collaborating languages with mixed graphical, textual, tabular and mathematical notations. This talk has four parts. I start by motivating the need for DSLs based on real-world examples, including the ones above. I will then present a few high-level design practices that guide our language development work. Third, I will develop a simple language extension to give you a feel for how MPS works. And finally, I will point you to things you can read to get you started with your own language development practice.
PSEUDOCODE TO SOURCE PROGRAMMING LANGUAGE TRANSLATORijistjournal
Pseudocode is an artificial and informal language that helps developers to create algorithms. In this papera software tool is described, for translating the pseudocode into a particular source programminglanguage. This tool compiles the pseudocode given by the user and translates it to a source programminglanguage. The scope of the tool is very much wide as we can extend it to a universal programming toolwhich produces any of the specified programming language from a given pseudocode. Here we present thesolution for translating the pseudocode to a programming language by using the different stages of acompiler
SOFTWARE TOOL FOR TRANSLATING PSEUDOCODE TO A PROGRAMMING LANGUAGEIJCI JOURNAL
Pseudocode is an artificial and informal language that helps programmers to develop algorithms. In this
paper a software tool is described, for translating the pseudocode into a particular programming
language. This tool takes the pseudocode as input, compiles it and translates it to a concrete programming
language. The scope of the tool is very much wide as we can extend it to a universal programming tool
which produces any of the specified programming language from a given pseudocode. Here we present the
solution for translating the pseudocode to a programming language by implementing the stages of a
compiler
SOFTWARE TOOL FOR TRANSLATING PSEUDOCODE TO A PROGRAMMING LANGUAGEIJCI JOURNAL
Pseudocode is an artificial and informal language that helps programmers to develop algorithms. In this
paper a software tool is described, for translating the pseudocode into a particular programming
language. This tool takes the pseudocode as input, compiles it and translates it to a concrete programming
language. The scope of the tool is very much wide as we can extend it to a universal programming tool
which produces any of the specified programming language from a given pseudocode. Here we present the
solution for translating the pseudocode to a programming language by implementing the stages of a
compiler.
Speech synthesis we can, in theory, mean any kind of synthetization of speech. For example, it can be the process in which a speech decoder generates the speech signal based on the parameters it has received through the transmission line, or it can be a procedure performed by a computer to estimate some kind of a presentation of the speech signal given a text input. Since there is a special course about the codecs (Puheen koodaus, Speech Coding), this chapter will concentrate on text-to-speech synthesis, or shortly TTS, which will be often referred to as speech synthesis to simplify the notation. Anyway, it is good to keep in mind that irrespective of what kind of synthesis we are dealing with, there are similar criteria in regard to the speech quality. We will return to this topic after a brief TTS motivation, and the rest of this chapter will be dedicated to the implementation point of view in TTS systems. Text-to-speech synthesis is a research field that has received a lot of attention and resources during the last couple of decades – for excellent reasons. One of the most interesting ideas (rather futuristic, though) is the fact that a workable TTS system, combined with a workable speech recognition device, would actually be an extremely efficient method for speech coding). It would provide incomparable compression ratio and flexible possibilities to choose the type of speech (e.g., breathless or hoarse), the fundamental frequency along with its range, the rhythm of speech, and several other effects. Furthermore, if the content of a message needs to be changed, it is much easier to retype the text than to record the signal again. Unfortunately this kind of a system does not yet exist for large vocabularies. Of course there are also numerous speech synthesis applications that are closer to being available than the one discussed above. For instance, a telephone inquiry system where the information is frequently updated, can use TTS to deliver answers to the customers. Speech synthesizers are also important to the visually impaired and to those who have lost their ability to speak. Several other examples can be found in everyday life, such as listening to the messages and news instead of reading them, and using hands-free functions through a voice interface in a car, and so on.
Envisioning the Future of Language WorkbenchesMarkus Voelter
Over the last couple of years, I have used MPS successfully to build interesting (modeling and programming) languages in a wide variety of domains, targeting both business users and engineers. I’ve used MPS because it is currently the most powerful language workbench, lots of things are good about iz, in particular, its support for a multitude of notations and language modularity. But it is also obvious that MPS is not going to be viable for the medium to long term future; the most obvious reason for this statement is that it is not web/cloud-based. In this keynote, I will quickly recap why and how we have been successful with MPS, and point out how language workbenches could look like in the future; I will outline challenges, opportunities and research problems. I hope to spawn discussions for the remainder of the workshop.
Are you responsible for developing satellite on-board software? Are you the Dutch government and you have to efficiently implement the public benefits law? Are you a healthcare startup, developing companion apps that help patients through a treatment? Are you an insurance company struggling to create new, and evolve existing products quickly to keep up with the market? These are all examples of organisations who have built their own domain-specific programming language to streamline the development of applications that have a non-trivial algorithmic core. All have built their languages with Jetbrains MPS, an open source language development tool optimized for ecosystems of collaborating languages with mixed graphical, textual, tabular and mathematical notations. This talk has four parts. I start by motivating the need for DSLs based on real-world examples, including the ones above. I will then present a few high-level design practices that guide our language development work. Third, I will develop a simple language extension to give you a feel for how MPS works. And finally, I will point you to things you can read to get you started with your own language development practice.
PSEUDOCODE TO SOURCE PROGRAMMING LANGUAGE TRANSLATORijistjournal
Pseudocode is an artificial and informal language that helps developers to create algorithms. In this papera software tool is described, for translating the pseudocode into a particular source programminglanguage. This tool compiles the pseudocode given by the user and translates it to a source programminglanguage. The scope of the tool is very much wide as we can extend it to a universal programming toolwhich produces any of the specified programming language from a given pseudocode. Here we present thesolution for translating the pseudocode to a programming language by using the different stages of acompiler
SOFTWARE TOOL FOR TRANSLATING PSEUDOCODE TO A PROGRAMMING LANGUAGEIJCI JOURNAL
Pseudocode is an artificial and informal language that helps programmers to develop algorithms. In this
paper a software tool is described, for translating the pseudocode into a particular programming
language. This tool takes the pseudocode as input, compiles it and translates it to a concrete programming
language. The scope of the tool is very much wide as we can extend it to a universal programming tool
which produces any of the specified programming language from a given pseudocode. Here we present the
solution for translating the pseudocode to a programming language by implementing the stages of a
compiler
SOFTWARE TOOL FOR TRANSLATING PSEUDOCODE TO A PROGRAMMING LANGUAGEIJCI JOURNAL
Pseudocode is an artificial and informal language that helps programmers to develop algorithms. In this
paper a software tool is described, for translating the pseudocode into a particular programming
language. This tool takes the pseudocode as input, compiles it and translates it to a concrete programming
language. The scope of the tool is very much wide as we can extend it to a universal programming tool
which produces any of the specified programming language from a given pseudocode. Here we present the
solution for translating the pseudocode to a programming language by implementing the stages of a
compiler.
Speech synthesis we can, in theory, mean any kind of synthetization of speech. For example, it can be the process in which a speech decoder generates the speech signal based on the parameters it has received through the transmission line, or it can be a procedure performed by a computer to estimate some kind of a presentation of the speech signal given a text input. Since there is a special course about the codecs (Puheen koodaus, Speech Coding), this chapter will concentrate on text-to-speech synthesis, or shortly TTS, which will be often referred to as speech synthesis to simplify the notation. Anyway, it is good to keep in mind that irrespective of what kind of synthesis we are dealing with, there are similar criteria in regard to the speech quality. We will return to this topic after a brief TTS motivation, and the rest of this chapter will be dedicated to the implementation point of view in TTS systems. Text-to-speech synthesis is a research field that has received a lot of attention and resources during the last couple of decades – for excellent reasons. One of the most interesting ideas (rather futuristic, though) is the fact that a workable TTS system, combined with a workable speech recognition device, would actually be an extremely efficient method for speech coding). It would provide incomparable compression ratio and flexible possibilities to choose the type of speech (e.g., breathless or hoarse), the fundamental frequency along with its range, the rhythm of speech, and several other effects. Furthermore, if the content of a message needs to be changed, it is much easier to retype the text than to record the signal again. Unfortunately this kind of a system does not yet exist for large vocabularies. Of course there are also numerous speech synthesis applications that are closer to being available than the one discussed above. For instance, a telephone inquiry system where the information is frequently updated, can use TTS to deliver answers to the customers. Speech synthesizers are also important to the visually impaired and to those who have lost their ability to speak. Several other examples can be found in everyday life, such as listening to the messages and news instead of reading them, and using hands-free functions through a voice interface in a car, and so on.
Envisioning the Future of Language WorkbenchesMarkus Voelter
Over the last couple of years, I have used MPS successfully to build interesting (modeling and programming) languages in a wide variety of domains, targeting both business users and engineers. I’ve used MPS because it is currently the most powerful language workbench, lots of things are good about iz, in particular, its support for a multitude of notations and language modularity. But it is also obvious that MPS is not going to be viable for the medium to long term future; the most obvious reason for this statement is that it is not web/cloud-based. In this keynote, I will quickly recap why and how we have been successful with MPS, and point out how language workbenches could look like in the future; I will outline challenges, opportunities and research problems. I hope to spawn discussions for the remainder of the workshop.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofsAlex Pruden
This paper presents Reef, a system for generating publicly verifiable succinct non-interactive zero-knowledge proofs that a committed document matches or does not match a regular expression. We describe applications such as proving the strength of passwords, the provenance of email despite redactions, the validity of oblivious DNS queries, and the existence of mutations in DNA. Reef supports the Perl Compatible Regular Expression syntax, including wildcards, alternation, ranges, capture groups, Kleene star, negations, and lookarounds. Reef introduces a new type of automata, Skipping Alternating Finite Automata (SAFA), that skips irrelevant parts of a document when producing proofs without undermining soundness, and instantiates SAFA with a lookup argument. Our experimental evaluation confirms that Reef can generate proofs for documents with 32M characters; the proofs are small and cheap to verify (under a second).
Paper: https://eprint.iacr.org/2023/1886
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
How to Get CNIC Information System with Paksim Ga.pptxdanishmna97
Pakdata Cf is a groundbreaking system designed to streamline and facilitate access to CNIC information. This innovative platform leverages advanced technology to provide users with efficient and secure access to their CNIC details.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Dr. Sean Tan, Head of Data Science, Changi Airport Group
Discover how Changi Airport Group (CAG) leverages graph technologies and generative AI to revolutionize their search capabilities. This session delves into the unique search needs of CAG’s diverse passengers and customers, showcasing how graph data structures enhance the accuracy and relevance of AI-generated search results, mitigating the risk of “hallucinations” and improving the overall customer journey.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
A tale of scale & speed: How the US Navy is enabling software delivery from l...sonjaschweigert1
Rapid and secure feature delivery is a goal across every application team and every branch of the DoD. The Navy’s DevSecOps platform, Party Barge, has achieved:
- Reduction in onboarding time from 5 weeks to 1 day
- Improved developer experience and productivity through actionable findings and reduction of false positives
- Maintenance of superior security standards and inherent policy enforcement with Authorization to Operate (ATO)
Development teams can ship efficiently and ensure applications are cyber ready for Navy Authorizing Officials (AOs). In this webinar, Sigma Defense and Anchore will give attendees a look behind the scenes and demo secure pipeline automation and security artifacts that speed up application ATO and time to production.
We will cover:
- How to remove silos in DevSecOps
- How to build efficient development pipeline roles and component templates
- How to deliver security artifacts that matter for ATO’s (SBOMs, vulnerability reports, and policy evidence)
- How to streamline operations with automated policy checks on container images
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
2. Indice generale
1. Introduction: Goals of the Assignment and used tools................................................................2
2. Choice of the dialogue and text to speech alignment with SPPAS..............................................3
3. Editing the dialogue tiers in Praat and writig a Script for Processing.........................................4
4. POS Tagging................................................................................................................................5
5. Semantic Analysis with JWNL....................................................................................................5
6. Results and main statistics...........................................................................................................5
7. Conclusions..................................................................................................................................7
8. Appendix: Lines of Code. ...........................................................................................................8
1. Introduction: Goals of the Assignment and used tools
The objective of this work is to provide a complete analysis of a piece of conversation,
carrying out the following features:
•
phonologic features of dialogue and a brief statistical analysis;
•
A subdivision in dialogue acts using the DAMSL model;
•
the POS tagging of the dialogue;
•
a brief Semantic Analysis;
•
a Graphical Representation of the results.
Given these goals, the first step has been the choice of the right dialogue for the purpose of
analysis. The audio file of the dialogue together with the written transcription was taken as
input to SPPAS (Automatic Phonetic Annotation of Speech), which is a tool for operations
of alignment between audio and text, with tokenization and phonetization features.
The result of SPPAS analysis got the text aligned with the audio file and it was used as
input to PRAAT, which is a tool to capture audio features of speech such as Pitch,
Intensity and Formants. The alignment was manually edited in Praat to provide the best
match between transcription and audio, and then a Praat script was created to append
some audio features and further annotations to the words in the .txt file.
The POS Tagging part of the project was carried out by using the POS Tagger of the
Stanford University. After this phase the txt with the data looked like a table with audio,
dialogue and syntactic features associated with each word of the conversation.
The last part of the project involved the semantic analysis of dialogue, leveraging the
JWNL java library to query the WordNet lexical database.
Graphical results has been made importing the final .txt file in Microsoft Excel.
3. 2. Choice of the dialogue and text to speech alignment with SPPAS
The choice of the suitable dialogue for the analysis was probably the hardest step in the
assignment, due to the constraints given by the SPPAS limited capabilities of processing.
My first idea was to get an artistically relevant dialogue, so I started with an excerpt from
the film Eyes Wide Shut by Stanley Kubrick, and I tried to get the best results in terms
of alignments.
SPPAS (version 1.4.8) doesn't perform so well with
•
audio files longer than 2 minute;
•
excerpts of films, which usually show a relevant background noise;
•
realistic and natural dialogues, due to superpositions of more voice, non-words
phonemes and other imperfections.
The Bill and Victor Dialogue had both these three characteristics, so it was almost
impossible to obtain a sufficient result in the alignment, even for a following editing
provided in Praat. I tried to remove some noise and underline only the speech parts of the
audio file using a simple matlab script (See appendix for code), but it didn't work.
The second attempt was the dialogue from the italian film Il Divo by Paolo Sorrentino,
in which the speech seemed more clear and fluid than the previous. SPPAS also allows
processing of italian language dialogues. Unfortunately this audio file showed the same
drawbacks of the previous, though I also tried to divide processing in shorter fragments of
the audio file, as you can see in the folder.
The last attempt was for a linear english educational dialogue between two girls, which
worked really good for SPPAS processing. Despite his simpleness and linear dialogue
interaction, it had a good level of emotive speaking and it was enough expressive for the
purpose of the assignment.
To enable a correct alignment with SPPAS I put in the .txt file also the the hashes to signal
the moments of pause in the dialogue. This is another limit of SPPAS, since without the
silence tracing in the .txt it couldn't provide a precise alignment. The resulting files are
shown in the folder of project “SPPAS Processing”.
4. 3. Editing the dialogue tiers in Praat and writig a Script for
Processing
Since the process of alignment in SPPAS was not precise, a further editing in Praat was
needed, moving boundaries and tokens in the right positions when needed. The results of
this editing were saved in the TextGrid file “dialogue-flat-phon_palign”, in the folder
“Editing in Praat”.
Two more tiers have been added in the TextGrid file, indicating the class of dialogue act
(using the theory of dialogue acts classifcation proposed in DAMSL model) and the
speaker.
The final TextGrid file featured the following tiers:
•
PhonAlign Tier;
•
PhnTokAlign Tier;
•
TokensAlign Tier;
•
DialogueAct Tier;
•
Speaker.
In the consequent phase I passed from the Praat Editor View to the Praat scripting
language, to extract required audio features associated to each word token in the dialogue.
The Praat Script “features.praat” takes the Wave file and the TextGrid file as input and
produces a txt file which shows:
•
Word token;
•
Mean Pitch of token;
•
Mean Intensity of token;
•
DialogueAct;
•
Speaker.
The results were saved in the .txt file “conversation-audio” in the folder “Editing in Praat”.
5. 4. POS Tagging
To come up with the part-of-speech tagging of each word in the dialogue the tool
Stanford POSTAGGER was used (version 3.2.0). The result of the tagging operation has
been stored in the file “conversation-tagged.txt”. A pretrained model has been used to
assign part of speech tags to unlabeled text, the adopted model was “wsj-0-18-left3wordsdistsim”, included in the package of the Stanford-postagger.
After the POS-tagging processing I noticed some mistakes of the tagger, i.e. some noun
terms were recognized as verbs and viceversa, but the majority of words had the right tag.
5. Semantic Analysis with JWNL
JWNL is a Java API (Application Programming Interface) to access and query WordNet
database. In this context JWNL was used to find the domains of each word token. I used
version 2.0 of WordNet, version 1.4 of JWNL and Eclipse as IDE with Java 1.7 SDK and
JRE 7 (Java Runtime Environment).
To find the domains of each token I leveraged the CATEGORY pointer type, and when no
related domains were found I wrote a function which recorsively search the root
hypernym. The Java Project reads as .txt input file “conversation-tagged” in the folder
“POS tagging”, and writes the .txt file “dialogue-audio-pos-domains” as output file.
One issue in this operation was due to the fact that the CATEGORY pointer didn't work for
so many tokens, and recursive search for hypernyms returned base classes like “entity” or
“abstraction”, too general for the purpose of a semantic domain search.
The final results of all processing are stored in the excel file “Dialogue Data” and in the flat
.txt file “dialogue-audio-pos-domains-def”.
6. Results and main statistics
Data of dialogue analysis were all imported in the excel file “Dialogue Data”, which include
four different sheets:
–
General Data: table with all fields and values;
–
Speaker Pitch-Intensity: Pitch & Intensity Data and graphics;
–
Dialogue Acts: Analysis of Dialogue Acts;
–
Domains: Analysis of Domains.
6. In the analysis non-word utterances were not taken into account since there is only a notword token in the conversation.
Pitch Trend By Speaker
600,00
500,00
Pitch (Hz)
400,00
300,00
200,00
100,00
0,00
1
5
9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81
Token Number
Amanda
Karen
Intensity Trend By Speaker
90,00
80,00
Intensity (dB)
70,00
60,00
50,00
40,00
30,00
20,00
10,00
0,00
1
5
9 13 17 21 25 29 33 37 41 45 49 53 57 61 65 69 73 77 81
Token Number
Amanda
Karen
7. 7. Conclusions
Due to the difficulties in SPPAS processing, the chosen dialogue is a very simple type of
conversation, so the DAMSL analysis and the domain analysis did not show sensitive
results. The topic of conversation is general, so there is not a particular trend in semantic
domains of word tokens. The conversation is equally distributed such that the two speakers
have almost the same number of tokens. The conversation shows slight variations in pitch
and the fundamental frequency of Amanda's voice is quite different than Karen's, showing
the different timber of the two speakers, though always maintaining a pitch in the range of
common female values. In average pitch results there is a significant pitch outlier
associated to the Amanda's expression “on friday”: the values of 97 and 107 Hz sound a
little bit irrealistics if associated to female voice. The average intensity of tokens underlines
that the volume of dialogue remains constant during the conversation, there's not softly
speaking and the two speakers talk at the same volume (only 2 dB of difference).
The PRAAT analysis is probably the most reliable analysis together with POS tagging,
whereas the analysis carried out with JWNL shows evident limits in recognizing the correct
domains of speech. Most of the domains found are clearly wrong if associated to the kind
of dialogue, and the reason relies upon the fact that a knowledge of the context in which
word token resides should be mandatory to reach the right semantic domain.
The kind of conversation between Amanda and Karen is a Q & A conversation, so it's not a
surprise that a high percentage of dialogue acts falls in the Answer and Info-Request types.
More pleasant expressions seems to have higher level of pitch and intensity, whereas
action-directive, open-options and offers show a lower pitch and sometimes lower
intensity, meaning that when the speaker launches a proposal wants probably to give a
feeling of modesty, to avoid the feeling of an imposition.
8. 8. Appendix: Lines of Code.
MATLAB CODE
function [y_n] = remove_noise(y,win_len,mean_val, atten)
% This functions performs a background noise attenuation, provided that the
% loudness difference between noise and original signal is high enough.
%
y = signal with noise
%
win_len = frame length to calculare noise impact
%
mean_val = threshold which discriminates between noise and signal
%
atten = attenuation value to cut noise
for n = 1:(length(y)-win_len)
if (sum(abs( y(n:(n+win_len-1) ) )) < mean_val*win_len &
max(abs(y(n:n+win_len-1)))< mean_val)
for m = n:n+win_len-1
y(m) = y(m)*atten;
end
end
end
y_n = y;
end
PRAAT CODE
##### Script to extract features for each token #####
##print columns of the table##
echo Token
MeanPitch Intens.
DialogueAct
select all
#sound file & TextGrid file to be analyzed#
s = selected("Sound")
tg = selected("TextGrid")
select tg
numIntervals = Get number of intervals... 3
### calculate Pitch and Intensity of Speech ###
select s
To Pitch... 0.0 75 600
select s
To Intensity... 75 0.0
plus Pitch dialogue-flat
Speaker
9. pitch = selected ("Pitch")
intensity = selected("Intensity")
space$ = " "
for cont from 1 to numIntervals
select TextGrid dialogue-flat-phon_palign
token$ = Get label of interval... 3 cont
tstart = Get starting point... 3 cont
tend = Get end point... 3 cont
dialogueActNum = Get interval at time... 4 tstart+0.01
dialogueAct$ = Get label of interval... 4 dialogueActNum
speakerNum = Get interval at time... 5 tstart+0.01
speaker$ = Get label of interval... 5 speakerNum
# for each not-silence token extract mean pitch & mean intensity #
if !startsWith (token$, "#")
select pitch
pitchMean = Get mean... tstart tend Hertz
select intensity
intensityMean= Get mean... tstart tend dB
### configure layout ###
lenStr = length(token$)
spaceNum = 15 - lenStr
print 'token$'
for lung from 1 to spaceNum
print 'space$'
endfor
10. print 'pitchMean:2'
'intensityMean:2'
lenStr2 = length(dialogueAct$)
spaceNum2 = 20 - lenStr2
### configure layout ###
print 'dialogueAct$'
for lung from 1 to spaceNum2
print 'space$'
endfor
print 'speaker$'
printline
endif
endfor
### Save data in txt file ###
appendFile ("conversation-audio.txt", info$ ())
JWNL CODE
package wordnet;
import java.io.*;
public class WordSem {
public static void main(String[] args) throws JWNLException, IOException,
JWNLRuntimeException {
// Initialize JWNL with the properties file to point to dictionary files
JWNL.initialize(new FileInputStream("file_properties.xml"));
// Dictionary object
Dictionary wordnet;
//After initialization create a Dictionary object that can be queried
wordnet = Dictionary.getInstance();
// read text file and extract words to be searched on WordNet
String read_path = "D:Ultimo semestreNatural Language
ProcessingASSIGNMENTconversationPOS taggingconversation-tagged.txt";
//Open file reader stream (will read file with POS Tagging)
FileReader fr = new FileReader(read_path);
BufferedReader br = new BufferedReader(fr);
//Open file writer stream (will write txt file with "Token POS Domain"
11. // lines for each token
String write_path = "D:Ultimo semestreNatural Language
ProcessingASSIGNMENTconversationdialogue-audio-pos-domains.txt";
File file = new File(write_path);
FileWriter file_write = new FileWriter(file);
String read_linea = ""; //line string variable, read line from sourcefile
String wordn = "";
//takes token words from source file
String word_POS = "";
// takes POS tags from source file
POS wnPOS;
// POS tag in WordNet format
String strdomain = "";
//takes domain string related to word token
// While there are lines in source file take word token and POS tag
while(true)
{
read_linea = br.readLine();
if(read_linea==null)
break;
String [] splits = read_linea.split("_"); //this is separator
between word and tag in source file
wordn = splits[0];
System.out.println(wordn);
word_POS = splits[1];
System.out.println(word_POS);
//begin write line in output txt file
StringBuilder write_appnd = new StringBuilder();
write_appnd.append(wordn)
.append(" ")
.append(word_POS)
.append(" ");
// translate from POS tag to WordNet word type
wnPOS = getWordNetPOS(word_POS);
//WordNet analysis: will check for word domain, and for hypernyms
if (wnPOS != null && wordn != null)
{
//An IndexWord is a single word and part of speech. Lookup a
SynSet object.
IndexWord w = wordnet.lookupIndexWord(wnPOS, wordn);
if (w != null)
{
Synset[] senses = w.getSenses();
int domainlen = senses.length;
Pointer[] domain = new Pointer[domainlen];
for (int i=0; i<senses.length; i++)
{
// CATEGORY is the pointer type for the domains
domain =
senses[i].getPointers(PointerType.CATEGORY);
Synset[] syndomain = new Synset[domain.length];
for (int l=0; l<domain.length; l++)
{
//obtain synset from domain and then an
associated word string
syndomain[l] =
domain[l].getTargetSynset();
Word rootWord = syndomain[l].getWord(0);
strdomain = rootWord.getLemma();
// add to outputtxt file
write_appnd.append(strdomain);
12. }
}
//get to root hypernym
if (wnPOS == POS.NOUN)
{
strdomain = getRootHypernym(w);
write_appnd.append(strdomain);
}
}
}
//finish to write line, and then skip to another
write_appnd.append("rn");
String write_linea = write_appnd.toString();
file_write.write(write_linea);
}
file_write.close();
br.close();
}
//translate from POS tag to WordNet word type
public static POS getWordNetPOS(String wPOS)
{
POS wordNetPos;
switch (wPOS)
{
case "NN": case "NNS": case "NNP": wordNetPos = POS.NOUN; break;
case "VB": case "VBD": case "VBG": case "VBN": case "VBP": case
"VBZ": wordNetPos = POS.VERB; break;
case "JJ": case "JJR": case "JJS": wordNetPos = POS.ADJECTIVE;
break;
case "RB": case "RBR": case "RBS": wordNetPos = POS.ADVERB; break;
default: wordNetPos = null;
}
return wordNetPos;
}
// search for root hypernym
public static String getRootHypernym(IndexWord synsetw) throws JWNLException
{
String stringdomain ="";
Synset syndomain = null;
Synset[] senses = synsetw.getSenses();
int domainlen = senses.length;
Pointer[] domain = new Pointer[domainlen];
for (int i=0; i<senses.length; i++)
{
domain = senses[0].getPointers(PointerType.HYPERNYM);
if (domain.length > 0)
{
syndomain = domain[0].getTargetSynset();
while(syndomain.toString() != null)
{
domain =
syndomain.getPointers(PointerType.HYPERNYM);
if (domain.length > 0) syndomain =