SlideShare a Scribd company logo
KIELI ANALYTICS
SPEECH & TEXT ANALYTICS
Chakir Mahjoubi
TECH REQUIREMENTS
Speech Text Metadata
DATA ANALYTICS
Speech Analytics
Text Analytics
Lexicon Database
I. Audio Transcription
I. Verbatim Transcription
II. Clean Transcription
II. Speaker Identification
I. Speakers’ background
II. Speakers’ Identity
I. Timestamp codes
a. Define the number of speakers
b. Define length of their utterances.
I. Speaker Identification
a. Identification via voice characteristics
enrolment: the speaker's voice is recorded
and a number of features are extracted to
form a voice print, template, or model.
b. Authentication via verification
verification: speech sample or "utterance" is
compared against a previously created voice
print.
II. Speakers’ Diarisation
is the process of partitioning an input audio stream
into homogeneous segments according to the
speaker identity.
Corpus Annotation
Corpus annotation is the practice of adding interpretative
linguistic information to a corpus. For example, one common
type of annotation is the addition of tags, or labels, indicating
the word class to which words in a text belong.
Phonology: the rules for combining and using phonemes.
Morphology: how morphemes are used in a language.
Syntax: how words can be combined to form a sentence
Semantics: meaning of words or combinations of words.
Pragmatics: social aspects of spoken language, including
conversational exchanges.
TEXT ANALYTICS
Sentiment Analysis
Is subjective information in an expression, that is, the opinions,
appraisals, emotions, or attitudes towards a topic, person or entity.
Expressions can be classified and categorised as positive, negative,
neutral, valid non valid, relative to the context…etc
“I really like the new design of your website!” → Positive
“I’m not sure if I like the new design” → Neutral
“The new design is awful!” → Negative
Sentiment analysis can be used across different industries to
understand the market trends and insights.
a. Social media monitoring,
b. Brand monitoring,
c. Customer support analysis.
LEXICON DATABASE
A lexicon database is a centralized compilation of industry-
specific terms. It serves as a guide to translation and localization
experts on how to manage key terminology.
A good terminology database will have source language key
terms and approved target language translations, along with a
clear definition and proper context for its usage. Key terms may
include acronyms, names, titles or subject specific terminology or
information. It may also include specific terms that are to be kept
in their source language.
A lexicon database is part of a suite of translation tools a
language service provider should offer, and is best used when
there is an abundance of specialised content.
TAXONOMY/ONTOLOGY
A taxonomy is a hierarchical classification in which things are
organized into groups or types. Taxonomy can be used to
organize and index knowledge (stored as documents, articles,
videos, etc.) such as in the form of a library classification
system, or a search engine taxonomy, so that users can more
easily find the information they are searching for.
Taxonomy various disciplines
a. Plant taxonomy (Taxonomy for Natural Science)
b. Corporate taxonymy (Business and economics)
c. Software engineering (ACM Computing Classification)
d. Bloom's taxonomy (Education and academia)
e. Safety (set of terminologies used within the safety field)
THANK YOU
Chakir Mahjoubi
Mail: cmahjoubi@kieli.co.uk
Web: https://kieli.co.uk
Tel: 07377508604

More Related Content

Similar to Kieli analytics

Aggregating Semantic Annotators Paper
Aggregating Semantic Annotators PaperAggregating Semantic Annotators Paper
Aggregating Semantic Annotators Paper
DBOnto
 
It services & research methods
It services & research methodsIt services & research methods
It services & research methods
AkanshShandilya
 
LEXICOGRAPHY
LEXICOGRAPHY LEXICOGRAPHY
LEXICOGRAPHY
mimisy
 

Similar to Kieli analytics (20)

Indexing
IndexingIndexing
Indexing
 
Terminology Management
Terminology ManagementTerminology Management
Terminology Management
 
Aggregating Semantic Annotators Paper
Aggregating Semantic Annotators PaperAggregating Semantic Annotators Paper
Aggregating Semantic Annotators Paper
 
Generating Lexical Information for Terminology in a Bioinformatics Ontology
Generating Lexical Information for Terminologyin a Bioinformatics OntologyGenerating Lexical Information for Terminologyin a Bioinformatics Ontology
Generating Lexical Information for Terminology in a Bioinformatics Ontology
 
TermWiki
TermWikiTermWiki
TermWiki
 
ProQuest Taxonomy Boot Camp Presentation 2008
ProQuest Taxonomy Boot Camp Presentation 2008ProQuest Taxonomy Boot Camp Presentation 2008
ProQuest Taxonomy Boot Camp Presentation 2008
 
Textmining
TextminingTextmining
Textmining
 
It services & research methods
It services & research methodsIt services & research methods
It services & research methods
 
Speech Retrieval
Speech RetrievalSpeech Retrieval
Speech Retrieval
 
SECOND LANGUAGE RESEARCH.pptx
SECOND LANGUAGE RESEARCH.pptxSECOND LANGUAGE RESEARCH.pptx
SECOND LANGUAGE RESEARCH.pptx
 
Deciphering voice of customer through speech analytics
Deciphering voice of customer through speech analyticsDeciphering voice of customer through speech analytics
Deciphering voice of customer through speech analytics
 
Natural Language Processing_in semantic web.pptx
Natural Language Processing_in semantic web.pptxNatural Language Processing_in semantic web.pptx
Natural Language Processing_in semantic web.pptx
 
Big data
Big dataBig data
Big data
 
CALICO 2010 Workshop
CALICO 2010  Workshop CALICO 2010  Workshop
CALICO 2010 Workshop
 
Mining Opinion Features in Customer Reviews
Mining Opinion Features in Customer ReviewsMining Opinion Features in Customer Reviews
Mining Opinion Features in Customer Reviews
 
Terminology: tips and tricks to boost your terminology work
Terminology: tips and tricks to boost your terminology workTerminology: tips and tricks to boost your terminology work
Terminology: tips and tricks to boost your terminology work
 
SWSN UNIT-3.pptx we can information about swsn professional
SWSN UNIT-3.pptx we can information about swsn professionalSWSN UNIT-3.pptx we can information about swsn professional
SWSN UNIT-3.pptx we can information about swsn professional
 
ISO 25964: Thesauri and Interoperability with Other Vocabularies
ISO 25964: Thesauri and Interoperability with Other VocabulariesISO 25964: Thesauri and Interoperability with Other Vocabularies
ISO 25964: Thesauri and Interoperability with Other Vocabularies
 
Taxonomy 101
Taxonomy 101Taxonomy 101
Taxonomy 101
 
LEXICOGRAPHY
LEXICOGRAPHY LEXICOGRAPHY
LEXICOGRAPHY
 

Recently uploaded

一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage s
MAQIB18
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
ukgaet
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Domenico Conte
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 

Recently uploaded (20)

一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
Computer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage sComputer Presentation.pptx ecommerce advantage s
Computer Presentation.pptx ecommerce advantage s
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
tapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive datatapal brand analysis PPT slide for comptetive data
tapal brand analysis PPT slide for comptetive data
 
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
一比一原版(UVic毕业证)维多利亚大学毕业证成绩单
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPsWebinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
Webinar One View, Multiple Systems No-Code Integration of Salesforce and ERPs
 
Slip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp ClaimsSlip-and-fall Injuries: Top Workers' Comp Claims
Slip-and-fall Injuries: Top Workers' Comp Claims
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
Professional Data Engineer Certification Exam Guide  _  Learn  _  Google Clou...
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
Q1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year ReboundQ1’2024 Update: MYCI’s Leap Year Rebound
Q1’2024 Update: MYCI’s Leap Year Rebound
 

Kieli analytics

  • 1. KIELI ANALYTICS SPEECH & TEXT ANALYTICS Chakir Mahjoubi
  • 3. DATA ANALYTICS Speech Analytics Text Analytics Lexicon Database
  • 4. I. Audio Transcription I. Verbatim Transcription II. Clean Transcription II. Speaker Identification I. Speakers’ background II. Speakers’ Identity I. Timestamp codes a. Define the number of speakers b. Define length of their utterances.
  • 5. I. Speaker Identification a. Identification via voice characteristics enrolment: the speaker's voice is recorded and a number of features are extracted to form a voice print, template, or model. b. Authentication via verification verification: speech sample or "utterance" is compared against a previously created voice print. II. Speakers’ Diarisation is the process of partitioning an input audio stream into homogeneous segments according to the speaker identity.
  • 6. Corpus Annotation Corpus annotation is the practice of adding interpretative linguistic information to a corpus. For example, one common type of annotation is the addition of tags, or labels, indicating the word class to which words in a text belong. Phonology: the rules for combining and using phonemes. Morphology: how morphemes are used in a language. Syntax: how words can be combined to form a sentence Semantics: meaning of words or combinations of words. Pragmatics: social aspects of spoken language, including conversational exchanges.
  • 7. TEXT ANALYTICS Sentiment Analysis Is subjective information in an expression, that is, the opinions, appraisals, emotions, or attitudes towards a topic, person or entity. Expressions can be classified and categorised as positive, negative, neutral, valid non valid, relative to the context…etc “I really like the new design of your website!” → Positive “I’m not sure if I like the new design” → Neutral “The new design is awful!” → Negative Sentiment analysis can be used across different industries to understand the market trends and insights. a. Social media monitoring, b. Brand monitoring, c. Customer support analysis.
  • 8. LEXICON DATABASE A lexicon database is a centralized compilation of industry- specific terms. It serves as a guide to translation and localization experts on how to manage key terminology. A good terminology database will have source language key terms and approved target language translations, along with a clear definition and proper context for its usage. Key terms may include acronyms, names, titles or subject specific terminology or information. It may also include specific terms that are to be kept in their source language. A lexicon database is part of a suite of translation tools a language service provider should offer, and is best used when there is an abundance of specialised content.
  • 9. TAXONOMY/ONTOLOGY A taxonomy is a hierarchical classification in which things are organized into groups or types. Taxonomy can be used to organize and index knowledge (stored as documents, articles, videos, etc.) such as in the form of a library classification system, or a search engine taxonomy, so that users can more easily find the information they are searching for. Taxonomy various disciplines a. Plant taxonomy (Taxonomy for Natural Science) b. Corporate taxonymy (Business and economics) c. Software engineering (ACM Computing Classification) d. Bloom's taxonomy (Education and academia) e. Safety (set of terminologies used within the safety field)
  • 10. THANK YOU Chakir Mahjoubi Mail: cmahjoubi@kieli.co.uk Web: https://kieli.co.uk Tel: 07377508604