SlideShare a Scribd company logo
Introduction to Sketch Engine
http://www.sketchengine.co.uk/
–
1
Basic Terminology
Introduction
How to Use Sketch Engine ?
Research Issues
Outline
2
BasicTerminology
English Term
Corpus - Corpora
≠Blog .
Parallel corpora
Comparable Corpus .
Written Corpora
Spoken Corpora
3
BasicTerminology
English Term
Collocation
)( ()()
()
Concordances –
:
.
.
.
..
Lemma
4
BasicTerminology
English Term
Part-of-Speech
(PoS) Tagging codetag
.
Thesaurus
()
5
What is Sketch Engine ?
 It is a corpus query tool which takes as input a corpus of any
language and a corresponding grammar patterns, and which
generates, amongst other things, word sketches for the words
of that language.
 The Sketch Engine is designed for anyone wanting to research
how words behave.
6
SkE
Corpus
Word Sketches
What is Sketch Engine ?
7
Upload
your own
corpus
Access to
public
corpora
Advanced
search
options
Sketch Engine Features
1
• Web based tool – No installation
2
• Support Arabic corpora
3
• The Concordancer with advanced options
4
• The Word Sketches
8
Sketch Engine Features
5
• The Thesaurus (find similar words)
6
• Support for parallel corpora, virtual sub- and
super corpora
7
• Full regular-expression searching using CQL
8
• Corpus Architect: user corpora, uploaded by
users or created by WebBootCaT
9
Who Use Sketch Engine ?
10
Language
learners
WritersLinguists
Researchers
Sketch engine usage:
11
Common
words/colloc
ations
synonyms grammar
Words
behavior
Available corpora
12
200+ corpora in 60+ languages
Available Arabic corpora
13

14
How to create your corpus using SKE?
Steps to create a Corpus in SKE
15
Word Sketches
Sketch Diff
Thesaurus
Raw text
Tokenizati
on
Lemmatiz
ation
POS
tagging
Sketch
Grammar
SKE
Features
16
1- Upload your text:
- Sketch engine accepts types of files such as (.xml .doc, .docx, .htm,
.html, .pdf,.txt, …)
17
2- Tokenization:
- The process of splitting words and adding structure tags
(<s>,<doc>,<p>).
- The output will be a vertical line file
18
3- Lemmatization (optional):
- The process of attaching a word with its lemma.
19
4- POS tagging:(mandatory for word sketch)
- The process of attaching a word with its part-of-speech tag.
- SKE Arabic tagger is not avaliable.
•
V
•
PN
•
N
20
5- uploading Sketch Grammar:
- A file describing the grammatical relations in a langauge.
Example: 1: ”V” “(DET|NUM|ADJ|ADV|N)”* 2:”N”
Vertical line file with annotations
21
Adding data to the corpus by uploading a file
22
Adding data to the corpus usingWebBootCat
23
Seeds/URLs WebBootCat Your corpus
How to Use Sketch Engine ?
 As a Corpus User (Querying Corpora)
Concordance Word Lists Word Sketches
Sketch Diff Thesaurus
24

Concordance
25
Concordance
What is Concordancer?
A concordancer looks through the
whole corpus and finds every
example of a particular word or
phrase, then displays it with its
immediate context.
.
.
26
27
Query Types
Context
Text Types
28
Concordance
Query'sTypes
Query’s
Types
Simple
Lemma
Phrase
Word
Character
CQL
29
Concordance
Query'sTypes
Simple Will match the lemma (the stemmed form)
as well as the word
+ work for phrases.
«
» ...
30
Concordance
Query'sTypes
Lemma Will match any lemma
+ you can select PoS (Not for Arabic corpus).
This option will not work for phrases
« » ...
31
Concordance
Query'sTypes
Phrase Will match a phrase
+ any capitalized variant (Not for Arabic corpus)
but will not match the lemma
«
»
«
»
32
Concordance
Query'sTypes
Word Will match any word form exactly.
+you can select the PoS (Not for Arabic corpus)
+you can select "match case“ (Not for Arabic corpus)
« »« »
33
Concordance
Query'sTypes
Character Matches a character string.
« » ...
34
Concordance
Query'sTypes
CQL Is for inputting complex queries using Corpus
Query Language
35
 The general form is: [attr="value"]
o«»
 “Match any character“ operator: *
o«...»
 Or , And operators: | , &:
o«»«»
36
Concordance
Corpus Query Language (Basics)
 “Match any token" operator: []
o«..»«»
 Specifying number of tokens operator: {}
o«..»«»
o«..»0-3
«»
37
Concordance
Corpus Query Language (Basics)
Concordance
Exercises (CQL)
 Ex1:
: «»
 Ex2:

38
Concordance
Exercises (CQL)
 Ex1:
: «»
"" [] "“
 Ex2:

"" [] {0,3} "|"
39
Context
40
 Here you can specify criteria on the context for
your query.
 Ex1:
«»«»
 Ex2:
«»«»
41
Concordance
Context
42
Concordance
Context (Exercise)
43
Concordance
Context (Exercise)
Text Types
44
 Here you can:
 Select a sub-corpus or
 Create a new sub-corpus from a subset
of the current corpus
 You can also select constraints on the
text types for documents that will be
searched for your query
45
Concordance
TextTypes
46
Concordance
TextTypes
47
Concordance
Concordance Menu Options
 Save
 View Options
 Sort
 Sample
 Filter
 Frequency
 Collocations
 ConcDesc
 Visualize
Concordance
Exercises
 Ex1: Filter

 Ex2: Collocation
«»
 Ex3: Frequency – Node Tags
«»,
 Ex4: CQL - Frequency – Node Forms
: «» «»
48
Concordance
Exercises
 Ex1: Concordance:  Make Concordance
 Filter  select negative, Simple query:
 Ex2: Concordance:  Make Concordance
 Collocation  Attribute: word  Make Candidate List
 Ex3: Concordance:  Make Concordance
 Click Node Tags
 Ex4: Concordance  CQL: « » « | »
49

Word List
50
WordList
What is theWord List?
 Word List: for obtaining word lists ranked by
frequency for an entire corpus, or a
specified sub-corpus
 It can be useful for investigating whether a
word is used most frequently in its verb or
noun form, for instance.
51
52
Input: RE pattern or any
attribute (word, tag, lemma…)
Word List
Output:
Filtered list of lemma and/
words with frequencies
53
WordList
Exercises
 Ex1:
«»
«»
54
Choose lemma at Search attribute
Type the lemma (e.g. ) into
the RE pattern box.
Tick the box that says change
output attribute(s).
In the first two levels, select
“lemma" and "Tag".
55
56
WordList
Exercises
 Ex1:
«»
57
WordList
Exercises
58
WordList
Exercises
59

Word Sketch
60
WordSketch
What isWord Sketch?
 Word Sketch: this allows you to explore the
grammatical and collocational behaviour of
a word.
 The Word Sketch function doesn’t just tell
you what words are commonly found in the
company of your search word, but also tells
you what their grammatical relationship is
to the search word.
61
62
Input: Lemma
Word Sketch
Output:
Collocations
in grammatical
relation
WordSketch
Example
63
WordSketch
Example
64
WordSketch
Exercises
 Ex1:
«»
 Ex2:
«»
65

Thesaurus
66
Thesaurus
What isThesaurus?
 Thesaurus: this allows you to find other
words that have similar grammatical and
collocational behaviour to a given word.
 Note that this thesaurus is produced
automatically from statistics on word co-
occurrences.
 It is not a manually constructed thesaurus and
will list words for each entry which are
distributionally related but not necessarily
synonyms.
67
68
Input: Lemma +
POS tag
Thesaurus
Output:
Similar lemma
Thesaurus
Example
69
Thesaurus
Example
70
Thesaurus
Example
71

Word Sketch difference
72
Sketch-Diff
What isWord Sketch Difference?
 Sketch-Diff: this allows you to compare the
behavior of two words
 This function is also very useful for
comparing/deciding between two possible
translations of an item.
73
74
Input: two words or
lemmas
Sketch-Diff
Output: the different and
common collocations of
the two lemmas.
Sketch-Diff
Example
75
Sketch-Diff
Example
76
Sketch-Diff
Exercises
 Ex1:
/
 Ex2:
/
77

Compare corpora
78
79
Research Issues!
Please visit: http://goo.gl/HqhUir
Limitations!
Usage!
References
 http://www.sketchengine.co.uk/
 http://lisan1.com/wordpress/?p=146
 Kilgarriff, A., Rychly, P., Smrz, P., & Tugwell, D.
(2004). Itri-04-08 the sketch engine. Information
Technology, 105, 116.
81
Thank You
#__
82

More Related Content

What's hot

Corpus linguistics
Corpus linguisticsCorpus linguistics
Corpus linguistics
King Saud University
 
Discourse and genre
Discourse and genreDiscourse and genre
Discourse and genreHanagaj
 
Conversation analysis
Conversation analysisConversation analysis
Conversation analysis
Azam Almubarki
 
Corpus Linguistics
Corpus LinguisticsCorpus Linguistics
Corpus Linguistics
Prof.Ravindra Borse
 
Discourse analysis
Discourse analysisDiscourse analysis
Discourse analysis
Alvy Mayrina
 
Corpus linguistics
Corpus linguisticsCorpus linguistics
Corpus linguistics
Jitendra Patil
 
Applied linguistics: overview
Applied linguistics: overviewApplied linguistics: overview
Applied linguistics: overview
Asma Almashad
 
Types of corpus linguistics Parallel ,aligned...
 Types of corpus linguistics Parallel ,aligned... Types of corpus linguistics Parallel ,aligned...
Types of corpus linguistics Parallel ,aligned...
RajpootBhatti5
 
Applied linguistic: Contrastive Analysis
Applied linguistic: Contrastive AnalysisApplied linguistic: Contrastive Analysis
Applied linguistic: Contrastive AnalysisIntan Meldy
 
Trasnlation shift
Trasnlation shiftTrasnlation shift
Trasnlation shift
Buhsra
 
Schools of thought
Schools of thoughtSchools of thought
Schools of thought
Valeria Roldán
 
Discourse and the sentence
Discourse and the sentenceDiscourse and the sentence
Discourse and the sentence
Student
 
Genre
GenreGenre
Task based syllabus
Task based syllabusTask based syllabus
Task based syllabus
Fariba Chamani
 
The ethnography of communication
The ethnography of communicationThe ethnography of communication
The ethnography of communication
Sara Pacheco
 
Corpus linguistics intro
Corpus linguistics introCorpus linguistics intro
Corpus linguistics intro
Alex Curtis
 
Critical discourse analysis
Critical discourse analysisCritical discourse analysis
Critical discourse analysis
ayeshahussain47
 
Branches of linguistics
Branches of linguisticsBranches of linguistics
Branches of linguistics
amna-shahid
 
Issues, Attitude, Tension between Elitism and Grassroots
Issues, Attitude, Tension between Elitism and Grassroots  Issues, Attitude, Tension between Elitism and Grassroots
Issues, Attitude, Tension between Elitism and Grassroots
Amna Fayyaz
 

What's hot (20)

Corpus linguistics
Corpus linguisticsCorpus linguistics
Corpus linguistics
 
Discourse and genre
Discourse and genreDiscourse and genre
Discourse and genre
 
Conversation analysis
Conversation analysisConversation analysis
Conversation analysis
 
Corpus Linguistics
Corpus LinguisticsCorpus Linguistics
Corpus Linguistics
 
Saussure
Saussure Saussure
Saussure
 
Discourse analysis
Discourse analysisDiscourse analysis
Discourse analysis
 
Corpus linguistics
Corpus linguisticsCorpus linguistics
Corpus linguistics
 
Applied linguistics: overview
Applied linguistics: overviewApplied linguistics: overview
Applied linguistics: overview
 
Types of corpus linguistics Parallel ,aligned...
 Types of corpus linguistics Parallel ,aligned... Types of corpus linguistics Parallel ,aligned...
Types of corpus linguistics Parallel ,aligned...
 
Applied linguistic: Contrastive Analysis
Applied linguistic: Contrastive AnalysisApplied linguistic: Contrastive Analysis
Applied linguistic: Contrastive Analysis
 
Trasnlation shift
Trasnlation shiftTrasnlation shift
Trasnlation shift
 
Schools of thought
Schools of thoughtSchools of thought
Schools of thought
 
Discourse and the sentence
Discourse and the sentenceDiscourse and the sentence
Discourse and the sentence
 
Genre
GenreGenre
Genre
 
Task based syllabus
Task based syllabusTask based syllabus
Task based syllabus
 
The ethnography of communication
The ethnography of communicationThe ethnography of communication
The ethnography of communication
 
Corpus linguistics intro
Corpus linguistics introCorpus linguistics intro
Corpus linguistics intro
 
Critical discourse analysis
Critical discourse analysisCritical discourse analysis
Critical discourse analysis
 
Branches of linguistics
Branches of linguisticsBranches of linguistics
Branches of linguistics
 
Issues, Attitude, Tension between Elitism and Grassroots
Issues, Attitude, Tension between Elitism and Grassroots  Issues, Attitude, Tension between Elitism and Grassroots
Issues, Attitude, Tension between Elitism and Grassroots
 

Viewers also liked

Sketch engine
Sketch engine Sketch engine
Sketch engine
Nazihah Kamel
 
Классификация корпусов
Классификация корпусовКлассификация корпусов
Классификация корпусовArtem Lukanin
 

Viewers also liked (20)

Sketch engine
Sketch engine Sketch engine
Sketch engine
 
Corpora and its use in elt
Corpora and its use in eltCorpora and its use in elt
Corpora and its use in elt
 
Баев Системы для обучения программированию
Баев Системы для обучения программированиюБаев Системы для обучения программированию
Баев Системы для обучения программированию
 
Смолина Пользовательские интерфейсы систем лингвистической разметки текстов
Смолина Пользовательские интерфейсы систем лингвистической разметки текстовСмолина Пользовательские интерфейсы систем лингвистической разметки текстов
Смолина Пользовательские интерфейсы систем лингвистической разметки текстов
 
Савкуев. Построение формального описания фотографий на основе контекстно-собы...
Савкуев. Построение формального описания фотографий на основе контекстно-собы...Савкуев. Построение формального описания фотографий на основе контекстно-собы...
Савкуев. Построение формального описания фотографий на основе контекстно-собы...
 
Классификация корпусов
Классификация корпусовКлассификация корпусов
Классификация корпусов
 
Мищенко. Методы автоматического определения наиболее частотного значения слова.
Мищенко. Методы автоматического определения наиболее частотного значения слова.Мищенко. Методы автоматического определения наиболее частотного значения слова.
Мищенко. Методы автоматического определения наиболее частотного значения слова.
 
Савостин. Системы и методы научного поиска и мониторинга
Савостин. Системы и методы научного поиска и мониторингаСавостин. Системы и методы научного поиска и мониторинга
Савостин. Системы и методы научного поиска и мониторинга
 
Лукьяненко. Извлечение коллокаций из текста
Лукьяненко. Извлечение коллокаций из текстаЛукьяненко. Извлечение коллокаций из текста
Лукьяненко. Извлечение коллокаций из текста
 
Котиков Простые методы выделения ключевых слов и построения рефератов
Котиков Простые методы выделения ключевых слов и построения рефератовКотиков Простые методы выделения ключевых слов и построения рефератов
Котиков Простые методы выделения ключевых слов и построения рефератов
 
Смирнова. Методы исправления ошибок в текстах, написанных иностранцами.
Смирнова. Методы исправления ошибок в текстах, написанных иностранцами.Смирнова. Методы исправления ошибок в текстах, написанных иностранцами.
Смирнова. Методы исправления ошибок в текстах, написанных иностранцами.
 
Тодуа. Сериализация и язык YAML
Тодуа. Сериализация и язык YAMLТодуа. Сериализация и язык YAML
Тодуа. Сериализация и язык YAML
 
Багдатов Методы автоматического выявления плагиата в текстах компьютерных про...
Багдатов Методы автоматического выявления плагиата в текстах компьютерных про...Багдатов Методы автоматического выявления плагиата в текстах компьютерных про...
Багдатов Методы автоматического выявления плагиата в текстах компьютерных про...
 
Иванов. Автоматизация построения предметных указателей
Иванов. Автоматизация построения предметных указателейИванов. Автоматизация построения предметных указателей
Иванов. Автоматизация построения предметных указателей
 
Можарова Тематические модели: учет сходства между униграммами и биграммами.
Можарова Тематические модели: учет сходства между униграммами и биграммами.Можарова Тематические модели: учет сходства между униграммами и биграммами.
Можарова Тематические модели: учет сходства между униграммами и биграммами.
 
Муромцев. Обзор библиографических менеджеров
Муромцев. Обзор библиографических менеджеровМуромцев. Обзор библиографических менеджеров
Муромцев. Обзор библиографических менеджеров
 
Панфилов. Корпусы текстов и принципы их создания
Панфилов. Корпусы текстов и принципы их созданияПанфилов. Корпусы текстов и принципы их создания
Панфилов. Корпусы текстов и принципы их создания
 
куликов Sketch engine ord
куликов Sketch engine ordкуликов Sketch engine ord
куликов Sketch engine ord
 
Сапин. Интеллектуальные агенты и обучение с подкреплением
Сапин. Интеллектуальные агенты и обучение с подкреплениемСапин. Интеллектуальные агенты и обучение с подкреплением
Сапин. Интеллектуальные агенты и обучение с подкреплением
 
Рой. Аспектный анализ тональности отзывов
Рой. Аспектный анализ тональности отзывов Рой. Аспектный анализ тональности отзывов
Рой. Аспектный анализ тональности отзывов
 

Similar to Sketch engine presentation

Regular Expressions(Theory of programming languages))
Regular Expressions(Theory of programming languages))Regular Expressions(Theory of programming languages))
Regular Expressions(Theory of programming languages))
khudabux1998
 
ANTLR - Writing Parsers the Easy Way
ANTLR - Writing Parsers the Easy WayANTLR - Writing Parsers the Easy Way
ANTLR - Writing Parsers the Easy Way
Michael Yarichuk
 
Using ANTLR on real example - convert "string combined" queries into paramete...
Using ANTLR on real example - convert "string combined" queries into paramete...Using ANTLR on real example - convert "string combined" queries into paramete...
Using ANTLR on real example - convert "string combined" queries into paramete...
Alexey Diyan
 
Plc part 2
Plc  part 2Plc  part 2
Plc part 2
Taymoor Nazmy
 
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Chunyang Chen
 
Parser
ParserParser
Fusing Modeling and Programming into Language-Oriented Programming
Fusing Modeling and Programming into Language-Oriented ProgrammingFusing Modeling and Programming into Language-Oriented Programming
Fusing Modeling and Programming into Language-Oriented Programming
Markus Voelter
 
A Brief Overview of (Static) Program Query Languages
A Brief Overview of (Static) Program Query LanguagesA Brief Overview of (Static) Program Query Languages
A Brief Overview of (Static) Program Query Languages
Kim Mens
 
Lexical analyzer
Lexical analyzerLexical analyzer
Lexical analyzer
Ashwini Sonawane
 
CD U1-5.pptx
CD U1-5.pptxCD U1-5.pptx
CD U1-5.pptx
Himajanaidu2
 
Computational model language and grammar bnf
Computational model language and grammar bnfComputational model language and grammar bnf
Computational model language and grammar bnf
Taha Shakeel
 
Towards an RDF Validation Language based on Regular Expression Derivatives
Towards an RDF Validation Language based on Regular Expression DerivativesTowards an RDF Validation Language based on Regular Expression Derivatives
Towards an RDF Validation Language based on Regular Expression Derivatives
Jose Emilio Labra Gayo
 
Compiler Construction
Compiler ConstructionCompiler Construction
Compiler Construction
Ahmed Raza
 
My 10 favorite Haxe language features - Francis Bourre - Codemotion Rome 2017
My 10 favorite Haxe language features - Francis Bourre - Codemotion Rome 2017My 10 favorite Haxe language features - Francis Bourre - Codemotion Rome 2017
My 10 favorite Haxe language features - Francis Bourre - Codemotion Rome 2017
Codemotion
 
8074448.ppt
8074448.ppt8074448.ppt
NLP and LSA getting started
NLP and LSA getting startedNLP and LSA getting started
NLP and LSA getting started
Innovation Engineering
 
Stefan Richter - Writing simple, readable and robust code: Examples in Java, ...
Stefan Richter - Writing simple, readable and robust code: Examples in Java, ...Stefan Richter - Writing simple, readable and robust code: Examples in Java, ...
Stefan Richter - Writing simple, readable and robust code: Examples in Java, ...
AboutYouGmbH
 
Compiler design Project
Compiler design ProjectCompiler design Project
Compiler design Project
DushyantSharma146
 

Similar to Sketch engine presentation (20)

Regular Expressions(Theory of programming languages))
Regular Expressions(Theory of programming languages))Regular Expressions(Theory of programming languages))
Regular Expressions(Theory of programming languages))
 
ANTLR - Writing Parsers the Easy Way
ANTLR - Writing Parsers the Easy WayANTLR - Writing Parsers the Easy Way
ANTLR - Writing Parsers the Easy Way
 
Using ANTLR on real example - convert "string combined" queries into paramete...
Using ANTLR on real example - convert "string combined" queries into paramete...Using ANTLR on real example - convert "string combined" queries into paramete...
Using ANTLR on real example - convert "string combined" queries into paramete...
 
Plc part 2
Plc  part 2Plc  part 2
Plc part 2
 
LANGUAGE TRANSLATOR
LANGUAGE TRANSLATORLANGUAGE TRANSLATOR
LANGUAGE TRANSLATOR
 
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
Unsupervised Software-Specific Morphological Forms Inference from Informal Di...
 
Parser
ParserParser
Parser
 
Fusing Modeling and Programming into Language-Oriented Programming
Fusing Modeling and Programming into Language-Oriented ProgrammingFusing Modeling and Programming into Language-Oriented Programming
Fusing Modeling and Programming into Language-Oriented Programming
 
A Brief Overview of (Static) Program Query Languages
A Brief Overview of (Static) Program Query LanguagesA Brief Overview of (Static) Program Query Languages
A Brief Overview of (Static) Program Query Languages
 
Lexical analyzer
Lexical analyzerLexical analyzer
Lexical analyzer
 
CD U1-5.pptx
CD U1-5.pptxCD U1-5.pptx
CD U1-5.pptx
 
Computational model language and grammar bnf
Computational model language and grammar bnfComputational model language and grammar bnf
Computational model language and grammar bnf
 
2.regular expressions
2.regular expressions2.regular expressions
2.regular expressions
 
Towards an RDF Validation Language based on Regular Expression Derivatives
Towards an RDF Validation Language based on Regular Expression DerivativesTowards an RDF Validation Language based on Regular Expression Derivatives
Towards an RDF Validation Language based on Regular Expression Derivatives
 
Compiler Construction
Compiler ConstructionCompiler Construction
Compiler Construction
 
My 10 favorite Haxe language features - Francis Bourre - Codemotion Rome 2017
My 10 favorite Haxe language features - Francis Bourre - Codemotion Rome 2017My 10 favorite Haxe language features - Francis Bourre - Codemotion Rome 2017
My 10 favorite Haxe language features - Francis Bourre - Codemotion Rome 2017
 
8074448.ppt
8074448.ppt8074448.ppt
8074448.ppt
 
NLP and LSA getting started
NLP and LSA getting startedNLP and LSA getting started
NLP and LSA getting started
 
Stefan Richter - Writing simple, readable and robust code: Examples in Java, ...
Stefan Richter - Writing simple, readable and robust code: Examples in Java, ...Stefan Richter - Writing simple, readable and robust code: Examples in Java, ...
Stefan Richter - Writing simple, readable and robust code: Examples in Java, ...
 
Compiler design Project
Compiler design ProjectCompiler design Project
Compiler design Project
 

More from iwan_rg

Automatic text simplification evaluation aspects
Automatic text simplification  evaluation aspectsAutomatic text simplification  evaluation aspects
Automatic text simplification evaluation aspects
iwan_rg
 
تلخيص كتاب مقدمة في معالجة اللغة العربية
تلخيص كتاب مقدمة في معالجة اللغة العربيةتلخيص كتاب مقدمة في معالجة اللغة العربية
تلخيص كتاب مقدمة في معالجة اللغة العربية
iwan_rg
 
Building theoretical models using structured equation modeling
Building theoretical models using structured equation modelingBuilding theoretical models using structured equation modeling
Building theoretical models using structured equation modeling
iwan_rg
 
ورشة تضمين الكلمات في التعلم العميق Word embeddings workshop
ورشة تضمين الكلمات في التعلم العميق Word embeddings workshopورشة تضمين الكلمات في التعلم العميق Word embeddings workshop
ورشة تضمين الكلمات في التعلم العميق Word embeddings workshop
iwan_rg
 
Introduction to Arabic natural language processing (Infographics)
Introduction to Arabic natural language processing (Infographics)Introduction to Arabic natural language processing (Infographics)
Introduction to Arabic natural language processing (Infographics)
iwan_rg
 
Summary of Multilingual Natural Language Processing Applications: From Theory...
Summary of Multilingual Natural Language Processing Applications: From Theory...Summary of Multilingual Natural Language Processing Applications: From Theory...
Summary of Multilingual Natural Language Processing Applications: From Theory...
iwan_rg
 
التقرير السنوي لمجموعة إيوان البحثية 1437هـ-1438هـ
التقرير السنوي لمجموعة إيوان البحثية 1437هـ-1438هـالتقرير السنوي لمجموعة إيوان البحثية 1437هـ-1438هـ
التقرير السنوي لمجموعة إيوان البحثية 1437هـ-1438هـ
iwan_rg
 
CHOOSING RESEARCH TOPICS AND WRITING RESEARCH PAPERS
CHOOSING RESEARCH TOPICS AND WRITING RESEARCH PAPERSCHOOSING RESEARCH TOPICS AND WRITING RESEARCH PAPERS
CHOOSING RESEARCH TOPICS AND WRITING RESEARCH PAPERS
iwan_rg
 
التقرير السنوي لمجموعة إيوان البحثية 1436هـ-1437هـ
التقرير السنوي لمجموعة إيوان البحثية 1436هـ-1437هـالتقرير السنوي لمجموعة إيوان البحثية 1436هـ-1437هـ
التقرير السنوي لمجموعة إيوان البحثية 1436هـ-1437هـ
iwan_rg
 
مركز تميز الحوسبة العربية المتقدمة
مركز تميز  الحوسبة العربية المتقدمةمركز تميز  الحوسبة العربية المتقدمة
مركز تميز الحوسبة العربية المتقدمة
iwan_rg
 
P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis
P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis
P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis
iwan_rg
 
P03- MANDIAC: A Web-based Annotation System For Manual Arabic Diacritization
P03- MANDIAC: A Web-based Annotation System For Manual Arabic Diacritization P03- MANDIAC: A Web-based Annotation System For Manual Arabic Diacritization
P03- MANDIAC: A Web-based Annotation System For Manual Arabic Diacritization
iwan_rg
 
P04- Toward an Arabic Punctuated Corpus: Annotation Guidelines and Evaluation
P04- Toward an Arabic Punctuated Corpus: Annotation Guidelines and Evaluation P04- Toward an Arabic Punctuated Corpus: Annotation Guidelines and Evaluation
P04- Toward an Arabic Punctuated Corpus: Annotation Guidelines and Evaluation
iwan_rg
 
P02- Towards a New Arabic Corpus of Dyslexic Texts
P02- Towards a New Arabic Corpus of Dyslexic TextsP02- Towards a New Arabic Corpus of Dyslexic Texts
P02- Towards a New Arabic Corpus of Dyslexic Texts
iwan_rg
 
P01- Toward a rich Arabic Speech Parallel Corpus for Algerian sub-Dialects
P01- Toward a rich Arabic Speech Parallel Corpus for Algerian sub-Dialects P01- Toward a rich Arabic Speech Parallel Corpus for Algerian sub-Dialects
P01- Toward a rich Arabic Speech Parallel Corpus for Algerian sub-Dialects
iwan_rg
 
Keynote - Computational Processing of Arabic Dialects: Challenges, Advances a...
Keynote - Computational Processing of Arabic Dialects: Challenges, Advances a...Keynote - Computational Processing of Arabic Dialects: Challenges, Advances a...
Keynote - Computational Processing of Arabic Dialects: Challenges, Advances a...
iwan_rg
 
OSACT2 LREC 2016 workshop proceedings
OSACT2 LREC 2016 workshop proceedingsOSACT2 LREC 2016 workshop proceedings
OSACT2 LREC 2016 workshop proceedings
iwan_rg
 
محاضرة المدونات اللغوية وأدواتها
محاضرة المدونات اللغوية وأدواتهامحاضرة المدونات اللغوية وأدواتها
محاضرة المدونات اللغوية وأدواتها
iwan_rg
 
لغويات المدونة الحاسوبية
لغويات المدونة الحاسوبيةلغويات المدونة الحاسوبية
لغويات المدونة الحاسوبية
iwan_rg
 
iWAN Annual Report 1435/1436H
 iWAN Annual Report 1435/1436H iWAN Annual Report 1435/1436H
iWAN Annual Report 1435/1436H
iwan_rg
 

More from iwan_rg (20)

Automatic text simplification evaluation aspects
Automatic text simplification  evaluation aspectsAutomatic text simplification  evaluation aspects
Automatic text simplification evaluation aspects
 
تلخيص كتاب مقدمة في معالجة اللغة العربية
تلخيص كتاب مقدمة في معالجة اللغة العربيةتلخيص كتاب مقدمة في معالجة اللغة العربية
تلخيص كتاب مقدمة في معالجة اللغة العربية
 
Building theoretical models using structured equation modeling
Building theoretical models using structured equation modelingBuilding theoretical models using structured equation modeling
Building theoretical models using structured equation modeling
 
ورشة تضمين الكلمات في التعلم العميق Word embeddings workshop
ورشة تضمين الكلمات في التعلم العميق Word embeddings workshopورشة تضمين الكلمات في التعلم العميق Word embeddings workshop
ورشة تضمين الكلمات في التعلم العميق Word embeddings workshop
 
Introduction to Arabic natural language processing (Infographics)
Introduction to Arabic natural language processing (Infographics)Introduction to Arabic natural language processing (Infographics)
Introduction to Arabic natural language processing (Infographics)
 
Summary of Multilingual Natural Language Processing Applications: From Theory...
Summary of Multilingual Natural Language Processing Applications: From Theory...Summary of Multilingual Natural Language Processing Applications: From Theory...
Summary of Multilingual Natural Language Processing Applications: From Theory...
 
التقرير السنوي لمجموعة إيوان البحثية 1437هـ-1438هـ
التقرير السنوي لمجموعة إيوان البحثية 1437هـ-1438هـالتقرير السنوي لمجموعة إيوان البحثية 1437هـ-1438هـ
التقرير السنوي لمجموعة إيوان البحثية 1437هـ-1438هـ
 
CHOOSING RESEARCH TOPICS AND WRITING RESEARCH PAPERS
CHOOSING RESEARCH TOPICS AND WRITING RESEARCH PAPERSCHOOSING RESEARCH TOPICS AND WRITING RESEARCH PAPERS
CHOOSING RESEARCH TOPICS AND WRITING RESEARCH PAPERS
 
التقرير السنوي لمجموعة إيوان البحثية 1436هـ-1437هـ
التقرير السنوي لمجموعة إيوان البحثية 1436هـ-1437هـالتقرير السنوي لمجموعة إيوان البحثية 1436هـ-1437هـ
التقرير السنوي لمجموعة إيوان البحثية 1436هـ-1437هـ
 
مركز تميز الحوسبة العربية المتقدمة
مركز تميز  الحوسبة العربية المتقدمةمركز تميز  الحوسبة العربية المتقدمة
مركز تميز الحوسبة العربية المتقدمة
 
P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis
P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis
P05- DINA: A Multi-Dialect Dataset for Arabic Emotion Analysis
 
P03- MANDIAC: A Web-based Annotation System For Manual Arabic Diacritization
P03- MANDIAC: A Web-based Annotation System For Manual Arabic Diacritization P03- MANDIAC: A Web-based Annotation System For Manual Arabic Diacritization
P03- MANDIAC: A Web-based Annotation System For Manual Arabic Diacritization
 
P04- Toward an Arabic Punctuated Corpus: Annotation Guidelines and Evaluation
P04- Toward an Arabic Punctuated Corpus: Annotation Guidelines and Evaluation P04- Toward an Arabic Punctuated Corpus: Annotation Guidelines and Evaluation
P04- Toward an Arabic Punctuated Corpus: Annotation Guidelines and Evaluation
 
P02- Towards a New Arabic Corpus of Dyslexic Texts
P02- Towards a New Arabic Corpus of Dyslexic TextsP02- Towards a New Arabic Corpus of Dyslexic Texts
P02- Towards a New Arabic Corpus of Dyslexic Texts
 
P01- Toward a rich Arabic Speech Parallel Corpus for Algerian sub-Dialects
P01- Toward a rich Arabic Speech Parallel Corpus for Algerian sub-Dialects P01- Toward a rich Arabic Speech Parallel Corpus for Algerian sub-Dialects
P01- Toward a rich Arabic Speech Parallel Corpus for Algerian sub-Dialects
 
Keynote - Computational Processing of Arabic Dialects: Challenges, Advances a...
Keynote - Computational Processing of Arabic Dialects: Challenges, Advances a...Keynote - Computational Processing of Arabic Dialects: Challenges, Advances a...
Keynote - Computational Processing of Arabic Dialects: Challenges, Advances a...
 
OSACT2 LREC 2016 workshop proceedings
OSACT2 LREC 2016 workshop proceedingsOSACT2 LREC 2016 workshop proceedings
OSACT2 LREC 2016 workshop proceedings
 
محاضرة المدونات اللغوية وأدواتها
محاضرة المدونات اللغوية وأدواتهامحاضرة المدونات اللغوية وأدواتها
محاضرة المدونات اللغوية وأدواتها
 
لغويات المدونة الحاسوبية
لغويات المدونة الحاسوبيةلغويات المدونة الحاسوبية
لغويات المدونة الحاسوبية
 
iWAN Annual Report 1435/1436H
 iWAN Annual Report 1435/1436H iWAN Annual Report 1435/1436H
iWAN Annual Report 1435/1436H
 

Recently uploaded

Advantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO PerspectiveAdvantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO Perspective
Krisztián Száraz
 
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
IreneSebastianRueco1
 
MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...
MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...
MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...
NelTorrente
 
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdfANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
Priyankaranawat4
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
eBook.com.bd (প্রয়োজনীয় বাংলা বই)
 
Assignment_4_ArianaBusciglio Marvel(1).docx
Assignment_4_ArianaBusciglio Marvel(1).docxAssignment_4_ArianaBusciglio Marvel(1).docx
Assignment_4_ArianaBusciglio Marvel(1).docx
ArianaBusciglio
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat  Leveraging AI for Diversity, Equity, and InclusionExecutive Directors Chat  Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
TechSoup
 
Fresher’s Quiz 2023 at GMC Nizamabad.pptx
Fresher’s Quiz 2023 at GMC Nizamabad.pptxFresher’s Quiz 2023 at GMC Nizamabad.pptx
Fresher’s Quiz 2023 at GMC Nizamabad.pptx
SriSurya50
 
Delivering Micro-Credentials in Technical and Vocational Education and Training
Delivering Micro-Credentials in Technical and Vocational Education and TrainingDelivering Micro-Credentials in Technical and Vocational Education and Training
Delivering Micro-Credentials in Technical and Vocational Education and Training
AG2 Design
 
What is the purpose of studying mathematics.pptx
What is the purpose of studying mathematics.pptxWhat is the purpose of studying mathematics.pptx
What is the purpose of studying mathematics.pptx
christianmathematics
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
chanes7
 
Reflective and Evaluative Practice PowerPoint
Reflective and Evaluative Practice PowerPointReflective and Evaluative Practice PowerPoint
Reflective and Evaluative Practice PowerPoint
amberjdewit93
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
camakaiclarkmusic
 
kitab khulasah nurul yaqin jilid 1 - 2.pptx
kitab khulasah nurul yaqin jilid 1 - 2.pptxkitab khulasah nurul yaqin jilid 1 - 2.pptx
kitab khulasah nurul yaqin jilid 1 - 2.pptx
datarid22
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
thanhdowork
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
Scholarhat
 
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
Ashish Kohli
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
Israel Genealogy Research Association
 
World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024
ak6969907
 

Recently uploaded (20)

Advantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO PerspectiveAdvantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO Perspective
 
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
RPMS TEMPLATE FOR SCHOOL YEAR 2023-2024 FOR TEACHER 1 TO TEACHER 3
 
MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...
MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...
MATATAG CURRICULUM: ASSESSING THE READINESS OF ELEM. PUBLIC SCHOOL TEACHERS I...
 
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdfANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
ANATOMY AND BIOMECHANICS OF HIP JOINT.pdf
 
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdfবাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
বাংলাদেশ অর্থনৈতিক সমীক্ষা (Economic Review) ২০২৪ UJS App.pdf
 
Assignment_4_ArianaBusciglio Marvel(1).docx
Assignment_4_ArianaBusciglio Marvel(1).docxAssignment_4_ArianaBusciglio Marvel(1).docx
Assignment_4_ArianaBusciglio Marvel(1).docx
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
 
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat  Leveraging AI for Diversity, Equity, and InclusionExecutive Directors Chat  Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
 
Fresher’s Quiz 2023 at GMC Nizamabad.pptx
Fresher’s Quiz 2023 at GMC Nizamabad.pptxFresher’s Quiz 2023 at GMC Nizamabad.pptx
Fresher’s Quiz 2023 at GMC Nizamabad.pptx
 
Delivering Micro-Credentials in Technical and Vocational Education and Training
Delivering Micro-Credentials in Technical and Vocational Education and TrainingDelivering Micro-Credentials in Technical and Vocational Education and Training
Delivering Micro-Credentials in Technical and Vocational Education and Training
 
What is the purpose of studying mathematics.pptx
What is the purpose of studying mathematics.pptxWhat is the purpose of studying mathematics.pptx
What is the purpose of studying mathematics.pptx
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
 
Reflective and Evaluative Practice PowerPoint
Reflective and Evaluative Practice PowerPointReflective and Evaluative Practice PowerPoint
Reflective and Evaluative Practice PowerPoint
 
CACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdfCACJapan - GROUP Presentation 1- Wk 4.pdf
CACJapan - GROUP Presentation 1- Wk 4.pdf
 
kitab khulasah nurul yaqin jilid 1 - 2.pptx
kitab khulasah nurul yaqin jilid 1 - 2.pptxkitab khulasah nurul yaqin jilid 1 - 2.pptx
kitab khulasah nurul yaqin jilid 1 - 2.pptx
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
 
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
Aficamten in HCM (SEQUOIA HCM TRIAL 2024)
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
 
World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024World environment day ppt For 5 June 2024
World environment day ppt For 5 June 2024
 

Sketch engine presentation

Editor's Notes

  1. It is a corpus query tool which takes as input a corpus of any language (with an appropriate level of linguistic mark-up) and a corresponding grammar patterns, and which generates, amongst other things, word sketches for the words of that language.Those other things include a corpus-based thesaurus and ‘sketch differences’, which specify, for two semantically related words, what behaviour they share and how they differ. We anticipate that sketch differences will be particularly useful for lexicographers interested in near synonym differentiation.Word sketches were first used in the production of the Macmillan English Dictionary (Rundell 2002) and were presented at Euralex 2002 (Kilgarriff and Rundell 2002). Following that presentation, the most-asked question was “can I have them for my language?” In response, we have now developed the Sketch Engine.
  2. It is a corpus query tool which takes as input a corpus of any language (with an appropriate level of linguistic mark-up) and a corresponding grammar patterns, and which generates, amongst other things, word sketches for the words of that language.Those other things include a corpus-based thesaurus and ‘sketch differences’, which specify, for two semantically related words, what behaviour they share and how they differ. We anticipate that sketch differences will be particularly useful for lexicographers interested in near synonym differentiation.Word sketches were first used in the production of the Macmillan English Dictionary (Rundell 2002) and were presented at Euralex 2002 (Kilgarriff and Rundell 2002). Following that presentation, the most-asked question was “can I have them for my language?” In response, we have now developed the Sketch Engine.
  3. The Sketch Engine has a number of language-analysis functions, the core ones being:the Concordancer A program which displays all occurrences from the corpus for a given query. The program is very powerful with a wide variety of query types and many different ways of displaying and organising the results. (concordancing, sorting, sampling, wordlists, collocation lists)the Word Sketch program This program provides a corpus-based summary of a word&apos;s grammatical and collocationalbehaviour.
  4. With Corpus Architect, you can build your own corpora from documents in various format: TXT, PDF, PS, DOC, HTML, VERT. When processed, you can search and query them within Sketch Engine.
  5. With Corpus Architect, you can build your own corpora from documents in various format: TXT, PDF, PS, DOC, HTML, VERT. When processed, you can search and query them within Sketch Engine.
  6. With Corpus Architect, you can build your own corpora from documents in various format: TXT, PDF, PS, DOC, HTML, VERT. When processed, you can search and query them within Sketch Engine.
  7. Concordance: for querying a corpus and obtaining concordances which you can then further refine, filter and use for generating frequency information and collocation listsWord List: for obtaining word lists for an entire corpus, or a specified subcorpusWord Sketch: this allows you to explore the grammatical and collocational behaviour of a word.Thesaurus: this allows you to find other words that have similar grammatical and collocational behaviour to a given word. Note that this thesaurus is produced automatically from statistics on word co-occurrences. It is not a manually constructed thesaurus and will list words for each entry which are distributionally related but not necessarily synonyms.Sketch-Diff: this allows you to compare the behaviour of two words
  8. Main Sketch Engine Links:https://www.sketchengine.co.uk/documentation/wiki/SkE/Help/MainLinkHelp
  9. Concordance Query:https://www.sketchengine.co.uk/documentation/wiki/SkE/Help/PageSpecificHelp/ConcordanceQueryQuery Types: Using Query Type, you can refine the type of query you wish to make in the main panel.Context : If Context is selected in the LHS menu, on the main panel you can specify criteria on the context for your query. You can choose to specify the context in terms of surrounding lemma(s) and/or PoS tag(s).Text Types: Here you can select a subcorpus or create a new subcorpus from a subset of the current corpus. You can also stipulate constraints on the text types for documents that will be searched for your query
  10. CQL:https://www.sketchengine.co.uk/documentation/wiki/SkE/CorpusQuerying#1.
  11. Ex1:Lemma filter:Window: right, 1 tokensLemma(s): عن none
  12. Concordance Menu options:https://www.sketchengine.co.uk/documentation/wiki/SkE/Help/PageSpecificHelp/Concordance Menu optionsNote that the options in the left hand side panel are all available when you are viewing the concordance. Some of the options will not be shown if you have already selected from this menu. If so, you can click view concordance to get back to the concordance.View OptionsClicking on View Options will allow you to alter how the concordance looksWith this you can select what attributes of the words in the concordance you seeKWIC/Sentence Toggle betweenthe KWIC mode where the queried text (node) is in a central column and context is displayed on either sideSentence where the queried text (node) is provided in the context of the sentence in which it occursSave Click on this to see options for saving the concordance in the main panel (or the frequency list or collocation candidates).Sort Click on this to see complex sorting options. If the concordance is sorted based on the context, an option to&quot;Jump to&quot; a page with context starting with a certain letter occurs.Alternatively, you can click onLeft (Right): to sort by the text left (Right) of the nodeNode: to sort by the text in the central column (referred to as the node or KWIC)References: to sort by the document references at the left hand side of the concordanceShuffle: the concordance will be jumbled to avoid bias from a user only looking at the first portionSample Click this to select a random sample of the concordance linesFilter Click this to further specify contextual features to filter the concordance, for example by words to the left or right of the node word, or by text typeFrequency Click on this to see a variety of complex methods for obtaining frequency listsAlternatively, you can click onNode tags: to get a frequency list over the part of speech tags of the node word/s in the central columnNode forms: to get a frequency list over the node word forms in the central columnDoc IDs: to get a frequency list over the Doc ID&apos;s for the node word/s in the central columnText Types: to get a frequency list over all the text types of the node word/s in the central columnCollocations Click on this to specify criteria and build collocation lists for the node word/s in the central columnConcDesc You can see the query in detail (for technical people) and you can go back in the history if the query consists of several subsequent actions.Visualize This link will show you the distributional graph of the concordance within the corpus. On x-axis there are concordance positions (by default 100 columns for 100 slices of the corpus, you may change its granularity with the slider + click on Redraw button), on y-axis there is a relative frequency of the query hits within a concordance part (=column). Columns are clickable: by clicking on a column, you will filter the concordance and will see only the appropriate concordance part.
  13. Word List Options:Left hand side options:select All words to generate a list of words in the corpus ranked by frequencyselect All lemmas to generate a list of lemmas in the corpus ranked by frequency. Lemma is the base (stem) form of a word.In the main panel of the interface you have further options:Subcorpus: where you can specify a subcorpus for the source data, or create a new one.Search Attribute: you can specify word, lemma, tag (part of speech tag) etc.. depending on the attributes defined for the corpus or you can specify one of the text types defined for the corpus. The default attribute is word.Filter Options: You can either do this for all words (or lemmas or whichever attribute you specify) or you can filter the list.Output Options:You can select different types of the produced list.
  14. Choose a corpus and click on Word List in the left hand side menu.Choose lemma at Search attributeType the lemma (e.g.  حار) into the RE pattern box. Tick the box that says change output attribute(s).In the first two levels, select “lemma&quot; and &quot;Tag&quot;.Click on Make Word List.
  15. Wordlist  search Attr: lemma, Change Attr: gender