SlideShare a Scribd company logo
1 of 15
AntConc ~Design & Development of a
Freeware
Corpus Analysis
Background
 AntConc was first released in 2002. At the time, it was a simple KWIC (Key Word in Context)
concordancer program designed for use by over 700 students in a scientific and technical
writing course at the Osaka University Graduate School of Engineering.
 AntConc was developed in a Windows environment using the PERL 5.8 programming
language, and the graphical user interface (GUI) was developed using the PERL/TK 8.0
toolkit.
 This enabled the program to be easily ported to a Linux/Unix environment, which was
necessary as the course was initially taught in a Linux based CALL (Computer Assisted
Language Learning).
 This generated wide interest in the program and many users reported on successes,
problems and features they would like to see added, resulting in new, improved versions of
the software and the latest version is AntConc 3.0. It was released in December 2004
Concordancer Tool
 The central tool used in most corpus analysis software, including AntConc is the
concordancer.
 Concordancers have been shown to be an effective aid in the acquisition of a second or
foreign language, facilitating the learning of vocabulary, collocations, grammar and writing
styles.
 A concordance program can find and display a huge number of examples in varied contexts
and situations quickly and efficiently.
 The Concordancer Tool is designed so that the most common operations are accessible
directly on the main screen.
KWIC Concordancer Tool
Range of
features
Search terms can be either
substrings, words, or phrases,
and can be either case sensitive
or insensitive. They can be
embedded with a wide range of
wildcards that the user can
assign to any particular character
or string of characters via a
menu option.
Search terms can be defined as
full regular expressions
(REGEX), offering the user
access to extremely powerful
and complex searches.
Three levels of sorting of
KWIC (Key Word in Context)
lines are possible, with user
definable highlight colors at
each level.
If a user clicks on any search
term in the KWIC results
display, the program will
automatically open the View
Files tool (described later) and
show the search term hit
embedded in the original file.
The KWIC results display is
divided into columns, in which
the hit number, KWIC line, and
file name are shown separately.
As in all other tools, each
column can be either displayed
or hidden, and standard
selection methods can be used
to save data in the columns or
rows to the clipboard or a text
file.
Concordance Search Term Plot Tool
 The main purpose of the Concordancer Tool is to show how a search term is used in
a target corpus.
 It offers the same functionality as the Concordancer Tool in terms of search term
options. However, the results are displayed in a quite different way.
 The example below shows that each box represents a file in which multiple lines
represent the relative positions at which search term hits can be found. From this
display, it is easyto see where and in what distribution a search term appears in the
file. This can be an effective aid, for example, in determining where phrases such as
“we” or “in this paper” are used in research articles, or determining which research
articles use a particular keyword or phrase.
CONCORDANCE SEARCH TERM PLOT TOOL
View Files Tools
 When a user clicks on a search term in the results display of the Concordancer Tool,
the View Files tool is used to display the search term in the original file.
 However, the View Files Tool can be used independently to search for any
substring, word, phrase or regular expression in a target file, offering the user a
very powerful text search engine.
 All resulting hits are displayed in a user-definable highlight color, and buttons and
keyboard shortcuts can be used to jump to a specified hit anywhere in the file. If
the user clicks on one of the highlighted search terms, all KWIC lines based on the
term are automatically shown using the Concordancer Tool.
view files tool
Word List / Keyword List Tools
 Word lists are useful as they suggest interesting areas for investigation and
highlight problem areas in a corpus.
 Word list generation program should be able to sort words into alphabetical or
frequency order and the added features of reverse ordering and the ability to count
words based on their ‘stem’ forms.
 Experienced users of corpus analysis tools will know that word lists usually tell us
little about how important a word is in a corpus. Therefore, AntConc offers a
Keyword List Tool which finds which words appear unusually frequently in a corpus
compared with the same words in a reference corpus that must also be specified
by the user.
KEYWORD LIST TOOLS
Word Clusters
 In AntConc, multi-word units can be investigated using the Word Clusters Tool.
 It displays clusters of words centered on a search term and orders them
alphabetically or by frequency.
 The search terms can be specified as a substring, word, phrase or regular
expression as in the Concordancer, Plot and View File tools, and the number of
additional words to the left and right of the search term can also be specified. It is
also possible to set a minimum frequency threshold for the clusters generated.
WORD CLUSTERS / BUNDLES TOOL
Limitations of AntConc
 Concordances can be divided into two types ;
1) those that first build an index which is used for subsequent search operations.
2) those that act directly on the raw text.
 On the other hand, they tend to be less flexible than the second type, especially if
the user is often switching or modifying the target corpus for a particular need.
AntConc fits into the second category, performing all processing on the raw data
files, and storing results in active memory.
 One of the weakest areas of AntConc is in the handling of annotated data such as
data encoded in HTML/XML format. Although AntConc offers a simple way to view
or hide embedded tags used in HTML/XML and other annotation methods, much
more sophisticated methods need to be implemented if the full power of
annotated data is to be realized.
Conclusions and Future Developments
 AntConc is a lightweight, simple and easy to use corpus analysis toolkit that has
been shown to be extremely effective in the technical writing classroom.
 In a later release, it is hoped that AntConc will be improved to handle annotated
data, in particular XML, in a much more powerful and intuitive way. XML data
includes header definitions that if extracted, can be used as part of search
criteria.

More Related Content

What's hot

Fundamentals of Language Processing
Fundamentals of Language ProcessingFundamentals of Language Processing
Fundamentals of Language ProcessingHemant Sharma
 
Recommendations for the automatic enrichment of digital library content using...
Recommendations for the automatic enrichment of digital library content using...Recommendations for the automatic enrichment of digital library content using...
Recommendations for the automatic enrichment of digital library content using...pathsproject
 
Chap 1-dhamdhere system programming
Chap 1-dhamdhere system programmingChap 1-dhamdhere system programming
Chap 1-dhamdhere system programmingTanzoGamerz
 
Accessing database using nlp
Accessing database using nlpAccessing database using nlp
Accessing database using nlpeSAT Journals
 
Corpus Linguistics :Analytical Tools
Corpus Linguistics :Analytical ToolsCorpus Linguistics :Analytical Tools
Corpus Linguistics :Analytical ToolsJitendra Patil
 
Declarative programming language
Declarative programming languageDeclarative programming language
Declarative programming languageVinisha Pathak
 
Principles of compiler design
Principles of compiler designPrinciples of compiler design
Principles of compiler designJanani Parthiban
 
IRJET- Short-Text Semantic Similarity using Glove Word Embedding
IRJET- Short-Text Semantic Similarity using Glove Word EmbeddingIRJET- Short-Text Semantic Similarity using Glove Word Embedding
IRJET- Short-Text Semantic Similarity using Glove Word EmbeddingIRJET Journal
 
4.language expert rendering unicode text on ascii editor for indian languages...
4.language expert rendering unicode text on ascii editor for indian languages...4.language expert rendering unicode text on ascii editor for indian languages...
4.language expert rendering unicode text on ascii editor for indian languages...EditorJST
 
Web classification of Digital Libraries using GATE Machine Learning
Web classification of Digital Libraries using GATE Machine LearningWeb classification of Digital Libraries using GATE Machine Learning
Web classification of Digital Libraries using GATE Machine Learningsstose
 
English kazakh parallel corpus for statistical machine translation
English kazakh parallel corpus for statistical machine translationEnglish kazakh parallel corpus for statistical machine translation
English kazakh parallel corpus for statistical machine translationijnlc
 
Safe assignment
Safe assignmentSafe assignment
Safe assignmentSally Loan
 
Ethnograph 11 Jul07
Ethnograph 11 Jul07Ethnograph 11 Jul07
Ethnograph 11 Jul07Clara Kwan
 
Tovek Presentation 2 by Livio Costantini
Tovek Presentation 2 by Livio CostantiniTovek Presentation 2 by Livio Costantini
Tovek Presentation 2 by Livio Costantinimaxfalc
 
Ethnograph 10 Jul07
Ethnograph 10 Jul07Ethnograph 10 Jul07
Ethnograph 10 Jul07Clara Kwan
 
Indexing language concept types and characteristics
Indexing language concept types and characteristicsIndexing language concept types and characteristics
Indexing language concept types and characteristicsDr. Utpal Das
 
Unit i data structure FYCS MUMBAI UNIVERSITY SEM II
Unit i  data structure FYCS MUMBAI UNIVERSITY SEM II Unit i  data structure FYCS MUMBAI UNIVERSITY SEM II
Unit i data structure FYCS MUMBAI UNIVERSITY SEM II ajay pashankar
 
Open nlp presentationss
Open nlp presentationssOpen nlp presentationss
Open nlp presentationssChandan Deb
 

What's hot (20)

Files c4
Files c4Files c4
Files c4
 
Fundamentals of Language Processing
Fundamentals of Language ProcessingFundamentals of Language Processing
Fundamentals of Language Processing
 
Recommendations for the automatic enrichment of digital library content using...
Recommendations for the automatic enrichment of digital library content using...Recommendations for the automatic enrichment of digital library content using...
Recommendations for the automatic enrichment of digital library content using...
 
Chap 1-dhamdhere system programming
Chap 1-dhamdhere system programmingChap 1-dhamdhere system programming
Chap 1-dhamdhere system programming
 
Accessing database using nlp
Accessing database using nlpAccessing database using nlp
Accessing database using nlp
 
Corpus Linguistics :Analytical Tools
Corpus Linguistics :Analytical ToolsCorpus Linguistics :Analytical Tools
Corpus Linguistics :Analytical Tools
 
Declarative programming language
Declarative programming languageDeclarative programming language
Declarative programming language
 
Principles of compiler design
Principles of compiler designPrinciples of compiler design
Principles of compiler design
 
Language processors
Language processorsLanguage processors
Language processors
 
IRJET- Short-Text Semantic Similarity using Glove Word Embedding
IRJET- Short-Text Semantic Similarity using Glove Word EmbeddingIRJET- Short-Text Semantic Similarity using Glove Word Embedding
IRJET- Short-Text Semantic Similarity using Glove Word Embedding
 
4.language expert rendering unicode text on ascii editor for indian languages...
4.language expert rendering unicode text on ascii editor for indian languages...4.language expert rendering unicode text on ascii editor for indian languages...
4.language expert rendering unicode text on ascii editor for indian languages...
 
Web classification of Digital Libraries using GATE Machine Learning
Web classification of Digital Libraries using GATE Machine LearningWeb classification of Digital Libraries using GATE Machine Learning
Web classification of Digital Libraries using GATE Machine Learning
 
English kazakh parallel corpus for statistical machine translation
English kazakh parallel corpus for statistical machine translationEnglish kazakh parallel corpus for statistical machine translation
English kazakh parallel corpus for statistical machine translation
 
Safe assignment
Safe assignmentSafe assignment
Safe assignment
 
Ethnograph 11 Jul07
Ethnograph 11 Jul07Ethnograph 11 Jul07
Ethnograph 11 Jul07
 
Tovek Presentation 2 by Livio Costantini
Tovek Presentation 2 by Livio CostantiniTovek Presentation 2 by Livio Costantini
Tovek Presentation 2 by Livio Costantini
 
Ethnograph 10 Jul07
Ethnograph 10 Jul07Ethnograph 10 Jul07
Ethnograph 10 Jul07
 
Indexing language concept types and characteristics
Indexing language concept types and characteristicsIndexing language concept types and characteristics
Indexing language concept types and characteristics
 
Unit i data structure FYCS MUMBAI UNIVERSITY SEM II
Unit i  data structure FYCS MUMBAI UNIVERSITY SEM II Unit i  data structure FYCS MUMBAI UNIVERSITY SEM II
Unit i data structure FYCS MUMBAI UNIVERSITY SEM II
 
Open nlp presentationss
Open nlp presentationssOpen nlp presentationss
Open nlp presentationss
 

Similar to Ant conc ~design & development of a freeware

PSEUDOCODE TO SOURCE PROGRAMMING LANGUAGE TRANSLATOR
PSEUDOCODE TO SOURCE PROGRAMMING LANGUAGE TRANSLATORPSEUDOCODE TO SOURCE PROGRAMMING LANGUAGE TRANSLATOR
PSEUDOCODE TO SOURCE PROGRAMMING LANGUAGE TRANSLATORijistjournal
 
lec00-Introduction.pdf
lec00-Introduction.pdflec00-Introduction.pdf
lec00-Introduction.pdfwigewej294
 
Compiler_Lecture1.pdf
Compiler_Lecture1.pdfCompiler_Lecture1.pdf
Compiler_Lecture1.pdfAkarTaher
 
SSD Mod 2 -18CS61_Notes.pdf
SSD Mod 2 -18CS61_Notes.pdfSSD Mod 2 -18CS61_Notes.pdf
SSD Mod 2 -18CS61_Notes.pdfJacobDragonette
 
Evaluation of Research Tools
Evaluation of Research ToolsEvaluation of Research Tools
Evaluation of Research ToolsHATS
 
Chapter One
Chapter OneChapter One
Chapter Onebolovv
 
2010 tool forum ata handout
2010 tool forum ata handout2010 tool forum ata handout
2010 tool forum ata handoutascetlan
 
Towards From Manual to Automatic Semantic Annotation: Based on Ontology Eleme...
Towards From Manual to Automatic Semantic Annotation: Based on Ontology Eleme...Towards From Manual to Automatic Semantic Annotation: Based on Ontology Eleme...
Towards From Manual to Automatic Semantic Annotation: Based on Ontology Eleme...IJwest
 
Challenge of Image Retrieval, Brighton, 2000 1 ANVIL: a System for the Retrie...
Challenge of Image Retrieval, Brighton, 2000 1 ANVIL: a System for the Retrie...Challenge of Image Retrieval, Brighton, 2000 1 ANVIL: a System for the Retrie...
Challenge of Image Retrieval, Brighton, 2000 1 ANVIL: a System for the Retrie...Petros Tsonis
 
SURE Research Report
SURE Research ReportSURE Research Report
SURE Research ReportAlex Sumner
 
A Collaborative Electronic Laboratory Notebook
A Collaborative Electronic Laboratory NotebookA Collaborative Electronic Laboratory Notebook
A Collaborative Electronic Laboratory NotebookJose Katab
 
Mc Useful Tools
Mc Useful ToolsMc Useful Tools
Mc Useful Toolsmcthedog
 

Similar to Ant conc ~design & development of a freeware (20)

Lexical analyzer
Lexical analyzerLexical analyzer
Lexical analyzer
 
 
PSEUDOCODE TO SOURCE PROGRAMMING LANGUAGE TRANSLATOR
PSEUDOCODE TO SOURCE PROGRAMMING LANGUAGE TRANSLATORPSEUDOCODE TO SOURCE PROGRAMMING LANGUAGE TRANSLATOR
PSEUDOCODE TO SOURCE PROGRAMMING LANGUAGE TRANSLATOR
 
lec00-Introduction.pdf
lec00-Introduction.pdflec00-Introduction.pdf
lec00-Introduction.pdf
 
G0361034038
G0361034038G0361034038
G0361034038
 
Compiler_Lecture1.pdf
Compiler_Lecture1.pdfCompiler_Lecture1.pdf
Compiler_Lecture1.pdf
 
SSD Mod 2 -18CS61_Notes.pdf
SSD Mod 2 -18CS61_Notes.pdfSSD Mod 2 -18CS61_Notes.pdf
SSD Mod 2 -18CS61_Notes.pdf
 
methods and resources
methods and resourcesmethods and resources
methods and resources
 
Evaluation of Research Tools
Evaluation of Research ToolsEvaluation of Research Tools
Evaluation of Research Tools
 
Chapter One
Chapter OneChapter One
Chapter One
 
2010 tool forum ata handout
2010 tool forum ata handout2010 tool forum ata handout
2010 tool forum ata handout
 
3.2
3.23.2
3.2
 
Towards From Manual to Automatic Semantic Annotation: Based on Ontology Eleme...
Towards From Manual to Automatic Semantic Annotation: Based on Ontology Eleme...Towards From Manual to Automatic Semantic Annotation: Based on Ontology Eleme...
Towards From Manual to Automatic Semantic Annotation: Based on Ontology Eleme...
 
Challenge of Image Retrieval, Brighton, 2000 1 ANVIL: a System for the Retrie...
Challenge of Image Retrieval, Brighton, 2000 1 ANVIL: a System for the Retrie...Challenge of Image Retrieval, Brighton, 2000 1 ANVIL: a System for the Retrie...
Challenge of Image Retrieval, Brighton, 2000 1 ANVIL: a System for the Retrie...
 
SURE Research Report
SURE Research ReportSURE Research Report
SURE Research Report
 
A Collaborative Electronic Laboratory Notebook
A Collaborative Electronic Laboratory NotebookA Collaborative Electronic Laboratory Notebook
A Collaborative Electronic Laboratory Notebook
 
Mc Useful Tools
Mc Useful ToolsMc Useful Tools
Mc Useful Tools
 
Chapter 1
Chapter 1Chapter 1
Chapter 1
 
Hku Ppt
Hku PptHku Ppt
Hku Ppt
 
HKU ppt
HKU pptHKU ppt
HKU ppt
 

Recently uploaded

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 

Recently uploaded (20)

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 

Ant conc ~design & development of a freeware

  • 1. AntConc ~Design & Development of a Freeware Corpus Analysis
  • 2. Background  AntConc was first released in 2002. At the time, it was a simple KWIC (Key Word in Context) concordancer program designed for use by over 700 students in a scientific and technical writing course at the Osaka University Graduate School of Engineering.  AntConc was developed in a Windows environment using the PERL 5.8 programming language, and the graphical user interface (GUI) was developed using the PERL/TK 8.0 toolkit.  This enabled the program to be easily ported to a Linux/Unix environment, which was necessary as the course was initially taught in a Linux based CALL (Computer Assisted Language Learning).  This generated wide interest in the program and many users reported on successes, problems and features they would like to see added, resulting in new, improved versions of the software and the latest version is AntConc 3.0. It was released in December 2004
  • 3. Concordancer Tool  The central tool used in most corpus analysis software, including AntConc is the concordancer.  Concordancers have been shown to be an effective aid in the acquisition of a second or foreign language, facilitating the learning of vocabulary, collocations, grammar and writing styles.  A concordance program can find and display a huge number of examples in varied contexts and situations quickly and efficiently.  The Concordancer Tool is designed so that the most common operations are accessible directly on the main screen.
  • 5. Range of features Search terms can be either substrings, words, or phrases, and can be either case sensitive or insensitive. They can be embedded with a wide range of wildcards that the user can assign to any particular character or string of characters via a menu option. Search terms can be defined as full regular expressions (REGEX), offering the user access to extremely powerful and complex searches. Three levels of sorting of KWIC (Key Word in Context) lines are possible, with user definable highlight colors at each level. If a user clicks on any search term in the KWIC results display, the program will automatically open the View Files tool (described later) and show the search term hit embedded in the original file. The KWIC results display is divided into columns, in which the hit number, KWIC line, and file name are shown separately. As in all other tools, each column can be either displayed or hidden, and standard selection methods can be used to save data in the columns or rows to the clipboard or a text file.
  • 6. Concordance Search Term Plot Tool  The main purpose of the Concordancer Tool is to show how a search term is used in a target corpus.  It offers the same functionality as the Concordancer Tool in terms of search term options. However, the results are displayed in a quite different way.  The example below shows that each box represents a file in which multiple lines represent the relative positions at which search term hits can be found. From this display, it is easyto see where and in what distribution a search term appears in the file. This can be an effective aid, for example, in determining where phrases such as “we” or “in this paper” are used in research articles, or determining which research articles use a particular keyword or phrase.
  • 8. View Files Tools  When a user clicks on a search term in the results display of the Concordancer Tool, the View Files tool is used to display the search term in the original file.  However, the View Files Tool can be used independently to search for any substring, word, phrase or regular expression in a target file, offering the user a very powerful text search engine.  All resulting hits are displayed in a user-definable highlight color, and buttons and keyboard shortcuts can be used to jump to a specified hit anywhere in the file. If the user clicks on one of the highlighted search terms, all KWIC lines based on the term are automatically shown using the Concordancer Tool.
  • 10. Word List / Keyword List Tools  Word lists are useful as they suggest interesting areas for investigation and highlight problem areas in a corpus.  Word list generation program should be able to sort words into alphabetical or frequency order and the added features of reverse ordering and the ability to count words based on their ‘stem’ forms.  Experienced users of corpus analysis tools will know that word lists usually tell us little about how important a word is in a corpus. Therefore, AntConc offers a Keyword List Tool which finds which words appear unusually frequently in a corpus compared with the same words in a reference corpus that must also be specified by the user.
  • 12. Word Clusters  In AntConc, multi-word units can be investigated using the Word Clusters Tool.  It displays clusters of words centered on a search term and orders them alphabetically or by frequency.  The search terms can be specified as a substring, word, phrase or regular expression as in the Concordancer, Plot and View File tools, and the number of additional words to the left and right of the search term can also be specified. It is also possible to set a minimum frequency threshold for the clusters generated.
  • 13. WORD CLUSTERS / BUNDLES TOOL
  • 14. Limitations of AntConc  Concordances can be divided into two types ; 1) those that first build an index which is used for subsequent search operations. 2) those that act directly on the raw text.  On the other hand, they tend to be less flexible than the second type, especially if the user is often switching or modifying the target corpus for a particular need. AntConc fits into the second category, performing all processing on the raw data files, and storing results in active memory.  One of the weakest areas of AntConc is in the handling of annotated data such as data encoded in HTML/XML format. Although AntConc offers a simple way to view or hide embedded tags used in HTML/XML and other annotation methods, much more sophisticated methods need to be implemented if the full power of annotated data is to be realized.
  • 15. Conclusions and Future Developments  AntConc is a lightweight, simple and easy to use corpus analysis toolkit that has been shown to be extremely effective in the technical writing classroom.  In a later release, it is hoped that AntConc will be improved to handle annotated data, in particular XML, in a much more powerful and intuitive way. XML data includes header definitions that if extracted, can be used as part of search criteria.