SlideShare a Scribd company logo
1 of 10
FARAH DIYANA BINTI AHMAD JEFIRUDDIN
Study of language as expressed
in samples (corpora) or "real
world" text.
DEFINITION
KUCERA AND W. NELSON FRANCIS
-publish Computational Analysis of Present-Day American
English (1967)
-contains a variety of computational analyses, combining
elements of linguistics, language
teaching, psychology, statistics, and sociology
RANDOLPH QUIRK
-publish Towards a description of English Usage' (1960) in
which he introduced The Survey of English Usage.
HISTORY
HOUGHTON-MIFFLIN
- publish American Heritage Dictionary (first
dictionary to be compiled using corpus linguistics)
-supply a million word, three-line citation base for the
dictionary
- AHD combines prescriptive elements with
descriptive information.
COLLINS
- publish COBUILD monolingual learner's dictionary
- designed for users learning English as a foreign
language, (compiled using the Bank of English)
-The Survey of English Usage Corpus was used in the
development of the Comprehensive Grammar of
English
MONTREAL FRENCH PROJECT
- The first computerized corpus of transcribed spoken
language
- contains one million words
ANDERSEN-FORBES
- is a computerized corpora
- database of the Hebrew Bible
- every clause is parsed using graphs representing
seven levels of syntax, and each segment are tagged
with seven fields of information
THE QURANIC ARABIC CORPUS
- an annotated corpus for the Classical Arabic
language of the Quran
- recent project with multiple layers of annotation
including morphological segmentation, part-of-
speech tagging, and syntactic analysis using
dependency grammar
METHODS 1) Annotation
2) Abstraction
3) Analysis
METHODS
Annotation consists of the application of a scheme to
texts.
Annotations may include structural mark-up, part-of-
speech tagging, parsing, and numerous other
representations.
1) ANNOTATION
Abstraction consists of the translation (mapping) of
terms in the scheme to terms in a theoretically
motivated model or dataset.
It typically includes linguist-directed search but may
include e.g., rule-learning for parsers.
2) ABSTRACTION
Analysis consists of statistically probing, manipulating
and generalising from the dataset.
Might include statistical evaluations, optimisation of
rule-bases or knowledge discovery methods.
3) ANALYSIS
Corpus linguistic

More Related Content

Viewers also liked

царевская наталья. революция Edtech 2.0 и ее монетизация
царевская наталья. революция Edtech  2.0  и ее монетизацияцаревская наталья. революция Edtech  2.0  и ее монетизация
царевская наталья. революция Edtech 2.0 и ее монетизацияelenae00
 
лукацкий алексей. обзор последних законодательных инициатив в области информа...
лукацкий алексей. обзор последних законодательных инициатив в области информа...лукацкий алексей. обзор последних законодательных инициатив в области информа...
лукацкий алексей. обзор последних законодательных инициатив в области информа...elenae00
 
левский николай. оценка рисков мобильного пользователя и рекомендации по их ...
левский николай. оценка рисков мобильного пользователя  и рекомендации по их ...левский николай. оценка рисков мобильного пользователя  и рекомендации по их ...
левский николай. оценка рисков мобильного пользователя и рекомендации по их ...elenae00
 
EQB Minnesota and Climate Change
EQB Minnesota and Climate ChangeEQB Minnesota and Climate Change
EQB Minnesota and Climate ChangeAnna Henderson
 
100 Trường đại học hàng đầu thế giới 2013-2014
100 Trường đại học hàng đầu thế giới 2013-2014100 Trường đại học hàng đầu thế giới 2013-2014
100 Trường đại học hàng đầu thế giới 2013-2014Duhoc_Vietsail
 
Week 8 Exercise
Week 8 ExerciseWeek 8 Exercise
Week 8 ExerciseCOMM12033
 
Membuat desain sistem keamananjaringan
Membuat desain sistem  keamananjaringanMembuat desain sistem  keamananjaringan
Membuat desain sistem keamananjaringanAnwarMuhammad1
 
Should teachers experiment with poetry in the classroom?
Should teachers experiment with poetry in the classroom?Should teachers experiment with poetry in the classroom?
Should teachers experiment with poetry in the classroom?Lauris Jagger
 
Chetan QA & MR resume
Chetan QA & MR resumeChetan QA & MR resume
Chetan QA & MR resumechetan naidu
 
бешков андрей. сравнение безопасности мобильных платформ
бешков андрей. сравнение безопасности мобильных платформбешков андрей. сравнение безопасности мобильных платформ
бешков андрей. сравнение безопасности мобильных платформelenae00
 
Safari App extensions cleared up
Safari App extensions cleared upSafari App extensions cleared up
Safari App extensions cleared upSanaa Squalli
 
Mr kvantitativni aspekt fundamentalne analize na nivou kompanije i industri...
Mr   kvantitativni aspekt fundamentalne analize na nivou kompanije i industri...Mr   kvantitativni aspekt fundamentalne analize na nivou kompanije i industri...
Mr kvantitativni aspekt fundamentalne analize na nivou kompanije i industri...Srđan Stefanovic
 
Cara membuktikan keaslian website
Cara membuktikan keaslian websiteCara membuktikan keaslian website
Cara membuktikan keaslian websiteAnwarMuhammad1
 
90% Of People Can't Pronounce This Whole Poem. You Have To Try It.
90% Of People Can't Pronounce This Whole Poem. You Have To Try It.90% Of People Can't Pronounce This Whole Poem. You Have To Try It.
90% Of People Can't Pronounce This Whole Poem. You Have To Try It.Duhoc_Vietsail
 

Viewers also liked (17)

царевская наталья. революция Edtech 2.0 и ее монетизация
царевская наталья. революция Edtech  2.0  и ее монетизацияцаревская наталья. революция Edtech  2.0  и ее монетизация
царевская наталья. революция Edtech 2.0 и ее монетизация
 
лукацкий алексей. обзор последних законодательных инициатив в области информа...
лукацкий алексей. обзор последних законодательных инициатив в области информа...лукацкий алексей. обзор последних законодательных инициатив в области информа...
лукацкий алексей. обзор последних законодательных инициатив в области информа...
 
левский николай. оценка рисков мобильного пользователя и рекомендации по их ...
левский николай. оценка рисков мобильного пользователя  и рекомендации по их ...левский николай. оценка рисков мобильного пользователя  и рекомендации по их ...
левский николай. оценка рисков мобильного пользователя и рекомендации по их ...
 
EQB Minnesota and Climate Change
EQB Minnesota and Climate ChangeEQB Minnesota and Climate Change
EQB Minnesota and Climate Change
 
100 Trường đại học hàng đầu thế giới 2013-2014
100 Trường đại học hàng đầu thế giới 2013-2014100 Trường đại học hàng đầu thế giới 2013-2014
100 Trường đại học hàng đầu thế giới 2013-2014
 
Week 8 Exercise
Week 8 ExerciseWeek 8 Exercise
Week 8 Exercise
 
Membuat desain sistem keamananjaringan
Membuat desain sistem  keamananjaringanMembuat desain sistem  keamananjaringan
Membuat desain sistem keamananjaringan
 
Should teachers experiment with poetry in the classroom?
Should teachers experiment with poetry in the classroom?Should teachers experiment with poetry in the classroom?
Should teachers experiment with poetry in the classroom?
 
Cartilha Reserva Legal
Cartilha Reserva LegalCartilha Reserva Legal
Cartilha Reserva Legal
 
Chetan QA & MR resume
Chetan QA & MR resumeChetan QA & MR resume
Chetan QA & MR resume
 
бешков андрей. сравнение безопасности мобильных платформ
бешков андрей. сравнение безопасности мобильных платформбешков андрей. сравнение безопасности мобильных платформ
бешков андрей. сравнение безопасности мобильных платформ
 
Safari App extensions cleared up
Safari App extensions cleared upSafari App extensions cleared up
Safari App extensions cleared up
 
Mr kvantitativni aspekt fundamentalne analize na nivou kompanije i industri...
Mr   kvantitativni aspekt fundamentalne analize na nivou kompanije i industri...Mr   kvantitativni aspekt fundamentalne analize na nivou kompanije i industri...
Mr kvantitativni aspekt fundamentalne analize na nivou kompanije i industri...
 
Galeria
GaleriaGaleria
Galeria
 
Cara membuktikan keaslian website
Cara membuktikan keaslian websiteCara membuktikan keaslian website
Cara membuktikan keaslian website
 
Arabe
ArabeArabe
Arabe
 
90% Of People Can't Pronounce This Whole Poem. You Have To Try It.
90% Of People Can't Pronounce This Whole Poem. You Have To Try It.90% Of People Can't Pronounce This Whole Poem. You Have To Try It.
90% Of People Can't Pronounce This Whole Poem. You Have To Try It.
 

Similar to Corpus linguistic

From Universal to Programming Languages
From Universal to Programming LanguagesFrom Universal to Programming Languages
From Universal to Programming LanguagesFederico Gobbo
 
Sujay Laws of Language Dynamics FINAL FINAL FINAL FINAL FINAL.pdf
Sujay Laws of Language Dynamics FINAL FINAL FINAL FINAL FINAL.pdfSujay Laws of Language Dynamics FINAL FINAL FINAL FINAL FINAL.pdf
Sujay Laws of Language Dynamics FINAL FINAL FINAL FINAL FINAL.pdfSujay Rao Mandavilli
 
A history of english language teaching - Section 1 (3,4,5)
A history of english language teaching - Section 1 (3,4,5)A history of english language teaching - Section 1 (3,4,5)
A history of english language teaching - Section 1 (3,4,5)Seray Tanyer
 
Sujay Laws of Language Dynamics FINAL FINAL FINAL FINAL FINAL.pdf
Sujay Laws of Language Dynamics FINAL FINAL FINAL FINAL FINAL.pdfSujay Laws of Language Dynamics FINAL FINAL FINAL FINAL FINAL.pdf
Sujay Laws of Language Dynamics FINAL FINAL FINAL FINAL FINAL.pdfSujay Rao Mandavilli
 
Secondary and tertiary sources
Secondary and tertiary sourcesSecondary and tertiary sources
Secondary and tertiary sourcesDisha Mishra
 
The Use of Corpus Linguistics in Lexicography
The Use of Corpus Linguistics in LexicographyThe Use of Corpus Linguistics in Lexicography
The Use of Corpus Linguistics in LexicographyIhsan Ibadurrahman
 
History of linguistics presentation
History of linguistics presentationHistory of linguistics presentation
History of linguistics presentationFariha asghar
 
A Brief History of Archiving in Language Documentation, With an Annotated Bib...
A Brief History of Archiving in Language Documentation, With an Annotated Bib...A Brief History of Archiving in Language Documentation, With an Annotated Bib...
A Brief History of Archiving in Language Documentation, With an Annotated Bib...Tiffany Daniels
 
The History of Language Teaching Methodology
The History of Language Teaching MethodologyThe History of Language Teaching Methodology
The History of Language Teaching MethodologyGeovanny Peña
 
A timeline of the history of linguists - BAUTISTA - BELGERA.pdf
A timeline of the history of linguists - BAUTISTA - BELGERA.pdfA timeline of the history of linguists - BAUTISTA - BELGERA.pdf
A timeline of the history of linguists - BAUTISTA - BELGERA.pdfFordBryantSadio
 
a timeline of the history of linguistics- BAUTISTA- BELGERA.pdf
a timeline of the history of linguistics- BAUTISTA- BELGERA.pdfa timeline of the history of linguistics- BAUTISTA- BELGERA.pdf
a timeline of the history of linguistics- BAUTISTA- BELGERA.pdfFordBryantSadio
 
corpus linguistics and lexicography
corpus linguistics and lexicographycorpus linguistics and lexicography
corpus linguistics and lexicographyayfa
 
History Of Language Teaching
History Of Language TeachingHistory Of Language Teaching
History Of Language TeachingIsabel
 

Similar to Corpus linguistic (20)

Schools of thought
Schools of thoughtSchools of thought
Schools of thought
 
Skpb 1023 corpus linguitics
Skpb 1023 corpus linguiticsSkpb 1023 corpus linguitics
Skpb 1023 corpus linguitics
 
Phonetics report
Phonetics reportPhonetics report
Phonetics report
 
From Universal to Programming Languages
From Universal to Programming LanguagesFrom Universal to Programming Languages
From Universal to Programming Languages
 
Sujay Laws of Language Dynamics FINAL FINAL FINAL FINAL FINAL.pdf
Sujay Laws of Language Dynamics FINAL FINAL FINAL FINAL FINAL.pdfSujay Laws of Language Dynamics FINAL FINAL FINAL FINAL FINAL.pdf
Sujay Laws of Language Dynamics FINAL FINAL FINAL FINAL FINAL.pdf
 
A history of english language teaching - Section 1 (3,4,5)
A history of english language teaching - Section 1 (3,4,5)A history of english language teaching - Section 1 (3,4,5)
A history of english language teaching - Section 1 (3,4,5)
 
Sujay Laws of Language Dynamics FINAL FINAL FINAL FINAL FINAL.pdf
Sujay Laws of Language Dynamics FINAL FINAL FINAL FINAL FINAL.pdfSujay Laws of Language Dynamics FINAL FINAL FINAL FINAL FINAL.pdf
Sujay Laws of Language Dynamics FINAL FINAL FINAL FINAL FINAL.pdf
 
Secondary and tertiary sources
Secondary and tertiary sourcesSecondary and tertiary sources
Secondary and tertiary sources
 
Dictionaries
DictionariesDictionaries
Dictionaries
 
Dictionaries
DictionariesDictionaries
Dictionaries
 
6. lecture no. intro to lang. dictionary, v+adv
6. lecture no. intro to lang. dictionary, v+adv6. lecture no. intro to lang. dictionary, v+adv
6. lecture no. intro to lang. dictionary, v+adv
 
The Use of Corpus Linguistics in Lexicography
The Use of Corpus Linguistics in LexicographyThe Use of Corpus Linguistics in Lexicography
The Use of Corpus Linguistics in Lexicography
 
History of linguistics presentation
History of linguistics presentationHistory of linguistics presentation
History of linguistics presentation
 
History of Linguistic
History of LinguisticHistory of Linguistic
History of Linguistic
 
A Brief History of Archiving in Language Documentation, With an Annotated Bib...
A Brief History of Archiving in Language Documentation, With an Annotated Bib...A Brief History of Archiving in Language Documentation, With an Annotated Bib...
A Brief History of Archiving in Language Documentation, With an Annotated Bib...
 
The History of Language Teaching Methodology
The History of Language Teaching MethodologyThe History of Language Teaching Methodology
The History of Language Teaching Methodology
 
A timeline of the history of linguists - BAUTISTA - BELGERA.pdf
A timeline of the history of linguists - BAUTISTA - BELGERA.pdfA timeline of the history of linguists - BAUTISTA - BELGERA.pdf
A timeline of the history of linguists - BAUTISTA - BELGERA.pdf
 
a timeline of the history of linguistics- BAUTISTA- BELGERA.pdf
a timeline of the history of linguistics- BAUTISTA- BELGERA.pdfa timeline of the history of linguistics- BAUTISTA- BELGERA.pdf
a timeline of the history of linguistics- BAUTISTA- BELGERA.pdf
 
corpus linguistics and lexicography
corpus linguistics and lexicographycorpus linguistics and lexicography
corpus linguistics and lexicography
 
History Of Language Teaching
History Of Language TeachingHistory Of Language Teaching
History Of Language Teaching
 

Recently uploaded

CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 

Recently uploaded (20)

CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 

Corpus linguistic

  • 1. FARAH DIYANA BINTI AHMAD JEFIRUDDIN
  • 2. Study of language as expressed in samples (corpora) or "real world" text. DEFINITION
  • 3. KUCERA AND W. NELSON FRANCIS -publish Computational Analysis of Present-Day American English (1967) -contains a variety of computational analyses, combining elements of linguistics, language teaching, psychology, statistics, and sociology RANDOLPH QUIRK -publish Towards a description of English Usage' (1960) in which he introduced The Survey of English Usage. HISTORY
  • 4. HOUGHTON-MIFFLIN - publish American Heritage Dictionary (first dictionary to be compiled using corpus linguistics) -supply a million word, three-line citation base for the dictionary - AHD combines prescriptive elements with descriptive information. COLLINS - publish COBUILD monolingual learner's dictionary - designed for users learning English as a foreign language, (compiled using the Bank of English) -The Survey of English Usage Corpus was used in the development of the Comprehensive Grammar of English
  • 5. MONTREAL FRENCH PROJECT - The first computerized corpus of transcribed spoken language - contains one million words ANDERSEN-FORBES - is a computerized corpora - database of the Hebrew Bible - every clause is parsed using graphs representing seven levels of syntax, and each segment are tagged with seven fields of information THE QURANIC ARABIC CORPUS - an annotated corpus for the Classical Arabic language of the Quran - recent project with multiple layers of annotation including morphological segmentation, part-of- speech tagging, and syntactic analysis using dependency grammar
  • 6. METHODS 1) Annotation 2) Abstraction 3) Analysis METHODS
  • 7. Annotation consists of the application of a scheme to texts. Annotations may include structural mark-up, part-of- speech tagging, parsing, and numerous other representations. 1) ANNOTATION
  • 8. Abstraction consists of the translation (mapping) of terms in the scheme to terms in a theoretically motivated model or dataset. It typically includes linguist-directed search but may include e.g., rule-learning for parsers. 2) ABSTRACTION
  • 9. Analysis consists of statistically probing, manipulating and generalising from the dataset. Might include statistical evaluations, optimisation of rule-bases or knowledge discovery methods. 3) ANALYSIS