An Approach to the AutomaticExtraction of Complex Predicates in Bengali by MEGHADITYA ROY CHAUDHURY (BCSE- III) Jadavpur University
What are Complex Predicates?Complex Predicates are defined as predicateswhich are composed of more than onegrammatical element (either morphemes/words),each of which contributes a non-trivial part of the non-information of the complex predicate (AlexAlsina 1996).Complex Predicates contain (verb + verb) or(noun/adjective + verb) combinations in SouthAsian Languages (Hook, 1974).
Identifying Complex Predicates in BengaliBengali is less computerized compared toEnglish due to its morphological enrichment.As the identification of Complex Predicatesrequires the knowledge of morphology, the taskof automatically extracting the ComplexPredicates is a challenge.
Benefits of Identification of Complex PredicatesDetection and interpretation of complexpredicates are important for tasks such asmachine translation, information retrieval,summarization etc.A mere listing of complex predicates constitutesvaluable linguistic resource for lexicographers,wordnet designers and other NLP systemdesigners.designers.
Approach to the identification of Complex PredicatesA Rule-Based Approach. Rule-In this project, I follow an algorithm forautomatic extraction of Complexpredicates from an untagged corpus usingonly morphological analyzer and rootlexicon.
Approach to the Extraction of Complex Predicates in Bengali Language Complex Predicates in Bengali consists of two types, Compound verbs and Conjunct verbs. Compound Verbs: Verb + Light Verb Conjunct Verbs : Noun/Adj + Verb The second verb is called Light Verb.
16 Light Verbs in BengaliaSa ‘come’ • dãRa ‘stand’rakha ‘keep’ • ana ‘bring’deoya ‘give’ • pOra ‘fall’paTha ‘send’ • bERano ‘roam’neoya ‘take’ • tola ‘lift’bOSa ‘sit’ • oTha ‘rise’jaoya ‘go’ • chaRa ‘leave’phEla ‘drop’ • mOra ‘die’
Bengali Shallow Parser The analysis begins at the morphologicallevel and accumulates at results of POStagger and chunker.The final output combines the results of allthese levels and shows them in a singlerepresentation (called Shakti StandardFormat).
The Console Output of the Bengali Shallow Parser
Functions That Work in the BackgroundLoad_resource()morph_file_creating()Find_complex_predicate()prepareOutput()deleteFile()
ConclusionThe algorithm heavily depends on TheBengali Shallow Parser, hence it suffersfrom some error crept in the parser tool.This can be modified by reducing thedependence and developing a more self- self-sufficient algorithm .It definitely calls for a large amount work infuture.
A particular slide catching your eye?
Clipping is a handy way to collect important slides you want to go back to later.