This document discusses approaches to analyzing lexis and grammar in texts. It examines both discourse-based, top-down approaches that identify characteristic lexical and grammatical traits across an entire text, as well as sentence-level, bottom-up approaches that discover linguistic traits through empirical observation of individual sentences. Specific aims discussed include analyzing language features for language learning or for a particular discipline like ESP/EAP. Examples are provided of analyzing academic registers and comparing lexical usage between non-native speakers, native speakers in different disciplines, and corpora in different languages and contexts. The document emphasizes the importance of examining both relative word frequencies and collocations/colligations to understand appropriate usage and identify potential gaps for language learners.
High level introduction to text mining analytics, which covers the building blocks or most commonly used techniques of text mining along with useful additional references/links where required for background/literature and R codes to get you started.
Named entity recognition (ner) with nltkJanu Jahnavi
https://www.learntek.org/blog/named-entity-recognition-ner-with-nltk/
Learntek is global online training provider on Big Data Analytics, Hadoop, Machine Learning, Deep Learning, IOT, AI, Cloud Technology, DEVOPS, Digital Marketing and other IT and Management courses.
Neural Models for Information RetrievalBhaskar Mitra
In the last few years, neural representation learning approaches have achieved very good performance on many natural language processing (NLP) tasks, such as language modelling and machine translation. This suggests that neural models will also yield significant performance improvements on information retrieval (IR) tasks, such as relevance ranking, addressing the query-document vocabulary mismatch problem by using semantic rather than lexical matching. IR tasks, however, are fundamentally different from NLP tasks leading to new challenges and opportunities for existing neural representation learning approaches for text.
We begin this talk with a discussion on text embedding spaces for modelling different types of relationships between items which makes them suitable for different IR tasks. Next, we present how topic-specific representations can be more effective than learning global embeddings. Finally, we conclude with an emphasis on dealing with rare terms and concepts for IR, and how embedding based approaches can be augmented with neural models for lexical matching for better retrieval performance. While our discussions are grounded in IR tasks, the findings and the insights covered during this talk should be generally applicable to other NLP and machine learning tasks.
High level introduction to text mining analytics, which covers the building blocks or most commonly used techniques of text mining along with useful additional references/links where required for background/literature and R codes to get you started.
Named entity recognition (ner) with nltkJanu Jahnavi
https://www.learntek.org/blog/named-entity-recognition-ner-with-nltk/
Learntek is global online training provider on Big Data Analytics, Hadoop, Machine Learning, Deep Learning, IOT, AI, Cloud Technology, DEVOPS, Digital Marketing and other IT and Management courses.
Neural Models for Information RetrievalBhaskar Mitra
In the last few years, neural representation learning approaches have achieved very good performance on many natural language processing (NLP) tasks, such as language modelling and machine translation. This suggests that neural models will also yield significant performance improvements on information retrieval (IR) tasks, such as relevance ranking, addressing the query-document vocabulary mismatch problem by using semantic rather than lexical matching. IR tasks, however, are fundamentally different from NLP tasks leading to new challenges and opportunities for existing neural representation learning approaches for text.
We begin this talk with a discussion on text embedding spaces for modelling different types of relationships between items which makes them suitable for different IR tasks. Next, we present how topic-specific representations can be more effective than learning global embeddings. Finally, we conclude with an emphasis on dealing with rare terms and concepts for IR, and how embedding based approaches can be augmented with neural models for lexical matching for better retrieval performance. While our discussions are grounded in IR tasks, the findings and the insights covered during this talk should be generally applicable to other NLP and machine learning tasks.
A talk looking at why doing exam practice tests is essentially a waste of time and also how the kind of teaching we give may have the wrong focus in terms of proficiency tests. First we look at the difference between progress and proficiency tests, then look at what is actually tested in proficiency tests such as FCE, show how we can get nearer that kind of language in materials, explanations and teaching, and correction of students. I also show how this can be related to exam skills and \'practice\' without have to go through practice tests.
Translation as an Act of Communication, Rosario DuraoRosário Durão
Class given for the M.A. students in Translation at Lisbon University's Faculty of Letters, by invitation of the "Language, Discourse and Translation" course professor, Margarita Correia, Ph.D.
Concept and example of a semantic solution implemented with SQL views to cooperate with users on queries over structured data with independence from database schema knowledge and technology.
This is an introduction to text analytics for advanced business users and IT professionals with limited programming expertise. The presentation will go through different areas of text analytics as well as provide some real work examples that help to make the subject matter a little more relatable. We will cover topics like search engine building, categorization (supervised and unsupervised), clustering, NLP, and social media analysis.
A talk looking at why doing exam practice tests is essentially a waste of time and also how the kind of teaching we give may have the wrong focus in terms of proficiency tests. First we look at the difference between progress and proficiency tests, then look at what is actually tested in proficiency tests such as FCE, show how we can get nearer that kind of language in materials, explanations and teaching, and correction of students. I also show how this can be related to exam skills and \'practice\' without have to go through practice tests.
Translation as an Act of Communication, Rosario DuraoRosário Durão
Class given for the M.A. students in Translation at Lisbon University's Faculty of Letters, by invitation of the "Language, Discourse and Translation" course professor, Margarita Correia, Ph.D.
Concept and example of a semantic solution implemented with SQL views to cooperate with users on queries over structured data with independence from database schema knowledge and technology.
This is an introduction to text analytics for advanced business users and IT professionals with limited programming expertise. The presentation will go through different areas of text analytics as well as provide some real work examples that help to make the subject matter a little more relatable. We will cover topics like search engine building, categorization (supervised and unsupervised), clustering, NLP, and social media analysis.
Semantic Web & Information Brokering: Opportunities, Commercialization and Ch...Amit Sheth
Amit Sheth, "Semantic Web & Info. Brokering Opportunities, Commercialization and Challenges," Keynote talk at the workshop on Semantic Web: Models, Architecture and Management, September 21, 2000, Lisbon, Portugal.
This was the keynote given at probably the first international event with "Semantic Web" in title (and before the well known SciAm article). As in TBL's use of Semantic Web in his 1999 book, (semantic) metadata plays central role. The use of Worldmodel/Ontology is consistent with our use of ontology for (Web) information integration in 1994 CIKM paper. Summary of the talk by event organizers and other details are at: http://knoesis.org/library/resource.php?id=735
Prof. Sheth started a Semantic Web company Taalee, Inc. in 1999 (product was called MediaAnywhere A/V search engine- discussed in this paper in the context of one of its use by a customer Redband Broadcasting). The product included Semantic Web/populated Ontology based semantic (faceted) search, semantic browsing, semantic personalization, semantic targeting (advertisement), etc as is described in U.S. Patent #6311194, 30 Oct. 2001 (filed 2000). MediaAnywhere has about 25 ontologies in News/Business, Sports, Entertainment, etc.
Taalee merged to become Voquette in 2001 (product was called SCORE), Semagix in 2004 (product was called Semagix Freedom), and then Fortent in 2006 (products included Know Your Customers).
The semantic technology enhances big data advancements by allowing sophisticated analysis of texts. Through the Linked Data technology, tremendous amount of information can be connected. However, this inherits ambiguity when it needs to be manipulated for certain purpose like natural language interface, semantic search and question answering. There are limited works which address ambiguity in semantic search. This paper introduces a technique based on self-adaptive disambiguation which utilizes the possible concept annotations of terms in the natural language queries. This will allow users to compose query in natural language and receive accurate answers without having to master the formal syntax of the semantic query language.
This presentation was held as a guest lecture on corpus linguistics at the University of Paderborn, Germany, on 8 November 2007. I'd like to thank my colleague Anette Rosenbach for inviting me as part of her "Web as Corpus" seminar.
Haystack 2018 - Algorithmic Extraction of Keywords Concepts and VocabulariesMax Irwin
Presentation as given to the Haystack Conference, which outlines research and techniques for automatic extraction of keywords, concepts, and vocabularies from text corpora.
Reflected Intelligence: Lucene/Solr as a self-learning data systemTrey Grainger
What if your search engine could automatically tune its own domain-specific relevancy model? What if it could learn the important phrases and topics within your domain, automatically identify alternate spellings (synonyms, acronyms, and related phrases) and disambiguate multiple meanings of those phrases, learn the conceptual relationships embedded within your documents, and even use machine-learned ranking to discover the relative importance of different features and then automatically optimize its own ranking algorithms for your domain?
In this presentation, you’ll learn you how to do just that - to evolving Lucene/Solr implementations into self-learning data systems which are able to accept user queries, deliver relevance-ranked results, and automatically learn from your users’ subsequent interactions to continually deliver a more relevant experience for each keyword, category, and group of users.
Such a self-learning system leverages reflected intelligence to consistently improve its understanding of the content (documents and queries), the context of specific users, and the relevance signals present in the collective feedback from every prior user interaction with the system. Come learn how to move beyond manual relevancy tuning and toward a closed-loop system leveraging both the embedded meaning within your content and the wisdom of the crowds to automatically generate search relevancy algorithms optimized for your domain.
"Data is the new water in the digital age"
Anthony (Tony) Nolan OAM, Anthony (Tony) Nolan OAM, Lead Data Scientist, G3N1U5 Pty Ltd, presented a summary of his research as part of the SMART Seminar Series on 6 June 2016.
For more information, visit the event page at: http://smart.uow.edu.au/events/UOW214302.html.
Macroeconomics- Movie Location
This will be used as part of your Personal Professional Portfolio once graded.
Objective:
Prepare a presentation or a paper using research, basic comparative analysis, data organization and application of economic information. You will make an informed assessment of an economic climate outside of the United States to accomplish an entertainment industry objective.
Strategies for Effective Upskilling is a presentation by Chinwendu Peace in a Your Skill Boost Masterclass organisation by the Excellence Foundation for South Sudan on 08th and 09th June 2024 from 1 PM to 3 PM on each day.
This presentation includes basic of PCOS their pathology and treatment and also Ayurveda correlation of PCOS and Ayurvedic line of treatment mentioned in classics.
Delivering Micro-Credentials in Technical and Vocational Education and TrainingAG2 Design
Explore how micro-credentials are transforming Technical and Vocational Education and Training (TVET) with this comprehensive slide deck. Discover what micro-credentials are, their importance in TVET, the advantages they offer, and the insights from industry experts. Additionally, learn about the top software applications available for creating and managing micro-credentials. This presentation also includes valuable resources and a discussion on the future of these specialised certifications.
For more detailed information on delivering micro-credentials in TVET, visit this https://tvettrainer.com/delivering-micro-credentials-in-tvet/
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Dr. Vinod Kumar Kanvaria
Exploiting Artificial Intelligence for Empowering Researchers and Faculty,
International FDP on Fundamentals of Research in Social Sciences
at Integral University, Lucknow, 06.06.2024
By Dr. Vinod Kumar Kanvaria
বাংলাদেশের অর্থনৈতিক সমীক্ষা ২০২৪ [Bangladesh Economic Review 2024 Bangla.pdf] কম্পিউটার , ট্যাব ও স্মার্ট ফোন ভার্সন সহ সম্পূর্ণ বাংলা ই-বুক বা pdf বই " সুচিপত্র ...বুকমার্ক মেনু 🔖 ও হাইপার লিংক মেনু 📝👆 যুক্ত ..
আমাদের সবার জন্য খুব খুব গুরুত্বপূর্ণ একটি বই ..বিসিএস, ব্যাংক, ইউনিভার্সিটি ভর্তি ও যে কোন প্রতিযোগিতা মূলক পরীক্ষার জন্য এর খুব ইম্পরট্যান্ট একটি বিষয় ...তাছাড়া বাংলাদেশের সাম্প্রতিক যে কোন ডাটা বা তথ্য এই বইতে পাবেন ...
তাই একজন নাগরিক হিসাবে এই তথ্য গুলো আপনার জানা প্রয়োজন ...।
বিসিএস ও ব্যাংক এর লিখিত পরীক্ষা ...+এছাড়া মাধ্যমিক ও উচ্চমাধ্যমিকের স্টুডেন্টদের জন্য অনেক কাজে আসবে ...
Thinking of getting a dog? Be aware that breeds like Pit Bulls, Rottweilers, and German Shepherds can be loyal and dangerous. Proper training and socialization are crucial to preventing aggressive behaviors. Ensure safety by understanding their needs and always supervising interactions. Stay safe, and enjoy your furry friends!
A workshop hosted by the South African Journal of Science aimed at postgraduate students and early career researchers with little or no experience in writing and publishing journal articles.
The simplified electron and muon model, Oscillating Spacetime: The Foundation...RitikBhardwaj56
Discover the Simplified Electron and Muon Model: A New Wave-Based Approach to Understanding Particles delves into a groundbreaking theory that presents electrons and muons as rotating soliton waves within oscillating spacetime. Geared towards students, researchers, and science buffs, this book breaks down complex ideas into simple explanations. It covers topics such as electron waves, temporal dynamics, and the implications of this model on particle physics. With clear illustrations and easy-to-follow explanations, readers will gain a new outlook on the universe's fundamental nature.
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...Levi Shapiro
Letter from the Congress of the United States regarding Anti-Semitism sent June 3rd to MIT President Sally Kornbluth, MIT Corp Chair, Mark Gorenberg
Dear Dr. Kornbluth and Mr. Gorenberg,
The US House of Representatives is deeply concerned by ongoing and pervasive acts of antisemitic
harassment and intimidation at the Massachusetts Institute of Technology (MIT). Failing to act decisively to ensure a safe learning environment for all students would be a grave dereliction of your responsibilities as President of MIT and Chair of the MIT Corporation.
This Congress will not stand idly by and allow an environment hostile to Jewish students to persist. The House believes that your institution is in violation of Title VI of the Civil Rights Act, and the inability or
unwillingness to rectify this violation through action requires accountability.
Postsecondary education is a unique opportunity for students to learn and have their ideas and beliefs challenged. However, universities receiving hundreds of millions of federal funds annually have denied
students that opportunity and have been hijacked to become venues for the promotion of terrorism, antisemitic harassment and intimidation, unlawful encampments, and in some cases, assaults and riots.
The House of Representatives will not countenance the use of federal funds to indoctrinate students into hateful, antisemitic, anti-American supporters of terrorism. Investigations into campus antisemitism by the Committee on Education and the Workforce and the Committee on Ways and Means have been expanded into a Congress-wide probe across all relevant jurisdictions to address this national crisis. The undersigned Committees will conduct oversight into the use of federal funds at MIT and its learning environment under authorities granted to each Committee.
• The Committee on Education and the Workforce has been investigating your institution since December 7, 2023. The Committee has broad jurisdiction over postsecondary education, including its compliance with Title VI of the Civil Rights Act, campus safety concerns over disruptions to the learning environment, and the awarding of federal student aid under the Higher Education Act.
• The Committee on Oversight and Accountability is investigating the sources of funding and other support flowing to groups espousing pro-Hamas propaganda and engaged in antisemitic harassment and intimidation of students. The Committee on Oversight and Accountability is the principal oversight committee of the US House of Representatives and has broad authority to investigate “any matter” at “any time” under House Rule X.
• The Committee on Ways and Means has been investigating several universities since November 15, 2023, when the Committee held a hearing entitled From Ivory Towers to Dark Corners: Investigating the Nexus Between Antisemitism, Tax-Exempt Universities, and Terror Financing. The Committee followed the hearing with letters to those institutions on January 10, 202
Normal Labour/ Stages of Labour/ Mechanism of LabourWasim Ak
Normal labor is also termed spontaneous labor, defined as the natural physiological process through which the fetus, placenta, and membranes are expelled from the uterus through the birth canal at term (37 to 42 weeks
1. The vocabulary / grammar
component?
Look at these examples and determine what
components in language can be observed (e.g.,
vocabulary, grammar, text, register…? Explain why)
a) To relate to + something vs. to be related to +
someone
b) Had he + past participle… / he would + past participle
c) Could I please have … ?(in a restaurant)
d) To make + someone + adjective versus to make +
something + preposition (up, into…)
e) The data are examined vs. The data is run
4. Sentence level (bottom-up
approach)
e.g., Johns, 1991; Aston, 1997; Bernardini, 2000;
Curado, 2002; De Cock (2003); Yeung, 2009,
Boulton (2010)
AIMS: To discover / assess linguistic traits for
language learning via empirical observation
5. Specific aims (e.g., writing in a
discipline): ESP / EAP /EPP …
--EFL countries (Brazil, France, Spain...)
--Working with specialized corpora (academic,
professional...) to both identify and propose
language / teaching solutions (key phraseology,
rhetorical items, etc) within or across disciplines
6. Academic register (Example 1 of
lexico-grammatical approach)
Start = “the good old semi-technical lexis” with hugely
different frequencies, collocations, and meanings across
disciplines (cf. Hyland, 2009; Durrant, 2009 …)
e.g.; applied linguistics (on the other hand + textual act) vs.
Electrical engineering (as shown in figure + research
oriented)
In a discipline, e.g.:
1) Computer Science (+ empirical /
experimental, + research...)
2) Analysis and DDL for academic discourse
competence (Spanish faculty / graduate
students inform about research)
7. EXAMPLE: C.S. (NNS) vs.
Humanities..NS) vs CS (NS)
-10
0
10
20
30
40
50
60
Tokens
(in
10,000s)
Types
(in
1,000s)
STTR
#
oftexts#
ofdisciplines
#
ofgenres
/texttypes
NNS Corpus
BNC selection
NS Computer Science
8. Relative word frequencies
WORD NNS Corpus for
Case Study
BNC selection
IN >
TO <
FOR >
AS >
THAT >
IS <
WE >>
HAVE >
CAN >>
AT >
USE >>
FROM <
WHICH =
BUT >
C.S. NS (40,180 tokens)
Is
For
That
Be
Are
As
With
This
By
On
It >
From
Was >
Can
Not >
Which
Have
We <
Within
These
At
Were
Also
10. T-scores = + 2.0; M.I. Scores = + 3.0T-scores = + 2.0; M.I. Scores = + 3.0
(Collocational strength– Clear, 1999)(Collocational strength– Clear, 1999)
Freq of node ‘new’ f(n): 221
Freq of collocate ‘technologies’ f(c): 123
Freq of node and collocate within span: 16
Size of corpus: 500120
We observe that (2.6 / 11.4)2.6 / 11.4) (CS NNS)(CS NNS)
11. Lexical / grammatical patternsLexical / grammatical patterns
Specialized collocationsSpecialized collocations (Topic / area)
Eg. record + file
Eg. New information technologies
Eg. The use of [+ technology]
eg. information + available on + digital media
Lexical-rhetoricalLexical-rhetorical (Genre / text type)
Ej. With respect to (+ concept)
Ej. In this paper we
Ej. As far as [+ subject] is concerned
Ej. This is found to be (passive)
12. Contrastive information: collocations,
colligations, semantic associations,
textual (Hoey, 2005)
*Similar use
*C.S. NNS gap = 15% more in BNC
*C.S. NNS use = 15% more in NNS
e.g. Appear* to be (similar) /
to ensure that (NS) /
in this sense (NNS)
*[and NS and NNS field-driven?]
13. WORD USE Similar Use NS only NNS only
Collocation
Appear* + to be
(20 / 20.4 / 19%)
It is possible to
(28 / 8 / 28.4%)
We observe that
(0 / 14.7 / 0%)
Colligation
The basis for (Direct
Object)
(26.3 / 17.6 / 21%)
Noun + to (no
purpose / no
reported speech)
(26.5 / 1.2 / 17%)
Be + asked to (present
tense)
(0 / 61.5 / 13%)
Semantic Association
In the field of + area
(20 / 11.5 / 16%)
To be seeking +
functionality (28 /
0 / 18%)
Related to + concept
(26 / 76.9 / 36%)
Textual Colligation
As a result of (beg.
paragraphs)
(20 / 31.5 / 26%)
One of the most +
adj. (beg. sentences)
(23.2 / 4.3 / 19%)
For this reason, (beg.
sentences)
(2.9 / 20 / 3%)
14. Genre and subject / field
Lexical use Genre Subject
Collocation
Such as + examples
(52 / 56% --C.S. papers)
If and only if
(71.4% --BNC: Logic)
Colligation
I had + past participle
(47% --BNC reports)
is + to be + past participle
(22 / 17.8% --C.S.: IT and
networking)
Semantic Association
Be + applied to + area
(17 / 25.6 % --C.S. paper
Introductions & Method)
Be / appear + on the right +
side
(19 / 26.6% --C.S.: graphical
design)
Textual Colligation
There is no + noun (beg.
paragraphs)
(34.8% -- BNC articles)
This form + be completed
(beg. paragraphs)
(16.4% -- BNC: survey
reports)
15. Correlating frequency and use
e.g., e.g., “we + observe” vs. “subj + has been /
was observed” (also CS)
1) There is a more open use of words in patterns by NS
authors (e.g., observe > this is observed to be / we
observe / this has been observed to … )
2) The NS limitation often obeys the rigid influence of
formulaic items & fossilization (K. Hyland’s claim that
the semi-technical items should follow the research-
oriented stylistic inclination more in engineering = many
choices for patterns)
16. Word use and context
Lexical use according to variables
0 5 10 15 20
Collocations
Colligations
Semantic Associations
Textual Colligations
number of items
NNS and NS
Subject
Genre
NS
NNS
17. Discipline versus NNS (Spanish)
writing interference: How much?
• L1 transfer problems with collocates & also,
fossilized structures
18. Data management on-line (e.g.,
Sketch Engine)
Double objective: Distinguish most appropriate
use & work with more phraseological
possibilities = enrich writing (genre & field)
19. Key points so far
--Relative frequencies as key references of use
--the lexico-grammatical component in specific text (top-
down)
--Statistical information on word behavior (bottom-up)
--Exploring content + content / function + content
elements:
Overusing, under-using, misusing by Non-native
--L1 transfer problem and fossilized items
20. Text type-focus (more examples)
In our organisation, we are just in the process of
finalising our new 3-year rolling Strategic Plan.
Crucial to achieving the objectivesachieving the objectives set in the Planset in the Plan
will be the implementation of a large number ofwill be the implementation of a large number of
new projects/initiatives that will have an impactnew projects/initiatives that will have an impact
on every part of the organisation.on every part of the organisation.
A computer technical report?
An academic lecture?
The Pet Rock, the White Power Rangers, the Beanie
Babies and the Furbies were toys that achievedachieved
successsuccess in different years without coming fromin different years without coming from
the rule-book or the experience database of anythe rule-book or the experience database of any
single company. Despite their yesterday's success,single company. Despite their yesterday's success,
the producers of such toys are not guaranteedthe producers of such toys are not guaranteed
a place in the future that doesn't computea place in the future that doesn't compute
based on yesterday's historical data.based on yesterday's historical data.
A business review?
A piece of news? (Give reasons for your choice)
21. Activities in class and collocations / patternsActivities in class and collocations / patterns
Another type of encryption designed primarily for business-to- business information exchange
involves both a public and a private key. The company that plans to exchange data with another
company provides it with a public key. This public key is used to encrypt data for transmission
between the companies but is not used for decryption. The receiving company uses its private
key to decrypt the data upon receipt.
Data sent over the Internet runs the risk of being changed by a hacker during transmission. Data
alteration includes deleting data, adding a virus to destroy data or report data back to the hacker,
and altering a business transaction. Using digital signatures can reduce these risks. A digital
signature contains a hash code derived from the data per se. Any data modification will cause a
different hash code that will not match the digital signature. After the digital signature is
encrypted within the message, the message is sent to the recipient, first with the sender’s private
key and then with the receiver’s public key. Furthermore, the recipient must decrypt the message
first with its private keys and then with the sender’s public key. This method ensures that the
message can come only from the sender.
Unregistered transactions: A business transaction may run the risk of being sent but not
received. This risk can be costly if the transaction is in response to a limited-time offer, such as a
bid on a government contract. The receipt of an important transaction should be confirmed by
sending an acknowledgment message back to the sender. Corporations find themselves at the
mercy of Internet hackers and vandals. They are looking for different ways to protect their own
networks against intrusion from hackers. Companies must not only prevent unauthorized users
from accessing private and sensitive data and resources but must also prevent unauthorized
22. •NEED FOR TECHNICAL COMPOUNDS—Collin et al.
(2004); Kaplan (2000)
•Management control system -- management control
•bit array -- number of bits
•online tax preparation software -- tax software
resource-based view of the firm -- view of the firm
•Discussing ‘solutions’: (L.A. Robb, 1996)
•sistema de gestión controlada; un string de bits; Un bit
array*; software de tasas**; preparación de tasas online**;
visión de la firma basada en riqueza??***;
GENRE / TEXT TYPE VARIATIONS:
In this paper we…-- En este papel***
RESEARCH PAPER
It was argued that…-- Se argumentó que*
PROCEEDINGS
(company+) Sales analysis reported that…-- El análisis de
ventas reportó* -- TECHNICAL REPORT
Get your company started-- Coge a tu compañía
empezada*** -- WEB SITE
Our paper-- Nuestro papel*** --ABSTRACT
In the current example-- En el corriente ejemplo***
TEXTBOOK
Rhetorical-discoursive markersRhetorical-discoursive markers
1. Nominal wages increase because ofbecause of a demand impulse in
2. experienced tremendous growth because ofbecause of the demand
3. for when the market for enforcement is tighter, either
because ofbecause of high demand or because ofbecause of low supply
4. service sectors are picking up because ofbecause of strengthening
demand.
¿debido a …?
COLLOCATIONS
______________corporate____ + LAW
+ IMAGE
+ GOVERNANCE
+ CONTROL
+ REPORT
+ PERFORMANCE
+ FINANCE
+ SECTOR
CHECK for instance:
EG. INFORME TÉCNICO / RENDIMIENTO DE LA
EMPRESA…
Observing language / L1 / L2
23. Text type & field / topic focus (can
one guess?)
According to our historical data, …According to our historical data, …
The paper describes our research findings…The paper describes our research findings…
For the results above, a similar phenomenon hasFor the results above, a similar phenomenon has
been found in a different site…been found in a different site…
Maybe I should emphasize the importance of thatMaybe I should emphasize the importance of that
Concept…Concept…
If and only if X > Y can we then assume…If and only if X > Y can we then assume…
Sorry, I couldn’t hear your questions…Sorry, I couldn’t hear your questions…
24. Enhancing tools for the relation between lexico-Enhancing tools for the relation between lexico-
grammatical items and text / discoursegrammatical items and text / discourse
14
36. Lexis and grammar in the conversation
register (Example with children)
• Speakers use parallel forms / e.g., pattern
question and answer replies (Carter,
2004)
• language-in-action collaborative tasks
among speakers (McCarthy, 1998)
• Categories: Age, nationality, situation /
topic…
• Example: Children in USA– English /
Spanish
Child Age
Aimee 5;4.0
Justin 4;6.0
Melissa 3;4.0
Trevor 4;3.0
Willie 6;1.0
38. Frequency + dispersion (DCLs)
• Overall similarities and differences:
1. + inter-personal statements
2. + everyday words / worlds (coche / boy)
3. + markers, references (esto / aquí / then)
4. 2nd vs. 3rd persons
5. Concise / short sentences vs. Longer
ones; less vs. More opinion (me parece)
6. Age levels
American English
(monolingual)
Spain’s Spanish
(monolingual)
Spanish / English (Bilingual
Latin American in USA)
Word TOTAL
You 30921
I 27118
A 23615
Be 23388
The 20701
It 20222
What 16925
To 15343
Do 14944
That 14056
Dem 10622
Not 9415
And 8774
Go 8507
This 7871
In 7848
No 7597
On 7351
One 7227
Have 7128
Word TOTAL
A 25204
No 23096
Que 19932
La 16372
El 16303
Es 13580
Se 12636
Qué 12477
De 10391
Sí 10365
Éh 8511
Lo 7069
En 6673
O 6071
Me 5999
Aquí 5951
Está 5317
Mira 5298
Los 5201
Mí 4610
Word TOTAL
No 3485
A 3468
Y 3209
Que 2843
El 2162
La 2010
Sí 1723
Es 1609
Eh 1482
Aquí 1386
Lo 1272
Un 1261
De 1226
Se 1191
Me 1111
Cómo 1078
Te 1076
Ya 1047
Está 946
Yo 889
American English (monolingual) Spain’s Spanish (monolingual) Spanish / English (Bilingual Latin
American in USA)
I don’t know
I’m goin(g) to (5 & 4
years)
Mommy, you… (all)
I’m not gonna (5 years)
I want ta go (4 years)
You want to…? (4 & 3
years)
I’m gonna (6 years)
You have to
You open it
I not going to (3 years)
A ver si
A lo mejor
No sé qué es (6 & 5
years)
Es que como no… (6
years)
Porque no + verb (6 & 5
years)
A mí no me gusta (6, 5 &
4 years)
Mira lo que + verb (4
years)
Pues creo que
Lo tienes que
Y luego (5, 4 & 3 years)
Y ya está
Y lo pone en
Y luego (6 & 5 years)
Me voy a + verb (all)
No me acuerdo (all
except 6 years)
No se puede
Me parece que (4 years)
Sí es eso
Y yo también (all)
Mamita, el de… (3 years)
39. Examples:
• Questions asked by adults vs. Children (3 / 4)
• Structures (e.g., Be + going to / gonna (3 / 4)
Age level-related development
3 and 4
Fr
eq.
Field – Year 3 Field – Year 4
1 Do you have... / would you like (adult) / where did you ... (adult) /
what else did you... (adult) / why don't you... (adult) / what do you
call... (adult)
I don't (want) / I don't see (no birds) /
I'm finished
2 I don't know / I don't think you (adult) / I want to (go) / I going to / I
don't want to / I want some (more) / mommy, I want (a)
you have to / mommy, you... / how
you do it / how do you do it / where
you going
3 Chug a chug a chug / make a (dog) (adult) / make a (plane)
(child) /
it looks like a / dis is a / I never heard
of a / it's gonna be a
4 Oh yeah? Oh look it what does it say / you turn it /
5 what kind of... (adult) I like to / would you like to (mother)
6 play with (toy) what is dis / what is that (mother)
40. • Comprehension / production according to age:
keyness (vs. Other ages and Other directories)
e.g., + likes and dislikes / commands (all since 3)
+ declaratives / questions (since 4)
+ numbers (5) / + colours (6)
Negative: specific words (e.g., “suitcase” –
age 3 / “dem” – age 5, etc)
Possible applications for pedagogy
Keyword type Year 3 Year 4 Year 5 Year 6
POSITIVE
I
[NAMES]
A
HE
GOIN(G)
[WHY]
D(O)
[YA]
UH
DIS
MONKEY
BUGS
IS
[THIS]
DOSE
THA
DEY
DE
TIRE
KNOCK
[NAMES]
I
DE
DIS
MOMMY
I’M
[COULD]
GRAIN
DAT’S
GONNA
[INFINITIVES]
NOW
DESE
NEED
DERE
CAN
MOM
[AUXILIARIES]
HEY
PAINT
ALLIGATOR
OLD
LADY
[BALANCE]
FIVE
YUP
OK
HOW
[MHMH]
[NUMBERS]
ASK
GIRAFFE
[SOMETIMES]
THINK
GOP
THIS
SCHOOL
GUTCHET
NINE
WE
[PICK CARDS]
PENGUIN
BLUE
WIN
BACKWARDS
PENGUINS
GREEN
CARDS
I
ROBBIE’S
CANDYLAND
GAME
MOVED
HATE
STAY
PURPLE
PICKED
THEY
HEARTS
[ORDER]
NEGATIVE HMM
BOOM
YUM
SUITCASE
GAIN
DOLLY
BOOKS
KNOWS
TV
FIT
[MOTHER’S]
ICE
BATH
BOOM
WORD
HAPPY
TOY
HOUSE
SHAPE
CHAIR
WHO
WHAT’S
OKAY
TAKES
DEM
FO
BRIDGE
REINDEER
DAT
MARBLE
BREAK
‘T
41. • Broader Contrastive View:
Overall key items in English (vs. BNC sampler)
and Spanish (vs. Written material—news,
essays, ads—on the web)
keyness :
e.g., + questions (what / qué)
+ personal inclinations (I want / quiero...)
+ negation / dislikes / commands (don’t /
no / ...)
EFL content for Spanish learners
American English Spain’s Spanish
Word Keyness
You 65.208,5
Dem 60.718,0
What 53.113,5
Do 38.399,9
Go 28.287,0
I 24.541,8
A 21.785,5
It 21.218,3
Zero 20.639,2
Not 18.937,5
Mommy 18.612,8
Want 16.661,1
No 16.515,1
That 16.035,6
Don't 15.150,2
Oh 14.576,8
Here 14.421,3
Huh 14.326,5
Put 14.182,9
See 13.663,4
Word Keyness
Qué 7.722,3
No 6.660,0
Sí 6.572,3
Te 4.604,5
Mira 3.733,8
Aquí 3.411,6
Está 3.213,1
Mí 2.828,3
Me 2.240,5
Ver 2.202,0
Di 2.197,7
Eh 2.183,7
Ti 2.136,9
Ah 2.044,7
Ay 1.590,0
Yo 1.575,6
Ahí 1.555,9
Esto 1.292,1
Así 1.271,5
Ahora 1.244,9
43. • COLLABORATIVE PLAY / TASKS
• DYNAMIC AND VISUAL
• RECEIVE AND
PRODUCE INFORMATION
Resources for pedagogical aims in the
children’s lessons
44. 1. Interpersonal (i.e., use of first and second person
pronouns, vocative words, commands);
2. Declarative (demonstrative pronouns and
adjectives, third person statements, expression of
preferences and dislikes);
3. Markers (discourse connectors, interjections,
gambits)
4. Nouns (30 % English / 26 % Spanish); 14.6 %
verbs / 7 % adjectives
Linguistic-discursive priorities
• 60 keywords at each age level > t-scores
1. Interpersonal = years 3 and 4 (E = S)
2. Declarative = years 4 and 3 (E); 3 (S)
3. Markers = years 4 and 5 (E); 5 and 6 (S)
4. Nouns = years 5 and 4 (E); 5 (S)
*MOT: want to take it apart first ? [interpersonal question]
*CHI: right here +... [marker / metadiscourse / production]
*MOT: how do you get it out ? [interpersonal question]
*MOT: how do you get the pieces out ? [interpersonal question / repetition]
*MOT: like this ? [question / metadiscourse / repetition]
*CHI: yeah . [answer / production]
*MOT: ok . [answer / marker]
*CHI: are ya gonna talk to it without the puzzles out of it ? [interpersonal question / production]
*MOT: yeah . [answer]
*MOT: <you can just put> [//] why don't you put a piece and then I'll put
a piece . [command / question]
*CHI: ok . [answer / marker / production]
*MOT: this looks like Mickey's head . [declarative / naming]
*MOT: is that his head ? [question / repetition]
*CHI: yep . [answer / production]
*MOT: ok . [answer / marker]
*CHI: there . [metadiscourse / production]
*MOT: now it's your turn . [marker / interpersonal prompt]
*CHI: um . [pause / marker / production]
*MOT: ok . [answer / marker]
*CHI: there . [metadiscourse / production]
*OBS: a ver # me dices como te llamas . [interpersonal question]
*CRI: Cristina Perez Perez . [answer / production]
*OBS: Cristina Perez Perez ? [question / repetition]
*OBS: oye que estabas haciendo ahora en clase ? [marker / interpersonal
question]
CRI: estaba escribiendo y pintando . [answer / declarative /
production]
*OBS: y que estabas escribiendo y pintando ? [interpersonal question / repetition]
*CRI: escribiendo en el cuaderno azul . [answer / declarative /
production]
*OBS: si # oye y que es el cuaderno azul ? [marker / interpersonal
question / repetition]
*CRI: uno que tiene cuadrados rojos y lo voy a terminar .[answer / declarative / production]
*OBS: si y que te ha dicho la sor # que lo haces bien ? [marker / interpersonal question]
*CRI: si . [answer / production]
*OBS: y tambien pintas en ese ? [marker / metadiscourse / interpersonal
question]
*CRI: &=afirma . [answer / production]
*OBS: y que pintas ? [marker / interpersonal question]
*CRI: pin [/] pinto cuadros . [answer / production]
45. lessons Concepts 3 4 5 Linguistic
content
3 4 5
Colours X Like/ Dislike X X X
Greetings /
introductions X X X
Prepositions
X X
Numbers X Commands
(Imperative)
X X X
Sizes and
shapes
X X X To be X X X
The weather X It is … X X X
Feelings
(love, hate
…) and likes
(I like/ I don´t
like)
X X X
Are you ….?
X X
Specific
Vocabulary X
To have
X X X
Simple
descriptions
of objects,
people ...
X X X
Personal and
possessive
pronouns
X X X
Vocabulary X X X X
Naming of
objects, people
–simple
definitions
X X X
Personal and
possessive
pronouns
X X X
Space /time
orientation (up,
down, near ...)
X X X
Can/Could
Would you like … X
Actions (read,
jump, run) X X
Adjectives
Comparative and
superlative
X X
Family X X These is/are X X
Sensations,
states of mind
(happy, bored, I
am cold…)
X X
Do/does
Yes/no questions X X
Daily routines
(wash one’s
hands, have
breakfast…) and
parts of the day
X X X
Wh/ open questions
Interrogative
pronouns
X X
46. • Self-access and group interactivity with key language
at age (EFL and L1):
– Adaptive for age / knowledge level (e.g., focus on common
words, common structures, simple naming, defining ...)
– Assessment by teachers + other professionals (child
pedagogy / psychology / sociology counsellors...)
• Animations / graphics / visual aspects > motivation in
MULTIMODALITY (e.g., audiovisual references in
metadiscourse, interpersonal addresses, etc)
• Interaction via Computer & networking: learning and
playing too
Applications / Implications
47. • Think about the fields / topics that are
important for 12-15 year old teenagers:
>What words are more important and why?
Also think about how to best have
students acess and exploit them…?
48. The vocabulary / grammar
component in speech?
Determine what components can be observed (e.g.,
vocabulary, grammar, text, register…? Explain why)
a)Dunno about that, maybe, I ain’t sure, maybe
b)Had I known back then, then that’d’ve made some
difference!
c) Could you just shut up once and for all!
d) Whatever he’s thinking, he sure chews it up
e)How’re you doing? Fine, thanks
f)Needless to say, need I say more!
g)Just going for a stroll…!
Editor's Notes
--Learning how to draw information from lexical items in specific context --(Dp) MORE GENERAL = “guy” in which nationality of English? Why? GETTING MORE SPECIFIC = what type of English (register) = formal? Informal? Conversational? Written? / fixed phrases? (idioms?) / etc --(Dp) Main meaning of guy? In which nationality now? What registers? Press? Fiction? Newspaper article? (genre) MORE SPECIFIC --(Dp) MUCH MORE SPECIFIC = REGISTER + GENRE + SUBJECT IDENTIFICATION: WHICH IS WHICH? (Dp)
--(Dp) home-made electronic glossaries w/ activities to exploit lexical items (BIT LANGUAGE TO BE EXPLOITED ___ C.S. students design electronic resources (DATABASES -- END OF MAJOR PROJECT) (DP) --(DP) BABYLON FOR WRITING -- COMING HANDY WITH SPECIFIC COLLOCATIONS AND STRUCTURES (DECODING & ENCODING) = GOOD THING IS ALSO TO USE VISUALS
Página de inicio una vez registrado el programa WordSmith ----------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------- -----------------------------------------------------------------------------------------
Comenzar a utilizar siempre una herramienta en el botón verde de la izquierda ----------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------- -----------------------------------------------------------------------------------------
Para elegir textos “Choose texts” y siempre buscar textos en unidades de abajo a la izquierda (donde estén textos) y luego posicionarse sobre los mismos cuando éstos aparezcan en la pantallita de la derecha (en verde arriba se eligen con “ALL” si se quiere todos, o uno a uno con cursor), y después “store” para guardar y seguir eligiendo de otras carpetas en la izquierda, o si ya se han elegido todos, “store” y “ok”. ----------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------- -----------------------------------------------------------------------------------------
Cuando se elige una palabra dada desde una lista y se pincha en la C (concord) se hace la búsqueda automática, pero también podemos refinar búsquedas indicando alguna palabra como contexto a la que se busca (ver imagen), o palabra con asterisco * para buscar derivaciones (ej., compr* en español daría todas las formas de comprar) o más palabras separadas por barra ( / ) para buscar todas las que queramos (ej., compr*/ vend*) ----------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------- -----------------------------------------------------------------------------------------
Una vez hayamos dado al icono de clusters (barras en rosita en la parte de la derecha arriba en concord), si queremos copiar unas expresiones dadas, siempre podemos hacer “copy” to the clipboard (portapapeles) y luego pegar en cualquier documento word, etc. Lo mismo desde la lista de palabras (wordlist): elegimos las que queramos con el cursor y copiamos y pegamos. ----------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------- -----------------------------------------------------------------------------------------
Un ejemplo de pegar una serie de expresiones dadas en el tema de contabilidad ----------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------- ----------------------------------------------------------------------------------------- -----------------------------------------------------------------------------------------
Ejemplo de colocaciones tras dar al icono de collocations en concord (dibujo parecido a un cazamariposas): en rojo aparecen las frecuencias más altas en las posiciones a la izquierda o derecha de un término que queramos ver (en este caso, por ejemplo, el nombre accounting en textos de contabilidad: por ejemplo “creative accounting” ocurre 96 veces, o “accounting data” 49, o Journal –luego espacio con alguna otra palabra y luego accounting, 21 veces).
Ejemplo de lista de palabras por frecuencia de este corpus sobre contabilidad (en textos escritos en inglés de este tipo– registro académico -- la primera palabra es siempre el artículo “the”). Como veis, en el icono de C podríamos dar para ejecutar la concordancia en una palabra dada (una vez nos posicionemos sobre ella). Recordad que para ejecutar una lista de palabras clave, este tipo de lista de frecuencia tiene que ser guardada antes en el programa (en File, save as) y le dais un nombre corto.
Ejemplo de lista detallada de consistencia – no lo veremos en este curso, pero se pueden utilizar para no sólo ver frecuencia, sino también distribución de palabras en más de una lista
Ejemplo de “aligner”– tampoco lo veremos en este curso, pero se utiliza para hacer corpus de textos paralelos (por ejemplo, para el análisis de la traducción).
--(Dp) home-made electronic glossaries w/ activities to exploit lexical items (BIT LANGUAGE TO BE EXPLOITED ___ C.S. students design electronic resources (DATABASES -- END OF MAJOR PROJECT) (DP) --(DP) BABYLON FOR WRITING -- COMING HANDY WITH SPECIFIC COLLOCATIONS AND STRUCTURES (DECODING & ENCODING) = GOOD THING IS ALSO TO USE VISUALS