Material of the Natural Language Processing (NLP) Workshop with STIC-Asia representatives and the Nepal team.
August 30-31, 2007.
Institution: Institut de Recherche en Informatique de Toulouse (IRIT)
Patan Dhoka, Lalitpur, Nepal.
Applicative evaluation of bilingual terminologiesEstelle Delpech
Material presented at the 18th Nordic Conference of Computational Linguistics (NODALIDA 2011), Riga, Latvia.
Download paper: http://hal.archives-ouvertes.fr/hal-00585187
Institutions: Laboratoire d'Informatique de Nantes Atlantique (LINA), Lingua et Machina
Corpus comparables et traduction assistée par ordinateur, contributions à la ...Estelle Delpech
Soutenance de thèse en Informatique, spécialité Traitement Automatique des Langues.
Soutenue le 2 juillet 2013 à l'Université de Nantes.
Manuscrit de thèse disponible ici : http://tel.archives-ouvertes.fr/tel-00905930
Identification de compatibilites sémantiques entre descripteurs de lieuxEstelle Delpech
Présentation effectuée lors de la 13e Conférence Francophone sur l'Extraction et la Gestion des Connaissances, le 31/12/2013, Toulouse, France.
Vidéo : http://www.canalc2.tv/video.asp?idVideo=11682
Article associé : http://hal.archives-ouvertes.fr/hal-00912332
Découverte du Traitement Automatique des LanguesEstelle Delpech
Conférence donnée dans le cadre du meet-up "Toulouse Data Science".
L'exposé est une introduction du domaine du traitement automatique des langues (aussi connu comme TAL, text mining, ou NLP, fouille de texte, analyse sémantique...). L'exposé est à destination de tout public (informaticien, statisticien, linguiste, manageur, curieux).
How to take notes and write paragraphs effectivelyMissConnell
This document provides guidance on how to take effective notes and write paragraphs for the AS media exam. It recommends:
1. Structuring notes with the POINT-EVIDENCE-LINK format to organize ideas before writing the essay. Notes should be taken over four viewings of the clip for maximum information.
2. Writing paragraphs using the PEAL structure - Point, Evidence, Analysis, Link - to provide a clear argument. Analysis should use descriptive verbs and link back to the overarching question.
3. Potentially comparing elements like technical codes, stereotypes, or characters across multiple PEAL paragraphs for higher-level answers. Connectives help structure comparisons of similarities and differences.
The document discusses pretraining models for natural language processing tasks. It outlines several ways to pretrain models, including pretraining decoders as language models, pretraining encoders using a masked language modeling objective, and pretraining encoder-decoder architectures. The document also discusses how pretrained models can be finetuned on downstream tasks to improve performance.
This document provides a guide for creating preparation outlines for speeches. It outlines the key elements an outline should contain, including the title, specific purpose statement, central idea, introduction, body with main points and subpoints, conclusion, and bibliography. The body is meant to have three main points as an example, each with subpoints, though the actual number and organization may vary by topic. The introduction should gain attention, reveal the topic, establish credibility, and preview the body. The conclusion should let the audience know the speech is ending and reinforce the central idea.
Applicative evaluation of bilingual terminologiesEstelle Delpech
Material presented at the 18th Nordic Conference of Computational Linguistics (NODALIDA 2011), Riga, Latvia.
Download paper: http://hal.archives-ouvertes.fr/hal-00585187
Institutions: Laboratoire d'Informatique de Nantes Atlantique (LINA), Lingua et Machina
Corpus comparables et traduction assistée par ordinateur, contributions à la ...Estelle Delpech
Soutenance de thèse en Informatique, spécialité Traitement Automatique des Langues.
Soutenue le 2 juillet 2013 à l'Université de Nantes.
Manuscrit de thèse disponible ici : http://tel.archives-ouvertes.fr/tel-00905930
Identification de compatibilites sémantiques entre descripteurs de lieuxEstelle Delpech
Présentation effectuée lors de la 13e Conférence Francophone sur l'Extraction et la Gestion des Connaissances, le 31/12/2013, Toulouse, France.
Vidéo : http://www.canalc2.tv/video.asp?idVideo=11682
Article associé : http://hal.archives-ouvertes.fr/hal-00912332
Découverte du Traitement Automatique des LanguesEstelle Delpech
Conférence donnée dans le cadre du meet-up "Toulouse Data Science".
L'exposé est une introduction du domaine du traitement automatique des langues (aussi connu comme TAL, text mining, ou NLP, fouille de texte, analyse sémantique...). L'exposé est à destination de tout public (informaticien, statisticien, linguiste, manageur, curieux).
How to take notes and write paragraphs effectivelyMissConnell
This document provides guidance on how to take effective notes and write paragraphs for the AS media exam. It recommends:
1. Structuring notes with the POINT-EVIDENCE-LINK format to organize ideas before writing the essay. Notes should be taken over four viewings of the clip for maximum information.
2. Writing paragraphs using the PEAL structure - Point, Evidence, Analysis, Link - to provide a clear argument. Analysis should use descriptive verbs and link back to the overarching question.
3. Potentially comparing elements like technical codes, stereotypes, or characters across multiple PEAL paragraphs for higher-level answers. Connectives help structure comparisons of similarities and differences.
The document discusses pretraining models for natural language processing tasks. It outlines several ways to pretrain models, including pretraining decoders as language models, pretraining encoders using a masked language modeling objective, and pretraining encoder-decoder architectures. The document also discusses how pretrained models can be finetuned on downstream tasks to improve performance.
This document provides a guide for creating preparation outlines for speeches. It outlines the key elements an outline should contain, including the title, specific purpose statement, central idea, introduction, body with main points and subpoints, conclusion, and bibliography. The body is meant to have three main points as an example, each with subpoints, though the actual number and organization may vary by topic. The introduction should gain attention, reveal the topic, establish credibility, and preview the body. The conclusion should let the audience know the speech is ending and reinforce the central idea.
The document outlines the 5 key steps in the writing process: 1) prewriting and planning, 2) drafting, 3) revising, 4) proofreading and editing, and 5) publishing. It describes various prewriting techniques and explains that drafting involves getting ideas down on paper without worrying about mistakes. The revision process involves refining the writing using the ARMS method of adding, removing, moving, and substituting content. Proofreading involves checking for errors in areas like spelling, grammar, and punctuation. Finally, publishing is sharing the final polished work.
The document discusses the Objective-C preprocessor and underlying C language features. It covers preprocessor directives like #define, #import, #include, and #undef. It also discusses arrays, structures, pointers, functions, and how Objective-C works with C at a low level. Blocks in Objective-C are described as being similar to functions but having some differences like needing to be declared in .m files instead of .h headers.
This document discusses analyzing writing prompts and structuring paragraphs. It provides guidance on identifying the components of a paragraph, including the topic sentence, supporting sentences, and clincher sentence. Readers are instructed to practice analyzing prompts, identifying sentence types within paragraphs, and writing their own structured paragraph in response to a writing prompt about how the lives of two individuals were impacted.
An essay is a group of paragraphs that discusses a single topic and central main idea. It typically contains at least three paragraphs, with five paragraphs being a common academic length. The structure of an essay includes an introduction with a general statement and thesis, body paragraphs that explain and support the thesis with evidence, and a conclusion that restates the main points. An outline is used to organize the information and structure of an essay in an ordered format using Roman numerals, capital letters, and Arabic numerals to denote the introduction, main ideas, supporting points, and details.
This document provides guidance on developing effective paragraphs with topic sentences and supporting details. It defines a paragraph as dealing with one main idea and recommends including a topic sentence that introduces the main idea. A topic sentence contains a topic and controlling idea. Supporting details are then used to elaborate on the topic sentence. Coherence and unity are important to ensure all sentences in a paragraph relate to the main idea. Signal devices like transitions and pronouns can also help achieve coherence by connecting ideas. Examples are provided to illustrate how to write topic sentences and supporting details.
This document provides guidance on correctly submitting assignments and common issues to avoid. It emphasizes including a completed top sheet with key information stapled to the essay, appending a word count, using correct formatting and checking for errors. Recurrent problems identified are lack of depth, overuse of description rather than analysis, and failing to apply appropriate concepts. The document then defines important semiotic terms like signifier, signified, denoted and connoted needed to analyze texts. It provides an example of deconstructing an advertisement by identifying signifiers and their signified meanings and associations. Finally, it outlines the proper structure for essays with introduction, main body, conclusion, and references.
This document provides instruction on developing ideas for writing paragraphs. It discusses brainstorming ideas, crafting an effective topic sentence to guide the paragraph, using supporting sentences to explain and expand on the topic sentence, and concluding the paragraph with a sentence that restates or predicts based on the main idea. Examples are given for different types of topic, supporting and concluding sentences. Students are assigned to brainstorm ideas on a given topic, write a paragraph using an appropriate topic sentence, and complete exercises on pronouns and punctuation from the textbook and online.
This document provides an introduction to pointers for programmers with basic experience. It covers the basic concepts of pointers, including pointers and pointees, pointer assignment, dereferencing, NULL pointers, and bad pointers. It explains pointers through examples in C syntax and memory drawings. The document is intended to give readers a complete understanding of how pointers work in memory.
Business Writing Style Guide Your Writing Companionenglishwriting
The reference book is a comprehensive and easy to use reference book that answers all the frequently asked questions about business writing style.
Compact and user-friendly, covering all the right topics - with just the right amount of information to be helpful. This is precisely what you want, at your elbow, at the ready, when you can\'t for the life of you remember which bit of punctuation goes where. (Ruth Wajnryb, Author and Columnist)
Business today is more complex and changes at a faster pace than ever before. This trend will only accelerate in the future. In this environment, one of a CEO\'s most important tasks is to provide clarity to the company and its stakeholders. In doing so, the ability to communicate clearly is a critical skill. In the Business Writing Style Guide, you will find great insights into improving your written communication skills. I strongly recommend it to any serious business person. (Julian Segal, ex CEO of Incitec Pivot)
I have been searching A LOT of bookstores to find something that is an easy reference yet comprehensive enough to use for my writing and your book is just that - clear, simple, precise and relevant. So thank you for putting your book out there. (Anna Fowler, Lawyer)
The document discusses coding standards and best practices for C# programming. It recommends naming conventions, formatting guidelines, and code review processes to develop reliable, maintainable code. Key points include using PascalCase for classes and methods, camelCase for variables, meaningful names without abbreviations, consistent indentation, and code reviews to ensure standards compliance.
The document provides instructions for students taking an AP Literature class. It outlines a 5-day plan for students to draft essay responses for 3 different AP Literature essay prompts. Each day focuses on a different step of the essay writing process, building upon the previous day. By day 5, students are expected to write a full essay responding to one of the released poetry prompts, incorporating a thesis statement, topic sentences supported by evidence from the text.
This document introduces a Python workbook for beginners. It covers 7 chapters that teach Python fundamentals like data types, variables, operators, conditional statements, and loops. The chapters also include quizzes to test the reader's understanding. The conclusion encourages the reader to take a paid Python course for more practical learning experiences to advance their skills.
Making an outline before writing an essay helps organize one's thoughts and ideas in a logical manner. An outline presents material in a hierarchical structure and establishes relationships between ideas. It involves determining the purpose, audience, and thesis statement. The outline then lists the main topics and subtopics to support the thesis through body paragraphs. Topic sentences for each paragraph should directly relate to and support the thesis. Outlining helps ensure paragraphs stay focused and saves time when writing the rough draft.
This document provides a tutorial on pointers and arrays in C. It begins by explaining that a pointer is a variable that holds the address of another variable. This allows a pointer variable to indirectly "point to" another variable in memory. The document covers various uses of pointers, including with arrays, strings, structures, dynamic memory allocation, and functions. It provides many code examples to demonstrate how pointers work in practice.
This document provides an introduction to pointers for programmers with basic experience. It covers fundamental topics like what pointers are, how they store references to other values (pointees), dereferencing pointers to access pointees, the NULL pointer value, and pointer assignment, which makes multiple pointers refer to the same pointee. The document is intended to build a complete understanding of pointers and memory through examples and diagrams.
The document provides an introduction to pointers for programmers with basic experience. It covers topics such as pointers, dereferencing pointers, the NULL pointer, pointer assignment, shallow vs deep copying, reference parameters, and memory allocation and leaks. Sample C code and memory drawings are provided to demonstrate pointer concepts.
The document discusses processing Boolean queries in an information retrieval system using an inverted index. It describes the steps to process a simple conjunctive query by locating terms in the dictionary, retrieving their postings lists, and intersecting the lists. More complex queries involving OR and NOT operators are also processed in a similar way. The document also discusses optimizing query processing by considering the order of accessing postings lists.
The document is a walkthrough of using test-driven development (TDD) to solve an encryption problem. It begins by setting up test and source code files. Tests are written based on examples from the problem. Minimal code is written to pass each test. Refactoring improves the code while maintaining passing tests. The implementation is generalized through many small iterative changes. Edge cases are addressed, like when the encryption grid is too small. The completed solution passes all tests.
Informative Speech Outline Template
Speech Title
Name
The comments in blue are for explanation purposes only for the outline. They explain the different sections and should not be included in your own outline.
Introduction
I.
Attention getter:
(Start all formal presentations with an attention getter. Avoid starting with “hi, my name is….” You can ask the audience a question, offer a quote or a statistic that is relevant to the topic that will get the audience’s attention.)
II.
State the topic:
(Tell the audience your topic.)
III.
Speaker credibility:
(Tell the audience why you are credible to speak on this topic. Tell them if you have experience with it or if you have conducted research on the topic.)
IV.
Thesis Statement:
(The thesis statement is a one-sentence summary of what you plan to cover for the presentation.)
V.
Preview:
(State the key ideas in the order you plan to cover them. “Today we will cover Saturn’s composition, the makeup of its rings, and the planet’s moons.”)
Body
I.
First key idea (
The main points you want to discuss are called key ideas. Key ideas should be labeled with I, II, III. For a speech four to seven minutes long, you should have at least two key ideas and no more than five key ideas.)
A. (
Supporting details for each of the key ideas are called subpoints. Subpoints should be indented underneath the main key ideas. You can decide how many supporting details you provide for each key idea. You should have at least an A and B. Label them A, B, C, then 1, 2, 3 and then a, b, c. In order for the outline to be balanced, If you have an A, you also need a B. If you have an a, you need a b. If you have a 1, you need a 2.)
B.
<Transition sentence>(Add a transition sentence between the key ideas. The transition sentence should summarize the previous key idea and introduce the next key idea. This lets the audience know where you are within the speech. Example: Now that we’ve gathered the materials for a blood draw, let’s discuss how to prepare the patient for a blood draw.)
II.
Second key idea
A.
1.
2.
a.
(Some of your key ideas may have one or two layers of sub-ideas, particularly where you have incorporated information from your sources.)
b.
B.
1.
2.
C.
1.
2.
<Transition sentence>(Add a transition sentence between the key ideas. The transition sentence should summarize the previous key idea and introduce the next key idea. This lets the audience know where you are within the speech. Example: Now that we’ve gathered the materials for a blood draw, let’s discuss how to prepare the patient for a blood draw.)
III.
Third key idea
A.
1.
2.
B.
1.
2.
<Transition sentence>(Add a transition sentence between the key ideas. The transition sentence should summarize the previous key idea and introduce the next key idea. This lets the audience know .
The document outlines the 5 key steps in the writing process: 1) prewriting and planning, 2) drafting, 3) revising, 4) proofreading and editing, and 5) publishing. It describes various prewriting techniques and explains that drafting involves getting ideas down on paper without worrying about mistakes. The revision process involves refining the writing using the ARMS method of adding, removing, moving, and substituting content. Proofreading involves checking for errors in areas like spelling, grammar, and punctuation. Finally, publishing is sharing the final polished work.
The document discusses the Objective-C preprocessor and underlying C language features. It covers preprocessor directives like #define, #import, #include, and #undef. It also discusses arrays, structures, pointers, functions, and how Objective-C works with C at a low level. Blocks in Objective-C are described as being similar to functions but having some differences like needing to be declared in .m files instead of .h headers.
This document discusses analyzing writing prompts and structuring paragraphs. It provides guidance on identifying the components of a paragraph, including the topic sentence, supporting sentences, and clincher sentence. Readers are instructed to practice analyzing prompts, identifying sentence types within paragraphs, and writing their own structured paragraph in response to a writing prompt about how the lives of two individuals were impacted.
An essay is a group of paragraphs that discusses a single topic and central main idea. It typically contains at least three paragraphs, with five paragraphs being a common academic length. The structure of an essay includes an introduction with a general statement and thesis, body paragraphs that explain and support the thesis with evidence, and a conclusion that restates the main points. An outline is used to organize the information and structure of an essay in an ordered format using Roman numerals, capital letters, and Arabic numerals to denote the introduction, main ideas, supporting points, and details.
This document provides guidance on developing effective paragraphs with topic sentences and supporting details. It defines a paragraph as dealing with one main idea and recommends including a topic sentence that introduces the main idea. A topic sentence contains a topic and controlling idea. Supporting details are then used to elaborate on the topic sentence. Coherence and unity are important to ensure all sentences in a paragraph relate to the main idea. Signal devices like transitions and pronouns can also help achieve coherence by connecting ideas. Examples are provided to illustrate how to write topic sentences and supporting details.
This document provides guidance on correctly submitting assignments and common issues to avoid. It emphasizes including a completed top sheet with key information stapled to the essay, appending a word count, using correct formatting and checking for errors. Recurrent problems identified are lack of depth, overuse of description rather than analysis, and failing to apply appropriate concepts. The document then defines important semiotic terms like signifier, signified, denoted and connoted needed to analyze texts. It provides an example of deconstructing an advertisement by identifying signifiers and their signified meanings and associations. Finally, it outlines the proper structure for essays with introduction, main body, conclusion, and references.
This document provides instruction on developing ideas for writing paragraphs. It discusses brainstorming ideas, crafting an effective topic sentence to guide the paragraph, using supporting sentences to explain and expand on the topic sentence, and concluding the paragraph with a sentence that restates or predicts based on the main idea. Examples are given for different types of topic, supporting and concluding sentences. Students are assigned to brainstorm ideas on a given topic, write a paragraph using an appropriate topic sentence, and complete exercises on pronouns and punctuation from the textbook and online.
This document provides an introduction to pointers for programmers with basic experience. It covers the basic concepts of pointers, including pointers and pointees, pointer assignment, dereferencing, NULL pointers, and bad pointers. It explains pointers through examples in C syntax and memory drawings. The document is intended to give readers a complete understanding of how pointers work in memory.
Business Writing Style Guide Your Writing Companionenglishwriting
The reference book is a comprehensive and easy to use reference book that answers all the frequently asked questions about business writing style.
Compact and user-friendly, covering all the right topics - with just the right amount of information to be helpful. This is precisely what you want, at your elbow, at the ready, when you can\'t for the life of you remember which bit of punctuation goes where. (Ruth Wajnryb, Author and Columnist)
Business today is more complex and changes at a faster pace than ever before. This trend will only accelerate in the future. In this environment, one of a CEO\'s most important tasks is to provide clarity to the company and its stakeholders. In doing so, the ability to communicate clearly is a critical skill. In the Business Writing Style Guide, you will find great insights into improving your written communication skills. I strongly recommend it to any serious business person. (Julian Segal, ex CEO of Incitec Pivot)
I have been searching A LOT of bookstores to find something that is an easy reference yet comprehensive enough to use for my writing and your book is just that - clear, simple, precise and relevant. So thank you for putting your book out there. (Anna Fowler, Lawyer)
The document discusses coding standards and best practices for C# programming. It recommends naming conventions, formatting guidelines, and code review processes to develop reliable, maintainable code. Key points include using PascalCase for classes and methods, camelCase for variables, meaningful names without abbreviations, consistent indentation, and code reviews to ensure standards compliance.
The document provides instructions for students taking an AP Literature class. It outlines a 5-day plan for students to draft essay responses for 3 different AP Literature essay prompts. Each day focuses on a different step of the essay writing process, building upon the previous day. By day 5, students are expected to write a full essay responding to one of the released poetry prompts, incorporating a thesis statement, topic sentences supported by evidence from the text.
This document introduces a Python workbook for beginners. It covers 7 chapters that teach Python fundamentals like data types, variables, operators, conditional statements, and loops. The chapters also include quizzes to test the reader's understanding. The conclusion encourages the reader to take a paid Python course for more practical learning experiences to advance their skills.
Making an outline before writing an essay helps organize one's thoughts and ideas in a logical manner. An outline presents material in a hierarchical structure and establishes relationships between ideas. It involves determining the purpose, audience, and thesis statement. The outline then lists the main topics and subtopics to support the thesis through body paragraphs. Topic sentences for each paragraph should directly relate to and support the thesis. Outlining helps ensure paragraphs stay focused and saves time when writing the rough draft.
This document provides a tutorial on pointers and arrays in C. It begins by explaining that a pointer is a variable that holds the address of another variable. This allows a pointer variable to indirectly "point to" another variable in memory. The document covers various uses of pointers, including with arrays, strings, structures, dynamic memory allocation, and functions. It provides many code examples to demonstrate how pointers work in practice.
This document provides an introduction to pointers for programmers with basic experience. It covers fundamental topics like what pointers are, how they store references to other values (pointees), dereferencing pointers to access pointees, the NULL pointer value, and pointer assignment, which makes multiple pointers refer to the same pointee. The document is intended to build a complete understanding of pointers and memory through examples and diagrams.
The document provides an introduction to pointers for programmers with basic experience. It covers topics such as pointers, dereferencing pointers, the NULL pointer, pointer assignment, shallow vs deep copying, reference parameters, and memory allocation and leaks. Sample C code and memory drawings are provided to demonstrate pointer concepts.
The document discusses processing Boolean queries in an information retrieval system using an inverted index. It describes the steps to process a simple conjunctive query by locating terms in the dictionary, retrieving their postings lists, and intersecting the lists. More complex queries involving OR and NOT operators are also processed in a similar way. The document also discusses optimizing query processing by considering the order of accessing postings lists.
The document is a walkthrough of using test-driven development (TDD) to solve an encryption problem. It begins by setting up test and source code files. Tests are written based on examples from the problem. Minimal code is written to pass each test. Refactoring improves the code while maintaining passing tests. The implementation is generalized through many small iterative changes. Edge cases are addressed, like when the encryption grid is too small. The completed solution passes all tests.
Informative Speech Outline Template
Speech Title
Name
The comments in blue are for explanation purposes only for the outline. They explain the different sections and should not be included in your own outline.
Introduction
I.
Attention getter:
(Start all formal presentations with an attention getter. Avoid starting with “hi, my name is….” You can ask the audience a question, offer a quote or a statistic that is relevant to the topic that will get the audience’s attention.)
II.
State the topic:
(Tell the audience your topic.)
III.
Speaker credibility:
(Tell the audience why you are credible to speak on this topic. Tell them if you have experience with it or if you have conducted research on the topic.)
IV.
Thesis Statement:
(The thesis statement is a one-sentence summary of what you plan to cover for the presentation.)
V.
Preview:
(State the key ideas in the order you plan to cover them. “Today we will cover Saturn’s composition, the makeup of its rings, and the planet’s moons.”)
Body
I.
First key idea (
The main points you want to discuss are called key ideas. Key ideas should be labeled with I, II, III. For a speech four to seven minutes long, you should have at least two key ideas and no more than five key ideas.)
A. (
Supporting details for each of the key ideas are called subpoints. Subpoints should be indented underneath the main key ideas. You can decide how many supporting details you provide for each key idea. You should have at least an A and B. Label them A, B, C, then 1, 2, 3 and then a, b, c. In order for the outline to be balanced, If you have an A, you also need a B. If you have an a, you need a b. If you have a 1, you need a 2.)
B.
<Transition sentence>(Add a transition sentence between the key ideas. The transition sentence should summarize the previous key idea and introduce the next key idea. This lets the audience know where you are within the speech. Example: Now that we’ve gathered the materials for a blood draw, let’s discuss how to prepare the patient for a blood draw.)
II.
Second key idea
A.
1.
2.
a.
(Some of your key ideas may have one or two layers of sub-ideas, particularly where you have incorporated information from your sources.)
b.
B.
1.
2.
C.
1.
2.
<Transition sentence>(Add a transition sentence between the key ideas. The transition sentence should summarize the previous key idea and introduce the next key idea. This lets the audience know where you are within the speech. Example: Now that we’ve gathered the materials for a blood draw, let’s discuss how to prepare the patient for a blood draw.)
III.
Third key idea
A.
1.
2.
B.
1.
2.
<Transition sentence>(Add a transition sentence between the key ideas. The transition sentence should summarize the previous key idea and introduce the next key idea. This lets the audience know .
Similar to Text Processing for Procedural Question Answering (20)
Usage du TAL dans des applications industrielles : gestion des contenus multi...Estelle Delpech
Intervention dans le cadre du Master Ergonomie Cognitive et Ingénierie Linguistique (ECIL 2012), UE 352 - "Production, gestion et exploitation de documents textuels", Université de Toulouse Le Mirail, Toulouse, France.
Institution : Nomao
Nomao: local search and recommendation engineEstelle Delpech
Nomao is a local search engine that uses social data and personalized search results to recommend places to users. It aggregates information from multiple sources, processes the content using natural language processing and data mining, and generates summaries of places. Current features include collaborative filtering to recommend places liked by similar users, user profiling to suggest places based on interests, and place merging, term classification, and summary generation from content. The company aims to expand its user base through better integration with Facebook and early adopter targeting.
Extraction of domain-specific bilingual lexicon from comparable corpora: comp...Estelle Delpech
Material presented at the 24th International Conference on Computational Linguistics (COLING 2012), Mumbai, India.
Paper download at http://hal.archives-ouvertes.fr/hal-00743807.
Institutions: Laboratoire d'Informatique de Nantes Atlantique (LINA), Lingua et Machina, Gremuts.
Identification of Fertile Translations in Comparable Corpora: a Morpho-Compos...Estelle Delpech
Material presented at the Tenth Biennial Conference of the
Association for Machine Translation in the Americas (AMTA 2012), San Diego, CA.
Download paper at http://hal.archives-ouvertes.fr/hal-00730325.
Instiutions: Laboratoire d'Informatique de Nantes Atlantique (LINA), Lingua et Machina, Gremuts
Évaluation applicative des terminologies destinées à la traduction spécialiséeEstelle Delpech
Présentation effectuée lors du 7ème atelier "Qualité des données et des connaissances, évaluation des méthodes d'extraction de données" (2011), Brest, France.
Articles associés :
- http://hal.archives-ouvertes.fr/hal-00912320 (actes atelier)
- http://hal.archives-ouvertes.fr/hal-00605304 (revue RNTI)
Institutions : Laboratoire d'Informatique de Nantes Atlantique, Lingua et Machina
Dealing with Lexicon Acquired from Comparable Corpora: post-edition and exchangeEstelle Delpech
Material presented at the TKE (Terminology and Knowledge Engineering) Conference 2010, Dublin, Ireland.
Download paper at http://hal.archives-ouvertes.fr/hal-00544403
Insitutions: Laboratoire d'Informatique de Nantes Atlantique (LINA), Lingua et Machina.
Material of the 4th Intensive Summer school and collaborative workshop on Natural Language Processing (NAIST Franco-Thai Workshop 2010).
Bangkok, Thaıland.
Institution: Institut de Recherche en Informatique de Toulouse (IRIT), Lingua et Machina
Material of the 4th Intensive Summer school and collaborative workshop on Natural Language Processing (NAIST Franco-Thai Workshop 2010).
Bangkok, Thaıland.
Material of the Natural Language Processing (NLP) Workshop with STIC-Asia representatives and the Nepal team.
August 30-31, 2007.
Patan Dhoka, Lalitpur, Nepal.
Material of the Natural Language Processing (NLP) Workshop with STIC-Asia representatives and the Nepal team.
August 30-31, 2007.
Patan Dhoka, Lalitpur, Nepal.
"$10 thousand per minute of downtime: architecture, queues, streaming and fin...Fwdays
Direct losses from downtime in 1 minute = $5-$10 thousand dollars. Reputation is priceless.
As part of the talk, we will consider the architectural strategies necessary for the development of highly loaded fintech solutions. We will focus on using queues and streaming to efficiently work and manage large amounts of data in real-time and to minimize latency.
We will focus special attention on the architectural patterns used in the design of the fintech system, microservices and event-driven architecture, which ensure scalability, fault tolerance, and consistency of the entire system.
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...Alex Pruden
Folding is a recent technique for building efficient recursive SNARKs. Several elegant folding protocols have been proposed, such as Nova, Supernova, Hypernova, Protostar, and others. However, all of them rely on an additively homomorphic commitment scheme based on discrete log, and are therefore not post-quantum secure. In this work we present LatticeFold, the first lattice-based folding protocol based on the Module SIS problem. This folding protocol naturally leads to an efficient recursive lattice-based SNARK and an efficient PCD scheme. LatticeFold supports folding low-degree relations, such as R1CS, as well as high-degree relations, such as CCS. The key challenge is to construct a secure folding protocol that works with the Ajtai commitment scheme. The difficulty, is ensuring that extracted witnesses are low norm through many rounds of folding. We present a novel technique using the sumcheck protocol to ensure that extracted witnesses are always low norm no matter how many rounds of folding are used. Our evaluation of the final proof system suggests that it is as performant as Hypernova, while providing post-quantum security.
Paper Link: https://eprint.iacr.org/2024/257
inQuba Webinar Mastering Customer Journey Management with Dr Graham HillLizaNolte
HERE IS YOUR WEBINAR CONTENT! 'Mastering Customer Journey Management with Dr. Graham Hill'. We hope you find the webinar recording both insightful and enjoyable.
In this webinar, we explored essential aspects of Customer Journey Management and personalization. Here’s a summary of the key insights and topics discussed:
Key Takeaways:
Understanding the Customer Journey: Dr. Hill emphasized the importance of mapping and understanding the complete customer journey to identify touchpoints and opportunities for improvement.
Personalization Strategies: We discussed how to leverage data and insights to create personalized experiences that resonate with customers.
Technology Integration: Insights were shared on how inQuba’s advanced technology can streamline customer interactions and drive operational efficiency.
Northern Engraving | Modern Metal Trim, Nameplates and Appliance PanelsNorthern Engraving
What began over 115 years ago as a supplier of precision gauges to the automotive industry has evolved into being an industry leader in the manufacture of product branding, automotive cockpit trim and decorative appliance trim. Value-added services include in-house Design, Engineering, Program Management, Test Lab and Tool Shops.
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...DanBrown980551
This LF Energy webinar took place June 20, 2024. It featured:
-Alex Thornton, LF Energy
-Hallie Cramer, Google
-Daniel Roesler, UtilityAPI
-Henry Richardson, WattTime
In response to the urgency and scale required to effectively address climate change, open source solutions offer significant potential for driving innovation and progress. Currently, there is a growing demand for standardization and interoperability in energy data and modeling. Open source standards and specifications within the energy sector can also alleviate challenges associated with data fragmentation, transparency, and accessibility. At the same time, it is crucial to consider privacy and security concerns throughout the development of open source platforms.
This webinar will delve into the motivations behind establishing LF Energy’s Carbon Data Specification Consortium. It will provide an overview of the draft specifications and the ongoing progress made by the respective working groups.
Three primary specifications will be discussed:
-Discovery and client registration, emphasizing transparent processes and secure and private access
-Customer data, centering around customer tariffs, bills, energy usage, and full consumption disclosure
-Power systems data, focusing on grid data, inclusive of transmission and distribution networks, generation, intergrid power flows, and market settlement data
The Department of Veteran Affairs (VA) invited Taylor Paschal, Knowledge & Information Management Consultant at Enterprise Knowledge, to speak at a Knowledge Management Lunch and Learn hosted on June 12, 2024. All Office of Administration staff were invited to attend and received professional development credit for participating in the voluntary event.
The objectives of the Lunch and Learn presentation were to:
- Review what KM ‘is’ and ‘isn’t’
- Understand the value of KM and the benefits of engaging
- Define and reflect on your “what’s in it for me?”
- Share actionable ways you can participate in Knowledge - - Capture & Transfer
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor IvaniukFwdays
At this talk we will discuss DDoS protection tools and best practices, discuss network architectures and what AWS has to offer. Also, we will look into one of the largest DDoS attacks on Ukrainian infrastructure that happened in February 2022. We'll see, what techniques helped to keep the web resources available for Ukrainians and how AWS improved DDoS protection for all customers based on Ukraine experience
How information systems are built or acquired puts information, which is what they should be about, in a secondary place. Our language adapted accordingly, and we no longer talk about information systems but applications. Applications evolved in a way to break data into diverse fragments, tightly coupled with applications and expensive to integrate. The result is technical debt, which is re-paid by taking even bigger "loans", resulting in an ever-increasing technical debt. Software engineering and procurement practices work in sync with market forces to maintain this trend. This talk demonstrates how natural this situation is. The question is: can something be done to reverse the trend?
Main news related to the CCS TSI 2023 (2023/1695)Jakub Marek
An English 🇬🇧 translation of a presentation to the speech I gave about the main changes brought by CCS TSI 2023 at the biggest Czech conference on Communications and signalling systems on Railways, which was held in Clarion Hotel Olomouc from 7th to 9th November 2023 (konferenceszt.cz). Attended by around 500 participants and 200 on-line followers.
The original Czech 🇨🇿 version of the presentation can be found here: https://www.slideshare.net/slideshow/hlavni-novinky-souvisejici-s-ccs-tsi-2023-2023-1695/269688092 .
The videorecording (in Czech) from the presentation is available here: https://youtu.be/WzjJWm4IyPk?si=SImb06tuXGb30BEH .
High performance Serverless Java on AWS- GoTo Amsterdam 2024Vadym Kazulkin
Java is for many years one of the most popular programming languages, but it used to have hard times in the Serverless community. Java is known for its high cold start times and high memory footprint, comparing to other programming languages like Node.js and Python. In this talk I'll look at the general best practices and techniques we can use to decrease memory consumption, cold start times for Java Serverless development on AWS including GraalVM (Native Image) and AWS own offering SnapStart based on Firecracker microVM snapshot and restore and CRaC (Coordinated Restore at Checkpoint) runtime hooks. I'll also provide a lot of benchmarking on Lambda functions trying out various deployment package sizes, Lambda memory settings, Java compilation options and HTTP (a)synchronous clients and measure their impact on cold and warm start times.
Discover top-tier mobile app development services, offering innovative solutions for iOS and Android. Enhance your business with custom, user-friendly mobile applications.
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving
Manufacturing custom quality metal nameplates and badges involves several standard operations. Processes include sheet prep, lithography, screening, coating, punch press and inspection. All decoration is completed in the flat sheet with adhesive and tooling operations following. The possibilities for creating unique durable nameplates are endless. How will you create your brand identity? We can help!
Monitoring and Managing Anomaly Detection on OpenShift.pdfTosin Akinosho
Monitoring and Managing Anomaly Detection on OpenShift
Overview
Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices.
Key Topics Covered
1. Introduction to Anomaly Detection
- Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems.
2. Understanding Edge (IoT)
- Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source.
3. What is ArgoCD?
- Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices.
4. Deployment Using ArgoCD for Edge Devices
- Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD.
5. Introduction to Apache Kafka and S3
- Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions.
6. Viewing Kafka Messages in the Data Lake
- Learn how to view and analyze Kafka messages stored in a data lake for better insights.
7. What is Prometheus?
- Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices.
8. Monitoring Application Metrics with Prometheus
- Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system.
9. What is Camel K?
- Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes.
10. Configuring Camel K Integrations for Data Pipelines
- Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow.
11. What is a Jupyter Notebook?
- Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text.
12. Jupyter Notebooks with Code Examples
- Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.
In the realm of cybersecurity, offensive security practices act as a critical shield. By simulating real-world attacks in a controlled environment, these techniques expose vulnerabilities before malicious actors can exploit them. This proactive approach allows manufacturers to identify and fix weaknesses, significantly enhancing system security.
This presentation delves into the development of a system designed to mimic Galileo's Open Service signal using software-defined radio (SDR) technology. We'll begin with a foundational overview of both Global Navigation Satellite Systems (GNSS) and the intricacies of digital signal processing.
The presentation culminates in a live demonstration. We'll showcase the manipulation of Galileo's Open Service pilot signal, simulating an attack on various software and hardware systems. This practical demonstration serves to highlight the potential consequences of unaddressed vulnerabilities, emphasizing the importance of offensive security practices in safeguarding critical infrastructure.
Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API?
Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose?
Which one is cheapest? Which one is fastest? Which one will scale to meet our needs?
Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!
Dandelion Hashtable: beyond billion requests per second on a commodity serverAntonios Katsarakis
This slide deck presents DLHT, a concurrent in-memory hashtable. Despite efforts to optimize hashtables, that go as far as sacrificing core functionality, state-of-the-art designs still incur multiple memory accesses per request and block request processing in three cases. First, most hashtables block while waiting for data to be retrieved from memory. Second, open-addressing designs, which represent the current state-of-the-art, either cannot free index slots on deletes or must block all requests to do so. Third, index resizes block every request until all objects are copied to the new index. Defying folklore wisdom, DLHT forgoes open-addressing and adopts a fully-featured and memory-aware closed-addressing design based on bounded cache-line-chaining. This design offers lock-free index operations and deletes that free slots instantly, (2) completes most requests with a single memory access, (3) utilizes software prefetching to hide memory latencies, and (4) employs a novel non-blocking and parallel resizing. In a commodity server and a memory-resident workload, DLHT surpasses 1.6B requests per second and provides 3.5x (12x) the throughput of the state-of-the-art closed-addressing (open-addressing) resizable hashtable on Gets (Deletes).
9 CEO's who hit $100m ARR Share Their Top Growth Tactics Nathan Latka, Founde...
Text Processing for Procedural Question Answering
1. Text Processing for Procedural
Question Answering
Undergoing work for TextCoop project
ILPL group, presentation by Estelle Delpech
2. Text Processing for Procedural
Question Answering
I.
INTRODUCTION : GLOBAL
ARCHITECTURE
II.
CLUES TO IDENTIFY TITLES/
INSTRUCTIONNAL COMPOUNDS
III.
THE WHOLE PROCESS
IV.
MAIN ISSUES
V.
DEMO
5. TEXT PROCESSING for Procedural QA :
Identification of task structure
.html
PRE-PROCESSING
SEGMENTER
TEXT GRAMMAR
TASK
HTML cleaning
MS tagging
Identification of
terminal symbols
Xbar analysis of
task structure
DATABASE
spec
G’
Pre-requisite Goal
Title
complemen
t
Instructional
Compound
6. II . CORPUS OBSERVATION :
WHAT CLUES TO IDENTIFY
-INSTRUCTIONNAL COMPOUNDS ?
-TITLES ?
7. 1. Clues for Instructional Compounds
Identification
Definition : kernel instructions linked to various clauses by rhetorical
or logical relations.
Identification in two steps :
Detect presence of instructions : expression of obligation
Find instructionnal compound boudaries, e.g. connectors…
Fixing the first wall plate (or shelf bracket)
Fixing the first wall plate (or shelf bracket)
Fixing the first wall plate (or shelf bracket)
We are going to mark the first wall plate (or bracket) for drilling.
We are going to mark the first wall plate (or bracket) for drilling.
First,position the face plate so one screw lines up with the mark on the wall you
First, position the face plate sosoone screw lines up with the mark on the wall you made
First, position the face plate one screw lines up with the mark on the wall you made
made in the last step and the level on topon top of the faceto ensure it is level. level.
in the last step and place the level on top of the face plate to ensure it is level.
in the last step and place place the level of the face plate plate to ensure it is
Second, you should mark thethewall in the next screw hole, again by turning the screw
Second,you should mark the wallthethe next screw hole, again turning thethe screw
Second, you should mark wall in in next screw hole, again by by turning screw
until it bites into the wall (see fig 1.3).
until it bites into the wall (see fig 1.3).
It is advised that you mark any remaining screw holes while keeping the wall plate
It is advised that you mark any remaining screw holes while keeping the wall plate
firmly in position.
firmly in position.
Now you have toto choose suitable drill bitbit (masonry or the right type for the
Now you have choose a a suitable drill (masonry or or right type for the surface). It
Now you have to choosea suitable drill bit (masonry thethe right type for the
surface). It should be theas the wall plug thebe used. to be used.
surface). the same width same width as to wall plug
should beIt should be the same width as the wall plug to be used.
Get to hand one of the wall plugs, and place itit against the tip of the drill bit (seefig
Get to hand one of the wall plugs, and place against the tip of the drill bit (see fig
Get to hand one of the wall plugs, and place it against the tip of the drill bit (see fig
1.4).
1.4).
Finally, Place a piece of masking tape on the drill bit to use as a guide, this will ensure
piece of masking tape on the drill bit to use as a guide, this will ensure
Finally, place aa piece of masking tape on the drill bit to use as a guide, this will ensure
Finally, place
you don't drill too deep.
you don't drill too deep.
8. 1. Clues for Instructional Compounds
Identification
Presence of instructions :
Morpho-lexical patterns
You should pre-heat the oven
shall Adv* base form verb
Have to Adv* base form verb
You have to pre-heat the oven
## Op? adv* base form verb
Do not pre-heat the oven
it be adv* (necessary|compulsory) that It is better that you pre-heat
the oven
Compound boudaries :
Morpho-lexical patterns
## to Adv* base form verb .* ,
(##|Conj) (if|then|after )
[To cook the cake, pre-heat the oven]
[and then start peeling …
[If you want to cook the cake, preHTML tags (typo-disposition) : heat the oven.] [If you don’t want to
cook …
<p> </p> <li> </li>
<li> [ Pre-heat the oven … ]</li>
9. 2. Titles identification :
About the HTML encoding of titles
The <hn> tag can not be used as a single clue for
title identification
HTML encoding is free, the code can be
underspecified (css)
Corpus observation :
80 % titles are encoded with <b>
57 % <b> encode titles
64 % <h> encode titles
the coding varies from a web site to another
We had to find some other clues …
10. 2. Clues for Title Identification
Some helpful visual Clues :
Short sequence of word
Emphasized
Spaced from the rest of the text
emphasized
not
not a title
not short
11. 2. Clues for Title Identification
Linguistic Clues :
Rarely contains tensed verb
Can be a single question
?
?
Textual environment clues :
Occurs between two
paragraphs of text
Occurs between title and a
paragraph of text
No single clue, but a bundle
of clues
?
?
12. III. THE WHOLE PROCESS
HTML cleaning
MS tagging
PRE-PROCESSING
SEGMENTER
Identification of
terminal symbols
Title
Instructional
Compound
13. 1. HTML Cleaning module
Raw HTML
Code
HTML
Cleaning
Text chunks tags
The output of the HTML
<p>
Cleaning module is :
<div>
<p>
<ol>
a list of text chunks,
<ul>
corresponding more or less
to paragraph breaks
Subdivision tags
<br>
<br>
Their corresponding typo<li>
<li>
dispositionnal structure
Emphasis tags
<h>
<b>
<u>
<i>
Main typo-dispostional information
<p>
<b>
<p>
<li>
<li>
<p>
<b>
<p>
<b>
<br>
<br>
<p>
<b>
<b>
<br>
14. 2. Clues Collection module
STRUCTURE
<b>
<li>
<li>
TEXT
MS Tagging
TAGS
Collection module is :
TreeTagger
<b>
<br>
<br>
<b>
<br>
<b>
<li>
<li>
the list of text chunks with :
Nb corresponding typoTheir of instructions
Instructions types
dispositionnal structure
Nb of goals
Text with tagged
instructions, goals,
Nb of words
connectors
Nb of sentences
Linguistic information
Nb of question
This information is used for :
Nb of tensed verbs
Titles identification
Instructionnal compounds
identification
<b>
<b>
Clues
The output collection
of the Clues
CLUES
15. 3. Processing each chunk : text or title ?
TEXT
CHUNKS
TYPE
unknown
unknown
Short chunk
spaced from the rest of the
text
with emphasis
a single question
Identification of
unambiguous
Titles
unknown
unknown
unknown
unknown
unknown
unknown
title
text
text
ambiguous
unknown
unknown
TEXT
CHUNKS
Identification of
unambiguous
paragraphs of
text
Long chunk
No emphasis
Subdivided
+ than 1 instruction
presence of tensed verbs
ambiguous
title
ambiguous
text
text
ambiguous
16. 3. Ambiguous chunks : text or title ?
Short chunks with no
emphasis
Instruction-like short chunks
Use of textual environement clues :
1. Identify unambiguous titles/paragraphs of text
2. Desambiguates the remaining chunks
17. 3. Ambiguous chunks : text or title ?
TEXT
CHUNKS
title
text
text
Desambiguisation
using textual
environment clues
ambiguous
a series of ambiguous
paragraphs become text
an ambiguous
paragraph between two
paragraphs of text
becomes a title
ambiguous
title
ambiguous
text
ambiguous
text
TEXT
CHUNKS
title
text
text
text
text
an ambiguous
paragraph between two
paragraphs of text
becomes a title
title
title
text
title
text
19. IV. Main issues : noise in web pages
« noise » of web pages : advertisements,
lists of links, navigation help...
interfers with compouds /title identification :
short sequence
emphasis
linguistic form:
Base form verb at the beginning of a sentence
typical of a title or an instruction
but it is a list of links !!
titles
instruction
titles
20. IV. Main issues : refining goal/titles
identification
only sub-goals sub tasks relations are
identified
what about the hierarchy task/sub-task(s) ?
what about the head title / main goal ?
the head title is not always the 1st
identified title (noise)
sometimes there is no head title
what if the action is implicit ?
ex : the room and the bed
implicit : how to clean the room and the
bed
some ideas :
choose a title that has vocabulary in
common with instructions
identify action verbs in relation with the
nouns of the title