The document discusses how to evaluate NLP tools for entity extraction. It recommends defining the problem domain, assembling a test set within that domain, creating annotation guidelines, reviewing evaluation metrics like precision and recall, providing examples of evaluations, and considering inter-annotator agreement when creating a gold standard. The goal is to perform a rigorous evaluation that can inform decisions about which NLP tools are suitable for a given use case.
[Apple|organization] and [oranges|fruit]: How to evaluate NLP tools for entit...Gil Irizarry
This document outlines steps for evaluating NLP tools for entity extraction:
1. Define the problem domain and assemble a test set specific to that domain
2. Create annotation guidelines to define what entities should be extracted and how annotators should label them
3. Review evaluation metrics like precision, recall, F1 score, and have annotators label a test set to create a gold standard for comparison
4. Compare tool output to the gold standard using the metrics to evaluate performance.
Prioritisation & Discovery vs Delivery with Welcome & Lenses PMsProduct School
The document discusses product management frameworks for prioritizing discovery vs delivery. It presents frameworks for exploring problems, defining problem statements, prototyping solutions, and delivering solutions. It also discusses frameworks like the Double Diamond for innovation and compares waterfall vs agile approaches like Scrum. The website, productschool.com, offers part-time product management training courses and corporate training.
What is Means to be Strategic and Create Value (UX Strat Summit, SF 2014)Nathan Shedroff
Designers are already inherently connected to strategy. They just need to know how to get into the room. Note: the talking points in the notes field isn't a full transcript. They're mostly just notes for myself while presenting.
This document provides information to help claimants make educated career and job target decisions. It discusses the challenges injured workers face and the importance of career assessment. Various assessment tools and methods are described, including verbal assessment, written assessments like interest inventories and skills scales, and career exploration resources. The goals of assessment are to determine a claimant's starting point, set achievable objectives, and develop a written ideal job description. Information interviewing employers in potential fields is emphasized to make informed training and career choices.
The document summarizes a career path presentation on Python given by Jo Gascoigne. The presentation covered Jo's background and company, an introduction to Python, curriculum vitae structure focusing on achievement statements, and tips for interviews. Key points included using PAR statements (Problem, Action, Result) in CVs and interviews to showcase achievements and results. Employers primarily look for skills, motivation, team fit, manageability, and affordability when hiring.
While working in Kabul, Afghanistan in 2007, I had problems with staff seeing the point of reporting. They thought it was something I needed; they didn\'t see it as something that was important to them. A colleague and I put together this presentation to help staff see how reporting benefits them -- their reputation at the organization, their career goals, even their likelihood at being considered for promotions. We made the presentation as interactive as possible. This presentation works best in small groups -- 15 people or less. Although focused on Afghanistan and UNDP-supported projects, this could easily be adapted to other situations. This presentation was part of an overall internal campaign we conducted to raise the quality of reporting from staff.
This document discusses principles of effective presentation design based on cognitive science research. It covers several cognitive load theories and effects, including split-attention effect, modality effect, redundancy effect, and imagination effect. It also discusses design principles for visual hierarchy, use of white space, and use of grids. The overall aim is to design presentations that minimize extraneous cognitive load and maximize learning for audiences.
The document provides guidelines for proper task estimation in 7 steps: 1) Review requirements and assets, 2) Define major tasks, 3) Break down work needed to complete each task, 4) Define sub-tasks step-by-step, 5) Estimate each sub-task, 6) Improve estimates by tracking time spent on similar tasks, and 7) Compare estimates to actual time and identify reasons for deviations. It emphasizes providing personal estimates based on individual work velocity and collecting data to continually improve estimates.
[Apple|organization] and [oranges|fruit]: How to evaluate NLP tools for entit...Gil Irizarry
This document outlines steps for evaluating NLP tools for entity extraction:
1. Define the problem domain and assemble a test set specific to that domain
2. Create annotation guidelines to define what entities should be extracted and how annotators should label them
3. Review evaluation metrics like precision, recall, F1 score, and have annotators label a test set to create a gold standard for comparison
4. Compare tool output to the gold standard using the metrics to evaluate performance.
Prioritisation & Discovery vs Delivery with Welcome & Lenses PMsProduct School
The document discusses product management frameworks for prioritizing discovery vs delivery. It presents frameworks for exploring problems, defining problem statements, prototyping solutions, and delivering solutions. It also discusses frameworks like the Double Diamond for innovation and compares waterfall vs agile approaches like Scrum. The website, productschool.com, offers part-time product management training courses and corporate training.
What is Means to be Strategic and Create Value (UX Strat Summit, SF 2014)Nathan Shedroff
Designers are already inherently connected to strategy. They just need to know how to get into the room. Note: the talking points in the notes field isn't a full transcript. They're mostly just notes for myself while presenting.
This document provides information to help claimants make educated career and job target decisions. It discusses the challenges injured workers face and the importance of career assessment. Various assessment tools and methods are described, including verbal assessment, written assessments like interest inventories and skills scales, and career exploration resources. The goals of assessment are to determine a claimant's starting point, set achievable objectives, and develop a written ideal job description. Information interviewing employers in potential fields is emphasized to make informed training and career choices.
The document summarizes a career path presentation on Python given by Jo Gascoigne. The presentation covered Jo's background and company, an introduction to Python, curriculum vitae structure focusing on achievement statements, and tips for interviews. Key points included using PAR statements (Problem, Action, Result) in CVs and interviews to showcase achievements and results. Employers primarily look for skills, motivation, team fit, manageability, and affordability when hiring.
While working in Kabul, Afghanistan in 2007, I had problems with staff seeing the point of reporting. They thought it was something I needed; they didn\'t see it as something that was important to them. A colleague and I put together this presentation to help staff see how reporting benefits them -- their reputation at the organization, their career goals, even their likelihood at being considered for promotions. We made the presentation as interactive as possible. This presentation works best in small groups -- 15 people or less. Although focused on Afghanistan and UNDP-supported projects, this could easily be adapted to other situations. This presentation was part of an overall internal campaign we conducted to raise the quality of reporting from staff.
This document discusses principles of effective presentation design based on cognitive science research. It covers several cognitive load theories and effects, including split-attention effect, modality effect, redundancy effect, and imagination effect. It also discusses design principles for visual hierarchy, use of white space, and use of grids. The overall aim is to design presentations that minimize extraneous cognitive load and maximize learning for audiences.
The document provides guidelines for proper task estimation in 7 steps: 1) Review requirements and assets, 2) Define major tasks, 3) Break down work needed to complete each task, 4) Define sub-tasks step-by-step, 5) Estimate each sub-task, 6) Improve estimates by tracking time spent on similar tasks, and 7) Compare estimates to actual time and identify reasons for deviations. It emphasizes providing personal estimates based on individual work velocity and collecting data to continually improve estimates.
EA Benefits Realization in a Digital WorldKaine Ugwu
Enterprise Architecture is a promising approach to supporting digital transformation and providing the necessary agility to respond to changes. It has received attention from academic and industry professionals.
In addition, the practical aspects of creating and implementing EAs have been addressed in many case studies and research. Some of the Benefits of EA include improved decision making, better alignment of business and IT, and reduced costs. However, the question of how EA creates these benefits has received little attention. This presentation discusses a way to measure the effectiveness of an EA function in realizing its goals.
How to Meet Goals and Inspire Your Team Using OKRs (Includes OKR Examples) QuekelsBaro
This document provides information on OKRs (Objectives and Key Results), including how to write effective OKRs to meet goals and inspire teams. It begins with background on OKRs, explaining their origins at Intel in the 1970s and adoption by Google. The benefits of OKRs are outlined as Focus, Alignment, Commitment, Tracking, and Stretching. Guidelines for writing OKRs emphasize using a clear objective/key result formula, avoiding vague language, setting ambitious but achievable goals, and making objectives motivating. Examples of real OKRs from Google, Piktochart, and Upraise are also presented.
The document summarizes the key points from a business English class, including a review of vocabulary, homework, and exercises. It also includes details about a case study on Auric Bank UK, which is considering outsourcing some of its call center operations to reduce costs while maintaining good customer service. Options discussed include keeping operations in-house, outsourcing to UK or low-cost countries like India. Losing £1.5 billion from unprofitable investments is noted.
The document provides an analysis of Steve Jobs' 2005 commencement speech at Stanford University. It discusses how Jobs drew on his own experiences being fired from Apple and undergoing cancer treatment to convey the lessons he learned. His credibility and status in the tech industry gave weight to his message of pursuing one's passions. The speech emphasized chasing dreams and not settling for work that isn't meaningful. It highlighted Jobs' skill in using personal stories and emotions to persuade and inspire the graduates.
This document provides an overview of non-technical roles in the tech industry. It begins by noting that tech companies employ 3 times as many non-technical workers as technical workers. It then lists and describes common non-technical roles including business operations, human resources, corporate development, research, product management, project management, operations, marketing, sales, business development, customer service, and finance/accounting. For each role, it provides an example project, typical daily responsibilities, common roles within that function, and the types of skills and experience sought for those roles. It concludes by noting that many people transition between different non-technical roles over their careers in tech.
How to Pass an Interview for Software Engineersuttoantruot
The document provides tips on how to prepare for and pass a technical interview for a software engineering position. It discusses researching the company and job position in advance, preparing answers to common technical and personality questions, and avoiding typical mistakes. The interview process typically involves the candidate presenting their background and skills, learning about the company and role, and demonstrating their technical abilities through questions and tasks. Being well prepared is key to overcoming nerves and showcasing qualifications for the position.
This document provides guidance on how to write a CV for a French applicant seeking an internship. It advises starting with a summary highlighting one's objectives and qualifications. It then recommends including sections on experience, skills, interests, community service, and references. For each section, it suggests including specific, quantifiable achievements and accomplishments that demonstrate one's value and fit for the role. The overall message is to directly sell oneself and one's strengths from the beginning to increase chances of securing the internship.
1 Undergraduate Program Rubric—BACHELOR OF SCIENCE IN CRI.docxjoyjonna282
1
Undergraduate Program Rubric—BACHELOR OF SCIENCE IN CRIMINAL JUSTICE
Expectations: Student work at the undergraduate level is expected to focus on a broad overview of the academic discipline, along with—where appropriate—basic theoretical
frameworks of professional practices and familiarity with discipline-specific tools and their application.
Criteria Exemplary (A)) Accomplished (B) Proficient (C) Partially Proficient (D) Unacceptable (F)
Working knowledge of criminal
justice system
Demonstrates thorough
insight and application
of key criminal justice
practices.
Shows above average
insight and application
of key criminal justice
practices.
Demonstrates average
insight and application
of key criminal justice
practices.
Demonstrates below
average insight and
application of key
criminal justice
practices.
Shows poor insight and
application of key
criminal justice
practices.
Theory analysis and application Demonstrates thorough
and effective analysis
and application of crime
causation theories.
Demonstrates above
average ability to
analyze and apply crime
causation theories.
Shows average ability to
analyze and apply crime
causation theories.
Shows below average
ability to analyze and
apply crime causation
theories.
Shows poor ability to
analyze and apply
crime causation
theories.
EFFECTIVE COMMUNICATION
Approach and Purpose,
Organization, Style, Grammar,
Mechanics, Format,
Presentation and Delivery
(where applicable)
Demonstrates
outstanding or exemplary
application of written,
visual, or oral skills.
Demonstrates
outstanding expression of
topic, main idea, and
purpose.
Audience is addressed
appropriately.
Language clearly and
effectively communicates
ideas and content
relevant to the
assignment.
Errors in grammar,
spelling, and sentence
structure are minimal.
Organization is clear.
Format is consistently
appropriate to
assignment.
Presentation and delivery
are confident and
Demonstrates sound or
accomplished application
of written, visual, or oral
skills.
Demonstrates sound or
accomplished expression
of topic, main idea, and
purpose.
Audience is usually
addressed appropriately.
Language does not
interfere with the
communication of ideas
and content relevant to
the assignment.
Errors in grammar,
spelling, and sentence
structure are present, but
do not distract.
Organization is apparent
and mostly clear.
Format is appropriate to
assignment, but not
entirely consistent.
Demonstrates adequate
or proficient application
of written, visual, or oral
skills.
Demonstrates adequate
expression of topic, main
idea, and purpose.
Audience is generally
addressed appropriately.
Language is adequate,
generally communicating
ideas and content
relevant to the
assignment.
Errors in grammar,
spelling, and sentence
structure are present and
sometimes distract from
meaning or presentation.
Organization is ...
This document outlines the plan for an introductory class on professional procedures and portfolio development. The class will focus on identifying career goals, the importance of teamwork, and completing self-assessments. Students will take tests to learn about their strengths when working with others and in organizations, and write an assignment introducing themselves and their skills.
Make User Experience Part of The KPI Conversation With Universal MeasuresUserZoom
Join Dr. Andrea Peer and learn:
-How Universal Measures makes tangible the abstract concept of experience for your organization
-How practitioners can make experience a critical KPI for their organization
-Ways to establish experience score goals for all lines of business
-The benefits Universal Measures brings to executives and stakeholders
This document provides an outline for a workshop on how to present creative work. It discusses preparing for a presentation by understanding the audience, environment and goals. The days before a presentation, creators should ensure their work is on strategy, on brand and compelling. Minutes before, they should rehearse while considering the audience, environment and team. The workshop covers setting up the presentation story, practicing delivery from a position of authority and anticipating objections.
This document provides guidance on monitoring and evaluation (M&E) for organizations. It discusses the importance of M&E and key concepts like indicators, results chains, and identifying evidence of change. The document emphasizes that M&E requires organizational and technical readiness, including clear frameworks, evidence-based planning, relevant skills, and experience. It also provides examples of performance measures and developing them for different sectors. Worksheets are included to help participants apply these M&E concepts.
It is very important that you understand the fact that writing a resume that appeals is the success criteria for landing in a job you always dream about. For more information click here : http://bit.ly/resumetips101
Job analysis is the systematic process of collecting information about the duties, responsibilities, skills, and working conditions of a job. It involves determining the tasks and behaviors required to perform the job successfully. The key outcomes of job analysis include job descriptions that summarize the principal duties and specifications of the job, as well as information used for recruitment, selection, training, performance evaluation, and compensation. Job analysis data is typically collected through methods such as observation, interviews, questionnaires, and logs completed by job incumbents. The results provide an objective basis for designing and classifying jobs within an organization.
Sydney Larson proposes using a structured format for answering interview questions that includes situation, task, action, and result. She gives an example where she had six weeks to get her company compliant with regulations to meet an investor requirement and tens of millions of dollars were at stake. She created a project plan, got buy-in from employees, followed up to ensure they were on track, and met the deadline, securing the needed funding.
This document provides information about career assessments and making educated career decisions. It discusses the importance of career assessments for improvement, accountability, and goal setting. It then describes different types of assessments, including verbal, written, interest inventories, and skills assessments. The document emphasizes using assessments to develop realistic and feasible career goals and job targets. It also discusses resources for career exploration like books, the internet, and information interviews. The overall message is that career assessments are an important part of making educated career decisions by understanding one's interests and skills.
Qualtrics experts will share with you new advanced methods to measure leadership traits and highlight individual strengths and weaknesses. Multi-rater assessments, 360-degree employee or student feedback provides a holistic view of an individual by gathering feedback from peers, direct reports while comparing the results with their own self evaluation.
Building a Peer Evaluation Program: Best practices for beginners
What is peer evaluation
Why run peer evaluation
Peer evaluation workflow / process
Competencies & items
Reports
What to do with results
Rosette Name Indexer applies machine learning and artificial intelligence to the problem of matching names and specific techniques for Hebrew names. Rosette is a leader in applying NLP and computational linguistics to text analytics.
Ai for Good: Bad Guys, Messy Data, & NLPGil Irizarry
All the information that is needed to find and stop bad actors from entering our financial system already exists and is available to you today; it’s just buried in terabits of messy, unstructured data all over the internet. For those performing investigations and evaluating risk, this needle in a stack of needles problem is huge and growing: Unstructured data already dominates the web (growing exponentially year over year), and the traditional technology these departments use cannot keep up. Recent developments in natural language processing technology (NLP), the field of AI that focuses on human language, have, for the first time, made it possible for automated systems to find and deliver identity-relevant intelligence hidden in unstructured textual data.
More Related Content
Similar to [Apple-organization] and [oranges-fruit] - How to evaluate NLP tools - Basis Webinar
EA Benefits Realization in a Digital WorldKaine Ugwu
Enterprise Architecture is a promising approach to supporting digital transformation and providing the necessary agility to respond to changes. It has received attention from academic and industry professionals.
In addition, the practical aspects of creating and implementing EAs have been addressed in many case studies and research. Some of the Benefits of EA include improved decision making, better alignment of business and IT, and reduced costs. However, the question of how EA creates these benefits has received little attention. This presentation discusses a way to measure the effectiveness of an EA function in realizing its goals.
How to Meet Goals and Inspire Your Team Using OKRs (Includes OKR Examples) QuekelsBaro
This document provides information on OKRs (Objectives and Key Results), including how to write effective OKRs to meet goals and inspire teams. It begins with background on OKRs, explaining their origins at Intel in the 1970s and adoption by Google. The benefits of OKRs are outlined as Focus, Alignment, Commitment, Tracking, and Stretching. Guidelines for writing OKRs emphasize using a clear objective/key result formula, avoiding vague language, setting ambitious but achievable goals, and making objectives motivating. Examples of real OKRs from Google, Piktochart, and Upraise are also presented.
The document summarizes the key points from a business English class, including a review of vocabulary, homework, and exercises. It also includes details about a case study on Auric Bank UK, which is considering outsourcing some of its call center operations to reduce costs while maintaining good customer service. Options discussed include keeping operations in-house, outsourcing to UK or low-cost countries like India. Losing £1.5 billion from unprofitable investments is noted.
The document provides an analysis of Steve Jobs' 2005 commencement speech at Stanford University. It discusses how Jobs drew on his own experiences being fired from Apple and undergoing cancer treatment to convey the lessons he learned. His credibility and status in the tech industry gave weight to his message of pursuing one's passions. The speech emphasized chasing dreams and not settling for work that isn't meaningful. It highlighted Jobs' skill in using personal stories and emotions to persuade and inspire the graduates.
This document provides an overview of non-technical roles in the tech industry. It begins by noting that tech companies employ 3 times as many non-technical workers as technical workers. It then lists and describes common non-technical roles including business operations, human resources, corporate development, research, product management, project management, operations, marketing, sales, business development, customer service, and finance/accounting. For each role, it provides an example project, typical daily responsibilities, common roles within that function, and the types of skills and experience sought for those roles. It concludes by noting that many people transition between different non-technical roles over their careers in tech.
How to Pass an Interview for Software Engineersuttoantruot
The document provides tips on how to prepare for and pass a technical interview for a software engineering position. It discusses researching the company and job position in advance, preparing answers to common technical and personality questions, and avoiding typical mistakes. The interview process typically involves the candidate presenting their background and skills, learning about the company and role, and demonstrating their technical abilities through questions and tasks. Being well prepared is key to overcoming nerves and showcasing qualifications for the position.
This document provides guidance on how to write a CV for a French applicant seeking an internship. It advises starting with a summary highlighting one's objectives and qualifications. It then recommends including sections on experience, skills, interests, community service, and references. For each section, it suggests including specific, quantifiable achievements and accomplishments that demonstrate one's value and fit for the role. The overall message is to directly sell oneself and one's strengths from the beginning to increase chances of securing the internship.
1 Undergraduate Program Rubric—BACHELOR OF SCIENCE IN CRI.docxjoyjonna282
1
Undergraduate Program Rubric—BACHELOR OF SCIENCE IN CRIMINAL JUSTICE
Expectations: Student work at the undergraduate level is expected to focus on a broad overview of the academic discipline, along with—where appropriate—basic theoretical
frameworks of professional practices and familiarity with discipline-specific tools and their application.
Criteria Exemplary (A)) Accomplished (B) Proficient (C) Partially Proficient (D) Unacceptable (F)
Working knowledge of criminal
justice system
Demonstrates thorough
insight and application
of key criminal justice
practices.
Shows above average
insight and application
of key criminal justice
practices.
Demonstrates average
insight and application
of key criminal justice
practices.
Demonstrates below
average insight and
application of key
criminal justice
practices.
Shows poor insight and
application of key
criminal justice
practices.
Theory analysis and application Demonstrates thorough
and effective analysis
and application of crime
causation theories.
Demonstrates above
average ability to
analyze and apply crime
causation theories.
Shows average ability to
analyze and apply crime
causation theories.
Shows below average
ability to analyze and
apply crime causation
theories.
Shows poor ability to
analyze and apply
crime causation
theories.
EFFECTIVE COMMUNICATION
Approach and Purpose,
Organization, Style, Grammar,
Mechanics, Format,
Presentation and Delivery
(where applicable)
Demonstrates
outstanding or exemplary
application of written,
visual, or oral skills.
Demonstrates
outstanding expression of
topic, main idea, and
purpose.
Audience is addressed
appropriately.
Language clearly and
effectively communicates
ideas and content
relevant to the
assignment.
Errors in grammar,
spelling, and sentence
structure are minimal.
Organization is clear.
Format is consistently
appropriate to
assignment.
Presentation and delivery
are confident and
Demonstrates sound or
accomplished application
of written, visual, or oral
skills.
Demonstrates sound or
accomplished expression
of topic, main idea, and
purpose.
Audience is usually
addressed appropriately.
Language does not
interfere with the
communication of ideas
and content relevant to
the assignment.
Errors in grammar,
spelling, and sentence
structure are present, but
do not distract.
Organization is apparent
and mostly clear.
Format is appropriate to
assignment, but not
entirely consistent.
Demonstrates adequate
or proficient application
of written, visual, or oral
skills.
Demonstrates adequate
expression of topic, main
idea, and purpose.
Audience is generally
addressed appropriately.
Language is adequate,
generally communicating
ideas and content
relevant to the
assignment.
Errors in grammar,
spelling, and sentence
structure are present and
sometimes distract from
meaning or presentation.
Organization is ...
This document outlines the plan for an introductory class on professional procedures and portfolio development. The class will focus on identifying career goals, the importance of teamwork, and completing self-assessments. Students will take tests to learn about their strengths when working with others and in organizations, and write an assignment introducing themselves and their skills.
Make User Experience Part of The KPI Conversation With Universal MeasuresUserZoom
Join Dr. Andrea Peer and learn:
-How Universal Measures makes tangible the abstract concept of experience for your organization
-How practitioners can make experience a critical KPI for their organization
-Ways to establish experience score goals for all lines of business
-The benefits Universal Measures brings to executives and stakeholders
This document provides an outline for a workshop on how to present creative work. It discusses preparing for a presentation by understanding the audience, environment and goals. The days before a presentation, creators should ensure their work is on strategy, on brand and compelling. Minutes before, they should rehearse while considering the audience, environment and team. The workshop covers setting up the presentation story, practicing delivery from a position of authority and anticipating objections.
This document provides guidance on monitoring and evaluation (M&E) for organizations. It discusses the importance of M&E and key concepts like indicators, results chains, and identifying evidence of change. The document emphasizes that M&E requires organizational and technical readiness, including clear frameworks, evidence-based planning, relevant skills, and experience. It also provides examples of performance measures and developing them for different sectors. Worksheets are included to help participants apply these M&E concepts.
It is very important that you understand the fact that writing a resume that appeals is the success criteria for landing in a job you always dream about. For more information click here : http://bit.ly/resumetips101
Job analysis is the systematic process of collecting information about the duties, responsibilities, skills, and working conditions of a job. It involves determining the tasks and behaviors required to perform the job successfully. The key outcomes of job analysis include job descriptions that summarize the principal duties and specifications of the job, as well as information used for recruitment, selection, training, performance evaluation, and compensation. Job analysis data is typically collected through methods such as observation, interviews, questionnaires, and logs completed by job incumbents. The results provide an objective basis for designing and classifying jobs within an organization.
Sydney Larson proposes using a structured format for answering interview questions that includes situation, task, action, and result. She gives an example where she had six weeks to get her company compliant with regulations to meet an investor requirement and tens of millions of dollars were at stake. She created a project plan, got buy-in from employees, followed up to ensure they were on track, and met the deadline, securing the needed funding.
This document provides information about career assessments and making educated career decisions. It discusses the importance of career assessments for improvement, accountability, and goal setting. It then describes different types of assessments, including verbal, written, interest inventories, and skills assessments. The document emphasizes using assessments to develop realistic and feasible career goals and job targets. It also discusses resources for career exploration like books, the internet, and information interviews. The overall message is that career assessments are an important part of making educated career decisions by understanding one's interests and skills.
Qualtrics experts will share with you new advanced methods to measure leadership traits and highlight individual strengths and weaknesses. Multi-rater assessments, 360-degree employee or student feedback provides a holistic view of an individual by gathering feedback from peers, direct reports while comparing the results with their own self evaluation.
Building a Peer Evaluation Program: Best practices for beginners
What is peer evaluation
Why run peer evaluation
Peer evaluation workflow / process
Competencies & items
Reports
What to do with results
Similar to [Apple-organization] and [oranges-fruit] - How to evaluate NLP tools - Basis Webinar (20)
Rosette Name Indexer applies machine learning and artificial intelligence to the problem of matching names and specific techniques for Hebrew names. Rosette is a leader in applying NLP and computational linguistics to text analytics.
Ai for Good: Bad Guys, Messy Data, & NLPGil Irizarry
All the information that is needed to find and stop bad actors from entering our financial system already exists and is available to you today; it’s just buried in terabits of messy, unstructured data all over the internet. For those performing investigations and evaluating risk, this needle in a stack of needles problem is huge and growing: Unstructured data already dominates the web (growing exponentially year over year), and the traditional technology these departments use cannot keep up. Recent developments in natural language processing technology (NLP), the field of AI that focuses on human language, have, for the first time, made it possible for automated systems to find and deliver identity-relevant intelligence hidden in unstructured textual data.
DevSecOps Orchestration of Text Analytics with ContainersGil Irizarry
The document discusses container security best practices. It begins by introducing containers and their advantages over VMs, but notes containers also introduce security risks if not managed properly. It then outlines several mitigation strategies like patching operating systems, scanning container images for vulnerabilities, not running as root, using namespaces to isolate containers, and implementing resource limits. The document concludes by offering a demonstration and inviting questions.
This document provides an overview of beginning native Android app development. It discusses Android app structure including the manifest, activities, intents and lifecycles. It also covers common Android views and layouts, accessing device capabilities like the camera and location, working with data via content providers, and rendering with OpenGL. Example code is provided for various app features like input handling, scrollable lists, and camera access. The document concludes with the process for submitting an app to the Google Play Store.
Make Cross-platform Mobile Apps Quickly - SIGGRAPH 2014Gil Irizarry
This document provides a summary of a presentation about making cross-platform mobile apps quickly using open source tools. It discusses using PhoneGap to create apps using HTML, CSS, and JavaScript that are cross-platform. It provides examples of building simple apps demonstrating concepts like accessing device data, using maps, touch events, and animation. The examples are meant to illustrate how to create mobile apps that work across Android and iOS without using their native languages.
This document provides an overview and examples of using HTML5 canvas to create graphics and mobile apps. It discusses using canvas to draw basic shapes, images, and textures. It also covers touch events, animation, and creating menus. Later examples demonstrate loading images, simple games with touch input, and playing sound. The document emphasizes best practices like only drawing after resources load and using requestAnimationFrame for smooth animation. Overall, it serves as a tutorial for beginners on building graphics and interactive content using the HTML5 canvas element.
Gil Irizarry presents techniques for building lightweight mobile apps quickly using open source tools like PhoneGap, jQuery Mobile, and Android SDK. The presentation includes 5 code examples that demonstrate getting data from online RSS feeds and the device, building interactive UIs, and using local storage. PhoneGap allows developing cross-platform mobile apps using HTML, CSS, and JavaScript that can access device capabilities like contacts.
Building The Agile Enterprise - LSSC '12Gil Irizarry
The document discusses how to scale Agile practices beyond individual teams to the enterprise level using Kanban and release management. It recommends prioritizing work, planning dependencies between teams, continuously integrating and testing code, and deploying features using a "release train" model even if not all work is complete by a deadline. Automating builds, tests and deployments is key to enabling small, frequent releases across multiple interdependent teams.
Agile The Kanban Way - Central MA PMI 2011Gil Irizarry
This document provides an overview of Kanban and how it was presented to the PMI Central MA Chapter. The key points covered include:
1) Kanban is a scheduling system that manages workflow using a visual board to limit work-in-progress and ensure continuous flow.
2) Core Kanban principles include visualizing workflow, limiting work-in-progress, managing flow, making policies explicit, and improving collaboratively.
3) Kanban was presented including demonstrating how to set up a Kanban board, establish policies and limits, and use cumulative flow diagrams to identify bottlenecks.
Transitioning to Kanban: Theory and Practice - Project Summit Boston 2011Gil Irizarry
This document provides an overview of transitioning a team to Kanban. It discusses the motivations for adopting Kanban, including reacting quicker to changes and improving quality. It covers Kanban concepts like value stream mapping, establishing work-in-progress limits, and using metrics like cumulative flow diagrams to identify bottlenecks and continuously improve. An example is provided of one company's experience transitioning its website team to Kanban and how it improved workflow, reduced bottlenecks, and enabled more consistent delivery.
The document discusses Constant Contact's transition from Scrum to Kanban. It provides background on Constant Contact and the motivations for adopting Kanban. It then discusses the theory behind Kanban and how the Website team at Constant Contact implemented Kanban in practice over several weeks and months. Key aspects included mapping their value stream, establishing work in progress limits, defining policies, tracking metrics like cycle time, and improving their process continuously based on those metrics.
This document summarizes a presentation on transitioning a team to using Kanban. It discusses the motivations for adopting Kanban, provides an overview of Kanban principles and practices, and shares the experience of one team at Constant Contact that transitioned to Kanban. This included mapping their workflow, establishing work in progress limits, defining policies, tracking metrics like cycle time, and evolving their process over time to achieve continuous flow and delivery.
May Marketo Masterclass, London MUG May 22 2024.pdfAdele Miller
Can't make Adobe Summit in Vegas? No sweat because the EMEA Marketo Engage Champions are coming to London to share their Summit sessions, insights and more!
This is a MUG with a twist you don't want to miss.
Graspan: A Big Data System for Big Code AnalysisAftab Hussain
We built a disk-based parallel graph system, Graspan, that uses a novel edge-pair centric computation model to compute dynamic transitive closures on very large program graphs.
We implement context-sensitive pointer/alias and dataflow analyses on Graspan. An evaluation of these analyses on large codebases such as Linux shows that their Graspan implementations scale to millions of lines of code and are much simpler than their original implementations.
These analyses were used to augment the existing checkers; these augmented checkers found 132 new NULL pointer bugs and 1308 unnecessary NULL tests in Linux 4.4.0-rc5, PostgreSQL 8.3.9, and Apache httpd 2.2.18.
- Accepted in ASPLOS ‘17, Xi’an, China.
- Featured in the tutorial, Systemized Program Analyses: A Big Data Perspective on Static Analysis Scalability, ASPLOS ‘17.
- Invited for presentation at SoCal PLS ‘16.
- Invited for poster presentation at PLDI SRC ‘16.
Microservice Teams - How the cloud changes the way we workSven Peters
A lot of technical challenges and complexity come with building a cloud-native and distributed architecture. The way we develop backend software has fundamentally changed in the last ten years. Managing a microservices architecture demands a lot of us to ensure observability and operational resiliency. But did you also change the way you run your development teams?
Sven will talk about Atlassian’s journey from a monolith to a multi-tenanted architecture and how it affected the way the engineering teams work. You will learn how we shifted to service ownership, moved to more autonomous teams (and its challenges), and established platform and enablement teams.
Neo4j - Product Vision and Knowledge Graphs - GraphSummit ParisNeo4j
Dr. Jesús Barrasa, Head of Solutions Architecture for EMEA, Neo4j
Découvrez les dernières innovations de Neo4j, et notamment les dernières intégrations cloud et les améliorations produits qui font de Neo4j un choix essentiel pour les développeurs qui créent des applications avec des données interconnectées et de l’IA générative.
Enterprise Resource Planning System includes various modules that reduce any business's workload. Additionally, it organizes the workflows, which drives towards enhancing productivity. Here are a detailed explanation of the ERP modules. Going through the points will help you understand how the software is changing the work dynamics.
To know more details here: https://blogs.nyggs.com/nyggs/enterprise-resource-planning-erp-system-modules/
Do you want Software for your Business? Visit Deuglo
Deuglo has top Software Developers in India. They are experts in software development and help design and create custom Software solutions.
Deuglo follows seven steps methods for delivering their services to their customers. They called it the Software development life cycle process (SDLC).
Requirement — Collecting the Requirements is the first Phase in the SSLC process.
Feasibility Study — after completing the requirement process they move to the design phase.
Design — in this phase, they start designing the software.
Coding — when designing is completed, the developers start coding for the software.
Testing — in this phase when the coding of the software is done the testing team will start testing.
Installation — after completion of testing, the application opens to the live server and launches!
Maintenance — after completing the software development, customers start using the software.
GraphSummit Paris - The art of the possible with Graph TechnologyNeo4j
Sudhir Hasbe, Chief Product Officer, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Flutter is a popular open source, cross-platform framework developed by Google. In this webinar we'll explore Flutter and its architecture, delve into the Flutter Embedder and Flutter’s Dart language, discover how to leverage Flutter for embedded device development, learn about Automotive Grade Linux (AGL) and its consortium and understand the rationale behind AGL's choice of Flutter for next-gen IVI systems. Don’t miss this opportunity to discover whether Flutter is right for your project.
Why Mobile App Regression Testing is Critical for Sustained Success_ A Detail...kalichargn70th171
A dynamic process unfolds in the intricate realm of software development, dedicated to crafting and sustaining products that effortlessly address user needs. Amidst vital stages like market analysis and requirement assessments, the heart of software development lies in the meticulous creation and upkeep of source code. Code alterations are inherent, challenging code quality, particularly under stringent deadlines.
Introducing Crescat - Event Management Software for Venues, Festivals and Eve...Crescat
Crescat is industry-trusted event management software, built by event professionals for event professionals. Founded in 2017, we have three key products tailored for the live event industry.
Crescat Event for concert promoters and event agencies. Crescat Venue for music venues, conference centers, wedding venues, concert halls and more. And Crescat Festival for festivals, conferences and complex events.
With a wide range of popular features such as event scheduling, shift management, volunteer and crew coordination, artist booking and much more, Crescat is designed for customisation and ease-of-use.
Over 125,000 events have been planned in Crescat and with hundreds of customers of all shapes and sizes, from boutique event agencies through to international concert promoters, Crescat is rigged for success. What's more, we highly value feedback from our users and we are constantly improving our software with updates, new features and improvements.
If you plan events, run a venue or produce festivals and you're looking for ways to make your life easier, then we have a solution for you. Try our software for free or schedule a no-obligation demo with one of our product specialists today at crescat.io
E-commerce Development Services- Hornet DynamicsHornet Dynamics
For any business hoping to succeed in the digital age, having a strong online presence is crucial. We offer Ecommerce Development Services that are customized according to your business requirements and client preferences, enabling you to create a dynamic, safe, and user-friendly online store.
Artificia Intellicence and XPath Extension FunctionsOctavian Nadolu
The purpose of this presentation is to provide an overview of how you can use AI from XSLT, XQuery, Schematron, or XML Refactoring operations, the potential benefits of using AI, and some of the challenges we face.
DDS Security Version 1.2 was adopted in 2024. This revision strengthens support for long runnings systems adding new cryptographic algorithms, certificate revocation, and hardness against DoS attacks.
3. BASIS TECHNOLOGY
About Me
Gil Irizarry - VP Engineering at Basis Technology, responsible for NLP and Text
Analytics software development
https://www.linkedin.com/in/gilirizarry/
https://www.slideshare.net/conoagil
gil@basistech.com
Basis Technology - leading provider of software solutions for extracting
meaningful intelligence from multilingual text and digital devices
4. BASIS TECHNOLOGY
Agenda
● The problem space
● Defining the domain
● Assemble a test set
● Annotation guidelines
● Review of measurement
● Evaluation examples
● Inter-annotator agreement
● The steps to evaluation
6. BASIS TECHNOLOGY
The Problem Space
● You have some text to analyze. Which tool to choose?
● Related question: You have multiple text or data annotators. Which are doing
a good job?
● The questions are made harder by the tools outputting different formats,
analyzing data differently, and annotators interpreting data differently
● Start by defining the problem space
8. BASIS TECHNOLOGY
Defining the domain
● What space are you in?
● More importantly, in what domain will you evaluate tools?
● Are you:
○ Reading news
○ Scanning patents
○ Looking for financial fraud
9. BASIS TECHNOLOGY
Assemble a test set
● NLP systems are often trained on a general corpus. Often this corpus
consists of mainstream news articles.
● Do you use this domain or a more specific one?
● If more specific, do you train a custom model?
10. BASIS TECHNOLOGY
Annotation Guidelines
Examples requiring definition and agreement in guidelines:
● “Alice shook Brenda’s hand when she entered the meeting.” Is “Brenda” or
“Brenda’s” the entity to be extracted (in addition to Alice of course)?
● Are pronouns expected to be extracted and resolved? “She” in the previous example
● What about tolerance to punctuation? The U.N. vs. the UN
● Should fictitious characters (“Harry Potter”) be tagged as “person”?
● When a location appears within an organization’s name, do you tag the location and the
organization extracted or just the organization (“San Francisco Association of
Realtors”)?
11. BASIS TECHNOLOGY
Annotation Guidelines
Examples requiring definition and agreement in guidelines:
● Do you tag the name of a person if it is used as a modifier (“Martin Luther King Jr.
Day”)?
● Do you tag “Twitter” in “You could try reaching out to the Twitterverse”?
● Do you tag “Google” in “I googled it, but I couldn’t find any relevant results”?
● When do you include “the” in an entity? The Ukraine vs. Ukraine
● How do you differentiate between an entity that’s a company name and a product by
the same name? {[ORG]The New York Times} was criticized for an article about the
{[LOC]Netherlands} in the June 4 edition of {[PRO]The New York Times}.
● “Washington and Moscow continued their negotiations.” Are Washington and
Moscow locations or organizations?
12. BASIS TECHNOLOGY
Annotation Guidelines
Non-entity extraction issues:
● How many levels of sentiment do you expect?
● Ontology and text classification - what categories do you expect?
● For language identification, are dialects identified as separate languages?
What about macrolanguages?
14. BASIS TECHNOLOGY
Annotation Guidelines
● Map to Universal Dependencies Guidelines where possible:
https://universaldependencies.org/guidelines.html
● Map to DBpedia ontology where possible:
http://mappings.dbpedia.org/server/ontology/classes/
● Map to known database such as Wikidata where possible:
https://www.wikidata.org/wiki/Wikidata:Main_Page
15. BASIS TECHNOLOGY
Review of measurement: precision
Precision is the fraction of retrieved documents that are relevant to the query
16. BASIS TECHNOLOGY
Review of measurement: recall
Recall is the fraction of the relevant documents that are successfully retrieved
17. BASIS TECHNOLOGY
Review of measurement: F-score
F-score is a harmonic mean of precision and recall
Precision and recall are ratios. In this case, a harmonic mean is more appropriate
for an average than an arithmetic mean.
18. BASIS TECHNOLOGY
Review of measurement: harmonic mean
A harmonic mean returns a single value to combine both precision and recall. In
the below image, a and b map to precision and recall, and H maps to F score. In
this example, note that increasing a would not increase the overall score.
19. BASIS TECHNOLOGY
Review of measurement: F-score
Previous example of F score was actually an F1 score, which balances precision
and recall evenly. A more generalized form of F score is:
F2 (β = 2) weights recall higher than precision and F0.5 (β = 0.5) weights precision
higher than recall
20. BASIS TECHNOLOGY
Review of measurement: AP and MAP
● Average precision is a measure that combines recall and precision for ranked
retrieval results. For one information need, the average precision is the mean
of the precision scores after each relevant document is retrieved
● Mean average precision is average precision over a range of queries
21. BASIS TECHNOLOGY
Review of measurement: MUC score
● Message Understanding Conference (MUC) scoring allows for taking partial
success into account
○ Correct: response = key
○ Partial: response ~= key
○ Incorrect: response != key
○ Spurious: key is blank and response is not
○ Missing: response is blank and key is not
○ Noncommittal: key and response are both blank
○ Recall = (correct + (partial x 0.5 )) / possible
○ Precision = (correct+(partial x 0.5)) / actual
○ Undergeneration = missing / possible
○ Overgeneration = spurious / actual
22. BASIS TECHNOLOGY
Evaluation Examples
As co-sponsor, Tim Cook was seated at a
table with Vogue editor Anna Wintour, but
he made time to get around and see his
other friends, including Uber CEO Travis
Kalanick. Cook's date for the night was
Laurene Powell Jobs, the widow of Apple
cofounder Steve Jobs. Powell currently
runs Emerson Collective, a company that
seeks to make investments in education.
Kalanick brought a date as well, Gabi
Holzwarth, a well-known violinist.
23. BASIS TECHNOLOGY
Evaluation Examples - gold standard
As co-sponsor, Tim Cook was seated at a
table with Vogue editor Anna Wintour, but
he made time to get around and see his
other friends, including Uber CEO Travis
Kalanick. Cook's date for the night was
Laurene Powell Jobs, the widow of Apple
cofounder Steve Jobs. Powell currently
runs Emerson Collective, a company that
seeks to make investments in education.
Kalanick brought a date as well, Gabi
Holzwarth, a well-known violinist.
28. BASIS TECHNOLOGY
Inter-annotator Agreement
● Krippendorff ’s alpha is a reliability coefficient developed to measure the
agreement among observers,coders, judges, raters, or measuring
instruments drawing distinctions among typically unstructured phenomena
● Cohen’s kappa is a measure of the agreement between two raters who
determine which category a finite number of subjects belong to whereby
agreement due to chance is factored out
● Inter-annotator agreement scoring determines the agreement between
different annotators annotating the same unstructured text
● It is not intended to measure the output of a tool against a gold standard
29. BASIS TECHNOLOGY
The Steps to Evaluation
● Define your requirements
● Assemble a valid test dataset
● Annotate the gold standard test dataset
● Get output from tools
● Evaluate the results
● Make your decision
Thank you for joining, while we wait for people to join, I'm going to spend two minutes telling you about Rosette.
Rosette is our text analytics brand, we pride ourselves with providing a high quality carefully curated and TESTED set of text analytics and natural language processing capabilities.
Testing and evaluation of NLP has become one of our in-house specialties, and a service we provide to customers. This is what inspired Gil's talk today. There is also a "How To Evaluate NLP" series on our blog if you want to read more after this talk.
We also pride ourselves with comprehensive NLP coverage. This includes both breadth of capabilities AND in language support. Rosette text analytics enables high quality analytics in over 32 languages.
All the Rosette capabilities are highly adaptable, with easy tools for domain adaptation and many options for deployment. We work with our clients to engineer the best possible NLP solution for their needs, using every possible data source to make their AI smart and resilient. Major brands that you know deploy Rosette on-premise and in the cloud for their mission critical, high volume systems.
Now let me introduce your host for this talk, Basis Technology's VP of engineering, Gil Irizarry...
Rosette is a full NLP stack from language identification to morphology to entity extraction and resolution. We moving into application development with annotation studio and identity resolution
One tool will output 5 levels of sentiment and another only 3. One tool will output transitive vs. intransitive verbs and another will output only verbs. One will strip possessives (King’s Landing) and another won’t.
Rosette / Amazon Comprehend. Note that Rosette and Comprehend identify titles differently. Comprehend identified CEO as a person and didn’t identify the pronoun.
Finding data is easier but annotating data is hard
The Ukraine is now Ukraine, similarly Sudan. How do you handle the change over time?
Screenshot of the TOC of our Annotation Guidelines. 42 pages. In some meetings, it’s the only doc under NDA. Header says for all. That means for all languages. We also have specific guidelines for some languages.
Images from wikipedia
Images from wikipedia
A harmonic mean is a better balance of two values than a simple average
Increasing A would lower the overall score, since both G and H would get smaller
Changing the beta value allows you to tune the harmonic mean and weight either precision or recall more heavily
https://link.springer.com/referenceworkentry/10.1007%2F978-0-387-39940-9_482
Precision is a single value. Average precision takes into account precision over a range of results. Mean average precision is the mean over a range of queries.
Annotated sample of people names. Note “Cook’s” and “Powell” as references to earlier names. Note the “Emerson Collective” as an organization name is not highlighted.
AP = (sum of (True Positive / Predicted Positive)) / num of True Positive
MAP = is the mean of AP over a range of different queries, for example varying the tolerances or confidences
Possible: The number of entities hand-annotated in the gold evaluation corpus, equal to (Correct + Incorrect + Partial + Missing)
Actual: The number of entities tagged by the test NER system, equal to (Correct + Incorrect + Partial + Spurious)
(R) Recall = (correct + (1/2 partial)) / possible (P) Precision = (correct + (1/2 partial)) / actual
F =(2 * P * R) / (P + R)