SlideShare a Scribd company logo
1 of 20
Team 1
Introduction to Expert Systems
 Title: Author Identification System Using Keywords and
Pattern Frequency
 OBJECTIVE OF SYSTEM: To identify or verify authors
through text analysis.
 How it works: The system analyzes word frequency,
keyword frequency, and part-of-speech pattern frequency
(POS permutations) after keywords.
Problem definition and system
objectives
 PROBLEM DEFINITION: From multiple texts, it is difficult
to identify or verify the authorship of each text. This is
needed in many areas, such as copyright infringement
issues, pragmatism research, or authorship identification
of literary works.
 System Purpose: the system solves the above problem
using text analysis and Natural Language Processing (NLP)
techniques.
Introduction of the entire system
 Our system analyzes word frequency, keyword frequency,
and POS patterns after keywords in the input text.
 These data form a feature vector that represents the
writing style and style of a particular author.
 Finally, these feature vectors are used to identify or verify
authors.
Specific details in the system
 Word frequency: Authors tend to favor certain words.
Analyzing the frequency of these words can help identify
authors.
 Keyword frequency: The frequency of certain keywords is
also important because authors frequently write about a
particular subject or topic.
 POS Patterns: Authors also tend to use grammatical
patterns consistently. For example, a particular keyword
is always followed by a particular part of speech.
Team 2
Problem
 Debbie was on her honeymoon and wrote an email to her mother. However,
the mother thought that Debbie might not have written the email and
contacted the police. We need to consider an expert system to ascertain
whether Debbie really wrote the email.
Overall system
 Prepare four data sets: one set of emails received from Debbie after the
marriage (Questioned), one set of emails received from Debbie before the
marriage (Known1), one set of emails received from Jamie (Known2), and an
unspecified set of emails (Reference).
 Using statistical analysis, discover Debbie's and Jamie's respective keywords
and see if the Keyword in Questioned applies to either of them.
System in detail (1/2)
 It divides the sentences in the four data sets by word and counts the number
of words that occur. Then sort them in order of word count.
 Summarize the words with the smallest percentage used compared to the
reference and the three data. These become keywords.
 Compare the keywords of the Questioned and the two people and calculate
how applicable they are. The one with the highest number of applicable
keywords is assumed to be the writer.
System in detail (2/2)
Tokenization of
sentences
Find keywords
based on
references
Output Jamie
label
Output Debbie
label
Does the keyword in
Known1 apply to the
Questioned more than
the keyword in Known2?
Yes No
Sort by word
frequency
Team 3
Problem/Purpose
 Problem
The sentence does not know who wrote it.
 Purpose
Determine who wrote sentences obtained through the datasets
Overall system
Step1.input
sentences
Step2. compare them and
datasets
(such as word
frequency,keyword
frequency,keyword and
pattern frequency)
Step3.Output result of
Step2
System in detail how
①How to create dataset?
・ get a lot of sentences written by someone who we want to search.
・separate sentences by words
・Divide the dataset into word frequencies, keyword frequencies, and keyword and pattern frequencies
System in detail how (continue)
②How to compare inputting sentences and datasets?
・separate sentences by words
・Compare the dataset with the input words to see if there are similarities in word frequency, keyword
frequency, and keyword/pattern frequency.
Team 4
Problem/Purpose
Problem:
After marriage, were emails received from Debbie written by herself?
Purpose:
To identify whether emails written by Debbie.
Overall system
Input:
Emails received from Debbie
after marriage
Inference:
Word frequency
Keyword frequency
Keyword pattern
Output:
Jamie/Debbie
System in detail 1
・The knowledge needed
Known Dataset: Emails received from Debbie before marriage,
Emails received from Jamie
Reference Dataset: Large collection of emails from many different
senders
・The inference needed
Frequency comparison, Keyword matching, Pattern analysis
System in detail 2
• Frequency comparison
This starts to create a word frequency list from each dataset of emails and compare whether there are
words or keywords which are overlapped.
• Keyword matching
This divides the sentences on emails into several keywords, counts their frequency, and searches for
keywords which are matching.
• Pattern analysis
This calculates keyword POS patterns comparing the reference dataset and other datasets. Also, this
evaluates the overlap of keyword patterns in emails.

More Related Content

Similar to Introduction to Expert Systems.pptx

Information Retrieval
Information Retrieval Information Retrieval
Information Retrieval ShujaatZaheer3
 
Automatic multiple choice question generation system for
Automatic multiple choice question generation system forAutomatic multiple choice question generation system for
Automatic multiple choice question generation system forAlexander Decker
 
It Doesn't Do What You Think It Does
It Doesn't Do What You Think It DoesIt Doesn't Do What You Think It Does
It Doesn't Do What You Think It DoesJustin Stoller
 
Information Systems Security3Information Systems Secur.docx
Information Systems Security3Information Systems Secur.docxInformation Systems Security3Information Systems Secur.docx
Information Systems Security3Information Systems Secur.docxjaggernaoma
 
Mining Email Social Networks
Mining Email Social NetworksMining Email Social Networks
Mining Email Social Networksarnamoy10
 
Tovek Presentation by Livio Costantini
Tovek Presentation by Livio CostantiniTovek Presentation by Livio Costantini
Tovek Presentation by Livio Costantinimaxfalc
 
EMOTION DETECTION FROM TEXT
EMOTION DETECTION FROM TEXTEMOTION DETECTION FROM TEXT
EMOTION DETECTION FROM TEXTcscpconf
 
MiningEmailSocialNetworks
MiningEmailSocialNetworksMiningEmailSocialNetworks
MiningEmailSocialNetworkswebuploader
 
How Anonymous Can Someone be on Twitter?
How Anonymous Can Someone be on Twitter?How Anonymous Can Someone be on Twitter?
How Anonymous Can Someone be on Twitter?George Sam
 
MUSIC’s VULGAR AWARENESS
MUSIC’s VULGAR AWARENESSMUSIC’s VULGAR AWARENESS
MUSIC’s VULGAR AWARENESSBoat Teelekboat
 
Text Classification.pptx
Text Classification.pptxText Classification.pptx
Text Classification.pptxhezamgawbah
 
Rule Based Automatic Generation of Query Terms for SMS Based Retrieval Systems
Rule Based Automatic Generation of Query Terms for SMS Based Retrieval SystemsRule Based Automatic Generation of Query Terms for SMS Based Retrieval Systems
Rule Based Automatic Generation of Query Terms for SMS Based Retrieval SystemsEditor IJCATR
 
EVALUATION OF SEMANTIC ANSWER SIMILARITY METRICS
EVALUATION OF SEMANTIC ANSWER SIMILARITY METRICSEVALUATION OF SEMANTIC ANSWER SIMILARITY METRICS
EVALUATION OF SEMANTIC ANSWER SIMILARITY METRICSkevig
 

Similar to Introduction to Expert Systems.pptx (20)

ashu ppt final.pptx
ashu ppt final.pptxashu ppt final.pptx
ashu ppt final.pptx
 
Novel Scoring System for Identify Accurate Answers for Factoid Questions
Novel Scoring System for Identify Accurate Answers for Factoid QuestionsNovel Scoring System for Identify Accurate Answers for Factoid Questions
Novel Scoring System for Identify Accurate Answers for Factoid Questions
 
Information Retrieval
Information Retrieval Information Retrieval
Information Retrieval
 
Automatic multiple choice question generation system for
Automatic multiple choice question generation system forAutomatic multiple choice question generation system for
Automatic multiple choice question generation system for
 
It Doesn't Do What You Think It Does
It Doesn't Do What You Think It DoesIt Doesn't Do What You Think It Does
It Doesn't Do What You Think It Does
 
Information Systems Security3Information Systems Secur.docx
Information Systems Security3Information Systems Secur.docxInformation Systems Security3Information Systems Secur.docx
Information Systems Security3Information Systems Secur.docx
 
A^2_Poster
A^2_PosterA^2_Poster
A^2_Poster
 
Mining Email Social Networks
Mining Email Social NetworksMining Email Social Networks
Mining Email Social Networks
 
Research Report
Research ReportResearch Report
Research Report
 
Tovek Presentation by Livio Costantini
Tovek Presentation by Livio CostantiniTovek Presentation by Livio Costantini
Tovek Presentation by Livio Costantini
 
EMOTION DETECTION FROM TEXT
EMOTION DETECTION FROM TEXTEMOTION DETECTION FROM TEXT
EMOTION DETECTION FROM TEXT
 
MiningEmailSocialNetworks
MiningEmailSocialNetworksMiningEmailSocialNetworks
MiningEmailSocialNetworks
 
Semantic Patterns for Sentiment Analysis of Twitter
Semantic Patterns for Sentiment Analysis of TwitterSemantic Patterns for Sentiment Analysis of Twitter
Semantic Patterns for Sentiment Analysis of Twitter
 
How Anonymous Can Someone be on Twitter?
How Anonymous Can Someone be on Twitter?How Anonymous Can Someone be on Twitter?
How Anonymous Can Someone be on Twitter?
 
MUSIC’s VULGAR AWARENESS
MUSIC’s VULGAR AWARENESSMUSIC’s VULGAR AWARENESS
MUSIC’s VULGAR AWARENESS
 
Text Classification.pptx
Text Classification.pptxText Classification.pptx
Text Classification.pptx
 
Abstract
AbstractAbstract
Abstract
 
Rule Based Automatic Generation of Query Terms for SMS Based Retrieval Systems
Rule Based Automatic Generation of Query Terms for SMS Based Retrieval SystemsRule Based Automatic Generation of Query Terms for SMS Based Retrieval Systems
Rule Based Automatic Generation of Query Terms for SMS Based Retrieval Systems
 
Textmining
TextminingTextmining
Textmining
 
EVALUATION OF SEMANTIC ANSWER SIMILARITY METRICS
EVALUATION OF SEMANTIC ANSWER SIMILARITY METRICSEVALUATION OF SEMANTIC ANSWER SIMILARITY METRICS
EVALUATION OF SEMANTIC ANSWER SIMILARITY METRICS
 

More from john6938

Social Media Ethics.pptx
Social Media Ethics.pptxSocial Media Ethics.pptx
Social Media Ethics.pptxjohn6938
 
Future of Information Ethics.pptx
Future of Information Ethics.pptxFuture of Information Ethics.pptx
Future of Information Ethics.pptxjohn6938
 
Bioethics.pptx
Bioethics.pptxBioethics.pptx
Bioethics.pptxjohn6938
 
Surveillance and security.pptx
Surveillance and security.pptxSurveillance and security.pptx
Surveillance and security.pptxjohn6938
 
Starbuck.pptx
Starbuck.pptxStarbuck.pptx
Starbuck.pptxjohn6938
 
Unit 4 Problem breakdown.pptx
Unit 4 Problem breakdown.pptxUnit 4 Problem breakdown.pptx
Unit 4 Problem breakdown.pptxjohn6938
 
Image_recognition.pptx
Image_recognition.pptxImage_recognition.pptx
Image_recognition.pptxjohn6938
 
Algorithms.pptx
Algorithms.pptxAlgorithms.pptx
Algorithms.pptxjohn6938
 
Artificial_intelligence.pptx
Artificial_intelligence.pptxArtificial_intelligence.pptx
Artificial_intelligence.pptxjohn6938
 
Image_generation.pptx
Image_generation.pptxImage_generation.pptx
Image_generation.pptxjohn6938
 
Computer_Graphics.pptx
Computer_Graphics.pptxComputer_Graphics.pptx
Computer_Graphics.pptxjohn6938
 
Security.pptx
Security.pptxSecurity.pptx
Security.pptxjohn6938
 
Gravitational_wave_detection.pptx
Gravitational_wave_detection.pptxGravitational_wave_detection.pptx
Gravitational_wave_detection.pptxjohn6938
 
Embedded_Systems.pptx
Embedded_Systems.pptxEmbedded_Systems.pptx
Embedded_Systems.pptxjohn6938
 
Software_engineering.pptx
Software_engineering.pptxSoftware_engineering.pptx
Software_engineering.pptxjohn6938
 
Quantum_computers.pptx
Quantum_computers.pptxQuantum_computers.pptx
Quantum_computers.pptxjohn6938
 
Sensors_SLAM.pptx
Sensors_SLAM.pptxSensors_SLAM.pptx
Sensors_SLAM.pptxjohn6938
 
Maths.pptx
Maths.pptxMaths.pptx
Maths.pptxjohn6938
 
Recommendation_systems.pptx
Recommendation_systems.pptxRecommendation_systems.pptx
Recommendation_systems.pptxjohn6938
 

More from john6938 (20)

Social Media Ethics.pptx
Social Media Ethics.pptxSocial Media Ethics.pptx
Social Media Ethics.pptx
 
Future of Information Ethics.pptx
Future of Information Ethics.pptxFuture of Information Ethics.pptx
Future of Information Ethics.pptx
 
Bioethics.pptx
Bioethics.pptxBioethics.pptx
Bioethics.pptx
 
Surveillance and security.pptx
Surveillance and security.pptxSurveillance and security.pptx
Surveillance and security.pptx
 
Starbuck.pptx
Starbuck.pptxStarbuck.pptx
Starbuck.pptx
 
Unit 4 Problem breakdown.pptx
Unit 4 Problem breakdown.pptxUnit 4 Problem breakdown.pptx
Unit 4 Problem breakdown.pptx
 
Image_recognition.pptx
Image_recognition.pptxImage_recognition.pptx
Image_recognition.pptx
 
Algorithms.pptx
Algorithms.pptxAlgorithms.pptx
Algorithms.pptx
 
Artificial_intelligence.pptx
Artificial_intelligence.pptxArtificial_intelligence.pptx
Artificial_intelligence.pptx
 
Image_generation.pptx
Image_generation.pptxImage_generation.pptx
Image_generation.pptx
 
Computer_Graphics.pptx
Computer_Graphics.pptxComputer_Graphics.pptx
Computer_Graphics.pptx
 
Security.pptx
Security.pptxSecurity.pptx
Security.pptx
 
Gravitational_wave_detection.pptx
Gravitational_wave_detection.pptxGravitational_wave_detection.pptx
Gravitational_wave_detection.pptx
 
Embedded_Systems.pptx
Embedded_Systems.pptxEmbedded_Systems.pptx
Embedded_Systems.pptx
 
Software_engineering.pptx
Software_engineering.pptxSoftware_engineering.pptx
Software_engineering.pptx
 
Quantum_computers.pptx
Quantum_computers.pptxQuantum_computers.pptx
Quantum_computers.pptx
 
NLP.pptx
NLP.pptxNLP.pptx
NLP.pptx
 
Sensors_SLAM.pptx
Sensors_SLAM.pptxSensors_SLAM.pptx
Sensors_SLAM.pptx
 
Maths.pptx
Maths.pptxMaths.pptx
Maths.pptx
 
Recommendation_systems.pptx
Recommendation_systems.pptxRecommendation_systems.pptx
Recommendation_systems.pptx
 

Recently uploaded

Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxRoyAbrique
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Micromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of PowdersMicromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of PowdersChitralekhaTherkar
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...EduSkills OECD
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxSayali Powar
 

Recently uploaded (20)

Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptxContemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Micromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of PowdersMicromeritics - Fundamental and Derived Properties of Powders
Micromeritics - Fundamental and Derived Properties of Powders
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
Presentation by Andreas Schleicher Tackling the School Absenteeism Crisis 30 ...
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptxPOINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
POINT- BIOCHEMISTRY SEM 2 ENZYMES UNIT 5.pptx
 

Introduction to Expert Systems.pptx

  • 2. Introduction to Expert Systems  Title: Author Identification System Using Keywords and Pattern Frequency  OBJECTIVE OF SYSTEM: To identify or verify authors through text analysis.  How it works: The system analyzes word frequency, keyword frequency, and part-of-speech pattern frequency (POS permutations) after keywords.
  • 3. Problem definition and system objectives  PROBLEM DEFINITION: From multiple texts, it is difficult to identify or verify the authorship of each text. This is needed in many areas, such as copyright infringement issues, pragmatism research, or authorship identification of literary works.  System Purpose: the system solves the above problem using text analysis and Natural Language Processing (NLP) techniques.
  • 4. Introduction of the entire system  Our system analyzes word frequency, keyword frequency, and POS patterns after keywords in the input text.  These data form a feature vector that represents the writing style and style of a particular author.  Finally, these feature vectors are used to identify or verify authors.
  • 5. Specific details in the system  Word frequency: Authors tend to favor certain words. Analyzing the frequency of these words can help identify authors.  Keyword frequency: The frequency of certain keywords is also important because authors frequently write about a particular subject or topic.  POS Patterns: Authors also tend to use grammatical patterns consistently. For example, a particular keyword is always followed by a particular part of speech.
  • 7. Problem  Debbie was on her honeymoon and wrote an email to her mother. However, the mother thought that Debbie might not have written the email and contacted the police. We need to consider an expert system to ascertain whether Debbie really wrote the email.
  • 8. Overall system  Prepare four data sets: one set of emails received from Debbie after the marriage (Questioned), one set of emails received from Debbie before the marriage (Known1), one set of emails received from Jamie (Known2), and an unspecified set of emails (Reference).  Using statistical analysis, discover Debbie's and Jamie's respective keywords and see if the Keyword in Questioned applies to either of them.
  • 9. System in detail (1/2)  It divides the sentences in the four data sets by word and counts the number of words that occur. Then sort them in order of word count.  Summarize the words with the smallest percentage used compared to the reference and the three data. These become keywords.  Compare the keywords of the Questioned and the two people and calculate how applicable they are. The one with the highest number of applicable keywords is assumed to be the writer.
  • 10. System in detail (2/2) Tokenization of sentences Find keywords based on references Output Jamie label Output Debbie label Does the keyword in Known1 apply to the Questioned more than the keyword in Known2? Yes No Sort by word frequency
  • 12. Problem/Purpose  Problem The sentence does not know who wrote it.  Purpose Determine who wrote sentences obtained through the datasets
  • 13. Overall system Step1.input sentences Step2. compare them and datasets (such as word frequency,keyword frequency,keyword and pattern frequency) Step3.Output result of Step2
  • 14. System in detail how ①How to create dataset? ・ get a lot of sentences written by someone who we want to search. ・separate sentences by words ・Divide the dataset into word frequencies, keyword frequencies, and keyword and pattern frequencies
  • 15. System in detail how (continue) ②How to compare inputting sentences and datasets? ・separate sentences by words ・Compare the dataset with the input words to see if there are similarities in word frequency, keyword frequency, and keyword/pattern frequency.
  • 17. Problem/Purpose Problem: After marriage, were emails received from Debbie written by herself? Purpose: To identify whether emails written by Debbie.
  • 18. Overall system Input: Emails received from Debbie after marriage Inference: Word frequency Keyword frequency Keyword pattern Output: Jamie/Debbie
  • 19. System in detail 1 ・The knowledge needed Known Dataset: Emails received from Debbie before marriage, Emails received from Jamie Reference Dataset: Large collection of emails from many different senders ・The inference needed Frequency comparison, Keyword matching, Pattern analysis
  • 20. System in detail 2 • Frequency comparison This starts to create a word frequency list from each dataset of emails and compare whether there are words or keywords which are overlapped. • Keyword matching This divides the sentences on emails into several keywords, counts their frequency, and searches for keywords which are matching. • Pattern analysis This calculates keyword POS patterns comparing the reference dataset and other datasets. Also, this evaluates the overlap of keyword patterns in emails.