Submit Search
Upload
Language and Offensive Word Detection
•
0 likes
•
14 views
IRJET Journal
Follow
https://www.irjet.net/archives/V9/i4/IRJET-V9I4536.pdf
Read less
Read more
Engineering
Report
Share
Report
Share
1 of 6
Download now
Download to read offline
Recommended
Threat Detection System Using Data-science and NLP
Threat Detection System Using Data-science and NLP
IRJET Journal
Automatic Grading of Handwritten Answers
Automatic Grading of Handwritten Answers
IRJET Journal
Image Based Tool for Level 1 and Level 2 Autistic People
Image Based Tool for Level 1 and Level 2 Autistic People
IRJET Journal
Sentimental Analysis For Electronic Product Review
Sentimental Analysis For Electronic Product Review
IRJET Journal
IDE Code Compiler for the physically challenged (Deaf, Blind & Mute)
IDE Code Compiler for the physically challenged (Deaf, Blind & Mute)
IRJET Journal
Text Summarization and Conversion of Speech to Text
Text Summarization and Conversion of Speech to Text
IRJET Journal
IRJET- Semantic Question Matching
IRJET- Semantic Question Matching
IRJET Journal
IRJET- Factoid Question and Answering System
IRJET- Factoid Question and Answering System
IRJET Journal
Recommended
Threat Detection System Using Data-science and NLP
Threat Detection System Using Data-science and NLP
IRJET Journal
Automatic Grading of Handwritten Answers
Automatic Grading of Handwritten Answers
IRJET Journal
Image Based Tool for Level 1 and Level 2 Autistic People
Image Based Tool for Level 1 and Level 2 Autistic People
IRJET Journal
Sentimental Analysis For Electronic Product Review
Sentimental Analysis For Electronic Product Review
IRJET Journal
IDE Code Compiler for the physically challenged (Deaf, Blind & Mute)
IDE Code Compiler for the physically challenged (Deaf, Blind & Mute)
IRJET Journal
Text Summarization and Conversion of Speech to Text
Text Summarization and Conversion of Speech to Text
IRJET Journal
IRJET- Semantic Question Matching
IRJET- Semantic Question Matching
IRJET Journal
IRJET- Factoid Question and Answering System
IRJET- Factoid Question and Answering System
IRJET Journal
Live Sign Language Translation: A Survey
Live Sign Language Translation: A Survey
IRJET Journal
IRJET- Speech Based Answer Sheet Evaluation System
IRJET- Speech Based Answer Sheet Evaluation System
IRJET Journal
IRJET - Threat Prediction using Speech Analysis
IRJET - Threat Prediction using Speech Analysis
IRJET Journal
SPEECH RECOGNITION WITH LANGUAGE SPECIFICATION
SPEECH RECOGNITION WITH LANGUAGE SPECIFICATION
IRJET Journal
IRJET - Mobile Chatbot for Information Search
IRJET - Mobile Chatbot for Information Search
IRJET Journal
Sentiment Analysis: A comparative study of Deep Learning and Machine Learning
Sentiment Analysis: A comparative study of Deep Learning and Machine Learning
IRJET Journal
Derogatory Comment Classification
Derogatory Comment Classification
IRJET Journal
Text pre-processing of multilingual for sentiment analysis based on social ne...
Text pre-processing of multilingual for sentiment analysis based on social ne...
IJECEIAES
IRJET - Text Optimization/Summarizer using Natural Language Processing
IRJET - Text Optimization/Summarizer using Natural Language Processing
IRJET Journal
IRJET- Vernacular Language Spell Checker & Autocorrection
IRJET- Vernacular Language Spell Checker & Autocorrection
IRJET Journal
Speech To Speech Translation
Speech To Speech Translation
IRJET Journal
IRJET- ASL Language Translation using ML
IRJET- ASL Language Translation using ML
IRJET Journal
Sentiment Analysis on Twitter Data
Sentiment Analysis on Twitter Data
IRJET Journal
2. an efficient approach for web query preprocessing edit sat
2. an efficient approach for web query preprocessing edit sat
IAESIJEECS
IRJET - Analysis of Paraphrase Detection using NLP Techniques
IRJET - Analysis of Paraphrase Detection using NLP Techniques
IRJET Journal
An Efficient Approach to Produce Source Code by Interpreting Algorithm
An Efficient Approach to Produce Source Code by Interpreting Algorithm
IRJET Journal
IRJET - Optical Character Recognition and Translation
IRJET - Optical Character Recognition and Translation
IRJET Journal
IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...
IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...
IRJET Journal
IRJET- QUEZARD : Question Wizard using Machine Learning and Artificial Intell...
IRJET- QUEZARD : Question Wizard using Machine Learning and Artificial Intell...
IRJET Journal
IRJET- Voice to Code Editor using Speech Recognition
IRJET- Voice to Code Editor using Speech Recognition
IRJET Journal
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
IRJET Journal
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
IRJET Journal
More Related Content
Similar to Language and Offensive Word Detection
Live Sign Language Translation: A Survey
Live Sign Language Translation: A Survey
IRJET Journal
IRJET- Speech Based Answer Sheet Evaluation System
IRJET- Speech Based Answer Sheet Evaluation System
IRJET Journal
IRJET - Threat Prediction using Speech Analysis
IRJET - Threat Prediction using Speech Analysis
IRJET Journal
SPEECH RECOGNITION WITH LANGUAGE SPECIFICATION
SPEECH RECOGNITION WITH LANGUAGE SPECIFICATION
IRJET Journal
IRJET - Mobile Chatbot for Information Search
IRJET - Mobile Chatbot for Information Search
IRJET Journal
Sentiment Analysis: A comparative study of Deep Learning and Machine Learning
Sentiment Analysis: A comparative study of Deep Learning and Machine Learning
IRJET Journal
Derogatory Comment Classification
Derogatory Comment Classification
IRJET Journal
Text pre-processing of multilingual for sentiment analysis based on social ne...
Text pre-processing of multilingual for sentiment analysis based on social ne...
IJECEIAES
IRJET - Text Optimization/Summarizer using Natural Language Processing
IRJET - Text Optimization/Summarizer using Natural Language Processing
IRJET Journal
IRJET- Vernacular Language Spell Checker & Autocorrection
IRJET- Vernacular Language Spell Checker & Autocorrection
IRJET Journal
Speech To Speech Translation
Speech To Speech Translation
IRJET Journal
IRJET- ASL Language Translation using ML
IRJET- ASL Language Translation using ML
IRJET Journal
Sentiment Analysis on Twitter Data
Sentiment Analysis on Twitter Data
IRJET Journal
2. an efficient approach for web query preprocessing edit sat
2. an efficient approach for web query preprocessing edit sat
IAESIJEECS
IRJET - Analysis of Paraphrase Detection using NLP Techniques
IRJET - Analysis of Paraphrase Detection using NLP Techniques
IRJET Journal
An Efficient Approach to Produce Source Code by Interpreting Algorithm
An Efficient Approach to Produce Source Code by Interpreting Algorithm
IRJET Journal
IRJET - Optical Character Recognition and Translation
IRJET - Optical Character Recognition and Translation
IRJET Journal
IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...
IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...
IRJET Journal
IRJET- QUEZARD : Question Wizard using Machine Learning and Artificial Intell...
IRJET- QUEZARD : Question Wizard using Machine Learning and Artificial Intell...
IRJET Journal
IRJET- Voice to Code Editor using Speech Recognition
IRJET- Voice to Code Editor using Speech Recognition
IRJET Journal
Similar to Language and Offensive Word Detection
(20)
Live Sign Language Translation: A Survey
Live Sign Language Translation: A Survey
IRJET- Speech Based Answer Sheet Evaluation System
IRJET- Speech Based Answer Sheet Evaluation System
IRJET - Threat Prediction using Speech Analysis
IRJET - Threat Prediction using Speech Analysis
SPEECH RECOGNITION WITH LANGUAGE SPECIFICATION
SPEECH RECOGNITION WITH LANGUAGE SPECIFICATION
IRJET - Mobile Chatbot for Information Search
IRJET - Mobile Chatbot for Information Search
Sentiment Analysis: A comparative study of Deep Learning and Machine Learning
Sentiment Analysis: A comparative study of Deep Learning and Machine Learning
Derogatory Comment Classification
Derogatory Comment Classification
Text pre-processing of multilingual for sentiment analysis based on social ne...
Text pre-processing of multilingual for sentiment analysis based on social ne...
IRJET - Text Optimization/Summarizer using Natural Language Processing
IRJET - Text Optimization/Summarizer using Natural Language Processing
IRJET- Vernacular Language Spell Checker & Autocorrection
IRJET- Vernacular Language Spell Checker & Autocorrection
Speech To Speech Translation
Speech To Speech Translation
IRJET- ASL Language Translation using ML
IRJET- ASL Language Translation using ML
Sentiment Analysis on Twitter Data
Sentiment Analysis on Twitter Data
2. an efficient approach for web query preprocessing edit sat
2. an efficient approach for web query preprocessing edit sat
IRJET - Analysis of Paraphrase Detection using NLP Techniques
IRJET - Analysis of Paraphrase Detection using NLP Techniques
An Efficient Approach to Produce Source Code by Interpreting Algorithm
An Efficient Approach to Produce Source Code by Interpreting Algorithm
IRJET - Optical Character Recognition and Translation
IRJET - Optical Character Recognition and Translation
IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...
IRJET- A Pragmatic Supervised Learning Methodology of Hate Speech Detection i...
IRJET- QUEZARD : Question Wizard using Machine Learning and Artificial Intell...
IRJET- QUEZARD : Question Wizard using Machine Learning and Artificial Intell...
IRJET- Voice to Code Editor using Speech Recognition
IRJET- Voice to Code Editor using Speech Recognition
More from IRJET Journal
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
IRJET Journal
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
IRJET Journal
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
IRJET Journal
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
IRJET Journal
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
IRJET Journal
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
IRJET Journal
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
IRJET Journal
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
IRJET Journal
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
IRJET Journal
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
IRJET Journal
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
IRJET Journal
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
IRJET Journal
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
IRJET Journal
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
IRJET Journal
React based fullstack edtech web application
React based fullstack edtech web application
IRJET Journal
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
IRJET Journal
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
IRJET Journal
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
IRJET Journal
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
IRJET Journal
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
IRJET Journal
More from IRJET Journal
(20)
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
React based fullstack edtech web application
React based fullstack edtech web application
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Recently uploaded
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
ranjana rawat
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Christo Ananth
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
Call Girls in Nagpur High Profile
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
rknatarajan
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Christo Ananth
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
sanyuktamishra911
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
Suhani Kapoor
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
RajkumarAkumalla
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur High Profile
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
simmis5
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
M Maged Hegazy, LLM, MBA, CCP, P3O
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
ranjana rawat
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Suman Mia
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
SIVASHANKAR N
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
Tsuyoshi Horigome
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
rakeshbaidya232001
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
sivaprakash250
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls in Nagpur High Profile
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
Asst.prof M.Gokilavani
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
ranjana rawat
Recently uploaded
(20)
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
Call for Papers - African Journal of Biological Sciences, E-ISSN: 2663-2187, ...
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
Language and Offensive Word Detection
1.
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 04 | April 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 3705 Language and Offensive Word Detection Akash Kotekar1, Anuj Jaijeevan2, Tauqeer Rumaney3, Vishakh GR4, Prof. Shubhangi Chavan5 1,2,3,4 UG Student, Dept. of Computer Engineering, Pillai College of Engineering, New Panvel, India. 5 Assistant Professor, Dept. of Information Technology, Pillai College of Engineering, New Panvel, India. ---------------------------------------------------------------------***-------------------------------------------------------------- Abstract— Language recognition is a task in natural language processing that recognizes the natural language that makes up a document's content automatically. Language recognition helps users and machine translation communicate more effectively. In many NLP applications, language recognition is a fundamental and crucial stage. Machine learning and n- gram-based language identifiers are used to train and recognize numerous languages in this research. The first and most important step in using a code mixed language translation tool is to identify the language. It's also found in tools for multilingual summarizing and paraphrasing. Our software is also aimed at detecting offensive terms in any type of text provided by the user. People that engage in some type of online content (for example, articles) targeting an individual or a group have grown in popularity as social media has grown in prominence in recent years. Our programme will analyze the text using natural language processing or machine learning methods and compare it to the dataset to see if it contains any offending terms. Keywords—Language recognition, multilingual, offensive word, NBSVM. 1. Introduction Language identification is a task in natural language processing that recognizes the natural language in which the content of a document is written automatically. It is vital to determine the language of the content before employing any natural language application. Language identification is a key and crucial stage in many NLP applications. Our technology also aims to detect offensive terms in any sort of text that the user provides. A text is considered threatening or abusive if it contains sexist or racist slurs, targets or condemns any community or religious perspective, or stimulates criminal conduct. Our programme will compare the text to the dataset and analyze it using natural language processing or machine learning methods to check whether it contains any offensive terms. 2. Literature Survey A. Language Detection Engine for Multilingual Texting on Mobile Devices: Sourabh Vasant Gothe, Sourav Ghosh, Sharmila Mani, Bhanodai Guggilla, Ankur Agarwal, and Chandramouli Sanchi introduced the Language Detection Engine (LDE), a system that improves multilingual typing user experience by precisely determining the language of input text in real- time. LDE is a combination of a character N-gram model, which calculates the chance of input text coming from a specific language, and a selector model, which employs the emission probabilities to determine the most likely language for a given text using logistic regression. B. Automatic Hate Speech Detection on Social Media: A Brief Survey: Ahlam Alrehili [3] proposes a comprehensive and state-of-the-art natural language processing (NLP) technique for automatic hate speech identification on OSNs in this study. We focused on eight frequently used strategies for automatic hate detection, with N-gram being the most efficient and user-friendly. C. Language Identification for Multilingual Machine Translation: S Arun Babhulgaonkar and Shefali Sonavane [1] developed n-gram and machine learning- based language identifiers and used them to identify three Indian languages in a document submitted for machine translation: Hindi, Marathi, and Sanskrit. Incorporating language identification components into machine translation enhanced translation quality.
2.
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 04 | April 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 3706 2.1 Summary of Related Work The summary of methods used in literature is given in Table 1. Table 1 Summary of literature survey Paper Advantages Disadvantages S Arun Babhulgaonkar et al. 2020 [1] This paper shows that SVM gives better results than other Language identifiers. . It is also observed that most of the misclassified instances are short and noisy Sourabh Vasant Gothe et al. 2020 [2] LDE accurately detects codeswitching in a multilingual text with the help of a uniquely designed selector model. It is a shallow learning model.. Ahlam Alrehili 2019 [3] This paper gives a detailed study of the commonly used hate speech detection techniques. Only eight technique comparisons are done. P. Mathur et al. 2018 [4] The success of transfer learning for analyzing complex cross linguistic textual structures can be extended to include many more tasks involving code-switched and code- mixed data. Hinglish tweets in the dataset suffer from syntactic degradation after transliteration and translation which leads to a loss in the contextual structuring of the tweets. Below is overview of comparison of different parameters Table 2 Summary of literature survey Types of Classifier Language Accuracy n-gram based Language Identifier Hindi 73.58% n-gram based Language Identifier Marathi 76.53% n-gram based Language Identifier Sanskrit 76.04% Logistic Regression Hindi 81.81% Logistic Regression Marathi 79.80% Logistic Regression Sanskrit 81.44% Support Vector Machine Hindi 87.68% Support Vector Machine Marathi 90% Support Vector Machine Sanskrit 89.23% Naïve Bayes Classifier Hindi 75.96% Naïve Bayes Classifier Marathi 76.99% Naïve Bayes Classifier Sanskrit 79.16%
3.
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 04 | April 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 3707 3. Proposed Work Our research is aimed at recognizing 44 distinct languages and offensive terms in any type of text that the user provides as input. Because social media has grown in popularity in recent years, there are people who engage in some type of online material (for example, articles) aimed towards a person or a group, etc. So, using natural language processing or machine learning methods, our programme will evaluate the text and refer back to the dataset to determine the language of the input text as well as whether or not it contains any offending terms (for English and hinglish only). 3.1 System Architecture The proposed system architecture is given in Figure 1. Fig. 1 Proposed system architecture A. Dataset Creation: We have collected datasets for 44 different languages and cleaned them by performing filtration on it. Then all the datasets were combined into one single dataset containing 69290 sentences. For offensive word detection, we downloaded two (english and hinglish) datasets, cleaned and merged them into one dataset containing 1712 words. B. Training and Testing: The dataset for language detection was split into a training (80%)(48503 sentences) and testing (20%)(20787 sentences) set. This dataset was trained and tested using the Logistic Regression and Tf-Idf model. C. TF-IDF Model: The TF-IDF (term frequency-inverse document frequency) statistic examines the relevance of a word to a document in a collection of documents. This is accomplished by multiplying two metrics: the number of times a word appears in a document and the word's inverse document frequency over a collection of documents. D. Logistic regression Model Logistic regression is a statistical machine learning technique that classifies data by taking extreme outcome variables and attempting to draw a logarithmic line that separates them. E. Support Vector Machine Model The "Support Vector Machine" (SVM) is a supervised machine learning technique that can solve classification and regression issues. Each data item is plotted as a point in n-dimensional space (where n is the number of features you have), with the value of each feature being the value of a certain coordinate in the SVM algorithm. Then we accomplish classification by locating the hyper- plane that clearly distinguishes the two classes. F. Ensemble Model (Soft Voting) Ensemble learning is the process of systematically generating and combining many models, such as classifiers or experts, to tackle a specific computational intelligence problem. Models that forecast class membership probability are subject to soft voting. Soft voting can be utilized for models that don't predict class membership probability natively, but it may require some calibration of their probability-like scores before they can be employed in the ensemble (e.g. support vector machine, k-nearest neighbors, and decision trees). When you have two or more models that perform well on a predictive modelling job, you should use a voting ensemble. The ensemble models must agree on the majority of their forecasts. E. N-gram (Tokenization) In a document, N-grams are continuous sequences of words, symbols, or tokens. N-gram, in our model, decomposes a phrase into a series of tokens containing each word. We use N-gram as a tokenizer to identify
4.
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 04 | April 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 3708 offensive phrases where two or more words combine to generate an offensive phrase. D. Flow of Project Language Detection – 1) Input Text 2) Preprocessing - a) Text converted to Lowercase b) Removing Numbers c) Removing Punctuations 3) Feature Extraction - a) Tf-idf of the clean input text 4) Model - a) Feeding feature matrix into Soft Voting classifier(Logistic Regression and Support Vector Machine) 5) Language Recognized Offensive Word Detection – 1) Input Text 2) Preprocessing - a) Text converted to Lowercase b) Removing Numbers c) Removing Punctuations 3) Feature Extraction - a) Tokenization using N-gram 4) Comparing tokens with the list of offensive words in the dataset. 5) Offensive Words Recognized 3 Requirement Analyses The implementation detail is given in this section. 3.1 Software Table 3.3 Software details Operating System Windows 10 Programming Language Python 3.2 Hardware Table 3.2 Hardware details Processor 2 GHz Intel HDD 180 GB RAM 2 GB 3.3 Dataset The Language Detection dataset, which provides text details for several languages, is used. We must construct a model that will be able to predict the given language using the text. We begin by loading the dataset and performing some preliminary processing. We start by filtering the data to find statements of the right length and language. These sentences are then divided into three groups: training (70%), validation (20%), and test (10%). We need to extract features from our collection of phrases to generate a feature matrix before we can fit a model. 4 Conclusions The SVM (Support Vector Machine) classifier had the best accuracy among the other classifiers, but we chose the Logistic Regression model after attaining a 99.2 percent accuracy from training and testing the dataset. We employed a simple method for recognizing offensive words for offensive word identification, but we also used N-gram for a more advanced search in the dataset.
5.
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 04 | April 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 3709 Acknowledgment It is our privilege to express our sincerest regards to our supervisor Shubhangi Chavan for the valuable inputs, able guidance, encouragement, whole-hearted cooperation and constructive criticism throughout the duration of this work. We deeply express our sincere thanks to our Head of the Department Dr. Sharvari Govilkar and our Principal Dr. Sandeep M. Joshi for encouraging and allowing us to present this work.
6.
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 09 Issue: 04 | April 2022 www.irjet.net p-ISSN: 2395-0072 © 2022, IRJET | Impact Factor value: 7.529 | ISO 9001:2008 Certified Journal | Page 3710 REFERENCES 1. S Arun Babhulgaonkar, Shefali Sonavane; “Language Identification for Multilingual Machine Translation”, 2020 International Conference on Communication and Signal Processing (ICCSP). 2. Sourabh Vasant Gothe, Sourav Ghosh, Sharmila Mani, Bhanodai Guggilla, Ankur Agarwal, Chandramouli Sanchi; “ Language Detection Engine for Multilingual Texting on Mobile Devices”, 2020 IEEE 14th International Conference on Semantic Computing (ICSC). 3. Ahlam Alrehili, ”Automatic Hate Speech Detection on Social Media: A Brief Survey”, in: IEEE/ACS 16th International Conference on Computer Systems and Applications (AICCSA), Nov. 2019. 4. P. Mathur, R. Shah, R.Sawhney, and D. Mahata, “Detecting offensive tweets in hindi-english code-switched language,” in Proceedings of the Sixth International Workshop on Natural Language Processing for SocialMedia,2018. 5. A. H. Razavi, D. Inkpen, S. Uritsky, S. Matwin; “Offensive Language Detection Using Multi- level Classification”, Canadian Conference on AI 2010. 6. Puja Chakraborty, Md. Hanif Seddiqui; “Threat and Abusive Language Detection on Social Media in the Bengali Language”, 2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT). 7. Alina Maria Ciobanu, Marcos Zampieri, Shervin Malmasi Santanu Pal, Liviu P. Dinu; “Discriminating between Indo-Aryan Languages Using SVM Ensembles”, 2018 “In Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects” Association for Computational Linguistics, USA. 8. Gabriel Araújo De Souza, Márjory Da Costa- Abreu, “Automatic offensive language detection from Twitter data using machine learning and feature selection of metadata” in International Joint Conference on Neural Networks (IJCNN), 19-24 July 2020.
Download now