SlideShare a Scribd company logo
1 of 39
Download to read offline
CYBERBULLYING DETECTION USING
MACHINE LEARNING
PRESENTED BY GROUP I
Under the Guidance of
Ms.Surya Ashok,
HOD Computer Science department
TEAM MEMBERS:
ANITHA R
KRITHIKA V S
MEGHA M S
PRANIDHI K J
ABSTRACT
● With the widespread use of social media in this era,
cyberbullying increased rapidly as a cybercrime.
● Cyberbullying is a willful and repeated harm inflicted
through the use of computer, cell phones, and other electronic devices.
● The proposed system aims at detecting cyberbullying, it detects abusive
comments and messages in social media platform.
● The Machine learning algorithm,Naive bayes is used to classify comments and
messages as bullying and non-bullying.
● The project ‘Cyberbullying Detection Using Machine Learning’ discusses and
implements the approach of machine learning in order to solve the threat of
cyberbullying, and thus makes social media a safe place for the users.
SYSTEM SPECIFICATIONS
Hardware Specification
Processor : Intel Core i5
Speed : Above 1GHz
RAM capacity : 4GB or above
Hard Disk Space Required : 5 GB or above
Keyboard : Standard Keyboard
Mouse : Standard Mouse
Monitor : Standard color monitor
Software Specification
● Language Used : Python 3.10, HTML5, JavaScript ES6
➔ Here, HTML and JavaScript are Used for designing the web application.
➔ The main advantages of using python in this project is that it is open source.
➔ It also has vast built-in machine learning libraries available.
● Web Framework : Django 3.7
➔ Django is preferred in this project because of its simplicity, flexibility, reliability and scalability.
● Database : SQL Server 2019
➔ SQL Server 2019 (15.x) introduces new ways to work with SQL Server Containers such as
Machine Learning Services.
➔ Supports Query interleaving,which is a tabular mode system configuration that can
improve user query response times in high-concurrency scenarios.
EXISTING SYSTEM
● For several years, the researchers have worked intensively on cyberbullying
detection to find a way to control or reduce cyberbullying in Social Media
platforms.
● In a research work by Massachusetts Institute of Technology, a system to detect
cyberbullying through textual context in YouTube video comments was
developed, but the system showed less precise classification outcome and
increased false positives.
● Generally most existing systems are focused on effects after cyberbullying
incident and there is no accurate system for online cyberbullying detection.
PROPOSED SYSTEM
● The proposed system employs machine learning to avoid human
intervention.
● A dataset containing cyberbullying and non-bullying comments is used to
train the machine learning model using the Sklearn library in Python.
● Naive Bayes algorithm is used for detecting abusive comments and
messages in social media.
● The Naive Bayes algorithm states that:
P(A/B)=(P(B/A) P(A))/P(B)
● In the proposed system automated detection of bullying comments in
social media is implemented.
● The proposed system is platform independent, it can be implemented on
any operating system and it is free to use.
MODULE DESCRIPTION
● User module.
● Admin module.
● Machine learning module.
MODULE FUNCTIONALITIES
❏ USER MODULE
● Users can sign up to the web application by registering themselves by
providing details like user name,password etc..
● Registered users can also sign in to their profile by using user id and password.
● They can post videos,stories and photos in the web application.
● Users can send friend requests to other users and can also chat with their
friends.
● Users can view,like and comment the videos and photos posted by their
friends in the web application.
❏ ADMIN MODULE
● Admin can handle and make changes in the web application.
● They can also view the requests from users .
● They can also view the comments that have been classified as bullying
and non-bullying.
● They can manage the notifications of users.
❏ MACHINE LEARNING MODULE
● The Machine Learning module is responsible for classifying
comments and messages as bullying or non-bullying.
● From a vast set of comments and messages, the Naive Bayes
algorithm is used to predict bullying comments and messages.
● This module includes the following steps :
➢ Data collection
➢ Data preprocessing
➢ Segmentation
➢ Feature extraction
➢ Training
➢ Testing
FLOWCHART OF CYBERBULLYING DETECTION SYSTEM
1. DATA COLLECTION
● Collecting data for training the Machine Learning model is the basic step
in the machine learning pipeline.
● The predictions made by Machine Learning systems can only be as good as
the data on which they have been trained.
● In this system, dataset containing bullying as well as non-bullying
comments and messages.
● The data set is downloaded from KAGGLE website.
● 80% of dataset is used for training and the remaining 20% is used for
testing.
2. DATA PREPROCESSING
● Real-world raw data and images are often incomplete, inconsistent and lacking in
certain behaviors or trends. They are also likely to contain many errors. So, once
collected, they are pre-processed into a format the machine learning algorithm
can use for the model.
● Data preprocessing in Machine Learning is a crucial step that helps enhance the
quality of data to promote the extraction of meaningful insights from the data.
● The proprocessing step also includes the removal of stop words, special characters
and the conversion of uppercase letters to lowercase.
● The Lemmatization step includes converting tense word into root word. For
example, the word running is converted to its root word run.
3. SEGMENTATION
● Segmentation can be defined as the process of separating sentences
into different tokens.
● N-grams are used for grouping tokens.
● N-grams are used for a variety of things. Some examples include auto
completion of sentences.
● In this project, 2-gram is used to group tokens.
4. FEATURE EXTRACTION
● Feature extraction is the process of taking out a list of words from the text data
and then transforming them into a feature set which is usable by a classifier.
● In this system, TF-IDF vectorizer is used for feature extraction.
● TF-IDF stands for term frequency-inverse document frequency and it is a
measure, used to quantify the importance or relevance of string
representations in a document.
● TF-IDF associates each word in a document with a number that represents how
relevant each word is in that document.
5. TRAINING
● Model training is the key step in machine learning that results in a model ready
to be validated, tested, and deployed.
● The performance of the model determines the quality of the applications that
are built using it.
● Quality of training data and the training algorithm are both important assets
during the model training phase.
● Typically, dataset is split for training and testing.
● All these aspects of model training make it both an involved and important
process in the overall machine learning development cycle.
6. TESTING
● In machine learning, model testing is referred to as the process where
the performance of a fully trained model is evaluated on a testing set.
● The testing set consisting of a set of testing samples should be
separated from the both training and validation sets, but it should
follow the same probability distribution as the training set.
● Each testing sample has a known value of the target.
DOMAIN THEORY
➔ Machine learning
● Machine learning (ML) is the study of computer algorithms that improve
automatically through experience.
● Machine learning involves computers discovering how they can perform tasks
without being explicitly programmed to do so.
● The Machine Learning process starts with inputting training data into the
selected algorithm.
● New input data is fed into the machine learning algorithm to test whether the
algorithm works correctly.
➔ NAIVE BAYES
● A Naive Bayes classifier is a probabilistic machine learning model
that’s used for classification task.
● The classifier is based on the Bayes theorem.
Bayes Theorem :
P(A/B)=(P(B/A) P(A))/P(B)
● This system uses Multinomial Naive Bayes Classifier.
● The features/predictors used by the classifier are the frequency of
the words present in the document.
CONFUSION MATRIX
Fig : Confusion Matrix
DATABASE TABLE
ADMIN
USER
POST
MESSAGES
COMMENTS
USER PROFILE
DATA FLOW DIAGRAMS
Fig. : Level 0 DFD
Fig.: Level 1 DFD
Fig.: Level 1 DFD of user
LEVEL 1.1 DFD OF ADMIN
ER DIAGRAM
ADMIN LOGIN
ADMIN HOME PAGE
SIGNUP PAGE
LOGIN PAGE
HOME PAGE
WARNING MESSAGE
RESTRICTED ACCOUNT
CONCLUSION
The overall aim of the project “Cyberbullying Detection Using Machine
Learning” is to develop a system that automatically classifies comments
and messages as bullying or non-bullying and also remove the bullying
comments from the web application.
BIBLIOGRAPHY
Referenced Sites:
1. Cynthia Van Hee, Gilles Jacobs, Chris Emmery, Bart Desmet, Els Lefever, Ben
Verhoeven, Guy De Pauw, Walter Daelemans, Véronique Hoste, Automatic
detection of cyberbullying in social media text, PloS one 13 (10), e0203794,
2018
2. Sweta Agrawal, Amit Awekar, European conference on information retrieval,
Deep learning for detecting cyberbullying across multiple social media
platforms, 141-153, 2018
3. Ong Chee Hang, Halina Mohamed Dahlan 2019 6th International Conference
on Research and Innovation in Information Systems, Cyberbullying lexicon
for social media, (ICRIIS), 1-6, 2019
4. John Hani, Mohamed Nashaat, Mostafa Ahmed, Zeyad Emad, Eslam Amer,
Ammar Mohammed, Social media cyberbullying detection using machine
learning, Int. J. Adv. Comput. Sci. Appl 10 (5), 703-707, 2019

More Related Content

What's hot

Face recognition technology - BEST PPT
Face recognition technology - BEST PPTFace recognition technology - BEST PPT
Face recognition technology - BEST PPTSiddharth Modi
 
Final spam-e-mail-detection
Final  spam-e-mail-detectionFinal  spam-e-mail-detection
Final spam-e-mail-detectionPartnered Health
 
Intro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationIntro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationAnkit Gupta
 
modeling and predicting cyber hacking breaches
modeling and predicting cyber hacking breaches modeling and predicting cyber hacking breaches
modeling and predicting cyber hacking breaches Venkat Projects
 
Driver drowsiness detection
Driver drowsiness detectionDriver drowsiness detection
Driver drowsiness detectionConnecting Point
 
Computer science seminar topics
Computer science seminar topicsComputer science seminar topics
Computer science seminar topics123seminarsonly
 
Project on disease prediction
Project on disease predictionProject on disease prediction
Project on disease predictionKOYELMAJUMDAR1
 
Machine Learning in Cyber Security
Machine Learning in Cyber SecurityMachine Learning in Cyber Security
Machine Learning in Cyber SecurityRishi Kant
 
Smart Voting System with Face Recognition
Smart Voting System with Face RecognitionSmart Voting System with Face Recognition
Smart Voting System with Face RecognitionNikhil Katte
 
Atm using fingerprint
Atm using fingerprintAtm using fingerprint
Atm using fingerprintAnIsh Kumar
 
ppt of gesture recognition
ppt of gesture recognitionppt of gesture recognition
ppt of gesture recognitionAayush Agrawal
 
Fraud detection with Machine Learning
Fraud detection with Machine LearningFraud detection with Machine Learning
Fraud detection with Machine LearningScaleway
 
Attendance system based on face recognition using python by Raihan Sikdar
Attendance system based on face recognition using python by Raihan SikdarAttendance system based on face recognition using python by Raihan Sikdar
Attendance system based on face recognition using python by Raihan Sikdarraihansikdar
 
Face detection presentation slide
Face detection  presentation slideFace detection  presentation slide
Face detection presentation slideSanjoy Dutta
 

What's hot (20)

Face recognition technology - BEST PPT
Face recognition technology - BEST PPTFace recognition technology - BEST PPT
Face recognition technology - BEST PPT
 
Final spam-e-mail-detection
Final  spam-e-mail-detectionFinal  spam-e-mail-detection
Final spam-e-mail-detection
 
Intro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning PresentationIntro/Overview on Machine Learning Presentation
Intro/Overview on Machine Learning Presentation
 
Image recognition
Image recognitionImage recognition
Image recognition
 
modeling and predicting cyber hacking breaches
modeling and predicting cyber hacking breaches modeling and predicting cyber hacking breaches
modeling and predicting cyber hacking breaches
 
Driver drowsiness detection
Driver drowsiness detectionDriver drowsiness detection
Driver drowsiness detection
 
captcha.ppt
 captcha.ppt captcha.ppt
captcha.ppt
 
Computer science seminar topics
Computer science seminar topicsComputer science seminar topics
Computer science seminar topics
 
Project on disease prediction
Project on disease predictionProject on disease prediction
Project on disease prediction
 
Machine Learning in Cyber Security
Machine Learning in Cyber SecurityMachine Learning in Cyber Security
Machine Learning in Cyber Security
 
Smart Voting System with Face Recognition
Smart Voting System with Face RecognitionSmart Voting System with Face Recognition
Smart Voting System with Face Recognition
 
Captcha seminar
Captcha seminar Captcha seminar
Captcha seminar
 
Atm using fingerprint
Atm using fingerprintAtm using fingerprint
Atm using fingerprint
 
ppt of gesture recognition
ppt of gesture recognitionppt of gesture recognition
ppt of gesture recognition
 
Sms spam-detection
Sms spam-detectionSms spam-detection
Sms spam-detection
 
Fraud detection with Machine Learning
Fraud detection with Machine LearningFraud detection with Machine Learning
Fraud detection with Machine Learning
 
Spam Detection Using Natural Language processing
Spam Detection Using Natural Language processingSpam Detection Using Natural Language processing
Spam Detection Using Natural Language processing
 
Browser Security
Browser SecurityBrowser Security
Browser Security
 
Attendance system based on face recognition using python by Raihan Sikdar
Attendance system based on face recognition using python by Raihan SikdarAttendance system based on face recognition using python by Raihan Sikdar
Attendance system based on face recognition using python by Raihan Sikdar
 
Face detection presentation slide
Face detection  presentation slideFace detection  presentation slide
Face detection presentation slide
 

Similar to CYBERBULLYING DETECTION USING MACHINE LEARNING-1 (1).pdf

cyberbullyingdetectionusingmachinelearning-11-220913143556-fec10e26.pptx
cyberbullyingdetectionusingmachinelearning-11-220913143556-fec10e26.pptxcyberbullyingdetectionusingmachinelearning-11-220913143556-fec10e26.pptx
cyberbullyingdetectionusingmachinelearning-11-220913143556-fec10e26.pptxSaiKiran101146
 
ML_Internship Presentation_Infidata_2021.pptx
ML_Internship Presentation_Infidata_2021.pptxML_Internship Presentation_Infidata_2021.pptx
ML_Internship Presentation_Infidata_2021.pptxAltafSMT
 
Online examination management system..pdf
Online examination management system..pdfOnline examination management system..pdf
Online examination management system..pdfKamal Acharya
 
online-examination-system.pptx
online-examination-system.pptxonline-examination-system.pptx
online-examination-system.pptxNehal1231
 
Algorithms and Application Programming
Algorithms and Application ProgrammingAlgorithms and Application Programming
Algorithms and Application Programmingahaleemsl
 
Identifying and classifying unknown Network Disruption
Identifying and classifying unknown Network DisruptionIdentifying and classifying unknown Network Disruption
Identifying and classifying unknown Network Disruptionjagan477830
 
introduction to machine learning
introduction to machine learningintroduction to machine learning
introduction to machine learningJohnson Ubah
 
A Survey on Design of Online Judge System
A Survey on Design of Online Judge SystemA Survey on Design of Online Judge System
A Survey on Design of Online Judge SystemIRJET Journal
 
Emotion Recognition By Textual Tweets Using Machine Learning
Emotion Recognition By Textual Tweets Using Machine LearningEmotion Recognition By Textual Tweets Using Machine Learning
Emotion Recognition By Textual Tweets Using Machine LearningIRJET Journal
 
Feb 2013Lesson 38 Software Acquisition Development
Feb 2013Lesson 38 Software Acquisition DevelopmentFeb 2013Lesson 38 Software Acquisition Development
Feb 2013Lesson 38 Software Acquisition DevelopmentBarb Tillich
 
Introduction to Machine Learning.pptx
Introduction to Machine Learning.pptxIntroduction to Machine Learning.pptx
Introduction to Machine Learning.pptxDr. Amanpreet Kaur
 
Preliminry report
 Preliminry report Preliminry report
Preliminry reportJiten Ahuja
 
Email Spam Detection Using Machine Learning
Email Spam Detection Using Machine LearningEmail Spam Detection Using Machine Learning
Email Spam Detection Using Machine LearningIRJET Journal
 
Top Natural Language Processing |aitech.studio
Top Natural Language Processing |aitech.studioTop Natural Language Processing |aitech.studio
Top Natural Language Processing |aitech.studioAITechStudio
 
e3f55595181f7cad006f26db820fb78ec146e00e-1646623528083 (1).pdf
e3f55595181f7cad006f26db820fb78ec146e00e-1646623528083 (1).pdfe3f55595181f7cad006f26db820fb78ec146e00e-1646623528083 (1).pdf
e3f55595181f7cad006f26db820fb78ec146e00e-1646623528083 (1).pdfSILVIUSyt
 
Lab management
Lab managementLab management
Lab managementlogumca
 
Major File On web Development
Major File On web Development Major File On web Development
Major File On web Development Love Kothari
 

Similar to CYBERBULLYING DETECTION USING MACHINE LEARNING-1 (1).pdf (20)

cyberbullyingdetectionusingmachinelearning-11-220913143556-fec10e26.pptx
cyberbullyingdetectionusingmachinelearning-11-220913143556-fec10e26.pptxcyberbullyingdetectionusingmachinelearning-11-220913143556-fec10e26.pptx
cyberbullyingdetectionusingmachinelearning-11-220913143556-fec10e26.pptx
 
Machine learning
Machine learningMachine learning
Machine learning
 
ML_Internship Presentation_Infidata_2021.pptx
ML_Internship Presentation_Infidata_2021.pptxML_Internship Presentation_Infidata_2021.pptx
ML_Internship Presentation_Infidata_2021.pptx
 
Online examination management system..pdf
Online examination management system..pdfOnline examination management system..pdf
Online examination management system..pdf
 
online-examination-system.pptx
online-examination-system.pptxonline-examination-system.pptx
online-examination-system.pptx
 
Algorithms and Application Programming
Algorithms and Application ProgrammingAlgorithms and Application Programming
Algorithms and Application Programming
 
Identifying and classifying unknown Network Disruption
Identifying and classifying unknown Network DisruptionIdentifying and classifying unknown Network Disruption
Identifying and classifying unknown Network Disruption
 
introduction to machine learning
introduction to machine learningintroduction to machine learning
introduction to machine learning
 
A Survey on Design of Online Judge System
A Survey on Design of Online Judge SystemA Survey on Design of Online Judge System
A Survey on Design of Online Judge System
 
Emotion Recognition By Textual Tweets Using Machine Learning
Emotion Recognition By Textual Tweets Using Machine LearningEmotion Recognition By Textual Tweets Using Machine Learning
Emotion Recognition By Textual Tweets Using Machine Learning
 
Feb 2013Lesson 38 Software Acquisition Development
Feb 2013Lesson 38 Software Acquisition DevelopmentFeb 2013Lesson 38 Software Acquisition Development
Feb 2013Lesson 38 Software Acquisition Development
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Introduction to Machine Learning.pptx
Introduction to Machine Learning.pptxIntroduction to Machine Learning.pptx
Introduction to Machine Learning.pptx
 
Preliminry report
 Preliminry report Preliminry report
Preliminry report
 
CSEIT- ALL.pptx
CSEIT- ALL.pptxCSEIT- ALL.pptx
CSEIT- ALL.pptx
 
Email Spam Detection Using Machine Learning
Email Spam Detection Using Machine LearningEmail Spam Detection Using Machine Learning
Email Spam Detection Using Machine Learning
 
Top Natural Language Processing |aitech.studio
Top Natural Language Processing |aitech.studioTop Natural Language Processing |aitech.studio
Top Natural Language Processing |aitech.studio
 
e3f55595181f7cad006f26db820fb78ec146e00e-1646623528083 (1).pdf
e3f55595181f7cad006f26db820fb78ec146e00e-1646623528083 (1).pdfe3f55595181f7cad006f26db820fb78ec146e00e-1646623528083 (1).pdf
e3f55595181f7cad006f26db820fb78ec146e00e-1646623528083 (1).pdf
 
Lab management
Lab managementLab management
Lab management
 
Major File On web Development
Major File On web Development Major File On web Development
Major File On web Development
 

Recently uploaded

一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理F
 
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC
 
Local Call Girls in Seoni 9332606886 HOT & SEXY Models beautiful and charmin...
Local Call Girls in Seoni  9332606886 HOT & SEXY Models beautiful and charmin...Local Call Girls in Seoni  9332606886 HOT & SEXY Models beautiful and charmin...
Local Call Girls in Seoni 9332606886 HOT & SEXY Models beautiful and charmin...kumargunjan9515
 
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样ayvbos
 
Call girls Service in Ajman 0505086370 Ajman call girls
Call girls Service in Ajman 0505086370 Ajman call girlsCall girls Service in Ajman 0505086370 Ajman call girls
Call girls Service in Ajman 0505086370 Ajman call girlsMonica Sydney
 
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency""Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency"growthgrids
 
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdfMatthew Sinclair
 
Abu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu DhabiAbu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu DhabiMonica Sydney
 
Meaning of On page SEO & its process in detail.
Meaning of On page SEO & its process in detail.Meaning of On page SEO & its process in detail.
Meaning of On page SEO & its process in detail.krishnachandrapal52
 
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制pxcywzqs
 
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac RoomVip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Roommeghakumariji156
 
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...meghakumariji156
 
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...kajalverma014
 
20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdfMatthew Sinclair
 
Ballia Escorts Service Girl ^ 9332606886, WhatsApp Anytime Ballia
Ballia Escorts Service Girl ^ 9332606886, WhatsApp Anytime BalliaBallia Escorts Service Girl ^ 9332606886, WhatsApp Anytime Ballia
Ballia Escorts Service Girl ^ 9332606886, WhatsApp Anytime Balliameghakumariji156
 
Real Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtReal Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtrahman018755
 
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...gajnagarg
 
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样ayvbos
 
APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC
 
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查ydyuyu
 

Recently uploaded (20)

一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理一比一原版田纳西大学毕业证如何办理
一比一原版田纳西大学毕业证如何办理
 
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
 
Local Call Girls in Seoni 9332606886 HOT & SEXY Models beautiful and charmin...
Local Call Girls in Seoni  9332606886 HOT & SEXY Models beautiful and charmin...Local Call Girls in Seoni  9332606886 HOT & SEXY Models beautiful and charmin...
Local Call Girls in Seoni 9332606886 HOT & SEXY Models beautiful and charmin...
 
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
 
Call girls Service in Ajman 0505086370 Ajman call girls
Call girls Service in Ajman 0505086370 Ajman call girlsCall girls Service in Ajman 0505086370 Ajman call girls
Call girls Service in Ajman 0505086370 Ajman call girls
 
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency""Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
 
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
20240509 QFM015 Engineering Leadership Reading List April 2024.pdf
 
Abu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu DhabiAbu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
 
Meaning of On page SEO & its process in detail.
Meaning of On page SEO & its process in detail.Meaning of On page SEO & its process in detail.
Meaning of On page SEO & its process in detail.
 
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
 
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac RoomVip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
 
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
Tadepalligudem Escorts Service Girl ^ 9332606886, WhatsApp Anytime Tadepallig...
 
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
best call girls in Hyderabad Finest Escorts Service 📞 9352988975 📞 Available ...
 
20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf20240508 QFM014 Elixir Reading List April 2024.pdf
20240508 QFM014 Elixir Reading List April 2024.pdf
 
Ballia Escorts Service Girl ^ 9332606886, WhatsApp Anytime Ballia
Ballia Escorts Service Girl ^ 9332606886, WhatsApp Anytime BalliaBallia Escorts Service Girl ^ 9332606886, WhatsApp Anytime Ballia
Ballia Escorts Service Girl ^ 9332606886, WhatsApp Anytime Ballia
 
Real Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtReal Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirt
 
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
 
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
一比一原版(Flinders毕业证书)弗林德斯大学毕业证原件一模一样
 
APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53
 
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
原版制作美国爱荷华大学毕业证(iowa毕业证书)学位证网上存档可查
 

CYBERBULLYING DETECTION USING MACHINE LEARNING-1 (1).pdf

  • 1. CYBERBULLYING DETECTION USING MACHINE LEARNING PRESENTED BY GROUP I Under the Guidance of Ms.Surya Ashok, HOD Computer Science department TEAM MEMBERS: ANITHA R KRITHIKA V S MEGHA M S PRANIDHI K J
  • 2. ABSTRACT ● With the widespread use of social media in this era, cyberbullying increased rapidly as a cybercrime. ● Cyberbullying is a willful and repeated harm inflicted through the use of computer, cell phones, and other electronic devices. ● The proposed system aims at detecting cyberbullying, it detects abusive comments and messages in social media platform. ● The Machine learning algorithm,Naive bayes is used to classify comments and messages as bullying and non-bullying. ● The project ‘Cyberbullying Detection Using Machine Learning’ discusses and implements the approach of machine learning in order to solve the threat of cyberbullying, and thus makes social media a safe place for the users.
  • 3. SYSTEM SPECIFICATIONS Hardware Specification Processor : Intel Core i5 Speed : Above 1GHz RAM capacity : 4GB or above Hard Disk Space Required : 5 GB or above Keyboard : Standard Keyboard Mouse : Standard Mouse Monitor : Standard color monitor
  • 4. Software Specification ● Language Used : Python 3.10, HTML5, JavaScript ES6 ➔ Here, HTML and JavaScript are Used for designing the web application. ➔ The main advantages of using python in this project is that it is open source. ➔ It also has vast built-in machine learning libraries available. ● Web Framework : Django 3.7 ➔ Django is preferred in this project because of its simplicity, flexibility, reliability and scalability. ● Database : SQL Server 2019 ➔ SQL Server 2019 (15.x) introduces new ways to work with SQL Server Containers such as Machine Learning Services. ➔ Supports Query interleaving,which is a tabular mode system configuration that can improve user query response times in high-concurrency scenarios.
  • 5. EXISTING SYSTEM ● For several years, the researchers have worked intensively on cyberbullying detection to find a way to control or reduce cyberbullying in Social Media platforms. ● In a research work by Massachusetts Institute of Technology, a system to detect cyberbullying through textual context in YouTube video comments was developed, but the system showed less precise classification outcome and increased false positives. ● Generally most existing systems are focused on effects after cyberbullying incident and there is no accurate system for online cyberbullying detection.
  • 6. PROPOSED SYSTEM ● The proposed system employs machine learning to avoid human intervention. ● A dataset containing cyberbullying and non-bullying comments is used to train the machine learning model using the Sklearn library in Python. ● Naive Bayes algorithm is used for detecting abusive comments and messages in social media.
  • 7. ● The Naive Bayes algorithm states that: P(A/B)=(P(B/A) P(A))/P(B) ● In the proposed system automated detection of bullying comments in social media is implemented. ● The proposed system is platform independent, it can be implemented on any operating system and it is free to use.
  • 8. MODULE DESCRIPTION ● User module. ● Admin module. ● Machine learning module.
  • 9. MODULE FUNCTIONALITIES ❏ USER MODULE ● Users can sign up to the web application by registering themselves by providing details like user name,password etc.. ● Registered users can also sign in to their profile by using user id and password. ● They can post videos,stories and photos in the web application. ● Users can send friend requests to other users and can also chat with their friends. ● Users can view,like and comment the videos and photos posted by their friends in the web application.
  • 10. ❏ ADMIN MODULE ● Admin can handle and make changes in the web application. ● They can also view the requests from users . ● They can also view the comments that have been classified as bullying and non-bullying. ● They can manage the notifications of users.
  • 11. ❏ MACHINE LEARNING MODULE ● The Machine Learning module is responsible for classifying comments and messages as bullying or non-bullying. ● From a vast set of comments and messages, the Naive Bayes algorithm is used to predict bullying comments and messages. ● This module includes the following steps : ➢ Data collection ➢ Data preprocessing ➢ Segmentation ➢ Feature extraction ➢ Training ➢ Testing
  • 12. FLOWCHART OF CYBERBULLYING DETECTION SYSTEM
  • 13. 1. DATA COLLECTION ● Collecting data for training the Machine Learning model is the basic step in the machine learning pipeline. ● The predictions made by Machine Learning systems can only be as good as the data on which they have been trained. ● In this system, dataset containing bullying as well as non-bullying comments and messages. ● The data set is downloaded from KAGGLE website. ● 80% of dataset is used for training and the remaining 20% is used for testing.
  • 14. 2. DATA PREPROCESSING ● Real-world raw data and images are often incomplete, inconsistent and lacking in certain behaviors or trends. They are also likely to contain many errors. So, once collected, they are pre-processed into a format the machine learning algorithm can use for the model. ● Data preprocessing in Machine Learning is a crucial step that helps enhance the quality of data to promote the extraction of meaningful insights from the data. ● The proprocessing step also includes the removal of stop words, special characters and the conversion of uppercase letters to lowercase. ● The Lemmatization step includes converting tense word into root word. For example, the word running is converted to its root word run.
  • 15. 3. SEGMENTATION ● Segmentation can be defined as the process of separating sentences into different tokens. ● N-grams are used for grouping tokens. ● N-grams are used for a variety of things. Some examples include auto completion of sentences. ● In this project, 2-gram is used to group tokens.
  • 16. 4. FEATURE EXTRACTION ● Feature extraction is the process of taking out a list of words from the text data and then transforming them into a feature set which is usable by a classifier. ● In this system, TF-IDF vectorizer is used for feature extraction. ● TF-IDF stands for term frequency-inverse document frequency and it is a measure, used to quantify the importance or relevance of string representations in a document. ● TF-IDF associates each word in a document with a number that represents how relevant each word is in that document.
  • 17. 5. TRAINING ● Model training is the key step in machine learning that results in a model ready to be validated, tested, and deployed. ● The performance of the model determines the quality of the applications that are built using it. ● Quality of training data and the training algorithm are both important assets during the model training phase. ● Typically, dataset is split for training and testing. ● All these aspects of model training make it both an involved and important process in the overall machine learning development cycle.
  • 18. 6. TESTING ● In machine learning, model testing is referred to as the process where the performance of a fully trained model is evaluated on a testing set. ● The testing set consisting of a set of testing samples should be separated from the both training and validation sets, but it should follow the same probability distribution as the training set. ● Each testing sample has a known value of the target.
  • 19. DOMAIN THEORY ➔ Machine learning ● Machine learning (ML) is the study of computer algorithms that improve automatically through experience. ● Machine learning involves computers discovering how they can perform tasks without being explicitly programmed to do so. ● The Machine Learning process starts with inputting training data into the selected algorithm. ● New input data is fed into the machine learning algorithm to test whether the algorithm works correctly.
  • 20. ➔ NAIVE BAYES ● A Naive Bayes classifier is a probabilistic machine learning model that’s used for classification task. ● The classifier is based on the Bayes theorem. Bayes Theorem : P(A/B)=(P(B/A) P(A))/P(B) ● This system uses Multinomial Naive Bayes Classifier. ● The features/predictors used by the classifier are the frequency of the words present in the document.
  • 21. CONFUSION MATRIX Fig : Confusion Matrix
  • 25. DATA FLOW DIAGRAMS Fig. : Level 0 DFD
  • 27. Fig.: Level 1 DFD of user
  • 28.
  • 29. LEVEL 1.1 DFD OF ADMIN
  • 38. CONCLUSION The overall aim of the project “Cyberbullying Detection Using Machine Learning” is to develop a system that automatically classifies comments and messages as bullying or non-bullying and also remove the bullying comments from the web application.
  • 39. BIBLIOGRAPHY Referenced Sites: 1. Cynthia Van Hee, Gilles Jacobs, Chris Emmery, Bart Desmet, Els Lefever, Ben Verhoeven, Guy De Pauw, Walter Daelemans, Véronique Hoste, Automatic detection of cyberbullying in social media text, PloS one 13 (10), e0203794, 2018 2. Sweta Agrawal, Amit Awekar, European conference on information retrieval, Deep learning for detecting cyberbullying across multiple social media platforms, 141-153, 2018 3. Ong Chee Hang, Halina Mohamed Dahlan 2019 6th International Conference on Research and Innovation in Information Systems, Cyberbullying lexicon for social media, (ICRIIS), 1-6, 2019 4. John Hani, Mohamed Nashaat, Mostafa Ahmed, Zeyad Emad, Eslam Amer, Ammar Mohammed, Social media cyberbullying detection using machine learning, Int. J. Adv. Comput. Sci. Appl 10 (5), 703-707, 2019