Final Year Internship Presentation
on
Fake News Detection Using Machine Learning
Government Engineering College Ramanagara
Doddamannina Gudde, Near Janapadaloka, B.M. Road, Ramanagara, Karnataka 562159
Under the Guidance of
Dr. CHETHAN K C
Assistant Professor, Dept. of CSE,
GEC, Ramanagara
Presented by
Geetha C
CONTENTS
 INTRODUCTION ABOUT COMPANY
 TECHNOLOGIES LEARNT
 ABSTRACT
 INTRODUCTION TO PROJECT
 SYSTEM ARCHITECTURE
 METHODOLOGY
 CONCLUSION
COMPANY OVERVIEW
 Company name – SYSLOG TECHNOLOGIES
 Syslog Technologies is a fast growing technology solutions and services provider.
 Founded in 2005 by a team of technology professionals with venture capital backing,
Syslog Technologies has built a successful track record of delivering end-to-end
solutions to its customers from various industrial sectors that include.
 Syslog Technologies has highly skilled and dedicated IT professionals to
provide customized IT solutions for several industries using our technical
expertise and experience.
 Our vision is to provide quality services that exceeds the expectations of our
esteemed customers.
 Our mission is to build long term relationships with our customers and clients
and provide exceptional customer services by pursuing business through
innovation and advanced technology.
TECHNOLOGIES LEARNT
 PYTHON
Python is an easy to learn, powerful programming language. It has efficient high-level data structures
and a simple but effective approach to object-oriented programming. Python’s elegant syntax and dynamic
typing, together with its interpreted nature, make it an ideal language for scripting and rapid application
development in many areas on most platforms.
 OPEN CV
Open Source Computer Vision Library is a common platform and set of programming
functions for real-time applications. The open CV library contains several algorithms for more than 500
optimized algorithms. Used mostly around the world, with forty thousand people in the user group. The
first languages used in C-C ++ are mainly written in C, making them portable to certain platforms such as
the digital signal processor. Now the language that is called Python is being used recently, has been
developed to encourage adoption by a wider audience.
 ANACONDA
Anaconda is a distribution of the Python and R programming languages for scientific
computing (data science, machine learning applications, large-scale data processing, predictive
analytics, etc.), that aims to simplify package management and deployment. The distribution includes
data-science packages suitable for Windows, Linux, and macOS.
ANACONDA NAVIGATOR
Anaconda Navigator is a desktop graphical user interface (GUI) included in Anaconda
distribution that allows users to launch applications and manage conda packages, environments and
channels without using command-line commands.
 The following applications are available by default in Navigator.
 JupyterLab
 Jupyter Notebook
 Spyder
 Glue
 Orange
 RStudio
 Visual Studio Code
JUPYTER NOTEBOOK
 Project Jupyter is a project to develop open-source software, open standards, and services for interactive
computing across multiple programming languages.
 Jupyter has developed and supported the interactive computing products Jupyter Notebook, JupyterHub, and
JupyterLab. Jupyter is financially sponsored by NumFOCUS.
SPYDER
Spyder is an open-source cross-platform integrated development environment (IDE) for scientific
programming in the Python language. Spyder integrates with a number of prominent packages in the
scientific Python stack, including NumPy, Matplotlib, pandas, IPython, SymPy and Cython, as well as other
open-source software.
ABSTRACT
 This Project comes up with the applications of NLP (Natural Language
Processing) techniques for detecting the ‘fake news’, that is, misleading news
stories that comes from the non-reputable sources.
 Is it possible to build a model that can differentiate between “Real “news and
“Fake” news? So a proposed work on assembling a dataset of both fake and
real news in order to create a model to classify an article into fake or real
based on its words and phrases.
INTRODUCTION
 Fake news spreads a wildlife and this is a big issues in this era.
 These days fake news is creating different issues from sarcastic articles to a
fabricated news and plan government propaganda in some outlets.
 Fake news and lack of trust in the media are growing problems with huge
ramifications in our society.
 it is seeked to produce a model that can accurately predict the likelihood that a
given article is fake news.
 We will be training and testing the data, when we use supervised learning it means
we are labeling the data.
PROPOSED SYSTEM
1) Model is build based on the count vectorizer or a tfidf matrix ( i.e ) word tallies
relatives to how often they are used in other articles in your dataset can help.
2) The actual goal is in developing a model which was the text transformation (count
vectorizer vs tfidf vectorizer) and choosing which type of text to use (headlines vs
full text).
3) Information was very clear and understandable. It gives accurate predictions which
is very clear to the user. User friendly and faster time compatibility.
SYSTEM ARCHITECTURE
 Step 1: Read the dataset.
 Step 2: Random Sampling is done on the data set to make it balanced.
 Step 3: Divide the dataset into two parts i.e., Train dataset and Test dataset.
 Step 4: Feature selection are applied for the proposed models.
 Step 5: Accuracy and performance metrics has been calculated to know the
efficiency for different algorithms.
 Step6: Then retrieve the best algorithm based on efficiency for the given dataset.
 System Architecture
METHODOLOGY
 The approach that this paper proposes, uses the latest machine learning
algorithms to detect fake news or real news.
 Each model is trained multiple times with a set of different parameters using a grid search
to optimize the model for the best outcome.
 fake news detection problem can be addressed with machine learning methods.
REQUIREMENTS
 SOFTWARE
 Anaconda navigator as an applications wrapper hub
 Microsoft Excel
 HARDWARE
 Processor i3 and above
 Ram 4gb and above
RESULTS AND DISCUSSION
CONCLUSION
 The feasibility of the project is analyzed in this phase and business proposal is put
forth with a very general plan for the project and some cost estimates.
 An innovative model for fake news detection using machine learning algorithms
has been presented.
 This model takes news events as an input and based on twitter reviews and
classification algorithms it predicts the percentage of news being fake or real.
 This is to ensure that the proposed system is not a burden to the company.
GEETHAhshansbbsbsbhshnsnsn_INTERNSHIP.pptx

GEETHAhshansbbsbsbhshnsnsn_INTERNSHIP.pptx

  • 1.
    Final Year InternshipPresentation on Fake News Detection Using Machine Learning Government Engineering College Ramanagara Doddamannina Gudde, Near Janapadaloka, B.M. Road, Ramanagara, Karnataka 562159 Under the Guidance of Dr. CHETHAN K C Assistant Professor, Dept. of CSE, GEC, Ramanagara Presented by Geetha C
  • 2.
    CONTENTS  INTRODUCTION ABOUTCOMPANY  TECHNOLOGIES LEARNT  ABSTRACT  INTRODUCTION TO PROJECT  SYSTEM ARCHITECTURE  METHODOLOGY  CONCLUSION
  • 3.
    COMPANY OVERVIEW  Companyname – SYSLOG TECHNOLOGIES  Syslog Technologies is a fast growing technology solutions and services provider.  Founded in 2005 by a team of technology professionals with venture capital backing, Syslog Technologies has built a successful track record of delivering end-to-end solutions to its customers from various industrial sectors that include.
  • 4.
     Syslog Technologieshas highly skilled and dedicated IT professionals to provide customized IT solutions for several industries using our technical expertise and experience.  Our vision is to provide quality services that exceeds the expectations of our esteemed customers.  Our mission is to build long term relationships with our customers and clients and provide exceptional customer services by pursuing business through innovation and advanced technology.
  • 5.
    TECHNOLOGIES LEARNT  PYTHON Pythonis an easy to learn, powerful programming language. It has efficient high-level data structures and a simple but effective approach to object-oriented programming. Python’s elegant syntax and dynamic typing, together with its interpreted nature, make it an ideal language for scripting and rapid application development in many areas on most platforms.  OPEN CV Open Source Computer Vision Library is a common platform and set of programming functions for real-time applications. The open CV library contains several algorithms for more than 500 optimized algorithms. Used mostly around the world, with forty thousand people in the user group. The first languages used in C-C ++ are mainly written in C, making them portable to certain platforms such as the digital signal processor. Now the language that is called Python is being used recently, has been developed to encourage adoption by a wider audience.
  • 6.
     ANACONDA Anaconda isa distribution of the Python and R programming languages for scientific computing (data science, machine learning applications, large-scale data processing, predictive analytics, etc.), that aims to simplify package management and deployment. The distribution includes data-science packages suitable for Windows, Linux, and macOS. ANACONDA NAVIGATOR Anaconda Navigator is a desktop graphical user interface (GUI) included in Anaconda distribution that allows users to launch applications and manage conda packages, environments and channels without using command-line commands.  The following applications are available by default in Navigator.  JupyterLab  Jupyter Notebook  Spyder  Glue  Orange  RStudio  Visual Studio Code
  • 7.
    JUPYTER NOTEBOOK  ProjectJupyter is a project to develop open-source software, open standards, and services for interactive computing across multiple programming languages.  Jupyter has developed and supported the interactive computing products Jupyter Notebook, JupyterHub, and JupyterLab. Jupyter is financially sponsored by NumFOCUS. SPYDER Spyder is an open-source cross-platform integrated development environment (IDE) for scientific programming in the Python language. Spyder integrates with a number of prominent packages in the scientific Python stack, including NumPy, Matplotlib, pandas, IPython, SymPy and Cython, as well as other open-source software.
  • 8.
    ABSTRACT  This Projectcomes up with the applications of NLP (Natural Language Processing) techniques for detecting the ‘fake news’, that is, misleading news stories that comes from the non-reputable sources.  Is it possible to build a model that can differentiate between “Real “news and “Fake” news? So a proposed work on assembling a dataset of both fake and real news in order to create a model to classify an article into fake or real based on its words and phrases.
  • 9.
    INTRODUCTION  Fake newsspreads a wildlife and this is a big issues in this era.  These days fake news is creating different issues from sarcastic articles to a fabricated news and plan government propaganda in some outlets.  Fake news and lack of trust in the media are growing problems with huge ramifications in our society.  it is seeked to produce a model that can accurately predict the likelihood that a given article is fake news.  We will be training and testing the data, when we use supervised learning it means we are labeling the data.
  • 10.
    PROPOSED SYSTEM 1) Modelis build based on the count vectorizer or a tfidf matrix ( i.e ) word tallies relatives to how often they are used in other articles in your dataset can help. 2) The actual goal is in developing a model which was the text transformation (count vectorizer vs tfidf vectorizer) and choosing which type of text to use (headlines vs full text). 3) Information was very clear and understandable. It gives accurate predictions which is very clear to the user. User friendly and faster time compatibility.
  • 11.
    SYSTEM ARCHITECTURE  Step1: Read the dataset.  Step 2: Random Sampling is done on the data set to make it balanced.  Step 3: Divide the dataset into two parts i.e., Train dataset and Test dataset.  Step 4: Feature selection are applied for the proposed models.  Step 5: Accuracy and performance metrics has been calculated to know the efficiency for different algorithms.  Step6: Then retrieve the best algorithm based on efficiency for the given dataset.
  • 12.
  • 13.
    METHODOLOGY  The approachthat this paper proposes, uses the latest machine learning algorithms to detect fake news or real news.  Each model is trained multiple times with a set of different parameters using a grid search to optimize the model for the best outcome.  fake news detection problem can be addressed with machine learning methods.
  • 14.
    REQUIREMENTS  SOFTWARE  Anacondanavigator as an applications wrapper hub  Microsoft Excel  HARDWARE  Processor i3 and above  Ram 4gb and above
  • 15.
  • 16.
    CONCLUSION  The feasibilityof the project is analyzed in this phase and business proposal is put forth with a very general plan for the project and some cost estimates.  An innovative model for fake news detection using machine learning algorithms has been presented.  This model takes news events as an input and based on twitter reviews and classification algorithms it predicts the percentage of news being fake or real.  This is to ensure that the proposed system is not a burden to the company.

Editor's Notes

  • #7 . The following applications are available by default in Navigator. JupyterLab Jupyter Notebook QtConsole Spyder Glue Orange RStudio Visual Studio Code