This document provides a literature survey of email filtering. It discusses different areas of email filtering application including spam filtering, generalized filtering and segregation of emails, and inbound and outbound email filtering. It also outlines some issues faced in email filtering such as spammers avoiding terms treated as spam, the double opt-in problem used by spammers, and the encrypted email problem.
IRJET- Information Retrieval from Chat ApplicationIRJET Journal
This document describes a proposed enhancement to messaging applications that would allow users to more easily search for and retrieve past messages. The proposed system would add a search feature that allows users to search for messages from a specific person by entering their name and date/time. This would help users avoid having to scroll through long chat histories to find an old message. The document outlines the existing problem, proposed solution, system requirements, architecture, modules, testing approach, advantages, and applications of the proposed search feature for messaging applications.
The document outlines the software requirements specification for the OnlineTest-WifiStudy.com website, which aims to provide an online platform for users to independently connect with experts, take online tests and mock exams, and prepare for various tests and jobs. It describes the purpose, scope, functions, users, environment, design constraints, and both functional and non-functional requirements of the system. The functional requirements include features like sign up, login, search, categories, reset password, feedback, logout, contact support, ask doubts, discussion forums, and job listings.
The document describes the requirements and design of an address book application. It must allow users to add, edit, delete, and print contacts. Contacts contain name, address, city, state, zip, and phone number fields. The application must sort contacts alphabetically by last name or zip code. Non-functional requirements include security, performance, and usability. The system will be designed using a repository architecture model to share contact data between components. It will undergo testing, verification, and validation to ensure requirements are met.
This document provides an overview of the Data Tag project, which aims to intelligently tag textual data and web pages based on their semantic context rather than just keywords. It begins with an introduction describing the purpose, system overview, and problem statement. It then discusses requirements such as user characteristics, functional requirements, dependencies, and constraints. The design section covers the functional design using data flow diagrams, database design using Redis, and GUI design. It also describes the coding, testing, installation, user instructions, future work, and provides a summary.
1. The document describes a voice-based email system that allows users to compose and read emails using only their voice. It transforms voice inputs into text that can be inserted into emails, and converts email text to speech for reading aloud.
2. The system is designed to help people with disabilities who cannot use keyboards. It allows users to access email functions like inbox, sent messages, drafts, and trash simply by using voice commands and mouse clicks.
3. Key features include speech recognition for typing messages vocally and text-to-speech for reading emails aloud. The system aims to make email more accessible for visually impaired users by eliminating the need for screen readers and keyboard shortcuts.
1) The document describes a student management system project in C programming. It includes details like the student's name, ID, course code, and a table of contents for the project.
2) It introduces the current paper-based student record keeping system and proposes a computerized student management system to address issues like data security, accessibility and efficiency.
3) The proposed system aims to provide a user-friendly interface for basic student data management like adding, modifying and searching records, with username/password security for authorized access only.
IRJET- Information Retrieval from Chat ApplicationIRJET Journal
This document describes a proposed enhancement to messaging applications that would allow users to more easily search for and retrieve past messages. The proposed system would add a search feature that allows users to search for messages from a specific person by entering their name and date/time. This would help users avoid having to scroll through long chat histories to find an old message. The document outlines the existing problem, proposed solution, system requirements, architecture, modules, testing approach, advantages, and applications of the proposed search feature for messaging applications.
The document outlines the software requirements specification for the OnlineTest-WifiStudy.com website, which aims to provide an online platform for users to independently connect with experts, take online tests and mock exams, and prepare for various tests and jobs. It describes the purpose, scope, functions, users, environment, design constraints, and both functional and non-functional requirements of the system. The functional requirements include features like sign up, login, search, categories, reset password, feedback, logout, contact support, ask doubts, discussion forums, and job listings.
The document describes the requirements and design of an address book application. It must allow users to add, edit, delete, and print contacts. Contacts contain name, address, city, state, zip, and phone number fields. The application must sort contacts alphabetically by last name or zip code. Non-functional requirements include security, performance, and usability. The system will be designed using a repository architecture model to share contact data between components. It will undergo testing, verification, and validation to ensure requirements are met.
This document provides an overview of the Data Tag project, which aims to intelligently tag textual data and web pages based on their semantic context rather than just keywords. It begins with an introduction describing the purpose, system overview, and problem statement. It then discusses requirements such as user characteristics, functional requirements, dependencies, and constraints. The design section covers the functional design using data flow diagrams, database design using Redis, and GUI design. It also describes the coding, testing, installation, user instructions, future work, and provides a summary.
1. The document describes a voice-based email system that allows users to compose and read emails using only their voice. It transforms voice inputs into text that can be inserted into emails, and converts email text to speech for reading aloud.
2. The system is designed to help people with disabilities who cannot use keyboards. It allows users to access email functions like inbox, sent messages, drafts, and trash simply by using voice commands and mouse clicks.
3. Key features include speech recognition for typing messages vocally and text-to-speech for reading emails aloud. The system aims to make email more accessible for visually impaired users by eliminating the need for screen readers and keyboard shortcuts.
1) The document describes a student management system project in C programming. It includes details like the student's name, ID, course code, and a table of contents for the project.
2) It introduces the current paper-based student record keeping system and proposes a computerized student management system to address issues like data security, accessibility and efficiency.
3) The proposed system aims to provide a user-friendly interface for basic student data management like adding, modifying and searching records, with username/password security for authorized access only.
LABRARY MANAGEMENT SYSTEM By ARPIT TRIPATHIArpit Tripathi
This document provides an overview of a library management system project being developed by students Arpit Tripathi and Mohd Osama Khan at Integral University in Lucknow, India. The project is being developed under the supervision of Assistant Professor Malik Shahzad Ahamed Iqbal and Lab Instructor Abida Khanam to partially fulfill the requirements for a Bachelor of Computer Application degree. The document outlines the aims, objectives, background, and requirements of the library management system as well as providing details on the system analysis, design, implementation, and testing of the project.
This document provides a software requirements specification for Skype. It begins with an introduction and definitions of terms. The overall description then states that Skype is a freemium voice-over-IP service that allows users to communicate via video calling, messaging, and file sharing. It supports calls between Skype users as well as to mobile and landline phones. The document outlines user classes, hardware requirements, functional requirements including downloading the app, user registration, login, profile management, calling, messaging and sharing features. It also covers design constraints, assumptions and non-functional requirements.
This document provides guidelines for the Master of Computer Applications (MCA) project at Indira Gandhi National Open University. It outlines the calendar for project submission, the process for approving project proposals, eligibility criteria for project guides, points to consider when preparing project proposals and reports, assessment guidelines, software areas, and reimbursement for guides. Students must work on their project for at least six months, preferably in industry. The objective is to develop quality software using standard practices like requirements analysis, design, development, testing and documentation. Topics should be complex enough to justify an MCA project.
Phone book with project report for BCA,MCASp Gurjar
This document appears to be a project report for a Phone Book application developed in Visual Basic .NET. The report includes chapters covering an introduction to the project, requirements specification, system design, implementation details through code snippets, testing plans, and conclusions. The Phone Book application allows users to store, search, update, and delete contact information from a central database for easy access from anywhere. Administrative users can manage the data while regular users can only view contacts. The system aims to simplify contact management and storage compared to traditional paper phone books.
IRJET- Question-Answer Text Mining using Machine LearningIRJET Journal
This document discusses using machine learning techniques to minimize duplicate questions on question and answer platforms. It proposes a system that uses a sequence-to-sequence model within a recurrent neural network to extract keywords from questions and find similar existing questions. When a new question is submitted, these keywords are used to search the database and suggest relevant, previously asked questions to the user. This aims to reduce redundancy and help users more efficiently find answers by navigating to similar existing questions. The system architecture includes user registration and authentication, an Express server backend, MongoDB database, and a machine learning model for keyword extraction and question matching.
Candidate Ranking and Evaluation System based on Digital FootprintsIOSRjournaljce
Digital resume provides insights about a candidate to the organization. This paper proposes a system where digital resumes of candidates are generated by extracting data from social networking sites like Facebook, Twitter and LinkedIn. Data which is relevant to recruitment is obtained from unstructured data using Data Mining algorithms. Candidates are evaluated based on their digital resumes and ranked accordingly. Ranking is done based on the requirements specified by an organization for a key position. The key aspects of this paper are a) Specification and design of system. b) Generation of digital Resume. c) Ranking of candidates. According to the ranking provided by this system, Recruiters can shortlist candidates for interviews. Thus, it revolutionizes the traditional recruitment process.
Advanced Question Paper Generator Implemented using Fuzzy LogicIRJET Journal
This document describes an advanced question paper generator system implemented using fuzzy logic. The system allows professors to generate question papers automatically by selecting the difficulty level and pattern. It uses fuzzy logic to determine the difficulty level of questions based on their analytical and descriptive quotients stored in the database. The system provides authentication and authorization for professors and admin. Professors can add, update and delete questions for the subjects allocated to them. The admin can manage user accounts and subject allocations. The generated question papers are in PDF format for ease of use and security.
Pranavi verma-it 402 class ix-unit 11_presentationPranaviVerma
This document provides an overview of email messaging and covers various topics related to using email. It discusses the basics of email, creating email accounts with Gmail and Outlook, linking email accounts to email applications, writing and sending email messages, receiving and responding to emails, using the ribbon interface in Outlook, formatting email text, attaching files to emails, and more. The goal is to teach users how to effectively use email.
this is VTU FINAL YEAR PROJECT REPORT full report is attached below.this alone with front pages attached Front pages report follows all the guidelines specified by vtu according to our college.
Mohammad Jasim Uddin is seeking an IT job and has over 10 years of experience in IT support roles. He currently works as a System Support Engineer for Airtel Bangladesh Ltd where he performs desktop and laptop troubleshooting, network setup and support, and user maintenance. Previously he held IT roles at Shine Enterprise and Divine Computer & Services where he also provided IT hardware and networking support. Uddin has a Bachelor's degree in Computer Science and Engineering and certifications in Cisco CCNA, Linux, Windows Server, and Oracle.
The document describes ConnectMe, a social media application that allows users to connect with friends, share updates and locations. Key features include digitizing visiting cards, automatically updating contacts, finding friends on a map, and accessing social media updates. The application has a client-server architecture and uses technologies like optical character recognition and natural language processing to extract text from scanned cards and integrate with social networks. It follows standard software development practices including use case modeling, UML diagrams, component structuring, and milestone planning.
This document describes a college enquiry chatbot that was developed to provide students with a way to get information about their college without having to visit in person. The chatbot uses algorithms to analyze user queries and respond to common questions about things like fees, admission processes, exams, and other college activities. It was created to reduce the time and effort spent by students and parents in obtaining information from the college. The chatbot system includes a database to store question and answer pairs, and an admin interface to update responses for questions not currently in the database.
A privacy learning objects identity system for smartphones based on a virtu...ijcsit
Smartphones are widely used today, with many features such as GPS map navigation, capturing
photos with camera equipment such as digital camera, internet connection via wifi or 3G devices that
function as computers. These devices are being used for various purposes including online learning, where
learners can study from anywhere and anytime for example in the street, home, office and school. However,
identifing a method by which teachers in these virtural environements can remember their learners “faces”
in the classroom or manage "Identification Number Student" (ID student or user) is not reliable when the
teacher cannot see all of the learners in the class or know who is online from a particular account. In this
paper, we propose a system, Android Virtual Learner Identify (AVLI), which collects images captured by
the face of the learning object directly from the camera, the location of the learner by identifing where the
learner is studying and configuration of information including Time, Mac, IP addresses, IMEI number and
location via GPS. The systen then saves learner profiles to help the teacher or education managers on the
Virtual Learning Environment (VLE) identify learning object. We used the VLE that we built on
mobile.ona.vn domain. We implemented the AVLI prototype Android phone with solution password
encryption and images taken directly from the camera to ensure that the information is transmitted and
stored securely in the Virtual Learning Environment System Database (VLE Data) of learning objects while
preserving the ability to identify learning objects by a teacher or education manager.
IRJET- Development of College Enquiry Chatbot using SnatchbotIRJET Journal
This document describes the development of a college enquiry chatbot using SnatchBot. The chatbot was developed to provide information to users about college activities and ease the workload of office staff. It uses natural language processing and a keyword matching algorithm to match user queries to responses from its knowledge base. If no match is found, the user is provided a default message. The chatbot has a user-friendly GUI and is accessible anytime via web. It also allows users to provide feedback if answers are invalid, which is sent to the admin for knowledge base updates.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Accessing remote android mobile contentseSAT Journals
not only used for calling purpose. Today all mobile Users are using
android phone and can store images, videos, files, contacts etc. Mobile phones now a day are not only for calling purpose but also
for important work like banking, corporate, education etc. In case, if user forgets mobile at home and wants some important data
from it remotely from one place to another place. For this there are several android applications available for smart phones today
that can help one to get it. This paper is presenting an android app for the same “Accessing Remote Android Mobile Contents”.
This app will take user requirement as input and retrieve it. This application requires internet connection first, if it is connected it
will open login prompt. In which user has to provide unique user id and password which is decided during installation of app.
Once the user is validated by remote mobile phone, user will see a menu on his/her screen. User has to select menu option form
that and send it to remote phone. User will receive data (contacts) from remote mobile.
Keywords: Android, Remote Mobile, Local Mobile, Eclipse.
IRJET- College Enquiry Chat-Bot using API.AIIRJET Journal
This document describes a college enquiry chatbot developed using API.ai (now known as Dialogflow). The chatbot is an Android application that allows students to get answers to their college-related queries without having to visit the college in person. It analyzes user queries using natural language processing and responds in text and audio format by integrating text-to-speech. The chatbot was built using Dialogflow to match user inputs with predefined intents and return appropriate responses from a database of FAQs. It aims to provide students with a convenient way to stay updated on college activities and information.
The document provides steps to create an Android quiz app with a relative layout, background image, image buttons, and text styling. It describes adding a relative layout, adding multiple versions of images for different screens, creating image buttons and positioning them, adding accessibility content descriptions, styling text with a custom font, and making the buttons interactive by adding onclick attributes.
Phase 1 Documentation (Added System Req)Reinier Eiman
This document outlines the requirements for developing an Administration of Sick Notes system. It will allow lecturers and secretaries at Cape Peninsula University of Technology to store and retrieve student sick note records digitally. The system will use Java for development, NetBeans as the IDE, and an Oracle database. It will have administrator and user functions like uploading scanned sick notes and student IDs, and retrieving student records. The system architecture involves a student providing their sick note and ID to a secretary, who will scan them into the student's digital file. Lecturers can then access generated student reports on absences. The goal is to improve on the current manual paper-based system.
Voice Based E-Mail System For Blind People Using Speech Recognition TechnologyIRJET Journal
This document describes a voice-based email system that is being developed to help blind and visually impaired people send emails independently using only voice commands and mouse clicks. The proposed system uses speech recognition technology to allow users to login, compose emails by recording voice messages, read emails in their inbox by having the system read them aloud, and access sent emails and deleted emails from their trash folder. The system is being created using tools like Flask for the web framework, Python for programming, and HTML5 for structuring the website. It follows a top-down design approach and uses algorithms like speech recognition to enable the voice functionality.
IRJET - Voice based E-Mail for Visually ChallengedIRJET Journal
This document describes a voice-based email system designed for visually impaired users. The system allows blind users to access email functions like sending, receiving, and reading emails using only voice commands without needing a keyboard, mouse, or screen reader. Key features include text-to-speech and speech-to-text functions to convert between voice and text. The proposed system aims to make email more accessible for visually impaired people by eliminating the need to remember keyboard shortcuts or rely on others to access emails. It could help about 250 million people with visual impairments globally who currently have difficulty using the internet and email.
LABRARY MANAGEMENT SYSTEM By ARPIT TRIPATHIArpit Tripathi
This document provides an overview of a library management system project being developed by students Arpit Tripathi and Mohd Osama Khan at Integral University in Lucknow, India. The project is being developed under the supervision of Assistant Professor Malik Shahzad Ahamed Iqbal and Lab Instructor Abida Khanam to partially fulfill the requirements for a Bachelor of Computer Application degree. The document outlines the aims, objectives, background, and requirements of the library management system as well as providing details on the system analysis, design, implementation, and testing of the project.
This document provides a software requirements specification for Skype. It begins with an introduction and definitions of terms. The overall description then states that Skype is a freemium voice-over-IP service that allows users to communicate via video calling, messaging, and file sharing. It supports calls between Skype users as well as to mobile and landline phones. The document outlines user classes, hardware requirements, functional requirements including downloading the app, user registration, login, profile management, calling, messaging and sharing features. It also covers design constraints, assumptions and non-functional requirements.
This document provides guidelines for the Master of Computer Applications (MCA) project at Indira Gandhi National Open University. It outlines the calendar for project submission, the process for approving project proposals, eligibility criteria for project guides, points to consider when preparing project proposals and reports, assessment guidelines, software areas, and reimbursement for guides. Students must work on their project for at least six months, preferably in industry. The objective is to develop quality software using standard practices like requirements analysis, design, development, testing and documentation. Topics should be complex enough to justify an MCA project.
Phone book with project report for BCA,MCASp Gurjar
This document appears to be a project report for a Phone Book application developed in Visual Basic .NET. The report includes chapters covering an introduction to the project, requirements specification, system design, implementation details through code snippets, testing plans, and conclusions. The Phone Book application allows users to store, search, update, and delete contact information from a central database for easy access from anywhere. Administrative users can manage the data while regular users can only view contacts. The system aims to simplify contact management and storage compared to traditional paper phone books.
IRJET- Question-Answer Text Mining using Machine LearningIRJET Journal
This document discusses using machine learning techniques to minimize duplicate questions on question and answer platforms. It proposes a system that uses a sequence-to-sequence model within a recurrent neural network to extract keywords from questions and find similar existing questions. When a new question is submitted, these keywords are used to search the database and suggest relevant, previously asked questions to the user. This aims to reduce redundancy and help users more efficiently find answers by navigating to similar existing questions. The system architecture includes user registration and authentication, an Express server backend, MongoDB database, and a machine learning model for keyword extraction and question matching.
Candidate Ranking and Evaluation System based on Digital FootprintsIOSRjournaljce
Digital resume provides insights about a candidate to the organization. This paper proposes a system where digital resumes of candidates are generated by extracting data from social networking sites like Facebook, Twitter and LinkedIn. Data which is relevant to recruitment is obtained from unstructured data using Data Mining algorithms. Candidates are evaluated based on their digital resumes and ranked accordingly. Ranking is done based on the requirements specified by an organization for a key position. The key aspects of this paper are a) Specification and design of system. b) Generation of digital Resume. c) Ranking of candidates. According to the ranking provided by this system, Recruiters can shortlist candidates for interviews. Thus, it revolutionizes the traditional recruitment process.
Advanced Question Paper Generator Implemented using Fuzzy LogicIRJET Journal
This document describes an advanced question paper generator system implemented using fuzzy logic. The system allows professors to generate question papers automatically by selecting the difficulty level and pattern. It uses fuzzy logic to determine the difficulty level of questions based on their analytical and descriptive quotients stored in the database. The system provides authentication and authorization for professors and admin. Professors can add, update and delete questions for the subjects allocated to them. The admin can manage user accounts and subject allocations. The generated question papers are in PDF format for ease of use and security.
Pranavi verma-it 402 class ix-unit 11_presentationPranaviVerma
This document provides an overview of email messaging and covers various topics related to using email. It discusses the basics of email, creating email accounts with Gmail and Outlook, linking email accounts to email applications, writing and sending email messages, receiving and responding to emails, using the ribbon interface in Outlook, formatting email text, attaching files to emails, and more. The goal is to teach users how to effectively use email.
this is VTU FINAL YEAR PROJECT REPORT full report is attached below.this alone with front pages attached Front pages report follows all the guidelines specified by vtu according to our college.
Mohammad Jasim Uddin is seeking an IT job and has over 10 years of experience in IT support roles. He currently works as a System Support Engineer for Airtel Bangladesh Ltd where he performs desktop and laptop troubleshooting, network setup and support, and user maintenance. Previously he held IT roles at Shine Enterprise and Divine Computer & Services where he also provided IT hardware and networking support. Uddin has a Bachelor's degree in Computer Science and Engineering and certifications in Cisco CCNA, Linux, Windows Server, and Oracle.
The document describes ConnectMe, a social media application that allows users to connect with friends, share updates and locations. Key features include digitizing visiting cards, automatically updating contacts, finding friends on a map, and accessing social media updates. The application has a client-server architecture and uses technologies like optical character recognition and natural language processing to extract text from scanned cards and integrate with social networks. It follows standard software development practices including use case modeling, UML diagrams, component structuring, and milestone planning.
This document describes a college enquiry chatbot that was developed to provide students with a way to get information about their college without having to visit in person. The chatbot uses algorithms to analyze user queries and respond to common questions about things like fees, admission processes, exams, and other college activities. It was created to reduce the time and effort spent by students and parents in obtaining information from the college. The chatbot system includes a database to store question and answer pairs, and an admin interface to update responses for questions not currently in the database.
A privacy learning objects identity system for smartphones based on a virtu...ijcsit
Smartphones are widely used today, with many features such as GPS map navigation, capturing
photos with camera equipment such as digital camera, internet connection via wifi or 3G devices that
function as computers. These devices are being used for various purposes including online learning, where
learners can study from anywhere and anytime for example in the street, home, office and school. However,
identifing a method by which teachers in these virtural environements can remember their learners “faces”
in the classroom or manage "Identification Number Student" (ID student or user) is not reliable when the
teacher cannot see all of the learners in the class or know who is online from a particular account. In this
paper, we propose a system, Android Virtual Learner Identify (AVLI), which collects images captured by
the face of the learning object directly from the camera, the location of the learner by identifing where the
learner is studying and configuration of information including Time, Mac, IP addresses, IMEI number and
location via GPS. The systen then saves learner profiles to help the teacher or education managers on the
Virtual Learning Environment (VLE) identify learning object. We used the VLE that we built on
mobile.ona.vn domain. We implemented the AVLI prototype Android phone with solution password
encryption and images taken directly from the camera to ensure that the information is transmitted and
stored securely in the Virtual Learning Environment System Database (VLE Data) of learning objects while
preserving the ability to identify learning objects by a teacher or education manager.
IRJET- Development of College Enquiry Chatbot using SnatchbotIRJET Journal
This document describes the development of a college enquiry chatbot using SnatchBot. The chatbot was developed to provide information to users about college activities and ease the workload of office staff. It uses natural language processing and a keyword matching algorithm to match user queries to responses from its knowledge base. If no match is found, the user is provided a default message. The chatbot has a user-friendly GUI and is accessible anytime via web. It also allows users to provide feedback if answers are invalid, which is sent to the admin for knowledge base updates.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Accessing remote android mobile contentseSAT Journals
not only used for calling purpose. Today all mobile Users are using
android phone and can store images, videos, files, contacts etc. Mobile phones now a day are not only for calling purpose but also
for important work like banking, corporate, education etc. In case, if user forgets mobile at home and wants some important data
from it remotely from one place to another place. For this there are several android applications available for smart phones today
that can help one to get it. This paper is presenting an android app for the same “Accessing Remote Android Mobile Contents”.
This app will take user requirement as input and retrieve it. This application requires internet connection first, if it is connected it
will open login prompt. In which user has to provide unique user id and password which is decided during installation of app.
Once the user is validated by remote mobile phone, user will see a menu on his/her screen. User has to select menu option form
that and send it to remote phone. User will receive data (contacts) from remote mobile.
Keywords: Android, Remote Mobile, Local Mobile, Eclipse.
IRJET- College Enquiry Chat-Bot using API.AIIRJET Journal
This document describes a college enquiry chatbot developed using API.ai (now known as Dialogflow). The chatbot is an Android application that allows students to get answers to their college-related queries without having to visit the college in person. It analyzes user queries using natural language processing and responds in text and audio format by integrating text-to-speech. The chatbot was built using Dialogflow to match user inputs with predefined intents and return appropriate responses from a database of FAQs. It aims to provide students with a convenient way to stay updated on college activities and information.
The document provides steps to create an Android quiz app with a relative layout, background image, image buttons, and text styling. It describes adding a relative layout, adding multiple versions of images for different screens, creating image buttons and positioning them, adding accessibility content descriptions, styling text with a custom font, and making the buttons interactive by adding onclick attributes.
Phase 1 Documentation (Added System Req)Reinier Eiman
This document outlines the requirements for developing an Administration of Sick Notes system. It will allow lecturers and secretaries at Cape Peninsula University of Technology to store and retrieve student sick note records digitally. The system will use Java for development, NetBeans as the IDE, and an Oracle database. It will have administrator and user functions like uploading scanned sick notes and student IDs, and retrieving student records. The system architecture involves a student providing their sick note and ID to a secretary, who will scan them into the student's digital file. Lecturers can then access generated student reports on absences. The goal is to improve on the current manual paper-based system.
Voice Based E-Mail System For Blind People Using Speech Recognition TechnologyIRJET Journal
This document describes a voice-based email system that is being developed to help blind and visually impaired people send emails independently using only voice commands and mouse clicks. The proposed system uses speech recognition technology to allow users to login, compose emails by recording voice messages, read emails in their inbox by having the system read them aloud, and access sent emails and deleted emails from their trash folder. The system is being created using tools like Flask for the web framework, Python for programming, and HTML5 for structuring the website. It follows a top-down design approach and uses algorithms like speech recognition to enable the voice functionality.
IRJET - Voice based E-Mail for Visually ChallengedIRJET Journal
This document describes a voice-based email system designed for visually impaired users. The system allows blind users to access email functions like sending, receiving, and reading emails using only voice commands without needing a keyboard, mouse, or screen reader. Key features include text-to-speech and speech-to-text functions to convert between voice and text. The proposed system aims to make email more accessible for visually impaired people by eliminating the need to remember keyboard shortcuts or rely on others to access emails. It could help about 250 million people with visual impairments globally who currently have difficulty using the internet and email.
The document provides details about a library management system project done by Sumedh Kumar Singh at MECON Limited, Ranchi under the guidance of Mr. P.K. Dubey. The project report includes sections on feasibility study, system architecture, database creation and tables, forms design, and deployment. The proposed system automates processes like book and member management, book issuing and returning, and calculates any fines. It aims to provide efficient services to users and reduce the workload for library staff.
This document describes the development of a web-based email client called the Aisha Email System. The system allows multiple users to login with the same credentials and send/receive personalized emails. It includes features like inbox, compose, address book, and attachment functionality. The system was developed using ASP.NET for the interfaces and MS SQL Server 2000 for centralized email storage. Security, authentication, and authorization measures were implemented throughout. The goal was to create a distributed and user-friendly email application accessible through a web browser.
This document describes a project to develop a system for detecting suspicious emails. The system aims to identify emails containing offensive or anti-social content and block them while also identifying suspicious users. It uses keyword analysis to determine if a user's emails contain suspicious terms. The proposed system has five modules: login, registration, administration, user, and mailing. The administration module allows administrators to manage keywords and blocked emails while the user module lets authenticated users compose and send emails. The overall goal is to enhance email security and help organizations like law enforcement.
This document describes an e-reception system developed by three students as a university project. The system was created using Visual Studio 2015 and SQLite database to manage resources for any organization. It allows for interaction between visitors and the organization electronically and helps update clients about activities. The system aims to simplify reception work and share information on events, training, exams and products. It was motivated by problems with accessing information on other academic websites and is intended to be easy to use across Windows devices.
This document describes a project to develop a graphical user interface (GUI) client email program that allows users to access remote message stores as if they were local. The project uses JavaMail API to provide email functionality like retrieving, composing, forwarding and deleting emails. It allows users to access their email accounts from multiple computers in a compatible way with Internet email standards. The project aims to provide a similar user experience to Outlook for various email tasks. It develops a GUI program that navigates users through different email functions. The overall goal is to gain insights into building real-time applications like Yahoo Mail and Gmail.
This document discusses various applications of the internet, with a focus on e-learning. It begins by introducing e-learning and describing it as learning facilitated through electronic means like online content, lectures, and tutorials. It then discusses e-learning processes, providing examples from IGNOU. MOODLE is introduced as a popular open-source learning management system. Advantages and disadvantages of e-learning are briefly covered. The document also mentions other electronic educational resources available online like journals, databases, and e-books. Finally, it discusses wikis as collaborative knowledge resources that allow users to easily edit and link pages.
1. The document describes a Students Club web-based chat application that allows students to communicate with text messaging in private chats or groups.
2. The application aims to help students easily discuss assignments, projects, and activities with their peers. It provides utilities to reduce distractions and make group discussions more productive.
3. The proposed system would create a centralized repository for user profiles and chat histories. It would allow students to easily access chat rooms and search for others in their department or field of study.
Ignou MCA 4th semester mini project report. College admission system. This project is based on real working system of University seat allocation to affiliate colleges. College admission system provide seat allocation process for various UG PG programs for every academic session.
Heart rate Encapsulation and Response Tool using Sentiment AnalysisIJECEIAES
Users of every system expect it to get better. Providing feedback to the owners or management was difficult but with the advent of technology, it has become handy. Users can now post their comments through online blogs, android apps and websites. Due to the enormous data piling up every second causes a problem in analyzing it. In this paper, sentiment analysis is used for analyzing comments and reviews for hospital management system are demonstrated with real time data. The tools, algorithms and methodology that could fetch accurate results is described. Experimental results indicate 90% of accuracy in proposed system. The review report generated would help the hospital management to identify the positive and negative feedback which further assists them in improving their facilities that could not only create customer satisfaction but also enhanced business processes.
This document describes a mail server project developed using Java. The mail server allows users within an intranet environment to communicate via electronic mail. It handles sending, receiving, and storing of emails. The project is divided into client and server modules. The client module uses Outlook Express for mailing, while the server module implements SMTP and uses servlets coded in Java. Key features include a global address book, support for POP3 and SMTP protocols, and security. The project was tested for errors and feasibility was analyzed. The mail server provides a user-friendly system to facilitate internal email communication.
Online dating system management project report.pdfKamal Acharya
The objective of our project is to develop an application that offers online dating services where individuals or users can find and contact each other over the internet to arrange a date usually with the objective of developing a romantic, personal and sexual relationship.
Users of an online dating service would currently provide personal information, to enable them to search the service provider's database for other individuals. Members use grade other members set, such as age range, gender and location.
IRJET - Voice based E-Mail for Visually ImpairedIRJET Journal
The document describes a voice-based email system for visually impaired people. It aims to allow visually impaired users to send and receive emails independently through voice commands without needing a screen reader or keyboard. The system uses speech recognition to convert voice inputs to text for composing emails and text-to-speech to read composed emails and responses aloud to users. It includes modules for user registration, login, accessing the inbox and sending emails entirely through voice while eliminating the need for keyboard shortcuts or screen readers. The system aims to improve accessibility and communication through email for visually impaired users.
The document describes the development of a web application for an online newspaper. It discusses the objectives, which are to provide daily news, breaking news, and make information easily accessible to people. It also covers the technologies used like PHP, MySQL, CSS, and the development models of waterfall and prototyping. Data gathering and analysis are explained as important parts of the initial analysis phase of the project.
The document summarizes a project report for a website called Global Freelancer. Global Freelancer is an online marketplace that allows businesses and individuals to outsource work globally. There are three types of users - administrators, service providers, and service buyers. The project report outlines the system study conducted, including defining problems with existing freelance systems, analyzing requirements, and assessing the feasibility of the proposed Global Freelancer system. It also provides details on the system design, coding and testing approach.
1. I
EMAIL FILTERING AND ANALYSIS
USING CLASSIFICATION ALGORITHMS
Submitted in partial fulfillment of the requirements
of the degree of
Bachelor of Engineering in Information Technology
By
Akshay Iyer
Dipti Pamnani
Akanksha Pandey
Karmanya Pathak
Supervisor:
Mrs. Jayshree Hajgude
Department of Information Technology
Vivekanand Education Society’s Institute of Technology
2013-14
2. II
Project Report Approval for B. E.
This project report entitled EMAIL FILTERING AND ANALYSIS USING
CLASSIFICATION ALGORITHMS by Akshay Iyer, Dipti Pamnani,
Akanksha Pandey, and Karmanya Pathak is approved for the degree of
Bachelor of Engineering in Information Technology.
Examiners
1.---------------------------------------------
2.---------------------------------------------
Supervisors
1.---------------------------------------------
2.---------------------------------------------
Chairman
-----------------------------------------------
Date:
Place:
3. III
Declaration
I declare that this written submission represents my ideas in my own words and
where others' ideas or words have been included, I have adequately cited and
referenced the original sources. I also declare that I have adhered to all principles
of academic honesty and integrity and have not misrepresented or fabricated or
falsified any idea/data/fact/source in my submission. I understand that any
violation of the above will be cause for disciplinary action by the Institute and can
also evoke penal action from the sources which have thus not been properly cited
or from whom proper permission has not been taken when needed.
-----------------------------------------
Akshay Iyer
-----------------------------------------
Dipti Pamnani
-----------------------------------------
Akanksha Pandey
-----------------------------------------
Karmanya Pathak
Date:
4. IV
ACKNOWLEDGEMENT
This project has been a great learning experience for us. Through the course of this year, we have worked
as a team for the successful completion of this project. Though, on paper it is only us who have made this
project, in reality there are some people without whom this project could not have been finalized and
designed the way it looks now.
First of all, we would like to thank our Principal, Dr.(Mrs.) J.M.Nair, and our Vice-Principal, Dr.
S.Mukhopadhyay for their support and guidance throughout the project implementation period. Without
their help, the project would not have been possible.
First of all, we are truly indebted to our internal project guide Mrs. Jayshree Hajgude, for her immense
guidance and support. She has encouraged us and channelized our enthusiasm effectively.
We would like to thank, Mrs. Vijayalakshmi Muralidharan, HOD of Information Technology Department.
We would also like to thank our lab in charges, Mr. Amar Jaiswar and Mr. Ulhas Pawar, who have been
very kind to us.
Lastly, but not the least, we want to thank our college, Vivekanand Education Society of Institute and
Technology, for providing us with the excellent reference materials and great computing facilities.
5. V
ABSTRACT
With the various developments that are taking place in the field of technology especially in the
communication department, there are a wide variety of malpractices that are being taking place which
might prove harmful to the user. Most of this is currently being observed in the Email Account of a user.
The Email user has an Inbox which consists of a wide variety of mails, and these mails are present in an
unorganized manner. Also some mails which are being received by the user may contain harmful content
which may prove to have severe consequences (Normally Termed As Spam). With this idea in mind, the
topic of our BE Project is Email Filtering.
Email Filtering is the process which is used in order to classify the Emails intro various categories on the
basis of their content. The application fetches the emails from a user’s id, and stores it in a server, it then
classifies it into spam and non spam using classification algorithms, and also it classifies it into user
defined categories on the basis of the keyword entered by the user. The user can also send, forward and
reply to a particular mail. There is also a lot of historical spam analysis done by the application on the
basis of the content downloaded by the user. The user can access, read, store and copy the contents of his
Email.
The project report begins with a small introduction about Email Filtering and the reason we have chosen
this topic. This is then followed by the Literature Survey, which tells the various areas where you can find
similar operations being performed, and the various features of Email Filtering. We have also explained
about the Algorithms which we are going to use in order to classify the Emails.
The project then focuses of the Implementation Flow, and various Use Case Designs, which will help in
better understanding of the various features of the project. This chapter is then followed by the actual
implementation code of the project where, you will find information about the various snippets of the
code that are a part of the project. Also, detailed explanation regarding each window of the Email
Filtering application has been written down for the user. The next chapter will display the screenshots of
the Email Filtering, and the various analyses which has been performed by the application, different types
of graphical information is also made visible. This chapter is then followed by the conclusion and the
future scope of the project as to what all features are going to be implemented in the future. The last
chapter consists of a list of references which have played an important role in bringing about the
completion of the project.
6. VI
Table of Contents
1. INTRODUCTION
1.1. What is Email Filtering…………………………………………………......2
1.2. Motivation…………………………………………………………………..3
1.3. Problem Definition ………………...………………...…………………..…4
1.4. Objectives…………………………………………………………………...5
2. LITERATURE SURVEY
2.1. Application………………………………………………………………....7
2.2. Issues Faced…………………………………………………………….......8
2.3. Different areas of Applications……………………………………………..9
3. ANALYSIS
3.1. C4.5 Algorithm…………………………………………………………......11
3.2. Naïve Bayes Algorithm………………………………………………….....12
3.3. Formulae…………………………………………………………………....15
4. DESIGN
4.1. Implementation Flow……….…………………………………………..…..17
4.2. Use Case Diagram………….…………………………………………….....19
4.3. Class Diagram…………….………………………………………………...20
4.4. Activity Diagram………….….………………………………………….….21
5. IMPLEMENTATION
5.1. The Connection Dialog Box……..…………………………………………23
5.2. The Email Client Window………..………………………………………...28
5.3. The Message Dialog Box....……….……….…………………………….....38
5.4. The File Chooser…………………….…………………………...................39
5.5. The Downloading Dialog Box……….……………………………………..40
5.6. The Analysis Window………………………………………………........... 41
6. RESULTS………………………………………………...................................49
7. CONCLUSION………………………………………………...........................59
8. FUTURE SCOPE………………………………………………........................61
9. REFERENCES………………………………………………............................64
7. VII
LIST OF IMAGES
S. NO IMAGE PG. NO
1 A graph showing the rate of spam and its increase in the past few years 3
2 The Gmail Inbox which has user various folders in which mails get classified 9
3 A logo of the Apache Spam Assassin 9
4 Implementation Flow 17,18
5 Use Case Diagram 19
6 Class Diagram 20
7 Activity Diagram 21
8 A screenshot of the connect dialog window. 49
9 A screenshot of the home screen which opens once the user is logging in 49
10 A Screenshot of the Main Page where all operations can be performed 50
11 A Screenshot of the message viewer tab 50
12 The Save Dialog Box Appears when store in PC has been clicked 51
13 A Screenshot of the Messaging Tab 51
14 A Screenshot of New Message box 52
15 A Screenshot of Reply Message box 52
16 A Screenshot of Forward Message Box 52
17 A screenshot of the credits page 53
18 The Message Dialog 53
19 The File Chooser 54
20 The Downloading Dialog 54
21 A Screenshot of the Statistics tab 55
22 The Annual Spam Rate Report 55
23 The Monthly Spam Rate Report 56
24 The Weekly Spam Rate Report 56
25 Comparative Spam Rate Report 57
26 User Defined Messages Quantity 57
LIST OF TABLES
S. NO TABLE PG. NO
1 The structure of the login details table 26
2 The structure of the main table where all the mails are stored 26
3 The structure of the keyword table where all the keywords are stored 27
9. 2
1.1 What is Email Filtering?
Email Filtering refers to the classification of an account’s emails based on two types of emails:
Spam and
Non-Spam.
The user first logs in to his account using the valid id and password. Upon logging in, the user’s mails
are fetched in the database and are classified into spam and non-spam. The user can also create
custom labels which are classified using keywords provided by the user. Also, he can browse for the
unread or read emails. This makes the mail service easy and user friendly.
A basic task in email filtering is to mine the data from an email and to classify it into the different
categories using Data Mining classification algorithm. Decision Tree Classification is a method
commonly used in data mining.
Email Filtering involves spam filtering, generalized filtering and segregation and filtering of inbound
emails. Spam mails are filtered since they are not important to most of the users. Generalized filtering
and segregation of emails is segregation of the mails into different categories such as sent and non-
spam.
Companies filter outbound emails so that sensitive data regarding the working of the company do not
leak intentionally or accidentally by emails.
To summarize email filtering
Segregates inbound mails into different categories.
Filters inbound mails so as not to leak sensitive information.
The different categories in which the emails are classified are:
Spam
Non- Spam
Also, the user can define categories as per his choice and can set the values as per the user’s choice.
The user can enter the values, and these values will get associated with all the mails that have been
calculated.
10. 3
1.2. Motivation for this domain
With the increase in the internet users, communication and transfer of files and data through different
methods over the internet has increased drastically. In such times, it is difficult to know what kinds of
emails are entering your organisation or system.
Most of the present filtering techniques are unable to handle frequent changing scenario of mails
adopted by the senders over the time.
A graph showing the rate of spam and its increase in the past few years
In absolute numbers, the average number of spam mails sent per day increased from 2.4 billion in
2002 to 300 billion in 2010.
Google today announced it has made security improvements to Gmail to further protect users’ emails
from snooping. Gmail now always uses an encrypted HTTPS connection when you check or send
email, and encrypts all messages moving internally on Google’s servers.
With the advent of growth in technology, desktop based email applications are more increasingly
used. Outlook express has changed the way the world read’s and communicates with the help of
Email.
11. 4
1.3. Problem Definition
As the Internet grows at a phenomenal rate, electronic mail (abbreviated as E-mail) has become a
widely used electronic form of communication on the Internet. Every day, a huge number of people
exchange messages in this fast and inexpensive way. With the excitement on electronic commerce
growing, the usage of E-mail will increase more dramatically. However, the advantages of E-mail also
make it overused by companies, organizations or people to promote products and spread information,
which serves their own purposes. The mailbox of a user may often be crammed with E-mail messages
some or even a large portion of which are not of interest to her/him. Searching for interesting
messages everyday is becoming tedious and annoying. As a consequence, a personal E-mail filter is
indeed needed.
In recent years the highest degree of communication happens through e-mails which are often affected
by passive or active attacks. Effective e-mail filtering measures are the timely requirement to handle
such attacks. The basic idea behind e-mail filtering is to organize the incoming e-mails and also
employ a mail filter to prioritize messages, and to sort them into folders based on subject matter or
other criteria.
The purpose of our application is to classify the incoming mails into different categories as follows:
Spam and
Non Spam
Also there are various other categories which can be created and defined by the user himself which
are stated as shown.
Facebook
Flipkart
Amazon
MakeMyTrip
12. 5
1.4. Objectives
User Interactive
Whenever the user would like to bring about some modifications to his particular application, he
would be able to achieve it easily and without any glitches.
The user would be able to use the application as per his requirements and reap the benefits of the
same.
Security
Security is also an important issue which needs to be considered before going about the actual
procedure and hence the user should be able use his client application in an extremely safe and
sophisticated manner without any fear of security breaks, and SQL attacks.
Spam Detection
This is the major aim of our project and we aim at bringing about the classification of mails, as per the
presence of malicious content which may be harmful for the user computer and hence has been
regarded as spam.
User Defined Mail Analysis
This is a new feature which would be included in our project
According to this, the user can define his own keyword, and on the basis of that, he can access his
mails easily and without any glitch.
The user himself will define the keywords, and on the basis of the keywords that have been defined,
he can clearly check all the concerned mails under one window.
The user will be able to enter a keyword and on the basis of that keyword the mails will get classified.
Historical Spam Analysis
This is one of the features of our projects.
All the mails that have been received by the user, can be analysed over its time period, and on the
basis of that analysis, historical data, and spam detection can be brought about.
The user can easily track which mails, have had the maximum spam, and in which year did he year
the maximum amount of spam mail.
The user can do the same Monthly and Weekly
14. 7
2.1. Different areas of Application
Spam Filtering
With the advent of Internet, the number of spam mails has increased too.
A spam filter is a program that is used to detect unsolicited and unwanted email and prevent those
messages from getting to a user’s inbox. Like other types of filtering programs, a spam filter looks for
certain criteria on which it bases judgments.
Generalized Filtering and Segregation of E-mails
Email filtering is the processing of email to organize it according to specified criteria. Most often this
refers to the automatic processing of incoming messages, but the term also applies to the intervention
of human intelligence in addition to anti-spam techniques, and to outgoing emails as well as those
being received.
Filtering mails based on classes like spam, travel, social and look out for a country-based
classification of official mails for ease of access to mails from specific sub-branches would help make
the mail service more efficient in terms of accessibility and user-friendliness.
Inbound and Outbound Filtering of E-mails
Mail filters can operate on inbound and outbound email traffic. Inbound email filtering involves
scanning messages from the Internet addressed to users protected by the filtering system or for lawful
interception.
Outbound email filtering involves the reverse – scanning email messages from local users before any
potentially harmful messages can be delivered to others on the Internet.
One method of outbound email filtering that is commonly used by Internet service
providers is transparent SMTP proxy, in which email traffic is intercepted and filtered via a
transparent proxy within the network.
Outbound filtering can also take place in an email server. Many corporations employ data leak
prevention technology in their outbound mail servers to prevent the leakage of sensitive information
via email.
15. 8
2.2. Issues Faced
Avoidance of vocabulary treated as Spam by Spammers
The subject and body content are chosen carefully by spammers. Being aware of terms, text
processing rules of a filter, etc. helps the spammers to use alternate words still serving the same
purpose yet not falling prey to the filter. This helps them to pass the filter and the mail is treated as a
non-spam mail which otherwise would have formed part of spam bulk.
The Double Opt-In problem
One of the main problems faced by spammers is to gain access and explicit permission to mail any
particular user. An efficient solution found out by the clan is the Double Opt-In method.
It works in the following manner:
1. The user enters his email address into an online form.
2. They receive a confirmation link.
3. On clicking the conformation link the spammer gets explicit permission to send mails to the user.
These mails, though actually spam, are then treated as normal and non-spam mails.
The Encrypted E-Mail Problem
The Encrypted E-Mail Problem is one of the most important problems which are being faced by
various E-Mail Client Applications. Most of the bank transactions which are being performed by
various banks and corporate companies are sent in an encrypted format to the concerned user. This is
done in order to ensure security.
Many mails which are sent by many Telecom and multinational companies concerning any payment
or any transfer of money are also done in the Encrypted format.
The message which is viewed in the user inbox, is not actually the mail which has been revived by it,
it is encrypted using some encryption key which can be retrieved by some user credentials, such as the
user bank account number, his password.
Thus, it is extremely difficult to bring about classification of mails in this format.
Recently, Gmail had announced that, it has taken a step forward in correct classification of encrypted
mails, which is soon to be implemented by them.
16. 9
2.3. Recent Applications
Gmail
Email filtering has been and is being continuously developed and used by various email service
providers. Recently Gmail added many more categories apart from spam which includes travel,
promotions; etc. This has helped the users of Gmail to achieve and efficient classification of all
incoming mails. The effectiveness of Gmail filters was recorded to a 99.05%.
The Gmail Inbox which has user various folders in which mails get classified
SpamAssassin
SpamAssassin is a mail filter to identify spam. It is an intelligent email filter which uses a diverse
range of tests to identify unsolicited bulk email, more commonly known as Spam. These tests are
applied to email headers and content to classify email using advanced statistical methods. In addition,
SpamAssassin has a modular architecture that allows other technologies to be quickly wielded against
spam and is designed for easy integration into virtually any email system.
A logo of the Apache SpamAssassin
18. 11
3.1. The C4.5 Algorithm
C4.5 is an algorithm used to generate a decision tree developed by Ross Quinlan. C4.5 is an extension
of Quinlan's earlier ID3 algorithm. The decision trees generated by C4.5 can be used for
classification, and for this reason, C4.5 is often referred to as a statistical classifier.
C4.5 builds decision trees from a set of training data in the same way as ID3, using the concept
of information entropy. The training data is a set
of already classified samples. Each sample consists of a p-dimensional vector
,
Where the represent attributes or features of the sample, as well as the class in which falls.
At each node of the tree, C4.5 chooses the attribute of the data that most effectively splits its set of
samples into subsets enriched in one class or the other. The splitting criterion is the
normalized information gain (difference in entropy). The attribute with the highest normalized
information gain is chosen to make the decision. Thus, the C4.5 algorithm then recourses on the
smaller sub lists.
This algorithm has a few base cases.
All the samples in the list belong to the same class. When this happens, it simply creates a leaf
node for the decision tree saying to choose that class.
None of the features provide any information gain. In this case, C4.5 creates a decision node
higher up the tree using the expected value of the class.
Instance of previously-unseen class encountered. Again, C4.5 creates a decision node higher up
the tree using the expected value.
Pseudo code
In pseudo code, the general algorithm for building decision trees is:
1. Check for base cases
2. For each attribute a
Find the normalized information gain ratio from splitting on a
3. Let a_best be the attribute with the highest normalized information gain
4. Create a decision node that splits on a_best
5. Recurse on the sub lists obtained by splitting on a_best, and add those nodes as children
of node
19. 12
3.2. The Naïve Bayes Algorithm
A naive Bayes classifier is a simple probabilistic classifier based on applying Bayes' theorem with
strong (naive) independence assumptions. A more descriptive term for the underlying probability
model would be "independent feature model". An overview of statistical classifiers is given in the
article on pattern recognition.
In simple terms, a naive Bayes classifier assumes that the value of a particular feature is unrelated to
the presence or absence of any other feature, given the class variable. For example, a fruit may be
considered to be an apple if it is red, round, and about 3" in diameter. A naive Bayes classifier
considers each of these features to contribute independently to the probability that this fruit is an
apple, regardless of the presence or absence of the other features.
For some types of probability models, naive Bayes classifiers can be trained very efficiently in
a supervised learning setting. In many practical applications, parameter estimation for naive Bayes
models uses the method of maximum likelihood; in other words, one can work with the naive Bayes
model without accepting Bayesian probability or using any Bayesian methods.
Despite their naive design and apparently oversimplified assumptions, naive Bayes classifiers have
worked quite well in many complex real-world situations. In 2004, an analysis of the Bayesian
classification problem showed that there are sound theoretical reasons for the apparently
implausible efficacy of naive Bayes classifiers. Still, a comprehensive comparison with other
classification algorithms in 2006 showed that Bayes classification is outperformed by other
approaches, such as boosted trees or random.
Advantages:
An advantage of naive Bayes is that it only requires a small amount of training data to estimate the
parameters (means and variances of the variables) necessary for classification. Because independent
variables are assumed, only the variances of the variables for each class need to be determined and not
the entire covariance matrix.
Probabilistic model:
Abstractly, the probability model for a classifier is a conditional model
over a dependent class variable with a small number of outcomes or classes, conditional on
several feature variables through . The problem is that if the number of features is
20. 13
large or when a feature can take on a large number of values, then basing such a model on
probability tables is infeasible. We therefore reformulate the model to make it more tractable.
Using Bayes' theorem, this can be written
In plain English, using Bayesian Probability terminology, the above equation can be written
as
In practice, there is interest only in the numerator of that fraction, because the denominator does not
depend on and the values of the features are given, so that the denominator is effectively
constant. The numerator is equivalent to the joint probability model
which can be rewritten as follows, using the chain rule for repeated applications of the definition
of conditional probability:
Now the "naive" conditional independence assumptions come into play: assume that each
feature is conditionally independent of every other feature for given the
category . This means that
,
, ,
and so on, for . Thus, the joint model can be expressed as
21. 14
This means that under the above independence assumptions, the conditional distribution over the class
variable is:
where the evidence is a scaling factor dependent only
on , that is, a constant if the values of the feature variables are known.
Constructing a classifier from the probability model:
The discussion so far has derived the independent feature model, that is, the naive Bayes probability
model. The naive Bayes classifier combines this model with a decision rule. One common rule is to
pick the hypothesis that is most probable; this is known as the maximum a posterior or MAP decision
rule. The corresponding classifier, a Bayes classifier, is the function defined as follows:
24. 17
4.1. Implementation Flow
Home Signup Login Creation of 2
tables in
MySQL
Creation of 3
separate fields in
main table:
Naïve
Bayes
C 4.5
Keyword
Graphical
Display of
the mails
fetched and
the unread
mails.Fill Credentials
Username
Password
Name
Surname
Phone no
Fill Credentials
Username
Password
The
credentials get
stored in a
table called
login details
Authenticate
based on
details in
login details
Classification
Selection between
Naïve Bayes,
C4.5, keyword
based
classification with
a multi-select
option available
to the user
On selection and
submission of
choices by clicking
on CLASSIFY
button, mails are
classified into spam
and non-spam
25. 18
Message
Viewer
Allows the user to
sell it
Spam or
Non-spam or
Keyword
Gives a view
of mails with
From and
subject as per
choices made
Allows for
keyword
based view,
where a
search is
made by
looking at the
subject as
well as the
content
An option
to store
mail to PC
made
available
An option to
copy e-
mail/content
to clipboard
Statistics Allows for a
graphical comparison
between on e-mails
and on an annual,
monthly or weekly
statistical view of e-
mail based on
historical data.
Messaging Read e-mails Reply to
e-mails
Forward
e-mails
30. 23
5.1 The Connection Dialog Box
The connection window is the major window which takes all the login credentials and the required
information from the user and stores it in the server. The signup credentials take information such as,
the username, the password, and the Name, Surname, Country, and Mobile Number of the user. The
user also needs to provide the Server with which he is going to be interacting, and the server which is
going to be used by the user to perform message sending operations. As specified earlier, the two mail
server which is going to be accessed is the IMAP server, and the SMTP server is going to be used for
message transport and access.
(See Screenshot 1)
From the above image, it can clearly be understood as to what operations are going to be performed
by the connect dialog window, and what are the prerequisites for signing up by the user. Also, as soon
as the user is signing up there are two separate tables that are created for the user, the first one is the
main user table where all the mails are getting fetched and they are getting stored. The second table is
the keyword table that stores all the user defined keywords that have been searched by the user.
ConnectDialog.java
package emailfiltering;
import java.awt.*;
import java.awt.event.*;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.ResultSet;
import java.sql.Statement;
import javax.swing.*;
public class ConnectDialog extends javax.swing.JDialog {
Connection conn = null;
Statement stmt = null, stmt1 = null;
ResultSet rs = null;
String un, ps, n, sn, co, imap, smtp, mobile;
public ConnectDialog(Frame parent) {
// Call super constructor, specifying that dialog is modal.
super(parent, true);
initComponents();
try {
Class.forName("com.mysql.jdbc.Driver");
conn = DriverManager.getConnection("jdbc:mysql://localhost:3306/email", "root",
"");
System.out.println("Connection Established Successfully");
} catch (Exception e) {
System.out.println(e);
}
// Set application title.
setTitle("Connect");
// Handle closing events.
addWindowListener(new WindowAdapter() {
31. 24
public void windowClosing(WindowEvent e) {
actionCancel();
}
});
}
private void actionConnect() {
if (usernameTextField.getText().trim().length() < 1
|| passwordField.getPassword().length < 1) {
JOptionPane.showMessageDialog(this,
"One or more settings is missing.",
"Missing Setting(s)", JOptionPane.ERROR_MESSAGE);
return;
}
// Close dialog.
dispose();
}
// Cancel connecting and exit program.
private void actionCancel() {
System.exit(0);
}
public String getUsername() {
return usernameTextField.getText();
}
// Get e-mail password.
public String getPassword() {
return new String(passwordField.getPassword());
}
@SuppressWarnings("unchecked")
// <editor-fold defaultstate="collapsed" desc="Generated Code">
private void connectButtonActionPerformed(java.awt.event.ActionEvent evt) {
actionConnect();
}
private void cancelButtonActionPerformed(java.awt.event.ActionEvent evt) {
actionCancel();
}
private void signupActionPerformed(java.awt.event.ActionEvent evt) {
un = username.getText();
ps = password.getText();
n = name.getText();
sn = surname.getText();
co = country.getText();
imap = servername.getText();
smtp = smtpserver.getText();
mobile = phoneno.getText();
try {
String sql = "INSERT INTO `logindetails`
(`Username`,`Password`,`Name`,`Surname`,`Country`,`Server`,`SMTPServer`,`Phoneno`)
VALUES (?,?,?,?,?,?,?,?);";
PreparedStatement pstmt = conn.prepareStatement(sql);
pstmt.setString(1, un);
pstmt.setString(2, ps);
pstmt.setString(3, n);
pstmt.setString(4, sn);
pstmt.setString(5, co);
pstmt.setString(6, imap);
pstmt.setString(7, smtp);
pstmt.setString(8, mobile);
pstmt.executeUpdate();
} catch (Exception e) {
32. 25
System.out.println(e);
}
int index = un.indexOf("@");
String name = un.substring(0, index);
String tablename = name.replace(".", "");
try {
String sql = "CREATE TABLE IF NOT EXISTS `" + tablename + "` ( `From` text NOT
NULL, `Subject` text NOT NULL, `Content` longtext NOT NULL, `Naivebayes` text NOT
NULL, `C45` text NOT NULL, `Day` varchar(3) NOT NULL, `Month` varchar(3) NOT NULL,
`Date` int(2) NOT NULL, `Year` int(4) NOT NULL, `Time` int(2) NOT NULL, `Keyword`
text NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=latin1;";
stmt = (Statement) conn.createStatement();
stmt.executeUpdate(sql);
String sql1 = "CREATE TABLE IF NOT EXISTS `" + tablename + "_keyword` ( `Keyword`
text NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=latin1;";
stmt1 = (Statement) conn.createStatement();
stmt1.executeUpdate(sql1);
} catch (Exception e) {
System.out.println(e);
}}
// Variables declaration - do not modify
private javax.swing.JButton cancelButton;
private javax.swing.JButton connectButton;
private javax.swing.JTextField country;
private javax.swing.JLabel jLabel10;
private javax.swing.JLabel jLabel11;
private javax.swing.JLabel jLabel12;
private javax.swing.JLabel jLabel13;
private javax.swing.JLabel jLabel14;
private javax.swing.JLabel jLabel15;
private javax.swing.JLabel jLabel16;
private javax.swing.JLabel jLabel2;
private javax.swing.JLabel jLabel4;
private javax.swing.JLabel jLabel5;
private javax.swing.JLabel jLabel6;
private javax.swing.JLabel jLabel7;
private javax.swing.JLabel jLabel8;
private javax.swing.JLabel jLabel9;
private javax.swing.JTextField name;
private javax.swing.JTextField password;
private javax.swing.JPasswordField passwordField;
private javax.swing.JTextField phoneno;
private javax.swing.JTextField servername;
private javax.swing.JButton signup;
private javax.swing.JTextField smtpserver;
private javax.swing.JTextField surname;
private javax.swing.JTextField username;
private javax.swing.JTextField usernameTextField;
// End of variables declaration
}
33. 26
When the user is signing up for the first time, all his information gets stored in the ‘logindetails’ table
in the server. The structure of the table and the mysql query to execute that code is as shown below.
The structure of the login details table
MySql Query
CREATE TABLE IF NOT EXISTS `logindetails` (
`Username` varchar(30) NOT NULL,
`Password` varchar(30) NOT NULL,
`Name` varchar(30) NOT NULL,
`Surname` varchar(30) NOT NULL,
`Country` varchar(30) NOT NULL,
`Server` varchar(30) NOT NULL,
`SMTPServer` varchar(30) NOT NULL,
`Phoneno` varchar(30) NOT NULL,
`messagecount` int(11) NOT NULL,
`classifiedcount` int(11) NOT NULL,
PRIMARY KEY (`Username`),
UNIQUE KEY `Phoneno` (`Phoneno`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
Once the user has signed up, the following are the two table structures that are created for the user.
The structure of the main table where all the mails are stored
34. 27
This table contains the information regarding the mail getting downloaded. Who was the message
received from, what is the subject of the mail, the content of the mail, the two algorithms which are to
be implemented, the date and time, and a keyword column, where the keyword/s associated with that
mail is/are stored.
MySql Query
CREATE TABLE IF NOT EXISTS `username` (
`From` text NOT NULL,
`Subject` text NOT NULL,
`Content` longtext NOT NULL,
`Naivebayes` text NOT NULL,
`C45` text NOT NULL,
`Day` varchar(3) NOT NULL,
`Month` varchar(3) NOT NULL,
`Date` int(2) NOT NULL,
`Year` int(4) NOT NULL,
`Time` int(2) NOT NULL,
`Keyword` text NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
The structure of the keyword table where all the keywords are stored
MySql Query
CREATE TABLE IF NOT EXISTS `username_keyword` (
`Keyword` text NOT NULL,
`Count` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
35. 28
5.2. The Email Client Window
The Email Client window is the major window in the application. The major functionalities which are
to be implemented are a part of the Email Client Window. The Email Client is entirely divided into 6
different parts, and each of these 6 parts is represented by 6 tabs which are present on the top of the
Email Client. All the operations which are to be performed can be performed only with the Email
Client.
The Entire Email Client is comprised of the following 6 tabs.
The Welcome Tab
The welcome tab is the basic homepage where the user can view all the basic information, like how
many mails have been downloaded, how many are unread.
The Main Page
It is here that the user performs all the necessary operations, with respect to the client application.
The user executes Naïve Bayes, and C4.5 classification algorithms, as well as can search for specific
user defined keywords.
The Message Viewer
The user can view all his mails on the basis of the conditions that have been specified in this window,
the message viewer helps the user read his mails, as per his preference.
The Statistics Window
The statistics window showcases graphical and historical analysis on the information that is made
available to him from previously fetched data.
The Messaging Window
The user can send a message to another user, from the desktop application to a particular user’s Email
Account.
Credits
Information regarding the developers is present in this window; also a feedback form has been
developed so that the user can send feedbacks regarding his experience with the application.
36. 29
THE WELCOME TAB
(See Screenshot 2)
The screenshot as shown above clearly shows, a graphical display as to how many mails the user has
received which are read, and the total number of mails the user has received and is unread. The red
portion in the pie chart represents the total amount of unread mail which the user is currently having
in his mailbox. The refresh button allows the user to refresh his mailbox, as to retrieve those mails
which haven’t been retrieved yet. This happens on the execution of the connect method which is
executed by clicking on connect from the connect dialog box.
The Connect Method
final ConnectDialog dialog = new ConnectDialog(this);
dialog.show();
username=dialog.getUsername();
password=dialog.getPassword();
final DownloadingDialog downloadingDialog =
new DownloadingDialog(this);
SwingUtilities.invokeLater(new Runnable() {
public void run() {
downloadingDialog.show();
}
});
//Establish JavaMail session and connect to server.
Store store = null;
try {
//Initialize JavaMail session with SMTP server.
Properties props = new Properties();
props.setProperty("mail.store.protocol", "imaps");
props.put("mail.smtp.host","smtp.gmail.com");
props.put("mail.smtp.starttls.enable","true");
props.put("mail.smtp.auth", "true");
session = Session.getInstance(props,
new javax.mail.Authenticator() {
protected PasswordAuthentication getPasswordAuthentication() {
return new PasswordAuthentication(dialog.getUsername(),dialog.getPassword());
}
});
store = session.getStore("imaps");
store.connect("imap.gmail.com",dialog.getUsername(),dialog.getPassword());
} catch (Exception e) {
//Close the downloading dialog.
downloadingDialog.dispose();
//Show error dialog.
showError("Unable to connect.", true);
}
//Download message headers from server.
try {
int j=0;
//Open main "INBOX" folder.
Folder folder = store.getFolder("INBOX");
folder.open(Folder.READ_WRITE);
38. 31
catch(Exception e)
{
System.out.println("there is an exception");
System.out.println(e);
}
}
catch (Exception e) {
System.out.println("No Information");
}
Message[] messages = folder.getMessages();
//Retrieve message headers for each message in folder.
FetchProfile profile = new FetchProfile();
profile.add(FetchProfile.Item.ENVELOPE);
folder.fetch(messages, profile);
}
} catch (Exception e) {
// Close the downloading dialog.
downloadingDialog.dispose();
// Show error dialog.
showError("Unable to download messages.", true);
}
// Close the downloading dialog.
downloadingDialog.dispose();
}
THE MAIN PAGE
The main page is the window where major classification operations are being performed. There are
two algorithms that are being used, Naïve Bayes and C4.5.
(See Screenshot 3)
The classification is being performed using the training dataset which is imported and then various
operations with respect to it are being performed by the user.
Training dataset creation
private void createTrainingSet(String dataset) throws Exception
{
emailMessage = new Attribute("emailMessage", (FastVector) null);
emailClass = new FastVector(3);
emailClass.addElement("spam");
emailClass.addElement("no spam");
emailClass.addElement("?");
eClass = new Attribute("emailClass", emailClass);
records = new FastVector(2);
records.addElement(eClass);
records.addElement(emailMessage);
trainingSet = new Instances("SpamClsfyTraining", records, 40);
trainingSet.setClassIndex(0);
this.readTrainingDataset(dataset);
ArffSaver saver = new ArffSaver();
saver.setInstances(trainingSet);
saver.setFile(new File("C:Akshaytraining.arff"));
saver.writeBatch();
39. 32
}
Classification Implementation
private void performClassification(Object model, String modelName) throws
Exception
{
System.out.println("**==" + modelName + "==**");
StringToWordVector stringToVector = new StringToWordVector(1000);
stringToVector.setInputFormat(trainingSet);
stringToVector.setOutputWordCounts(true);
stringToVector.setUseStoplist(false);
Instances filteredData = Filter.useFilter(trainingSet, stringToVector);
Instances filteredTestData = Filter.useFilter(testingSet,stringToVector);
Classifier cModel = (Classifier) model;
cModel.buildClassifier(filteredData);
Evaluation eTest = new Evaluation(filteredTestData);
eTest.evaluateModel(cModel, filteredTestData);
double m=eTest.correct();
int x=(int)m;
System.out.println(x);
if(x==1)
{
if(nb==1)
{
System.out.println("Naive Bayes Spam");
}
if(c==1)
{
System.out.println("C4.5 Spam");
}
}
else
{
if(nb==1)
{
System.out.println("Naive Bayes Non Spam");
}
if(c==1)
{
System.out.println("C4.5 Non Spam");
}
}
}
There is also a keyword based search feature which has been implemented in which the user specified
keyword is being searched by the application.
Keyword Search
private void searchActionPerformed(java.awt.event.ActionEvent evt) {
if(keyword.getText().equals(""))
{
System.out.println("lol");
JOptionPane.showMessageDialog(new JFrame(),"Please Enter The Keyword", "Error",
JOptionPane.ERROR_MESSAGE);
}
40. 33
else
{
try
{
String sql1="INSERT INTO `username_keyword` (`Keyword`) VALUES (?);";
PreparedStatement pstmt = conn.prepareStatement(sql1);
pstmt.setString(1,keyword.getText());
pstmt.executeUpdate();
pst1=conn.prepareStatement("SELECT * FROM `username_keyword`");
rs1=pst1.executeQuery();
keywordviewer.setModel(DbUtils.resultSetToTableModel(rs1));
pst=conn.prepareStatement("SELECT * FROM `username`");
rs=pst.executeQuery();
int i=1;
while(rs.next())
{
String subject = rs.getString("Subject");
String content = rs.getString("Content");
String pastkeywordlist = rs.getString("Keyword");
String newkeyword;
if(pastkeywordlist.equals(""))
{
newkeyword=keyword.getText();
}
else
{
newkeyword=pastkeywordlist + "," + keyword.getText();
}
System.out.println(EmailFiltering.containtsKeyWord(subject, content,
keyword.getText()));
if(EmailFiltering.containtsKeyWord(subject, content, keyword.getText()))
{
String sql="UPDATE `username` SET `keyword` = ? WHERE `Subject` = ? AND `Content`
= ?";
PreparedStatement pstmt1=conn.prepareStatement(sql);
pstmt1.setString(1,newkeyword);
pstmt1.setString(2,subject);
pstmt1.setString(3,content);
pstmt1.executeUpdate();
}
}
}
catch(Exception e)
{
System.out.println(e);
}
}
FillCombo();
}
THE MESSAGE VIEWER
The message viewer enables the user to view all the information on the basis of segregation which has
been performed by the classification algorithms that are executed by the user. The message viewer
also has a feature where the keyword can be recognised and all the necessary files can be created with
respect to that feature to be implemented.
41. 34
There are two additional buttons which have been provided; one is to store the particular file in a
specific location which is defined by the user. The other feature is to copy all the message contents to
the clipboard.
(See Screenshot 4)
View Messages on the basis of Classification
private void update_table()
{
try
{
String cv,sb;
cv=columnvalue.getSelectedItem().toString();
sb=spambox.getSelectedItem().toString();
System.out.println("SELECT `From`,`Subject` FROM `username` WHERE
`naivebayes`='spam'");
pst=conn.prepareStatement("SELECT `From`,`Subject` FROM `username` WHERE
`"+cv+"`='"+sb+"'");
rs=pst.executeQuery();
messageviewer.setModel(DbUtils.resultSetToTableModel(rs));
}
catch(Exception e)
{
System.out.println(e);
}}
View Messages on the basis of Keywords
private void keywordbuttonActionPerformed(java.awt.event.ActionEvent evt) {
String keywordvt=keywordcombobox.getSelectedItem().toString();
System.out.println(keywordvt);
try
{
String sql="SELECT * FROM `username`";
pst=conn.prepareStatement(sql);
rs=pst.executeQuery();
while(rs.next())
{
String keywordtb=rs.getString("Keyword");
System.out.println(keywordtb);
System.out.println(EmailFiltering.containsKeyWord(keywordtb,keywordvt));
if(EmailFiltering.containsKeyWord(keywordtb,keywordvt))
{
pst1=conn.prepareStatement("SELECT `From`,`Subject` FROM `username` WHERE
`Keyword`='"+keywordtb+"'");
rs1=pst1.executeQuery();
messageviewer.setModel(DbUtils.resultSetToTableModel(rs1));
//pst.close();
}
System.out.println();
}}
catch(Exception e)
{
System.out.println(e);
}}
42. 35
Store the particular text file in a specific location
(See Screenshot 5)
Store in PC
private void savepcActionPerformed(java.awt.event.ActionEvent evt) {
System.out.println("Working");
final FileChooser filec=new FileChooser(this,true);
int result = FileChooser.jFileChooser2.showSaveDialog(this);
if (result == FileChooser.jFileChooser2.APPROVE_OPTION) {
String
path=FileChooser.jFileChooser2.getSelectedFile().getAbsoluteFile().toString();
try
{FileWriter writer=new FileWriter(path);
PrintWriter outputStream=new PrintWriter(path);
String content=EmailFiltering.jTextArea1.getText();
outputStream.println(content);
outputStream.close();}
catch(Exception e)
{}
} else if (result == FileChooser.jFileChooser2.CANCEL_OPTION) {
System.out.println("Cancel was selected");
}
FileChooser.jFileChooser2.setVisible(false);
}
Copy Text
private void copytextActionPerformed(java.awt.event.ActionEvent evt) {
String name= jTextArea1.getText();
StringSelection stringSelection = new StringSelection(name);
Clipboard clipboard = Toolkit.getDefaultToolkit().getSystemClipboard();
clipboard.setContents(stringSelection,null);
}
THE MESSAGING TAB
This tab helps the user to send mails, via the desktop application itself. The user can also select a
particular message and forward that message to any user. The user can also reply to a mail which he
has received. All these features have been implemented with the help of the Message Dialog box.
(See Screenshot 6)
Send Message
private void sendMessage(String to,String Subject,String Content) {
MessageDialog dialog=new MessageDialog(this,true);
dialog.totextbox.setText(to);
dialog.subjecttextbox.setText(Subject);
dialog.contenttextbox.setText(Content);
dialog.setVisible(true);
try {
44. 37
messageto=rs.getString("From");
messagecontent=rs.getString("Content");
replycontent=replycontent1+messagecontent;
sendMessage(messageto,replysubject,replycontent);
break;
}
}
catch(Exception e)
{
System.out.println(e);
}
}
(See Screenshot 9)
Function:
private void actionForward() {
int row=messagereader.getSelectedRow();
String messagesubject=(messagereader.getModel().getValueAt(row,1).toString());
String messageto="";
String messagecontent="";
String forwardcontent1=" ----------------- +n" +
" FORWARDED MESSAGE +n" +
" ----------------- +n";
String forwardcontent;
String sql="select `From`,`Content` from `ourbeproject2014` where
subject='"+messagesubject+"' ";
try
{
pst=conn.prepareStatement(sql);
rs=pst.executeQuery();
while(rs.next())
{
messagecontent=rs.getString("Content");
forwardcontent=forwardcontent1+messagecontent;
sendMessage(messageto,messagesubject,forwardcontent);
break;
}
}
catch(Exception e)
{
System.out.println(e);
}
}
CREDITS
(See Screenshot 10)
The user can send a feedback as to how the user felt regarding the application.
45. 38
5.3. The Message Dialog Box
The message dialog box is the dialog box which is being used to send a new mail, reply to an already
existing mail, or to forward a mail. Various code snippets have been combined with this particular
box and hence it plays an important role in the functionality of the project.
(See Screenshot 11)
MessageDialog.java
package emailfiltering;
public class MessageDialog extends javax.swing.JDialog {
public MessageDialog(java.awt.Frame parent, boolean modal) {
super(parent, modal);
initComponents();
}
private void totextboxActionPerformed(java.awt.event.ActionEvent evt) {
}
private void jButton1ActionPerformed(java.awt.event.ActionEvent evt) {
dispose();
}
public static javax.swing.JTextArea contenttextbox;
public static javax.swing.JTextField fromtextbox;
private javax.swing.JButton jButton1;
private javax.swing.JLabel jLabel1;
private javax.swing.JLabel jLabel2;
private javax.swing.JLabel jLabel3;
private javax.swing.JScrollPane jScrollPane1;
public static javax.swing.JTextField subjecttextbox;
public javax.swing.JTextField totextbox;
// End of variables declaration
}
46. 39
5.4. The File Chooser
The file chooser is an inbuilt feature in java which has been included so that the user can trace the
path to a particular location in order to save the file.
(See Screenshot 12)
FileChooser.java
package emailfiltering;
public class FileChooser extends javax.swing.JDialog {
public FileChooser(java.awt.Frame parent, boolean modal) {
super(parent, modal);
initComponents();
}
private void jFileChooser2ActionPerformed(java.awt.event.ActionEvent evt) {
}
public static void main(String args[]) {
java.awt.EventQueue.invokeLater(new Runnable() {
public void run() {
FileChooser dialog = new FileChooser(new javax.swing.JFrame(), true);
dialog.addWindowListener(new java.awt.event.WindowAdapter() {
@Override
public void windowClosing(java.awt.event.WindowEvent e) {
System.exit(0);
}
});
dialog.setVisible(true);
}
});}
public static javax.swing.JFileChooser jFileChooser2;
}
47. 40
5.5. The Downloading Dialog
The downloading dialog is a dialogue that appears whenever the mails are being downloaded from the
server. It appears when the Connect button is clicked from the connect dialog box and continues till
the mails are being fetched by the user.
(See Screenshot 13)
DownloadingDialog.java
package emailfiltering;
import java.awt.*;
import javax.swing.*;
public class DownloadingDialog extends JDialog {
public DownloadingDialog(Frame parent) {
// Call super constructor, specifying that dialog is modal.
super(parent, true);
// Set dialog title.
setTitle("E-mail Client");
// Instruct window not to close when the "X" is clicked.
setDefaultCloseOperation(DO_NOTHING_ON_CLOSE);
// Put a message with a nice border in this dialog.
JPanel contentPane = new JPanel();
contentPane.setBorder(
BorderFactory.createEmptyBorder(5, 5, 5, 5));
contentPane.add(new JLabel("Downloading messages..."));
setContentPane(contentPane);
// Size dialog to components.
pack();
// Center dialog over application.
setLocationRelativeTo(parent);
}
@SuppressWarnings("unchecked")
// <editor-fold defaultstate="collapsed" desc="Generated Code">
}
48. 41
5.6. Analysis Window
THE STATISTICS WINDOW
(See Screenshot 14)
The statistics window is extremely useful in achieving historical analysis of mails, as to how much
amount of spam and non spam has been received over the past few years.
Annual Statistics
The annual statistics generate statistics from 2007 to 2017 and showcase how many mails have been
received each year, how many of them are spam, and how many of them are non spam.
(See Screenshot 15)
Function:
private void annuallyActionPerformed(java.awt.event.ActionEvent evt) {
DefaultCategoryDataset datasetyearly = new DefaultCategoryDataset();
int year=2007;
while(year<=2017)
{
System.out.println(year);
try
{
pst=conn.prepareStatement("SELECT COUNT( * ) AS count FROM `username` WHERE
NAIVEBAYES='spam' AND YEAR='"+year+"'");
rs=pst.executeQuery();
int spamyearcount;
String yearvalue=Integer.toString(year);
while(rs.next())
{
spamyearcount=rs.getInt("count");
System.out.println(spamyearcount);
datasetyearly.addValue(spamyearcount,"Spam",yearvalue);
}
}
catch(Exception e)
{
System.out.println(e);
}
year=year+1;
}
year=2007;
while(year<=2017)
{
System.out.println(year);
49. 42
try
{
pst=conn.prepareStatement("SELECT COUNT( * ) AS count FROM `username` WHERE
NAIVEBAYES='nonspam' AND YEAR='"+year+"'");
rs=pst.executeQuery();
int nonspamyearcount;
String yearvalue=Integer.toString(year);
while(rs.next())
{
nonspamyearcount=rs.getInt("count");
System.out.println(nonspamyearcount);
datasetyearly.addValue(nonspamyearcount,"Non Spam",yearvalue);
}
}
catch(Exception e)
{
System.out.println(e);
}
year=year+1;
}
JFreeChart stackedChart = ChartFactory.createStackedBarChart("Annual Spam Rate
Report", "Year", "Mail",datasetyearly, PlotOrientation.VERTICAL, true, true,
false);
CategoryPlot barchrt=stackedChart.getCategoryPlot();
setResizable(false);
barchrt.setRangeGridlinePaint(Color.BLACK);
jPanel13.setLayout(new java.awt.BorderLayout());
ChartPanel panelpie =new ChartPanel(stackedChart);
jPanel13.removeAll();
jPanel13.add(panelpie,BorderLayout.CENTER);
jPanel13.validate();
}
Monthly Statistics
The yearly statistics which are being developed can be further viewed monthly. The user needs to
specify the year during which he would like to perform Analysis and on the basis of that the user can
understand the amount of spam mails that are being fetched and are being stored by the user.
The Monthly Statistics can be viewed from the month of January and it continues till the month of
December.
All the months have been specified
(See Screenshot 16)
Function:
private void monthlyActionPerformed(java.awt.event.ActionEvent evt) {
DefaultCategoryDataset datasetmonthly = new DefaultCategoryDataset();
String my;
String[] month = new String[] {"Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul",
"Aug", "Sep", "Oct", "Nov", "Dec"};
my=monthyear.getSelectedItem().toString();
System.out.println(my);
int i=0;
50. 43
while(i<month.length)
{
System.out.println(month[i]);
try
{
pst=conn.prepareStatement("SELECT COUNT( * ) AS count FROM `username` WHERE
MONTH='"+month[i]+"' AND NAIVEBAYES='spam' AND YEAR='"+my+"'");
rs=pst.executeQuery();
int nonspammonthcount;
while(rs.next())
{
nonspammonthcount=rs.getInt("count");
System.out.println(nonspammonthcount);
datasetmonthly.addValue(nonspammonthcount,"Spam",month[i]);
}
rs.close();
}
catch(Exception e)
{
System.out.println(e);
}
i++;
}
i=0;
while(i<month.length)
{
System.out.println(month[i]);
try
{
pst=conn.prepareStatement("SELECT COUNT( * ) AS count FROM `username` WHERE
MONTH='"+month[i]+"' AND NAIVEBAYES='nonspam' AND YEAR='"+my+"'");
rs=pst.executeQuery();
int nonspammonthcount;
while(rs.next())
{
nonspammonthcount=rs.getInt("count");
System.out.println(nonspammonthcount);
datasetmonthly.addValue(nonspammonthcount,"Non Spam",month[i]);
}
rs.close();
}
catch(Exception e)
{
System.out.println(e);
}
i++;
}
JFreeChart stackedChart = ChartFactory.createStackedBarChart("Monthly Spam Rate
Report", "Month", "Mails",
datasetmonthly, PlotOrientation.VERTICAL, true, true, false);
CategoryPlot barchrt=stackedChart.getCategoryPlot();
setResizable(false);
barchrt.setRangeGridlinePaint(Color.BLACK);
jPanel13.setLayout(new java.awt.BorderLayout());
ChartPanel panelpie =new ChartPanel(stackedChart);
jPanel13.removeAll();
jPanel13.add(panelpie,BorderLayout.CENTER);
jPanel13.validate();
}
51. 44
Weekly Statistics
The monthly statistics which are being developed can be further viewed weekly. The user needs to
specify the year during which he would like to perform Analysis and on the basis of that the user can
understand the amount of spam mails that are being fetched and are being stored by the user.
The Weekly Statistics can be viewed in spans of 4 weeks
All the weeks have been specified
Week 1: 1-7
Week 2: 8-14
Week 3: 15-21
Week 4: 22-31
(See Screenshot 17)
Function:
private void weeklyActionPerformed(java.awt.event.ActionEvent evt) {
DefaultCategoryDataset datasetweekly = new DefaultCategoryDataset();
int w1=0,w2=0,w3=0,w4=0;
String wm,wy;
wm=weekmonth.getSelectedItem().toString();
wy=weekyear.getSelectedItem().toString();
System.out.println(wm);
System.out.println(wy);
int i=1;
while(i<=31)
{
try
{
pst=conn.prepareStatement("SELECT COUNT( * ) AS count FROM `username` WHERE
NAIVEBAYES='spam' AND DATE='"+i+"' AND MONTH='"+wm+"'AND YEAR='"+wy+"'");
rs=pst.executeQuery();
int spamweekcount;
while(rs.next())
{
spamweekcount=rs.getInt("count");
if(i>=1 && i<8)
{ w1=w1+spamweekcount; }
if(i>=8 && i<15)
{ w2=w2+spamweekcount; }
if(i>=15 && i<22)
{ w3=w3+spamweekcount; }
if(i>=22 && i<31)
{ w4=w4+spamweekcount; }
}
rs.close();
}
catch(Exception e)
{
System.out.println(e);
}
i++;
52. 45
}
datasetweekly.addValue(w1, "Spam","Week1");
datasetweekly.addValue(w2, "Spam","Week2");
datasetweekly.addValue(w3, "Spam","Week3");
datasetweekly.addValue(w4, "Spam","Week4");
i=0;
w1=0;w2=0;w3=0;w4=0;
while(i<=31)
{
try
{
pst=conn.prepareStatement("SELECT COUNT( * ) AS count FROM `username` WHERE
NAIVEBAYES='nonspam' AND DATE='"+i+"' AND MONTH='"+wm+"'AND YEAR='"+wy+"'");
rs=pst.executeQuery();
int nonspamweekcount;
while(rs.next())
{
nonspamweekcount=rs.getInt("count");
if(i>=1 && i<8)
{ w1=w1+nonspamweekcount; }
if(i>=8 && i<15)
{ w2=w2+nonspamweekcount; }
if(i>=15 && i<22)
{ w3=w3+nonspamweekcount; }
if(i>=22 && i<31)
{ w4=w4+nonspamweekcount; }
}
rs.close();
}
catch(Exception e)
{
System.out.println(e);
}
i++;
}
datasetweekly.addValue(w1, "Non Spam","Week1");
datasetweekly.addValue(w2, "Non Spam","Week2");
datasetweekly.addValue(w3, "Non Spam","Week3");
datasetweekly.addValue(w4, "Non Spam","Week4");
JFreeChart stackedChart = ChartFactory.createStackedBarChart("Weekly Spam Rate
Report",wm+","+wy, "Messages",
datasetweekly, PlotOrientation.VERTICAL, true, true, false);
CategoryPlot barchrt=stackedChart.getCategoryPlot();
barchrt.setRangeGridlinePaint(Color.RED);
setResizable(false);
jPanel13.setLayout(new java.awt.BorderLayout());
ChartPanel panelpie =new ChartPanel(stackedChart);
jPanel13.removeAll();
jPanel13.add(panelpie,BorderLayout.CENTER);
jPanel13.validate();
}
Comparative Analysis:
This method shows a comparison between Naïve Bayes and C4.5 and tells the user, which algorithm
is better in catching Spam.
53. 46
(See Screenshot 18)
Function:
private void comparativeActionPerformed(java.awt.event.ActionEvent evt) {
DefaultCategoryDataset datasetcomparative = new DefaultCategoryDataset();
try
{
pst=conn.prepareStatement("SELECT COUNT( * ) AS count FROM `username` WHERE
NAIVEBAYES='spam'");
pst1=conn.prepareStatement("SELECT COUNT( * ) AS count FROM `username` WHERE
C45='spam'");
rs=pst.executeQuery();
int c45spamcount;
int naivebayesspamcount;
while(rs.next())
{
c45spamcount=rs.getInt("count");
System.out.println(c45spamcount);
datasetcomparative.addValue(c45spamcount,"Spam","C45");
}
rs1=pst1.executeQuery();
while(rs1.next())
{
naivebayesspamcount=rs1.getInt("count");
System.out.println(naivebayesspamcount);
datasetcomparative.addValue(naivebayesspamcount,"Spam","Naive Bayes");
}
}
catch(Exception e)
{
System.out.println(e);
}
try
{
pst=conn.prepareStatement("SELECT COUNT( * ) AS count FROM `username` WHERE
NAIVEBAYES='nonspam'");
pst1=conn.prepareStatement("SELECT COUNT( * ) AS count FROM `username` WHERE
C45='nonspam'");
rs=pst.executeQuery();
rs1=pst1.executeQuery();
int c45nonspamcount;
int naivebayesnonspamcount;
while(rs.next())
{
c45nonspamcount=rs.getInt("count");
System.out.println(c45nonspamcount);
datasetcomparative.addValue(c45nonspamcount,"Non Spam","C45");
}
rs1=pst1.executeQuery();
while(rs1.next())
{
naivebayesnonspamcount=rs1.getInt("count");
System.out.println(naivebayesnonspamcount);
datasetcomparative.addValue(naivebayesnonspamcount,"Non Spam","Naive Bayes");
}
}
catch(Exception e)
54. 47
{
System.out.println(e);
}
JFreeChart stackedChart = ChartFactory.createStackedBarChart("Comparative Spam
Rate Report", "Algorithm", "Spam/NonSpam",datasetcomparative,
PlotOrientation.VERTICAL, true, true, false);
CategoryPlot barchrt=stackedChart.getCategoryPlot();
setResizable(false);
barchrt.setRangeGridlinePaint(Color.BLACK);
jPanel13.setLayout(new java.awt.BorderLayout());
ChartPanel panelpie =new ChartPanel(stackedChart);
jPanel13.removeAll();
jPanel13.add(panelpie,BorderLayout.CENTER);
jPanel13.validate();
}
User Defined
This feature shows a comparison amongst the mails, which have been distinguished based on the
keywords which have been specified by the user.
This just helps the user in understanding which mails the user has received number of times.
(See Screenshot 19)
Function:
private void userdefinedActionPerformed(java.awt.event.ActionEvent evt) {
DefaultCategoryDataset barChartData=new DefaultCategoryDataset();
String sql="SELECT * FROM `username_keyword`";
try
{
pst=conn.prepareStatement(sql);
rs=pst.executeQuery();
while(rs.next())
barChartData.setValue(rs.getInt("Count"),"Messages",rs.getString("Keyword"));
}
catch(Exception e)
{
System.out.println(e);
}
JFreeChart barChart=ChartFactory.createBarChart("User Preference Messages
Quantity","Keyword","Message", barChartData, PlotOrientation.VERTICAL,
rootPaneCheckingEnabled, rootPaneCheckingEnabled, rootPaneCheckingEnabled);
CategoryPlot barchrt=barChart.getCategoryPlot();
barchrt.setRangeGridlinePaint(Color.ORANGE);
jPanel13.setLayout(new java.awt.BorderLayout());
setResizable(false);
ChartPanel panelpie =new ChartPanel(barChart);
jPanel13.removeAll();
jPanel13.add(panelpie,BorderLayout.CENTER);
jPanel13.validate();}
56. 49
Results
SCREENSHOTS:
Screenshot 1: A screenshot of the connect dialog window.
Screenshot 2: A screenshot of the homescreen which opens once the user is logging in
57. 50
Screenshot 3: A Screenshot of the Main Page where all operations can be performed
Screenshot 4: A Screenshot of the message viewer tab
58. 51
Screenshot 5: The Save Dialog Box Appears when store in PC has been clicked
Screenshot 6: A Screenshot of the Messaging Tab
59. 52
Screenshot 7: A Screenshot of New Message box
Screenshot 8: A Screenshot of Reply Message box
Screenshot 9: A Screenshot of Forward Message Box
60. 53
Screenshot 10: A screenshot of the credits page
Screenshot 11: The Message Dialog
67. 60
Conclusion
Considering the necessity of E-Mail in an individual’s life, the need of classifying the messages is of
utmost importance and it is necessary to be achieved. With the employment of various Spam Filtering
techniques, and various classification algorithms, it is extremely easy to classify the information into
various categories. Hence, E-Mail filtering classification and analysis using data mining approach has
been achieved successfully.
69. 62
Future Scope
Cloud Based Email Archiving System
The concept of cloud based email archiving is pretty simple. Broadly put, a service provider typically
processes, manages and stores your business data in a hosted server and at a remote place either as a
substitute or typically as an enhancement to your on premise infrastructure.
Research reveals that cloud-based email archiving service is becoming rather popular over time with
prominent growth in the number of corporate users served by this cloud based archival model.
An email spam filter service on the cloud thus offers an array of significant benefits, which includes:
1. It’s rather predictable cost of ownership.
2. Its ability in letting the specialist providers manage tall those key email and related functions.
3. Its capability of freeing up the IT staff for other initiatives.
4. A paradigm shift from capital expenditure (CAPEX) to the operating expenditure (OPEX)
model.
5. Ease and convenience of managing the IT services.
6. Comprehensive and thorough E DISCOVERY solution.
7. Reduced chance of virus, spam and malware attacks.
8. Inbound and outbound Email filtering
9. Agile E-mail accessibility.
The concept of email storage on the cloud has been in use by the large corporate for many years. The
scope and future of cloud based email archiving system thus looks extremely bright and is popular for
services which ranges from email archiving to retrieval and spam filtration.
Encrypted message based E-Mail Classification
This is an application which will enable the user to fetch messages from the server and perform
classification on the message on the basis of various encryption algorithms.
The E-Mail application will consists of various encryption/decryption algorithms such as:
1. AES.
2. DES.
3. Additive Cipher.
4. Huffman’s Algorithm.
5. RSA Algorithm.
On the basis of the information obtained, the application will decrypt the text obtained from the E-
mail server and execute all the algorithms. On the basis of the result obtained, the best solution will be
selected amongst all the decrypted texts. If however, the algorithm fails to decrypt the text, then the
70. 63
message will be passed as non-encrypted text and further filtering according to the categories will take
place.
An Android Based Application for accessing Emails
An Android Based Application can be created in order to access and bring about the classification of
emails. This will enable the user to access his E-Mails from any location. We could make use of the
same server to bring about accessing and storage of mails. Also, we can bring about the more user
friendliness with the help of this application.
Location based Analysis of Spam Rate
Location based Analysis of Spam can be a really good feature that can be implemented in the future.
We can take the location information from the user, or retrieve the location information from the
email account of the user, and classify if that particular Email is spam or not. With location based
analysis we can find out which country has maximum spam concentration. This can be graphically
displayed using Google Maps and Java maps in our application.