SlideShare a Scribd company logo
1 of 3
Download to read offline
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 03 | Mar -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 2631
Touch with Industry
Priyanka Thakur 1, Karuna Sharma 2, Priyanka Chaudhari 3, Sayali Borade 4
1,2,3,4 Student, Computer Department, MET’s BKC IOE Nashik, Maharashtra, India
---------------------------------------------------------------------***---------------------------------------------------------------------
Abstract - Web content mining is the extraction and
integration of relevant data, informationandknowledgefrom
different forms of web page contents. Extracting data from
web pages is challenging task as web data is mainly in semi-
structured or unstructured format, while web content mining
deals primarily with structured data. We have proposed a
system with title "Touch with Industry" that mainlyfocuses on
gathering information from various trusted sites on the basis
of company names and fields selected by the users. The
provided fields not limited to Company CEO, Address, Contact
No, Year of Establishment, Employee Strength, Images of
Company, and People on LinkedIn.
The proposed system uses web crawling, whichmakes iteasier
for the application to return the most relevant resultstousers.
The proposed system is advantageous to people like
consultants, HR representatives, Employed and Unemployed
peoples who need company overview in short period of time,
which reduces the searching time and provide relevant
information from different web pages in one go.
Key Words: web mining, web crawling, pattern matchin,
filtering.
1.INTRODUCTION
The World Wide Web has large online databases of the
companies. This database needstobeaccessedby millionsof
people. According to recent survey, it has been found that
some websites provide fake information to people. Also, it
has been observed that people needs to search a lot for
relevant information which consumes large amount of time
1.1 Project Idea
The idea behind implementing project is to provide solution
to problem of people which they face while they are
searching for company information. This system is mainly
proposed to promote efficient communication between job
seekers and consultants which will lead to ideal and effective
system.
1.2 Motivation
Basically the motivation came withthesurveyamongpeople
who faced problem while searching company details. Users
need to crawl through a number of pages to get desired
information about a company.Toovercomethisproblemthe
proposed system will be develop a desktop application
through which users can easily search about a company in
less amount of time and get relevant details.
2. LITERATURE SURVEY
The exhaustive literaturesurveyconsistingoftheconceptual
base for this project is briefly outlined here.
1. Web Data Extraction: Extracting structured data from
deep Web pages is a challenging problem due to the
underlying intricate structures of suchpages.Thismotivates
us to seek a different way for deep Web data extraction to
overcome the limitations of previous works by utilizing
some interesting common visual features on the deep Web
pages. In this paper, a novel vision-based approach that is
Web- pageprogramming-language-independentisproposed.
This approach primarily utilizes the visual features on the
deep Web pages to implement deep Web data extraction,
including data record extractionanddata itemextraction.[1]
2. Data Restruction: It synthesizes ideas from query
languages for the Web, for semi structured data and for
website restructuringand makesseveral contributions, most
notably, the idea of querying documents by manipulating
their abstract syntax trees and the support of the concept of
web as a data type. [2]
3. Hoovers: It is an official website to search the company's
profile. It searches the world’s largest database of company
and industry information. It provides less detail company
information to non-subscribers. It maintains a database of
about 85 million companies and 100 million peoples.[3]
4. Web Information Extraction Systems: The Internet
presents a huge amount of useful information which is
usually formatted for its users, which makes it difficult to
extract relevant data from various sources. This paper
surveys the major Web data extraction approaches and
compares them in three dimensions: the task domain, the
automation degree, and the techniques used. [4]
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 03 | Mar -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 2632
2.1.1 Existing Systems
The following mentioned applications are available for the
various feature on Company Information Extraction but
having the limitations with some regards to satisfy the user.
1. Company Profile Search
_ Organised Profile
_ Detailed Search
_ Profit Accomplished
2. Lead Finder Jack
_ Crawls onto Google
_ Searches companies having paid advertisements
_ Low accuracy
3. PROBLEM DEFINATION AND SCOPE
Suppose a user wants to search specific information about
the company. For this purpose one makes use of search
engine which indeed displays a lot of websites whichmaybe
irrelevant to the user and this is a tedious and time
consuming task. To overcome the problem specified above
we develop a desktop based application that provides the
solution to the issues related with existing company
information finding system. Using trusted sites to retrieve
relevant information related to companies and providing
accurate data to the user.
3.1 Goals and objectives
i) Goals: To eliminate the burden of searching company
information. To provide user friendly and efficient service.
ii)Objective: The main objective of proposed system is user
comfort. The users can easily accessthecompanydetailsand
get the desired information at a glance.
3.2 Statement of scope
The scope of proposed system is justifiable because in wide
range people search for industry details. So wide range of
people can make use of proposed system.
3.3 Outcome
The outcome of the proposed system is,
i)Accuracy of information
ii)Selection of features according to users choice
iii)Efficient searching
4. DETAILED DESIGN
4.1 Introduction:
System architecture will benefit in such a way that every
user of the system gets advantage. When a user searches
about a company he needs to register into the system. After
logging into the system user has to enter the company name
and select the filters as per his requirement. The system will
return the required information by applying data extraction
techniques.
4.2 Architecture Design:
The input to the system is the companynameandthefeature
of the company. The proposed system will then createa web
request and send it to the search engine. The search engine
then replies to the system, the reply contains all the
information of the company. This system supports GET and
SET protocol method for the web request. SET method
creates a web request and GET method sends the request to
the system. After, receiving the information the proposed
system filter the HTML tags i.e. filters the irrelevant
information based on the feature entered as input. Further
pattern matching is carried out in which the input pattern
(features of input query) is matched with the responded
received by the system after filtering. After the pattern
matching, the information whose pattern matches with the
input query are displayed to the user i.e. user receives the
relevant information as output. The proposed system will
make use of only trusted sites, so only the correct
information will be received by the tool and display as
output.
4.3 Algorithm
Extracting structured data from web pages :
On entering the company name the process of the system
will get initiated. company_name is the variable taken for
company name. The system sends the web request to the
websites with company_name. The system receives the
HTML response from the websites. Then we apply regular
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056
Volume: 04 Issue: 03 | Mar -2017 www.irjet.net p-ISSN: 2395-0072
© 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 2633
expressions and extract text node and data node. Pattern
matching is performed and illegal characters are removed.
After extracting relevant data related company links will be
displayed to the user. The user has to enter the correct
company name . After that system again sends web request
with CIN no and get HTML response. The variable used for
CIN no is cin_no. Then html tags are removed by using
regular expressions and replacing it by newline n.
Split lines by 'n'. For each line do ( find heading
nodes(attributenames)atpositioni attribute[i].value=
lines[i+1] ). Then the system save attribute value and
show data.
5. CONCLUSIONS AND FUTURE WORK
We have presented Touch With Industry system, which
provides a framework for approaching many web data
management task from a unified perspective. This system
operates in three levels: Input, process and output. The
proposed system provides the user with the relevant
information by filtering the HTML tags. The system also
follows pattern matching process after filtering HTML tags
which allows the system to match the response with the
input parameters given by user. The Touch With Industry
system can be extended to other domain of application.
Additionally, as future work we can notify the logged in
users about the jobs vacancy available invariouscompanies.
We can also provide the facility to the user for uploading
their resume for getting job in their area of interest.
REFERENCES
[1] ViDE: A Vision-Based Approach for Deep Web Data
Extraction by Wei Liu, Xiaofeng Meng.
[2] WebOQL: Restructuring Documents,Databasesand Webs
by Gustavo O. Arocena, Alberto O. Mendelzon
[3] S. Huffman, “Learning Information Extraction Patterns
from Examples,” Connectionist, Statistical, and Symbolic
Approaches to Learning for Natural Language Processing,
Springer-Verlag, 1996.
[4] A Survey of Web InformationExtractionSystemsbyChia-
Hui Chang, Mohammed Kayed and Khaled F. Shaalan
[5] Using Visual Clues Concept for Extracting Main Data
from Deep Web Pages by Satish Pusdekar , ShaikhChhaware
[6] A Survey of Web Information Extraction Systems by
Chia-Hui Chang , Mohammed Kayed
[7]D. Freitag, “Information Extraction from HTML:
Application of a General Learning Approach,” Proc. 15th
Conf. Artificial Intelligence (AAAI ’98), 1998.
[8] J. Wang and F.H. Lochovsky, “Data Extraction and Label
Assignment for Web Databases,” Proc. 12th Int’l Conf. World
Wide Web (WWW), pp. 187-196, 2003.
[9] A. Arasu and H. Garcia-Molina, “Extracting Structured
Data from Web Pages,” Proc. ACM SIGMOD Int’l Conf.
Management of Data, pp. 337-348, 2003.
BIOGRAPHIES
Priyanka S. Thakur appearingfor
BE degree from the Department of
Computer Engineering, MET’s
Bhujbal Knowledge City IOE,
Nashik.
Karuna R. Sharma appearing for
BE degree from the Department of
Computer Engineering, MET’s
Bhujbal Knowledge City IOE,
Nashik.
Priyanka B. Chaudhari appearing
for BE degree from the
Department of Computer
Engineering, MET’s Bhujbal
Knowledge City IOE, Nashik.
Sayali P. Borade appearing for BE
degree from the Department of
Computer Engineering, MET’s
Bhujbal Knowledge City IOE,
Nashik.

More Related Content

What's hot

IRJET - Encoded Polymorphic Aspect of Clustering
IRJET - Encoded Polymorphic Aspect of ClusteringIRJET - Encoded Polymorphic Aspect of Clustering
IRJET - Encoded Polymorphic Aspect of ClusteringIRJET Journal
 
IRJET- Identify the Human or Bots Twitter Data using Machine Learning Alg...
IRJET-  	  Identify the Human or Bots Twitter Data using Machine Learning Alg...IRJET-  	  Identify the Human or Bots Twitter Data using Machine Learning Alg...
IRJET- Identify the Human or Bots Twitter Data using Machine Learning Alg...IRJET Journal
 
Time-Ordered Collaborative Filtering for News Recommendation
Time-Ordered Collaborative Filtering for News RecommendationTime-Ordered Collaborative Filtering for News Recommendation
Time-Ordered Collaborative Filtering for News RecommendationIRJET Journal
 
IMPLEMENTATION OF SASF CRAWLER BASED ON MINING SERVICES
IMPLEMENTATION OF SASF CRAWLER BASED ON MINING SERVICESIMPLEMENTATION OF SASF CRAWLER BASED ON MINING SERVICES
IMPLEMENTATION OF SASF CRAWLER BASED ON MINING SERVICESIAEME Publication
 
IRJET- PDF Extraction using Data Mining Techniques
IRJET- PDF Extraction using Data Mining TechniquesIRJET- PDF Extraction using Data Mining Techniques
IRJET- PDF Extraction using Data Mining TechniquesIRJET Journal
 
Making isp business profitable using data mining
Making isp business profitable using data miningMaking isp business profitable using data mining
Making isp business profitable using data miningijitcs
 
Benchmarking supervised learning models for sentiment analysis
Benchmarking supervised learning models for sentiment analysisBenchmarking supervised learning models for sentiment analysis
Benchmarking supervised learning models for sentiment analysisConference Papers
 
IRJET- UID Secure Travel Identity
IRJET- UID Secure Travel IdentityIRJET- UID Secure Travel Identity
IRJET- UID Secure Travel IdentityIRJET Journal
 
IRJET- Intelligence Extraction using Machine Learning Technics
IRJET- Intelligence Extraction using Machine Learning TechnicsIRJET- Intelligence Extraction using Machine Learning Technics
IRJET- Intelligence Extraction using Machine Learning TechnicsIRJET Journal
 
IRJET- Customer Feedback Analysis using Machine Learning
IRJET-  	  Customer Feedback Analysis using Machine LearningIRJET-  	  Customer Feedback Analysis using Machine Learning
IRJET- Customer Feedback Analysis using Machine LearningIRJET Journal
 
Candidate Ranking and Evaluation System based on Digital Footprints
Candidate Ranking and Evaluation System based on Digital FootprintsCandidate Ranking and Evaluation System based on Digital Footprints
Candidate Ranking and Evaluation System based on Digital FootprintsIOSRjournaljce
 
Conference Paper: Enabling Privacy Mechanisms in Apache Storm
Conference Paper: Enabling Privacy Mechanisms in Apache StormConference Paper: Enabling Privacy Mechanisms in Apache Storm
Conference Paper: Enabling Privacy Mechanisms in Apache StormEricsson
 
Combining efficiency, fidelity, and flexibility in resource information services
Combining efficiency, fidelity, and flexibility in resource information servicesCombining efficiency, fidelity, and flexibility in resource information services
Combining efficiency, fidelity, and flexibility in resource information servicesPvrtechnologies Nellore
 
Security risk analysis of bring your own device system in manufacturing compa...
Security risk analysis of bring your own device system in manufacturing compa...Security risk analysis of bring your own device system in manufacturing compa...
Security risk analysis of bring your own device system in manufacturing compa...TELKOMNIKA JOURNAL
 
CHATBOT FOR COLLEGE RELATED QUERIES | J4RV4I1008
CHATBOT FOR COLLEGE RELATED QUERIES | J4RV4I1008CHATBOT FOR COLLEGE RELATED QUERIES | J4RV4I1008
CHATBOT FOR COLLEGE RELATED QUERIES | J4RV4I1008Journal For Research
 

What's hot (20)

IRJET - Encoded Polymorphic Aspect of Clustering
IRJET - Encoded Polymorphic Aspect of ClusteringIRJET - Encoded Polymorphic Aspect of Clustering
IRJET - Encoded Polymorphic Aspect of Clustering
 
IRJET- Identify the Human or Bots Twitter Data using Machine Learning Alg...
IRJET-  	  Identify the Human or Bots Twitter Data using Machine Learning Alg...IRJET-  	  Identify the Human or Bots Twitter Data using Machine Learning Alg...
IRJET- Identify the Human or Bots Twitter Data using Machine Learning Alg...
 
Time-Ordered Collaborative Filtering for News Recommendation
Time-Ordered Collaborative Filtering for News RecommendationTime-Ordered Collaborative Filtering for News Recommendation
Time-Ordered Collaborative Filtering for News Recommendation
 
IMPLEMENTATION OF SASF CRAWLER BASED ON MINING SERVICES
IMPLEMENTATION OF SASF CRAWLER BASED ON MINING SERVICESIMPLEMENTATION OF SASF CRAWLER BASED ON MINING SERVICES
IMPLEMENTATION OF SASF CRAWLER BASED ON MINING SERVICES
 
EFFECTIVE IN-HOUSE VOTING AND VERIFICATION USING BLOCK CHAIN IMPLEMENTATION
EFFECTIVE IN-HOUSE VOTING AND VERIFICATION USING BLOCK CHAIN IMPLEMENTATIONEFFECTIVE IN-HOUSE VOTING AND VERIFICATION USING BLOCK CHAIN IMPLEMENTATION
EFFECTIVE IN-HOUSE VOTING AND VERIFICATION USING BLOCK CHAIN IMPLEMENTATION
 
IRJET- PDF Extraction using Data Mining Techniques
IRJET- PDF Extraction using Data Mining TechniquesIRJET- PDF Extraction using Data Mining Techniques
IRJET- PDF Extraction using Data Mining Techniques
 
DISTRIBUTION SYSTEM SERVICE PROVIDED BY ELECTRIC VEHICLES
DISTRIBUTION SYSTEM SERVICE PROVIDED  BY ELECTRIC VEHICLESDISTRIBUTION SYSTEM SERVICE PROVIDED  BY ELECTRIC VEHICLES
DISTRIBUTION SYSTEM SERVICE PROVIDED BY ELECTRIC VEHICLES
 
Making isp business profitable using data mining
Making isp business profitable using data miningMaking isp business profitable using data mining
Making isp business profitable using data mining
 
Aisha Email System
Aisha Email SystemAisha Email System
Aisha Email System
 
Benchmarking supervised learning models for sentiment analysis
Benchmarking supervised learning models for sentiment analysisBenchmarking supervised learning models for sentiment analysis
Benchmarking supervised learning models for sentiment analysis
 
IRJET- UID Secure Travel Identity
IRJET- UID Secure Travel IdentityIRJET- UID Secure Travel Identity
IRJET- UID Secure Travel Identity
 
A DISTRIBUTED MACHINE LEARNING BASED IDS FOR CLOUD COMPUTING
A DISTRIBUTED MACHINE LEARNING BASED IDS FOR  CLOUD COMPUTINGA DISTRIBUTED MACHINE LEARNING BASED IDS FOR  CLOUD COMPUTING
A DISTRIBUTED MACHINE LEARNING BASED IDS FOR CLOUD COMPUTING
 
IRJET- Intelligence Extraction using Machine Learning Technics
IRJET- Intelligence Extraction using Machine Learning TechnicsIRJET- Intelligence Extraction using Machine Learning Technics
IRJET- Intelligence Extraction using Machine Learning Technics
 
IRJET- Customer Feedback Analysis using Machine Learning
IRJET-  	  Customer Feedback Analysis using Machine LearningIRJET-  	  Customer Feedback Analysis using Machine Learning
IRJET- Customer Feedback Analysis using Machine Learning
 
Candidate Ranking and Evaluation System based on Digital Footprints
Candidate Ranking and Evaluation System based on Digital FootprintsCandidate Ranking and Evaluation System based on Digital Footprints
Candidate Ranking and Evaluation System based on Digital Footprints
 
AUTHENTICATED MEDICAL DOCUMENTS RELEASING WITH PRIVACY PROTECTION AND RELEAS...
AUTHENTICATED MEDICAL DOCUMENTS RELEASING  WITH PRIVACY PROTECTION AND RELEAS...AUTHENTICATED MEDICAL DOCUMENTS RELEASING  WITH PRIVACY PROTECTION AND RELEAS...
AUTHENTICATED MEDICAL DOCUMENTS RELEASING WITH PRIVACY PROTECTION AND RELEAS...
 
Conference Paper: Enabling Privacy Mechanisms in Apache Storm
Conference Paper: Enabling Privacy Mechanisms in Apache StormConference Paper: Enabling Privacy Mechanisms in Apache Storm
Conference Paper: Enabling Privacy Mechanisms in Apache Storm
 
Combining efficiency, fidelity, and flexibility in resource information services
Combining efficiency, fidelity, and flexibility in resource information servicesCombining efficiency, fidelity, and flexibility in resource information services
Combining efficiency, fidelity, and flexibility in resource information services
 
Security risk analysis of bring your own device system in manufacturing compa...
Security risk analysis of bring your own device system in manufacturing compa...Security risk analysis of bring your own device system in manufacturing compa...
Security risk analysis of bring your own device system in manufacturing compa...
 
CHATBOT FOR COLLEGE RELATED QUERIES | J4RV4I1008
CHATBOT FOR COLLEGE RELATED QUERIES | J4RV4I1008CHATBOT FOR COLLEGE RELATED QUERIES | J4RV4I1008
CHATBOT FOR COLLEGE RELATED QUERIES | J4RV4I1008
 

Similar to Touch With Industry

Development of Web-based Job Fair Information System
Development of Web-based Job Fair Information SystemDevelopment of Web-based Job Fair Information System
Development of Web-based Job Fair Information SystemEditor IJCATR
 
Development of Web-based Job Fair Information System
Development of Web-based Job Fair Information SystemDevelopment of Web-based Job Fair Information System
Development of Web-based Job Fair Information SystemEditor IJCATR
 
Finger Gesture Based Rating System
Finger Gesture Based Rating SystemFinger Gesture Based Rating System
Finger Gesture Based Rating SystemIRJET Journal
 
Ijsred v2 i5p95
Ijsred v2 i5p95Ijsred v2 i5p95
Ijsred v2 i5p95IJSRED
 
A Review Of Computerized Payroll System
A Review Of Computerized Payroll SystemA Review Of Computerized Payroll System
A Review Of Computerized Payroll SystemApril Knyff
 
A Trinity Construction for Web Extraction Using Efficient Algorithm
A Trinity Construction for Web Extraction Using Efficient AlgorithmA Trinity Construction for Web Extraction Using Efficient Algorithm
A Trinity Construction for Web Extraction Using Efficient AlgorithmIOSR Journals
 
IRJET-Computational model for the processing of documents and support to the ...
IRJET-Computational model for the processing of documents and support to the ...IRJET-Computational model for the processing of documents and support to the ...
IRJET-Computational model for the processing of documents and support to the ...IRJET Journal
 
IRJET- In-House File Tracking System
IRJET-  	  In-House File Tracking SystemIRJET-  	  In-House File Tracking System
IRJET- In-House File Tracking SystemIRJET Journal
 
Providing Highly Accurate Service Recommendation over Big Data using Adaptive...
Providing Highly Accurate Service Recommendation over Big Data using Adaptive...Providing Highly Accurate Service Recommendation over Big Data using Adaptive...
Providing Highly Accurate Service Recommendation over Big Data using Adaptive...IRJET Journal
 
IRJET- Noisy Content Detection on Web Data using Machine Learning
IRJET- Noisy Content Detection on Web Data using Machine LearningIRJET- Noisy Content Detection on Web Data using Machine Learning
IRJET- Noisy Content Detection on Web Data using Machine LearningIRJET Journal
 
IRJET- Website Health Checker
IRJET- Website Health CheckerIRJET- Website Health Checker
IRJET- Website Health CheckerIRJET Journal
 
Prototype of the Export Information System for Managing Cargo Data
Prototype of the Export Information System for Managing Cargo DataPrototype of the Export Information System for Managing Cargo Data
Prototype of the Export Information System for Managing Cargo DataIJSRED
 
SOFTWARE DESIGN .docx
SOFTWARE DESIGN                                                   .docxSOFTWARE DESIGN                                                   .docx
SOFTWARE DESIGN .docxrronald3
 

Similar to Touch With Industry (20)

Job portal
Job portalJob portal
Job portal
 
G017334248
G017334248G017334248
G017334248
 
Ijcatr04071001
Ijcatr04071001Ijcatr04071001
Ijcatr04071001
 
Development of Web-based Job Fair Information System
Development of Web-based Job Fair Information SystemDevelopment of Web-based Job Fair Information System
Development of Web-based Job Fair Information System
 
Development of Web-based Job Fair Information System
Development of Web-based Job Fair Information SystemDevelopment of Web-based Job Fair Information System
Development of Web-based Job Fair Information System
 
E017413647
E017413647E017413647
E017413647
 
5 job adda doc 2
5 job adda doc 25 job adda doc 2
5 job adda doc 2
 
5 job adda doc 2
5 job adda doc 25 job adda doc 2
5 job adda doc 2
 
Finger Gesture Based Rating System
Finger Gesture Based Rating SystemFinger Gesture Based Rating System
Finger Gesture Based Rating System
 
Ijsred v2 i5p95
Ijsred v2 i5p95Ijsred v2 i5p95
Ijsred v2 i5p95
 
A Review Of Computerized Payroll System
A Review Of Computerized Payroll SystemA Review Of Computerized Payroll System
A Review Of Computerized Payroll System
 
H017124652
H017124652H017124652
H017124652
 
A Trinity Construction for Web Extraction Using Efficient Algorithm
A Trinity Construction for Web Extraction Using Efficient AlgorithmA Trinity Construction for Web Extraction Using Efficient Algorithm
A Trinity Construction for Web Extraction Using Efficient Algorithm
 
IRJET-Computational model for the processing of documents and support to the ...
IRJET-Computational model for the processing of documents and support to the ...IRJET-Computational model for the processing of documents and support to the ...
IRJET-Computational model for the processing of documents and support to the ...
 
IRJET- In-House File Tracking System
IRJET-  	  In-House File Tracking SystemIRJET-  	  In-House File Tracking System
IRJET- In-House File Tracking System
 
Providing Highly Accurate Service Recommendation over Big Data using Adaptive...
Providing Highly Accurate Service Recommendation over Big Data using Adaptive...Providing Highly Accurate Service Recommendation over Big Data using Adaptive...
Providing Highly Accurate Service Recommendation over Big Data using Adaptive...
 
IRJET- Noisy Content Detection on Web Data using Machine Learning
IRJET- Noisy Content Detection on Web Data using Machine LearningIRJET- Noisy Content Detection on Web Data using Machine Learning
IRJET- Noisy Content Detection on Web Data using Machine Learning
 
IRJET- Website Health Checker
IRJET- Website Health CheckerIRJET- Website Health Checker
IRJET- Website Health Checker
 
Prototype of the Export Information System for Managing Cargo Data
Prototype of the Export Information System for Managing Cargo DataPrototype of the Export Information System for Managing Cargo Data
Prototype of the Export Information System for Managing Cargo Data
 
SOFTWARE DESIGN .docx
SOFTWARE DESIGN                                                   .docxSOFTWARE DESIGN                                                   .docx
SOFTWARE DESIGN .docx
 

More from IRJET Journal

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...IRJET Journal
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTUREIRJET Journal
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...IRJET Journal
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsIRJET Journal
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...IRJET Journal
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...IRJET Journal
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...IRJET Journal
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...IRJET Journal
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASIRJET Journal
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...IRJET Journal
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProIRJET Journal
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...IRJET Journal
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemIRJET Journal
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesIRJET Journal
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web applicationIRJET Journal
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...IRJET Journal
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.IRJET Journal
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...IRJET Journal
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignIRJET Journal
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...IRJET Journal
 

More from IRJET Journal (20)

TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
 
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURESTUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
 
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
 
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil CharacteristicsEffect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
 
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
 
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
 
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
 
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
 
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADASA REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
 
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
 
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD ProP.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
 
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
 
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare SystemSurvey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
 
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridgesReview on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
 
React based fullstack edtech web application
React based fullstack edtech web applicationReact based fullstack edtech web application
React based fullstack edtech web application
 
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
 
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
 
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
 
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic DesignMultistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
 
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
 

Recently uploaded

CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)Dr SOUNDIRARAJ N
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionDr.Costas Sachpazis
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
EduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AIEduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AIkoyaldeepu123
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHC Sai Kiran
 
Effects of rheological properties on mixing
Effects of rheological properties on mixingEffects of rheological properties on mixing
Effects of rheological properties on mixingviprabot1
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxbritheesh05
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...Chandu841456
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .Satyam Kumar
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptSAURABHKUMAR892774
 
pipeline in computer architecture design
pipeline in computer architecture  designpipeline in computer architecture  design
pipeline in computer architecture designssuser87fa0c1
 

Recently uploaded (20)

CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
UNIT III ANALOG ELECTRONICS (BASIC ELECTRONICS)
 
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective IntroductionSachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
Sachpazis Costas: Geotechnical Engineering: A student's Perspective Introduction
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
EduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AIEduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AI
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
Introduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECHIntroduction to Machine Learning Unit-3 for II MECH
Introduction to Machine Learning Unit-3 for II MECH
 
Effects of rheological properties on mixing
Effects of rheological properties on mixingEffects of rheological properties on mixing
Effects of rheological properties on mixing
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
Artificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptxArtificial-Intelligence-in-Electronics (K).pptx
Artificial-Intelligence-in-Electronics (K).pptx
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...An experimental study in using natural admixture as an alternative for chemic...
An experimental study in using natural admixture as an alternative for chemic...
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .
 
Design and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdfDesign and analysis of solar grass cutter.pdf
Design and analysis of solar grass cutter.pdf
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Arduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.pptArduino_CSE ece ppt for working and principal of arduino.ppt
Arduino_CSE ece ppt for working and principal of arduino.ppt
 
pipeline in computer architecture design
pipeline in computer architecture  designpipeline in computer architecture  design
pipeline in computer architecture design
 

Touch With Industry

  • 1. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 04 Issue: 03 | Mar -2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 2631 Touch with Industry Priyanka Thakur 1, Karuna Sharma 2, Priyanka Chaudhari 3, Sayali Borade 4 1,2,3,4 Student, Computer Department, MET’s BKC IOE Nashik, Maharashtra, India ---------------------------------------------------------------------***--------------------------------------------------------------------- Abstract - Web content mining is the extraction and integration of relevant data, informationandknowledgefrom different forms of web page contents. Extracting data from web pages is challenging task as web data is mainly in semi- structured or unstructured format, while web content mining deals primarily with structured data. We have proposed a system with title "Touch with Industry" that mainlyfocuses on gathering information from various trusted sites on the basis of company names and fields selected by the users. The provided fields not limited to Company CEO, Address, Contact No, Year of Establishment, Employee Strength, Images of Company, and People on LinkedIn. The proposed system uses web crawling, whichmakes iteasier for the application to return the most relevant resultstousers. The proposed system is advantageous to people like consultants, HR representatives, Employed and Unemployed peoples who need company overview in short period of time, which reduces the searching time and provide relevant information from different web pages in one go. Key Words: web mining, web crawling, pattern matchin, filtering. 1.INTRODUCTION The World Wide Web has large online databases of the companies. This database needstobeaccessedby millionsof people. According to recent survey, it has been found that some websites provide fake information to people. Also, it has been observed that people needs to search a lot for relevant information which consumes large amount of time 1.1 Project Idea The idea behind implementing project is to provide solution to problem of people which they face while they are searching for company information. This system is mainly proposed to promote efficient communication between job seekers and consultants which will lead to ideal and effective system. 1.2 Motivation Basically the motivation came withthesurveyamongpeople who faced problem while searching company details. Users need to crawl through a number of pages to get desired information about a company.Toovercomethisproblemthe proposed system will be develop a desktop application through which users can easily search about a company in less amount of time and get relevant details. 2. LITERATURE SURVEY The exhaustive literaturesurveyconsistingoftheconceptual base for this project is briefly outlined here. 1. Web Data Extraction: Extracting structured data from deep Web pages is a challenging problem due to the underlying intricate structures of suchpages.Thismotivates us to seek a different way for deep Web data extraction to overcome the limitations of previous works by utilizing some interesting common visual features on the deep Web pages. In this paper, a novel vision-based approach that is Web- pageprogramming-language-independentisproposed. This approach primarily utilizes the visual features on the deep Web pages to implement deep Web data extraction, including data record extractionanddata itemextraction.[1] 2. Data Restruction: It synthesizes ideas from query languages for the Web, for semi structured data and for website restructuringand makesseveral contributions, most notably, the idea of querying documents by manipulating their abstract syntax trees and the support of the concept of web as a data type. [2] 3. Hoovers: It is an official website to search the company's profile. It searches the world’s largest database of company and industry information. It provides less detail company information to non-subscribers. It maintains a database of about 85 million companies and 100 million peoples.[3] 4. Web Information Extraction Systems: The Internet presents a huge amount of useful information which is usually formatted for its users, which makes it difficult to extract relevant data from various sources. This paper surveys the major Web data extraction approaches and compares them in three dimensions: the task domain, the automation degree, and the techniques used. [4]
  • 2. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 04 Issue: 03 | Mar -2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 2632 2.1.1 Existing Systems The following mentioned applications are available for the various feature on Company Information Extraction but having the limitations with some regards to satisfy the user. 1. Company Profile Search _ Organised Profile _ Detailed Search _ Profit Accomplished 2. Lead Finder Jack _ Crawls onto Google _ Searches companies having paid advertisements _ Low accuracy 3. PROBLEM DEFINATION AND SCOPE Suppose a user wants to search specific information about the company. For this purpose one makes use of search engine which indeed displays a lot of websites whichmaybe irrelevant to the user and this is a tedious and time consuming task. To overcome the problem specified above we develop a desktop based application that provides the solution to the issues related with existing company information finding system. Using trusted sites to retrieve relevant information related to companies and providing accurate data to the user. 3.1 Goals and objectives i) Goals: To eliminate the burden of searching company information. To provide user friendly and efficient service. ii)Objective: The main objective of proposed system is user comfort. The users can easily accessthecompanydetailsand get the desired information at a glance. 3.2 Statement of scope The scope of proposed system is justifiable because in wide range people search for industry details. So wide range of people can make use of proposed system. 3.3 Outcome The outcome of the proposed system is, i)Accuracy of information ii)Selection of features according to users choice iii)Efficient searching 4. DETAILED DESIGN 4.1 Introduction: System architecture will benefit in such a way that every user of the system gets advantage. When a user searches about a company he needs to register into the system. After logging into the system user has to enter the company name and select the filters as per his requirement. The system will return the required information by applying data extraction techniques. 4.2 Architecture Design: The input to the system is the companynameandthefeature of the company. The proposed system will then createa web request and send it to the search engine. The search engine then replies to the system, the reply contains all the information of the company. This system supports GET and SET protocol method for the web request. SET method creates a web request and GET method sends the request to the system. After, receiving the information the proposed system filter the HTML tags i.e. filters the irrelevant information based on the feature entered as input. Further pattern matching is carried out in which the input pattern (features of input query) is matched with the responded received by the system after filtering. After the pattern matching, the information whose pattern matches with the input query are displayed to the user i.e. user receives the relevant information as output. The proposed system will make use of only trusted sites, so only the correct information will be received by the tool and display as output. 4.3 Algorithm Extracting structured data from web pages : On entering the company name the process of the system will get initiated. company_name is the variable taken for company name. The system sends the web request to the websites with company_name. The system receives the HTML response from the websites. Then we apply regular
  • 3. International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395 -0056 Volume: 04 Issue: 03 | Mar -2017 www.irjet.net p-ISSN: 2395-0072 © 2017, IRJET | Impact Factor value: 5.181 | ISO 9001:2008 Certified Journal | Page 2633 expressions and extract text node and data node. Pattern matching is performed and illegal characters are removed. After extracting relevant data related company links will be displayed to the user. The user has to enter the correct company name . After that system again sends web request with CIN no and get HTML response. The variable used for CIN no is cin_no. Then html tags are removed by using regular expressions and replacing it by newline n. Split lines by 'n'. For each line do ( find heading nodes(attributenames)atpositioni attribute[i].value= lines[i+1] ). Then the system save attribute value and show data. 5. CONCLUSIONS AND FUTURE WORK We have presented Touch With Industry system, which provides a framework for approaching many web data management task from a unified perspective. This system operates in three levels: Input, process and output. The proposed system provides the user with the relevant information by filtering the HTML tags. The system also follows pattern matching process after filtering HTML tags which allows the system to match the response with the input parameters given by user. The Touch With Industry system can be extended to other domain of application. Additionally, as future work we can notify the logged in users about the jobs vacancy available invariouscompanies. We can also provide the facility to the user for uploading their resume for getting job in their area of interest. REFERENCES [1] ViDE: A Vision-Based Approach for Deep Web Data Extraction by Wei Liu, Xiaofeng Meng. [2] WebOQL: Restructuring Documents,Databasesand Webs by Gustavo O. Arocena, Alberto O. Mendelzon [3] S. Huffman, “Learning Information Extraction Patterns from Examples,” Connectionist, Statistical, and Symbolic Approaches to Learning for Natural Language Processing, Springer-Verlag, 1996. [4] A Survey of Web InformationExtractionSystemsbyChia- Hui Chang, Mohammed Kayed and Khaled F. Shaalan [5] Using Visual Clues Concept for Extracting Main Data from Deep Web Pages by Satish Pusdekar , ShaikhChhaware [6] A Survey of Web Information Extraction Systems by Chia-Hui Chang , Mohammed Kayed [7]D. Freitag, “Information Extraction from HTML: Application of a General Learning Approach,” Proc. 15th Conf. Artificial Intelligence (AAAI ’98), 1998. [8] J. Wang and F.H. Lochovsky, “Data Extraction and Label Assignment for Web Databases,” Proc. 12th Int’l Conf. World Wide Web (WWW), pp. 187-196, 2003. [9] A. Arasu and H. Garcia-Molina, “Extracting Structured Data from Web Pages,” Proc. ACM SIGMOD Int’l Conf. Management of Data, pp. 337-348, 2003. BIOGRAPHIES Priyanka S. Thakur appearingfor BE degree from the Department of Computer Engineering, MET’s Bhujbal Knowledge City IOE, Nashik. Karuna R. Sharma appearing for BE degree from the Department of Computer Engineering, MET’s Bhujbal Knowledge City IOE, Nashik. Priyanka B. Chaudhari appearing for BE degree from the Department of Computer Engineering, MET’s Bhujbal Knowledge City IOE, Nashik. Sayali P. Borade appearing for BE degree from the Department of Computer Engineering, MET’s Bhujbal Knowledge City IOE, Nashik.