A semantic based approach for information retrieval from html documents using...csandit
Most of the internet applications are built using web technologies like HTML. Web pages are
designed in such a way that it displays the data records from the underlying databases or just
displays the text in an unstructured format but using some fixed template. Summarizing these
data which are dispersed in different web pages is hectic and tedious and consumes most of the
time and manual effort. A supervised learning technique called Wrapper Induction technique
can be used across the web pages to learn data extraction rules. By applying these learnt rules
to web pages, enables the information extraction an easier process. This paper focuses on
developing a tool for information extraction from the unstructured data. The use of semantic
web technologies much simplifies the process. This tool enables us to query the data being
scattered over multiple web pages, in distinguished ways. This can be accomplished by the
following steps – extracting the data from multiple web pages, storing them in the form of RDF
triples, integrating multiple RDF files using ontology, generating SPARQL query based on user
query and generating report in the form of tables or charts from the results of SPARQL query.
The relationship between various related web pages are identified using ontology and used to
query in better ways thus enhancing the searching efficacy.
A SEMANTIC BASED APPROACH FOR INFORMATION RETRIEVAL FROM HTML DOCUMENTS USING...cscpconf
Most of the internet applications are built using web technologies like HTML. Web pages are designed in such a way that it displays the data records from the underlying databases or just displays the text in an unstructured format but using some fixed template. Summarizing these data which are dispersed in different web pages is hectic and tedious and consumes most of the time and manual effort. A supervised learning technique called Wrapper Induction technique
can be used across the web pages to learn data extraction rules. By applying these learnt rules to web pages, enables the information extraction an easier process. This paper focuses on
developing a tool for information extraction from the unstructured data. The use of semantic web technologies much simplifies the process. This tool enables us to query the data being scattered over multiple web pages, in distinguished ways. This can be accomplished by the following steps – extracting the data from multiple web pages, storing them in the form of RDF triples, integrating multiple RDF files using ontology, generating SPARQL query based on user query and generating report in the form of tables or charts from the results of SPARQL query. The relationship between various related web pages are identified using ontology and used to query in better ways thus enhancing the searching efficacy.
Comparative Study on Graph-based Information Retrieval: the Case of XML DocumentIJAEMSJORNAL
The processing of massive amounts of data has become indispensable especially with the potential proliferation of big data. The volume of information available nowadays makes it difficult for the user to find relevant information in a vast collection of documents. As a result, the exploitation of vast document collections necessitates the implementation of automated technologies that enable appropriate and effective retrieval. In this paper, we will examine the state of the art of IR in XML documents. We will also discuss some works that have used graphs to represent documents in the context of IR. In the same vein, the relationships between the components of a graph are the center of our attention.
International Journal of Engineering Research and Development (IJERD)IJERD Editor
call for paper 2012, hard copy of journal, research paper publishing, where to publish research paper,
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
Overview of Indexing In Object Oriented DatabaseEditor IJMTER
In conventional database an index is maintain on an attribute of single class to speed up
association research. In object oriented database the access scope of query against a class in general
includes not only the class but also all subclass of the class. This means that to support the evaluation of
a query, the system must maintain one index on an attribute for each classes involve in query.
A semantic based approach for information retrieval from html documents using...csandit
Most of the internet applications are built using web technologies like HTML. Web pages are
designed in such a way that it displays the data records from the underlying databases or just
displays the text in an unstructured format but using some fixed template. Summarizing these
data which are dispersed in different web pages is hectic and tedious and consumes most of the
time and manual effort. A supervised learning technique called Wrapper Induction technique
can be used across the web pages to learn data extraction rules. By applying these learnt rules
to web pages, enables the information extraction an easier process. This paper focuses on
developing a tool for information extraction from the unstructured data. The use of semantic
web technologies much simplifies the process. This tool enables us to query the data being
scattered over multiple web pages, in distinguished ways. This can be accomplished by the
following steps – extracting the data from multiple web pages, storing them in the form of RDF
triples, integrating multiple RDF files using ontology, generating SPARQL query based on user
query and generating report in the form of tables or charts from the results of SPARQL query.
The relationship between various related web pages are identified using ontology and used to
query in better ways thus enhancing the searching efficacy.
A SEMANTIC BASED APPROACH FOR INFORMATION RETRIEVAL FROM HTML DOCUMENTS USING...cscpconf
Most of the internet applications are built using web technologies like HTML. Web pages are designed in such a way that it displays the data records from the underlying databases or just displays the text in an unstructured format but using some fixed template. Summarizing these data which are dispersed in different web pages is hectic and tedious and consumes most of the time and manual effort. A supervised learning technique called Wrapper Induction technique
can be used across the web pages to learn data extraction rules. By applying these learnt rules to web pages, enables the information extraction an easier process. This paper focuses on
developing a tool for information extraction from the unstructured data. The use of semantic web technologies much simplifies the process. This tool enables us to query the data being scattered over multiple web pages, in distinguished ways. This can be accomplished by the following steps – extracting the data from multiple web pages, storing them in the form of RDF triples, integrating multiple RDF files using ontology, generating SPARQL query based on user query and generating report in the form of tables or charts from the results of SPARQL query. The relationship between various related web pages are identified using ontology and used to query in better ways thus enhancing the searching efficacy.
Comparative Study on Graph-based Information Retrieval: the Case of XML DocumentIJAEMSJORNAL
The processing of massive amounts of data has become indispensable especially with the potential proliferation of big data. The volume of information available nowadays makes it difficult for the user to find relevant information in a vast collection of documents. As a result, the exploitation of vast document collections necessitates the implementation of automated technologies that enable appropriate and effective retrieval. In this paper, we will examine the state of the art of IR in XML documents. We will also discuss some works that have used graphs to represent documents in the context of IR. In the same vein, the relationships between the components of a graph are the center of our attention.
International Journal of Engineering Research and Development (IJERD)IJERD Editor
call for paper 2012, hard copy of journal, research paper publishing, where to publish research paper,
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
Overview of Indexing In Object Oriented DatabaseEditor IJMTER
In conventional database an index is maintain on an attribute of single class to speed up
association research. In object oriented database the access scope of query against a class in general
includes not only the class but also all subclass of the class. This means that to support the evaluation of
a query, the system must maintain one index on an attribute for each classes involve in query.
An approach for transforming of relational databases to owl ontologyIJwest
Rapid growth of documents, web pages, and other types of text content is a huge challenge for the modern content management systems. One of the problems in the areas of information storage and retrieval is the lacking of semantic data. Ontologies can present knowledge in sharable and repeatedly usable manner and provide an effective way to reduce the data volume overhead by encoding the structure of a particular domain. Metadata in relational databases can be used to extract ontology from database in a special domain. According to solve the problem of sharing and reusing of data, approaches based on transforming relational database to ontology are proposed. In this paper we propose a method for automatic ontology construction based on relational database. Mining and obtaining further components from relational database leads to obtain knowledge with high semantic power and more expressiveness. Triggers are one of the database components which could be transformed to the ontology model and increase the amount of power and expressiveness of knowledge by presenting part of the knowledge dynamically.
Abstract—Since the demand for information retrieval increases quickly, indexing structures became an important issue to support fast information retrieval. According to the work in this paper, a new data structure called Dynamic Ordered Multi-field Index (DOMI) for information retrieval has been introduced. It is based on radix trees organized in segments in addition to a hash table to point to the roots of each segment, where each segment is dedicated to store the values of a single field. The hash table is used to access the needed segments directly without traversing the upper segments. So, DOMI improves look-up performance for queries addressing to a single field. In the case of multiple queries addressing, each segment of the radix tree is traversed sequentially without visiting the unrelated branches. The use of segmentation for the proposed DOMI provides flexibility for minimizing communication overhead in the distributed system. Every field in the radix tree is represented by one segment, where each segment can be stored as one block.
In addition to, the proposed DOMI consumes less space comparing to indexes which are built using B or B+ trees. Hence, it is more suitable for intensive-data such as Big Data.
Vertical intent prediction approach based on Doc2vec and convolutional neural...IJECEIAES
Vertical selection is the task of selecting the most relevant verticals to a given query in order to improve the diversity and quality of web search results. This task requires not only predicting relevant verticals but also these verticals must be those the user expects to be relevant for his particular information need. Most existing works focused on using traditional machine learning techniques to combine multiple types of features for selecting several relevant verticals. Although these techniques are very efficient, handling vertical selection with high accuracy is still a challenging research task. In this paper, we propose an approach for improving vertical selection in order to satisfy the user vertical intent and reduce user’s browsing time and efforts. First, it generates query embeddings vectors using the doc2vec algorithm that preserves syntactic and semantic information within each query. Secondly, this vector will be used as input to a convolutional neural network model for increasing the representation of the query with multiple levels of abstraction including rich semantic information and then creating a global summarization of the query features. We demonstrate the effectiveness of our approach through comprehensive experimentation using various datasets. Our experimental findings show that our system achieves significant accuracy. Further, it realizes accurate predictions on new unseen data.
AUTOMATIC CONVERSION OF RELATIONAL DATABASES INTO ONTOLOGIES: A COMPARATIVE A...IJwest
Constructing ontologies from relational databases is an active research topic in the Semantic Web domain.
While conceptual mapping rules/principles of relational databases and ontology structures are being
proposed, several software modules or plug-ins are being developed to enable the automatic conversion of
relational databases into ontologies. However, the correlation between the resulting ontologies built
automatically with plug-ins from relational databases and the database-toontology mapping principles has
been given little attention. This study reviews and applies two Protégé plug-ins, namely, DataMaster and
OntoBase to automatically construct ontologies from a relational database. The resulting ontologies are
further analysed to match their structures against the database-to-ontology mapping principles. A
comparative analysis of the matching results reveals that OntoBase outperforms DataMaster in applying
the database-to-ontology mapping principles for automatically converting relational databases into
ontologies
Performance Evaluation of Query Processing Techniques in Information Retrievalidescitation
The first element of the search process is the query.
The user query being on an average restricted to two or three
keywords makes the query ambiguous to the search engine.
Given the user query, the goal of an Information Retrieval
[IR] system is to retrieve information which might be useful
or relevant to the information need of the user. Hence, the
query processing plays an important role in IR system.
The query processing can be divided into four categories
i.e. query expansion, query optimization, query classification and
query parsing. In this paper an attempt is made to evaluate the
performance of query processing algorithms in each of the
category. The evaluation was based on dataset as specified by
Forum for Information Retrieval [FIRE15]. The criteria used
for evaluation are precision and relative recall. The analysis is
based on the importance of each step in query processing. The
experimental results show that the significance of each step
in query processing and also the relevance of web semantics
and spelling correction in the user query.
Query Optimization Techniques in Graph Databasesijdms
Graph databases (GDB) have recently been arisen to overcome the limits of traditional databases for
storing and managing data with graph-like structure. Today, they represent a requirementfor many
applications that manage graph-like data,like social networks.Most of the techniques, applied to optimize
queries in graph databases, have been used in traditional databases, distribution systems,… or they are
inspired from graph theory. However, their reuse in graph databases should take care of the main
characteristics of graph databases, such as dynamic structure, highly interconnected data, and ability to
efficiently access data relationships. In this paper, we survey the query optimization techniques in graph
databases. In particular,we focus on the features they have in
Survey of Machine Learning Techniques in Textual Document ClassificationIOSR Journals
Classification of Text Document points towards associating one or more predefined categories based
on the likelihood expressed by the training set of labeled documents. Many machine learning algorithms plays
an important role in training the system with predefined categories. The importance of Machine learning
approach has felt because of which the study has been taken up for text document classification based on the
statistical event models available. The aim of this paper is to present the important techniques and
methodologies that are employed for text documents classification, at the same time making awareness of some
of the interesting challenges that remain to be solved, focused mainly on text representation and machine
learning techniques.
HIGH-LEVEL SEMANTICS OF IMAGES IN WEB DOCUMENTS USING WEIGHTED TAGS AND STREN...IJCSEA Journal
The multimedia information retrieval from World Wide Web is a challenging issue. Describing multimedia object in general, images in particular with low-level features increases the semantic gap. From WWW, information present in a HTML document as textual keywords can be extracted for capturing semantic information with the view to narrow the semantic gap. The high-level textual information of images can be extracted and associated with the textual keywords, which narrow down the search space and improve the precision of retrieval. In this paper, a strength matrix is being proposed, which is based on the frequency of occurrence of keywords and the textual information pertaining to image URLs. The strength of these textual keywords are estimated and used for associating these keywords with the images present in the documents. The high-level semantics of the image is described in the HTML documents in the form of image name, ALT tag, optional description, etc., is used for estimating the strength. In addition, word position and weighting mechanism is also used for further improving the association textual keywords with the image related text. The effectiveness of information retrieval of the proposed technique is found to be comparatively better than many of the recently proposed retrieval techniques. The experimental results of the proposed method endorse the fact that image retrieval using image information and textual keywords is better than those of the text based and the content-based approaches.
Object-Oriented Database Model For Effective Mining Of Advanced Engineering M...cscpconf
Materials have become a very important aspect of our daily life and the search for better and
new kind of engineered materials has created some opportunities for the Information science
and technology fraternity to investigate in to the world of materials. Hence this combination of
materials science and Information science together is nowadays known as Materials
Informatics. An Object-Oriented Database Model has been proposed for organizing advanced engineering materials datasets.
Fixing Forecast Accuracy: Joint Webinar with Sales Management AssociationRevegy, Inc.
Forecasting accuracy is a problem for sales organizations. Too often forecasts reflect little more than a sales force’s collective intuition about future results. Not the reliable predictor management needs.
New approaches to forecasting are proving much more valuable. They combine technology, salesperson activities, and a focus on verifiable customer outcomes.
An approach for transforming of relational databases to owl ontologyIJwest
Rapid growth of documents, web pages, and other types of text content is a huge challenge for the modern content management systems. One of the problems in the areas of information storage and retrieval is the lacking of semantic data. Ontologies can present knowledge in sharable and repeatedly usable manner and provide an effective way to reduce the data volume overhead by encoding the structure of a particular domain. Metadata in relational databases can be used to extract ontology from database in a special domain. According to solve the problem of sharing and reusing of data, approaches based on transforming relational database to ontology are proposed. In this paper we propose a method for automatic ontology construction based on relational database. Mining and obtaining further components from relational database leads to obtain knowledge with high semantic power and more expressiveness. Triggers are one of the database components which could be transformed to the ontology model and increase the amount of power and expressiveness of knowledge by presenting part of the knowledge dynamically.
Abstract—Since the demand for information retrieval increases quickly, indexing structures became an important issue to support fast information retrieval. According to the work in this paper, a new data structure called Dynamic Ordered Multi-field Index (DOMI) for information retrieval has been introduced. It is based on radix trees organized in segments in addition to a hash table to point to the roots of each segment, where each segment is dedicated to store the values of a single field. The hash table is used to access the needed segments directly without traversing the upper segments. So, DOMI improves look-up performance for queries addressing to a single field. In the case of multiple queries addressing, each segment of the radix tree is traversed sequentially without visiting the unrelated branches. The use of segmentation for the proposed DOMI provides flexibility for minimizing communication overhead in the distributed system. Every field in the radix tree is represented by one segment, where each segment can be stored as one block.
In addition to, the proposed DOMI consumes less space comparing to indexes which are built using B or B+ trees. Hence, it is more suitable for intensive-data such as Big Data.
Vertical intent prediction approach based on Doc2vec and convolutional neural...IJECEIAES
Vertical selection is the task of selecting the most relevant verticals to a given query in order to improve the diversity and quality of web search results. This task requires not only predicting relevant verticals but also these verticals must be those the user expects to be relevant for his particular information need. Most existing works focused on using traditional machine learning techniques to combine multiple types of features for selecting several relevant verticals. Although these techniques are very efficient, handling vertical selection with high accuracy is still a challenging research task. In this paper, we propose an approach for improving vertical selection in order to satisfy the user vertical intent and reduce user’s browsing time and efforts. First, it generates query embeddings vectors using the doc2vec algorithm that preserves syntactic and semantic information within each query. Secondly, this vector will be used as input to a convolutional neural network model for increasing the representation of the query with multiple levels of abstraction including rich semantic information and then creating a global summarization of the query features. We demonstrate the effectiveness of our approach through comprehensive experimentation using various datasets. Our experimental findings show that our system achieves significant accuracy. Further, it realizes accurate predictions on new unseen data.
AUTOMATIC CONVERSION OF RELATIONAL DATABASES INTO ONTOLOGIES: A COMPARATIVE A...IJwest
Constructing ontologies from relational databases is an active research topic in the Semantic Web domain.
While conceptual mapping rules/principles of relational databases and ontology structures are being
proposed, several software modules or plug-ins are being developed to enable the automatic conversion of
relational databases into ontologies. However, the correlation between the resulting ontologies built
automatically with plug-ins from relational databases and the database-toontology mapping principles has
been given little attention. This study reviews and applies two Protégé plug-ins, namely, DataMaster and
OntoBase to automatically construct ontologies from a relational database. The resulting ontologies are
further analysed to match their structures against the database-to-ontology mapping principles. A
comparative analysis of the matching results reveals that OntoBase outperforms DataMaster in applying
the database-to-ontology mapping principles for automatically converting relational databases into
ontologies
Performance Evaluation of Query Processing Techniques in Information Retrievalidescitation
The first element of the search process is the query.
The user query being on an average restricted to two or three
keywords makes the query ambiguous to the search engine.
Given the user query, the goal of an Information Retrieval
[IR] system is to retrieve information which might be useful
or relevant to the information need of the user. Hence, the
query processing plays an important role in IR system.
The query processing can be divided into four categories
i.e. query expansion, query optimization, query classification and
query parsing. In this paper an attempt is made to evaluate the
performance of query processing algorithms in each of the
category. The evaluation was based on dataset as specified by
Forum for Information Retrieval [FIRE15]. The criteria used
for evaluation are precision and relative recall. The analysis is
based on the importance of each step in query processing. The
experimental results show that the significance of each step
in query processing and also the relevance of web semantics
and spelling correction in the user query.
Query Optimization Techniques in Graph Databasesijdms
Graph databases (GDB) have recently been arisen to overcome the limits of traditional databases for
storing and managing data with graph-like structure. Today, they represent a requirementfor many
applications that manage graph-like data,like social networks.Most of the techniques, applied to optimize
queries in graph databases, have been used in traditional databases, distribution systems,… or they are
inspired from graph theory. However, their reuse in graph databases should take care of the main
characteristics of graph databases, such as dynamic structure, highly interconnected data, and ability to
efficiently access data relationships. In this paper, we survey the query optimization techniques in graph
databases. In particular,we focus on the features they have in
Survey of Machine Learning Techniques in Textual Document ClassificationIOSR Journals
Classification of Text Document points towards associating one or more predefined categories based
on the likelihood expressed by the training set of labeled documents. Many machine learning algorithms plays
an important role in training the system with predefined categories. The importance of Machine learning
approach has felt because of which the study has been taken up for text document classification based on the
statistical event models available. The aim of this paper is to present the important techniques and
methodologies that are employed for text documents classification, at the same time making awareness of some
of the interesting challenges that remain to be solved, focused mainly on text representation and machine
learning techniques.
HIGH-LEVEL SEMANTICS OF IMAGES IN WEB DOCUMENTS USING WEIGHTED TAGS AND STREN...IJCSEA Journal
The multimedia information retrieval from World Wide Web is a challenging issue. Describing multimedia object in general, images in particular with low-level features increases the semantic gap. From WWW, information present in a HTML document as textual keywords can be extracted for capturing semantic information with the view to narrow the semantic gap. The high-level textual information of images can be extracted and associated with the textual keywords, which narrow down the search space and improve the precision of retrieval. In this paper, a strength matrix is being proposed, which is based on the frequency of occurrence of keywords and the textual information pertaining to image URLs. The strength of these textual keywords are estimated and used for associating these keywords with the images present in the documents. The high-level semantics of the image is described in the HTML documents in the form of image name, ALT tag, optional description, etc., is used for estimating the strength. In addition, word position and weighting mechanism is also used for further improving the association textual keywords with the image related text. The effectiveness of information retrieval of the proposed technique is found to be comparatively better than many of the recently proposed retrieval techniques. The experimental results of the proposed method endorse the fact that image retrieval using image information and textual keywords is better than those of the text based and the content-based approaches.
Object-Oriented Database Model For Effective Mining Of Advanced Engineering M...cscpconf
Materials have become a very important aspect of our daily life and the search for better and
new kind of engineered materials has created some opportunities for the Information science
and technology fraternity to investigate in to the world of materials. Hence this combination of
materials science and Information science together is nowadays known as Materials
Informatics. An Object-Oriented Database Model has been proposed for organizing advanced engineering materials datasets.
Fixing Forecast Accuracy: Joint Webinar with Sales Management AssociationRevegy, Inc.
Forecasting accuracy is a problem for sales organizations. Too often forecasts reflect little more than a sales force’s collective intuition about future results. Not the reliable predictor management needs.
New approaches to forecasting are proving much more valuable. They combine technology, salesperson activities, and a focus on verifiable customer outcomes.
"Winning The Professional Services Sale" with Aaron Ross & Ago CluytensAaron Ross
* Discover the #1 mistake consultants / sellers make
* Find out what the top–5% of rainmakers do
* Learn a simple messaging "mind trick”
* The “three pillars” of successful lead generation
* Bringing in revenue
An introduction to the different phases of the BPM life cycle: Analyze, Design, Implement, Measure, Improve.
Key Takeaways:
- Analysis Frameworks provide context
- Transactions perform work, Analytics manage work
- The diagram isn’t everything
- Improve your capabilities in 3 areas:
-- Process Maturity
-- Process Management Maturity
-- Organizational Maturity
Presented by Michael zur Muehlen (Stevens Institute of Technology) at the Appian World 2012 Conference, April 16, 2012, in Reston VA.
A quick reference guide to help sales execs review their lead generation, qualification and opportunity sales cycle mapping to improve current business processes.
PS business is complicated. Arriving at the right balance of delivering customer success along with commercial success is not easy. This presentation of mine attempts to highlight tried and tested strategies that have worked for me again and again.
A prescribed and simple sales process is key to the timely and accurate positioning of Professional Services. The attached presentation describes a simple process and techniques that have worked well for Enterprise Software companies of medium to large sizes.
How to make the most out of Lead Management and CRMIntergen
You’ve implemented Dynamics CRM, you “kind of” manage leads, but you either have no real process or a process that regularly fails. In this session we will find out how to take advantage of Dynamics CRM to manage your leads more effectively and track the conversion of leads to actual opportunities and sales.
Information on the web is tremendously increasing in
recent years with the faster rate. This massive or voluminous data
has driven intricate problems for information retrieval and
knowledge management. As the data resides in a web with several
forms, the Knowledge management in the web is a challenging
task. Here the novel 'Semantic Web' concept may be used for
understanding the web contents by the machine to offer
intelligent services in an efficient way with a meaningful
knowledge representation. The data retrieval in the traditional
web source is focused on 'page ranking' techniques, whereas in
the semantic web the data retrieval processes are based on the
‘concept based learning'. The proposed work is aimed at the
development of a new framework for automatic generation of
ontology and RDF to some real time Web data, extracted from
multiple repositories by tracing their URI’s and Text Documents.
Improved inverted indexing technique is applied for ontology
generation and turtle notation is used for RDF notation. A
program is written for validating the extracted data from
multiple repositories by removing unwanted data and considering
only the document section of the web page.
International Journal of Research in Engineering and Science is an open access peer-reviewed international forum for scientists involved in research to publish quality and refereed papers. Papers reporting original research or experimentally proved review work are welcome. Papers for publication are selected through peer review to ensure originality, relevance, and readability.
Semantic Search of E-Learning Documents Using Ontology Based Systemijcnes
The keyword searching mechanism is traditionally used for information retrieval from Web based systems. However, this system fails to meet the requirements in Web searching of the expert knowledge base based on the popular semantic systems. Semantic search of E-learning documents based on ontology is increasingly adopted in information retrieval systems. Ontology based system simplifies the task of finding correct information on the Web by building a search system based on the meaning of keyword instead of the keyword itself. The major function of the ontology based system is the development of specification of conceptualization which enhances the connection between the information present in the Web pages with that of the background knowledge.The semantic gap existing between the keyword found in documents and those in query can be matched suitably using Ontology based system. This paper provides a detailed account of the semantic search of E-learning documents using ontology based system by making comparison between various ontology systems. Based on this comparison, this survey attempts to identify the possible directions for future research.
Discovering Resume Information using linked data dannyijwest
In spite of having different web applications to create and collect resumes, these web applications suffer
mainly from a common standard data model, data sharing, and data reusing. Though, different web
applications provide same quality of resume information, but internally there are many differences in terms
of data structure and storage which makes computer difficult to process and analyse the information from
different sources. The concept of Linked Data has enabled the web to share data among different data
sources and to discover any kind of information while resolving the issues like heterogeneity,
interoperability, and data reusing between different data sources and allowing machine process-able data
on the web.
Bridging the gap between the semantic web and big data: answering SPARQL que...IJECEIAES
Nowadays, the database field has gotten much more diverse, and as a result, a variety of non-relational (NoSQL) databases have been created, including JSON-document databases and key-value stores, as well as extensible markup language (XML) and graph databases. Due to the emergence of a new generation of data services, some of the problems associated with big data have been resolved. In addition, in the haste to address the challenges of big data, NoSQL abandoned several core databases features that make them extremely efficient and functional, for instance the global view, which enables users to access data regardless of how it is logically structured or physically stored in its sources. In this article, we propose a method that allows us to query non-relational databases based on the ontology-based access data (OBDA) framework by delegating SPARQL protocol and resource description framework (RDF) query language (SPARQL) queries from ontology to the NoSQL database. We applied the method on a popular database called Couchbase and we discussed the result obtained.
ONTOLOGY-DRIVEN INFORMATION RETRIEVAL FOR HEALTHCARE INFORMATION SYSTEM : A C...IJNSA Journal
In health research, one of the major tasks is to retrieve, and analyze heterogeneous databases containing one single patient’s information gathered from a large volume of data over a long period of time. The main objective of this paper is to represent our ontology-based information retrieval approach for clinical Information System. We have performed a Case Study in the real life hospital settings. The results obtained illustrate the feasibility of the proposed approach which significantly improved the information retrieval process on a large volume of data over a long period of time from August 2011 until January 2012.
Expression of Query in XML object-oriented databaseEditor IJCATR
Upon invent of object-oriented database, the concept of behavior in database was propounded. Before, relational database only provided a logical modeling of data and paid no attention to the operations applied on data in the system. In this paper, a method is presented for query of object-oriented database. This method has appropriate results when the user explains restrictions in a combinational matter (disjunctive and conjunctive) and assumes a weight for each one of restrictions based on their importance. Later, the obtained results are sorted based on their belonging rate to the response set. In continue, queries are explained using XML labels. The purpose is simplifying queries and objects resulted from queries to be very close to the user need and meet his expectation.
Expression of Query in XML object-oriented databaseEditor IJCATR
Upon invent of object-oriented database, the concept of behavior in database was propounded. Before, relational database
only provided a logical modeling of data and paid no attention to the operations applied on data in the system. In this paper, a method
is presented for query of object-oriented database. This method has appropriate results when the user explains restrictions in a
combinational matter (disjunctive and conjunctive) and assumes a weight for each one of restrictions based on their importance. Later,
the obtained results are sorted based on their belonging rate to the response set. In continue, queries are explained using XML labels.
The purpose is simplifying queries and objects resulted from queries to be very close to the user need and meet his expectation.
Expression of Query in XML object-oriented databaseEditor IJCATR
Upon invent of object-oriented database, the concept of behavior in database was propounded. Before, relational database
only provided a logical modeling of data and paid no attention to the operations applied on data in the system. In this paper, a method
is presented for query of object-oriented database. This method has appropriate results when the user explains restrictions in a
combinational matter (disjunctive and conjunctive) and assumes a weight for each one of restrictions based on their importance. Later,
the obtained results are sorted based on their belonging rate to the response set. In continue, queries are explained using XML labels.
The purpose is simplifying queries and objects resulted from queries to be very close to the user need and meet his expectation.
Information residing in relational databases and delimited file systems are inadequate for reuse and sharing over the web. These file systems do not adhere to commonly set principles for maintaining data harmony. Due to these reasons, the resources have been suffering from lack of uniformity, heterogeneity as well as redundancy throughout the web. Ontologies have been widely used for solving such type of problems, as they help in extracting knowledge out of any information system. In this article, we focus on extracting concepts and their relations from a set of CSV files. These files are served as individual concepts and grouped into a particular domain, called the domain ontology. Furthermore, this domain ontology is used for capturing CSV data and represented in RDF format retaining links among files or concepts. Datatype and object properties are automatically detected from header fields. This reduces the task of user involvement in generating mapping files. The detail analysis has been performed on Baseball tabular data and the result shows a rich set of semantic information.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
An Efficient Annotation of Search Results Based on Feature Ranking Approach f...Computer Science Journals
With the increased number of web databases, major part of deep web is one of the bases of database. In several search engines, encoded data in the returned resultant pages from the web often comes from structured databases which are referred as Web databases (WDB).
Vision Based Deep Web data Extraction on Nested Query Result RecordsIJMER
International Journal of Modern Engineering Research (IJMER) is Peer reviewed, online Journal. It serves as an international archival forum of scholarly research related to engineering and science education.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
2. 194 Computer Science & Information Technology (CS & IT)
representing metadata about WWW resources, such as the title, author, and modification date of a
Web page, but it can be used for storing any other data. It is based on triples subject-predicate-
object that form graph of data [21].
All data in the semantic web use RDF as the primary representation language [16]. RDF Schema
(RDFS) can be used to describe taxonomies of classes and properties and use them to create
lightweight ontologies. Ontologies describe the conceptualization, the structure of the domain,
which includes the domain model with possible restrictions [18]. More detailed ontologies can be
created with Web Ontology Language (OWL). It is syntactically embedded into RDF, so like
RDFS, it provides additional standardized vocabulary. For querying RDF data as well as RDFS
and OWL ontologies with knowledge bases, a Simple Protocol and RDF Query Language
(SPARQL) is available. SPARQL is SQL-like language, but uses RDF triples and resources for
both matching part of the query and for returning results [17]. With the help of ontologies, the
data is stored and organized in a meaningful way. It helps for context based search unlike the
keyword based search, whereas the latter gives more irrelevant search results.
2. LITERATURE SURVEY
Raghu et. al [1] developed yellow pages service provider by using semantic web technologies and
improved the search results by increasing the relevancy through feedback mechanism for geo-
spatial and mobile applications. Richard Vlach et.al [2], developed a single schema to extract
information from multiple web sources and handled ambiguous text. Wood et. al. [3] extracted
the botanical data presented in the form of text and defined a model for correlating them using
ontologies. Rifat et. al [4], in his paper suggested the lazy strategy scheme for information
extraction along with usual machine learning techniques by building the specialized model for
each test instance. Jie Zou et. al [5] segmented HTML documents into logical zones for medical
journal articles using Hidden Markov Model approach. In their technical report [6], Rifat Ozcan
et. al used ontologies and latent semantic analysis technique to reduce the irrelevancy.
Ayaz et.al [7] discussed that the data from web sites can be converted to XML format for better
information analysis and evaluation. Harish et. al [8] developed an interactive semantic tool to
extract pertinent information from unstructured data and visualized it through spring graph using
NLP and semantic technologies. James Mayfield et.al [9] discussed that the information retrieval
could be tightly coupled with inference so that the semantic web search engines can lead to
improvements in retrieval. Mohammed et.al [10], in his paper developed a web based multimedia
enabled eLearning system for selecting courses to suit to students’ needs using the dynamic
mash-up technique. Uijal et.al [11] presented a Linked Data approach to discover resume
information enabling the task aggregation, sharing and reusing information among different
resume documents. Chris Welty et.al [12], in his research paper, transformed the data into
knowledge base and used deeper semantics of OWL to improve the precision of relation
extraction. Maceij Janik et.al [15] in their paper classified Wikipedia documents using ontology
and claimed that their model not required training set for classification, but can be done with the
help of ontologies. Canan Pembe et. al [18] proposed a novel approach for improving the web
search by representing the HTML documents hierarchically. They used semantic knowledge to
represent the section and subsection of HTML documents.
3. Computer Science & Information Technology (CS & IT) 195
3. METHODOLOGY
The general methodology adopted in this paper is diagrammatically represented in Fig. 1 and
explained in this section. Keywords are given to the search engines and the search results are
analyzed further for relevant information extraction. The layout of different HTML pages is learnt
and the pointer is moved to the section where the necessary data or information is present. We’ve
conducted various experiments for different HTML’s tags for retrieving the information, which in
detail is explained in Section 4. Relevancy of web pages is determined with the help of synsets
generated by the WordNet. Earlier, ontologies, or the vocabularies used for different domains are
generated using Protégé tool [15]. The user query is mapped with the updated repositories and
ontologies by ontology mapping module and then the fine tuned results are given back to the user.
This feedback mechanism introduced through ontology mapping increases the relevancy.
Fig. 1. Generic model for relevant Information Extraction from web documents
Sample ontology representation [19] used for Experiment 1 is shown in Fig 2 where the
relationship can be easily maintained for knowledge inference.
Fig. 2. Ontology representation of College staff members
4. 196 Computer Science & Information Technology (CS & IT)
4. EXPERIMENTAL RESULTS
We’ve conducted two different experiments with HTML pages to extract information and to
discover new knowledge from them. We’ve used RDF for storing the content and SPARQL and
Jena [18] for querying RDF content. Ontologies created using Protégé tool is used for context
based search. It establishes the relationship between the concepts and enables the increased
relevancy.
4.1 Experiment 1
The staff profile in various formats like .html, .pdf, .doc is fed to the structured data extraction
module after preprocessing and converted into RDF format. The model followed for this process
is shown in Fig. 3. For this experiment, we’ve taken our college web site (www.tce.edu) and
collected the profiles of all staff members of all the departments. We’ve ignored the HTML pages
in which the journal publication details may not present and have considered only the profiles
which have the journal publication details. The number of documents taken for the experiment is
shown in Table 1. We’ve used HTMLTidy parser to make the document syntactically clean
document. By this process, all unclosed tags are closed and unwanted lines are removed.
Fig. 3. Model for Information Extraction from HTML pages (list tags)
Once the pre-processing phase gets completed, the clean html pages are fed into structured data
extraction phase. This is accomplished by using the Wrapper Induction Technique [20], which is
the supervised learning approach, and is semi-automatic. In this approach, a set of extraction rules
is learned from a collection of manually labeled pages. The rules are then employed to extract
target data items from other similarly formatted pages.
For eg. the journal profile of a staff member after conversion may look like
<rdf:Description>
<staff:authors> A.M.Abirami </staff:authors>
<staff:title> An Enhanced ….. </staff:title>
<staff:journal>International Journal …. Technology </staff:journal>
5. Computer Science & Information Technology (CS & IT) 197
<staff:year>2012</staff:year>
</rdf:Description>
Similarly profile of all staff members are converted into RDF format. The words related for each
research area are stored in the RDF format. The exact domain in which each staff member is
doing research is identified with the help of these words and their titles in the publications. The
similarity between each staff member is measured with respect to their publications, thus the staff
with similar profile is easily identified. Cosine similarity measure is used for this purpose. Latent
semantic indexing can also be used as the alternate. These set of profiles can be recommended for
new persons who fall in the same category of research. Or the staff member can easily find
his/her research colleagues to strengthen their relationships between them. The model suggested
in this paper thus is helpful for categorizing the web documents. We’ve developed a tool to give
the visualization effect such that the tool gives report on the number of people working in the
particular research field, number of publication from each department in each journal and the like.
The Table 1 shows the new knowledge discovered from the set of profiles. Among the staff
members of the department, the persons working in the same research category are easily
discovered.
Table 1. Knowledge Discovered from staff member profile
4.2 Experiment 2
Now-a-days all the colleges use web applications to maintain the students’ personal and academic
details, but it may be maintained and managed by one department. Not all department members
are given access to the database and its tables. Sometimes it may become necessary to analyze the
students’ attendance and performance details, but we are left with data in the web pages. In this
case, it is better if these data are converted into suitable machine learnable format so that
inference and analysis can be easily made on the data. Otherwise, the human support is very
much required for this process.
Department
Total
Documents
Research
Category
Matched Documents
CSE 14
Distributed
Computing
CSE - 5, IT - 4, MCA- 1,
ECE – 3
IT 11 Data Mining CSE - 2, IT - 3, MCA – 2
MCA 7 Security CSE - 3, IT – 3
ECE 26
Software
Engineering
CSE - 2, IT - 1, MCA – 2
Image
Processing
CSE - 2, MCA - 1, ECE –
4
6. 198 Computer Science & Information Technology (CS & IT)
Fig. 4. Model for Information Extraction from HTML pages (Tables)
We’ve followed the model given above to extract the relevant and required information about
students from the different web pages as shown in Fig 4. We’ve used DOM APIs to extract the
content from HTML tables and converted into XML for querying. Ontology is used for building
the URL and the tool developed learns the set of information to be retrieved from the login id and
other databases. Ontologies are constructed for various concepts like student, staff member,
proctor, subjects and semester. The relationship between each concept is established using OWL
and the pictorial representation is shown in Fig. 5. We’ve extracted our students’ data from the
HTML pages into XML and RDF and used different XML query languages like XSLT, XPATH
and SPARQL and compared the time taken for the inference and shown in Table 2.
Fig. 5. Ontology representation for Experiment 2
7. Computer Science & Information Technology (CS & IT) 199
Table 2. Knowledge Acquisiton from different web pages
Here the student details like personal, attendance and test marks are displayed in different HTML
tables. For example, the attendance of a single class is displayed in a web page. In order to collate
details of a single student, we need to parse three or more web pages. But the information
extraction module traverses all these web pages and parses the content into required format like
XML or RDF and enables for easy query processing. Single XML file means that a group of
students details are stored in a file; multiple XML files means that each student detail is stored in
separate files.
5. CONCLUSION
We have developed different semantic web applications to convert the unstructured or semi-
structured text into XML/RDF format to enable easy machine processing. This approach leads to
quick inferences and new knowledge discovery from the set of data. As a future enhancement, we
will impart machine learning algorithms for classifying the web documents based on their
content.
REFERENCES
[1] Raghu Anantharangachar1 and Ramani Srinivasan .: Semantic Web techniques for yellow page
service providers. International Journal of Web & Semantic Technology (IJWesT) Vol.3, No.3, July
2012.
[2] Richard Vlach, Wassili Kazakaos .: Using Common Schemas for Information Extraction for
Heterogeneous Web Catalogs. ADBIS 2003, LNCS 2798, pp.118-132, 2003.
[3] M.M. Wood, S.J. Lydon, V. Tablan, D. Maynard, and H. Cunningham.: Populating a Database from
Parallel Texts Using Ontology-Based Information Extraction. LNCS 3136, pp. 254–264, 2004.
[4] Rifat Ozcan, Ismail Sengor Altingovde, and Ozgur Ulusoy .: In Praise of Laziness: A Lazy Strategy
for Web Information Extraction. ECIR 2012, LNCS 7224, pp. 565–568, 2012.
[5] Jie Zou, Daniel Le and George R. Thoma .: Structure and Content Analysis for HTML Medical
Articles: A Hidden Markov Model Approach. DocEng’07, ACM 978-1-59593-776.
[6] Rifat Ozcan, Y. Alp Aslandogan .: Concept Based Information Retrieval Using Ontolo-gies and
Latent Semantic Analysis. A Technical Report, CSE-2004-8.
[7] Ayaz Ahmed Ayaz Ahmed Shariff K, Mohammed Ali Hussain, Sambath Kumar .: Lev-eraging
Unstructured Data into Intelligent Information – Analysis & Evaluation. IPCSIT vol.4, 2011
International Conference on Information and Network Technology.
HTML
Records
Manual
Effort
(in mts)
Time taken (ms)
Single XML/RDF file
Time Taken (ms)
Multiple
XML/RDF files
XSLT XPath SPARQL XPath SPARQL
60 30 90 78 207 6458 890
140 80 350 218 220 25272 1190
200 130 540 360 230 97984 1388
500 280 985 795 240 213783 2398
8. 200 Computer Science & Information Technology (CS & IT)
[8] Harish Jadhao, Dr. Jagannath Aghav, Anil Vegiraju .: Semantic Tool for Analysing Un-structured
Data. International Journal of Scientific & Engineering Research, Volume 3, Issue 8.
[9] James Mayfield, Tim Finin .: Information Retrieval on the Semantic Web: Integrating inference and
retrieval. SIGIR 2003 Semantic Web Workshop.
[10] Mohammed Al-Zoube, Baha Khasawneh .: A Data Mashup for Dynamic Composition of Adaptive
Courses. The International Arab Journal of Information Technology, Vol. 7, No. 2, April 2010.
[11] Ujjal Marjit, Kumar Sharma and Utpal Biswas .: Discovering resume information using linked data.
International Journal of Web & Semantic Technology (IJWesT) Vol.3, No.2, April 2012.
[12] Chris Welty, William Murdoc. J: Towards knowledge acquisition from Information Ex-traction.
ISWC'06 Proceedings of the 5th international conference on The Semantic Web, pages 709-722,
2006.
[13] Maciej Janik, Krys Kochut : Wikipedia in action: Ontological Knowledge in Text Cate-gorization.
University of Georgia. Technical Report No. UGA-CS-TR-07-001.
[14] Canan Pembe, Tunga Gungor: Heading based Sectional Hierarchy Identification for HTML
Documents. International Symposium on Conference in Computer and Informa-tion Sciences, ISCIS
2007.
[15] http://protege.stanford.edu/
[16] http://www.w3schools.com/rdf/rdf_example.asp
[17] http://www.w3.org/TR/rdf-sparql-query/
[18] http://jena.sourceforge.net/tutorial/RDF_API/
[19] http://www.obitko.com/tutorials/ontologies-semanticweb/ontologies.htm
[20] Bing Liu,"Web data mining - Exploring hyperlinks,contents,and usage data".
[21] Grigoris Antoniou and Frank van Harmelen, "A semantic Web Primer".