Extensible markup language (XML) is well-known as the standard for data exchange over the internet. It is flexible and has high expressibility to express the relationship between the data stored. Yet, the structural complexity and the semantic relationships are not well expressed. On the other hand, ontology models the structural, semantic and domain knowledge effectively. By combining ontology with visualization effect, one will be able to have a closer view based on respective user requirements. In this paper, we propose several mapping rules for the transformation of XML into ontology representation. Subsequently, we show how the ontology is constructed based on the proposed rules using the sample domain ontology in University of Wisconsin-Milwaukee (UWM) and mondial datasets. We
also look at the schemas, query workload, and evaluation, to derive the extended knowledge from the existing ontology. The correctness of the ontology representation has been proven effective through supporting various types of complex queries in simple protocol and resource description framework query language (SPARQL) language.
Catalog-based Conversion from Relational Database into XML Schema (XSD)CSCJournals
Where we are in the age of information revolution, exchange information, and transport data effectively among various sectors of government, commercial, service and industrial, etc., the uses of a new databases model to support this trend has become very important because inability of traditional databases models to support it. eXtensible Markup Language (XML) considers a new standard model for data interchange through internet and mobiles devices networks, it has become a common language to exchange and share the data of traditional models in easy and inexpensive ways. In this research, we propose a new technique to convert the relational database contents and schema into XML schema (XSD- XML Schema Definition), the main idea of the technique is extracting relational database catalog using Structured Query Language (SQL). We follow three steps to complete the conversion process. First, extracting relation instance (actual content) and schema catalog using SQL query language, which consider the required information to implement XML document and its schema. Second, transform the actual content into XML document tree. The idea of this step is converting table columns of the relations (tables) into the elements of XML document. Third, transform schema catalog into XML schema for describing the structure of the XML document. To do so, we transform datatype of the elements and the variant data constrains such as data length, not null, check and default, moreover define primary foreign keys and the referential integrity between the tables. Overall results of the technique are very promise while the technique is very clear and does not require complex procedures that could adversely effect on the results accuracy. We performed many experiments and report their elapsed CPU times.
XML COMPACTION IMPROVEMENTS BASED ON BINARY STRING ENCODINGSijdms
Due to the flexibility and the easy use of XML, it is nowadays widely used in a vast number of application areas and new information is increasingly being encoded as XML documents. Therefore, it is important to provide a repository for XML documents, which supports efficient management and storage of XML data.For this purpose, many proposals have been made, the most common ones are node labeling schemes. On
the other hand, XML repeatedly uses tags to describe the data itself. This self-describing nature of XML makes it verbose with the result that the storage requirements of XML are often expanded and can be excessive. In addition, the increased size leads to increased costs for data manipulation. Therefore, it also seems natural to use compression techniques to increase the efficiency of storing and querying XML data.
In our previous works, we aimed at combining the advantages of both areas (labeling and compaction technologies), Specially, we took advantage of XML structural peculiarities for attempting to reduce storage space requirements and to improve the efficiency of XML query processing using labeling schemes. In this paper, we continue our investigations on variations of binary string encoding forms to decrease the
label size. Also We report the experimental results to examine the impact of binary string encoding on the query performance and the storage size needed to store the compacted XML documents.
Very few research works have been done on XML security over relational databases despite that XML became the de facto standard for the data representation and exchange on the internet and a lot of XML documents are stored in RDBMS. In [14], the author proposed an access control model for schema-based storage of XML documents in relational storage and translating XML access control rules to relational access control rules. However, the proposed algorithms had performance drawbacks. In this paper, we will use the same access control model of [14] and try to overcome the drawbacks of [14] by proposing an efficient technique to store the XML access control rules in a relational storage of XML DTD. The mapping of the XML DTD to relational schema is proposed in [7]. We also propose an algorithm to translate XPath queries to SQL queries based on the mapping algorithm in [7].
Very few research works have been done on XML security over relational databases despite that XML
became the de facto standard for the data representation and exchange on the internet and a lot of XML
documents are stored in RDBMS. In [14], the author proposed an access control model for schema-based
storage of XML documents in relational storage and translating XML access control rules to relational
access control rules. However, the proposed algorithms had performance drawbacks. In this paper, we will
use the same access control model of [14] and try to overcome the drawbacks of [14] by proposing an
efficient technique to store the XML access control rules in a relational storage of XML DTD. The mapping
of the XML DTD to relational schema is proposed in [7]. We also propose an algorithm to translate XPath
queries to SQL queries based on the mapping algorithm in [7].
Large volume of information is stored in XML format in the Web, and clustering is a management method for this documents. Most of current methods for clustering XML documents consider only one of these two aspects. In this paper, we propose SCEM (Expectation Maximization Structure and Content) for XML documents which is used to effectively cluster XML documents by combining content and structural features. The other contribution of this paper is that we used probabilistic distributions in such way that have probability parameters corresponding to one cluster. In this way, we obtained better effectiveness compared to other clustering methods due to generality. Experimental results on real datasets show effectiveness of proposed method, particularly when it is applied on large XML documents without schema. Also it can be used to improve accuracy and effectiveness of XML information retrieval.
2008 Industry Standards for C2 CDM and FrameworkBob Marcus
The document discusses standards for data modeling, metadata tagging, XML processing and schemas, and transport that enable interoperability across networks with varying degrees of coupling between systems. It categorizes use cases as intranet-centric within a single organization, extranet-centric across organizations with common standards, and internet-centric for ad hoc interactions. The relationships between these categories and appropriate standards are illustrated. Key standards discussed include XML schemas, RDF, OWL, and data models for command and control like C2IEDM.
Towards a Query Rewriting Algorithm Over Proteomics XML ResourcesCSCJournals
Querying and sharing Web proteomics data is not an easy task. Given that, several data sources can be used to answer the same sub-goals in the Global query, it is obvious that we can have many candidates rewritings. The user-query is formulated using Concepts and Properties related to Proteomics research (Domain Ontology). Semantic mappings describe the contents of underlying sources. In this paper, we propose a characterization of query rewriting problem using semantic mappings as an associated hypergraph. Hence, the generation of candidates’ rewritings can be formulated as the discovery of minimal Transversals of an hypergraph. We exploit and adapt algorithms available in Hypergraph Theory to find all candidates rewritings from a query answering problem. Then, in future work, some relevant criteria could help to determine optimal and qualitative rewritings, according to user needs, and sources performances.
This document discusses querying XML schemas using XSPath, a path language for XML schemas. It begins with an abstract stating that XML schemas are used to constrain the structure and content of XML documents, but previous methods of navigating schemas were not easy. XSPath allows navigating and selecting nodes in XML schemas. The document then provides background on XML schemas and XPath, and describes related work on schema exploration and querying. It explains how schemas can be represented and provides an abbreviated syntax and examples of XSPath.
Catalog-based Conversion from Relational Database into XML Schema (XSD)CSCJournals
Where we are in the age of information revolution, exchange information, and transport data effectively among various sectors of government, commercial, service and industrial, etc., the uses of a new databases model to support this trend has become very important because inability of traditional databases models to support it. eXtensible Markup Language (XML) considers a new standard model for data interchange through internet and mobiles devices networks, it has become a common language to exchange and share the data of traditional models in easy and inexpensive ways. In this research, we propose a new technique to convert the relational database contents and schema into XML schema (XSD- XML Schema Definition), the main idea of the technique is extracting relational database catalog using Structured Query Language (SQL). We follow three steps to complete the conversion process. First, extracting relation instance (actual content) and schema catalog using SQL query language, which consider the required information to implement XML document and its schema. Second, transform the actual content into XML document tree. The idea of this step is converting table columns of the relations (tables) into the elements of XML document. Third, transform schema catalog into XML schema for describing the structure of the XML document. To do so, we transform datatype of the elements and the variant data constrains such as data length, not null, check and default, moreover define primary foreign keys and the referential integrity between the tables. Overall results of the technique are very promise while the technique is very clear and does not require complex procedures that could adversely effect on the results accuracy. We performed many experiments and report their elapsed CPU times.
XML COMPACTION IMPROVEMENTS BASED ON BINARY STRING ENCODINGSijdms
Due to the flexibility and the easy use of XML, it is nowadays widely used in a vast number of application areas and new information is increasingly being encoded as XML documents. Therefore, it is important to provide a repository for XML documents, which supports efficient management and storage of XML data.For this purpose, many proposals have been made, the most common ones are node labeling schemes. On
the other hand, XML repeatedly uses tags to describe the data itself. This self-describing nature of XML makes it verbose with the result that the storage requirements of XML are often expanded and can be excessive. In addition, the increased size leads to increased costs for data manipulation. Therefore, it also seems natural to use compression techniques to increase the efficiency of storing and querying XML data.
In our previous works, we aimed at combining the advantages of both areas (labeling and compaction technologies), Specially, we took advantage of XML structural peculiarities for attempting to reduce storage space requirements and to improve the efficiency of XML query processing using labeling schemes. In this paper, we continue our investigations on variations of binary string encoding forms to decrease the
label size. Also We report the experimental results to examine the impact of binary string encoding on the query performance and the storage size needed to store the compacted XML documents.
Very few research works have been done on XML security over relational databases despite that XML became the de facto standard for the data representation and exchange on the internet and a lot of XML documents are stored in RDBMS. In [14], the author proposed an access control model for schema-based storage of XML documents in relational storage and translating XML access control rules to relational access control rules. However, the proposed algorithms had performance drawbacks. In this paper, we will use the same access control model of [14] and try to overcome the drawbacks of [14] by proposing an efficient technique to store the XML access control rules in a relational storage of XML DTD. The mapping of the XML DTD to relational schema is proposed in [7]. We also propose an algorithm to translate XPath queries to SQL queries based on the mapping algorithm in [7].
Very few research works have been done on XML security over relational databases despite that XML
became the de facto standard for the data representation and exchange on the internet and a lot of XML
documents are stored in RDBMS. In [14], the author proposed an access control model for schema-based
storage of XML documents in relational storage and translating XML access control rules to relational
access control rules. However, the proposed algorithms had performance drawbacks. In this paper, we will
use the same access control model of [14] and try to overcome the drawbacks of [14] by proposing an
efficient technique to store the XML access control rules in a relational storage of XML DTD. The mapping
of the XML DTD to relational schema is proposed in [7]. We also propose an algorithm to translate XPath
queries to SQL queries based on the mapping algorithm in [7].
Large volume of information is stored in XML format in the Web, and clustering is a management method for this documents. Most of current methods for clustering XML documents consider only one of these two aspects. In this paper, we propose SCEM (Expectation Maximization Structure and Content) for XML documents which is used to effectively cluster XML documents by combining content and structural features. The other contribution of this paper is that we used probabilistic distributions in such way that have probability parameters corresponding to one cluster. In this way, we obtained better effectiveness compared to other clustering methods due to generality. Experimental results on real datasets show effectiveness of proposed method, particularly when it is applied on large XML documents without schema. Also it can be used to improve accuracy and effectiveness of XML information retrieval.
2008 Industry Standards for C2 CDM and FrameworkBob Marcus
The document discusses standards for data modeling, metadata tagging, XML processing and schemas, and transport that enable interoperability across networks with varying degrees of coupling between systems. It categorizes use cases as intranet-centric within a single organization, extranet-centric across organizations with common standards, and internet-centric for ad hoc interactions. The relationships between these categories and appropriate standards are illustrated. Key standards discussed include XML schemas, RDF, OWL, and data models for command and control like C2IEDM.
Towards a Query Rewriting Algorithm Over Proteomics XML ResourcesCSCJournals
Querying and sharing Web proteomics data is not an easy task. Given that, several data sources can be used to answer the same sub-goals in the Global query, it is obvious that we can have many candidates rewritings. The user-query is formulated using Concepts and Properties related to Proteomics research (Domain Ontology). Semantic mappings describe the contents of underlying sources. In this paper, we propose a characterization of query rewriting problem using semantic mappings as an associated hypergraph. Hence, the generation of candidates’ rewritings can be formulated as the discovery of minimal Transversals of an hypergraph. We exploit and adapt algorithms available in Hypergraph Theory to find all candidates rewritings from a query answering problem. Then, in future work, some relevant criteria could help to determine optimal and qualitative rewritings, according to user needs, and sources performances.
This document discusses querying XML schemas using XSPath, a path language for XML schemas. It begins with an abstract stating that XML schemas are used to constrain the structure and content of XML documents, but previous methods of navigating schemas were not easy. XSPath allows navigating and selecting nodes in XML schemas. The document then provides background on XML schemas and XPath, and describes related work on schema exploration and querying. It explains how schemas can be represented and provides an abbreviated syntax and examples of XSPath.
HOLISTIC EVALUATION OF XML QUERIES WITH STRUCTURAL PREFERENCES ON AN ANNOTATE...ijseajournal
With the emergence of XML as de facto format for storing and exchanging information over the Internet, the search for ever more innovative and effective techniques for their querying is a major and current concern of the XML database community. Several studies carried out to help solve this problem are mostly oriented towards the evaluation of so-called exact queries which, unfortunately, are likely (especially in the case of semi-structured documents) to yield abundant results (in the case of vague queries) or empty results (in the case of very precise queries). From the observation that users who make requests are not necessarily interested in all possible solutions, but rather in those that are closest to their needs, an important field of research has been opened on the evaluation of preferences queries. In this paper, we propose an approach for the evaluation of such queries, in case the preferences concern the structure of the document. The solution investigated revolves around the proposal of an evaluation plan in three phases: rewriting-evaluation-merge. The rewriting phase makes it possible to obtain, from a partitioningtransformation operation of the initial query, a hierarchical set of preferences path queries which are holistically evaluated in the second phase by an instrumented version of the algorithm TwigStack. The merge phase is the synthesis of the best results.
INVESTIGATING BINARY STRING ENCODING FOR COMPACT REPRESENTATION OF XML DOCUMENTScsandit
This document summarizes an investigation into using binary string encoding to compactly represent XML documents. The study explores prefix-free encoding schemes that map XML node labels to binary strings to minimize storage requirements. Experimental results show that a proposed prefix-free encoding scheme where each bitstring division observes byte boundaries achieved the most efficient storage sizes for various real and synthetic XML datasets compacted using different labeling and compaction approaches. Future work could explore additional encoding formats and their impact on storage costs and query/update performance for compacted XML data.
The increased potential of the ontologies to reduce the human interference has wide range of applications. This paper identifies requirements for an ontology development platform to innovate artificially intelligent web. To facilitate this process, RDF and OWL have been developed as standard formats for the sharing and integration of data and knowledge. The knowledge in the form of rich conceptual schemas called ontologies. Based on the framework, an architectural paradigm is put forward in view of ontology engineering and development of ontology applications and a development portal designed to support ontology engineering, content authoring and application development with a view to maximal scalability in size and complexity of semantic knowledge and flexible reuse of ontology models and ontology application processes in a distributed and collaborative engineering environment.
The document discusses space efficient data structures for storing JSON documents in memory. It introduces JSON and compares it to XML. Succinct data structures are discussed as a way to represent trees in a space efficient manner close to theoretical minimum while still supporting efficient operations. The paper aims to apply succinct data structures to represent the tree structure of JSON documents to allow processing large documents in limited memory.
Transforming data-centric eXtensible markup language into relational database...journalBEEI
eXtensible markup language (XML) appeared internationally as the format for data representation over the web. Yet, most organizations are still utilising relational databases as their database solutions. As such, it is crucial to provide seamless integration via effective transformation between these database infrastructures. In this paper, we propose XML-REG to bridge these two technologies based on node-based and path-based approaches. The node-based approach is good to annotate each positional node uniquely, while the path-based approach provides summarised path information to join the nodes. On top of that, a new range labelling is also proposed to annotate nodes uniquely by ensuring the structural relationships are maintained between nodes. If a new node is to be added to the document, re-labelling is not required as the new label will be assigned to the node via the new proposed labelling scheme. Experimental evaluations indicated that the performance of XML-REG exceeded XMap, XRecursive, XAncestor and Mini-XML concerning storing time, query retrieval time and scalability. This research produces a core framework for XML to relational databases (RDB) mapping, which could be adopted in various industries.
Formal Models and Algorithms for XML Data InteroperabilityThomas Lee
In this paper, we study the data interoperability problem of web services in terms of XML schema compatibility. When Web Service A sends XML messages to Web Service B, A is interoperable with B if B can accept all messages from A. That is, the XML schema R for B to receive XML instances must be compatible with the XML schema S for A to send XML instances, i.e., A is a subschema of B. We propose a formal model called Schema Automaton (SA) to model W3C XML Schema (XSD) and develop several algorithms to perform different XML schema computations. The computations include schema minimization, schema equivalence testing, subschema testing, and subschema extraction. We have conducted experiments on an e-commerce standard XSD called xCBL to demonstrate the practicality of our algorithms. One experiment has refuted the claim that the xCBL 3.5 XSD is backward compatible with the xCBL 3.0 XSD. Another experiment has shown that the xCBL XSDs can be effectively trimmed into small subschemas for specific applications, which has significantly reduced the schema processing time.
This paper investigates using Object Role Modeling (ORM), a graphical conceptual modeling technique, to design XML schemas. The authors describe an algorithm to automatically generate an XML schema file from an ORM conceptual data model. Their approach aims to reduce data redundancy and increase connectivity in the resulting XML instances. Major object types are identified in the ORM model and used as the starting point for hierarchies in the generated XML schema to achieve this goal.
Ontology languages are used in modelling the semantics of concepts within a particular domain and the relationships between those concepts. The Semantic Web standard provides a number of modelling languages that differ in their level of expressivity and are organized in a Semantic Web Stack in such a way that each language level builds on the expressivity of the other. There are several problems when one attempts to use independently developed ontologies. When existing ontologies are adapted for new purposes it requires that certain operations are performed on them. These operations are currently performed in a semi-automated manner. This paper seeks to model categorically the syntax and semantics of RDF ontology as a step towards the formalization of ontological operations using category theory.
This document proposes an enhanced method of XML validation using the SRML metalanguage. It extends SRML to allow for more complex validation rules and the ability to correct invalid data. The validation is done in two steps, first using XSD for structure and then SRML rules for content. It also explores applying the SRML validator to databases for record validation on database operations. This provides more granular validation of dataset records compared to validating the entire XML file.
This document discusses ontology mapping. It begins with an introduction to the semantic web and ontologies. Ontology mapping is important for allowing different ontologies to be aligned and related. There are different types of ontology mapping including alignment, merging, and mapping. The document then surveys some popular ontology mapping techniques including GLUE, PROMPT, and QOM. It evaluates these techniques and discusses their inputs, outputs, and approaches. The document concludes that semantic web research is important for advancing web technologies and realizing the goals of web 3.0. Future work could involve developing new ontology mapping techniques and publishing research on existing mapping methods.
A Survey on Heterogeneous Data Exchange using XmlIRJET Journal
This document summarizes a research paper on heterogeneous data exchange using XML. It discusses how XML has become a standard for data transmission due to its flexibility, extensibility and ability to represent heterogeneous data. The document then reviews related work on XML data exchange and mapping between relational and XML models. It also describes the process of exporting data from a source database to XML, importing XML data by validating, transforming and storing it in the target database, and transmitting data between different servers.
The document provides an overview of approaches for clustering XML data based on structure and content. It first outlines applications where XML clustering is useful, including XML query processing and data integration. It then presents a generic framework for XML clustering with three phases: data representation, similarity computation, and clustering/grouping. The document surveys current approaches and aims to classify them and identify common features. It also discusses challenges in XML clustering and future research directions.
While the world is witnessing an information revolution unprecedented and great speed in the growth of databases in all aspects. Databases interconnect with their content and schema but use different elements and structures to express the same concepts and relations, which may cause semantic and structural conflicts. This paper proposes a new technique for integration the heterogeneous eXtensible Markup Language (XML) schemas, under the name XDEHD. The returned mediated schema contains all concepts and relations of the sources without duplication. Detailed technique divides into three steps; First, extract all subschemas from the sources by decompose the schemas sources, each subschema contains three levels, these levels are ancestor, root and leaf. Thereafter, second, the technique matches and compares the subschemas and return the related candidate subschemas, semantic closeness function is implemented to measures the degree how similar the concepts of subschemas are modelled in the sources. Finally, create the medicate schema by integration the candidate subschemas, and then obtain the minimal and complete unified schema, association strength function is developed to compute closely of pair in candidate subschema across all data sources, and elements repetition function is employed to calculate how many times each element repeated between the candidate subschema.
A Semi-Automatic Ontology Extension Method for Semantic Web ServicesIDES Editor
this paper provides a novel semi-automatic ontology
extension method for Semantic Web Services (SWS). This is
significant since ontology extension methods those existing
in literature mostly deal with semantic description of static
Web resources such as text documents. Hence, there is a need
for methods that can serve dynamic Web resources such as
SWS. The developed method in this paper avoids redundancy
and respects consistency so as to assure high quality of the
resulting shared ontologies.
CONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANSijseajournal
ABSTRACT
In this paper we propose a novel method to cluster categorical data while retaining their context. Typically, clustering is performed on numerical data. However it is often useful to cluster categorical data as well, especially when dealing with data in real-world contexts. Several methods exist which can cluster categorical data, but our approach is unique in that we use recent text-processing and machine learning advancements like GloVe and t- SNE to develop a a context-aware clustering approach (using pre-trained
word embeddings). We encode words or categorical data into numerical, context-aware, vectors that we use to cluster the data points using common clustering algorithms like K-means.
Expression of Query in XML object-oriented databaseEditor IJCATR
This document discusses expressing queries in an XML object-oriented database. It proposes a method for querying an object-oriented database where the user can assign weights to restrictions in conjunctive or disjunctive queries based on importance. The queries and resulting objects are represented using XML labels to simplify them and make them closer to user needs. It also presents a case study of a book information registration system to evaluate the proposed query method on an XML object-oriented database.
Expression of Query in XML object-oriented databaseEditor IJCATR
Upon invent of object-oriented database, the concept of behavior in database was propounded. Before, relational database only provided a logical modeling of data and paid no attention to the operations applied on data in the system. In this paper, a method is presented for query of object-oriented database. This method has appropriate results when the user explains restrictions in a combinational matter (disjunctive and conjunctive) and assumes a weight for each one of restrictions based on their importance. Later, the obtained results are sorted based on their belonging rate to the response set. In continue, queries are explained using XML labels. The purpose is simplifying queries and objects resulted from queries to be very close to the user need and meet his expectation.
Expression of Query in XML object-oriented databaseEditor IJCATR
Upon invent of object-oriented database, the concept of behavior in database was propounded. Before, relational database
only provided a logical modeling of data and paid no attention to the operations applied on data in the system. In this paper, a method
is presented for query of object-oriented database. This method has appropriate results when the user explains restrictions in a
combinational matter (disjunctive and conjunctive) and assumes a weight for each one of restrictions based on their importance. Later,
the obtained results are sorted based on their belonging rate to the response set. In continue, queries are explained using XML labels.
The purpose is simplifying queries and objects resulted from queries to be very close to the user need and meet his expectation.
USING RELATIONAL MODEL TO STORE OWL ONTOLOGIES AND FACTScsandit
The storing and the processing of OWL instances are important subjects in database modeling.
Many research works have focused on the way of managing OWL instances efficiently. Some
systems store and manage OWL instances using relational models to ensure their persistence.
Nevertheless, several approaches keep only RDF triplets as instances in relational tables
explicitly, and the manner of structuring instances as graph and keeping links between concepts
is not taken into account. In this paper, we propose an architecture that permits relational
tables behave as an OWL model by adapting relational tables to OWL instances and an OWL
hierarchy structure. Therefore, two kinds of tables are used: facts or instances relational tables.
The tables hold instances and the OWL table holds a specification of how the concepts are
structured. Instances tables should conform to OWLtable to be valid. A mechanism of
construction of OWLtable and instances tables is defined in order to enable and enhance
inference and semantic querying of OWL in relational model context.
Effective Data Retrieval in XML using TreeMatch AlgorithmIRJET Journal
This document summarizes research on effective data retrieval from XML documents using the TreeMatch algorithm. It begins with an abstract that introduces the TreeMatch algorithm and its ability to provide fast data retrieval from XML documents by matching tree-shaped patterns. It then reviews related work on XML tree matching algorithms and their issues like suboptimality. The document proposes using the TreeMatch algorithm to overcome issues with wildcards, negation, and siblings when querying XML documents with XPath or XQuery. It provides details on the TreeMatch algorithm and its ability to process different types of XML tree pattern queries efficiently while avoiding intermediate results. In conclusion, it states that the TreeMatch algorithm can efficiently handle three types of XML tree pattern queries and overcome the problem of sub
Convolutional neural network with binary moth flame optimization for emotion ...IAESIJAI
Electroencephalograph (EEG) signals have the ability of real-time reflecting brain activities. Utilizing the EEG signal for analyzing human emotional states is a common study. The EEG signals of the emotions aren’t distinctive and it is different from one person to another as every one of them has different emotional responses to same stimuli. Which is why, the signals of the EEG are subject dependent and proven to be effective for the subject dependent detection of the Emotions. For the purpose of achieving enhanced accuracy and high true positive rate, the suggested system proposed a binary moth flame optimization (BMFO) algorithm for the process of feature selection and convolutional neural networks (CNNs) for classifications. In this proposal, optimum features are chosen with the use of accuracy as objective function. Ultimately, optimally chosen features are classified after that with the use of a CNN for the purpose of discriminating different emotion states.
A novel ensemble model for detecting fake newsIAESIJAI
Due the growing proliferation of fake news over the past couple of years, our objective in this paper is to propose an ensemble model for the automatic classification of article news as being either real or fake. For this purpose, we opt for a blending technique that combines three models, namely bidirectional long short-term memory (Bi-LSTM), stochastic gradient descent classifier and ridge classifier. The implementation of the proposed model (i.e. BI-LSR) on real world datasets, has shown outstanding results. In fact, it achieved an accuracy score of 99.16%. Accordingly, this ensemble learning has proven to do perform better than individual conventional machine learning and deep learning models as well as many ensemble learning approaches cited in the literature.
More Related Content
Similar to Mapping of extensible markup language-to-ontology representation for effective data integration
HOLISTIC EVALUATION OF XML QUERIES WITH STRUCTURAL PREFERENCES ON AN ANNOTATE...ijseajournal
With the emergence of XML as de facto format for storing and exchanging information over the Internet, the search for ever more innovative and effective techniques for their querying is a major and current concern of the XML database community. Several studies carried out to help solve this problem are mostly oriented towards the evaluation of so-called exact queries which, unfortunately, are likely (especially in the case of semi-structured documents) to yield abundant results (in the case of vague queries) or empty results (in the case of very precise queries). From the observation that users who make requests are not necessarily interested in all possible solutions, but rather in those that are closest to their needs, an important field of research has been opened on the evaluation of preferences queries. In this paper, we propose an approach for the evaluation of such queries, in case the preferences concern the structure of the document. The solution investigated revolves around the proposal of an evaluation plan in three phases: rewriting-evaluation-merge. The rewriting phase makes it possible to obtain, from a partitioningtransformation operation of the initial query, a hierarchical set of preferences path queries which are holistically evaluated in the second phase by an instrumented version of the algorithm TwigStack. The merge phase is the synthesis of the best results.
INVESTIGATING BINARY STRING ENCODING FOR COMPACT REPRESENTATION OF XML DOCUMENTScsandit
This document summarizes an investigation into using binary string encoding to compactly represent XML documents. The study explores prefix-free encoding schemes that map XML node labels to binary strings to minimize storage requirements. Experimental results show that a proposed prefix-free encoding scheme where each bitstring division observes byte boundaries achieved the most efficient storage sizes for various real and synthetic XML datasets compacted using different labeling and compaction approaches. Future work could explore additional encoding formats and their impact on storage costs and query/update performance for compacted XML data.
The increased potential of the ontologies to reduce the human interference has wide range of applications. This paper identifies requirements for an ontology development platform to innovate artificially intelligent web. To facilitate this process, RDF and OWL have been developed as standard formats for the sharing and integration of data and knowledge. The knowledge in the form of rich conceptual schemas called ontologies. Based on the framework, an architectural paradigm is put forward in view of ontology engineering and development of ontology applications and a development portal designed to support ontology engineering, content authoring and application development with a view to maximal scalability in size and complexity of semantic knowledge and flexible reuse of ontology models and ontology application processes in a distributed and collaborative engineering environment.
The document discusses space efficient data structures for storing JSON documents in memory. It introduces JSON and compares it to XML. Succinct data structures are discussed as a way to represent trees in a space efficient manner close to theoretical minimum while still supporting efficient operations. The paper aims to apply succinct data structures to represent the tree structure of JSON documents to allow processing large documents in limited memory.
Transforming data-centric eXtensible markup language into relational database...journalBEEI
eXtensible markup language (XML) appeared internationally as the format for data representation over the web. Yet, most organizations are still utilising relational databases as their database solutions. As such, it is crucial to provide seamless integration via effective transformation between these database infrastructures. In this paper, we propose XML-REG to bridge these two technologies based on node-based and path-based approaches. The node-based approach is good to annotate each positional node uniquely, while the path-based approach provides summarised path information to join the nodes. On top of that, a new range labelling is also proposed to annotate nodes uniquely by ensuring the structural relationships are maintained between nodes. If a new node is to be added to the document, re-labelling is not required as the new label will be assigned to the node via the new proposed labelling scheme. Experimental evaluations indicated that the performance of XML-REG exceeded XMap, XRecursive, XAncestor and Mini-XML concerning storing time, query retrieval time and scalability. This research produces a core framework for XML to relational databases (RDB) mapping, which could be adopted in various industries.
Formal Models and Algorithms for XML Data InteroperabilityThomas Lee
In this paper, we study the data interoperability problem of web services in terms of XML schema compatibility. When Web Service A sends XML messages to Web Service B, A is interoperable with B if B can accept all messages from A. That is, the XML schema R for B to receive XML instances must be compatible with the XML schema S for A to send XML instances, i.e., A is a subschema of B. We propose a formal model called Schema Automaton (SA) to model W3C XML Schema (XSD) and develop several algorithms to perform different XML schema computations. The computations include schema minimization, schema equivalence testing, subschema testing, and subschema extraction. We have conducted experiments on an e-commerce standard XSD called xCBL to demonstrate the practicality of our algorithms. One experiment has refuted the claim that the xCBL 3.5 XSD is backward compatible with the xCBL 3.0 XSD. Another experiment has shown that the xCBL XSDs can be effectively trimmed into small subschemas for specific applications, which has significantly reduced the schema processing time.
This paper investigates using Object Role Modeling (ORM), a graphical conceptual modeling technique, to design XML schemas. The authors describe an algorithm to automatically generate an XML schema file from an ORM conceptual data model. Their approach aims to reduce data redundancy and increase connectivity in the resulting XML instances. Major object types are identified in the ORM model and used as the starting point for hierarchies in the generated XML schema to achieve this goal.
Ontology languages are used in modelling the semantics of concepts within a particular domain and the relationships between those concepts. The Semantic Web standard provides a number of modelling languages that differ in their level of expressivity and are organized in a Semantic Web Stack in such a way that each language level builds on the expressivity of the other. There are several problems when one attempts to use independently developed ontologies. When existing ontologies are adapted for new purposes it requires that certain operations are performed on them. These operations are currently performed in a semi-automated manner. This paper seeks to model categorically the syntax and semantics of RDF ontology as a step towards the formalization of ontological operations using category theory.
This document proposes an enhanced method of XML validation using the SRML metalanguage. It extends SRML to allow for more complex validation rules and the ability to correct invalid data. The validation is done in two steps, first using XSD for structure and then SRML rules for content. It also explores applying the SRML validator to databases for record validation on database operations. This provides more granular validation of dataset records compared to validating the entire XML file.
This document discusses ontology mapping. It begins with an introduction to the semantic web and ontologies. Ontology mapping is important for allowing different ontologies to be aligned and related. There are different types of ontology mapping including alignment, merging, and mapping. The document then surveys some popular ontology mapping techniques including GLUE, PROMPT, and QOM. It evaluates these techniques and discusses their inputs, outputs, and approaches. The document concludes that semantic web research is important for advancing web technologies and realizing the goals of web 3.0. Future work could involve developing new ontology mapping techniques and publishing research on existing mapping methods.
A Survey on Heterogeneous Data Exchange using XmlIRJET Journal
This document summarizes a research paper on heterogeneous data exchange using XML. It discusses how XML has become a standard for data transmission due to its flexibility, extensibility and ability to represent heterogeneous data. The document then reviews related work on XML data exchange and mapping between relational and XML models. It also describes the process of exporting data from a source database to XML, importing XML data by validating, transforming and storing it in the target database, and transmitting data between different servers.
The document provides an overview of approaches for clustering XML data based on structure and content. It first outlines applications where XML clustering is useful, including XML query processing and data integration. It then presents a generic framework for XML clustering with three phases: data representation, similarity computation, and clustering/grouping. The document surveys current approaches and aims to classify them and identify common features. It also discusses challenges in XML clustering and future research directions.
While the world is witnessing an information revolution unprecedented and great speed in the growth of databases in all aspects. Databases interconnect with their content and schema but use different elements and structures to express the same concepts and relations, which may cause semantic and structural conflicts. This paper proposes a new technique for integration the heterogeneous eXtensible Markup Language (XML) schemas, under the name XDEHD. The returned mediated schema contains all concepts and relations of the sources without duplication. Detailed technique divides into three steps; First, extract all subschemas from the sources by decompose the schemas sources, each subschema contains three levels, these levels are ancestor, root and leaf. Thereafter, second, the technique matches and compares the subschemas and return the related candidate subschemas, semantic closeness function is implemented to measures the degree how similar the concepts of subschemas are modelled in the sources. Finally, create the medicate schema by integration the candidate subschemas, and then obtain the minimal and complete unified schema, association strength function is developed to compute closely of pair in candidate subschema across all data sources, and elements repetition function is employed to calculate how many times each element repeated between the candidate subschema.
A Semi-Automatic Ontology Extension Method for Semantic Web ServicesIDES Editor
this paper provides a novel semi-automatic ontology
extension method for Semantic Web Services (SWS). This is
significant since ontology extension methods those existing
in literature mostly deal with semantic description of static
Web resources such as text documents. Hence, there is a need
for methods that can serve dynamic Web resources such as
SWS. The developed method in this paper avoids redundancy
and respects consistency so as to assure high quality of the
resulting shared ontologies.
CONTEXT-AWARE CLUSTERING USING GLOVE AND K-MEANSijseajournal
ABSTRACT
In this paper we propose a novel method to cluster categorical data while retaining their context. Typically, clustering is performed on numerical data. However it is often useful to cluster categorical data as well, especially when dealing with data in real-world contexts. Several methods exist which can cluster categorical data, but our approach is unique in that we use recent text-processing and machine learning advancements like GloVe and t- SNE to develop a a context-aware clustering approach (using pre-trained
word embeddings). We encode words or categorical data into numerical, context-aware, vectors that we use to cluster the data points using common clustering algorithms like K-means.
Expression of Query in XML object-oriented databaseEditor IJCATR
This document discusses expressing queries in an XML object-oriented database. It proposes a method for querying an object-oriented database where the user can assign weights to restrictions in conjunctive or disjunctive queries based on importance. The queries and resulting objects are represented using XML labels to simplify them and make them closer to user needs. It also presents a case study of a book information registration system to evaluate the proposed query method on an XML object-oriented database.
Expression of Query in XML object-oriented databaseEditor IJCATR
Upon invent of object-oriented database, the concept of behavior in database was propounded. Before, relational database only provided a logical modeling of data and paid no attention to the operations applied on data in the system. In this paper, a method is presented for query of object-oriented database. This method has appropriate results when the user explains restrictions in a combinational matter (disjunctive and conjunctive) and assumes a weight for each one of restrictions based on their importance. Later, the obtained results are sorted based on their belonging rate to the response set. In continue, queries are explained using XML labels. The purpose is simplifying queries and objects resulted from queries to be very close to the user need and meet his expectation.
Expression of Query in XML object-oriented databaseEditor IJCATR
Upon invent of object-oriented database, the concept of behavior in database was propounded. Before, relational database
only provided a logical modeling of data and paid no attention to the operations applied on data in the system. In this paper, a method
is presented for query of object-oriented database. This method has appropriate results when the user explains restrictions in a
combinational matter (disjunctive and conjunctive) and assumes a weight for each one of restrictions based on their importance. Later,
the obtained results are sorted based on their belonging rate to the response set. In continue, queries are explained using XML labels.
The purpose is simplifying queries and objects resulted from queries to be very close to the user need and meet his expectation.
USING RELATIONAL MODEL TO STORE OWL ONTOLOGIES AND FACTScsandit
The storing and the processing of OWL instances are important subjects in database modeling.
Many research works have focused on the way of managing OWL instances efficiently. Some
systems store and manage OWL instances using relational models to ensure their persistence.
Nevertheless, several approaches keep only RDF triplets as instances in relational tables
explicitly, and the manner of structuring instances as graph and keeping links between concepts
is not taken into account. In this paper, we propose an architecture that permits relational
tables behave as an OWL model by adapting relational tables to OWL instances and an OWL
hierarchy structure. Therefore, two kinds of tables are used: facts or instances relational tables.
The tables hold instances and the OWL table holds a specification of how the concepts are
structured. Instances tables should conform to OWLtable to be valid. A mechanism of
construction of OWLtable and instances tables is defined in order to enable and enhance
inference and semantic querying of OWL in relational model context.
Effective Data Retrieval in XML using TreeMatch AlgorithmIRJET Journal
This document summarizes research on effective data retrieval from XML documents using the TreeMatch algorithm. It begins with an abstract that introduces the TreeMatch algorithm and its ability to provide fast data retrieval from XML documents by matching tree-shaped patterns. It then reviews related work on XML tree matching algorithms and their issues like suboptimality. The document proposes using the TreeMatch algorithm to overcome issues with wildcards, negation, and siblings when querying XML documents with XPath or XQuery. It provides details on the TreeMatch algorithm and its ability to process different types of XML tree pattern queries efficiently while avoiding intermediate results. In conclusion, it states that the TreeMatch algorithm can efficiently handle three types of XML tree pattern queries and overcome the problem of sub
Similar to Mapping of extensible markup language-to-ontology representation for effective data integration (20)
Convolutional neural network with binary moth flame optimization for emotion ...IAESIJAI
Electroencephalograph (EEG) signals have the ability of real-time reflecting brain activities. Utilizing the EEG signal for analyzing human emotional states is a common study. The EEG signals of the emotions aren’t distinctive and it is different from one person to another as every one of them has different emotional responses to same stimuli. Which is why, the signals of the EEG are subject dependent and proven to be effective for the subject dependent detection of the Emotions. For the purpose of achieving enhanced accuracy and high true positive rate, the suggested system proposed a binary moth flame optimization (BMFO) algorithm for the process of feature selection and convolutional neural networks (CNNs) for classifications. In this proposal, optimum features are chosen with the use of accuracy as objective function. Ultimately, optimally chosen features are classified after that with the use of a CNN for the purpose of discriminating different emotion states.
A novel ensemble model for detecting fake newsIAESIJAI
Due the growing proliferation of fake news over the past couple of years, our objective in this paper is to propose an ensemble model for the automatic classification of article news as being either real or fake. For this purpose, we opt for a blending technique that combines three models, namely bidirectional long short-term memory (Bi-LSTM), stochastic gradient descent classifier and ridge classifier. The implementation of the proposed model (i.e. BI-LSR) on real world datasets, has shown outstanding results. In fact, it achieved an accuracy score of 99.16%. Accordingly, this ensemble learning has proven to do perform better than individual conventional machine learning and deep learning models as well as many ensemble learning approaches cited in the literature.
K-centroid convergence clustering identification in one-label per type for di...IAESIJAI
Disease prediction is a high demand field which requires significant support from machine learning (ML) to enhance the result efficiency. The research works on application of K-means clustering supervised classification in disease prediction where each class only has one labeled data. The K-centroid convergence clustering identification (KC3 I) system is based on semi-K-means clustering but only requires single labeled data per class for the training process with the training dataset to update the centroid. The KC3 I model also includes a dictionary box to index all the input centroids before and after the updating process. Each centroid matches with a corresponding label inside this box. After the training process, each time the input features arrive, the trained centroid will put them to its cluster depending on the Euclidean distance, then convert them into the specific class name, which is coherent to that centroid index. Two validation stages were carried out and accomplished the expectation in terms of precision, recall, F1-score, and absolute accuracy. The last part demonstrates the possibility of feature reduction by selecting the most crucial feature with the extra tree classifier method. Total data are fed into the KC3 I system with the most important features and remain the same accuracy.
Plant leaf detection through machine learning based image classification appr...IAESIJAI
Since maize is a staple diet for people, especially vegetarians and vegans, maize leaf disease has a significant influence here on the food industry including maize crop productivity. Therefore, it should be understood that maize quality must be optimal; yet, to do so, maize must be safeguarded from several illnesses. As a result, there is a great demand for such an automated system that can identify the condition early on and take the appropriate action. Early disease identification is crucial, but it also poses a major obstacle. As a result, in this research project, we adopt the fundamental k-nearest neighbor (KNN) model and concentrate on building and developing the enhanced k-nearest neighbor (EKNN) model. EKNN aids in identifying several classes of disease. To gather discriminative, boundary, pattern, and structurally linked information, additional high-quality fine and coarse features are generated. This information is then used in the classification process. The classification algorithm offers high-quality gradient-based features. Additionally, the proposed model is assessed using the Plant-Village dataset, and a comparison with many standard classification models using various metrics is also done.
Backbone search for object detection for applications in intrusion warning sy...IAESIJAI
In this work, we propose a novel backbone search method for object detection for applications in intrusion warning systems. The goal is to find a compact model for use in embedded thermal imaging cameras widely used in intrusion warning systems. The proposed method is based on faster region-based convolutional neural network (Faster R-CNN) because it can detect small objects. Inspired by EfficientNet, the sought-after backbone architecture is obtained by finding the most suitable width scale for the base backbone (ResNet50). The evaluation metrics are mean average precision (mAP), number of parameters, and number of multiply–accumulate operations (MACs). The experimental results showed that the proposed method is effective in building a lightweight neural network for the task of object detection. The obtained model can keep the predefined mAP while minimizing the number of parameters and computational resources. All experiments are executed elaborately on the person detection in intrusion warning systems (PDIWS) dataset.
Deep learning method for lung cancer identification and classificationIAESIJAI
Lung cancer (LC) is calming many lives and is becoming a serious cause of concern. The detection of LC at an early stage assists the chances of recovery. Accuracy of detection of LC at an early stage can be improved with the help of a convolutional neural network (CNN) based deep learning approach. In this paper, we present two methodologies for Lung cancer detection (LCD) applied on Lung image database consortium (LIDC) and image database resource initiative (IDRI) data sets. Classification of these LC images is carried out using support vector machine (SVM), and deep CNN. The CNN is trained with i) multiple batches and ii) single batch for LC image classification as non cancer and cancer image. All these methods are being implemented in MATLAB. The accuracy of classification obtained by SVM is 65%, whereas deep CNN produced detection accuracy of 80% and 100% respectively for multiple and single batch training. The novelty of our experimentation is near 100% classification accuracy obtained by our deep CNN model when tested on 25 Lung computed tomography (CT) test images each of size 512×512 pixels in less than 20 iterations as compared to the research work carried out by other researchers using cropped LC nodule images.
Optically processed Kannada script realization with Siamese neural network modelIAESIJAI
Optical character recognition (OCR) is a technology that allows computers to recognize and extract text from images or scanned documents. It is commonly used to convert printed or handwritten text into machine-readable format. This Study presents an OCR system on Kannada Characters based on siamese neural network (SNN). Here the SNN, a Deep neural network which comprises of two identical convolutional neural network (CNN) compare the script and ranks based on the dissimilarity. When lesser dissimilarity score is identified, prediction is done as character match. In this work the authors use 5 classes of Kannada characters which were initially preprocessed using grey scaling and convert it to pgm format. This is directly input into the Deep convolutional network which is learnt from matching and non-matching image between the CNN with contrastive loss function in Siamese architecture. The Proposed OCR system uses very less time and gives more accurate results as compared to the regular CNN. The model can become a powerful tool for identification, particularly in situations where there is a high degree of variation in writing styles or limited training data is available.
Embedded artificial intelligence system using deep learning and raspberrypi f...IAESIJAI
Melanoma is a kind of skin cancer that originates in melanocytes responsible for producing melanin, it can be a severe and potentially deadly form of cancer because it can metastasize to other regions of the body if not detected and treated early. To facilitate this process, Recently, various computer-assisted low-cost, reliable, and accurate diagnostic systems have been proposed based on artificial intelligence (AI) algorithms, particularly deep learning techniques. This work proposed an innovative and intelligent system that combines the internet of things (IoT) with a Raspberry Pi connected to a camera and a deep learning model based on the deep convolutional neural network (CNN) algorithm for real-time detection and classification of melanoma cancer lesions. The key stages of our model before serializing to the Raspberry Pi: Firstly, the preprocessing part contains data cleaning, data transformation (normalization), and data augmentation to reduce overfitting when training. Then, the deep CNN algorithm is used to extract the features part. Finally, the classification part with applied Sigmoid Activation Function. The experimental results indicate the efficiency of our proposed classification system as we achieved an accuracy rate of 92%, a precision of 91%, a sensitivity of 91%, and an area under the curve- receiver operating characteristics (AUC-ROC) of 0.9133.
Deep learning based biometric authentication using electrocardiogram and irisIAESIJAI
Authentication systems play an important role in wide range of applications. The traditional token certificate and password-based authentication systems are now replaced by biometric authentication systems. Generally, these authentication systems are based on the data obtained from face, iris, electrocardiogram (ECG), fingerprint and palm print. But these types of models are unimodal authentication, which suffer from accuracy and reliability issues. In this regard, multimodal biometric authentication systems have gained huge attention to develop the robust authentication systems. Moreover, the current development in deep learning schemes have proliferated to develop more robust architecture to overcome the issues of tradition machine learning based authentication systems. In this work, we have adopted ECG and iris data and trained the obtained features with the help of hybrid convolutional neural network- long short-term memory (CNN-LSTM) model. In ECG, R peak detection is considered as an important aspect for feature extraction and morphological features are extracted. Similarly, gabor-wavelet, gray level co-occurrence matrix (GLCM), gray level difference matrix (GLDM) and principal component analysis (PCA) based feature extraction methods are applied on iris data. The final feature vector is obtained from MIT-BIH and IIT Delhi Iris dataset which is trained and tested by using CNN-LSTM. The experimental analysis shows that the proposed approach achieves average accuracy, precision, and F1-core as 0.985, 0.962 and 0.975, respectively.
Hybrid channel and spatial attention-UNet for skin lesion segmentationIAESIJAI
Melanoma is a type of skin cancer which has affected many lives globally. The American Cancer Society research has suggested that it a serious type of skin cancer and lead to mortality but it is almost 100% curable if it is detected and treated in its early stages. Currently automated computer vision-based schemes are widely adopted but these systems suffer from poor segmentation accuracy. To overcome these issue, deep learning (DL) has become the promising solution which performs extensive training for pattern learning and provide better classification accuracy. However, skin lesion segmentation is affected due to skin hair, unclear boundaries, pigmentation, and mole. To overcome this issue, we adopt UNet based deep learning scheme and incorporated attention mechanism which considers low level statistics and high-level statistics combined with feedback and skip connection module. This helps to obtain the robust features without neglecting the channel information. Further, we use channel attention, spatial attention modulation to achieve the final segmentation. The proposed DL based scheme is instigated on publically available dataset and experimental investigation shows that the proposed Hybrid Attention UNet approach achieves average performance as 0.9715, 0.9962, 0.9710.
Photoplethysmogram signal reconstruction through integrated compression sensi...IAESIJAI
The transmission of photoplethysmogram (PPG) signals in real-time is extremely challenging and facilitates the use of an internet of things (IoT) environment for healthcare- monitoring. This paper proposes an approach for PPG signal reconstruction through integrated compression sensing and basis function aware shallow learning (CSBSL). Integrated-CSBSL approach for combined compression of PPG signals via multiple channels thereby improving the reconstruction accuracy for the PPG signals essential in healthcare monitoring. An optimal basis function aware shallow learning procedure is employed on PPG signals with prior initialization; this is further fine-tuned by utilizing the knowledge of various other channels, which exploit the further sparsity of the PPG signals. The proposed method for learning combined with PPG signals retains the knowledge of spatial and temporal correlation. The proposed Integrated-CSBSL approach consists of two steps, in the first step the shallow learning based on basis function is carried out through training the PPG signals. The proposed method is evaluated using multichannel PPG signal reconstruction, which potentially benefits clinical applications through PPG monitoring and diagnosis.
Speaker identification under noisy conditions using hybrid convolutional neur...IAESIJAI
Speaker identification is biometrics that classifies or identifies a person from other speakers based on speech characteristics. Recently, deep learning models outperformed conventional machine learning models in speaker identification. Spectrograms of the speech have been used as input in deep learning-based speaker identification using clean speech. However, the performance of speaker identification systems gets degraded under noisy conditions. Cochleograms have shown better results than spectrograms in deep learning-based speaker recognition under noisy and mismatched conditions. Moreover, hybrid convolutional neural network (CNN) and recurrent neural network (RNN) variants have shown better performance than CNN or RNN variants in recent studies. However, there is no attempt conducted to use a hybrid CNN and enhanced RNN variants in speaker identification using cochleogram input to enhance the performance under noisy and mismatched conditions. In this study, a speaker identification using hybrid CNN and the gated recurrent unit (GRU) is proposed for noisy conditions using cochleogram input. VoxCeleb1 audio dataset with real-world noises, white Gaussian noises (WGN) and without additive noises were employed for experiments. The experiment results and the comparison with existing works show that the proposed model performs better than other models in this study and existing works.
Multi-channel microseismic signals classification with convolutional neural n...IAESIJAI
Identifying and classifying microseismic signals is essential to warn of mines’ dangers. Deep learning has replaced traditional methods, but labor-intensive manual identification and varying deep learning outcomes pose challenges. This paper proposes a transfer learning-based convolutional neural network (CNN) method called microseismic signals-convolutional neural network (MS-CNN) to automatically recognize and classify microseismic events and blasts. The model was instructed on a limited sample of data to obtain an optimal weight model for microseismic waveform recognition and classification. A comparative analysis was performed with an existing CNN model and classical image classification models such as AlexNet, GoogLeNet, and ResNet50. The outcomes demonstrate that the MS-CNN model achieved the best recognition and classification effect (99.6% accuracy) in the shortest time (0.31 s to identify 277 images in the test set). Thus, the MS-CNN model can efficiently recognize and classify microseismic events and blasts in practical engineering applications, improving the recognition timeliness of microseismic signals and further enhancing the accuracy of event classification.
Sophisticated face mask dataset: a novel dataset for effective coronavirus di...IAESIJAI
Efficient and accurate coronavirus disease (COVID-19) surveillance necessitates robust identification of individuals wearing face masks. This research introduces the sophisticated face mask dataset (SFMD), a comprehensive compilation of high-quality face mask images enriched with detailed annotations on mask types, fits, and usage patterns. Leveraging cutting-edge deep learning models—EfficientNet-B2, ResNet50, and MobileNet-V2—, we compare SFMD against two established benchmarks: the real-world masked face dataset (RMFD) and the masked face recognition dataset (MFRD). Across all models, SFMD consistently outperforms RMFD and MFRD in key metrics, including accuracy, precision, recall, and F1 score. Additionally, our study demonstrates the dataset's capability to cultivate robust models resilient to intricate scenarios like low-light conditions and facial occlusions due to accessories or facial hair.
Transfer learning for epilepsy detection using spectrogram imagesIAESIJAI
Epilepsy stands out as one of the common neurological diseases. The neural activity of the brain is observed using electroencephalography (EEG). Manual inspection of EEG brain signals is a slow and arduous process, which puts heavy load on neurologists and affects their performance. The aim of this study is to find the best result of classification using the transfer learning model that automatically identify the epileptic and the normal activity, to classify EEG signals by using images of spectrogram which represents the percentage of energy for each coefficient of the continuous wavelet. Dataset includes the EEG signals recorded at monitoring unit of epilepsy used in this study to presents an application of transfer learning by comparing three models Alexnet, visual geometry group (VGG19) and residual neural network ResNet using different combinations with seven different classifiers. This study tested the models and reached a different value of accuracy and other metrics used to judge their performances, and as a result the best combination has been achieved with ResNet combined with support vector machine (SVM) classifier that classified EEG signals with a high success rate using multiple performance metrics such as 97.22% accuracy and 2.78% the value of the error rate.
Deep neural network for lateral control of self-driving cars in urban environ...IAESIJAI
The exponential growth of the automotive industry clearly indicates that self-driving cars are the future of transportation. However, their biggest challenge lies in lateral control, particularly in urban bottlenecking environments, where disturbances and obstacles are abundant. In these situations, the ego vehicle has to follow its own trajectory while rapidly correcting deviation errors without colliding with other nearby vehicles. Various research efforts have focused on developing lateral control approaches, but these methods remain limited in terms of response speed and control accuracy. This paper presents a control strategy using a deep neural network (DNN) controller to effectively keep the car on the centerline of its trajectory and adapt to disturbances arising from deviations or trajectory curvature. The controller focuses on minimizing deviation errors. The Matlab/Simulink software is used for designing and training the DNN. Finally, simulation results confirm that the suggested controller has several advantages in terms of precision, with lateral deviation remaining below 0.65 meters, and rapidity, with a response time of 0.7 seconds, compared to traditional controllers in solving lateral control.
Attention mechanism-based model for cardiomegaly recognition in chest X-Ray i...IAESIJAI
Recently, cardiovascular diseases (CVDs) have become a rapidly growing problem in the world, especially in developing countries. The latter are facing a lifestyle change that introduces new risk factors for heart disease, that requires a particular and urgent interest. Besides, cardiomegaly is a sign of cardiovascular diseases that refers to various conditions; it is associated with the heart enlargement that can be either transient or permanent depending on certain conditions. Furthermore, cardiomegaly is visible on any imaging test including Chest X-Radiation (X-Ray) images; which are one of the most common tools used by Cardiologists to detect and diagnose many diseases. In this paper, we propose an innovative deep learning (DL) model based on an attention module and MobileNet architecture to recognize Cardiomegaly patients using the popular Chest X-Ray8 dataset. Actually, the attention module captures the spatial relationship between the relevant regions in Chest X-Ray images. The experimental results show that the proposed model achieved interesting results with an accuracy rate of 81% which makes it suitable for detecting cardiomegaly disease.
Efficient commodity price forecasting using long short-term memory modelIAESIJAI
Predicting commodity prices, particularly food prices, is a significant concern for various stakeholders, especially in regions that are highly sensitive to commodity price volatility. Historically, many machine learning models like autoregressive integrated moving average (ARIMA) and support vector machine (SVM) have been suggested to overcome the forecasting task. These models struggle to capture the multifaceted and dynamic factors influencing these prices. Recently, deep learning approaches have demonstrated considerable promise in handling complex forecasting tasks. This paper presents a novel long short-term memory (LSTM) network-based model for commodity price forecasting. The model uses five essential commodities namely bread, meat, milk, oil, and petrol. The proposed model focuses on advanced feature engineering which involves moving averages, price volatility, and past prices. The results reveal that our model outperforms traditional methods as it achieves 0.14, 3.04%, and 98.2% for root mean square error (RMSE), mean absolute percentage error (MAPE), and R-squared (R2 ), respectively. In addition to the simplicity of the model, which consists of an LSTM single-cell architecture that reduced the training time to a few minutes instead of hours. This paper contributes to the economic literature on price prediction using advanced deep learning techniques as well as provides practical implications for managing commodity price instability globally.
1-dimensional convolutional neural networks for predicting sudden cardiacIAESIJAI
Sudden cardiac arrest (SCA) is a serious heart problem that occurs without symptoms or warning. SCA causes high mortality. Therefore, it is important to estimate the incidence of SCA. Current methods for predicting ventricular fibrillation (VF) episodes require monitoring patients over time, resulting in no complications. New technologies, especially machine learning, are gaining popularity due to the benefits they provide. However, most existing systems rely on manual processes, which can lead to inefficiencies in disseminating patient information. On the other hand, existing deep learning methods rely on large data sets that are not publicly available. In this study, we propose a deep learning method based on one-dimensional convolutional neural networks to learn to use discrete fourier transform (DFT) features in raw electrocardiogram (ECG) signals. The results showed that our method was able to accurately predict the onset of SCA with an accuracy of 96% approximately 90 minutes before it occurred. Predictions can save many lives. That is, optimized deep learning models can outperform manual models in analyzing long-term signals.
A deep learning-based approach for early detection of disease in sugarcane pl...IAESIJAI
In many regions of the nation, agriculture serves as the primary industry. The farming environment now faces a number of challenges to farmers. One of the major concerns, and the focus of this research, is disease prediction. A methodology is suggested to automate a process for identifying disease in plant growth and warning farmers in advance so they can take appropriate action. Disease in crop plants has an impact on agricultural production. In this work, a novel DenseNet-support vector machine: explainable artificial intelligence (DNet-SVM: XAI) interpretation that combines a DenseNet with support vector machine (SVM) and local interpretable model-agnostic explanation (LIME) interpretation has been proposed. DNet-SVM: XAI was created by a series of modifications to DenseNet201, including the addition of a support vector machine (SVM) classifier. Prior to using SVM to identify if an image is healthy or un-healthy, images are first feature extracted using a convolution network called DenseNet. In addition to offering a likely explanation for the prediction, the reasoning is carried out utilizing the visual cue produced by the LIME. In light of this, the proposed approach, when paired with its determined interpretability and precision, may successfully assist farmers in the detection of infected plants and recommendation of pesticide for the identified disease.
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...Jason Yip
The typical problem in product engineering is not bad strategy, so much as “no strategy”. This leads to confusion, lack of motivation, and incoherent action. The next time you look for a strategy and find an empty space, instead of waiting for it to be filled, I will show you how to fill it in yourself. If you’re wrong, it forces a correction. If you’re right, it helps create focus. I’ll share how I’ve approached this in the past, both what works and lessons for what didn’t work so well.
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation F...AlexanderRichford
QR Secure: A Hybrid Approach Using Machine Learning and Security Validation Functions to Prevent Interaction with Malicious QR Codes.
Aim of the Study: The goal of this research was to develop a robust hybrid approach for identifying malicious and insecure URLs derived from QR codes, ensuring safe interactions.
This is achieved through:
Machine Learning Model: Predicts the likelihood of a URL being malicious.
Security Validation Functions: Ensures the derived URL has a valid certificate and proper URL format.
This innovative blend of technology aims to enhance cybersecurity measures and protect users from potential threats hidden within QR codes 🖥 🔒
This study was my first introduction to using ML which has shown me the immense potential of ML in creating more secure digital environments!
How information systems are built or acquired puts information, which is what they should be about, in a secondary place. Our language adapted accordingly, and we no longer talk about information systems but applications. Applications evolved in a way to break data into diverse fragments, tightly coupled with applications and expensive to integrate. The result is technical debt, which is re-paid by taking even bigger "loans", resulting in an ever-increasing technical debt. Software engineering and procurement practices work in sync with market forces to maintain this trend. This talk demonstrates how natural this situation is. The question is: can something be done to reverse the trend?
GlobalLogic Java Community Webinar #18 “How to Improve Web Application Perfor...GlobalLogic Ukraine
Під час доповіді відповімо на питання, навіщо потрібно підвищувати продуктивність аплікації і які є найефективніші способи для цього. А також поговоримо про те, що таке кеш, які його види бувають та, основне — як знайти performance bottleneck?
Відео та деталі заходу: https://bit.ly/45tILxj
Session 1 - Intro to Robotic Process Automation.pdfUiPathCommunity
👉 Check out our full 'Africa Series - Automation Student Developers (EN)' page to register for the full program:
https://bit.ly/Automation_Student_Kickstart
In this session, we shall introduce you to the world of automation, the UiPath Platform, and guide you on how to install and setup UiPath Studio on your Windows PC.
📕 Detailed agenda:
What is RPA? Benefits of RPA?
RPA Applications
The UiPath End-to-End Automation Platform
UiPath Studio CE Installation and Setup
💻 Extra training through UiPath Academy:
Introduction to Automation
UiPath Business Automation Platform
Explore automation development with UiPath Studio
👉 Register here for our upcoming Session 2 on June 20: Introduction to UiPath Studio Fundamentals: https://community.uipath.com/events/details/uipath-lagos-presents-session-2-introduction-to-uipath-studio-fundamentals/
Introducing BoxLang : A new JVM language for productivity and modularity!Ortus Solutions, Corp
Just like life, our code must adapt to the ever changing world we live in. From one day coding for the web, to the next for our tablets or APIs or for running serverless applications. Multi-runtime development is the future of coding, the future is to be dynamic. Let us introduce you to BoxLang.
Dynamic. Modular. Productive.
BoxLang redefines development with its dynamic nature, empowering developers to craft expressive and functional code effortlessly. Its modular architecture prioritizes flexibility, allowing for seamless integration into existing ecosystems.
Interoperability at its Core
With 100% interoperability with Java, BoxLang seamlessly bridges the gap between traditional and modern development paradigms, unlocking new possibilities for innovation and collaboration.
Multi-Runtime
From the tiny 2m operating system binary to running on our pure Java web server, CommandBox, Jakarta EE, AWS Lambda, Microsoft Functions, Web Assembly, Android and more. BoxLang has been designed to enhance and adapt according to it's runnable runtime.
The Fusion of Modernity and Tradition
Experience the fusion of modern features inspired by CFML, Node, Ruby, Kotlin, Java, and Clojure, combined with the familiarity of Java bytecode compilation, making BoxLang a language of choice for forward-thinking developers.
Empowering Transition with Transpiler Support
Transitioning from CFML to BoxLang is seamless with our JIT transpiler, facilitating smooth migration and preserving existing code investments.
Unlocking Creativity with IDE Tools
Unleash your creativity with powerful IDE tools tailored for BoxLang, providing an intuitive development experience and streamlining your workflow. Join us as we embark on a journey to redefine JVM development. Welcome to the era of BoxLang.
Conversational agents, or chatbots, are increasingly used to access all sorts of services using natural language. While open-domain chatbots - like ChatGPT - can converse on any topic, task-oriented chatbots - the focus of this paper - are designed for specific tasks, like booking a flight, obtaining customer support, or setting an appointment. Like any other software, task-oriented chatbots need to be properly tested, usually by defining and executing test scenarios (i.e., sequences of user-chatbot interactions). However, there is currently a lack of methods to quantify the completeness and strength of such test scenarios, which can lead to low-quality tests, and hence to buggy chatbots.
To fill this gap, we propose adapting mutation testing (MuT) for task-oriented chatbots. To this end, we introduce a set of mutation operators that emulate faults in chatbot designs, an architecture that enables MuT on chatbots built using heterogeneous technologies, and a practical realisation as an Eclipse plugin. Moreover, we evaluate the applicability, effectiveness and efficiency of our approach on open-source chatbots, with promising results.
LF Energy Webinar: Carbon Data Specifications: Mechanisms to Improve Data Acc...DanBrown980551
This LF Energy webinar took place June 20, 2024. It featured:
-Alex Thornton, LF Energy
-Hallie Cramer, Google
-Daniel Roesler, UtilityAPI
-Henry Richardson, WattTime
In response to the urgency and scale required to effectively address climate change, open source solutions offer significant potential for driving innovation and progress. Currently, there is a growing demand for standardization and interoperability in energy data and modeling. Open source standards and specifications within the energy sector can also alleviate challenges associated with data fragmentation, transparency, and accessibility. At the same time, it is crucial to consider privacy and security concerns throughout the development of open source platforms.
This webinar will delve into the motivations behind establishing LF Energy’s Carbon Data Specification Consortium. It will provide an overview of the draft specifications and the ongoing progress made by the respective working groups.
Three primary specifications will be discussed:
-Discovery and client registration, emphasizing transparent processes and secure and private access
-Customer data, centering around customer tariffs, bills, energy usage, and full consumption disclosure
-Power systems data, focusing on grid data, inclusive of transmission and distribution networks, generation, intergrid power flows, and market settlement data
What is an RPA CoE? Session 2 – CoE RolesDianaGray10
In this session, we will review the players involved in the CoE and how each role impacts opportunities.
Topics covered:
• What roles are essential?
• What place in the automation journey does each role play?
Speaker:
Chris Bolin, Senior Intelligent Automation Architect Anika Systems
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving
Manufacturing custom quality metal nameplates and badges involves several standard operations. Processes include sheet prep, lithography, screening, coating, punch press and inspection. All decoration is completed in the flat sheet with adhesive and tooling operations following. The possibilities for creating unique durable nameplates are endless. How will you create your brand identity? We can help!
"Frontline Battles with DDoS: Best practices and Lessons Learned", Igor IvaniukFwdays
At this talk we will discuss DDoS protection tools and best practices, discuss network architectures and what AWS has to offer. Also, we will look into one of the largest DDoS attacks on Ukrainian infrastructure that happened in February 2022. We'll see, what techniques helped to keep the web resources available for Ukrainians and how AWS improved DDoS protection for all customers based on Ukraine experience
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsDianaGray10
Join us to learn how UiPath Apps can directly and easily interact with prebuilt connectors via Integration Service--including Salesforce, ServiceNow, Open GenAI, and more.
The best part is you can achieve this without building a custom workflow! Say goodbye to the hassle of using separate automations to call APIs. By seamlessly integrating within App Studio, you can now easily streamline your workflow, while gaining direct access to our Connector Catalog of popular applications.
We’ll discuss and demo the benefits of UiPath Apps and connectors including:
Creating a compelling user experience for any software, without the limitations of APIs.
Accelerating the app creation process, saving time and effort
Enjoying high-performance CRUD (create, read, update, delete) operations, for
seamless data management.
Speakers:
Russell Alfeche, Technology Leader, RPA at qBotic and UiPath MVP
Charlie Greenberg, host
Essentials of Automations: Exploring Attributes & Automation ParametersSafe Software
Building automations in FME Flow can save time, money, and help businesses scale by eliminating data silos and providing data to stakeholders in real-time. One essential component to orchestrating complex automations is the use of attributes & automation parameters (both formerly known as “keys”). In fact, it’s unlikely you’ll ever build an Automation without using these components, but what exactly are they?
Attributes & automation parameters enable the automation author to pass data values from one automation component to the next. During this webinar, our FME Flow Specialists will cover leveraging the three types of these output attributes & parameters in FME Flow: Event, Custom, and Automation. As a bonus, they’ll also be making use of the Split-Merge Block functionality.
You’ll leave this webinar with a better understanding of how to maximize the potential of automations by making use of attributes & automation parameters, with the ultimate goal of setting your enterprise integration workflows up on autopilot.
Discover the Unseen: Tailored Recommendation of Unwatched ContentScyllaDB
The session shares how JioCinema approaches ""watch discounting."" This capability ensures that if a user watched a certain amount of a show/movie, the platform no longer recommends that particular content to the user. Flawless operation of this feature promotes the discover of new content, improving the overall user experience.
JioCinema is an Indian over-the-top media streaming service owned by Viacom18.
In our second session, we shall learn all about the main features and fundamentals of UiPath Studio that enable us to use the building blocks for any automation project.
📕 Detailed agenda:
Variables and Datatypes
Workflow Layouts
Arguments
Control Flows and Loops
Conditional Statements
💻 Extra training through UiPath Academy:
Variables, Constants, and Arguments in Studio
Control Flow in Studio
Mapping of extensible markup language-to-ontology representation for effective data integration
1. IAES International Journal of Artificial Intelligence (IJ-AI)
Vol. 12, No. 1, March 2023, pp. 432~442
ISSN: 2252-8938, DOI: 10.11591/ijai.v12.i1.pp432-442 432
Journal homepage: http://ijai.iaescore.com
Mapping of extensible markup language-to-ontology
representation for effective data integration
Su-Cheng Haw1
, Lit-Jie Chew1
, Dana Sulistyo Kusumo2
, Palanichamy Naveen1
, Kok-Why Ng1
1
Faculty of Computing and Informatics, Multimedia University, Cyberjaya, Malaysia
2
School of Computing, Telkom University, Bandung, Indonesia
Article Info ABSTRACT
Article history:
Received Mar 3, 2022
Revised Aug 17, 2022
Accepted Sep 15, 2022
Extensible markup language (XML) is well-known as the standard for data
exchange over the internet. It is flexible and has high expressibility to
express the relationship between the data stored. Yet, the structural
complexity and the semantic relationships are not well expressed. On the
other hand, ontology models the structural, semantic and domain knowledge
effectively. By combining ontology with visualization effect, one will be
able to have a closer view based on respective user requirements. In this
paper, we propose several mapping rules for the transformation of XML into
ontology representation. Subsequently, we show how the ontology is
constructed based on the proposed rules using the sample domain ontology
in University of Wisconsin-Milwaukee (UWM) and mondial datasets. We
also look at the schemas, query workload, and evaluation, to derive the
extended knowledge from the existing ontology. The correctness of the
ontology representation has been proven effective through supporting
various types of complex queries in simple protocol and resource description
framework query language (SPARQL) language.
Keywords:
Mapping rules
Mapping scheme
Ontology representation
Semantic relationship
Extensible markup language to
ontology
This is an open access article under the CC BY-SA license.
Corresponding Author:
Su-Cheng Haw
Faculty of Computing and Informatics, Multimedia University
Persiaran Multimedia, 63100 Cyberjaya, Selangor, Malaysia
Email: sucheng@mmu.edu.my
1. INTRODUCTION
Extensible markup language (XML) has been widely used as the data exchange format over the
internet [1], [2]. Big data analytics is the trend in various industries to boost their industrial performance, and in
fact, XML data format usually forms the basis of data streaming used in the analytical process [3], [4].
However, XML data represent the data only at the syntactic level. On the other hand, ontology is a knowledge
representation that established a shared vocabulary, conceptualizations and model domain knowledge for
various applications [5]–[8]. Ontology is often expressed in ontology web language (OWL) format.
Several ontology generation (also known as ontology mapping) techniques existed to transform the
gap between the syntactical XML and semantical OWL representation [9]–[11]. Ontology enrichment is also
an objective of the transformation [12]. It is to extend the ontology by adding the elements and constructor
(class, object attributes, data types, concept relations, axioms, properties). In addition, the ontology
population process adds individuals or attributes to available individuals from XML data to the ontology
representation.
In general, the mapping approaches from XML to ontology representation can be grouped into two
main categories: the instance approach and the validation approach [13]. The instance approach intends to
convert XML documents directly to ontology representation without using schema knowledge. Most of these
approaches generate new ontologies from only XML content by using XML path language (Xpath) query
2. Int J Artif Intell ISSN: 2252-8938
Mapping of extensible markup language-to-ontology representation for … (Su-Cheng Haw)
433
language based on path expression to navigate each path of the respective node in the XML. Klein presented
the first mapping tool to translate XML documents directly to an ontology language such as resource
description framework (RDF) or OWL [14]. He proposed a method to transform the ambiguous XML data
into the RDF statements on a one-way mapping basis. Bohring and Auer [15] proposed a framework to map
XML into OWL, which is built on top of the XML instance document to possibly generate XML schema
definition (XSD), and finally transform it into OWL. O’Connor and Das [16] proposed a domain-specific
language called XML master. It is developed by using OWL syntax and XPath query language.
The validation approach refers to the approach that generates ontology from the schema. Both XSD
and document type definition (DTD) are the two major schemas that are being used today. However, DTDs
are not in XML format, which DTDs do not support ‘namespace’ while XSD does provide more advanced
features. These approaches make use of the advantage of XSD, which contains the defined elements and
XML types of simple or complex data. Ferdinand et al. [17] proposed a semi-automated approach named
ontology web language mapping (OWLMAP), which is constructed based on some mapping rules to handle
complex type, simple type, attribute, element, elemet type, substitution group and so on. The XML schema to
ontology web language (XS2OWL) [18] approach targets to support the interoperability between the XML
and OWL environment. The tool automatically transforms the XSD as input into: i) main ontology which is
directly reflected by the defined transformation rules and ii) mapping ontology. The mapping ontology is
used to keep the radio-frequency identifications (rf:IDs) of the OWL constructs of the main ontology which
cannot be generated directly from the main ontology. There are four classes of mapping ontology which are
complex type info type, element info type, data type property info type and data type property info type.
Bedini et al. [19] developed a prototype named Janus, which consists of 40 transformation rules to map the
XSD constructs to ontology (OWL2-RL) constructs. Their approach managed to minimize the information
loss during the transformation process. As an example, the construction ‘restriction’, derived from the
restriction of a simple type, allows the creation of several simple types from the simple predefined types in
XSD. However, the transformation is designed based on an application domain to compute statistical analysis
of the business to business (B2B) domain, which becomes the constraint of this approach. An efficient XML
to OWL converter (EXCO) [20] is a tool, which could manage both enrichment and population for XML
documents into OWL by covering both the internal and external references. Thuy et al. [21] proposed s-trans,
which transforms XML healthcare data into ontology based on extraction of the XML schema with added
description of the semantic knowledge. Subsequently, in another research, Thuy et al. [22] proposed to
reduce the redundancy of data resulting from duplicate elements in XML schema by measuring the similarity
between these duplicates before the transformation process.
More recently, Shapkin and Shumsky [23] proposed modularizing the transformation from XML to
ontologies based on some designed templates, which are constructed based on class and property values.
Singapogu et al. [24] proposed the mapping by looking at the XML schema elements to automatically
structure and represent it in the first draft of ontology. Subsequently, some part-of-speech tagging method is
employed to extract the domain knowledge to enrich the refinement of ontology. Jounaidi and Bahaj [25]
formulated some rules for mapping the XML schema into ontology representation. This mapping also covers
the relationships between the nodes to ensure the structure is maintained. They also proposed canonical data
model (CDM) to transform XML Schema into OWL ontology [26]. Hacherouf et al. [27] proposed patterns
identification for XSD conversion to OWL (PIXCO), a method based on formal concept analysis (FCA) to
model the transformation patterns. There are several processes involved including the constructions of XML
schema, the transformation patterns identified and the OWL modelling.
From the review, we observed that EXCO [20] tool is stable and has enriched information.
Nevertheless, EXCO can be further improved to support some advanced operators and restrictions. Our
proposed framework extended EXCO to add some new functionalities as described in the next section.
2. THE PROPOSED METHOD
Figure 1 depicts the overall framework of our proposed approach. In this method, a validation
approach is used in the proposed solution. At first, if there is no XSD nor DTD available as input, it will be
generated automatically from the XML documents. The generation of the target OWL is composed of a few
stages as elaborated next.
2.1. Stage 1: initial transformation step
The trang application programming interface (API) is used in the transformation between an XML
document and XML schema to define the restriction on the XML structure. Trang API is an open source API
for working with XML files to convert the XML schema into XSD schema format. In addition, trang is also
capable to infer a schema from an XML document itself if the schema is not present.
3. ISSN: 2252-8938
Int J Artif Intell, Vol. 12, No. 1, March 2023: 432-442
434
Figure 1. The overview of the framework
2.2. Stage 2: resolving the conflict
Next, to resolve the internal and external references, consolidation mapping is adopted from [19]
method. First, i) collecting schema files, XSD schemas are collected to get the reference of their location into
a hash table. The namespace and references are saved to avoid duplication; ii) merging schemas files,
namespace prefixes of the saved schemas are unified to merge them into the main schema file; and
iii) reorganizing schema, to reorganize the internal references within the main schema file. The referred
elements will be simply appended into the node, and this is done hierarchically through the descendant.
Finally, the useless element is eliminated.
2.3. Stage 3: automated transformation
The automatic transformation is handled through an algorithm developed based on the
transformation model to map the consolidated output construct into the ontology web language description
logics (OWL-DL) construct. The process is done without any user intervention. This process will ease the
user with the initial mapping constructued.
2.4. Stage 4: refinement stage
Refinement of the generated ontology and mapping of bridges. The invalid mapping can be cleaned
and reconstructed. Mapping of ontology which cannot be generated directly from the XML schema can be
manually mapped using the mapping ontology that keeps the rdf:IDs of the OWL constructs.
3. METHOD
3.1. Translation on UWM dataset
The University of Wisconsin-Milwaukee (UWM) XML document from the University of
Washington (UW) database group [28] is used as an example. UWM data are the course data derived from
the UWM website. Figure 2(a) shows the partial view of UWM XML document, which contains the series of
<course_listing> records and each of them contains the details of the record elements and value. The next
step of the ontology generation process is the conversion of the XML document to XML Schema, XSD. The
following generated XML-schema is depicted in Figure 2(b). The XSD generated having of the elements, sub
elements, and property restrictions like the type of cardinality and also operators of class combinations (union
of, complement of, intersection of).
The rules of the generation of OWL constructs are shown in Figure 3. OWL class can be created
from xsd: complex types; and xsd: elements which are independent identities. OWL data type property is
created from the element which they are the only literal with no attributes as well as the XML attributes. The
constraints properties from XML schema like min occurs or max occurs will map as the cardinality constraint
in OWL. There is owl: minimum cardinality and owl: maximum cardinality. The inheritance that is shown is-
a relationship that is derived from XML Schema will be mapped to RDF schema (RDFS): sub class of in
OWL. As a similar condition for elements will be mapped to RDFS: sub property of RDF. The compositors
of combining elements sequence, all and choice will be mapped into owl: intersection of, owl: union of or
4. Int J Artif Intell ISSN: 2252-8938
Mapping of extensible markup language-to-ontology representation for … (Su-Cheng Haw)
435
owl: complement of. Lastly, the model group definitions and attribute group definitions are specialisations of
complex types since they only contain elements respectively attribute declarations are also mapped to OWL
class. The summary of the transformation mapping rules is shown in Figure 3.
(a)
(b)
Figure 2. Partial view of the, (a) UWM XML document and the corresponding and (b) generated XSD
5. ISSN: 2252-8938
Int J Artif Intell, Vol. 12, No. 1, March 2023: 432-442
436
Figure 3. Transformation rules from XSD to OWL
Figure 4 and Table 1 show the OWL classes and properties generated respectively, where by the
Table 1(a) lists the object type property while Table 1(b) lists the data type property. During the
transformation, two elements with the same name, but on a different level, the property will add “has” prefix
for owl: object properties. In addition, rdf:ID will be generated for each instance of the class in order. The
generated OWL ontology is shown in Figure 5. This ontology is comprised of seven complex types since
seven OWL classes, root, course_listing, restriction, A, section_listing, hours, and bldg_and_rm are created.
The couse_listing, section_listing, hours and bldg_and_rm further contains respective properties as shown in
Figure 5.
Figure 4. The classes of UWM OWL
6. Int J Artif Intell ISSN: 2252-8938
Mapping of extensible markup language-to-ontology representation for … (Su-Cheng Haw)
437
Table 1. The properties of UWM OWL (a) object type property and (b) data type property
(a)
Name Domain Range
Has course Root Course_listing
Course Course_listing Restrictions
Has A Restrictions A
Has section listing Course_listing Section_listing
Has hours Section_listing Hours
Has bldg and rm Section_listing Bldg_and_rm
(b)
Name Domain Range
Note Course_listing Xs: string
Course Course_listing Xs: string
Title Course_listing Xs: string
Credits Course_listing Xs: string
Level Course_listing Xs: string
HREF A Xs: string
Section_note Section_listing Xs: string
Section Section_listing Xs: string
Days Section_listing Xs: string
Instructors Section_listing Xs: string
Comments Section_listing Xs: string
Start Hours Xs: string
End Hours Xs: string
Bldg. Bldg_and_rm Xs: string
Rm Bldg_and_rm Xs: string
Figure 5. The generated UWM OWL ontology
In the next section, the method of ontology population that we adopted is by looking at the XML
instances and XSD files as inputs. XSD documents are acting as the reference to translate the XML instance
to the OWL ontology. The snippet as follows shows an example of the transformation of XML elements to
instances according to the OWL model. The data type property is represented as follows. The extracted
model of OWL ontology constructed composed of: i) classes for concept definition; ii) object properties for
object relationship; and iii) data type properties for the relationship between object and data values.
<course_listing rdf:id= " id11234544 " >
<hasSectionListing rdf:id= " #id2213444 " />
</ course_listing >
<section_listing rdf:id="id2213444">
7. ISSN: 2252-8938
Int J Artif Intell, Vol. 12, No. 1, March 2023: 432-442
438
<section_note rdf:datatype="&xs;string"> </name>
<section rdf:datatype="&xs;string">Se 001</name>
<days rdf:datatype="&xs;string">R</name>
<instructor rdf:datatype="&xs;string">Silberg</name>
<comments rdf:datatype="&xs;string"></name>
</ section_listing >
3.2. Translation on mondial dataset
In addition, the same XML-OWL transformation is applied to the mondial XML document. Tables 2
and Table 3 show the OWL classes and properties generated. Table 3(a) lists the object type property while
Table 3(b) lists the data type property. The generated mondial OWL ontology is depicted in Figure 6.
Table 2. The classes of mondial OWL
Name Constraints
Mondial Has continent, min cardinality=0, max cardinality=unbounded
Continent Has country, min cardinality=0, max cardinality=unbounded
Country Has city, min cardinality=0, max cardinality=unbounded
Has ethinicgroups, min cardinality=0, max cardinality=unbounded
Has religions, min cardinality=0, max cardinality=unbounded
Has encompassed, min cardinality=0, max cardinality=unbounded
Has border, min cardinality=0, max cardinality=unbounded
City Has population, min cardinality=0, max cardinality=unbounded
Population -
Ethnicgroups -
Religions -
Encompassed -
Border -
Table 3. The properties of mondial OWL, (a) object type property and (b) data type property
(a)
Name Domain Range
Has continent Mondial Continent
Has country Mondial Country
Has city Country City
Has population City Population
Has ethinicgroups Country Ethnicgroups
Has religions Country Religions
Has encompassed Country Encompassed
Has border Country Border
(b)
Name Domain Range
Name City
Continent
Country
Xs: string
Xs: string
Xs: string
Id Continent
City
Xs: string
Xs: string
Population City
Country
Xs: int
Xs: int
Latitude City Xs: double
Longitude City Xs: double
Percentage Ethnicgroups
Religions
Encompassed
Xs: int
Xs: int
Xs: int
Length Length Xs: int
Continent Encompassed Xs: int
Country Border Xs: string
Gdp_total Country Xs: int
Datacode Country Xs: string
Population_growth Country Xs: double
Car_code Country Xs: string
Indep_date Country Xs: string
Infant_mortality Country Xs: double
Government Country Xs: string
Inflation Country Xs: int
Gdp_agri Country Xs: int
Total_area Country Xs: int
8. Int J Artif Intell ISSN: 2252-8938
Mapping of extensible markup language-to-ontology representation for … (Su-Cheng Haw)
439
Figure 6. The generated UWM OWL ontology
4. RESULTS AND DISCUSSION
The implementation of the proposed approach has been applied to several XML datasets from
different domains including UWM and mondial datasets. The evaluation of the final ontology is done by
comparing the semantics captured in the defined ontologies with the semantics captured in the automatic
generation. Based on the comparison shown, the semantics captured manually is shown to be the same as the
automatic transformation result. The correctness of the ontology representation has been proven by the
reflection of the query result with manual verification. Some examples designed query tests were executed
towards the constructed UWM and mondial ontology by using simple protocol and resource description
framework query language (SPARQL) playground (standalone multi-platform web application) [29].
4.1. Query results on UWM dataset
Two queries were executed to check the correctness of the number of returned results on UWM
ontology representation as compared with the query retrieved from the XML dataset itself. Figure 7 shows
the first query, query 1, which list the course_listing with credit of 7. Figure 8 depicts query 2, which list the
number of sections of each course group by credit. From the number of returned results, it shows that the
ontology constructed via our mapping scheme is correct.
Figure 7. Test case on Query 1 on UWM dataset
9. ISSN: 2252-8938
Int J Artif Intell, Vol. 12, No. 1, March 2023: 432-442
440
Figure 8. Test case on Query 2 on UWM dataset
4.2. Query results on mondial dataset
Figure 9 shows the first query, query 1, which list the countries latitude at 50.3 with their respective
religion. Figure 10 depicts query 2, which show the city of more than 10000 population. From the number of
returned results, it shows that the ontology constructed via our mapping scheme is correct.
Figure 9. Test case on Query 1 on Mondial dataset
Figure 10. Test case on Query 1 on Mondial dataset
5. CONCLUSION
In this paper, we proposed a set of transformation rules to translate XML documents into OWL
ontology representation. The generated ontology is found accurately defined the semantics of the XML
document through the evaluation of comparison to the manual transformation approach. In future work, we
will focus on the generation of OWL for the unsupported constructs of the validation schema.
10. Int J Artif Intell ISSN: 2252-8938
Mapping of extensible markup language-to-ontology representation for … (Su-Cheng Haw)
441
ACKNOWLEDGEMENTS
This work is supported by the funding of MMU-Tel U Joint Research Grant (MMUE/210062).
REFERENCES
[1] S.-C. Haw, A. Amin, C.-O. Wong, and S. Subramaniam, “Improving the support for XML dynamic updates using a hybridization
labeling scheme (ORD-GAP),” F1000Research, vol. 10, pp. 1–14, Sep. 2021, doi: 10.12688/f1000research.69108.1.
[2] L. Bao, J. Yang, C. Q. Wu, H. Qi, X. Zhang, and S. Cai, “XML2HBase: storing and querying large collections of XML
documents using a NoSQL database system,” Journal of Parallel and Distributed Computing, vol. 161, pp. 83–99, Mar. 2022,
doi: 10.1016/j.jpdc.2021.11.003.
[3] X. Jin, B. W. Wah, X. Cheng, and Y. Wang, “Significance and challenges of big data research,” Big Data Research, vol. 2, no. 2,
pp. 59–64, Jun. 2015, doi: 10.1016/j.bdr.2015.01.006.
[4] J. Yang et al., “Intelligent bridge management via big data knowledge engineering,” Automation in Construction, vol. 135, pp. 1–14,
Mar. 2022, doi: 10.1016/j.autcon.2021.104118.
[5] N. Luo, M. Pritoni, and T. Hong, “An overview of data tools for representing and managing building information and
performance data,” Renewable and Sustainable Energy Reviews, vol. 147, pp. 1–14, Sep. 2021, doi: 10.1016/j.rser.2021.111224.
[6] A. C. Khadir, H. Aliane, and A. Guessoum, “Ontology learning: grand tour and challenges,” Computer Science Review, vol. 39,
pp. 1–14, Feb. 2021, doi: 10.1016/j.cosrev.2020.100339.
[7] H. Q. Dung, L. T. Q. Le, N. H. H. Tho, T. Q. Truong, and C. H. Nguyen-Dinh, “A novel ontology framework supporting model-
based tourism recommender,” IAES International Journal of Artificial Intelligence (IJ-AI), vol. 10, no. 4, pp. 1060–1068, Dec.
2021, doi: 10.11591/ijai.v10.i4.pp1060-1068.
[8] D. Tomaszuk and D. Hyland-Wood, “RDF 1.1: knowledge representation and data integration language for the web,” Symmetry,
vol. 12, no. 1, p. 84, Jan. 2020, doi: 10.3390/sym12010084.
[9] S. Elnagar, V. Yoon, and M. A. Thomas, “An automatic ontology generation framework with an organizational perspective,” in
Proceedings of the Annual Hawaii International Conference on System Sciences, Jan. 2020, vol. 2020-January, pp. 4860–4869,
doi: 10.24251/hicss.2020.597.
[10] O. Corcho, F. Priyatna, and D. Chaves-Fraga, “Towards a new generation of ontology based data access,” Semantic Web, vol. 11,
no. 1, pp. 153–160, Jan. 2020, doi: 10.3233/SW-190384.
[11] C. Blankenberg, B. Gebel-Sauer, and P. Schubert, “Using a graph database for the ontology-based information integration of
business objects from heterogenous business information systems,” Procedia Computer Science, vol. 196, pp. 314–323, 2022,
doi: 10.1016/j.procs.2021.12.019.
[12] M. Ali, S. Fathalla, S. Ibrahim, M. Kholief, and Y. Hassan, “Cross-lingual ontology enrichment based on multi-agent
architecture,” Procedia Computer Science, vol. 137, pp. 127–138, 2018, doi: 10.1016/j.procs.2018.09.013.
[13] M. Hacherouf, S. N. Bahloul, and C. Cruz, “Transforming XML documents to OWL ontologies: a survey,” Journal of
Information Science, vol. 41, no. 2, pp. 242–259, Apr. 2015, doi: 10.1177/0165551514565972.
[14] M. Klein, “Interpreting XML documents via an RDF schema ontology,” in Proceedings. 13th International Workshop on
Database and Expert Systems Applications, 2002, pp. 889–893, doi: 10.1109/DEXA.2002.1046008.
[15] H. Bohring and O. Auer, “Mapping XML to OWL ontologies,” in Lecture Notes in Informatics (LNI), Proceedings - Series of the
Gesellschaft fur Informatik (GI), Bonn: Gesellschaft für Informatik, 2005, pp. 147–156.
[16] M. J. O’Connor and A. Das, “Acquiring OWL ontologies from XML documents,” in Proceedings of the sixth international
conference on Knowledge capture - K-CAP ’11, 2011, pp. 17–24, doi: 10.1145/1999676.1999681.
[17] M. Ferdinand, C. Zirpins, and D. Trastour, “Lifting XML schema to OWL,” in Lecture Notes in Computer Science (including
subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Berlin: Springer, 2004, pp. 354–358.
[18] C. Tsinaraki and S. Christodoulakis, “Interoperability of XML schema applications with OWL domain knowledge and semantic
web tools,” in On the Move to Meaningful Internet Systems 2007: CoopIS, DOA, ODBASE, GADA, and IS, Berlin, Heidelberg:
Springer, 2007, pp. 850–869.
[19] I. Bedini, C. Matheus, P. F. Patel-Schneider, A. Boran, and B. Nguyen, “Transforming XML schema to OWL using patterns,” in
2011 IEEE Fifth International Conference on Semantic Computing, Sep. 2011, pp. 102–109, doi: 10.1109/ICSC.2011.77.
[20] D. Lacoste, K. P. Sawant, and S. Roy, “An efficient XML to OWL converter,” in Proceedings of the 4th India Software
Engineering Conference on - ISEC ’11, 2011, pp. 145–154, doi: 10.1145/1953355.1953376.
[21] P. T. T. Thuy, Y.-K. Lee, and S. Lee, “S-Trans: semantic transformation of XML healthcare data into OWL ontology,”
Knowledge-Based Systems, vol. 35, pp. 349–356, Nov. 2012, doi: 10.1016/j.knosys.2012.04.009.
[22] P. T. T. Thuy, Y. K. Lee, and S. Lee, “A semantic approach for transforming XML data into RDF ontology,” Wireless Personal
Communications, vol. 73, no. 4, pp. 1387–1402, Dec. 2013, doi: 10.1007/s11277-013-1256-z.
[23] P. Shapkin and L. Shumsky, “A language for transforming the RDF data on the basis of ontologies,” in WEBIST 2015 - 11th
International Conference on Web Information Systems and Technologies, Proceedings, 2015, pp. 504–511, doi:
10.5220/0005479505040511.
[24] S. S. Singapogu, P. C. G. Costa, and J. M. Pullen, “Automated ontology creation using XML schema elements,” CEUR Workshop
Proceedings, vol. 1523, pp. 10–17, 2015.
[25] A. Jounaidi and M. Bahaj, “Designing and implementing XML schema inside OWL ontology,” in 2017 International Conference
on Wireless Networks and Mobile Communications (WINCOM), Nov. 2017, pp. 1–7, doi: 10.1109/WINCOM.2017.8238166.
[26] A. Jounaidi and M. Bahaj, “Converting of an xml schema to an owl ontology using a canonical data model,” Journal of
Theoretical and Applied Information Technology, vol. 96, no. 5, pp. 1422–1435, 2018.
[27] M. Hacherouf, S. Nait-Bahloul, and C. Cruz, “Transforming XML schemas into OWL ontologies using formal concept analysis,”
Software & Systems Modeling, vol. 18, no. 3, pp. 2093–2110, Jun. 2019, doi: 10.1007/s10270-017-0651-4.
[28] University of Washington, “XML repository UWM and mondial datasets.”
http://aiweb.cs.washington.edu/research/projects/xmltk/xmldata/www/repository.html#dblp (accessed May 15, 2018).
[29] S. Bischof, S. Decker, T. Krennwallner, N. Lopes, and A. Polleres, “Mapping between RDF and XML with XSPARQL,” Journal
on Data Semantics, vol. 1, no. 3, pp. 147–185, Sep. 2012, doi: 10.1007/s13740-012-0008-7.
11. ISSN: 2252-8938
Int J Artif Intell, Vol. 12, No. 1, March 2023: 432-442
442
BIOGRAPHIES OF AUTHORS
Su-Cheng Haw is Professor at Faculty of Computing and Informatics,
Multimedia University, where she leads several funded researches on the XML databases. Her
research interests include XML databases, query optimization, data modeling, semantic web,
and recommender system. She is also the chairperson for Center for Web Engineering (CWE)
and chief editor for Journal of Informatics and Web Engineering (JIWE). She can be contacted
at email: sucheng@mmu.edu.my.
Lit-Jie Chew holds a Bachelor of Computer Science (Hons.) major in data
science from Multimedia University, Malaysia and is currently pursing Master of Science
(M.Sc.) in Information Technology at Multimedia University, Malaysia. His research areas of
interest include recommender system, ontology, and IoT. He can be contacted at
chewljie@gmail.com.
Dana Sulistiyo Kusumo is a senior lecturer in the School of Computing, Telkom
University, Indonesia. He received BEng in Informatics from the Telkom University,
Indonesia, MEng in Information Technology from Bandung Institute of Technology, Indonesia
and a Ph.D. degree in Computer Science and Engineering from the School of Computer
Science and Engineering, The University of New South Wales. Australia. His research
interests include Software Engineering and Information Architecture. He is the Head of
Information Science and Engineering Lab and also a member of the Software Engineering
Research Group at the School of Computing, Telkom University, Indonesia. He can be
contacted at email: danakusumo@telkomuniversity.ac.id.
Palanichamy Naveen joined the Faculty of Computing and Informatics,
Multimedia University after receiving Ph.D from Curtin University, Malaysia. She received
her Bachelor of Engineering (CSE) and Master of Engineering (CSE) from Anna University,
India. Her research interest includes smart grid, cloud computing, machine learning, deep
learning and recommender system. She is involved in multiple research projects funded by
Multimedia University. She can be contacted at email: p.naveen@mmu.edu.my.
Kok-Why Ng is a senior lecturer in the Faculty of Computing and Informatics
(FCI) in Multimedia University (MMU), Malaysia. He did his B.Sc. (Math) in USM, Penang,
and his M.Sc (IT) and Ph.D (IT) in MMU, Malaysia. His research interests are in
Recommender System, 3D Geometric Modeling and Animation. He is also active in some
research projects related to artificial intelligence, deep learning and human blood cells.
He can be contacted at email: kwng@mmu.edu.my.