World Wide Web has revolutionized the means of data availability, but with its current structure model , it
is becoming increasingly difficult to retrieve relevant information, with reasonable precision and recall,
using the major search engines. However, with use of metadata, combined with the use of improved
searching techniques, helps to enhance relevant information retrieval .The design of structured,
descriptions of Web resources enables greater search precision and a more accurate relevance ranking of
retrieved information .One such efforts towards standardization is , Dublin Core standard, which has been
developed as Metadata Standard and also other standards which enhances retrieval of a wide range of
information resources. This paper discuses the importance of metadata, various metadata schemas and
elements, and the need of standardization of Metadata. This paper further discusses how the metadata can
be generated using various tools which assist intelligent agents for efficient retrieval.
Discovering Resume Information using linked data dannyijwest
In spite of having different web applications to create and collect resumes, these web applications suffer
mainly from a common standard data model, data sharing, and data reusing. Though, different web
applications provide same quality of resume information, but internally there are many differences in terms
of data structure and storage which makes computer difficult to process and analyse the information from
different sources. The concept of Linked Data has enabled the web to share data among different data
sources and to discover any kind of information while resolving the issues like heterogeneity,
interoperability, and data reusing between different data sources and allowing machine process-able data
on the web.
Discovering Resume Information using linked data dannyijwest
In spite of having different web applications to create and collect resumes, these web applications suffer
mainly from a common standard data model, data sharing, and data reusing. Though, different web
applications provide same quality of resume information, but internally there are many differences in terms
of data structure and storage which makes computer difficult to process and analyse the information from
different sources. The concept of Linked Data has enabled the web to share data among different data
sources and to discover any kind of information while resolving the issues like heterogeneity,
interoperability, and data reusing between different data sources and allowing machine process-able data
on the web.
Vision Based Deep Web data Extraction on Nested Query Result RecordsIJMER
International Journal of Modern Engineering Research (IJMER) is Peer reviewed, online Journal. It serves as an international archival forum of scholarly research related to engineering and science education.
Urm concept for sharing information inside of communitiesKarel Charvat
The paper describe concept for Sharing Information Inside Communities - Uniform Resource Management (URM), which support validation, discovery and access to heterogeneous information and knowledge. It is based on utilisation of metadata schemes. The URM models currently also integrate different tools, which support sharing of knowledge. The URM concept was introduced by NaturNet Redime project as tool for managing of educational context and now is modified for general sharing of information inside of community in c@r project. The concept is now partly implemented as part of Czech metadata portal, Czech portal for United Nation Spatial Data infrastructure and it is also tested in Latvia by BOCS
Beyond Seamless Access: Meta-data In The Age of Content IntegrationNew York University
This was an example of meta-data research that I did before Dot-COM bubble hit the East Coast in 2000. Much of what we envisioned for content integration shaped the meta-data movement for today. Its full potentials have not reached yet, e.g. the level of intelligent data for semantic apps, personalized delivery, interactive and bidirectional-linking services, repurposed services, etc. It's the first of its kind weaving content from scholarly publications (particularly in the context of formal and informal communications) with library mission critical applications in authority control, meta-data, directory services, ILS, ILL, knowledge-base for site map, etc.
Key aspects of big data storage and its architectureRahul Chaturvedi
This paper helps understand the tools and technologies related to a classic BigData setting. Someone who reads this paper, especially Enterprise Architects, will find it helpful in choosing several BigData database technologies in a Hadoop architecture.
Clustering of Deep WebPages: A Comparative Studyijcsit
The internethas massive amount of information. This information is stored in the form of zillions of
webpages. The information that can be retrieved by search engines is huge, and this information constitutes
the ‘surface web’.But the remaining information, which is not indexed by search engines – the ‘deep web’,
is much bigger in size than the ‘surface web’, and remains unexploited yet.
Several machine learning techniques have been commonly employed to access deep web content. Under
machine learning, topic models provide a simple way to analyze large volumes of unlabeled text. A ‘topic’is
a cluster of words that frequently occur together and topic models can connect words with similar
meanings and distinguish between words with multiple meanings. In this paper, we cluster deep web
databases employing several methods, and then perform a comparative study. In the first method, we apply
Latent Semantic Analysis (LSA) over the dataset. In the second method, we use a generative probabilistic
model called Latent Dirichlet Allocation(LDA) for modeling content representative of deep web
databases.Both these techniques are implemented after preprocessing the set of web pages to extract page
contents and form contents.Further, we propose another version of Latent Dirichlet Allocation (LDA) to the
dataset. Experimental results show that the proposed method outperforms the existing clustering methods.
Topic Modeling : Clustering of Deep Webpagescsandit
The internet is comprised of massive amount of info
rmation in the form of zillions of web
pages.This information can be categorized into the
surface web and the deep web. The existing
search engines can effectively make use of surface
web information.But the deep web remains
unexploited yet. Machine learning techniques have b
een commonly employed to access deep
web content.
Under Machine Learning, topic models provide a simp
le way to analyze large volumes of
unlabeled text. A "topic" consists of a cluster of
words that frequently occur together. Using
contextual clues, topic models can connect words wi
th similar meanings and distinguish
between words with multiple meanings. Clustering is
one of the key solutions to organize the
deep web databases.In this paper, we cluster deep w
eb databases based on the relevance found
among deep web forms by employing a generative prob
abilistic model called Latent Dirichlet
Allocation(LDA) for modeling content representative
of deep web databases. This is
implemented after preprocessing the set of web page
s to extract page contents and form
contents.Further, we contrive the distribution of “
topics per document” and “words per topic”
using the technique of Gibbs sampling. Experimental
results show that the proposed method
clearly outperforms the existing clustering methods
.
Topic Modeling : Clustering of Deep Webpagescsandit
The internet is comprised of massive amount of information in the form of zillions of web pages.This information can be categorized into the surface web and the deep web. The existing search engines can effectively make use of surface web information.But the deep web remains unexploited yet. Machine learning techniques have been commonly employed to access deep web content.
Under Machine Learning, topic models provide a simple way to analyze large volumes of unlabeled text. A "topic" consists of a cluster of words that frequently occur together. Using
contextual clues, topic models can connect words with similar meanings and distinguish between words with multiple meanings. Clustering is one of the key solutions to organize the deep web databases.In this paper, we cluster deep web databases based on the relevance found among deep web forms by employing a generative probabilistic model called Latent Dirichlet
Allocation(LDA) for modeling content representative of deep web databases. This is implemented after preprocessing the set of web pages to extract page contents and form
contents.Further, we contrive the distribution of “topics per document” and “words per topic”
using the technique of Gibbs sampling. Experimental results show that the proposed method clearly outperforms the existing clustering methods.
Organizing Data in a Traditional File Environment
File organization Term and Concepts
Computer system organizes data in a hierarchy
Bit: Smallest unit of data; binary digit (0,1)
Byte: Group of bits that represents a single character
Field: Group of characters as word(s) or number
Record: Group of related fields
File: Group of records of same type
HIGWGET-A Model for Crawling Secure Hidden WebPagesijdkp
The conventional search engines existing over the internet are active in searching the appropriate
information. The search engine gets few constraints similar to attainment the information seeked from a
different sources. The web crawlers are intended towards a exact lane of the web.Web Crawlers are limited
in moving towards a different path as they are protected or at times limited because of the apprehension of
threats. It is possible to make a web crawler,which will have the ability of penetrating from side to side the
paths of the web, not reachable by the usual web crawlers, so as to get a improved answer in terms of
infoemation, time and relevancy for the given search query. The proposed web crawler is designed to
attend Hyper Text Transfer Protocol Secure (HTTPS) websites including the web pages,which requires
verification to view and index.
IJWEST CFP (9).pdfCALL FOR ARTICLES...! IS INDEXING JOURNAL...! Internationa...dannyijwest
Paper Submission
Authors are invited to submit papers for this journal through Email: ijwestjournal@airccse.org / ijwest@aircconline.com or through Submission System.
mportant Dates
Submission Deadline : June 01, 2024
Notification :July 01, 2024
Final Manuscript Due : July 08, 2024
Publication Date : Determined by the Editor-in-Chief
Here's where you can reach us : ijwestjournal@yahoo.com or ijwestjournal@airccse.org or ijwest@aircconline.com
More Related Content
Similar to Metadata: Towards Machine-Enabled Intelligence
Vision Based Deep Web data Extraction on Nested Query Result RecordsIJMER
International Journal of Modern Engineering Research (IJMER) is Peer reviewed, online Journal. It serves as an international archival forum of scholarly research related to engineering and science education.
Urm concept for sharing information inside of communitiesKarel Charvat
The paper describe concept for Sharing Information Inside Communities - Uniform Resource Management (URM), which support validation, discovery and access to heterogeneous information and knowledge. It is based on utilisation of metadata schemes. The URM models currently also integrate different tools, which support sharing of knowledge. The URM concept was introduced by NaturNet Redime project as tool for managing of educational context and now is modified for general sharing of information inside of community in c@r project. The concept is now partly implemented as part of Czech metadata portal, Czech portal for United Nation Spatial Data infrastructure and it is also tested in Latvia by BOCS
Beyond Seamless Access: Meta-data In The Age of Content IntegrationNew York University
This was an example of meta-data research that I did before Dot-COM bubble hit the East Coast in 2000. Much of what we envisioned for content integration shaped the meta-data movement for today. Its full potentials have not reached yet, e.g. the level of intelligent data for semantic apps, personalized delivery, interactive and bidirectional-linking services, repurposed services, etc. It's the first of its kind weaving content from scholarly publications (particularly in the context of formal and informal communications) with library mission critical applications in authority control, meta-data, directory services, ILS, ILL, knowledge-base for site map, etc.
Key aspects of big data storage and its architectureRahul Chaturvedi
This paper helps understand the tools and technologies related to a classic BigData setting. Someone who reads this paper, especially Enterprise Architects, will find it helpful in choosing several BigData database technologies in a Hadoop architecture.
Clustering of Deep WebPages: A Comparative Studyijcsit
The internethas massive amount of information. This information is stored in the form of zillions of
webpages. The information that can be retrieved by search engines is huge, and this information constitutes
the ‘surface web’.But the remaining information, which is not indexed by search engines – the ‘deep web’,
is much bigger in size than the ‘surface web’, and remains unexploited yet.
Several machine learning techniques have been commonly employed to access deep web content. Under
machine learning, topic models provide a simple way to analyze large volumes of unlabeled text. A ‘topic’is
a cluster of words that frequently occur together and topic models can connect words with similar
meanings and distinguish between words with multiple meanings. In this paper, we cluster deep web
databases employing several methods, and then perform a comparative study. In the first method, we apply
Latent Semantic Analysis (LSA) over the dataset. In the second method, we use a generative probabilistic
model called Latent Dirichlet Allocation(LDA) for modeling content representative of deep web
databases.Both these techniques are implemented after preprocessing the set of web pages to extract page
contents and form contents.Further, we propose another version of Latent Dirichlet Allocation (LDA) to the
dataset. Experimental results show that the proposed method outperforms the existing clustering methods.
Topic Modeling : Clustering of Deep Webpagescsandit
The internet is comprised of massive amount of info
rmation in the form of zillions of web
pages.This information can be categorized into the
surface web and the deep web. The existing
search engines can effectively make use of surface
web information.But the deep web remains
unexploited yet. Machine learning techniques have b
een commonly employed to access deep
web content.
Under Machine Learning, topic models provide a simp
le way to analyze large volumes of
unlabeled text. A "topic" consists of a cluster of
words that frequently occur together. Using
contextual clues, topic models can connect words wi
th similar meanings and distinguish
between words with multiple meanings. Clustering is
one of the key solutions to organize the
deep web databases.In this paper, we cluster deep w
eb databases based on the relevance found
among deep web forms by employing a generative prob
abilistic model called Latent Dirichlet
Allocation(LDA) for modeling content representative
of deep web databases. This is
implemented after preprocessing the set of web page
s to extract page contents and form
contents.Further, we contrive the distribution of “
topics per document” and “words per topic”
using the technique of Gibbs sampling. Experimental
results show that the proposed method
clearly outperforms the existing clustering methods
.
Topic Modeling : Clustering of Deep Webpagescsandit
The internet is comprised of massive amount of information in the form of zillions of web pages.This information can be categorized into the surface web and the deep web. The existing search engines can effectively make use of surface web information.But the deep web remains unexploited yet. Machine learning techniques have been commonly employed to access deep web content.
Under Machine Learning, topic models provide a simple way to analyze large volumes of unlabeled text. A "topic" consists of a cluster of words that frequently occur together. Using
contextual clues, topic models can connect words with similar meanings and distinguish between words with multiple meanings. Clustering is one of the key solutions to organize the deep web databases.In this paper, we cluster deep web databases based on the relevance found among deep web forms by employing a generative probabilistic model called Latent Dirichlet
Allocation(LDA) for modeling content representative of deep web databases. This is implemented after preprocessing the set of web pages to extract page contents and form
contents.Further, we contrive the distribution of “topics per document” and “words per topic”
using the technique of Gibbs sampling. Experimental results show that the proposed method clearly outperforms the existing clustering methods.
Organizing Data in a Traditional File Environment
File organization Term and Concepts
Computer system organizes data in a hierarchy
Bit: Smallest unit of data; binary digit (0,1)
Byte: Group of bits that represents a single character
Field: Group of characters as word(s) or number
Record: Group of related fields
File: Group of records of same type
HIGWGET-A Model for Crawling Secure Hidden WebPagesijdkp
The conventional search engines existing over the internet are active in searching the appropriate
information. The search engine gets few constraints similar to attainment the information seeked from a
different sources. The web crawlers are intended towards a exact lane of the web.Web Crawlers are limited
in moving towards a different path as they are protected or at times limited because of the apprehension of
threats. It is possible to make a web crawler,which will have the ability of penetrating from side to side the
paths of the web, not reachable by the usual web crawlers, so as to get a improved answer in terms of
infoemation, time and relevancy for the given search query. The proposed web crawler is designed to
attend Hyper Text Transfer Protocol Secure (HTTPS) websites including the web pages,which requires
verification to view and index.
IJWEST CFP (9).pdfCALL FOR ARTICLES...! IS INDEXING JOURNAL...! Internationa...dannyijwest
Paper Submission
Authors are invited to submit papers for this journal through Email: ijwestjournal@airccse.org / ijwest@aircconline.com or through Submission System.
mportant Dates
Submission Deadline : June 01, 2024
Notification :July 01, 2024
Final Manuscript Due : July 08, 2024
Publication Date : Determined by the Editor-in-Chief
Here's where you can reach us : ijwestjournal@yahoo.com or ijwestjournal@airccse.org or ijwest@aircconline.com
Cybercrimes in the Darknet and Their Detections: A Comprehensive Analysis and...dannyijwest
Although the Dark web was originally used for maintaining privacy-sensitive communication for business or intelligence services for defence, government and business organizations, fighting against censorship and blocked content, later, the advantage of technologies behind the Dark web were abused by criminals to conduct crimes which involve drug dealing to the contract of assassinations in a widespread manner. Since the communication remains secure and untraceable, criminals can easily use dark web service via The Onion Router (TOR), can hide their illegal motives and can conceal their criminal activities. This makes it very difficult to monitor and detect cybercrimes over the dark web. With the evolution of machine learning, natural language processing techniques, computational big data applications and hardware, there is a growing interest in exploiting dark web data to monitor and detect criminal activities. Due to the anonymity provided by the Dark Web, the rapid disappearance and the change of the uniform resource locators (URLs) of the resources, it is not as easy to crawl the Drak web and get the data as the usual surface web which limits the researchers and law enforcement agencies to analyse the data. Therefore, there is an urgent need to study the technology behind the Dark web, its widespread abuse, its impact on society and the existing systems, to identify the sources of drug deal or terrorism activities. In this research, we analysed the predominant darker sides of the world wide web (WWW), their volumes, their contents and their ratios. We have performed the analysis of the larger malicious or hidden activities that occupy the major portions of the Dark net; tools and techniques used to identify cybercrimes which happen inside the dark web. We applied a systematic literature review (SLR) approach on the resources where the actual dark net data have been used for research purposes in several areas. From this SLR, we identified the approaches (tools and algorithms) which have been applied to analyse the Dark net data, the key gaps as well as the key contributions of the existing works in the literature. In our study, we find the main challenges to crawl the dark web and collect forum data are: scalability of crawler, content selection trade off, and social obligation for TOR crawler and the limitations of techniques used in automatic sentiment analysis to understand criminals’ forums and thereby monitor the forums. From the comprehensive analysis of existing tools, our study summarizes the most tools. However the forum topics rapidly change as their sources changes; criminals inject noises to obfuscate the forum’s main topic and thus remain undetectable. Therefore supervised techniques fail to address the above challenges. Semi-supervised techniques would be an interesting research direction.
FFO: Forest Fire Ontology and Reasoning System for Enhanced Alert and Managem...dannyijwest
Forest fires or wildfires pose a serious threat to property, lives, and the environment. Early detection and mitigation of such emergencies, therefore, play an important role in reducing the severity of the impact caused by wildfire. Unfortunately, there is often an improper or delayed mechanism for forest fire detection which leads to destruction and losses. These anomalies in detection can be due to defects in sensors or a lack of proper information interoperability among the sensors deployed in forests. This paper presents a lightweight ontological framework to address these challenges. Interoperability issues are caused due to heterogeneity in technologies used and heterogeneous data created by different sensors. Therefore, through the proposed Forest Fire Detection and Management Ontology (FFO), we introduce a standardized model to share and reuse knowledge and data across different sensors. The proposed ontology is validated using semantic reasoning and query processing. The reasoning and querying processes are performed on real-time data gathered from experiments conducted in a forest and stored as RDF triples based on the design of the ontology. The outcomes of queries and inferences from reasoning demonstrate that FFO is feasible for the early detection of wildfire and facilitates efficient process management subsequent to detection.
Call For Papers-10th International Conference on Artificial Intelligence and ...dannyijwest
** Registration is currently open **
Call for Research Papers!!!
Free – Extended Paper will be published as free of cost.
10th International Conference on Artificial Intelligence and Applications (AI 2024)
July 20 ~ 21, 2024, Toronto, Canada
https://csty2024.org/ai/index
Submission Deadline: May 11, 2024
Contact Us
Here's where you can reach us : ai@csty2024.org or ai.conference@yahoo.com
Submission System
https://csty2024.org/submission/index.php
#artificialintelligence #softcomputing #machinelearning #technology #datascience #python #deeplearning #tech #robotics #innovation #bigdata #coding #iot #computerscience #data #dataanalytics #engineering #robot #datascientist #software #automation #analytics #ml #pythonprogramming #programmer #digitaltransformation #developer #promptengineering #generativeai #genai #chatgpt
CALL FOR ARTICLES...! IS INDEXING JOURNAL...! International Journal of Web &...dannyijwest
Paper Submission
Authors are invited to submit papers for this journal through Email: ijwest@aircconline.com or through Submission System. Submissions must be original and should not have been published previously or be under consideration for publication while being evaluated for this Journal.
Important Dates
• Submission Deadline: March 16, 2024
• Notification : April 13, 2024
• Final Manuscript Due : April 20, 2024
• Publication Date : Determined by the Editor-in-Chief
Contact Us
Here's where you can reach us
ijwestjournal@yahoo.com or ijwestjournal@airccse.org or ijwest@aircconline.com
Submission URL : https://airccse.com/submissioncs/home.html
ENHANCING WEB ACCESSIBILITY - NAVIGATING THE UPGRADE OF DESIGN SYSTEMS FROM W...dannyijwest
ENHANCING WEB ACCESSIBILITY - NAVIGATING THE UPGRADE OF DESIGN SYSTEMS FROM WCAG 2.0 TO WCAG 2.1
Hardik Shah
Department of Information Technology, Rochester Institute of Technology, USA
ABSTRACT
In this research, we explore the vital transition of Design Systems from Web Content Accessibility
Guidelines (WCAG) 2.0 to WCAG 2.1, emphasizing its role in enhancing web accessibility and inclusivity
in digital environments. The study outlines a comprehensive strategy for achieving WCAG 2.1 compliance,
encompassing assessment, strategic planning, implementation, and testing, with a focus on collaboration
and user involvement. It also addresses the challenges in using web accessibility tools, such as their
complexity and the dynamic nature of accessibility standards. The paper looks forward to the integration
of emerging technologies like AI, ML, NLP, VR, and AR in accessibility tools, advocating for universal
design and user-centered approaches. This research acts as a crucial guide for organizations aiming to
navigate the changing landscape of web accessibility, underscoring the importance of continuous learning
and adaptation to maintain and enhance accessibility in digital platforms.
KEYWORDS
Web accessibility, WCAG 2.1, Design Systems, Web accessibility tools, Artificial Intelligence
PDF LINK:https://aircconline.com/ijwest/V15N1/15124ijwest01.pdf
VOLUME LINK:https://www.airccse.org/journal/ijwest/vol15.html
OTHER INFORMATION:https://www.airccse.org/journal/ijwest/ijwest.html
NO1 Uk best vashikaran specialist in delhi vashikaran baba near me online vas...Amil Baba Dawood bangali
Contact with Dawood Bhai Just call on +92322-6382012 and we'll help you. We'll solve all your problems within 12 to 24 hours and with 101% guarantee and with astrology systematic. If you want to take any personal or professional advice then also you can call us on +92322-6382012 , ONLINE LOVE PROBLEM & Other all types of Daily Life Problem's.Then CALL or WHATSAPP us on +92322-6382012 and Get all these problems solutions here by Amil Baba DAWOOD BANGALI
#vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore#blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #blackmagicforlove #blackmagicformarriage #aamilbaba #kalajadu #kalailam #taweez #wazifaexpert #jadumantar #vashikaranspecialist #astrologer #palmistry #amliyaat #taweez #manpasandshadi #horoscope #spiritual #lovelife #lovespell #marriagespell#aamilbabainpakistan #amilbabainkarachi #powerfullblackmagicspell #kalajadumantarspecialist #realamilbaba #AmilbabainPakistan #astrologerincanada #astrologerindubai #lovespellsmaster #kalajaduspecialist #lovespellsthatwork #aamilbabainlahore #Amilbabainuk #amilbabainspain #amilbabaindubai #Amilbabainnorway #amilbabainkrachi #amilbabainlahore #amilbabaingujranwalan #amilbabainislamabad
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
HEAP SORT ILLUSTRATED WITH HEAPIFY, BUILD HEAP FOR DYNAMIC ARRAYS.
Heap sort is a comparison-based sorting technique based on Binary Heap data structure. It is similar to the selection sort where we first find the minimum element and place the minimum element at the beginning. Repeat the same process for the remaining elements.
Sachpazis:Terzaghi Bearing Capacity Estimation in simple terms with Calculati...Dr.Costas Sachpazis
Terzaghi's soil bearing capacity theory, developed by Karl Terzaghi, is a fundamental principle in geotechnical engineering used to determine the bearing capacity of shallow foundations. This theory provides a method to calculate the ultimate bearing capacity of soil, which is the maximum load per unit area that the soil can support without undergoing shear failure. The Calculation HTML Code included.
Saudi Arabia stands as a titan in the global energy landscape, renowned for its abundant oil and gas resources. It's the largest exporter of petroleum and holds some of the world's most significant reserves. Let's delve into the top 10 oil and gas projects shaping Saudi Arabia's energy future in 2024.
CW RADAR, FMCW RADAR, FMCW ALTIMETER, AND THEIR PARAMETERSveerababupersonal22
It consists of cw radar and fmcw radar ,range measurement,if amplifier and fmcw altimeterThe CW radar operates using continuous wave transmission, while the FMCW radar employs frequency-modulated continuous wave technology. Range measurement is a crucial aspect of radar systems, providing information about the distance to a target. The IF amplifier plays a key role in signal processing, amplifying intermediate frequency signals for further analysis. The FMCW altimeter utilizes frequency-modulated continuous wave technology to accurately measure altitude above a reference point.
Gen AI Study Jams _ For the GDSC Leads in India.pdf
Metadata: Towards Machine-Enabled Intelligence
1. International Journal of Web & Semantic Technology (IJWesT) Vol.3, No.3, July 2012
DOI : 10.5121/ijwest.2012.3308 121
METADATA: TOWARDS MACHINE-ENABLED
INTELLIGENCE
Vandana Dhingra 1
and Komal Kumar Bhatia 2
1
School of Computational Sciences, Apeejay Stya University, Gurgaon, India
vandana.dhingra@asu.apeejay.edu
2
Department of Computer Science, YMCA University, Faridabad, India
komal_bhatia1@rediffmail.com
ABSTRACT
World Wide Web has revolutionized the means of data availability, but with its current structure model , it
is becoming increasingly difficult to retrieve relevant information, with reasonable precision and recall,
using the major search engines. However, with use of metadata, combined with the use of improved
searching techniques, helps to enhance relevant information retrieval .The design of structured,
descriptions of Web resources enables greater search precision and a more accurate relevance ranking of
retrieved information .One such efforts towards standardization is , Dublin Core standard, which has been
developed as Metadata Standard and also other standards which enhances retrieval of a wide range of
information resources. This paper discuses the importance of metadata, various metadata schemas and
elements, and the need of standardization of Metadata. This paper further discusses how the metadata can
be generated using various tools which assist intelligent agents for efficient retrieval.
KEYWORDS
Semantic web, Intelligent retrieval, FOAF, RDF, Metadata Schema & Dublin Core
1. INTRODUCTION
Metadata, in general terms is defined as “data about data” or information about information, it is
data that describes information resources. Metadata is structured information that describes,
locates and makes it easier to retrieve, use, or manage an information resource. The purpose of
metadata is ‘to facilitate search, evaluation, acquisition, and use’ of resources [1]. It is basically
structured data that allows machine readability and understandability, which is basically the
vision for Semantic Web.
Metadata can be used to add information that can be used to describe the document such as:
• Title of document
• Author of document
• Time and date of creation
• Standards used
•
There are three main types of metadata [2]:
1Descriptive metadata describes a resource for purposes such as discovery and identification. It
can include elements such as title, abstract, author, and keywords.
2. International Journal of Web & Semantic Technology (IJWesT) Vol.3, No.3, July 2012
122
2. Structural metadata indicates how compound objects are put together, for example, how pages
are ordered to form chapters.
3. Administrative metadata provides information to help manage a resource, such as when and
how it was created, file type and other technical information, and who can access it. There are
several subsets of administrative data; two that sometimes are listed as separate metadata types
are:
a. Rights management metadata, which deals with intellectual property rights, and
b. Preservation metadata, which contains information needed to archive and preserve a
resource.
Metadata descriptions present two advantages [3]:
• They enable the abstraction of details such as the format and organization of data, and capture
the information content of the underlying data independent of representational details. This
represents the first step in reduction of information overload, as intentional metadata descriptions
are in general an order of magnitude smaller than the underlying data.
• They enable representation of domain knowledge describing the information domain to which
the underlying data belong. This knowledge may then be used to make inferences about the
underlying data. This helps in reducing information overload as the inferences may be used to
determine the relevance of the underlying data without accessing the data.
The remainder of this paper is organized as follows: In section 2 we have given an overview of
metadata schemas and element sets .Section 3 describes the structuring of metadata, including
Dublin Core standard .Section 4 describes about the role of metadata in Semantic web. Section 5
covers how we can generate metadata for the existing web pages with the related work and
Section 6 concludes the paper.
2. Metadata Schemas and Element Sets
2.1. Metadata Schema
A schema is a set of rules covering the elements and requirements for coding. Metadata schemas
are sets of metadata elements designed for a specific purpose, such as describing a particular type
of information resource. The definition or meaning of the elements themselves is known as the
semantics of the schema.
Some of the most popular metadata schemas include:
Dublin Core
AACR2 (Anglo-American Cataloguing Rules)
GILS (Government Information Locator Service)
EAD (Encoded Archives Description)
IMS (IMS Global Learning Consortium)
AGLS (Australian Government Locator Service)
Examples of schemas in the semantic web include Dublin Core, FOAF (Friend of a Friend), and
many others. Metadata schemas specify names of elements and their semantics. Many current
metadata encoding schemes are:
3. International Journal of Web & Semantic Technology (IJWesT) Vol.3, No.3, July 2012
123
• SGML (Standard Generalized Mark-up Language)
• XML (Extensible Mark-up Language).
• XML, developed by the World Wide Web Consortium (W3C.)
• HTML (Hyper-Text Markup Language)
2.2. Metadata Deployment
Metadata may be deployed in a number of ways:
• Embedding the metadata in the Web page using META tags in the HTML coding of the
page
• As a separate HTML document linked to the resource it describes
• In a database linked to the resource. The records may either have been directly created
within the database or extracted from another source, such as Web pages.
3. Structuring of Metadata
The metadata of each web document has its own structure, so there is need for some structure so
that this data can be processed by machines in uniform manner. Different author’s will have their
own ways to describe a given page, by this each page will have its own unique structure, and it
will not be possible for an automated agent to process this metadata in uniform way. So for
processing by machines, there is need for some metadata standard. Such kinds of standards are
called metadata schemas. Dublin Core is one such standard.
3.1 Dublin Core Standard
It has 13 elements which are called as Dublin Core metadata Element Set. Basic metadata
elements indicate the title, author, year of publication and similar simple bibliographic data.
Richer metadata structures also cover technical features, copyright properties, annotations and so
on[4].These original 13 core elements were later increased to 15[5]:
• Title: a name given to the resource
• Creator: an entity primarily responsible for making the resource
• Subject: the topic of the resource
• Description: an account of the resource
• Publisher: an entity responsible for making the resource available
• Contributor: an entity responsible for making contributions to the resource
• Date: a point or period of time associated with an event in the lifecycle of a resource
• Type: the nature or genre of the resource
• Format: the file format, physical medium, or dimensions of the resource
• Identifier: an unambiguous reference to the resource within a given context
• Source: the resource from which the described resource is derived
• Language: a language of the resource
• Relation: a related resource
• Coverage: the spatial or temporal topic of the resource, the spatial applicability of the
resource, or the jurisdiction under which the resource is relevant
• Rights: information about rights held in and over the resource
Figure 1 shows how DC metadata Standard can be used in html web page The Dublin Core
metatags can be placed within the head section of the HTML code of web pages. Normally, the
Dublin Core elements are preceded by the "DC" abbreviation. The best method to be used
4. International Journal of Web & Semantic Technology (IJWesT) Vol.3, No.3, July 2012
124
whenever you need to use Dublin Core elements in your HTML code is to have this metadata
embedded. The main characteristics of Dublin Core are:
• Simplicity (of creation and maintenance)
• Interoperability (among collections and indexing systems)
• International applicability
• Extensibility
• Modularity
Figure 1: Example of the Dublin Core Metadata Elements could be used in Web content
4. Role of Metadata in Semantic Web
Tim Berners-Lee introduced the term “Semantic Web” foreseeing the concept that we are creating
a web which can only be managed when we use intelligent software agents[6]. Tim Berners-Lee's
vision for Web is termed the "semantic Web," where semantic Meta data are the building blocks.
One such effort towards this is Semantic Search engine is Swoogle[7]. It is an implemented
metadata and search engine for online Semantic Web documents. Swoogle analyzes these
documents and their constituent parts (e.g., terms and triples) and records meaningful metadata
about them [8]. By annotating or enhancing documents with semantic metadata, software
programs (agents) can automatically understand the full context and meaning of each document
and can make correct decisions about who can use the documents and how these documents
should be used.It offers a set of rules for creating semantic relations and RDF Schema can be
used to define elements and vocabularies. The relations are defined with a very simple
mechanism that can also be processed by machines. In an RDF environment every resource has to
have a unique identifier (URI). It can have properties and properties can have values.
Users will not be able to process the huge amount of knowledge available on the web. Semantic
Web is based on the fundamental idea that web resources should be annotated with semantic
mark-up that captures information about their meaning[9]. The objective is to provide a common
framework that allows data to be shared and reused across applications, and which brings
structure to the meaningful content of WebPages which results in creating an environment where
software agents roaming from page to page readily carry out advanced searches. Hence, the
metadata initiatives are important steps towards the realization of the Semantic Web. Following
subsections describes various representation languages for describing metadata-
<html>
<head>
<title>Example Metadata in Dublin Core</Title>
<meta name="DC.Title" content="Example Metadata in Dublin Core">
<meta name="DC.Creator" content="Vandana Dhingra">
<meta name=" DC.Creator.Address" content="vandana@yahoo.com">
<meta name="DC.Date.Created" Content="2012-02-10">
<meta name="DC.Type" Content="Text.Homepage.Tutorial">
<meta name="DC.Format" Content="Text/Html">
</head>
<body>
This is my first tutorial in Dublin Core
</body>
</html>
5. International Journal of Web & Semantic Technology (IJWesT) Vol.3, No.3, July 2012
125
4.1 RDF/XML
RDF provides a rich foundation to describe metadata.RDF (Resource Description Framework)
allows multiple metadata schemes to be read by humans as well as parsed by machines. It uses
XML (EXtensible Markup Language) to express structure thereby allowing metadata
communities to define the actual semantics [10].RDF allows multiple objects to be described
without specifying the detail required. The underlying glue, XML, simply requires that all
namespaces be defined and once defined; they can be used to the extent needed by the provider of
the metadata.
The Resource Description Framework (RDF), developed by the World Wide Web Consortium
(W3C), is a data model for the description of resources on the Web that provides a mechanism for
integrating multiple metadata schemes. For metadata, the most popular bindings nowadays are
XML, or, more specifically, RDF [11].In RDF a namespace is defined by a URL pointing to a
Web resource that describes the metadata scheme that is used in the description. Multiple
namespaces can be defined, allowing elements from different schemes to be combined in a single
resource description. Multiple descriptions, created at different times for different purposes, can
also be linked to each other.
Example: Description with a single statement, which uses a language-tagged literal value.
“Vandana is the author of the resource”[12]
Figure 2: Metadata Represented in RDF document
The ontology creation language OIL extends RDF Schema and allows you to be much
more specific about what sort of thing a person is, the properties a thing needs to have to
be a wn:person and so on[13].
4.2 FOAF
It is one of the metadata standards for the Semantic Web, Figure 2 represents an example of Web
document represented in FOAF .FOAF is all about creating and using machine-readable
homepages that describe people, the links between them, and define things they create and do.
FOAF defines an RDF vocabulary for expressing metadata about people and the relation among
them.
Vandana Dhingra is the creator of the resource
http://vandana.dhingra/home/index.html
can be represented in RDF by
<? xml version="1.0"?>
<rdf: RDF
xmlns:rdf=" http://vandana.dhingra/2000/22 -rdf-syntax-
ns#"
xmlns:s="http://description.org/schema/">
<rdf: Description about="
http://vandana.dhingra/home/index.html ">
<s: Creator>vandana</s: Creator>
</rdf: Description>
</rdf: RDF>
6. International Journal of Web & Semantic Technology (IJWesT) Vol.3, No.3, July 2012
126
Figure 3: Metadata represented in FOAF
5. Generation of Metadata for Existing Web Pages
Metadata generation tools generate metadata from Web resources and store it in RDF for later
use. There are various metadata tools to create metadata for already existing Web pages. One
such tool for creating metadata is
1. Reggie[14]- is a tool capable of extracting metadata from given Web pages. The user can
select any existing schema file or can create his/her own schema files. Reggie extracts the META
tags from a given URL and attempts to add them to the most appropriate fields of the chosen
schema. It also allows users to create their own metadata schema files.
2. DC-DOT [15] is another tool for metadata extraction. In comparison to Reggie where the user
provides the schema file, DC-DOT specifically uses the Dublin Core schema, a metadata element
set for description of electronic resources, to extract metadata from a given Web resource. DC-
DOT uses the information contained in the META tags of a Web source to generate the RDF
model.
Using DC-dot
It will read the page you submit and automatically generate DC metadata for the web page.
rdf:Description>
<dc: title>Tutorial </dc: title>
<dc: creator>
<foaf:Person>
<foaf:name>Vandana Dhingra </foaf:name>
<foaf: homepage
rdf:resource="http://rdfweb.org/people/vandana "/>
</foaf: Person>
<dc: description>Semantic Web Tutorial
7. International Journal of Web & Semantic Technology (IJWesT) Vol.3, No.3, July 2012
127
Figure 4: Generation of Dublin Core metadata
8. International Journal of Web & Semantic Technology (IJWesT) Vol.3, No.3, July 2012
128
Figure 5: A Dublin Core description represented in RDF
This metadata generally is read by automated tools. It will read the page for the submitted URL
and will automatically generate DC metadata for the user. Figure 4 shows the generated metadata
for the submitted URL of http://ymcaust.ac.in and Figure 5 shows the generated RDF document
with metadata of submitted URL http://ymcaust.ac.in.
5. Conclusion
Metadata is a very useful element in organizing content on the Internet and in enabling to find
relevant search results effectively and in this research paper we have discussed about the
metadata, and how the usage of metadata impacts the searching efficiency for search agents
and generation of metadata for existing web pages. But, there are still many problems and issues
that remain unsolved. Various open research issues in this area are –
• Interoperability with respect to syntax, semantics, vocabularies, languages, and
underlying models.
• Identifying the best metadata models, schemas, and vocabularies for metadata
models;
10. International Journal of Web & Semantic Technology (IJWesT) Vol.3, No.3, July 2012
130
Authors
Ms.Vandana Dhingra, Pursing Ph.d. from YMCAIE University in the field of Semantic
Web have done M.Tech (Information Technology),B.E. (Computer Science and
Engineering) with First Class with distinction and is having experience of teaching in
reputed Engineering colleges for 15 years. She has administrative experience of being
Head of Department of Computer for five years. She has published research papers in
IEEE Digital Library and various International and National Conferences and currently is
working in Apeejay Stya University. Her subjects of Interest include Digital System
Design using VHDL Language, Operating Systems and Distributed Operating Systems.
Her research interests are Semantic Web, Information retrieval, Crawlers, Search engines
and Ontologies.
Dr. Komal Kumar Bhatia received his B.E, M.Tech. And Ph.D. degrees in Computer
Science Engineering with Honors from Maharishi Dayanand University. Presently, he is
working as Associate Professor in Computer Engineering Department in YMCA
University of Science & Technology, Faridabad. He is guiding PhDs in Computer
Engineering and has guided many M.Tech Thesis. His subject interests include Analysis
and Design of algorithms, Information retrieval. He has almost 80 research papers in
referred International Journals, National Journals, and International Conferences to his
credit. His research interests include Web Crawlers, Hidden Web, Information Retrieval,
Deep Web and Semantic Web.