The document describes a web-based ontology editor that was developed to make ontology building easier for non-experts. The editor uses several approaches, including being web-based to not require installation, limiting technical terms and functions, and extracting property-value pairs from web pages to assist with registering instances. It provides an intuitive graphical view and list views of ontologies, along with sample applications to demonstrate usage. The key contribution discussed is an approach for extracting candidate property-value pairs using bootstrapping and dependency parsing techniques, and having users select the correct values. The accuracy of this extraction method is evaluated.
The document describes an algorithm for parsing resumes into groups based on formatting of titles. It was tested on resumes of varying lengths, with longer resumes like Collins' 112 pages posing more challenges. For the Collins resume, the system was able to recognize and group titles with 30-40% accuracy. The next phase will focus on individual sections, extract needed information from groups more semantically, reduce errors, and remove noise to improve accuracy.
Intelligent Hiring with Resume Parser and Ranking using Natural Language Proc...Zainul Sayed
Using Natural Language Processing(NLP) and (ML)Machine Learning to rank the resumes according to the given constraint, this intelligent system ranks the resume of any format according to the given constraints or following the requirements provided by the client company. We will basically take the bulk of input resume from the client company and that client company will also provided the requirement and the constraints according to which the resume shall be ranked by our system. Moreover the details acquired from the resumes, our system shall be reading the candidates social profiles (like LinkedIn, Github etc) which will the more genuine information about that candidate.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Towards From Manual to Automatic Semantic Annotation: Based on Ontology Eleme...IJwest
This document describes a proposed system for automatic semantic annotation of web documents based on ontology elements and relationships. It begins with an introduction to semantic web and annotation. The proposed system architecture matches topics in text to entities in an ontology document. It utilizes WordNet as a lexical ontology and ontology resources to extract knowledge from text and generate annotations. The main components of the system include a text analyzer, ontology parser, and knowledge extractor. The system aims to automatically generate metadata to improve information retrieval for non-technical users.
This document presents a framework for automatically generating entity-relationship (ER) diagrams from natural language text input. It involves five main modules: 1) text preprocessing and summary generation, 2) translating the summary to a Semantic Business Vocabulary and Rules (SBVR) format, 3) part-of-speech tagging, 4) extracting ER diagram requirements by identifying entities, relationships, and attributes, and 5) generating an XMI file that can be imported into a UML modeling tool to visualize the generated ER diagram. Keywords are extracted from the input text using term frequency, and sentences are scored and selected for the summary based on important keywords and nouns. The framework aims to reduce the complexity of manually creating ER diagrams by
2015-User Modeling of Skills and Expertise from Resumes-KMISHua Li, PhD
The document describes a Resume Expertise Modeling Algorithm (REMA) that analyzes resumes to automatically generate expertise models. REMA extracts expertise topics from resumes using natural language processing. It then builds weighted expertise models by incrementally processing resume information over time using reinforcement to increase weights for recent expertise and forgetting to decrease weights for outdated skills. The authors developed a prototype system using REMA and are evaluating its performance at identifying people's skills and expertise from their resumes.
Tracing Requirements as a Problem of Machine Learning ijseajournal
This summary provides the key details from the document in 3 sentences:
The document discusses using a machine learning approach to classify traceability links between requirements. It proposes a 2-learner model that uses both lexical features from word pairs as well as features derived from a hand-built ontology. The model achieves a 56% reduction in error compared to a baseline using only lexical features, and performance is improved further by generating additional pseudo training instances from the ontology.
This document discusses sentiment analysis of tweets from Twitter. It begins with an introduction to how social media allows people to share opinions and how analyzing sentiment can be useful. It then discusses previous work on sentiment analysis of Twitter data, focusing on techniques like Naive Bayes classification. The document outlines a proposed approach to collecting Twitter data using APIs, preprocessing the data by removing stop words and emoticons, and classifying sentiment using Naive Bayes. Finally, it discusses applications of sentiment analysis and potential areas for future work, such as handling multiple languages and semantic analysis.
The document describes an algorithm for parsing resumes into groups based on formatting of titles. It was tested on resumes of varying lengths, with longer resumes like Collins' 112 pages posing more challenges. For the Collins resume, the system was able to recognize and group titles with 30-40% accuracy. The next phase will focus on individual sections, extract needed information from groups more semantically, reduce errors, and remove noise to improve accuracy.
Intelligent Hiring with Resume Parser and Ranking using Natural Language Proc...Zainul Sayed
Using Natural Language Processing(NLP) and (ML)Machine Learning to rank the resumes according to the given constraint, this intelligent system ranks the resume of any format according to the given constraints or following the requirements provided by the client company. We will basically take the bulk of input resume from the client company and that client company will also provided the requirement and the constraints according to which the resume shall be ranked by our system. Moreover the details acquired from the resumes, our system shall be reading the candidates social profiles (like LinkedIn, Github etc) which will the more genuine information about that candidate.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Towards From Manual to Automatic Semantic Annotation: Based on Ontology Eleme...IJwest
This document describes a proposed system for automatic semantic annotation of web documents based on ontology elements and relationships. It begins with an introduction to semantic web and annotation. The proposed system architecture matches topics in text to entities in an ontology document. It utilizes WordNet as a lexical ontology and ontology resources to extract knowledge from text and generate annotations. The main components of the system include a text analyzer, ontology parser, and knowledge extractor. The system aims to automatically generate metadata to improve information retrieval for non-technical users.
This document presents a framework for automatically generating entity-relationship (ER) diagrams from natural language text input. It involves five main modules: 1) text preprocessing and summary generation, 2) translating the summary to a Semantic Business Vocabulary and Rules (SBVR) format, 3) part-of-speech tagging, 4) extracting ER diagram requirements by identifying entities, relationships, and attributes, and 5) generating an XMI file that can be imported into a UML modeling tool to visualize the generated ER diagram. Keywords are extracted from the input text using term frequency, and sentences are scored and selected for the summary based on important keywords and nouns. The framework aims to reduce the complexity of manually creating ER diagrams by
2015-User Modeling of Skills and Expertise from Resumes-KMISHua Li, PhD
The document describes a Resume Expertise Modeling Algorithm (REMA) that analyzes resumes to automatically generate expertise models. REMA extracts expertise topics from resumes using natural language processing. It then builds weighted expertise models by incrementally processing resume information over time using reinforcement to increase weights for recent expertise and forgetting to decrease weights for outdated skills. The authors developed a prototype system using REMA and are evaluating its performance at identifying people's skills and expertise from their resumes.
Tracing Requirements as a Problem of Machine Learning ijseajournal
This summary provides the key details from the document in 3 sentences:
The document discusses using a machine learning approach to classify traceability links between requirements. It proposes a 2-learner model that uses both lexical features from word pairs as well as features derived from a hand-built ontology. The model achieves a 56% reduction in error compared to a baseline using only lexical features, and performance is improved further by generating additional pseudo training instances from the ontology.
This document discusses sentiment analysis of tweets from Twitter. It begins with an introduction to how social media allows people to share opinions and how analyzing sentiment can be useful. It then discusses previous work on sentiment analysis of Twitter data, focusing on techniques like Naive Bayes classification. The document outlines a proposed approach to collecting Twitter data using APIs, preprocessing the data by removing stop words and emoticons, and classifying sentiment using Naive Bayes. Finally, it discusses applications of sentiment analysis and potential areas for future work, such as handling multiple languages and semantic analysis.
The document proposes developing a skills taxonomy to address problems recruiters face in evaluating job applicants. It discusses related work on skills taxonomies, which lacked comprehensive relationship information or public availability. As a motivating example, it describes using an individual's Stack Overflow posts and tags to create a skills cloud, but initial methods produced noisy results. A hybrid approach is proposed that seeds a taxonomy with public resources like Wikipedia and identifies skills and relationships through data mining. Experimental results showed a 98% classification rate.
Generating requirements analysis models from textual requiremenfortes
This document describes a process for generating use case models from textual requirements. The process uses the EA-Miner tool to analyze textual requirements and extract information like functional concerns, RDL sentences, and a syntactically tagged document. This extracted information is used to derive initial candidate use cases, actors, and relationships. The candidate model is then refined by activities like removing undesirable use cases, completing abstraction names, adding new use cases/actors, and defining relationships between use cases. The overall goal is to reduce the time and effort required to produce requirements artifacts from textual specifications.
IRJET- A Novel Approch Automatically Categorizing Software TechnologiesIRJET Journal
This document proposes an automatic approach called Witt to categorize software technologies based on their descriptions. Witt takes a sentence describing a technology as input and outputs a general category (e.g. integrated development environment) along with qualifying attributes. It applies natural language processing and the Levenshtein distance algorithm to compare string similarities and categorize technologies from large datasets. The system architecture first obtains data on software methodologies and labels. It then applies NLP and Levenshtein distance to find hypernyms and transform them into categories with attributes for classification.
A web based approach: Acronym Definition ExtractionIRJET Journal
The document presents an automatic web-based approach to extract definitions of acronyms. It uses web resources like Google, Bing, Wikipedia, and Acronym Finder to identify definitions. Snippets and titles are extracted from search engine results and pattern extraction algorithms are used to identify definitions. Over 100 acronym definitions were successfully extracted from the different web resources as an example. The extracted definitions could potentially be applied in areas like information retrieval and question answering systems.
This document presents an approach for extracting ontologies from heterogeneous documents. It discusses how ontologies play an important role in the semantic web for knowledge management and interoperability. The authors describe a clustering algorithm that identifies concepts and relationships by processing sentences from input documents. Key steps include marking the first word of each sentence as a parent concept and subsequent words as child concepts. They also describe a harmonization process to integrate extracted ontologies with existing knowledge bases by matching and merging corresponding concepts and relations. The authors applied their approach to documents in text, document and PDF formats, and were able to extract concept hierarchies and relationships from the input files.
This document summarizes a research paper that evaluates different classification methods for detecting spam users in social bookmarking systems. It tests naive Bayes and k-nearest neighbor classifiers on user data represented using three information retrieval models: Boolean, bag-of-words, and TFIDF. The best results were achieved using naive Bayes with a Boolean user representation, accurately classifying 97.5% of users. K-nearest neighbor worked best across all three representations, with over 96% accuracy using TFIDF. The study aims to automatically detect spam users through supervised machine learning techniques.
Studying user footprints in different online social networksIIIT Hyderabad
This document describes research on linking a user's accounts across multiple online social networks. It discusses the challenges in linking accounts as usernames and profiles can differ across networks. Existing techniques for linking are reviewed, along with their limitations. The paper then presents a new supervised learning approach to link Twitter and LinkedIn accounts based on similarity metrics for different profile fields. Evaluation shows the approach can accurately match accounts with 98% accuracy and discover new candidate matches for a given user profile.
The project is to ask college related queries and get the responses through a chatbot an Artificial Conversational Entity. This System is a web application which provides answer to the query of the student. Students just have to query through the bot which is used for chatting. Students can chat using any format there is no specific format the user has to follow. This system helps the student to be updated about the college activities.
Resume Parsing with Named Entity Clustering AlgorithmSwapnil Sonar
The paper gives an outlook of an ongoing project
on deploying information extraction techniques in
the process of resume information extraction into
compact and highly-structured data. This online
tool has been able to reduce lots of burden on the
shoulder of users of recruitment agency. The
Resume Parser automatically segregates
information on the basis of various fields and
parameters like name, phone / mobile nos. etc. and
huge volume of resumes is no problem for this
system and all work is done automatically without
any personal or human intervention.
The resume extraction process consists of four
phases. In the first phase, a resume is segmented
into blocks according to their information types. In
the second phase, named entities are found by
using special chunkers for each information type.
In the third phase, found named entities are
clustered according to their distance in text and
information type. In the fourth phase,
normalization methods are applied to the text.
FEATURE LEVEL FUSION USING FACE AND PALMPRINT BIOMETRICS FOR SECURED AUTHENTI...pharmaindexing
This document discusses feature level fusion of biometric modalities for secure authentication. It explores fusion at the feature level of 1) PCA and LDA coefficients of face images, 2) LDA coefficients of RGB color channels of face images, and 3) face and hand modalities. Preliminary results show fusion at the feature level outperforms fusion at the match score level in some cases, though further analysis is needed to understand why performance varies across datasets. The paper introduces using threshold absolute distance along with Euclidean distance to better distinguish genuine from imposter feature vector matches in the critical region of score distributions.
IRJET- Development of College Enquiry Chatbot using SnatchbotIRJET Journal
This document describes the development of a college enquiry chatbot using SnatchBot. The chatbot was developed to provide information to users about college activities and ease the workload of office staff. It uses natural language processing and a keyword matching algorithm to match user queries to responses from its knowledge base. If no match is found, the user is provided a default message. The chatbot has a user-friendly GUI and is accessible anytime via web. It also allows users to provide feedback if answers are invalid, which is sent to the admin for knowledge base updates.
This document describes a college enquiry chatbot that was developed to provide students with a way to get information about their college without having to visit in person. The chatbot uses algorithms to analyze user queries and respond to common questions about things like fees, admission processes, exams, and other college activities. It was created to reduce the time and effort spent by students and parents in obtaining information from the college. The chatbot system includes a database to store question and answer pairs, and an admin interface to update responses for questions not currently in the database.
Fundamentals of Database Systems questions and answers with explanation for fresher's and experienced for interview, competitive examination and entrance test.
Convincing a customer is always considered as a challenging task in every business. But when it comes to
online business, this task becomes even more difficult. Online retailers try everything possible to gain the
trust of the customer. One of the solutions is to provide an area for existing users to leave their comments.
This service can effectively develop the trust of the customer however normally the customer comments
about the product in their native language using Roman script. If there are hundreds of comments this
makes difficulty even for the native customers to make a buying decision. This research proposes a system
which extracts the comments posted in Roman Urdu, translate them, find their polarity and then gives us
the rating of the product. This rating will help the native and non-native customers to make buying decision
efficiently from the comments posted in Roman Urdu.
TOWARDS MAKING SENSE OF ONLINE REVIEWS BASED ON STATEMENT EXTRACTIONcscpconf
Product reviews are valuable resource for information seeking and decision making purposes. Products such as smart phone are discussed based on their aspects e.g. battery life, screen quality, etc. Knowing user statements about aspects is relevant as it will guide other users in their buying process. In this paper, we automatically extract user statements about aspects for a given product. Our extraction method is based on dependency parse information of individual reviews. The parse information is used to learn patterns and use them to determine the user statements for a given aspect. Our results show that our methods are able to extract potentially
useful statements for given aspects.
Text preprocessing and document classification plays a vital role in web services discovery. Nearest centroid classifiers were mostly employed in high-dimensional application including genomics. Feature selection is a major problem in all classifiers and in this paper we propose to use an effective feature selection procedure followed by web services discovery through Centroid classifier algorithm. The task here in this problem statement is to effectively assign a document to one or more classes. Besides being simple and robust, the centroid classifier s not effectively used for document classification due to the computational complexity and larger memory requirements. We address these problems through dimensionality reduction and effective feature set selection before training and testing the classifier. Our preliminary experimentation and results shows that the proposed method outperforms other algorithms mentioned in the literature including K-Nearest neighbors, Naive Bayes classifier and Support Vector Machines.
This paper presents a natural language processing based automated system called DrawPlus for generating UML diagrams, user scenarios and test cases after analyzing the given business requirement specification which is written in natural language. The DrawPlus is presented for analyzing the natural languages and extracting the relative and required information from the given business requirement Specification by the user. Basically user writes the requirements specifications in simple English and the designed system has conspicuous ability to analyze the given requirement specification by using some of the core natural language processing techniques with our own well defined algorithms. After compound analysis and extraction of associated information, the DrawPlus system draws use case diagram, User scenarios and system level high level test case description. The DrawPlus provides the more convenient and reliable way of generating use case, user scenarios and test cases in a way reducing the time and cost of software development process while accelerating the 70 of works in Software design and Testing phase Janani Tharmaseelan ""Cohesive Software Design"" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-3 , April 2019, URL: https://www.ijtsrd.com/papers/ijtsrd22900.pdf
Paper URL: https://www.ijtsrd.com/computer-science/other/22900/cohesive-software-design/janani-tharmaseelan
Design, analysis and implementation of geolocation based emotion detection te...eSAT Journals
Abstract
It has been a topic of utmost importance to researchers that emotions of public has a direct impact on various social science problems such as politics, online business and so on. With emotion analysis, we can bring sensitivity to analytics and stay attuned to the feelings of customers during chat sessions, track social media reactions to a press releases, or gauge the public outlook on financial news. In order to meet these need we create a system for analyzing moods of tweets on any topic trending on twitter.com. We collected 1. 3 × 10^3 emotional tweets, and then these were annotated for emotion, geographic location. Bayes classifier has been used for analysis.
Keywords: Emotion Analysis, Twitter, Geographic Distribution
IRJET- Comparative Study of Classification Algorithms for Sentiment Analy...IRJET Journal
This document provides a comparative study of classification algorithms for sentiment analysis on Twitter data. It discusses Naive Bayes, Random Forest and Support Vector Machine (SVM) algorithms. For each algorithm, it describes the basic theory, common uses, and pros and cons. It also outlines the process used for sentiment analysis, including data collection from Twitter, preprocessing, feature extraction and classification. The goal is to evaluate which algorithm performs best for sentiment classification of tweets.
Advanced Question Paper Generator Implemented using Fuzzy LogicIRJET Journal
This document describes an advanced question paper generator system implemented using fuzzy logic. The system allows professors to generate question papers automatically by selecting the difficulty level and pattern. It uses fuzzy logic to determine the difficulty level of questions based on their analytical and descriptive quotients stored in the database. The system provides authentication and authorization for professors and admin. Professors can add, update and delete questions for the subjects allocated to them. The admin can manage user accounts and subject allocations. The generated question papers are in PDF format for ease of use and security.
The document discusses how the Internet allows computers to connect locally, regionally, nationally, and internationally. It explains that the World Wide Web, email, FTP, chat/instant messaging, listservs, and telnet are part of the Internet. The World Wide Web is accessed through browsers and allows users to view webpages from websites. Search engines like Google help users find information on the visible web, while directories and databases contain specialized content. Website addresses provide information on their domain and country of origin. The Internet is not supervised and search engines cannot find all of its content.
While the world is witnessing an information revolution unprecedented and great speed in the growth of databases in all aspects. Databases interconnect with their content and schema but use different elements and structures to express the same concepts and relations, which may cause semantic and structural conflicts. This paper proposes a new technique for integration the heterogeneous eXtensible Markup Language (XML) schemas, under the name XDEHD. The returned mediated schema contains all concepts and relations of the sources without duplication. Detailed technique divides into three steps; First, extract all subschemas from the sources by decompose the schemas sources, each subschema contains three levels, these levels are ancestor, root and leaf. Thereafter, second, the technique matches and compares the subschemas and return the related candidate subschemas, semantic closeness function is implemented to measures the degree how similar the concepts of subschemas are modelled in the sources. Finally, create the medicate schema by integration the candidate subschemas, and then obtain the minimal and complete unified schema, association strength function is developed to compute closely of pair in candidate subschema across all data sources, and elements repetition function is employed to calculate how many times each element repeated between the candidate subschema.
The document proposes developing a skills taxonomy to address problems recruiters face in evaluating job applicants. It discusses related work on skills taxonomies, which lacked comprehensive relationship information or public availability. As a motivating example, it describes using an individual's Stack Overflow posts and tags to create a skills cloud, but initial methods produced noisy results. A hybrid approach is proposed that seeds a taxonomy with public resources like Wikipedia and identifies skills and relationships through data mining. Experimental results showed a 98% classification rate.
Generating requirements analysis models from textual requiremenfortes
This document describes a process for generating use case models from textual requirements. The process uses the EA-Miner tool to analyze textual requirements and extract information like functional concerns, RDL sentences, and a syntactically tagged document. This extracted information is used to derive initial candidate use cases, actors, and relationships. The candidate model is then refined by activities like removing undesirable use cases, completing abstraction names, adding new use cases/actors, and defining relationships between use cases. The overall goal is to reduce the time and effort required to produce requirements artifacts from textual specifications.
IRJET- A Novel Approch Automatically Categorizing Software TechnologiesIRJET Journal
This document proposes an automatic approach called Witt to categorize software technologies based on their descriptions. Witt takes a sentence describing a technology as input and outputs a general category (e.g. integrated development environment) along with qualifying attributes. It applies natural language processing and the Levenshtein distance algorithm to compare string similarities and categorize technologies from large datasets. The system architecture first obtains data on software methodologies and labels. It then applies NLP and Levenshtein distance to find hypernyms and transform them into categories with attributes for classification.
A web based approach: Acronym Definition ExtractionIRJET Journal
The document presents an automatic web-based approach to extract definitions of acronyms. It uses web resources like Google, Bing, Wikipedia, and Acronym Finder to identify definitions. Snippets and titles are extracted from search engine results and pattern extraction algorithms are used to identify definitions. Over 100 acronym definitions were successfully extracted from the different web resources as an example. The extracted definitions could potentially be applied in areas like information retrieval and question answering systems.
This document presents an approach for extracting ontologies from heterogeneous documents. It discusses how ontologies play an important role in the semantic web for knowledge management and interoperability. The authors describe a clustering algorithm that identifies concepts and relationships by processing sentences from input documents. Key steps include marking the first word of each sentence as a parent concept and subsequent words as child concepts. They also describe a harmonization process to integrate extracted ontologies with existing knowledge bases by matching and merging corresponding concepts and relations. The authors applied their approach to documents in text, document and PDF formats, and were able to extract concept hierarchies and relationships from the input files.
This document summarizes a research paper that evaluates different classification methods for detecting spam users in social bookmarking systems. It tests naive Bayes and k-nearest neighbor classifiers on user data represented using three information retrieval models: Boolean, bag-of-words, and TFIDF. The best results were achieved using naive Bayes with a Boolean user representation, accurately classifying 97.5% of users. K-nearest neighbor worked best across all three representations, with over 96% accuracy using TFIDF. The study aims to automatically detect spam users through supervised machine learning techniques.
Studying user footprints in different online social networksIIIT Hyderabad
This document describes research on linking a user's accounts across multiple online social networks. It discusses the challenges in linking accounts as usernames and profiles can differ across networks. Existing techniques for linking are reviewed, along with their limitations. The paper then presents a new supervised learning approach to link Twitter and LinkedIn accounts based on similarity metrics for different profile fields. Evaluation shows the approach can accurately match accounts with 98% accuracy and discover new candidate matches for a given user profile.
The project is to ask college related queries and get the responses through a chatbot an Artificial Conversational Entity. This System is a web application which provides answer to the query of the student. Students just have to query through the bot which is used for chatting. Students can chat using any format there is no specific format the user has to follow. This system helps the student to be updated about the college activities.
Resume Parsing with Named Entity Clustering AlgorithmSwapnil Sonar
The paper gives an outlook of an ongoing project
on deploying information extraction techniques in
the process of resume information extraction into
compact and highly-structured data. This online
tool has been able to reduce lots of burden on the
shoulder of users of recruitment agency. The
Resume Parser automatically segregates
information on the basis of various fields and
parameters like name, phone / mobile nos. etc. and
huge volume of resumes is no problem for this
system and all work is done automatically without
any personal or human intervention.
The resume extraction process consists of four
phases. In the first phase, a resume is segmented
into blocks according to their information types. In
the second phase, named entities are found by
using special chunkers for each information type.
In the third phase, found named entities are
clustered according to their distance in text and
information type. In the fourth phase,
normalization methods are applied to the text.
FEATURE LEVEL FUSION USING FACE AND PALMPRINT BIOMETRICS FOR SECURED AUTHENTI...pharmaindexing
This document discusses feature level fusion of biometric modalities for secure authentication. It explores fusion at the feature level of 1) PCA and LDA coefficients of face images, 2) LDA coefficients of RGB color channels of face images, and 3) face and hand modalities. Preliminary results show fusion at the feature level outperforms fusion at the match score level in some cases, though further analysis is needed to understand why performance varies across datasets. The paper introduces using threshold absolute distance along with Euclidean distance to better distinguish genuine from imposter feature vector matches in the critical region of score distributions.
IRJET- Development of College Enquiry Chatbot using SnatchbotIRJET Journal
This document describes the development of a college enquiry chatbot using SnatchBot. The chatbot was developed to provide information to users about college activities and ease the workload of office staff. It uses natural language processing and a keyword matching algorithm to match user queries to responses from its knowledge base. If no match is found, the user is provided a default message. The chatbot has a user-friendly GUI and is accessible anytime via web. It also allows users to provide feedback if answers are invalid, which is sent to the admin for knowledge base updates.
This document describes a college enquiry chatbot that was developed to provide students with a way to get information about their college without having to visit in person. The chatbot uses algorithms to analyze user queries and respond to common questions about things like fees, admission processes, exams, and other college activities. It was created to reduce the time and effort spent by students and parents in obtaining information from the college. The chatbot system includes a database to store question and answer pairs, and an admin interface to update responses for questions not currently in the database.
Fundamentals of Database Systems questions and answers with explanation for fresher's and experienced for interview, competitive examination and entrance test.
Convincing a customer is always considered as a challenging task in every business. But when it comes to
online business, this task becomes even more difficult. Online retailers try everything possible to gain the
trust of the customer. One of the solutions is to provide an area for existing users to leave their comments.
This service can effectively develop the trust of the customer however normally the customer comments
about the product in their native language using Roman script. If there are hundreds of comments this
makes difficulty even for the native customers to make a buying decision. This research proposes a system
which extracts the comments posted in Roman Urdu, translate them, find their polarity and then gives us
the rating of the product. This rating will help the native and non-native customers to make buying decision
efficiently from the comments posted in Roman Urdu.
TOWARDS MAKING SENSE OF ONLINE REVIEWS BASED ON STATEMENT EXTRACTIONcscpconf
Product reviews are valuable resource for information seeking and decision making purposes. Products such as smart phone are discussed based on their aspects e.g. battery life, screen quality, etc. Knowing user statements about aspects is relevant as it will guide other users in their buying process. In this paper, we automatically extract user statements about aspects for a given product. Our extraction method is based on dependency parse information of individual reviews. The parse information is used to learn patterns and use them to determine the user statements for a given aspect. Our results show that our methods are able to extract potentially
useful statements for given aspects.
Text preprocessing and document classification plays a vital role in web services discovery. Nearest centroid classifiers were mostly employed in high-dimensional application including genomics. Feature selection is a major problem in all classifiers and in this paper we propose to use an effective feature selection procedure followed by web services discovery through Centroid classifier algorithm. The task here in this problem statement is to effectively assign a document to one or more classes. Besides being simple and robust, the centroid classifier s not effectively used for document classification due to the computational complexity and larger memory requirements. We address these problems through dimensionality reduction and effective feature set selection before training and testing the classifier. Our preliminary experimentation and results shows that the proposed method outperforms other algorithms mentioned in the literature including K-Nearest neighbors, Naive Bayes classifier and Support Vector Machines.
This paper presents a natural language processing based automated system called DrawPlus for generating UML diagrams, user scenarios and test cases after analyzing the given business requirement specification which is written in natural language. The DrawPlus is presented for analyzing the natural languages and extracting the relative and required information from the given business requirement Specification by the user. Basically user writes the requirements specifications in simple English and the designed system has conspicuous ability to analyze the given requirement specification by using some of the core natural language processing techniques with our own well defined algorithms. After compound analysis and extraction of associated information, the DrawPlus system draws use case diagram, User scenarios and system level high level test case description. The DrawPlus provides the more convenient and reliable way of generating use case, user scenarios and test cases in a way reducing the time and cost of software development process while accelerating the 70 of works in Software design and Testing phase Janani Tharmaseelan ""Cohesive Software Design"" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-3 , April 2019, URL: https://www.ijtsrd.com/papers/ijtsrd22900.pdf
Paper URL: https://www.ijtsrd.com/computer-science/other/22900/cohesive-software-design/janani-tharmaseelan
Design, analysis and implementation of geolocation based emotion detection te...eSAT Journals
Abstract
It has been a topic of utmost importance to researchers that emotions of public has a direct impact on various social science problems such as politics, online business and so on. With emotion analysis, we can bring sensitivity to analytics and stay attuned to the feelings of customers during chat sessions, track social media reactions to a press releases, or gauge the public outlook on financial news. In order to meet these need we create a system for analyzing moods of tweets on any topic trending on twitter.com. We collected 1. 3 × 10^3 emotional tweets, and then these were annotated for emotion, geographic location. Bayes classifier has been used for analysis.
Keywords: Emotion Analysis, Twitter, Geographic Distribution
IRJET- Comparative Study of Classification Algorithms for Sentiment Analy...IRJET Journal
This document provides a comparative study of classification algorithms for sentiment analysis on Twitter data. It discusses Naive Bayes, Random Forest and Support Vector Machine (SVM) algorithms. For each algorithm, it describes the basic theory, common uses, and pros and cons. It also outlines the process used for sentiment analysis, including data collection from Twitter, preprocessing, feature extraction and classification. The goal is to evaluate which algorithm performs best for sentiment classification of tweets.
Advanced Question Paper Generator Implemented using Fuzzy LogicIRJET Journal
This document describes an advanced question paper generator system implemented using fuzzy logic. The system allows professors to generate question papers automatically by selecting the difficulty level and pattern. It uses fuzzy logic to determine the difficulty level of questions based on their analytical and descriptive quotients stored in the database. The system provides authentication and authorization for professors and admin. Professors can add, update and delete questions for the subjects allocated to them. The admin can manage user accounts and subject allocations. The generated question papers are in PDF format for ease of use and security.
The document discusses how the Internet allows computers to connect locally, regionally, nationally, and internationally. It explains that the World Wide Web, email, FTP, chat/instant messaging, listservs, and telnet are part of the Internet. The World Wide Web is accessed through browsers and allows users to view webpages from websites. Search engines like Google help users find information on the visible web, while directories and databases contain specialized content. Website addresses provide information on their domain and country of origin. The Internet is not supervised and search engines cannot find all of its content.
While the world is witnessing an information revolution unprecedented and great speed in the growth of databases in all aspects. Databases interconnect with their content and schema but use different elements and structures to express the same concepts and relations, which may cause semantic and structural conflicts. This paper proposes a new technique for integration the heterogeneous eXtensible Markup Language (XML) schemas, under the name XDEHD. The returned mediated schema contains all concepts and relations of the sources without duplication. Detailed technique divides into three steps; First, extract all subschemas from the sources by decompose the schemas sources, each subschema contains three levels, these levels are ancestor, root and leaf. Thereafter, second, the technique matches and compares the subschemas and return the related candidate subschemas, semantic closeness function is implemented to measures the degree how similar the concepts of subschemas are modelled in the sources. Finally, create the medicate schema by integration the candidate subschemas, and then obtain the minimal and complete unified schema, association strength function is developed to compute closely of pair in candidate subschema across all data sources, and elements repetition function is employed to calculate how many times each element repeated between the candidate subschema.
This document provides guidance on developing an effective search strategy to find information on a topic. It demonstrates identifying keywords and subject headings for the sample topic "What is the danger of pesticide residue on produce for the health of children?". Keywords, synonyms, and Boolean logic are used to construct search strings. Tips are given to refine searches, including starting broad and narrowing, considering word variations, and prioritizing essential keywords.
This document provides instructions for an HTML introduction assignment. It tells the student to open the INTRODUCTIONS forum and compose a message, and that HTML tags can be used directly in the discussion dialog box without the HTML editor. It then provides two examples of formatting text using HTML tags - using <h1> tags to define the largest header, and using <strong> tags to bold text.
This PowerPoint class overview outlines what students will learn over the semester to become expert PowerPoint users. Students will create presentations and infographics, learn how to share presentations and use PowerPoint beyond traditional presentations. They will also communicate with a graphic designer about layouts and walk away with a new skill that can be used to teach, present, and communicate information visually.
This document provides tips for using PowerPoint for the first time or with some experience by suggesting the reader try opening PowerPoint, adding new slides, changing the background and fonts, and playing with WordArt to have fun completing an assignment.
Quality Assurance. Quality Assurance Approach. White BoxKimberly Jones
The document discusses using the Unified Modeling Language (UML) to model database systems and computer applications. It describes how UML diagrams like use case diagrams, class diagrams, sequence diagrams, and deployment diagrams can be used at different stages of the software development process. The paper examines how these UML diagrams integrate with various programming methodologies and how they provide a standardized way to visually define and model the design and structure of software systems, including defining objects in an object-oriented design approach.
This document presents a framework for reusing existing software agents through ontological engineering. The framework includes components like a user interface agent, query processor, mapping agent, transfer agent, wrapper agent, and remote agents containing ontologies. The query processor reformulates the user's query, the mapping agent identifies relevant ontologies, and the transfer agent sends the query to remote agents. The remote agents provide ontologies as output, which are then integrated/merged and presented back to the user interface agent. The goal is to enable reuse of heterogeneous agents across different development environments through a standardized ontology representation.
ONTOLOGY VISUALIZATION PROTÉGÉ TOOLS – A REVIEWijait
The document discusses ontology visualization tools in Protégé. It reviews four main visualization methods used in Protégé tools: indented list, node-link and tree, zoomable, and focus+context. It then examines specific Protégé tools that use each method, including their key features and limitations. The tools discussed are Protégé Class Browser (indented list), Protégé OntoViz and OntoSphere (node-link and tree), Jambalaya (zoomable), and Protégé TGVizTab (focus+context). The document aims to categorize the characteristics of existing Protégé visualization tools to assist in method selection and promote future research.
ONTOLOGY VISUALIZATION PROTÉGÉ TOOLS – A REVIEW ijait
The document discusses ontology visualization tools in Protégé. It reviews four main visualization methods used in Protégé tools: indented list, node-link and tree, zoomable, and focus+context. It then examines specific Protégé tools that use each method, including their key features and limitations. The tools assessed are Protégé Class Browser (indented list), Protégé OntoViz and OntoSphere (node-link and tree), Jambalaya (zoomable), and Protégé TGVizTab (focus+context). The document concludes by summarizing and comparing the visualization characteristics of these Protégé tools.
A SOFTWARE REQUIREMENT ENGINEERING TECHNIQUE USING OOADA-RE AND CSC FOR IOT B...ijseajournal
This Internet of things is one of the most trending technology with wide range of applications. Here we are going to focus on Medical and Healthcare applications of IOT. Generally such IOT applications are very complex comprising of many different modules. Thus a lot of care has to be taken during the requirement engineering of IOT applications. Requirement Engineering is a process of structuring all the requirements of the users. This is the base phase of software development which greatly affects the rest of the phases. Thus our best should be given in the engineering of requirements because if the effort goes down here, it will greatly affect the quality of the end product. In this study we have presented an approach to improve the requirements engineering phase of IOT applications development by using Object Oriented Analysis and Design Approach(OOADA) along with Constraints Story Card(CSC) templates.
IRJET - Deep Collaborrative Filtering with Aspect InformationIRJET Journal
This document discusses a proposed system for deep collaborative filtering with aspect information. The system aims to help web users efficiently locate relevant information on unfamiliar topics to increase their knowledge. It utilizes techniques like multi-keyword search, synonym matching, and ontology mapping to return relevant web links, images, and news articles to the user based on their search terms. The proposed system architecture includes an index structure to efficiently search and rank results based on similarity to the search query terms. The implementation and evaluation of the proposed system are also discussed.
This document is a project report submitted by D.Surya Teja to fulfill requirements for the CS 361 Mini Project Lab at Acharya Nagarjuna University. The report describes the development of a Placement Management System to manage student and company information for university career services. It identifies key actors like students, recruiters, and administrators. Several use cases are defined including registration, validation, and other interactions between actors and the system. The document also covers analysis diagrams, class diagrams, relationships between classes, and system deployment.
This document describes an algorithm visualizer application that was created to help students learn algorithms. The application visually demonstrates the steps and processes of various pathfinding and sorting algorithms. It uses interactive graphics and animations to illustrate how the algorithms work in an engaging way. The developers used React.js for the framework and JavaScript as the primary language. Research shows that visualization helps most students learn algorithms better than traditional teaching methods. The application is intended to make algorithm learning less burdensome and more enjoyable through an interactive visual approach.
Local Service Search Engine Management System LSSEMSYogeshIJTSRD
Local Services Search Engine Management System LSSEMS is a web based application which helps user to find serviceman in a local area such as maid, tuition teacher, plumber etc. LSSEMS contain data of serviceman maid, tuition teacher, plumber etc. . The main purpose of LSSEMS is to systematically record, store and update the serviceman records. Kaushik Mishra | Aditya Sharma | Mohak Gund "Local Service Search Engine Management System (LSSEMS)" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Special Issue | International Conference on Advances in Engineering, Science and Technology - 2021 , May 2021, URL: https://www.ijtsrd.com/papers/ijtsrd42462.pdf Paper URL : https://www.ijtsrd.com/engineering/computer-engineering/42462/local-service-search-engine-management-system-lssems/kaushik-mishra
1) The document proposes an approach to automatically match web entities described using schema.org annotations to potential actions they could support.
2) It outlines difficulties using the new schema.org actions dimension due to its flexibility and need for understanding the actions hierarchy.
3) The approach aims to address these obstacles by suggesting potential actions for entities based on mappings between entity and action properties and ranges.
Image Based Tool for Level 1 and Level 2 Autistic PeopleIRJET Journal
This document proposes an image-based assistive tool for people with level 1 and level 2 autism. It uses natural language processing to analyze input text and return relevant images from a database using cosine similarity. The tool has four main components: a graphical user interface, an NLP unit to analyze input and perform semantic processing, a query function to search the image database, and the image database itself. It is intended to help autistic individuals associate words and concepts by providing visual representations.
Multiagent Based Methodologies have become an
important subject of research in advance Software Engineering.
Several methodologies have been proposed as, a theoretical
approach, to facilitate and support the development of complex
distributed systems. An important question when facing the
construction of Agent Applications is deciding which
methodology to follow. Trying to answer this question, a
framework with several criteria is applied in this paper for the
comparative analysis of existing multiagent system
methodologies. The results of the comparative over two of them,
conclude that those methodologies have not reached a sufficient
maturity level to be used by the software industry. The
framework has also proved its utility for the evaluation of any
kind of Multiagent Based Software Engineering Methodology
The document describes an ontology evolution process for classifying web services. It uses three techniques - TF/IDF, web context extraction, and free text descriptor verification - to analyze web service descriptions and automatically generate concepts and relationships for the ontology. TF/IDF and web context extraction are used to identify significant concepts from the descriptions. The free text descriptor is then used to validate these concepts and resolve any conflicts with the existing ontology. The combined approach aims to accurately define and evolve the ontology over time as new web services are added.
Text Summarization and Conversion of Speech to TextIRJET Journal
This document discusses text summarization and speech to text conversion using deep learning algorithms. It describes how recurrent neural networks can be used for text summarization by identifying key information and semantic meaning from text. Speech recognition uses similar deep learning methods to convert spoken audio to text. The document also provides an overview of the text summarization process, including segmentation, normalization, feature extraction, and modeling steps. It concludes that these models can generate summarized text from extensive documents and meetings.
IRJET- A Detailed Analysis on Windows Event Log Viewer for Faster Root Ca...IRJET Journal
This document summarizes research on analyzing Windows event logs to identify the root causes of defects in software. It discusses using machine learning algorithms and pattern recognition techniques on event log data to detect defect root causes. Specifically, it proposes developing an efficient algorithm based on pattern recognition to accurately detect defect root causes. The algorithm would analyze past event logs and defect resolution methods to improve prediction capability and accuracy over traditional approaches. It also reviews literature on using clustering, classification, and other machine learning methods on event logs to identify patterns and anomalies.
IRJET- Opinion Mining and Sentiment Analysis for Online ReviewIRJET Journal
This document summarizes a research paper that proposes a system for conducting sentiment analysis on online product reviews. The system uses a dual sentiment analysis approach that trains a classifier on both original reviews and sentiment-reversed reviews to address issues with polarity shifts. It generates random keys for users to access the review system and uses clustering algorithms to differentiate positive and negative words in reviews and provide an overall product rating. The goal is to help users make more informed purchasing decisions based on genuine reviews by preventing fake reviews from improperly influencing ratings.
Here are the key points about using content-based filtering techniques:
- Content-based filtering relies on analyzing the content or description of items to recommend items similar to what the user has liked in the past. It looks for patterns and regularities in item attributes/descriptions to distinguish highly rated items.
- The item content/descriptions are analyzed automatically by extracting information from sources like web pages, or entered manually from product databases.
- It focuses on objective attributes about items that can be extracted algorithmically, like text analysis of documents.
- However, personal preferences and what makes an item appealing are often subjective qualities not easily extracted algorithmically, like writing style or taste.
- So while content-based filtering can
IRJET- Plug-In based System for Data VisualizationIRJET Journal
This document describes a plug-in based system for data visualization. The system allows users to upload different file types like Excel, HTML, CSV and visualize the data through interactive visualizations. The system uses a plug-in architecture that allows new plug-ins to be added to support additional file formats. Each plug-in implements a reader interface to extract data from its file type and output it as JSON. The system then hosts the JSON and provides various visualization patterns for users to analyze and report on the data. The plug-in based design makes the system flexible and adaptable to future changes and additions of new plug-in types.
This document describes a name entity recognition (NER) system that classifies entities like locations, people, organizations, dates, and money amounts from free text. It discusses the main modules of the system - data selection, algorithm application, entity identification and classification, and result display. Visual Studio, SQL Server, and .NET are proposed as tools for the system's development, with .NET used for the front-end interface, SQL Server as the backend database, and Visual Studio for compiling programs. The system is aimed at helping users find useful information from texts.
This document describes a name entity recognition (NER) system that classifies entities like locations, people, organizations, dates, and money amounts from free text. It discusses the main modules of the system - data selection, algorithm application, entity identification and classification, and result display. Visual Studio, SQL Server, and .NET are proposed as tools for the system's development, with .NET used for the front-end interface and Visual Studio for compiling programs. The system is aimed at helping users find useful information from texts.
Similar to WEB-BASED ONTOLOGY EDITOR ENHANCED BY PROPERTY VALUE EXTRACTION (20)
Main news related to the CCS TSI 2023 (2023/1695)Jakub Marek
An English 🇬🇧 translation of a presentation to the speech I gave about the main changes brought by CCS TSI 2023 at the biggest Czech conference on Communications and signalling systems on Railways, which was held in Clarion Hotel Olomouc from 7th to 9th November 2023 (konferenceszt.cz). Attended by around 500 participants and 200 on-line followers.
The original Czech 🇨🇿 version of the presentation can be found here: https://www.slideshare.net/slideshow/hlavni-novinky-souvisejici-s-ccs-tsi-2023-2023-1695/269688092 .
The videorecording (in Czech) from the presentation is available here: https://youtu.be/WzjJWm4IyPk?si=SImb06tuXGb30BEH .
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
This presentation provides valuable insights into effective cost-saving techniques on AWS. Learn how to optimize your AWS resources by rightsizing, increasing elasticity, picking the right storage class, and choosing the best pricing model. Additionally, discover essential governance mechanisms to ensure continuous cost efficiency. Whether you are new to AWS or an experienced user, this presentation provides clear and practical tips to help you reduce your cloud costs and get the most out of your budget.
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API?
Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose?
Which one is cheapest? Which one is fastest? Which one will scale to meet our needs?
Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!
Dive into the realm of operating systems (OS) with Pravash Chandra Das, a seasoned Digital Forensic Analyst, as your guide. 🚀 This comprehensive presentation illuminates the core concepts, types, and evolution of OS, essential for understanding modern computing landscapes.
Beginning with the foundational definition, Das clarifies the pivotal role of OS as system software orchestrating hardware resources, software applications, and user interactions. Through succinct descriptions, he delineates the diverse types of OS, from single-user, single-task environments like early MS-DOS iterations, to multi-user, multi-tasking systems exemplified by modern Linux distributions.
Crucial components like the kernel and shell are dissected, highlighting their indispensable functions in resource management and user interface interaction. Das elucidates how the kernel acts as the central nervous system, orchestrating process scheduling, memory allocation, and device management. Meanwhile, the shell serves as the gateway for user commands, bridging the gap between human input and machine execution. 💻
The narrative then shifts to a captivating exploration of prominent desktop OSs, Windows, macOS, and Linux. Windows, with its globally ubiquitous presence and user-friendly interface, emerges as a cornerstone in personal computing history. macOS, lauded for its sleek design and seamless integration with Apple's ecosystem, stands as a beacon of stability and creativity. Linux, an open-source marvel, offers unparalleled flexibility and security, revolutionizing the computing landscape. 🖥️
Moving to the realm of mobile devices, Das unravels the dominance of Android and iOS. Android's open-source ethos fosters a vibrant ecosystem of customization and innovation, while iOS boasts a seamless user experience and robust security infrastructure. Meanwhile, discontinued platforms like Symbian and Palm OS evoke nostalgia for their pioneering roles in the smartphone revolution.
The journey concludes with a reflection on the ever-evolving landscape of OS, underscored by the emergence of real-time operating systems (RTOS) and the persistent quest for innovation and efficiency. As technology continues to shape our world, understanding the foundations and evolution of operating systems remains paramount. Join Pravash Chandra Das on this illuminating journey through the heart of computing. 🌟
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Tatiana Kojar
Skybuffer AI, built on the robust SAP Business Technology Platform (SAP BTP), is the latest and most advanced version of our AI development, reaffirming our commitment to delivering top-tier AI solutions. Skybuffer AI harnesses all the innovative capabilities of the SAP BTP in the AI domain, from Conversational AI to cutting-edge Generative AI and Retrieval-Augmented Generation (RAG). It also helps SAP customers safeguard their investments into SAP Conversational AI and ensure a seamless, one-click transition to SAP Business AI.
With Skybuffer AI, various AI models can be integrated into a single communication channel such as Microsoft Teams. This integration empowers business users with insights drawn from SAP backend systems, enterprise documents, and the expansive knowledge of Generative AI. And the best part of it is that it is all managed through our intuitive no-code Action Server interface, requiring no extensive coding knowledge and making the advanced AI accessible to more users.
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
Skybuffer SAM4U tool for SAP license adoptionTatiana Kojar
Manage and optimize your license adoption and consumption with SAM4U, an SAP free customer software asset management tool.
SAM4U, an SAP complimentary software asset management tool for customers, delivers a detailed and well-structured overview of license inventory and usage with a user-friendly interface. We offer a hosted, cost-effective, and performance-optimized SAM4U setup in the Skybuffer Cloud environment. You retain ownership of the system and data, while we manage the ABAP 7.58 infrastructure, ensuring fixed Total Cost of Ownership (TCO) and exceptional services through the SAP Fiori interface.
Ivanti’s Patch Tuesday breakdown goes beyond patching your applications and brings you the intelligence and guidance needed to prioritize where to focus your attention first. Catch early analysis on our Ivanti blog, then join industry expert Chris Goettl for the Patch Tuesday Webinar Event. There we’ll do a deep dive into each of the bulletins and give guidance on the risks associated with the newly-identified vulnerabilities.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
WeTestAthens: Postman's AI & Automation Techniques
WEB-BASED ONTOLOGY EDITOR ENHANCED BY PROPERTY VALUE EXTRACTION
1. International Journal of Web & Semantic Technology (IJWesT) Vol.4, No.3, July 2013
DOI : 10.5121/ijwest.2013.4301 1
WEB-BASED ONTOLOGY EDITOR ENHANCED BY
PROPERTY VALUE EXTRACTION
Takahiro Kawamura1
, I Shin1
and Akihiko Ohsuga1
1
Graduate School of Information Systems, University of Electro-Communications
1-5-1 Chofugaoka, Chofu-shi, Tokyo 182-8585, Japan
ABSTRACT
Linked Open Data and its consuming services are increasing in the areas of electric government, bio-
science researches and smart community projects. Therefore lightweight ontologies used for those areas
are also becoming important. This paper proposes a web-based ontology editor that is an ontology building
service especially for the lightweight ontologies. It is designed not only for ontology experts, but also for
users and/or developers of consumer services, that is, non-experts of the ontology. Thus we focused on ease
of use, no installation, cooperative work environment, and also provided sample applications to keep the
users' motivation. Furthermore, it offers a function that extract pairs of <property, value> belonging to a
certain instance of a class. First, we introduce the design and the implementation of our ontology editor,
and then present the extraction mechanism of <property, value> pairs and confirms its accuracy in
experimental evaluations.
KEYWORDS
Ontology Editor, Property Value, Bootstrapping, Dependency Parsing
1. INTRODUCTION
Recently, Linked Open Data (LOD) and services that are using the LOD have increased in the
areas of electric government, bio-science researches and smart community projects. In those areas,
most of ontologies used in the LOD have simpler structures in comparison with traditional
ontologies for manufacturing design and medical diagnosis, and so forth. Faultings [1] said that
"is-a" relation makes up 80-90% of all the relations in such lightweight ontologies. However,
ontology building tools such as protégé [2] are mainly for the ontology experts, and few of them
are focusing on developers and users of consumer services. Therefore, we have developed a web-
based ontology editor that is an ontology building service aiming at offering an environment by
which non-experts are able to make necessary ontologies for their purposes. The target users of
this editor are people who have little expertise about what are ontology, its schema, organization,
and technical terms. Or, people who have some knowledge about the ontology, but no experiment
of building ontology are also the targets. Our editor tried to solve the following three problems
that the target users may encounter when using the ontology tools. We had several interviews
with non-ontology developers and programmers and summarized as follows:
1. Necessary tool preparation and its usage are unclear because of many technical terms and
functions.
2. It is hard to know what term should be input as classes, instances, and properties, since
they have never thought about things according to the ontological method.
3. It is problematic to register a large amount of the instances and property values (this
problem is shared with the expert and the non-experts).
2. International Journal of Web & Semantic Technology (IJWesT) Vol.4, No.3, July 2013
2
To solve these problems, we took the following approaches:
A. Easy preparation for introduction.
Web-based editors do not need the installation of any tools, and can be used by web
browsers.
B. Improvement of usability.
We limited use of the technical terms, and narrowed the functions only for browsing and
editing of the lightweight ontology. Also, we tried to keep usability even in the browser
application.
C. Support of term registration.
It would be almost impossible that a single person can register every instance and
property in a large domain. Therefore, our editor extracts candidates of the property
values from the Web, and recommends them to the user.
D. Keeping of motivation.
The editor also offers sample applications to show what services can be made by using
ontologies. Moreover, it opens web service APIs to access the stored ontologies to be
used by external services.
E. Support of collaborative work.
Multiple access to a single ontology is possible, thus the ontology can be built by a team.
The above three problems would not be solved by any single approach, and have 1-to-n relations
to the five approaches. Fig.1 shows the relations between the problems and approaches we have
assumed. Furthermore, their effect measurement for each problem would not be necessarily
verified quantitatively, because of the large dependency of user's feeling and sensitivity.
Therefore, this paper firstly describes overview of our service, and then we focus on the approach
“C. Support of term registration” with details of the implementation and the evaluation. Although
the approach C is corresponding to the problem 3 and 2, we believe that an accuracy of
recommended terms can be regarded as a metric to measure the degree of the user support for
these problems.
1. Necessary tool
preparation and its usage
are unclear
2. hard to knowwhat term
shouldbe input as classes,
instances, and properties
3. Difficult to register a
large amount of the
terms
Problem Approach
A.Easy preparation for
introduction
B.Improvement of usability
(design of user interface)
C.Support of termregistration
D.Keeping of motivation
E.Support of collaboration work
1. Necessary tool
preparation and its usage
are unclear
2. hard to knowwhat term
shouldbe input as classes,
instances, and properties
3. Difficult to register a
large amount of the
terms
Problem Approach
A.Easy preparation for
introduction
B.Improvement of usability
(design of user interface)
C.Support of termregistration
D.Keeping of motivation
E.Support of collaboration work
Figure 1. Problems and Approaches
The outline of this paper is as follows. We first describe the service overview with its interface
and functionality in section 2. Then section 3 shows the <property, value> extraction mechanism
and its evaluation. In section 4, we discuss the limitations and the improvements of our proposed
approaches. Finally, we show the related works in section 5, and conclude this paper in section 6.
3. International Journal of Web & Semantic Technology (IJWesT) Vol.4, No.3, July 2013
3
2. DESIGN AND IMPLEMENTATION OF ONTOLOGY EDITOR
The service logically consists of a front-end (Flash website) and a back-end (web services)
(Fig.2). The front-end has an ontology editing function, the recommendation function, and two
sample applications. Then, the back-end provides the APIs to access the ontologies built by the
service so that other systems can use the ontologies. We connected the front- and back-end by
Asynchronous Flash, and realized high responsibility and operability as a browser application.
This section describes the editing function, the sample applications and implementation issues.
Ontology
Developer
Resource Layer
DB
XML file
Client Layer
(Front-end)
Web Browser
RPC Component
Edit func.
Flex
Server Layer
(Back-end)
Server
RPC Component
Ontology Processing Engine
Jena
XML
Engine
Service
Developer
Web Service
Recommen-
dation func.
Search Application
request
response
Ontology
Developer
Resource Layer
DB
XML file
Client Layer
(Front-end)
Web Browser
RPC Component
Edit func.
Flex
Server Layer
(Back-end)
Server
RPC Component
Ontology Processing Engine
Jena
XML
Engine
Service
Developer
Web Service
Recommen-
dation func.
Search Application
request
response
Figure 2. Service Architecture
2.1. Editing Function
The editing function is for the lightweight ontology, and has two roles: browsing major
components of the ontology such as the classes, instances and properties, and basic functions like
new addition, edit, and delete from them.
We provide three ways of the ontology browse. The first one is a Graph view to intuitively figure
out the ontology structure. Fig.3 top indicates that Flash and Spring Graph that is an automatic
graph allocation library based on a spring model visualize a hierarchical structure composed of
the classes, subclasses and instances. It can adjust, for example, a distance of an instance and a
class. In addition, if the user moves a class, then instances of the class automatically follows and
are re-arranged with the pre-defined distance. Also, if the user double-clicks a class, the instances
will hide, and vice versa.
4. International Journal of Web & Semantic Technology (IJWesT) Vol.4, No.3, July 2013
4
instanceclass
Graph XML List Search
Selected nodeDistance of nodes
Spring Graph pane
Tree pane
instanceclass
Graph XML List Search
Selected nodeDistance of nodes
Spring Graph pane
Tree pane
Class list Property list Instance listClass list Property list Instance list
Right Click
Graph View add instanceadd class
Property Value
Figure 3. User Interface
5. International Journal of Web & Semantic Technology (IJWesT) Vol.4, No.3, July 2013
5
On the other hand, the Graph view is unsuited to read through a list of the instances, and check
the details of their definitions. Therefore, we added a List view and an XML view. The List view
(Fig.3 middle) has three lists for the classes, properties, and instances. The XML view shows
OWL [3] data of the ontology as follows:
<rdf:RDF xmlns:mobilephone=http://www.myontology.co.jp/mobilephone/#
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:xsd=http://www.w3.org/2001/XMLSchema#
xmlns:rdfs=http://www.w3.org/2000/01/rdf-schema#
xmlns:owl=http://www.w3.org/2002/07/owl#
xmlns:daml="http://www.daml.org/2001/03/daml+oil#">
<owl:Ontology>
<rdfs:comment>MobilePhone OWL Ontology</rdfs:comment>
</owl:Ontology>
<owl:Class rdf:about="http://www.myontology.co.jp/mobilephone/#softbank">
<rdfs:subClassOf>
<owl:Class rdf:about="http://www.myontology.co.jp/mobilephone/#mobile phone"/>
</rdfs:subClassOf>
</owl:Class>
</rdf:RDF>
Ontology editing starts with a popup menu that appears with a right click on Graph view (Fig.3
bottom). If the user does the right click on a class or instance, the popup menu with “addition”,
“edit”, and “delete” will appear. Furthermore, the ontology is stored in the DB in the back-end,
and can be exported in OWL files, or accessed by other services.
2.2. Sample Applications
The service also provides sample applications, one of which is a product search application. In
this application, the user can find products that have specified properties. We prepared the
product ontologies of three domains in advance: mobile phones, digital cameras, and media
players like iPod. If the user searches any products with a common property like "SD card", the
products with the property will be found across the three domains. It is also possible to search
with multiple properties like hours of continuous operation and TV function (1 seg), and then the
ontologies of the mobile phones and the media players will be searched.
Moreover, a blog search application shows the related blogs on the product found by the above
application by using Google blog search, since the latest word-of-mouth information about the
product would be useful for the user.
2.3. Implementation
Our editor is a three-tier system (Fig.2), where a client layer provides a flexible user interface
using Adobe Flex [4]. Then, a server layer has an ontology processing engine with Jena [5], and a
resource layer stores the ontology data in an XML format in MySQL. Asynchronous call between
the client layer and the server layer is realized by Flex components like Flex Remote Object,
Flex Data Service. When the user operates the views with Flash, the client layer sends a request to
the server layer, and the Flex component calls corresponding APIs. The result is returned in an
XML format for the Graph or the List to the client layer, and then the XML engine transforms it
to show in the Flash view.
3. PROPERTY VALUE EXTRACTION
3.1. Extraction Mechanism from the Web
In the practical work of the instance addition, it would be difficult to register every detail of each
instance without any help. Especially, although the instance names can be collected from a list on
6. International Journal of Web & Semantic Technology (IJWesT) Vol.4, No.3, July 2013
6
any summarized site and the necessary property names would be defined by the users based on
their service requirements, values of all the properties for each instance would need the help of
the service. Therefore, we developed the function to extract values of each property of the
instances from the Web, whereas the user finally selects and adds some of them as new correct
<property, value> pairs. This function involves a bootstrapping method [6] and a dependency
parsing based on [7].
The Process of the extraction is as follows (Fig. 4). First of all, we make a keyword list that
includes an instance name and a logical disjunction of property names, and then search on Google,
and receive the result list, which includes more than 100 web pages. Next, we collect the pages
except for PDF files and also check Google PageRank for each page.
As the bootstrapping method, we first extract specific patterns of the DOM tree from a web page
based on the keys that are the property names (and their synonyms), and then we apply that
patterns to other web pages to extract the values of the other properties. This method is mainly
used for the extraction of <property, value> pairs from structured parts of a page such as tables
and lists (Fig. 4 left).
However, we found there are many unstructured sites that are explaining the contents only in
plain text. Therefore, we developed an extraction method using the dependency parsing, since a
triple <instance, property, value> corresponds to <subject, verb, object> in some cases. It first
follows modification relations in a sentence from a seed term that is an instance name or a
property name (and their synonyms), and then extract the triple, or a triple like <−, property,
value> in the case of no subject in the sentence. See Fig. 4 right.
We then combine all the property values obtained by the above methods, and select the values
that match to co-occurrence strings with the corresponding property names. A set of the co-
occurrence strings is prepared in advance, for example, the property “price” must obviously co-
occur with a string of JPY, USD or others. Then, we form some clusters of the same property
values for each property name based on LCS (Longest Common Substring). Furthermore, for
correction of errors that include not only errors of the extraction, but also the information in the
source pages, we sum up the Google PageRank of the source pages for each cluster to determine
the best possible property value and the second-best. Finally, the user determines a correct value
from the proposed two candidates.
Get the list of web pages(100) by search engine (Google)
Input seeds ”instance name (property_name1 OR name2 OR …)”
Get Google Page Ranks for each pages
Crawl the content of all pages in text format
Extract pairs of
(property, value)
by Bootstrapping
from structured part
Extract triples of
<instance, property, value>,
or <“”, property, value>
by Dependency Parsing
from unstructured part
Filter the values with co-occurrence patterns of the property
Make clusters of values per a property, select a typical value for a cluster
Order the clusters with sum of Page Ranks, fix 1,2-best for each prop.
Output CSV and RDF file
navigation
header
footer
list
plain text
table
navigation
header
footer
list
plain text
table
Figure 4. Process of LOD content generation
7. International Journal of Web & Semantic Technology (IJWesT) Vol.4, No.3, July 2013
7
<body>
<h1>begonia</h1>
<table>
<caption>Characteristics</caption>
<tr><td>Color</td><td>red</td></tr>
<tr><td>Light</td><td>Part Shade</td></tr>
<tr><td>Water</td><td>Normal</td></tr>
</table>
body
h1 table
caption tr
td td
tr
td td
(Light, Part Shade)
(Water, Normal)
1. Key match to original structure
2. DOM pattern extraction
3. Other (property, value) pairs are extracted
“Begonias will bloom in the spring,
the light should be partial.”
1. Seed match to original sentence
2. Dependency parse
3. Triples <plant name, property, value> are
extracted
<Begonia, bloom, spring>
< - , Light, partial>
Figure 5. Examples of bootstrapping and dependency parsing
3.2. Evaluation of Property Value Extraction
Table 2. Extraction accuracy
We applied this mechanism of <property, value> extraction to collect the values of 13 properties
for 90 products (garden plants). The result shown in Table 2 includes an average precision and
recall of the best possible value (1-best) obtained through the whole process, the bootstrapping
method only, and the dependency parsing only, and then those of the second possible value (2-
best). It should be noted that we collected 100 web pages for each product, but some reasons such
as DOM parse errors and difference of file types reduced the amount to about 60%. Properties
like product description, which are not clear whether it is true or not, were out of scope of this
evaluation. If there are more than two clusters whose sums of the PageRank are the same, we
regarded them all as the first position. The accuracy is calculated in units of the cluster instead of
each extracted value. That is, in the case of 1-best, a cluster which has the biggest PageRank
corresponds to an answer for the property. In the case of 2-best, two clusters are compared with
the correct value, and if either one of the two answers is correct, then it is regarded as correct
(thus, it is slightly different than average precision).
, where |Dq| is the number of correct answers for question q, and rock is an indicator function
equaling 1 if the item at rank k is correct, zero otherwise. The bootstrapping method only and the
dependency parsing only mean to form the clusters out of the values extracted only by the
bootstrapping and the dependency parsing, respectively. A cluster consists of the extracted values
8. International Journal of Web & Semantic Technology (IJWesT) Vol.4, No.3, July 2013
8
for a property, which seem identical according to LCS, but the number of the values in a cluster
may vary from more than 10 to 1. Finally, if there are multiple correct values for a property, we
selected the most dominant one.
The best possible values (1-best) achieved an average precision of 85% and an average recall of
77%. However, the 2-best achieved an average precision of 97% and an average recall of 87%.
Therefore if we are permitted to show a binary choice to the user, it would be possible to present
a correct answer in them in many cases. The accuracy of the automatic generation would not be
100% over all, and then a human checking is necessary at the final step. The binary choice would
be a realistic option.
In detail, the bootstrapping collects smaller amounts of values (11%), so that the recall is
substantially lower (46%) than the dependency parsing, although the precision is higher (89%). It
is because data written in the tables can be correctly extracted, but lacks diversity of properties.
Semantic drift of the values by generic patterns that is a well-known problem in the bootstrapping
method did not happen in this case, because target sources are at most top 100 pages of the
Google result, and the values are sorted by the PageRank at the end.
On the other hand, the dependency parsing collects a large amount of values (89%), although it
was a mixture of correct data and noises. However, the total accuracy is affected by the
dependency parsing, because the biggest cluster of the PageRank is composed mainly of the
values extracted by the dependency parsing. Therefore we need to consider weight on the values
extracted by the bootstrapping.
4. DISCUSSION
In terms of “C. Support of term registration” in section 1, we presented the <property, value>
recommendation mechanism to reduce the user's strain. As the result, it achieved the high
precision although the recall is relatively low. However, it would be a difficult task for the target
users of this editor who are non-experts of the ontology, to extract only the correct terms from a
large set of the terms (Of course, it depends on their domain knowledge). Meanwhile, in case of
the low recall the only thing that the user needs is to repeat the process of the extraction with the
different seeds. That is obviously a simpler task.
In terms of other approaches: “A. Easy preparation for the introduction” and “B. Improvement of
usability” as mentioned in section 1 we eliminated the tool installation by making it the browser
application, and realized operability like a desktop application by using Adobe Flex. Also, the
Spring Graph realized visibility of the whole ontology. In terms of “D. Keeping of motivation”,
we tried to motivate the user by presenting the sample applications and open web APIs, and for
“E. Support of collaborative work” an exclusive access control of ontology DB enabled multiple
developers to build an ontology at the same time. For those approaches, however, there are some
other methods like the improvement of documents, more innovative user interfaces, and a point
system for the contribution, and so forth. In the future, we will compare several methods and
discuss their effects.
5. RELATED WORK
As the famous ontology building tools, we compare with Protégé, Hozo, and KiWi. Protégé [2] is
the most famous ontology editor mainly used by ontology researchers for more than ten years. It
has several advanced features like ontology inference. Also, it opens the plug-in specification,
and now has more than 70 plug-ins in its official site such as data import from RDB, term
extractions by text mining, etc. Hozo [8] is a tool which has a unique feature to handle Role
9. International Journal of Web & Semantic Technology (IJWesT) Vol.4, No.3, July 2013
9
concept. It also has a distributed development environment that keeps data consistency by
checking the difference of ontologies edited by multiple users.
However, those are the heavyweight ontology tools mainly for the ontology experts. In the future,
we will refer to them for the ontology visualization plug-in and the mechanism to keep the
consistency of data, and so forth. However, KiWi (Knowledge in A Wiki) [9] focused on the
lightweight ontology, and the extended Wiki by the semantic web technology. It enables the user
to edit its content through the same interface as the Wiki, and so the user can register the
instances and properties without any special knowledge of ontology and the tool. It is a different
method than our editor, whereas for the same purposes like an easy introduction for the non-
expert, the improvement of usability, and the collaborative work. In the future, we would also like
to incorporate the Wiki-based change history and difference management mechanism.
Moreover, there have been several researches to extract the information from textual documents
on the Web, that are combining Natural Language Processing (NLP) mechanisms and semantic
resources like ontologies. Our extraction mechanism is similar to NELL [10] with the features
like “Ontology-driven”, “Macro-reading” of the NELL. This means that the input is a large text
collection and the desired output is a large collection of facts, using “Machine learning methods”.
However, instead of the Coupled Pattern Learner in NELL, we used a morphological analysis and
a dependency parsing. Moreover, clustering of the values using LCS and the PageRank are also
our own methods. Also, a key strategic difference is a target domain of the extraction. The NELL
is targeting the world, so that the granularity of the properties is large and the number of the
properties is limited. For example, “agricultural product growing in state or province” is a barely
fine-grain property in the NELL, whereas only 10 instances have this property. Also, the number
of the properties is only 5% of the total extracted terms. However, by restricting the domain of
interest, it is possible for our mechanism to construct the set of the co-occurrence strings with the
predefined property name. This simple heuristic effectively select candidates for the property
values, and then raise the accuracy of the extraction and keep the variety of the properties.
Furthermore, NERD (Named Entity Recognition and Disambiguation) framework [11] has
proposed an RDF/OWL-based NLP Interchange Format (NIF) and an API to unify various tools
for a qualitative comparison. LODifier [12] has recently proposed a translation of textual
information in its entirety into a structural RDF in open-domain scenarios with no predefined
schema. In the future, we would like to use the techniques for further improvement.
6. CONCLUSIONS
This paper presented the development of a web-based ontology editor for the non-expert. First,
we raised three problems for such editors and attempted to solve them with five approaches.
Especially we described the mechanism of <property, value> extraction from the Web in order to
reduce the workload of the term registration, and confirm the feasibility of showing the binary
choice by the evaluation. We continue to offer the ontology tool for the developers and users of
the consumer services incorporating a diverse range of their opinions.
REFERENCES
[1] B. Faltings, V. Schickel-Zuber, 2007. Oss: A semantic similarity function based on hierarchical
ontologies. Proceedings of 20th Int. Joint Conf. on Artifical Intelligence (IJCAI 2007), pp.551-
556.
[2] The Protégé Ontology Editor and Knowledge Acquisition System, <http://protege.stanford.edu/>.
[3] OWL, Web Ontology Language, <http://www.w3.org/TR/owl-features/>.
10. International Journal of Web & Semantic Technology (IJWesT) Vol.4, No.3, July 2013
10
[4] Adobe Flex, < http://www.adobe.com/products/flex/>.
[5] Jena - A Semantic Web Framework for Java, <http://jena.sourceforge.net/>.
[6] S. Brin: “Extracting patterns and relations from the world wide web”, Proc. of WebDB Workshop
at 6th International Conference on Extended Database Technology, 1998.
[7] T. Kawamura, S. Nagano, M. Inaba, Y. Mizoguchi: Mobile Service for Reputation Extraction
from Weblogs - Public Experiment and Evaluation, Proc. of Twenty-Second Conference on
Artificial Intelligence (AAAI), 2007.
[8] Hozo, <http://www.hozo.jp/ >.
[9] KiWi - Knowledge In A Wiki, <http://www.kiwi-project.eu/>.
[10] A. Carlson, J. Betteridge, B. Kisiel, B. Settles, E.R. Hruschka Jr. and T.M. Mitchell: Toward an
Architecture for Never-Ending Language Learning, Proc. of Conference on Artificial Intelligence
(AAAI), 2010.
[11] G. Rizzo, R. Troncy, S. Hellmann, M. Bruemmer: NERD meets NIF: Lifting NLP Extraction
Results to the Linked Data Cloud, Proc. of 5th Workshop on Linked Data on the Web (LDOW),
2012.
[12] I. Augenstein, S. Pado, and S. Rudolph: LODifier: Generating Linked Data from Unstructured
Text, Proc. of 9th Extended Semantic Web Conference (ESWC), 2012.
Authors
Takahiro Kawamura is a Senior Research Scientist at the Corporate
Research and Development Center, Toshiba Corp., and also an Associate
Professor at the Graduate School of Information Systems, the University of
Electro-Communications, Japan.
I Shin received a M.S. degree at the Graduate School of Information Systems,
the University of Electro-Communications, Japan and joined NTT DATA
Corporation in 2010.
Akihiko Ohsuga is a Professor at the Graduate School of Information Systems,
the University of Electro-Communications. He is currently the Chair of the IEEE
Computer Society Japan Chapter.