Active Learning in Collaborative Filtering Recommender Systems : a SurveyUniversity of Bergen
In collaborative filtering recommender systems user’s preferences are expressed as ratings for items, and each additional rating extends the knowledge of the system and affects the system’s recommendation accuracy. In general, the more ratings are elicited from the users, the more effective the recommendations are. However, the usefulness of each rating may vary significantly, i.e., different ratings may bring a different amount and type of information about the user’s tastes. Hence, specific techniques, which are defined as “active learning strategies”, can be used to selectively choose the items to be presented to the user for rating. In fact, an active learning strategy identifies and adopts criteria for obtaining data that better reflects users’ preferences and enables to generate better recommendations.
Novel Algorithms for Ranking and Suggesting True Popular ItemsIJMER
International Journal of Modern Engineering Research (IJMER) is Peer reviewed, online Journal. It serves as an international archival forum of scholarly research related to engineering and science education.
International Journal of Modern Engineering Research (IJMER) covers all the fields of engineering and science: Electrical Engineering, Mechanical Engineering, Civil Engineering, Chemical Engineering, Computer Engineering, Agricultural Engineering, Aerospace Engineering, Thermodynamics, Structural Engineering, Control Engineering, Robotics, Mechatronics, Fluid Mechanics, Nanotechnology, Simulators, Web-based Learning, Remote Laboratories, Engineering Design Methods, Education Research, Students' Satisfaction and Motivation, Global Projects, and Assessment…. And many more.
Recommender systems have grown to be a critical research subject after the emergence of the first paper on collaborative filtering in the Nineties. Despite the fact that educational studies on recommender systems, has extended extensively over the last 10 years, there are deficiencies in the complete literature evaluation and classification of that research. Because of this, we reviewed articles on recommender structures, and then classified those based on sentiment analysis. The articles are categorized into three techniques of recommender system, i.e.; collaborative filtering (CF), content based and context based. We have tried to find out the research papers related to sentimental analysis based recommender system. To classify research done by authors in this field, we have shown different approaches of recommender system based on sentimental analysis with the help of tables. Our studies give statistics, approximately trends in recommender structures research, and gives practitioners and researchers with perception and destiny route on the recommender system using sentimental analysis. We hope that this paper enables all and sundry who is interested in recommender systems research with insight for destiny.
A Hybrid Approach for Personalized Recommender System Using Weighted TFIDF on...Editor IJCATR
Recommender systems are gaining a great popularity with the emergence of e-commerce and social media on the internet. These recommender systems enable users’ access products or services that they would otherwise not be aware of due to the wealth of information on the internet. Two traditional methods used to develop recommender systems are content-based and collaborative filtering. While both methods have their strengths, they also have weaknesses; such as sparsity, new item and new user problem that leads to poor recommendation quality. Some of these weaknesses can be overcome by combining two or more methods to form a hybrid recommender system. This paper deals with issues related to the design and evaluation of a personalized hybrid recommender system that combines content-based and collaborative filtering methods to improve the precision of recommendation. Experiments done using MovieLens dataset shows the personalized hybrid recommender system outperforms the two traditional methods implemented separately.
Recommender systems give suggestion according
to the user preferences. The number of contents and books in a
university size library is enormous and a better than ever.
Readers find it extremely difficult to locate their favorite books.
Even though they could possibly find best preferred book by
the user, finding another similar book to the first preferred
book seems as if finding an in nail the ocean. That is because
the second preferred book might be at very last edge of long
tail. So recommender system is often a requirement in library
that should be considers and need it to come into make the
above finding similar. They have become fundamental
applications in electronic commerce and information retrieval,
providing suggestions that effectively crop large information
spaces so that users are directed toward those items that best
meet their needs and preferences. A variety of techniques have
been suggested for performing recommendation, including
collaborative technique and its three methods which are Slope
One used for rating prediction, Pearson’s correlation used for
finding the similarity between users and last but not the least
item to item similarity. To upgrade the performance, these
methods have sometimes been combined in hybrid
recommendation technique.
A Study of Neural Network Learning-Based Recommender Systemtheijes
A recommender system sorts and recommends the information which meets personal preferences among a huge amount of data provided by e-commerce. In particular, collaborative filtering (CF) is the most widely used technique in these recommendation systems. This method finds neighboring users who have similar preferences with particular users and recommends the items preferred by the former. This study proposes a neural network learning model as a new technique to find neighboring users using the collaborative filtering method. This kind of neural network learning model takes care of a sparseness problem during the analysis stage among those related with target users. The proposed method was tested with MovieLens data sets, and the results showed that precision improved by 6.7%.
Active Learning in Collaborative Filtering Recommender Systems : a SurveyUniversity of Bergen
In collaborative filtering recommender systems user’s preferences are expressed as ratings for items, and each additional rating extends the knowledge of the system and affects the system’s recommendation accuracy. In general, the more ratings are elicited from the users, the more effective the recommendations are. However, the usefulness of each rating may vary significantly, i.e., different ratings may bring a different amount and type of information about the user’s tastes. Hence, specific techniques, which are defined as “active learning strategies”, can be used to selectively choose the items to be presented to the user for rating. In fact, an active learning strategy identifies and adopts criteria for obtaining data that better reflects users’ preferences and enables to generate better recommendations.
Novel Algorithms for Ranking and Suggesting True Popular ItemsIJMER
International Journal of Modern Engineering Research (IJMER) is Peer reviewed, online Journal. It serves as an international archival forum of scholarly research related to engineering and science education.
International Journal of Modern Engineering Research (IJMER) covers all the fields of engineering and science: Electrical Engineering, Mechanical Engineering, Civil Engineering, Chemical Engineering, Computer Engineering, Agricultural Engineering, Aerospace Engineering, Thermodynamics, Structural Engineering, Control Engineering, Robotics, Mechatronics, Fluid Mechanics, Nanotechnology, Simulators, Web-based Learning, Remote Laboratories, Engineering Design Methods, Education Research, Students' Satisfaction and Motivation, Global Projects, and Assessment…. And many more.
Recommender systems have grown to be a critical research subject after the emergence of the first paper on collaborative filtering in the Nineties. Despite the fact that educational studies on recommender systems, has extended extensively over the last 10 years, there are deficiencies in the complete literature evaluation and classification of that research. Because of this, we reviewed articles on recommender structures, and then classified those based on sentiment analysis. The articles are categorized into three techniques of recommender system, i.e.; collaborative filtering (CF), content based and context based. We have tried to find out the research papers related to sentimental analysis based recommender system. To classify research done by authors in this field, we have shown different approaches of recommender system based on sentimental analysis with the help of tables. Our studies give statistics, approximately trends in recommender structures research, and gives practitioners and researchers with perception and destiny route on the recommender system using sentimental analysis. We hope that this paper enables all and sundry who is interested in recommender systems research with insight for destiny.
A Hybrid Approach for Personalized Recommender System Using Weighted TFIDF on...Editor IJCATR
Recommender systems are gaining a great popularity with the emergence of e-commerce and social media on the internet. These recommender systems enable users’ access products or services that they would otherwise not be aware of due to the wealth of information on the internet. Two traditional methods used to develop recommender systems are content-based and collaborative filtering. While both methods have their strengths, they also have weaknesses; such as sparsity, new item and new user problem that leads to poor recommendation quality. Some of these weaknesses can be overcome by combining two or more methods to form a hybrid recommender system. This paper deals with issues related to the design and evaluation of a personalized hybrid recommender system that combines content-based and collaborative filtering methods to improve the precision of recommendation. Experiments done using MovieLens dataset shows the personalized hybrid recommender system outperforms the two traditional methods implemented separately.
Recommender systems give suggestion according
to the user preferences. The number of contents and books in a
university size library is enormous and a better than ever.
Readers find it extremely difficult to locate their favorite books.
Even though they could possibly find best preferred book by
the user, finding another similar book to the first preferred
book seems as if finding an in nail the ocean. That is because
the second preferred book might be at very last edge of long
tail. So recommender system is often a requirement in library
that should be considers and need it to come into make the
above finding similar. They have become fundamental
applications in electronic commerce and information retrieval,
providing suggestions that effectively crop large information
spaces so that users are directed toward those items that best
meet their needs and preferences. A variety of techniques have
been suggested for performing recommendation, including
collaborative technique and its three methods which are Slope
One used for rating prediction, Pearson’s correlation used for
finding the similarity between users and last but not the least
item to item similarity. To upgrade the performance, these
methods have sometimes been combined in hybrid
recommendation technique.
A Study of Neural Network Learning-Based Recommender Systemtheijes
A recommender system sorts and recommends the information which meets personal preferences among a huge amount of data provided by e-commerce. In particular, collaborative filtering (CF) is the most widely used technique in these recommendation systems. This method finds neighboring users who have similar preferences with particular users and recommends the items preferred by the former. This study proposes a neural network learning model as a new technique to find neighboring users using the collaborative filtering method. This kind of neural network learning model takes care of a sparseness problem during the analysis stage among those related with target users. The proposed method was tested with MovieLens data sets, and the results showed that precision improved by 6.7%.
COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...IJCSEA Journal
Collaborative Filtering is generally used as a recommender system. There is enormous growth in the
amount of data in web. These recommender systems help users to select products on the web, which is the
most suitable for them. Collaborative filtering-systems collect user’s previous information about an item
such as movies, music, ideas, and so on. For recommending the best item, there are many algorithms,
which are based on different approaches. The most known algorithms are User-based and Item-based
algorithms. Experiments show that Item-based algorithms give better results than User-based algorithms.
The aim of this paper isto compare User-based and Item-based Collaborative Filtering Algorithms with
many different similarity indexes with their accuracy and performance. We provide an approach to
determine the best algorithm, which give the most accurate recommendation by using statistical accuracy
metrics. The results are compared the User-based and Item-based algorithms with movie recommendation
data set.
Political prediction analysis using text mining and deep learningVishwambhar Deshpande
We have proposed a system to determine current sentiment on twitter using Twit-
ter API for open access which includes opinions from dierent content structures like
latest news, audits, articles and social media posts. and Deep Learning method to
study Historic Data for predicting future results. we utilized Naive Bayes and dictio-
nary based algorithms to predict the sentiment on Live Twitter Data.
WEB-BASED DATA MINING TOOLS : PERFORMING FEEDBACK ANALYSIS AND ASSOCIATION RU...IJDKP
This paper aims to explain the web-enabled tools for educational data mining. The proposed web-based
tool developed using Asp.Net framework and php can be helpful for universities or institutions providing
the students with elective courses as well improving academic activities based on feedback collected from
students. In Asp.Net tool, association rule mining using Apriori algorithm is used whereas in php based
Feedback Analytical Tool, feedback related to faculty and institutional infrastructure is collected from
students and based on that Feedback it shows performance of faculty and institution. Using that data, it
helps management to improve in-house training skills and gains knowledge about educational trends which
is to be followed by faculty to improve the effectiveness of the course and teaching skills.
Detection and Analysis of Twitter Trending Topics via Link-Anomaly DetectionIJERA Editor
This paper involves two approaches for finding the trending topics in social networks that is key-based approach and link-based approach. In conventional key-based approach for topics detection have mainly focus on frequencies of (textual) words. We propose a link-based approach which focuses on posts reflected in the mentioning behavior of hundreds users. The anomaly detection in the twitter data set is carried out by retrieving the trend topics from the twitter in a sequential manner by using some API and corresponding user for training, then computed anomaly score is aggregated from different users. Further the aggregated anomaly score will be feed into change-point analysis or burst detection at the pinpoint, in order to detect the emerging topics. We have used the real time twitter account, so results are vary according to the tweet trends made. The experiment shows that proposed link-based approach performs even better than the keyword-based approach.
Measuring information credibility in social media using combination of user p...IJECEIAES
Information credibility in social media is becoming the most important part of information sharing in the society. The literatures have shown that there is no labeling information credibility based on user competencies and their posted topics. This paper increases the information credibility by adding new 17 features for Twitter and 49 features for Facebook. In the first step, we perform a labeling process based on user competencies and their posted topic to classify the users into two groups, credible and not credible users, regarding their posted topics. These approaches are evaluated over ten thousand samples of real-field data obtained from Twitter and Facebook networks using classification of Naive Bayes (NB), Support Vector Machine (SVM), Logistic Regression (Logit) and J48 Algorithm (J48). With the proposed new features, the credibility of information provided in social media is increasing significantly indicated by better accuracy compared to the existing technique for all classifiers.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Cold-Start Management with Cross-Domain Collaborative Filtering and TagsMatthias Braunhofer
Recommender systems suffer from the new user problem, i.e., the difficulty to make accurate predictions for users that have rated only few items. Moreover, they usually compute recommendations for items just in one domain, such as movies, music, or books. In this paper we deal with such a cold-start situation exploiting cross-domain recommendation techniques, i.e., we suggest items to a user in one target domain by using ratings of other users in a, completely disjoint, auxiliary domain. We present three rating prediction models that make use of information about how users tag items in an auxiliary domain, and how these tags correlate with the ratings to improve the rating prediction task in a different target domain. We show that the proposed techniques can effectively deal with the considered cold-start situation, given that the tags used in the two domains overlap.
video link => http://youtu.be/D9PBX8FmtpQ
Tweets Classifier which categorises tweets into these 6 categories:
Business
Politics
Music
Health
Sports
Technology
SIMILARITY MEASURES FOR RECOMMENDER SYSTEMS: A COMPARATIVE STUDYJournal For Research
Recommender Systems have the ability to guide the users in a personalized way to interesting items in a large space of possible options. They have fundamental applications in e-commerce and information retrieval, providing suggestion that prune large information spaces so that users are directed towards those items that best meets the needs and preferences. A variety of approaches have been proposed but collaborative filtering has been the most popular and widely used which makes use of various similarity measures to calculate the similarity. Collaborative Filtering takes the user feedback in the form of ratings in an application area and uses it to find similarities and differences between user profiles to generate recommendations. Collaborative Filtering makes use of various similarity measures to calculate the similarity or difference between the users. This paper provides an overview on few important similarity measures that are currently being used. Different similarity measures provide different results against same input parameters. So, to understand how various similarity measures behave when they are put in different contexts but with same input, few observations are made. This paper also provides a comparison graph to help understand the results of different similarity measures.
Over recent years, big data, a huge amount of structured and unstructured data is generated from social Network. There needs to extract the valulable information from the social big data. The traditional analytic platform needs to be scaled up for analyzing social big data in an efficient and timely manner. Sentiment Analysis of social big data helps the organizations by providing business insights with public opinion. Sentiment analysis based on multi-class classification scheme is oriented towards classification of text into more detailed sentiment labels. Multi-class classification with single tier architecture where single model is developed and entire labeled data is trained may increase the classification complexity. In this paper, multi-tier sentiment analysis system on big data analytics platform (MSABDP) is proposed to reduce the multi class classification complexity and efficiently analyze large scale data set. Hadoop is built for big data analytics and it is a good platform for being able to manage large data at scale and which can improve scalability and efficiency by adopting distributed processing environment since they have been implemented using a MapReduce framework and a Hadoop distributed storage (HDFS). The MSABDP is implemented by combining SentiStrength lexicon and learning based classification scheme with multi-tier architecture and run on big data analytics platform for being able to manage large data at scale. The proposed system collects a large amount of real Twitter data by using Apache Flume and the data was used for evaluation. The evaluation results have shown that the proposed multi class classification system with multi-tier architecture is able to significantly improve the classification accuracy over multi class classification based on single-tier architecture by 7%.
Nesta palestra no evento GDG DataFest, apresentei uma introdução prática sobre as principais técnicas de sistemas de recomendação, incluindo arquiteturas recentes baseadas em Deep Learning. Foram apresentados exemplos utilizando Python, TensorFlow e Google ML Engine, e fornecidos datasets para exercitarmos um cenário de recomendação de artigos e notícias.
COMPARISON OF COLLABORATIVE FILTERING ALGORITHMS WITH VARIOUS SIMILARITY MEAS...IJCSEA Journal
Collaborative Filtering is generally used as a recommender system. There is enormous growth in the
amount of data in web. These recommender systems help users to select products on the web, which is the
most suitable for them. Collaborative filtering-systems collect user’s previous information about an item
such as movies, music, ideas, and so on. For recommending the best item, there are many algorithms,
which are based on different approaches. The most known algorithms are User-based and Item-based
algorithms. Experiments show that Item-based algorithms give better results than User-based algorithms.
The aim of this paper isto compare User-based and Item-based Collaborative Filtering Algorithms with
many different similarity indexes with their accuracy and performance. We provide an approach to
determine the best algorithm, which give the most accurate recommendation by using statistical accuracy
metrics. The results are compared the User-based and Item-based algorithms with movie recommendation
data set.
Political prediction analysis using text mining and deep learningVishwambhar Deshpande
We have proposed a system to determine current sentiment on twitter using Twit-
ter API for open access which includes opinions from dierent content structures like
latest news, audits, articles and social media posts. and Deep Learning method to
study Historic Data for predicting future results. we utilized Naive Bayes and dictio-
nary based algorithms to predict the sentiment on Live Twitter Data.
WEB-BASED DATA MINING TOOLS : PERFORMING FEEDBACK ANALYSIS AND ASSOCIATION RU...IJDKP
This paper aims to explain the web-enabled tools for educational data mining. The proposed web-based
tool developed using Asp.Net framework and php can be helpful for universities or institutions providing
the students with elective courses as well improving academic activities based on feedback collected from
students. In Asp.Net tool, association rule mining using Apriori algorithm is used whereas in php based
Feedback Analytical Tool, feedback related to faculty and institutional infrastructure is collected from
students and based on that Feedback it shows performance of faculty and institution. Using that data, it
helps management to improve in-house training skills and gains knowledge about educational trends which
is to be followed by faculty to improve the effectiveness of the course and teaching skills.
Detection and Analysis of Twitter Trending Topics via Link-Anomaly DetectionIJERA Editor
This paper involves two approaches for finding the trending topics in social networks that is key-based approach and link-based approach. In conventional key-based approach for topics detection have mainly focus on frequencies of (textual) words. We propose a link-based approach which focuses on posts reflected in the mentioning behavior of hundreds users. The anomaly detection in the twitter data set is carried out by retrieving the trend topics from the twitter in a sequential manner by using some API and corresponding user for training, then computed anomaly score is aggregated from different users. Further the aggregated anomaly score will be feed into change-point analysis or burst detection at the pinpoint, in order to detect the emerging topics. We have used the real time twitter account, so results are vary according to the tweet trends made. The experiment shows that proposed link-based approach performs even better than the keyword-based approach.
Measuring information credibility in social media using combination of user p...IJECEIAES
Information credibility in social media is becoming the most important part of information sharing in the society. The literatures have shown that there is no labeling information credibility based on user competencies and their posted topics. This paper increases the information credibility by adding new 17 features for Twitter and 49 features for Facebook. In the first step, we perform a labeling process based on user competencies and their posted topic to classify the users into two groups, credible and not credible users, regarding their posted topics. These approaches are evaluated over ten thousand samples of real-field data obtained from Twitter and Facebook networks using classification of Naive Bayes (NB), Support Vector Machine (SVM), Logistic Regression (Logit) and J48 Algorithm (J48). With the proposed new features, the credibility of information provided in social media is increasing significantly indicated by better accuracy compared to the existing technique for all classifiers.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Cold-Start Management with Cross-Domain Collaborative Filtering and TagsMatthias Braunhofer
Recommender systems suffer from the new user problem, i.e., the difficulty to make accurate predictions for users that have rated only few items. Moreover, they usually compute recommendations for items just in one domain, such as movies, music, or books. In this paper we deal with such a cold-start situation exploiting cross-domain recommendation techniques, i.e., we suggest items to a user in one target domain by using ratings of other users in a, completely disjoint, auxiliary domain. We present three rating prediction models that make use of information about how users tag items in an auxiliary domain, and how these tags correlate with the ratings to improve the rating prediction task in a different target domain. We show that the proposed techniques can effectively deal with the considered cold-start situation, given that the tags used in the two domains overlap.
video link => http://youtu.be/D9PBX8FmtpQ
Tweets Classifier which categorises tweets into these 6 categories:
Business
Politics
Music
Health
Sports
Technology
SIMILARITY MEASURES FOR RECOMMENDER SYSTEMS: A COMPARATIVE STUDYJournal For Research
Recommender Systems have the ability to guide the users in a personalized way to interesting items in a large space of possible options. They have fundamental applications in e-commerce and information retrieval, providing suggestion that prune large information spaces so that users are directed towards those items that best meets the needs and preferences. A variety of approaches have been proposed but collaborative filtering has been the most popular and widely used which makes use of various similarity measures to calculate the similarity. Collaborative Filtering takes the user feedback in the form of ratings in an application area and uses it to find similarities and differences between user profiles to generate recommendations. Collaborative Filtering makes use of various similarity measures to calculate the similarity or difference between the users. This paper provides an overview on few important similarity measures that are currently being used. Different similarity measures provide different results against same input parameters. So, to understand how various similarity measures behave when they are put in different contexts but with same input, few observations are made. This paper also provides a comparison graph to help understand the results of different similarity measures.
Over recent years, big data, a huge amount of structured and unstructured data is generated from social Network. There needs to extract the valulable information from the social big data. The traditional analytic platform needs to be scaled up for analyzing social big data in an efficient and timely manner. Sentiment Analysis of social big data helps the organizations by providing business insights with public opinion. Sentiment analysis based on multi-class classification scheme is oriented towards classification of text into more detailed sentiment labels. Multi-class classification with single tier architecture where single model is developed and entire labeled data is trained may increase the classification complexity. In this paper, multi-tier sentiment analysis system on big data analytics platform (MSABDP) is proposed to reduce the multi class classification complexity and efficiently analyze large scale data set. Hadoop is built for big data analytics and it is a good platform for being able to manage large data at scale and which can improve scalability and efficiency by adopting distributed processing environment since they have been implemented using a MapReduce framework and a Hadoop distributed storage (HDFS). The MSABDP is implemented by combining SentiStrength lexicon and learning based classification scheme with multi-tier architecture and run on big data analytics platform for being able to manage large data at scale. The proposed system collects a large amount of real Twitter data by using Apache Flume and the data was used for evaluation. The evaluation results have shown that the proposed multi class classification system with multi-tier architecture is able to significantly improve the classification accuracy over multi class classification based on single-tier architecture by 7%.
Nesta palestra no evento GDG DataFest, apresentei uma introdução prática sobre as principais técnicas de sistemas de recomendação, incluindo arquiteturas recentes baseadas em Deep Learning. Foram apresentados exemplos utilizando Python, TensorFlow e Google ML Engine, e fornecidos datasets para exercitarmos um cenário de recomendação de artigos e notícias.
Prediction of Reaction towards Textual Posts in Social NetworksMohamed El-Geish
Posting on social networks could be a gratifying or a terrifying experience depending on the reaction the post and its author —by association— receive from the readers. To better understand what makes a post popular, this project inquires into the factors that determine the number of likes, comments, and shares a textual post gets on LinkedIn; and finds a predictor function that can estimate those quantitative social gestures.
Advances in Exploratory Data Analysis, Visualisation and Quality for Data Cen...Hima Patel
It is widely accepted that data preparation is one of the most time-consuming steps of the machine learning (ML) lifecycle. It is also one of the most important steps, as the quality of data directly influences the quality of a model. In this session, we will discuss the importance and the role of exploratory data analysis (EDA) and data visualisation techniques to find data quality issues and for data preparation, relevant to building ML pipelines. We will also discuss the latest advances in these fields and bring out areas that need innovation. Finally, we will discuss on the challenges posed by industry workloads and the gaps to be addressed to make data-centric AI real in industry settings.
Question 1 Some agile and incremental methods- like extreme programmi.pdfPhilzIGHudsonl
Question 1
Some agile and incremental methods, like extreme programming, claim that they don't need
high-level designs.
Group of answer choices
True
False
Question 2
It is impossible, and not practical, to remove all bugs from a reasonably large system.
Group of answer choices
True
False
Question 3
A successful software system will require zero maintenance.
Group of answer choices
True
False
Question 4
Git belongs in which category of version control systems (VCS)?
Group of answer choices
Local VCS
Centralized VCS
Distributed VCS
Locking VCS
Question 5
Git thinks about its data more like a stream of snapshots as opposed to a collection of file
differences (or deltas).
Group of answer choices
True
False
Question 6
If two team members disagree about an estimate, which of the following can help find a
compromise?
Group of answer choices
Wideband Delphi
discussing assumptions
WBS
a Scrum meeting
Question 7
Which planning method is used by eXtreme Programming (XP)?
Group of answer choices
PROBE
COCOMO II
The Planning Game
Wideband Delphi
Question 8
In the Wideband Delphi process, the project manager would make a good moderator.
Group of answer choices
True
False
Question 9
Which of the following are true of Wideband Delphi? (select all that apply)
Group of answer choices
requires the entire team to correct one another
requires the creation of a WBS
requires a daily stand-up meeting
was developed at the Rand Corporation in the 1940s
involves an estimation team with 3 to 7 members
Flag question: Question 10
Question 10
Which of the following should you have before you begin the Wideband Delphi process? (select
all that apply)
Group of answer choices
WBS
Vision document
Scope document
a Scrum meeting
Flag question: Question 11
Question 11
Which of the following relationships is also known as "generalization"?
Group of answer choices
Has-a
Creates
Is-a
Knows about
Flag question: Question 12
Question 12
The domain model is a dynamic model that captures the behavior of the system.
Group of answer choices
True
False
Flag question: Question 13
Question 13
External systems are always modeled as actors.
Group of answer choices
True
False
Flag question: Question 14
Question 14
Matching
Group of answer choices
validation
[ Choose ] Are we doing the things right? Are we doing the right things?
verification
[ Choose ] Are we doing the things right? Are we doing the right things?
Flag question: Question 15
Question 15
Which of the following rules are part of Osborn's method?
Group of answer choices
solicit user stories
focus on quantity
withold criticism
break rules
encourage unusual ideas
combine and improve ideas
Flag question: Question 16
Question 16
Which of the following are parts of a use case?
Group of answer choices
title
main success scenario
extensions
user stories
Flag question: Question 17
Question 17
A domain model is: (check all that apply)
Group of answer choices
a use case
a graphic that shows relationships
a project glossary
a dictionary of term.
A Robust Cybersecurity Topic Classification ToolIJNSA Journal
In this research, we use user defined labels from three internet text sources (Reddit, StackExchange, Arxiv) to train 21 different machine learning models for the topic classification task of detecting cybersecurity discussions in natural English text. We analyze the false positive and false negative rates of each of the 21 model’s in cross validation experiments. Then we present a Cybersecurity Topic Classification (CTC) tool, which takes the majority vote of the 21 trained machine learning models as the decision mechanism for detecting cybersecurity related text. We also show that the majority vote mechanism of the CTC tool provides lower false negative and false positive rates on average than any of the 21 individual models. We show that the CTC tool is scalable to the hundreds of thousands of documents with a wall clock time on the order of hours.
Recommendation based on Clustering and Association RulesIJARIIE JOURNAL
Recommender systems play an important role in filtering and customizing the desired information.
Recommender system are divided into 3 categories i.e collaborative filtering , content-based filtering, and hybrid
filtering and they are the most adopted techniques being utilized in recommender systems. The paper mainly
describe about the issues of recommendation system.The main aim of paper is to recommend the suitable items to
the user, so for recommending the suitable items a better rule extraction is needed.Thus for better rule extraction
Association mining is applied .The clustering method is also applied here to cluster the data based on similar
characteristics .The propose methods try to eliminate certain problems such as sparsity, cold-start problem. So to
overcome the certain problem association mining over clustering is used
A Proposed Method to Develop Shared Papers for Researchers at Conferenceiosrjce
In conferences, the topics of interest for papers include variety of subjects, if the researcher wants to
write a shared researched paper on specific subject with another researcher who is also interested in the same
subject and wants to participate in the same conference, here the problem will arise especially when the topics
of interest and number of researchers become large. The aim of the paper is to solve this problem by finding a
suitable representation of researcher information of topics of interest that can be easily represented and then
found shared researcher on the same topics of interest. Two proposed system algorithms are implemented to
find the shared researchers in conference which gives an easy and efficient implementation.
Similar to Tag-based Approaches to Sharing Background Information regarding Social Problems towards Facilitating Public Collaboration (20)
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
Tag-based Approaches to Sharing Background Information regarding Social Problems towards Facilitating Public Collaboration
1. Tag-based Approaches to Sharing
Background Information regarding Social
Problems towards Facilitating Public
Collaboration
Masaru Watanabe,
Shun Shiramatsu, Yasuaki Goto
Nagoya Institute of Technology
1
2. Outline
1. Background and Goal
2. Automatic annotation
1. Generate tags
1. Filtering
2. SVM
2. Annotate tags
1. TF-IDF
2. Paragraph Vector
3. Prototype of API
3. Systems for sharing collaborative activities
4. Conclusion
2
3. Background
CivicTech :
Citizens and IT engineers cooperate to solve social
problems
Hackathons are frequently held.
When participants discuss the solutions of social
problems, they need to share background
regarding the problems
3
Goal: Sharing background information about social problems
4. Our Approaches
4
1. Automatic annotation to web articles with social
problem tags
If articles have tags of social problems, these articles
can be found easily as background information of the
problems.
2. Systems for sharing collaborative activities
By making the activities in the organization open data,
citizen collaboration across the organization is
promoted.
Goal: Sharing background information about social problems
5. Outline
1. Background and Goal
2. Automatic annotation
1. Generate tags
1. Filtering
2. SVM
2. Annotate tags
1. TF-IDF
2. Paragraph Vector
3. Prototype of API
3. Systems for sharing collaborative activities
4. Conclusion
5
7. Knowledge Connector
Site that can share works
such as ideas, applications,
datas
Tag-based search is
supported
7
Users often forget to annotate or
do not understand the necessity of annotation
Orthographical variants of tags
http://idea.linkdata.org/
8. Our Solution
8
Users often forget to annotate or
do not understand the necessity of annotation
Orthographic variants of tags
Automatic annotation with social problem tags
Automatic generation of a tagset in advance
11. Generate Tags
Requirements for generating tags
Hierarchical structure
Because exploratory browsing of related problems
promotes understanding of background information
Sufficient amount of social problem tags
11
"Social problem" category of DBpedia Japanese
DBpedia Japanese:
well-known linked open dataset that is converted from Wikipedia.
Some articles in the category are
unrelated to "social problem"
12. Filtering to
exclude inappropriate tags
12
Extract page title from "Social Problem"
Category and its sub categories
(within n hierarchical levels).
Filtering noisy resources by tracing other
particular categories.
13. Categories used for filtering
Filter A
Stub Category,
Computer Science,
Judgment, Work, Social
Movement Organization,
People, Biology Field,
Criminal Studies, Crime
type,
Peace Studies, Logic
13
Filter B
almost the same as
Filter A, except
"Biological Field" is
excluded.
14. Evaluation method (Filtering)
Recall
Six participants selected 102
pages that relate social
problem from Japanese
Wikipedia
Precision
Select 100 tags randomly
from the tag list
14
Calculated the
percentage of these
items that were included
in the tag list
Ask 25 participants to evaluate
whether these data were social
problems on a five-point scale.
Calculate the percentage of
regarded tags.
(more than three on the scale
were regarded)
15. Evaluation
(Tag Generation by Filtering)
15
The method with Filter B and 2 hierarchical levels has best balance.
recall : 43%
precition : 49%
16. Filter based on SVM
Dataset
Pages belonging to a lower category within three hierarchical
levels of "Category: Social problem"
Feature vector used
a. category page that can reach within 5 hierarchical levels
from any one of the acquired pages, the occurrence
frequency is 9 or more
b. Total of distributed representation vectors of words
(word2vec) included in each page title
c. Distributed representation vector of the full text of each
page(doc2vec)
d. Mixing a. and c.
16
17. Evaluation method(SVM)
10-fold cross validation test
Both positive and negative examples used 120 cases
Use the results that obtained when evaluate the
precision of filtering method
17
Recall
percentage of examples
categorized into the positive
class among the positive
examples
Precision
percentage of the positive
examples among examples
categorized into the positive
class
20. Annotate Tags
Calculate Cos similarity between target article and all
Wikipedia articles with title of tag name.
When the similarity is equal to or higher than the
threshold, the title is set as the tag to be attached.
Two methods are used for vector generation.
1. TF-IDF
2. Paragraph Vector
20
21. Evaluation method (Annotate)
Measure Cos similarity with each method for 10
articles on social problems collected in advance.
21
Evaluate the validity of the tags in seven-point scale by
showing to 25 participants up to ten tags which
annotated to article and three randomly extracted tags.
Calculate correlation coefficient and accuracy
based on evaluation.
22. Evaluation
(Tag Annotated by TF-IDF)
22
correlation coefficient : 0.732
accuracy rate at threshold 0.2 : 0.812
Tags with similarity of 0.2 or more : 37/85
23. Example of false recognition:
Evaluation value by system differs from the
evaluation value by human
In the article of Hunger,
"Food crisis" Human : 7 (very high)
System : 0.154 (low)
In the article of Bullying
"Social isolation" Human : 5 (high)
System : 0.152 (low)
Similarity assessment by related terms could not be
considered.
23Note : These tags are translated from Japanese.
24. Evaluation
(Tag Annotated by Paragraph Vector)
24
correlation coefficient : 0.346
accuracy rate at threshold 0.35 : 0.824
Tags with similarity of 0.35 or more :8/102
26. Prototype of API
26
Input : http://foo-bar.net/tag-recom/[Target page URL]
Output:
Note : These tags are translated from Japanese.
27. Outline
1. Background and Goal
2. Automatic annotation
1. Generate tags
1. Filtering
2. SVM
2. Annotate tags
1. TF-IDF
2. Paragraph Vector
3. Prototype of API
3. Systems for sharing collaborative activities
4. Conclusion
27
28. Knowledge Connector
(Repeated)
Site that can share works
such as ideas, applications,
data
28
We aim to solve these problems by developing MissionForest
Users often forget to annotate or
do not understand the necessity of annotation
Orthographic variants of tags
Lack of a task management function
29. MissionForest
29
Web system for sharing social activities and research
activities.
Managing tasks in a tree structure like Work
Breakdown Structure.
Activity data is published as linked open data.
30. Benefits of
linked open data
30
You can discovery information about social problem from tags.
31. Future work for MissionForest
31
• Annotate each task with social problem tags that can
be used for exploratory browsing of social activities
browsing other organization's solution is helpful for
discussing about own problems
Environmental
destruction
Global warming
32. Outline
1. Background and Goal
2. Automatic annotation
1. Generate tags
1. Filtering
2. SVM
2. Annotate tags
1. TF-IDF
2. Paragraph Vector
3. Prototype of API
3. Systems for sharing collaborative activities
4. Conclusion
32
33. Conclusion
Automatic Annotation
• Filtering method based SVM can generate
sufficient tag set from DBpedia Japanese.
• TF-IDF method can tag articles with
reasonable precision.
Systems for sharing collaborative activities
• We are developing MissionForest for
connect collaboration within the university
laboratory and cross-organization
collaboration
33
Editor's Notes
Thank you chair.
I'll talk about "Tag-based Approaces to Sharing Background Information regarding Social Problems towards Facilitating Public Collaboration."
This is a brief outline of our presentation.
Firstly, Background and Goal; secondly, Automatic annotation; thirdly, Systems for sharing collaborative activities; finnaly, Conclude our presentation.
In recent years, Civictech are getting active.
Civictech refers to activities to solve social problems by collaboration between citizens and IT engineers.
Many Civictech Hackathon is being held.
In Civictech, participants discuss the solutions of social problems.
In order to do that, it is necessary to share background knowledge about the problem.
Therefore, we set our goal as "Sharing background information about social problems."
We chose two approaches to achieve our goal.
The first approache is to annotate social problem tags to web articles automatically.
If articles have tags of social problems, these articles can be found easily as background information of the problems.
The second approache is to develop "Systems for sharing collaborative activities."
By making the activities in the organization open data, citizen collaboration across the organization is promoted.
Next, I'll talk about Automatic annotation.
If tags are attached to articles on social problems, you can easily investigate about social problems.
This is an example of using tags when discussing "global warming".
In this discussion, let's assume that a article with the tag "global warming" is shared.
By clicking the tag "global warming" you can get a list of articles on other "global warming".
An example of actually using such a tag-based search is the site called Knoledge Connector.
Knoledge Connector is the site that can share works such as ideas, applications and datas.
In Knoledge Connector, tags are annotated by users.
Therefore, some problems arise from it.
For example, users often forget to annotate the tags, some users do not understand the necessity of annotation, or orthographical variants are occur.
These are our solution for the problems.
The solution to forget to annotate tags is to annotate tags automatically.
The solution to orthographic variants is to generate a tag set automatically.
This is architecture of our automatic tag annotation systems.
First, we generate tags and use the results to annotate them to the article of social problem on the web.
First, we would talk about how to generate tag set automatically.
The tag here has a purpose of exploratory browsing of related problems promotes understanding of background information.
Therefore, there should be a hierarchical structure between the tags.
Furthermore, it is also important to have sufficient amount of social problem tags.
In this research, we selected "social problem" category of Japanese version DBpedia as the source of tag extraction.
DBpedia is a project to extract information from Wikipedia and publish it as linked open data.
We tried to extract tag candidates from DBpedia.
However, if extracted as it is, some articles in the category are unrelated to "social problem."
In order to solve this problem, we decided to filter by what parent category a page has.
First, we obtain pages belonging to the social problem category and its lower category, and extract page title as tag name.
Next, remove the page belonging to other specific category and its lower category from the list of extracted pages.
In this research, two kinds of filters are prepared.
Filter A is a filter designed based on common points found that can be judged by observing as having no relation to social problems in non filtered tag list.
Filter B is a filter which does not use the category "Biological Field" from filter A.
It was not only filtering non related tags but also filtering many related tags.
In the evaluation of filtering, the recall and precision were calculated.
This is how calculate recall.
First, six participants selected 102 pages that relate social problem from Japanese Wikipedia.
Then, we calculated the percentage of these items that were included in the tag list.
This is how calculate precision.
First, select 100 tags randomly from the tag list.
Then, we asked 25 participants to evaluate whether these data were social problems on a five-point scale.
Finaly, we calculated the percentage of the tags that have more than three on the scale.
This is evaluation result.
Precision is dropped considerably when it reaches 3 levels regardless of use or not use filter.
It shows that elements not related to social problems are explodingly increasing in the 3 levels or later.
In addition, we got the result that the recall of 1 level without filter, it means the list get from pages belonging directly to the social problem category, is quite low.
We should also review the point of using only DBpedia for tag generation.
Therefore, we omit something that is not a social problem from the list by binary classification using support vector machines.
We used wikipedia pages belonging to a lower category within 3 levels of "Category: Social problem" for dataset.
Four kinds of feature vectors are prepared as support vector machine input vectors.
The first one is a corpus of the category page that can reach within 5 levels from any one of the acquired pages.
Among them, we decided to use those with an appearance frequency of 9 or more in the whole.
The number of hierarchies from the page to the category page is taken as a value.
However, if you can not reach within 5 levels, enter 6.
The second one is a corpus of the whole wikipedia sentence, and the total value of the distributed representation vectors of words constituting each page title is taken as a vector.
The third one is a corpus of the whole wikipedia sentence, and the total value of the distributed representation vectors of words constituting each page article is taken as a vector.
The fourth one is combination of the first and third vectors.
Based on the above vectors, we performed a 10-fold cross validation test on the support vector machine to calculate the recall and precision.
We used the questionnaire results gathered when measuring the precision of filtering as the data for the positive examples and the negative examples used for the cross validation test.
We calculate recall and precision.
In this evaluation, recall means how much positive examples are included among examples of categorized into the positive class, and precision means how much examples of categorized into the positive class are included among positive examples.
This is evaluation result.
In the method using word vectors, recall is high and precision is low.
This seems to be because most elements were judged to be positive.
Except for recall of word vectors, the method using only category pages shows better performance than others.
So, when using support vector machine, it is effective to use only category information.
Because the denominator of the ratio is different from the evaluation form of the filterling methods, the ratio values cannot be simply compared.
But this results indicate great implovement from the filtering methods.
Next, We'll talk about automatic tag annotation.
For automatic assignment, use Wikipedia's article with the title with the same name as tag candidate.
We calculate the Cos similarity of the article to be annotated and the Wikipedia article with the same name as the tag candidate, and use tags with a degree of similarity equal to or higher than the threshold as the given tag.
In creating a vector for calculating Cos similarity, we used two methods, TF-IDF and paragraph vector.
In the evaluation of automatic annotation, we calculated correlation coefficient and accuracy.
We gave ten articles to the tag annotation system created by each method and calculated Cos similarity.
Tags whose calculated Cos similarity was within the top 10 and equal to or more than the threshold and 3 tags selected at random were shown to 25 participants and evaluated in seven-point scale.
Then calculate correlation coefficient and accuracy based on evaluation.
This is evaluation result.
The correlation coefficient shows a strong correlation.
The system evaluation value of the element which question evaluation value is 7 are greatly dispersed.
We think that this is due to the characteristics of TF-IDF which can not handle related words.
When the threshold value was set to 0.2, accuracy rate and the number of tags given is sufficient.
We think this is useful to support semi-automatic annotation at actual use.
These are examples where the system evaluation value differs from the questionnaire evaluation value.
In the article on hunger, the tag "food crisis" was judged appropriate tag by human, and inappropriate tag by system.
Also in the article on bullying, the tag "social isolation" was judged appropriate tag by human, and inappropriate tag by system.
When I looked at these articles, there was no mentioned about these tags in the articles itself.
It seems that the questionnaire evaluation value got higher due to the high relevance of words themselves "hunger" and "food crisis," "bullying" and "social isolation".
We think that these problems can be solved by introducing a method that can handle related words.
This is evaluation of Paragraph Vector method.
The correlation coefficient shows a weak correlation.
In this research, we used only 102 Wikipedia articles that is tag candidates for corpus of paragraph vector.
We think that correlation was not achieved because the number of documents for corpus was insufficient.
Since the paragraph vector algorithm considers word order, there is a possibility that the stylistic difference may be affected.
When the threshold value was set to 0.35, accuracy rate is sufficient but number of tags given is not enough.
These results indicated that TF-IDF is superior for actual use than Paragraph Vector.
We provided a system for automatic generation and automatic annotation of tags as an API.
By passing the URL containing articles on social problem in the API, the tags that the system judged when given to the article are returned together with the similarity.
Currently JSON in the format shown in the figure is output.
Now let's we talked about "Systems for sharing collaborative activities."
An example of "a system that shares collaborative work" is the Knowledge Connector mentioned.
However, the Knowledge Connector has the problem of not having the task management function in addition to the above problem.
We aim to solve these problems by developing MissionForest.
This is the User Interface of MissionForest.
MissionForest is a web system for sharing social activities and research activities.
This system can managing tasks in a tree structure like Work Breakdown Structure and published activity data as linked open data.
Linked open data is open data that expresses each data and the relationship between the data by URI.By publishing the data in this format, the user can effectively utilize the related information based on the connection between the data.If there is the above tags as one of the connections between the data, it will lead to the discovery of articles and data of social problems related to a certain mission and task.
As a future work, We are thinking of incorporating the tag system above.
Browsing other organization's solution is helpful for discussing about own problems.
So, We think that annotate each task with social problem tags that can be used for exploratory browsing of social activities.
Conclusion.
First, We talked about automatic tag annotation .
Filtering method based SVM can generate sufficient tag set from DBpedia Japanese.
TF-IDF method can tag articles with reasonable precision.
Next, We talked about systems for sharing collaborative activities.
We have developing MissionForest for connect collaboration within the university laboratory and cross-organization collaboration.
That's all thank you.
An example of a tag that could not be filtered by Filter B.
A typical example of abusive intergenerational chaining called Goller family has been judged as a social problem by the system, and participants are judged to have no relation with social problems.
Definitions of key words to be used as social problem tags should be considered.
Key words in many fields such as myths and feelings remained.
Eleven categories are already selected for filtering already.
The method of adding new categories to make it possible to filter these tags is not realistic.