This document summarizes an algorithm for efficiently refining why-not questions on top-k queries. It begins by executing a top-k query and identifying a missing object m. It then samples potential replacements for m from a restricted sample space and computes m's new ranking if modified to each sample value. The refined query with the smallest penalty, which returns m in the results, is returned as the answer. The algorithm improves efficiency by skipping unnecessary ranking computations and is tested on a basketball player database, demonstrating effectiveness. It answers why-not questions faster than without optimization techniques.
Enhancing Keyword Query Results Over Database for Improving User Satisfaction ijmpict
Storing data in relational databases is widely increasing to support keyword queries but search results does not gives effective answers to keyword query and hence it is inflexible from user perspective. It would be helpful to recognize such type of queries which gives results with low ranking. Here we estimate prediction of query performance to find out effectiveness of a search performed in response to query and features of such hard queries is studied by taking into account contents of the database and result list. One relevant problem of database is the presence of missing data and it can be handled by imputation. Here an inTeractive Retrieving-Inferring data imputation method (TRIP) is used which achieves retrieving and inferring alternately to fill the missing attribute values in the database. So by considering both the prediction of hard queries and imputation over the database, we can get better keyword search results.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
International Journal of Engineering and Science Invention (IJESI)inventionjournals
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online
Not Good Enough but Try Again! Mitigating the Impact of Rejections on New Con...Aleksi Aaltonen
Presentation at the University of Miami on 3 December 2021 on how Stack Overflow improved the retention of new contributors whose initial question is rejected (closed) as substandard. The presentation is based on a paper coauthored with Sunil Wattal.
Enhancing Keyword Query Results Over Database for Improving User Satisfaction ijmpict
Storing data in relational databases is widely increasing to support keyword queries but search results does not gives effective answers to keyword query and hence it is inflexible from user perspective. It would be helpful to recognize such type of queries which gives results with low ranking. Here we estimate prediction of query performance to find out effectiveness of a search performed in response to query and features of such hard queries is studied by taking into account contents of the database and result list. One relevant problem of database is the presence of missing data and it can be handled by imputation. Here an inTeractive Retrieving-Inferring data imputation method (TRIP) is used which achieves retrieving and inferring alternately to fill the missing attribute values in the database. So by considering both the prediction of hard queries and imputation over the database, we can get better keyword search results.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
International Journal of Engineering and Science Invention (IJESI)inventionjournals
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online
Not Good Enough but Try Again! Mitigating the Impact of Rejections on New Con...Aleksi Aaltonen
Presentation at the University of Miami on 3 December 2021 on how Stack Overflow improved the retention of new contributors whose initial question is rejected (closed) as substandard. The presentation is based on a paper coauthored with Sunil Wattal.
bulk ieee projects in pondicherry,ieee projects in pondicherry,final year ieee projects in pondicherry
Nexgen Technology Address:
Nexgen Technology
No :66,4th cross,Venkata nagar,
Near SBI ATM,
Puducherry.
Email Id: praveen@nexgenproject.com.
www.nexgenproject.com
Mobile: 9751442511,9791938249
Telephone: 0413-2211159.
NEXGEN TECHNOLOGY as an efficient Software Training Center located at Pondicherry with IT Training on IEEE Projects in Android,IEEE IT B.Tech Student Projects, Android Projects Training with Placements Pondicherry, IEEE projects in pondicherry, final IEEE Projects in Pondicherry , MCA, BTech, BCA Projects in Pondicherry, Bulk IEEE PROJECTS IN Pondicherry.So far we have reached almost all engineering colleges located in Pondicherry and around 90km
Query Aware Determinization of Uncertain Objects1crore projects
IEEE PROJECTS 2015
1 crore projects is a leading Guide for ieee Projects and real time projects Works Provider.
It has been provided Lot of Guidance for Thousands of Students & made them more beneficial in all Technology Training.
Dot Net
DOTNET Project Domain list 2015
1. IEEE based on datamining and knowledge engineering
2. IEEE based on mobile computing
3. IEEE based on networking
4. IEEE based on Image processing
5. IEEE based on Multimedia
6. IEEE based on Network security
7. IEEE based on parallel and distributed systems
Java Project Domain list 2015
1. IEEE based on datamining and knowledge engineering
2. IEEE based on mobile computing
3. IEEE based on networking
4. IEEE based on Image processing
5. IEEE based on Multimedia
6. IEEE based on Network security
7. IEEE based on parallel and distributed systems
ECE IEEE Projects 2015
1. Matlab project
2. Ns2 project
3. Embedded project
4. Robotics project
Eligibility
Final Year students of
1. BSc (C.S)
2. BCA/B.E(C.S)
3. B.Tech IT
4. BE (C.S)
5. MSc (C.S)
6. MSc (IT)
7. MCA
8. MS (IT)
9. ME(ALL)
10. BE(ECE)(EEE)(E&I)
TECHNOLOGY USED AND FOR TRAINING IN
1. DOT NET
2. C sharp
3. ASP
4. VB
5. SQL SERVER
6. JAVA
7. J2EE
8. STRINGS
9. ORACLE
10. VB dotNET
11. EMBEDDED
12. MAT LAB
13. LAB VIEW
14. Multi Sim
CONTACT US
1 CRORE PROJECTS
Door No: 214/215,2nd Floor,
No. 172, Raahat Plaza, (Shopping Mall) ,Arcot Road, Vadapalani, Chennai,
Tamin Nadu, INDIA - 600 026
Email id: 1croreprojects@gmail.com
website:1croreprojects.com
Phone : +91 97518 00789 / +91 72999 51536
We are the company providing Complete Solution for all Academic Final Year/Semester Student Projects. Our projects are
suitable for B.E (CSE,IT,ECE,EEE), B.Tech (CSE,IT,ECE,EEE),M.Tech (CSE,IT,ECE,EEE) B.sc (IT & CSE), M.sc (IT & CSE),
MCA, and many more..... We are specialized on Java,Dot Net ,PHP & Andirod technologies. Each Project listed comes with
the following deliverable: 1. Project Abstract 2. Complete functional code 3. Complete Project report with diagrams 4.
Database 5. Screen-shots 6. Video File
SERVICE AT CLOUDTECHNOLOGIES
IEEE, WEB, WINDOWS PROJECTS ON DOT NET, JAVA& ANDROID TECHNOLOGIES,EMBEDDED SYSTEMS,MAT LAB,VLSI DESIGN.
ME, M-TECH PAPER PUBLISHING
COLLEGE TRAINING
Thanks&Regards
cloudtechnologies
# 304, Siri Towers,Behind Prime Hospitals
Maitrivanam, Ameerpet.
Contact:-8121953811,8522991105.040-65511811
cloudtechnologiesprojects@gmail.com
http://cloudstechnologies.in/
In March of 2015 I'm invited to give a presentation at Data Science program at the College of William and Mary: http://jxshix.people.wm.edu/Math410-2015/index.html. This talk is hence prepared to introduce data science to college students studying mathematics. Nonetheless I hope it is useful to a general public.
Architecture of an ontology based domain-specific natural language question a...IJwest
Question answering (QA) system aims at retrieving precise information from a large collection of
documents against a query. This paper describes the architecture of a Natural Language Question
Answering (NLQA) system for a specifi
c domain based on the ontological information, a step towards
semantic web question answering. The proposed architecture defines four basic modules suitable for
enhancing current QA capabilities with the ability of processing complex questions. The first m
odule was
the question processing, which analyses and classifies the question and also reformulates the user query.
The second module allows the process of retrieving the relevant documents. The next module processes the
retrieved documents, and the last m
odule performs the extraction and generation of a response. Natural
language processing techniques are used for processing the question and documents and also for answer
extraction. Ontology and domain knowledge are used for reformulating queries and ident
ifying the
relations. The aim of the system is to generate short and specific answer to the question that is asked in the
natural language in a specific domain. We have achieved 94 % accuracy of natural language question
answering in our implementation
Feature selection, optimization and clustering strategies of text documentsIJECEIAES
Clustering is one of the most researched areas of data mining applications in the contemporary literature. The need for efficient clustering is observed across wide sectors including consumer segmentation, categorization, shared filtering, document management, and indexing. The research of clustering task is to be performed prior to its adaptation in the text environment. Conventional approaches typically emphasized on the quantitative information where the selected features are numbers. Efforts also have been put forward for achieving efficient clustering in the context of categorical information where the selected features can assume nominal values. This manuscript presents an in-depth analysis of challenges of clustering in the text environment. Further, this paper also details prominent models proposed for clustering along with the pros and cons of each model. In addition, it also focuses on various latest developments in the clustering task in the social network and associated environments.
On the benefit of logic-based machine learning to learn pairwise comparisonsjournalBEEI
In recent years, many daily processes such as internet web searching, e-mail filter-ing, social media services, e-commerce have benefited from machine learning tech-niques (ML). The implementation of ML techniques has been largely focused on blackbox methods where the general conclusions are not easily interpretable. Hence, theelaboration with other declarative software models to identify the correctness and com-pleteness of the models is not easy to perform. On the other hand, the emerge of somelogic-based machine learning techniques with their advantage of white box approachhave been proven to be well-suited for many software engineering tasks. In this paper,we propose the use of a logic-based approach to learn user preference in the form ofpairwise comparisons. APARELL as a novel approach of inductive learning is able tomodel the user’s preferences in description logic representation. This offers a rich, re-lational representation which is then can be used to produce a set of recommendations.A user study has been performed in our experiment to evaluate the implementation ofpairwise preference recommender system when compared to a standard list interface.The result of the experiment shows that the pairwise interface was significantly betterthan the other interface in many ways.
Semantic Based Model for Text Document Clustering with IdiomsWaqas Tariq
Text document clustering has become an increasingly important problem in recent years because of the tremendous amount of unstructured data which is available in various forms in online forums such as the web, social networks, and other information networks. Clustering is a very powerful data mining technique to organize the large amount of information on the web. Traditionally, document clustering methods do not consider the semantic structure of the document. This paper addresses the task of developing an effective and efficient method to improve the semantic structure of the text documents. A method has been developed that performs the following: tag the documents for parsing, replacement of idioms with their original meaning, semantic weights calculation for document words and apply semantic grammar. The similarity measure is obtained between the documents and then the documents are clustered using Hierarchical clustering algorithm. The method adopted in this work is evaluated on different data sets with standard performance measures and the effectiveness of the method to develop in meaningful clusters has been proved.
professional fuzzy type-ahead rummage around in xml type-ahead search techni...Kumar Goud
Abstract – It is a research venture on the new information-access standard called type-ahead search, in which systems discover responds to a keyword query on-the-fly as users type in the uncertainty. In this paper we learn how to support fuzzy type-ahead search in XML. Underneath fuzzy search is important when users have limited knowledge about the exact representation of the entities they are looking for, such as people records in an online directory. We have developed and deployed several such systems, some of which have been used by many people on a daily basis. The systems received overwhelmingly positive feedbacks from users due to their friendly interfaces with the fuzzy-search feature. We describe the design and implementation of the systems, and demonstrate several such systems. We show that our efficient techniques can indeed allow this search paradigm to scale on large amounts of data.
Index Terms - type-ahead, large data set, server side, online directory, search technique.
Context Sensitive Search String Composition Algorithm using User Intention to...IJECEIAES
Finding the required URL among the first few result pages of a search engine is still a challenging task. This may require number of reformulations of the search string thus adversely affecting user's search time. Query ambiguity and polysemy are major reasons for not obtaining relevant results in the top few result pages. Efficient query composition and data organization are necessary for getting effective results. Context of the information need and the user intent may improve the autocomplete feature of existing search engines. This research proposes a Funnel Mesh-5 algorithm (FM5) to construct a search string taking into account context of information need and user intention with three main steps 1) Predict user intention with user profiles and the past searches via weighted mesh structure 2) Resolve ambiguity and polysemy of search strings with context and user intention 3) Generate a personalized disambiguated search string by query expansion encompassing user intention and predicted query. Experimental results for the proposed approach and a comparison with direct use of search engine are presented. A comparison of FM5 algorithm with K Nearest Neighbor algorithm for user intention identification is also presented. The proposed system provides better precision for search results for ambiguous search strings with improved identification of the user intention. Results are presented for English language dataset as well as Marathi (an Indian language) dataset of ambiguous search strings.
2. an efficient approach for web query preprocessing edit satIAESIJEECS
The emergence of the Web technology generated a massive amount of raw data by enabling Internet users to post their opinions, comments, and reviews on the web. To extract useful information from this raw data can be a very challenging task. Search engines play a critical role in these circumstances. User queries are becoming main issues for the search engines. Therefore a preprocessing operation is essential. In this paper, we present a framework for natural language preprocessing for efficient data retrieval and some of the required processing for effective retrieval such as elongated word handling, stop word removal, stemming, etc. This manuscript starts by building a manually annotated dataset and then takes the reader through the detailed steps of process. Experiments are conducted for special stages of this process to examine the accuracy of the system.
Convolutional recurrent neural network with template based representation for...IJECEIAES
Complex Question answering system is developed to answer different types of questions accurately. Initially the question from the natural language is transformed to an internal representation which captures the semantics and intent of the question. In the proposed work, internal representation is provided with templates instead of using synonyms or keywords. Then for each internal representation, it is mapped to relevant query against the knowledge base. In present work, the Template representation based Convolutional Recurrent Neural Network (T-CRNN) is proposed for selecting answer in Complex Question Answering (CQA) framework. Recurrent neural network is used to obtain the exact correlation between answers and questions and the semantic matching among the collection of answers. Initially, the process of learning is accomplished through Convolutional Neural Network (CNN) which represents the questions and answers separately. Then the representation with fixed length is produced for each question with the help of fully connected neural network. In order to design the semantic matching between the answers, the representation of Question Answer (QA) pair is given into the Recurrent Neural Network (RNN). Finally, for the given question, the correctly correlated answers are identified with the softmax classifier.
bulk ieee projects in pondicherry,ieee projects in pondicherry,final year ieee projects in pondicherry
Nexgen Technology Address:
Nexgen Technology
No :66,4th cross,Venkata nagar,
Near SBI ATM,
Puducherry.
Email Id: praveen@nexgenproject.com.
www.nexgenproject.com
Mobile: 9751442511,9791938249
Telephone: 0413-2211159.
NEXGEN TECHNOLOGY as an efficient Software Training Center located at Pondicherry with IT Training on IEEE Projects in Android,IEEE IT B.Tech Student Projects, Android Projects Training with Placements Pondicherry, IEEE projects in pondicherry, final IEEE Projects in Pondicherry , MCA, BTech, BCA Projects in Pondicherry, Bulk IEEE PROJECTS IN Pondicherry.So far we have reached almost all engineering colleges located in Pondicherry and around 90km
Query Aware Determinization of Uncertain Objects1crore projects
IEEE PROJECTS 2015
1 crore projects is a leading Guide for ieee Projects and real time projects Works Provider.
It has been provided Lot of Guidance for Thousands of Students & made them more beneficial in all Technology Training.
Dot Net
DOTNET Project Domain list 2015
1. IEEE based on datamining and knowledge engineering
2. IEEE based on mobile computing
3. IEEE based on networking
4. IEEE based on Image processing
5. IEEE based on Multimedia
6. IEEE based on Network security
7. IEEE based on parallel and distributed systems
Java Project Domain list 2015
1. IEEE based on datamining and knowledge engineering
2. IEEE based on mobile computing
3. IEEE based on networking
4. IEEE based on Image processing
5. IEEE based on Multimedia
6. IEEE based on Network security
7. IEEE based on parallel and distributed systems
ECE IEEE Projects 2015
1. Matlab project
2. Ns2 project
3. Embedded project
4. Robotics project
Eligibility
Final Year students of
1. BSc (C.S)
2. BCA/B.E(C.S)
3. B.Tech IT
4. BE (C.S)
5. MSc (C.S)
6. MSc (IT)
7. MCA
8. MS (IT)
9. ME(ALL)
10. BE(ECE)(EEE)(E&I)
TECHNOLOGY USED AND FOR TRAINING IN
1. DOT NET
2. C sharp
3. ASP
4. VB
5. SQL SERVER
6. JAVA
7. J2EE
8. STRINGS
9. ORACLE
10. VB dotNET
11. EMBEDDED
12. MAT LAB
13. LAB VIEW
14. Multi Sim
CONTACT US
1 CRORE PROJECTS
Door No: 214/215,2nd Floor,
No. 172, Raahat Plaza, (Shopping Mall) ,Arcot Road, Vadapalani, Chennai,
Tamin Nadu, INDIA - 600 026
Email id: 1croreprojects@gmail.com
website:1croreprojects.com
Phone : +91 97518 00789 / +91 72999 51536
We are the company providing Complete Solution for all Academic Final Year/Semester Student Projects. Our projects are
suitable for B.E (CSE,IT,ECE,EEE), B.Tech (CSE,IT,ECE,EEE),M.Tech (CSE,IT,ECE,EEE) B.sc (IT & CSE), M.sc (IT & CSE),
MCA, and many more..... We are specialized on Java,Dot Net ,PHP & Andirod technologies. Each Project listed comes with
the following deliverable: 1. Project Abstract 2. Complete functional code 3. Complete Project report with diagrams 4.
Database 5. Screen-shots 6. Video File
SERVICE AT CLOUDTECHNOLOGIES
IEEE, WEB, WINDOWS PROJECTS ON DOT NET, JAVA& ANDROID TECHNOLOGIES,EMBEDDED SYSTEMS,MAT LAB,VLSI DESIGN.
ME, M-TECH PAPER PUBLISHING
COLLEGE TRAINING
Thanks&Regards
cloudtechnologies
# 304, Siri Towers,Behind Prime Hospitals
Maitrivanam, Ameerpet.
Contact:-8121953811,8522991105.040-65511811
cloudtechnologiesprojects@gmail.com
http://cloudstechnologies.in/
In March of 2015 I'm invited to give a presentation at Data Science program at the College of William and Mary: http://jxshix.people.wm.edu/Math410-2015/index.html. This talk is hence prepared to introduce data science to college students studying mathematics. Nonetheless I hope it is useful to a general public.
Architecture of an ontology based domain-specific natural language question a...IJwest
Question answering (QA) system aims at retrieving precise information from a large collection of
documents against a query. This paper describes the architecture of a Natural Language Question
Answering (NLQA) system for a specifi
c domain based on the ontological information, a step towards
semantic web question answering. The proposed architecture defines four basic modules suitable for
enhancing current QA capabilities with the ability of processing complex questions. The first m
odule was
the question processing, which analyses and classifies the question and also reformulates the user query.
The second module allows the process of retrieving the relevant documents. The next module processes the
retrieved documents, and the last m
odule performs the extraction and generation of a response. Natural
language processing techniques are used for processing the question and documents and also for answer
extraction. Ontology and domain knowledge are used for reformulating queries and ident
ifying the
relations. The aim of the system is to generate short and specific answer to the question that is asked in the
natural language in a specific domain. We have achieved 94 % accuracy of natural language question
answering in our implementation
Feature selection, optimization and clustering strategies of text documentsIJECEIAES
Clustering is one of the most researched areas of data mining applications in the contemporary literature. The need for efficient clustering is observed across wide sectors including consumer segmentation, categorization, shared filtering, document management, and indexing. The research of clustering task is to be performed prior to its adaptation in the text environment. Conventional approaches typically emphasized on the quantitative information where the selected features are numbers. Efforts also have been put forward for achieving efficient clustering in the context of categorical information where the selected features can assume nominal values. This manuscript presents an in-depth analysis of challenges of clustering in the text environment. Further, this paper also details prominent models proposed for clustering along with the pros and cons of each model. In addition, it also focuses on various latest developments in the clustering task in the social network and associated environments.
On the benefit of logic-based machine learning to learn pairwise comparisonsjournalBEEI
In recent years, many daily processes such as internet web searching, e-mail filter-ing, social media services, e-commerce have benefited from machine learning tech-niques (ML). The implementation of ML techniques has been largely focused on blackbox methods where the general conclusions are not easily interpretable. Hence, theelaboration with other declarative software models to identify the correctness and com-pleteness of the models is not easy to perform. On the other hand, the emerge of somelogic-based machine learning techniques with their advantage of white box approachhave been proven to be well-suited for many software engineering tasks. In this paper,we propose the use of a logic-based approach to learn user preference in the form ofpairwise comparisons. APARELL as a novel approach of inductive learning is able tomodel the user’s preferences in description logic representation. This offers a rich, re-lational representation which is then can be used to produce a set of recommendations.A user study has been performed in our experiment to evaluate the implementation ofpairwise preference recommender system when compared to a standard list interface.The result of the experiment shows that the pairwise interface was significantly betterthan the other interface in many ways.
Semantic Based Model for Text Document Clustering with IdiomsWaqas Tariq
Text document clustering has become an increasingly important problem in recent years because of the tremendous amount of unstructured data which is available in various forms in online forums such as the web, social networks, and other information networks. Clustering is a very powerful data mining technique to organize the large amount of information on the web. Traditionally, document clustering methods do not consider the semantic structure of the document. This paper addresses the task of developing an effective and efficient method to improve the semantic structure of the text documents. A method has been developed that performs the following: tag the documents for parsing, replacement of idioms with their original meaning, semantic weights calculation for document words and apply semantic grammar. The similarity measure is obtained between the documents and then the documents are clustered using Hierarchical clustering algorithm. The method adopted in this work is evaluated on different data sets with standard performance measures and the effectiveness of the method to develop in meaningful clusters has been proved.
professional fuzzy type-ahead rummage around in xml type-ahead search techni...Kumar Goud
Abstract – It is a research venture on the new information-access standard called type-ahead search, in which systems discover responds to a keyword query on-the-fly as users type in the uncertainty. In this paper we learn how to support fuzzy type-ahead search in XML. Underneath fuzzy search is important when users have limited knowledge about the exact representation of the entities they are looking for, such as people records in an online directory. We have developed and deployed several such systems, some of which have been used by many people on a daily basis. The systems received overwhelmingly positive feedbacks from users due to their friendly interfaces with the fuzzy-search feature. We describe the design and implementation of the systems, and demonstrate several such systems. We show that our efficient techniques can indeed allow this search paradigm to scale on large amounts of data.
Index Terms - type-ahead, large data set, server side, online directory, search technique.
Context Sensitive Search String Composition Algorithm using User Intention to...IJECEIAES
Finding the required URL among the first few result pages of a search engine is still a challenging task. This may require number of reformulations of the search string thus adversely affecting user's search time. Query ambiguity and polysemy are major reasons for not obtaining relevant results in the top few result pages. Efficient query composition and data organization are necessary for getting effective results. Context of the information need and the user intent may improve the autocomplete feature of existing search engines. This research proposes a Funnel Mesh-5 algorithm (FM5) to construct a search string taking into account context of information need and user intention with three main steps 1) Predict user intention with user profiles and the past searches via weighted mesh structure 2) Resolve ambiguity and polysemy of search strings with context and user intention 3) Generate a personalized disambiguated search string by query expansion encompassing user intention and predicted query. Experimental results for the proposed approach and a comparison with direct use of search engine are presented. A comparison of FM5 algorithm with K Nearest Neighbor algorithm for user intention identification is also presented. The proposed system provides better precision for search results for ambiguous search strings with improved identification of the user intention. Results are presented for English language dataset as well as Marathi (an Indian language) dataset of ambiguous search strings.
2. an efficient approach for web query preprocessing edit satIAESIJEECS
The emergence of the Web technology generated a massive amount of raw data by enabling Internet users to post their opinions, comments, and reviews on the web. To extract useful information from this raw data can be a very challenging task. Search engines play a critical role in these circumstances. User queries are becoming main issues for the search engines. Therefore a preprocessing operation is essential. In this paper, we present a framework for natural language preprocessing for efficient data retrieval and some of the required processing for effective retrieval such as elongated word handling, stop word removal, stemming, etc. This manuscript starts by building a manually annotated dataset and then takes the reader through the detailed steps of process. Experiments are conducted for special stages of this process to examine the accuracy of the system.
Convolutional recurrent neural network with template based representation for...IJECEIAES
Complex Question answering system is developed to answer different types of questions accurately. Initially the question from the natural language is transformed to an internal representation which captures the semantics and intent of the question. In the proposed work, internal representation is provided with templates instead of using synonyms or keywords. Then for each internal representation, it is mapped to relevant query against the knowledge base. In present work, the Template representation based Convolutional Recurrent Neural Network (T-CRNN) is proposed for selecting answer in Complex Question Answering (CQA) framework. Recurrent neural network is used to obtain the exact correlation between answers and questions and the semantic matching among the collection of answers. Initially, the process of learning is accomplished through Convolutional Neural Network (CNN) which represents the questions and answers separately. Then the representation with fixed length is produced for each question with the help of fully connected neural network. In order to design the semantic matching between the answers, the representation of Question Answer (QA) pair is given into the Recurrent Neural Network (RNN). Finally, for the given question, the correctly correlated answers are identified with the softmax classifier.
IOSR Journal of Applied Physics (IOSR-JAP) is an open access international journal that provides rapid publication (within a month) of articles in all areas of physics and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in applied physics. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
International Journal of Engineering and Science Invention (IJESI) inventionjournals
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online
Application of hidden markov model in question answering systemsijcsa
By the increase of the volume of the saved information on web, Question Answering (QA) systems have been very important for Information Retrieval (IR). QA systems are a specialized form of information retrieval. Given a collection of documents, a Question Answering system attempts to retrieve correct answers to questions posed in natural language. Web QA system is a sample of QA systems that in this system answers retrieval from web environment doing. In contrast to the databases, the saved information on web does not follow a distinct structure and are not generally defined. Web QA systems is the task of automatically answering a question posed in Natural Language Processing (NLP). NLP techniques are used in applications that make queries to databases, extract information from text, retrieve relevant documents from a collection, translate from one language to another, generate text responses, or recognize spoken words converting them into text. To find the needed information on a mass of the non-structured information we have to use techniques in which the accuracy and retrieval factors are implemented well. In this paper in order to well IR in web environment, The QA system in designed and also implemented based on the Hidden Markov Model (HMM)
Pattern based approach for Natural Language Interface to DatabaseIJERA Editor
Natural Language Interface to Database (NLIDB) is an interesting and widely applicable research field. As the name suggests an NLIDB allows a naive user to ask query to database in natural language. This paper presents an NLIDB namely Pattern based Natural Language Interface to Database (PBNLIDB) in which patterns for simple query, aggregate function, relational operator, short-circuit logical operator and join are defined. The patterns are categorized into valid and invalid. Valid patterns are directly used to translate natural language query into Structured Query Language (SQL) query whereas an invalid pattern assists the query authoring service in generating options for user so that the query could be framed correctly. The system takes an English language query as input, recognizes pattern in the query, selects one of the before mentioned features of SQL based on the pattern, prepares an SQL statement, fires it on database and displays the result.
With quick advancement of investigative databases and web data databases are turning out to be exceptionally colossal in size and complex in nature. These databases hold extensive and heterogeneous information, with huge number of relations and qualities. So it is exceptionally hard to outline an arrangement of static inquiry structures to answer different specially appointed database inquirieson these cutting edge databases. Along these lines there is need of such framework which create Query Forms powerfully as indicated by the clients need at run time. The proposed framework Dynamic Query Form i.e.DQF framework going to give an answer by the inquiry interface in extensive and complex databases. In proposed framework, the center idea is to catch client intrigues all through client associations and to adjust the inquiry sort iteratively. Each cycle comprises of 2 sorts of client collaborations: Query Form Enrichment and Query Execution. In Query Form Enrichment DQF would prescribe a positioned rundown of question structure part to client so he/she can choose sought structure segments into current inquiry structure. In Query Execution client fills current inquiry shape and submit question, DQF going to show result and take input from client on gave question results. A client would have office to fill the inquiry frame and submit questions to see the inquiry result at every cycle. So that a question structure could be progressively refined till the client fulfills with the inquiry results.
This presentation discusses the following topics:
Introduction to Query Processing
Need for Query processing
Architecture of Query Processing
Query Processing Steps
Phases in a typical query processing
Represented in relational structures
Translating SQL Queries into Relational Algebra
Query Optimization
Importance of Query Optimization
Actions of Query Optimization
Dear students get fully solved assignments
Send your semester & Specialization name to our mail id :
“ help.mbaassignments@gmail.com ”
or
Call us at : 08263069601
In this research work we have develop a new scoring mathematical model that works on the five types of questions. The question text failures are first extracted and a score is found based on its structure with respect to its template structure and then answer score is calculated again the question as well as paragraph. Text to finally reach at the index of the most probable answer with respect to question.
Top 30 Data Analyst Interview Questions.pdfShaikSikindar1
Data Analytics has emerged has one of the central aspects of business operations. Consequently, the quest to grab professional positions within the Data Analytics domain has assumed unimaginable proportions. So if you too happen to be someone who is desirous of making through a Data Analyst .
Profile Analysis of Users in Data Analytics DomainDrjabez
Data Analytics and Data Science is in the fast forward
mode recently. We see a lot of companies hiring people for data
analysis and data science, especially in India. Also, many
recruiting firms use stackoverflow to fish their potential
candidates. The industry has also started to recruit people based
on the shapes of expertise. Expertise of a personal is
metaphorically outlined by shapes of letters like I, T, M and
hyphen betting on her experiencein a section (depth) and
therefore the variety of areas of interest (width).This proposal
builds upon the work of mining shapes of user expertise in a
typical online social Question and Answer (Q&A) community
where expert users often answer questions posed by other
users.We have dealt with the temporal analysis of the expertise
among the Q&A community users in terms how the user/ expert
have evolved over time.
Keywords— Shapes of expertise, Graph communities, Expertise
evolution, Q&A community
Question Answering has been a well-researched NLP area over recent years. It has become necessary for
users to be able to query through the variety of information available - be it structured or unstructured. In
this paper, we propose a Question Answering module which a) can consume a variety of data formats - a
heterogeneous data pipeline, which ingests data from product manuals, technical data forums, internal
discussion forums, groups, etc. b) addresses practical challenges faced in real-life situations by pointing to
the exact segment of the manual or chat threads which can solve a user query c) provides segments of texts
when deemed relevant, based on user query and business context. Our solution provides a comprehensive
and detailed pipeline that is composed of elaborate data ingestion, data parsing, indexing, and querying
modules. Our solution is capable of handling a plethora of data sources such as text, images, tables,
community forums, and flow charts. Our studies performed on a variety of business-specific datasets
represent the necessity of custom pipelines like the proposed one to solve several real-world document
question-answering
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
B017350710
1. IOSR Journal of Computer Engineering (IOSR-JCE)
e-ISSN: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 3, Ver. V (May – Jun. 2015), PP 07-10
www.iosrjournals.org
DOI: 10.9790/0661-17350710 www.iosrjournals.org 7 | Page
Efficient Refining Of Why-Not Questions on Top-K Queries
P. Haripriya1
, J. Jegan Amarnath2
1
P.G student, Sri Sairam Engineering College, Chennai.
2
Assistant Professor, Sri Sairam Engineering College, Chennai,
Abstract: After decades of effort working on database performance, the quality and the usability of database
systems have received more attention in recent years. In particular, answering the why-not questions after a
search is made has become more important. In this project, the problem of answering why-not questions on top-
k queries and refining the user query is solved. Generally many users love to pose those kinds of queries when
they are making multi-criteria decisions. However, they would also want to know why their expected answers
do not show up in the query results. The different algorithms are developed to answer such why-not questions
efficiently. Top-K dominating questions are those which have more than one or two results. When this case
occurs, the result is ordered according to highest ranking among the records. A search is made and result is
displayed, if the expected tuple does not appear then user raises a why-not query. This query is refined using
algorithm and then the result is calculated. A penalty function is added such that the result can be returned
efficiently and without any fault.
Keyword: why-not questions, Top-K and Dominating queries, penalty.
I. Introduction
Database technology has made great strides in the past decades. Today, we are able to process ever
larger numbers of ever more complex queries on ever more humongous data sets efficiently. Internet search
engines have popularized keyword based search. Users provide keywords to the user interface and a ranked list
of documents is displayed to the user. A why-not question is being posed when a user wants to know why her
expected tuples do not show up in the query result. A certain effort has worked on answering why-not
questions on traditional relational or the SQL queries. But none of those can answer why-not questions on
preference queries like top-k queries yet. Answering the why-not questions gives the purpose of using the data
mining algorithms. The main goal is to find a refined top-k query that include non-empty set of missing objects
and the user’s initial query. A non-empty set includes all the contents and the set is found to be not empty.
When the user provides a query with the count or search for a top-k query then the process analyses and
produces the result without the non-empty set.
For example, a user of DBLife may be surprised to find out that the system believes that a person was
not on the program committee of conference of another. In fact he may have actually been on the program
committee, but this fact does not appear in the extracted data, perhaps due to bugs in extractors, or in accuracies
in sources, or incomplete coverage of sources. Therefore, it is important to help developers debug the system
and to help users understand why they got the result they did. On the other hand, if the result really shouldn’t be
in the result, it is a must to explain to the user why this is the case so that they can gain confidence in the non-
answer. [1][2].Top-k dominating queries, or just dominating queries, is a form of top-k query that users may
pose why-not questions on. While a top-k dominating query frees users from specifying the set of weightings by
ranking the objects based on the number of (other) objects that they could dominate.
Both the why-not and Dominating Top-k queries are explained with two algorithms to provide the
reason for the missing records. The main goals can be explained as the problem formulation, the problem
analysis, and the algorithms of answering why-not questions on top-k queries and dominating queries. Also
given thing is that there are an infinite number of points (weightings) in the weighting space we should put
limited amount of the records into S in order to obtain a good approximation of the answer. Searching for the
particular keyword through traditional information retrieval techniques for enabling keyword search in
document collections use data structures such as inverted lists that efficiently identify documents containing a
query keyword is another method. A straight forward mapping of this idea to databases is a symbol table that
stores information at row level granularity that is we keep the list of rows that contains the keyword. Alternative
symbol table designs are possible where we can leverage the physical design of the database. For example, if a
column has an index then we only need column level granularity. For this purpose we only store the list of
columns for each keyword where they occur.
2. Efficient Refining Of Why-Not Questions On Top-K Queries
DOI: 10.9790/0661-17350710 www.iosrjournals.org 8 | Page
II. Related Work
The concept of why-not is first discussed in [6]. This work answers a user’s why-not question on
Select-Project-Join(SPJ) queries by telling her which query operator(s) eliminated her desired answers.
After that, this line of work is gradually expanded. In [7] and [8], the missing answers of SPJ [7] and
SPJUA (SPJ + Union + Aggregation) queries are explained by a data-refinement approach, i.e., it tells the user
how the data should be modified if user wants the missing answer back to the result.
Answering why-not for a Top-k query was explained by Zhian He and Eric Lo [1]. An algorithm is
discussed for answering the queries posed by the user on a Top-K query. Also a defined dimensional disk space
has been provided for the records or the tuples on the Top-k rankings and positions. The main goal is about the
basic top-k query where users need to specify the set of weightings and the query where users do not need to
specify the set of weightings because the ranking function ranks an object higher if it can dominate more
objects. The target is focussed mainly to give an explanation to a user who is wondering why her expected
answers are missing in the query result. Since the problems are non-identical, a different explanation models for
top-k queries and top-k dominating queries is given.
Islam, M.S.,Rui Zhou, Chengfei Liu has proposed a method for answering the why-not question on
Reverse Skyline queries. This query recovers all data points whose dynamic skylines contain the query point.
The benefit and the semantics of answering why-not questions in reverse skyline queries are defined. A
technique to modify the why-not point and the query point to include the why-not point in the reverse skyline of
the query point is given. This point can be placed anywhere within a region safely without losing any of
the existing reverse skyline points. Considering the safe region of the query point answering a why-not question
is done. The procedure also efficiently combines both query point and data point modification techniques to
produce meaningful answers.
Vermeulen, Vanderhulst, Luyten, Coninx, Karin started a method of answering the why-not question
through the pervasive crystal. The condition becomes distressed when they are unable to understand and control
a pervasive computing environment. Also the other works have shown that allowing users to pose why and why
not questions about context-aware applications resulted in better understanding and stronger feelings of trust.
Though why-not questions have been used before to aid in debugging and to clarify graphical user interfaces, it
is not clear how they can be integrated into pervasive computing systems. In existing framework with support
for why and why-not question is extended for the search of missing keywords. So a new method called
Pervasive Crystal which is a system for asking and answering why and why-not questions in pervasive
computing environments was derived.
Islam M.S has proposed a related process where a database is efficiently used in this process without
wasting the sample space. There is a growing interest in allowing users to ask questions on received results in
the hope of improving the usability of database systems. Islam M.S. has proposed this approach which aims
at answering the so called why and why-not questions on received results with respect to different query
settings in databases. The goals of this research can be explained as studying the problem of answering the why
and the why-not questions in relative databases, explain the efficient strategies for answering these questions in
terms of different settings and finally developing a framework that can take advantage of the existing data
indexing and query evaluation techniques for the purpose of answering such questions in the databases. The
progressed research work contributes completely towards improving the usability of traditional database
systems. The similarity between the current and the related work is that an algorithm for refining those why-not
questions is given and it is efficient in time. Analyzing and answering a dominating Top-k query is not an easy
task and it is also solved by giving different weightings to the set of the records.
III. Problem And Analysis
A table is called a trusted table if it is assumed to be correct and complete, so we do not have to
consider updates or insertions to it when computing the provenance of non-answers. An attribute is called a
trusted attribute if its values in existing tuples are correct and therefore updates to them can be ignored. But that
new values can appear in trusted attributes when new tuples are inserted. The user must either choose to trust
tables or individual attributes that appear in a database else the corresponding objects. In a database, each object
p with d attribute values can be represented as a point p = |p[1] p[2] ...p[d]| in a d-dimensional data space Rd.
Now we assume that all attribute values are numeric and a smaller value means a better score for simplicity. A
top-k query is composed of a scoring function which gives a result set size r, and a weighting vector w = w [1]
w[2]. The scoring function as any monotonic function is accepted and the weighting space subject to the
constraints w[i] = 1 and 0 ≤ w[i] ≤ 1 is assumed. The query result would then be a set of k objects whose scores
are the smallest.
The penalty function is developed in case of interruption of the process. Nevertheless, the solution
works for all kinds of monotonic penalty functions. A technique to skip many of those progressive top-k
operations so as to improve the algorithm’s efficiency is also presented. A much more aggressive and effective
3. Efficient Refining Of Why-Not Questions On Top-K Queries
DOI: 10.9790/0661-17350710 www.iosrjournals.org 9 | Page
stopping condition that makes most of those operations stop is also presented. Two techniques together can
significantly reduce the overall running time of the algorithm. Since general QP solver requires the solution
space be convex, first divide Wri into Cnj −1. Each convex coordinate corresponds to a quadratic programming
problem. After solving all these quadratic programming problems, the best wri would be identified. For all
rankings to be considered there are n+1 j=1 Cnj −1 = 2n (n is the number of incomparable objects with m)
quadratic programming problems in the worst case.
An approach for finding out the multiple missing objects is proposed. The main goal is considered
through varying the data size, query dimension, count or the number of missing objects, performance. A
sampling-based algorithm that finds the best approximate answer is proposed. A progressive top-k query q
based on the weighting vector w in the user’s original query q is posed using any progressive top-k query
evaluation algorithm, and stop when m comes forth to the result set with a ranking ro. If m does not appear in
the query result, then report to the user that m does not exist in the database and the process terminates. If m
exists in the database, then randomly sample a list of weighting vectors S = [w1, w2 . . . ws] from the weighting
space.
Fig: 1. Restricted Sample space
IV. Answering Dominating Top-K Query
The basic idea for refining why-not top-k dominating queries is similar to the idea of answering top-k
why-not questions. First in the case where there is only one missing object m execute a top-k dominating query
q_o using a progressive top-k dominating query evaluation algorithm and stop when m comes forth to the result
set with a ranking ro. If m does not appear in the query result, inform the user that m does not exist in the
database and the process terminates. tie at rank k-th, only one of them is returned). Initially, a user poses a top-k
dominating query qo(ko).After she gets the result, she may pose a why-not question on qo with a set of missing
objects M = {m1, . . . , mj}. By using only the query-refinement approach here, we can only modify the value of
k in order to make M appear in the result. That may result in a refined query whose k’s value is increased
significantly if there are some missing objects that are actually dominated by many points. As such, we also use
the data-refinement approach [7], [8] here. That is, we may either adjust the value of k, the values of m1, . . . ,
mj, or both.
Now, formally, the problem is: Given a why-not question {M, qo(ko)}, where M is a non-empty set of
missing objects, qo(ko) is the user’s initial top-k dominating query, the goal is to find a new value k_ and a
value replacement M_ for M, such that all the objects in M_ appear in the result of refined dominating query
q(k) with the smallest penalty based on the weightings.
V. Algorithm
[PHASE-1] The algorithm first executes a progressive top-k dominating query evaluation algorithm to
locate the list of objects, together with their scores, and the list is given as L with rank 1, 2, 3,... until the
missing object m shows up in the result in rank rth. Now denote that operation as (L, ro) =
DOMINATING(UNTIL-SEE- _m). After that, it samples data values _x1, _x2, . . . , _xs from the restricted
sample space Rs and adds them into S.
[PHASE-2] Next, for some data value sample _xi of S, modify m’s values to be xi and then determine
the ranking ri of m after the value modification. Note that the ranking ri basically can be determined by
executing a progressive top-k dominating algorithm once again on the database. The Technique is given as (a)
below to illustrate a much efficient way to determine the ranking ri, without actually invoking the progressive
top-k dominating algorithm. Therefore the technique in why-not top-k processing can be applied here to skip
ranking calculations for some data value samples.
4. Efficient Refining Of Why-Not Questions On Top-K Queries
DOI: 10.9790/0661-17350710 www.iosrjournals.org 10 | Page
[PHASE-3] After PHASE-2, we should have s + 1 ―refined queries and modified values‖ pairs:
<q>o(ro), m = m,<q>1(r1), m = <x1>, . . . , <q>s+1(rs+1), m = <xs+1>. The pair with the least penalty is
returned to the user as the answer.
Technique —Efficient ranking computation for a sample point
This method describes how to efficiently compute the ranking ri of m if setting m’s value to _xi. First,
we compute the new score of m which is the number of objects dominated by m, when its values equal to
sample _xi. This technique can be easily done by any skyline-related algorithm or by posing a simple range
query on an R-tree. Next update the scores of all objects in L (stored in PHASE-1) as the value of m is changed
to _xi. do not update the scores of objects not in L because they were either dominated by _m or incomparable
with m. So, their scores would not get changed. For the objects in L that do not dominate _m, their scores are
unchanged because if they did not dominate _m before, they also cannot dominate m now (because _m gets a
better value _xi). Only for those objects in L that dominate _m, we check whether every such object dominates
_xi (which is _m’s new value), if yes, its score is unchanged; otherwise its score is reduced by one. With all the
updated scores in place, we can easily determine the new ranking ri of _m. We represent this operation as:
ri = COMPUTE-RANK( _m, _xi).
A case study is done for a NBA database selecting top-k players for center, guard and other positions.
The search is made according to their rankings and the why-not questions posed by the users are answered
respectively based on their weightings. The technique proposed in the algorithm is tested and experimented with
a sample database containing the records. The effectiveness of techniques are also very promising. Without
using any optimization technique, the algorithm requires about 1500 seconds and 400 seconds on uniform
dataset and anti-correlated dataset, respectively. But when optimization techniques are enabled, the algorithm
runs about two orders of magnitude faster — it requires only about 10 seconds and 2 seconds on uniform
dataset and anti-correlated dataset, respectively.
VI. Conclusion
The refining of why-not questions on Top-k queries is studied. There are different techniques for
answering a why-not questions but this algorithm helps users to get a efficient answer to their questions. While
a search is made the why-not query must be refined such that the errors will be avoided at an initial state. The
basic top-k query where users need to specify the set of weightings, and the top-k dominating query where users
do not need to specify the set of weightings because the ranking function ranks an object higher if it can
dominate more objects. The target is to give an explanation to a user who is wondering why her expected
answers are missing in the query result. Since the problems are different, so a different explanation models for
top-k queries and top-k dominating queries is used . For the former, the user gets a refined query with
approximately minimal changes to the k value and their weightings. For the latter, user gets a refined query
with approximately minimal changes to the k value and the missing objects’ data values. In the future work the
case of non-numeric attributes will be studied.
References
[1]. Sanjay Agrawal, Surajit Chaudhuri, Gautam Das, ―DBXplorer: A System for Keyword-Based Search over Relational Databases‖
2010.
[2]. Zhian He, Eric Lo, ―Answering Why-Not Questions on Top-K Queries‖, IEEE transactions on knowledge and data engineering,
vol. 26, no. 6, june 2014.
[3]. Islam , M.S ; Rui Zhou ; Chengfei Liu, ―On answering why-not questions in reverse skyline queries‖, IEEE transactions on Mining
techniques, April 2013.
[4]. Islam, M.S., ―On answering why and why-not questions in Databases‖, Data Engineering Workshop (ICDEW),IEEE international
conference,2013.
[5]. Vermeulen,J; Vanderhulst, G. ; Luyten,K. ; Conix, karin ,― PervasiveCrystal: Asking and Answering Why and Why Not
Questions about Pervasive Computing Applications‖, IEEE Conference publications, 2010.
[6]. Jiajun Gu ; Kitagawa, H. ,― Extending Keyword Search to Metadata on Relational Databases‖ , Information-Explosion and Next
Generation search (IENGS),. 2008.
[7]. Melanie Herschel, MauricioA.Hern´andez, ―Explaining Missing Answers to SPJUA Queries‖, IEEE conference publications, 2008
[8]. E. Tiakas, A. N. Papadopoulos, and Y. Manolopoulos, ―Progressive processing of subspace dominating queries,‖ VLDB J., vol. 20,
no. 6, pp. 921–948, 2011.
[9]. A. Vlachou, C. Doulkeridis, Y. Kotidis, and K. Nørvåg, ―Reverse top-k queries,‖ in Proc. ICDE, Long Beach, CA, USA, 2010, pp.
365–376.
[10]. M. L. Yiu and N. Mamoulis, ―Efficent processing of top-k dominating queries on multi-dimensional data,‖ in Proc. VLDB, Vienna,
Austria, 2007, pp. 541–552.
[11]. S. Borzsonyi,D. Kossmann, and K. Stocker, ―The skyline operator,‖ACM Trans. Database Syst., vol. 25, no. 2, pp. 129–178, 2000.
[12]. A. Motro, ―Query generalization: A method for interpreting null answers,‖ in Proc. Expert Database Workshop, 1984, pp. 597–616.