The document describes a system that can automatically generate SQL queries from natural language input. It discusses how the system works in multiple phases: it first acquires text input, then analyzes the text to comprehend it and extract necessary information like table names and conditions. It then generates the SQL queries based on this extracted information and predefined rules. The system was tested on sample queries and showed 80-90% accuracy in generating simple and complex queries for operations like select, insert and delete. While accurate, the authors note room for improving the algorithms to achieve higher than 85% accuracy and handle more types of queries.
IRJET- Querying Database using Natural Language InterfaceIRJET Journal
This document presents a proposed natural language interface system to allow users to query a database using English queries instead of SQL. The system aims to make database access easier for non-technical users. It discusses the architecture of the system, which includes modules for natural language processing, query translation to SQL, and speech conversion. It also reviews related work and discusses advantages and disadvantages of natural language interfaces for databases. The proposed system uses techniques like tokenization, parsing, and semantic analysis to understand queries and map them to equivalent SQL queries to retrieve results from the database.
This document is a mini project report submitted in partial fulfillment of the requirements for a Bachelor of Technology degree in Computer Science and Engineering. It describes a project to create a "College Phone Book" application, with the goal of storing contact information for students and faculty at the college. The report includes sections on introduction, literature survey, requirements analysis, implementation, system design, coding, system testing, screenshots, limitations and future enhancements, and conclusion. It was created by four students under the guidance of an associate professor.
Intelligent query converter a domain independent interfacefor conversionIAEME Publication
This document describes an "Intelligent Query Converter" (IQC) system that converts natural language queries in English to SQL queries. IQC uses semantic matching techniques and WordNet to match queries to database schema without requiring domain-specific configuration. The system classifies queries based on presence of value keywords and conjunctive clauses. Experiments show IQC correctly answered 64.5-87.9% of queries for different databases, outperforming Microsoft's English Query system which answered correctly 29.0-51.8% of time. IQC provides a domain-independent natural language interface to databases using semantic analysis techniques.
This presentation summarizes a simple phone book program using linked lists and file handling data structures. Key points include:
- Linked lists and files were used to store and manage contact data in a dynamic way without pre-allocating memory.
- Files allow data to be stored non-volatile, reusable, and portable between systems.
- The program includes functions for loading data from a file into a linked list, validating user input, adding/finding/modifying/deleting records, and writing the linked list data back to the file.
The document provides information about EduProz, an education center that offers distance learning courses for management and IT streams. It offers classroom facilities free of cost with experienced faculty and infrastructure to ensure a comfortable learning environment. EduProz helps students prepare for interviews and land jobs through its career counseling and placement cells. It is conveniently located in Dwarka, Delhi near metro stations.
Domain Specific Terminology Extraction (ICICT 2006)IT Industry
Imran Sarwar Bajwa, M. Imran Siddique, M. Abbas Choudhary, [2006], "Automatic Domain Specific Terminology Extraction using a Decision Support System", in IEEE 4th International Conference on Information and Communication Technology (ICICT 2006), Cairo, Egypt. pp:651-659
INTELLIGENT-MULTIDIMENSIONAL-DATABASE-INTERFACEMohamed Reda
The document describes an intelligent multidimensional database interface system that allows users to query the database using natural language instead of SQL. The system works by parsing the user's natural language query, filling a semantic dictionary with words from the query and a lexical dictionary with terms from the database schema. It then maps words between the two dictionaries to generate a SQL query, which is executed on the database to return results to the user. The system aims to provide a more user-friendly search experience for non-expert users compared to traditional SQL queries.
IRJET- Querying Database using Natural Language InterfaceIRJET Journal
This document presents a proposed natural language interface system to allow users to query a database using English queries instead of SQL. The system aims to make database access easier for non-technical users. It discusses the architecture of the system, which includes modules for natural language processing, query translation to SQL, and speech conversion. It also reviews related work and discusses advantages and disadvantages of natural language interfaces for databases. The proposed system uses techniques like tokenization, parsing, and semantic analysis to understand queries and map them to equivalent SQL queries to retrieve results from the database.
This document is a mini project report submitted in partial fulfillment of the requirements for a Bachelor of Technology degree in Computer Science and Engineering. It describes a project to create a "College Phone Book" application, with the goal of storing contact information for students and faculty at the college. The report includes sections on introduction, literature survey, requirements analysis, implementation, system design, coding, system testing, screenshots, limitations and future enhancements, and conclusion. It was created by four students under the guidance of an associate professor.
Intelligent query converter a domain independent interfacefor conversionIAEME Publication
This document describes an "Intelligent Query Converter" (IQC) system that converts natural language queries in English to SQL queries. IQC uses semantic matching techniques and WordNet to match queries to database schema without requiring domain-specific configuration. The system classifies queries based on presence of value keywords and conjunctive clauses. Experiments show IQC correctly answered 64.5-87.9% of queries for different databases, outperforming Microsoft's English Query system which answered correctly 29.0-51.8% of time. IQC provides a domain-independent natural language interface to databases using semantic analysis techniques.
This presentation summarizes a simple phone book program using linked lists and file handling data structures. Key points include:
- Linked lists and files were used to store and manage contact data in a dynamic way without pre-allocating memory.
- Files allow data to be stored non-volatile, reusable, and portable between systems.
- The program includes functions for loading data from a file into a linked list, validating user input, adding/finding/modifying/deleting records, and writing the linked list data back to the file.
The document provides information about EduProz, an education center that offers distance learning courses for management and IT streams. It offers classroom facilities free of cost with experienced faculty and infrastructure to ensure a comfortable learning environment. EduProz helps students prepare for interviews and land jobs through its career counseling and placement cells. It is conveniently located in Dwarka, Delhi near metro stations.
Domain Specific Terminology Extraction (ICICT 2006)IT Industry
Imran Sarwar Bajwa, M. Imran Siddique, M. Abbas Choudhary, [2006], "Automatic Domain Specific Terminology Extraction using a Decision Support System", in IEEE 4th International Conference on Information and Communication Technology (ICICT 2006), Cairo, Egypt. pp:651-659
INTELLIGENT-MULTIDIMENSIONAL-DATABASE-INTERFACEMohamed Reda
The document describes an intelligent multidimensional database interface system that allows users to query the database using natural language instead of SQL. The system works by parsing the user's natural language query, filling a semantic dictionary with words from the query and a lexical dictionary with terms from the database schema. It then maps words between the two dictionaries to generate a SQL query, which is executed on the database to return results to the user. The system aims to provide a more user-friendly search experience for non-expert users compared to traditional SQL queries.
The document is a curriculum vitae for Md. Hasanuzzaman that outlines his professional experience, education, skills, and personal details. It summarizes his experience working in IT roles for several companies, including as an IT officer and senior officer, and lists his educational background including a BSc in computer science and engineering. It also provides details on his computer skills, training, extracurricular activities, and references.
NL based Object Oriented modeling - EJSR 35(1)IT Industry
Imran Sarwar Bajwa, Shahzad Mumtaz, Ali Samad [2009], "Object Oriented Software Modeling using NLP Based Knowledge Extraction", European Journal of Scientific Research, Aug 2009, Vol. 35 No. 01, pp:22-33
Automated Java Code Generation (ICDIM 2006)IT Industry
Rule based Production Systems for Automatic Code Generation in Java
This document describes a system that can automatically generate UML diagrams and code from natural language requirements. It analyzes text written in English to extract objects, classes, attributes and methods. It then generates UML class, activity, sequence and use case diagrams based on the analysis. Finally, it produces code in languages like Java, C# and VB.NET corresponding to the diagrams. The system aims to save time and costs by automating modeling and a significant portion of coding based on input requirements. It provides a more efficient alternative to traditional manual modeling and coding approaches.
This document provides guidance on how to become a good software engineer. It discusses what programming is, different types of computer languages, popular programming majors, and how to learn software programming. The document recommends starting with online courses to learn programming basics and the integrated development environment. It also advises supplementing courses with books for a more comprehensive understanding and to develop an open mind. Community websites are recommended for discussions and problem solving help. The overall guidance is that both courses and books are important for learning, but courses are best to start as a beginner.
The document describes a Java programming assignment that involves designing and implementing a student record keeping system. It includes:
1) An introduction to Java and how it has become a popular programming language due to its flexibility across desktop and server environments.
2) Details on designing classes like Person, Player, and Goalkeeper to represent different types of students as part of the record keeping system.
3) Implementation of the design in Java, including defining relationships between objects, implementing behaviors and error handling, and using an integrated development environment.
4) Testing and documenting the Java solution, including creating user and technical documentation.
Boya Ramesh is seeking a challenging position in software that leverages his Oracle Database Administration skills. He has an MCA degree with 76% aggregate from JNTUA University and a BSc in Computers from SK University. His skills include SQL, PL/SQL, Oracle 10g/11g administration, Linux, and tools like RMAN and OEM. His academic project involved rate limiting to defend against flood attacks in disruption tolerant networks using Java, Oracle 10g, and Windows XP. He has participated in technical model contests and enjoys cricket and movies in his free time.
Imran Sarwar Bajwa, M. Abbas Choudhary [2006], "Natural Language Processing based Automated System for UML Diagrams Generation", in Saudi 18th National Conference on Computer Application, 2006, (18th NCCA) Riyadh, Kingdom of Saudi Arabia pp:171-176
Manjunath Shivashimpiger has over 11.5 years of experience developing Java/J2EE applications. He has extensive experience with technologies like Java, Spring, Struts, EJB, JSP, and databases like Oracle, MySQL, and SQL Server. He has worked on projects for clients such as Apple, CoreLogic, and Huawei, developing applications for domains including telecom, insurance, and recruitment. His roles have included requirement analysis, coding, testing, and communication with clients and teammates.
1) UCD-Generator is an application that uses natural language processing to automatically generate use case diagrams from plain English requirements.
2) It analyzes the text using LESSA, which performs tokenization, part-of-speech tagging, and meaning understanding to extract actors, actions, and objects.
3) It then generates use case diagrams in two phases - first extracting information, then drawing the diagrams based on that information. Experiments showed it could accurately generate diagrams for 85% of scenarios.
Pattern based approach for Natural Language Interface to DatabaseIJERA Editor
Natural Language Interface to Database (NLIDB) is an interesting and widely applicable research field. As the name suggests an NLIDB allows a naive user to ask query to database in natural language. This paper presents an NLIDB namely Pattern based Natural Language Interface to Database (PBNLIDB) in which patterns for simple query, aggregate function, relational operator, short-circuit logical operator and join are defined. The patterns are categorized into valid and invalid. Valid patterns are directly used to translate natural language query into Structured Query Language (SQL) query whereas an invalid pattern assists the query authoring service in generating options for user so that the query could be framed correctly. The system takes an English language query as input, recognizes pattern in the query, selects one of the before mentioned features of SQL based on the pattern, prepares an SQL statement, fires it on database and displays the result.
E learning resource Locator Project Report (J2EE)Chiranjeevi Adi
This document provides an overview of an e-learning resource locator project being developed by students at Shri Dharmasthala Manjunatheshwar College of Engineering &Technology. The proposed system will allow students and professors within the Computer Science department to access and share learning materials online. Students will be able to view and download notes, presentations, and other resources. Professors can upload materials and answer student questions on discussion forums. The system is designed to make educational resources more conveniently accessible for remote learning. It will be developed using technologies like Java, J2EE, DB2 database, and NetBeans IDE.
IRJET- Natural Language Query ProcessingIRJET Journal
The document discusses the development of a natural language query processing system that allows users to retrieve data from a database using simple English statements rather than SQL queries. It proposes a system that takes an English query as input, analyzes it to extract keywords, uses those keywords to generate an equivalent SQL query, executes the SQL query on the database, and returns the results to the user. The system is meant to make accessing database information easier for non-technical users by allowing them to use natural language instead of SQL.
IRJET - Voice based Natural Language Query ProcessingIRJET Journal
This document describes a voice-based natural language query processing system that allows non-expert users to interact with a database using natural language queries. The system takes a user's spoken query as input, converts it to text using speech recognition, analyzes the text to generate a SQL query, executes the SQL query against the database, and displays the results in a table. The system addresses challenges like ambiguity through techniques such as tokenization, lexical analysis, syntactic analysis, and semantic analysis to map the natural language query to a valid SQL query.
The document describes the need for and objectives of developing a paperless SQL-based examination system. Currently, paperless exam systems mainly focus on objective questions and cannot adequately evaluate subjective questions involving SQL programming. The proposed system aims to analyze SQL queries and programming questions in real-time to provide prompt feedback to students. It will use a dynamic algorithm to interpret queries and compare student responses to standard outputs. The system will be developed using J2EE and follow the MVC pattern, with a practice test facility and functions for query analysis, reporting, and administration. Its goals are to reduce grading workload, promote learning, and comprehensively evaluate students' SQL skills.
This document describes a natural language interface for accessing databases. It discusses how natural language processing can be used to allow users to query databases using their own language instead of a specialized query language. It proposes an approach that uses techniques like tokenization, parsing, semantic analysis and query generation to take a natural language query, analyze it, generate a corresponding SQL query, execute it against the database and return results to the user in their own language. The document provides details on the architecture and components of such a natural language interface system and the techniques that can be used to develop it, including pattern matching, syntax-based and semantic-based approaches.
Hindi language as a graphical user interface to relational database for tran...IRJET Journal
This document describes a proposed system to develop a Hindi language graphical user interface for a relational database using natural language processing. The system would allow users to query the database using Hindi language queries and receive results back in Hindi as well, without requiring knowledge of database query languages like SQL. It involves developing a Hindi language compiler to tokenize Hindi queries, map the tokens to equivalent English words, generate corresponding SQL queries, execute them against the database, and return results in Hindi. The proposed system aims to provide easy database access to non-technical users in their native Hindi language. It uses a transport database as a case study.
IRJET- An Efficient Way to Querying XML Database using Natural LanguageIRJET Journal
This document discusses an efficient way to query XML databases using natural language. It proposes a framework that can accept English language queries and translate them into XQuery or SQL expressions to retrieve data from an XML database. The system performs linguistic processing to map tokens in the natural language query to XQuery fragments, then executes the translated query against the database. Existing approaches are discussed that typically use semantic and syntactic analysis to represent the query logically before translation, but have limitations in handling ambiguity. The proposed system aims to improve query translation accuracy by leveraging token relationships and classifications determined from natural language parsing.
PL/SQL is a standard and portable language for Oracle Database development. If you develop a program that executes on an Oracle Database, you can quickly move it to another compatible Oracle Database without any changes. PL/SQL is an embedded language. PL/SQL only can execute in an Oracle Database.
The document describes an intelligent query processing system for the Malayalam language. It presents a model for developing such a system, focusing on time inquiries for different transportation modes. The system performs shallow syntactic and semantic analysis of queries. It determines the query type and required result slots. SQL queries are generated to retrieve answers from the database. The system architecture includes morphological analysis, shallow parsing, query frame identification, SQL generation, and answer retrieval. It was evaluated on 70 queries with 87.5% precision.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
The document is a curriculum vitae for Md. Hasanuzzaman that outlines his professional experience, education, skills, and personal details. It summarizes his experience working in IT roles for several companies, including as an IT officer and senior officer, and lists his educational background including a BSc in computer science and engineering. It also provides details on his computer skills, training, extracurricular activities, and references.
NL based Object Oriented modeling - EJSR 35(1)IT Industry
Imran Sarwar Bajwa, Shahzad Mumtaz, Ali Samad [2009], "Object Oriented Software Modeling using NLP Based Knowledge Extraction", European Journal of Scientific Research, Aug 2009, Vol. 35 No. 01, pp:22-33
Automated Java Code Generation (ICDIM 2006)IT Industry
Rule based Production Systems for Automatic Code Generation in Java
This document describes a system that can automatically generate UML diagrams and code from natural language requirements. It analyzes text written in English to extract objects, classes, attributes and methods. It then generates UML class, activity, sequence and use case diagrams based on the analysis. Finally, it produces code in languages like Java, C# and VB.NET corresponding to the diagrams. The system aims to save time and costs by automating modeling and a significant portion of coding based on input requirements. It provides a more efficient alternative to traditional manual modeling and coding approaches.
This document provides guidance on how to become a good software engineer. It discusses what programming is, different types of computer languages, popular programming majors, and how to learn software programming. The document recommends starting with online courses to learn programming basics and the integrated development environment. It also advises supplementing courses with books for a more comprehensive understanding and to develop an open mind. Community websites are recommended for discussions and problem solving help. The overall guidance is that both courses and books are important for learning, but courses are best to start as a beginner.
The document describes a Java programming assignment that involves designing and implementing a student record keeping system. It includes:
1) An introduction to Java and how it has become a popular programming language due to its flexibility across desktop and server environments.
2) Details on designing classes like Person, Player, and Goalkeeper to represent different types of students as part of the record keeping system.
3) Implementation of the design in Java, including defining relationships between objects, implementing behaviors and error handling, and using an integrated development environment.
4) Testing and documenting the Java solution, including creating user and technical documentation.
Boya Ramesh is seeking a challenging position in software that leverages his Oracle Database Administration skills. He has an MCA degree with 76% aggregate from JNTUA University and a BSc in Computers from SK University. His skills include SQL, PL/SQL, Oracle 10g/11g administration, Linux, and tools like RMAN and OEM. His academic project involved rate limiting to defend against flood attacks in disruption tolerant networks using Java, Oracle 10g, and Windows XP. He has participated in technical model contests and enjoys cricket and movies in his free time.
Imran Sarwar Bajwa, M. Abbas Choudhary [2006], "Natural Language Processing based Automated System for UML Diagrams Generation", in Saudi 18th National Conference on Computer Application, 2006, (18th NCCA) Riyadh, Kingdom of Saudi Arabia pp:171-176
Manjunath Shivashimpiger has over 11.5 years of experience developing Java/J2EE applications. He has extensive experience with technologies like Java, Spring, Struts, EJB, JSP, and databases like Oracle, MySQL, and SQL Server. He has worked on projects for clients such as Apple, CoreLogic, and Huawei, developing applications for domains including telecom, insurance, and recruitment. His roles have included requirement analysis, coding, testing, and communication with clients and teammates.
1) UCD-Generator is an application that uses natural language processing to automatically generate use case diagrams from plain English requirements.
2) It analyzes the text using LESSA, which performs tokenization, part-of-speech tagging, and meaning understanding to extract actors, actions, and objects.
3) It then generates use case diagrams in two phases - first extracting information, then drawing the diagrams based on that information. Experiments showed it could accurately generate diagrams for 85% of scenarios.
Pattern based approach for Natural Language Interface to DatabaseIJERA Editor
Natural Language Interface to Database (NLIDB) is an interesting and widely applicable research field. As the name suggests an NLIDB allows a naive user to ask query to database in natural language. This paper presents an NLIDB namely Pattern based Natural Language Interface to Database (PBNLIDB) in which patterns for simple query, aggregate function, relational operator, short-circuit logical operator and join are defined. The patterns are categorized into valid and invalid. Valid patterns are directly used to translate natural language query into Structured Query Language (SQL) query whereas an invalid pattern assists the query authoring service in generating options for user so that the query could be framed correctly. The system takes an English language query as input, recognizes pattern in the query, selects one of the before mentioned features of SQL based on the pattern, prepares an SQL statement, fires it on database and displays the result.
E learning resource Locator Project Report (J2EE)Chiranjeevi Adi
This document provides an overview of an e-learning resource locator project being developed by students at Shri Dharmasthala Manjunatheshwar College of Engineering &Technology. The proposed system will allow students and professors within the Computer Science department to access and share learning materials online. Students will be able to view and download notes, presentations, and other resources. Professors can upload materials and answer student questions on discussion forums. The system is designed to make educational resources more conveniently accessible for remote learning. It will be developed using technologies like Java, J2EE, DB2 database, and NetBeans IDE.
IRJET- Natural Language Query ProcessingIRJET Journal
The document discusses the development of a natural language query processing system that allows users to retrieve data from a database using simple English statements rather than SQL queries. It proposes a system that takes an English query as input, analyzes it to extract keywords, uses those keywords to generate an equivalent SQL query, executes the SQL query on the database, and returns the results to the user. The system is meant to make accessing database information easier for non-technical users by allowing them to use natural language instead of SQL.
IRJET - Voice based Natural Language Query ProcessingIRJET Journal
This document describes a voice-based natural language query processing system that allows non-expert users to interact with a database using natural language queries. The system takes a user's spoken query as input, converts it to text using speech recognition, analyzes the text to generate a SQL query, executes the SQL query against the database, and displays the results in a table. The system addresses challenges like ambiguity through techniques such as tokenization, lexical analysis, syntactic analysis, and semantic analysis to map the natural language query to a valid SQL query.
The document describes the need for and objectives of developing a paperless SQL-based examination system. Currently, paperless exam systems mainly focus on objective questions and cannot adequately evaluate subjective questions involving SQL programming. The proposed system aims to analyze SQL queries and programming questions in real-time to provide prompt feedback to students. It will use a dynamic algorithm to interpret queries and compare student responses to standard outputs. The system will be developed using J2EE and follow the MVC pattern, with a practice test facility and functions for query analysis, reporting, and administration. Its goals are to reduce grading workload, promote learning, and comprehensively evaluate students' SQL skills.
This document describes a natural language interface for accessing databases. It discusses how natural language processing can be used to allow users to query databases using their own language instead of a specialized query language. It proposes an approach that uses techniques like tokenization, parsing, semantic analysis and query generation to take a natural language query, analyze it, generate a corresponding SQL query, execute it against the database and return results to the user in their own language. The document provides details on the architecture and components of such a natural language interface system and the techniques that can be used to develop it, including pattern matching, syntax-based and semantic-based approaches.
Hindi language as a graphical user interface to relational database for tran...IRJET Journal
This document describes a proposed system to develop a Hindi language graphical user interface for a relational database using natural language processing. The system would allow users to query the database using Hindi language queries and receive results back in Hindi as well, without requiring knowledge of database query languages like SQL. It involves developing a Hindi language compiler to tokenize Hindi queries, map the tokens to equivalent English words, generate corresponding SQL queries, execute them against the database, and return results in Hindi. The proposed system aims to provide easy database access to non-technical users in their native Hindi language. It uses a transport database as a case study.
IRJET- An Efficient Way to Querying XML Database using Natural LanguageIRJET Journal
This document discusses an efficient way to query XML databases using natural language. It proposes a framework that can accept English language queries and translate them into XQuery or SQL expressions to retrieve data from an XML database. The system performs linguistic processing to map tokens in the natural language query to XQuery fragments, then executes the translated query against the database. Existing approaches are discussed that typically use semantic and syntactic analysis to represent the query logically before translation, but have limitations in handling ambiguity. The proposed system aims to improve query translation accuracy by leveraging token relationships and classifications determined from natural language parsing.
PL/SQL is a standard and portable language for Oracle Database development. If you develop a program that executes on an Oracle Database, you can quickly move it to another compatible Oracle Database without any changes. PL/SQL is an embedded language. PL/SQL only can execute in an Oracle Database.
The document describes an intelligent query processing system for the Malayalam language. It presents a model for developing such a system, focusing on time inquiries for different transportation modes. The system performs shallow syntactic and semantic analysis of queries. It determines the query type and required result slots. SQL queries are generated to retrieve answers from the database. The system architecture includes morphological analysis, shallow parsing, query frame identification, SQL generation, and answer retrieval. It was evaluated on 70 queries with 87.5% precision.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Coverage-Criteria-for-Testing-SQL-QueriesMohamed Reda
This document discusses testing SQL queries by defining coverage criteria. It introduces coverage criteria for evaluating how well a test suite exercises different situations that could affect the data retrieved by an SQL query. These include criteria related to query clauses like selection, joining, grouping and having. The document also discusses representing SQL queries as control flow graphs and applying criteria like condition coverage to account for all possible combinations of condition evaluations. Automatic test case generation and population of test databases is discussed to evaluate coverage based on the criteria.
NLIDB(Natural Language Interface to DataBases)Swetha Pallati
The document discusses natural language interfaces to databases (NLIDB). It describes the purpose of NLIDB as allowing users to compose queries in natural language rather than SQL. The architecture of an NLIDB system includes components like a lexicon, tokenizer, matcher, query generator, and equivalence checker. NLIDB systems have advantages like being more user-friendly than SQL, but also have disadvantages like not having perfect linguistic coverage.
This document discusses different types of database interfaces. It introduces interfaces as programs that allow users to query databases without writing code. The main types discussed are form-based interfaces, menu-based interfaces, graphical user interfaces, natural language interfaces, speech-based interfaces, and interfaces for database administrators (DBAs). Examples are provided for each type to illustrate how they work.
This tutorial teaches the basics of SQL and how to write SQL queries. SQL (Structured Query Language) is used to modify and access data from a database. It was first developed by IBM in the 1970s and has become a standard language used by most relational database management systems like Oracle, Microsoft SQL Server, and Sybase. Some common SQL commands include SELECT, UPDATE, INSERT, DELETE, WHERE, and ORDER BY clauses.
This document outlines the syllabus for a course on Object Oriented Programming in Java. The course objectives are to familiarize students with OOP concepts and reinforce them through Java implementation. Topics covered include language components, object modeling, OOP basics, methods, arrays, strings, encapsulation, inheritance, polymorphism, exceptions, files, graphics, and database programming. Students will complete assignments and a group project. The assignments have strict deadlines and plagiarized work will not be accepted.
hExarAbax makkAmasjix samayaM anni rojulu 5:00 am - 9:00 pm
(Mecca Masjid timings in Hyderabad - All days 5:00 am - 9:00 pm)
User query: makkAmasjix PIju eVMwa?
(What is the fee for Mecca Masjid?)
POS-tagger: makkAmasjix PIju/WQ eVMwa
Replace with root word: makkAmasjix PIju/WQ eMwa
Context Handler: Updates context to 'makkAmasjix'
Advanced Filter: Keywords - makkAmas
The document discusses the methods and resources used in the dictionary writing process. It describes the stages of analysis, transfer, and synthesis used to compile dictionary entries from a corpus. Specific software tools are used at each stage, including a corpus query system to analyze sample text, and a dictionary writing system for editors to compile and edit entries stored in a database. The advantages of these tools include guiding the analysis process, creating a comprehensive record of words, and streamlining the editorial workflow.
This document presents a voice-based billing system that takes voice input from customers on purchased items and quantities and generates an itemized bill. It consists of three main modules: 1) A speech-to-text module that converts voice input into text using Google APIs. 2) A word tokenization module that splits the text into individual words using NLTK. 3) A bill generation module that takes the tokenized words as input to calculate the total bill amount. The system was tested on purchasing various fruits and achieved 90% accuracy compared to existing systems. It aims to reduce time complexity for billing compared to manual entry.
Similar to NL Interface for Database - EJSR 20(4) (20)
The News Today 24 (https://thenewstoday24.com/)IT Industry
The News Today 24 is source of latest and up-to-date information from all fields of life including politics, business, technology, entertainment and all other fields of life. For more detail, visit: https://thenewstoday24.com/
Imran Sarwar Bajwa, [2010], "Context Based Meaning Extraction by Means of Markov Logic", in International Journal of Computer Theory and Engineering - (IJCTE) 2(1) pp:35-38, February 2010
Imran Sarwar Bajwa, [2010], "Markov Logics Based Automated Business Requirements Analysis", in International Journal of Computer and Electrical Engineering (IJCEE) 2(3) pp:481-485, June 2010
Imran Sarwar Bajwa, [2010], "Virtual Telemedicine Using Natural Language Processing", International Journal of Information Technology and Web Engineering IJITWE 5(1):43-55, January 2010
Imran Sarwar Bajwa, Behzad Bordbar, Mark G. Lee. [2010] "OCL Constraints Generation from Natural Language Specification", in proceedings of EDOC 2010 - IEEE International EDOC Conference 2010, Vitoria, Brazil, Oct 2010, pp:204-213
BPM & SOA for Small Business Enterprises (ICIME 2009)IT Industry
Imran Sarwar Bajwa, S. Mumtaz, A. Samad, R. Kazmi, A. Choudhary [2009], "BPM meeting with SOA: A Customized Solution for Small Business Enterprises", in proceedings of IEEE- International Conference on Information management & Engineering 2009, Kuala Lumpur Malaysia, Apr 2009, pp:677-682
Imran Sarwar Bajwa, A. H. S. Bukhari, [2006] "Speech Language based Engineering System for Automatic Generation of User Forms", in International Conference on Man-Machine Systems (ICOMMS 2006), Kangar, Malaysia
Imran Sarwar Bajwa, M. Abbas Choudhry [2006], "A Study for Prediction of Minerals in Rock Images using Back Propagation Neural Networks", in IEEE 1st International Conference on Advances in Space Technologies (ICAST 2006), Aug 2006, Islamabad, Pakistan. pp:185-189
Imran Sarwar Bajwa, Irfan Hyder, M. Abbas Choudhary. [2006], “Suitable Reusable Components Mining to Assist Developers using a Rule Based System”, in Fifth International Conference on Information and Management Sciences (IMS 2006), Chengdu, China, Volume: 5 pp:266-270
M. Kashif Nazir, Imran Sarwar Bajwa, M. Imran Khan [2006], "A Conceptual Framework of Earthquake Disaster Management System (EDMS) for Quetta City using GIS", in IEEE 1st International Conference on Advances in Space Technologies, (ICAST 2006), Islamabad, Pakistan, Aug 2006, pp:117-120
Imran Sarwar Bajwa, M. Abbas Choudhary [2006], "A Rule Based System for Speech Language Context Understanding", International Journal of Donghua University (English Edition), Jun 2006, Vol. 23 No. 06, pp:39-42
Imran Sarwar Bajwa, M. Abbas Choudhary [2006], "Automatic Web Layout Generation using Natural Language Processing Techniques", in 2nd International Conference From Scientific Computing to Computational Engineering, (IC-SCCE 2006) Athens, Greece, pp:334-340
Imran Sarwar Bajwa, S. Irfan Hyder [2005], "PCA Based Image Classification of Single-Layered Cloud Types", in 1st IEEE International Conference on Emerging Technologies (ICET 2005), Islamabad, Pakistan, Jan 2005, pp:365-369
Feature Based Image Classification by using Principal Component AnalysisIT Industry
Classification of different types of cloud images is the primary issue used to forecast precipitation and other weather constituents. A PCA based classification system has been presented in this paper to classify the different types of single-layered and multi-layered clouds. Principal Component Analysis (PCA) provides enhanced accuracy in features based image identification and classification as compared to other techniques. PCA is a feature based classification technique that is characteristically used for image recognition. PCA is based on principal features of an image and these features discreetly represent an image. The used approach in this research uses the principal features of an image to identify different cloud image types with better accuracy. A classifier system has also been designed to exhibit this enhancement. The designed system reads features of gray-level images to create an image space. This image space is used for classification of images. In testing phase, a new cloud image is classified by comparing it with the specified image space using the PCA algorithm.
2. Database Interfacing using Natural Language Processing 845
interface is provided by some technical languages. These languages are called query languages and are
constituted of the database commands typically used for asking questions to a distinctive database and
getting intended response. SQL [3] (Structured Query Language) is the most popular query language
which is actually the defacto language of databases today. SQL is an orthodox tool of database
querying. Different database management systems implement this standardized language with trivial
alterations and adjustments. However, in spite of these proprietary extensions by the vendors, the core
of this querying language is the same in all of the environments.
From an application programmer's point of view, the major novelty in the relational database is
that one uses a declarative query language, SQL. Most computer languages are procedural. The
programmer tells the computer what to do, step by step, specifying a procedure. Using SQL interface,
the programmer defines his requirements and questions and the RDBMS query planner figures out how
to get it [5]. There are two compensations of using a declarative language. The first is that the queries
no longer depend on the data depiction. The RDBMS is free to store data according to its own design
requirements [6]. The second major factor is improved software dependability. For various web-based
and stand-alone applications the generic SQL is used to make the things simple and straightforward.
Besides these praising compensations occupied by SQL, it’s technical and trifle interface makes this
language monotonous and difficult to learn and use. It is quite intricate to remember these SQL
commands and use them accurately and precisely.
In order to resolve all such issues, an automated software is needed, which facilitates both users
and software engineers. As far as this software is concerns the time, it takes to explore all the facilities
and services, should be quite less than a minute and this information is quite useful for the users.
2.0. Problem Description
Modern software engineering requires quick and automated solutions which may have ability to create
the accurate and precise SQL queries automatically. For complex queries an expert programmer also
requires assistance in terms of automatic query generation. He can use these queries after making
appropriate adjustments and alterations in the automated generated queries with less effort in less time
as compared to the traditional approaches.
The task of the novel user can be simplified by providing an easy interface that is more familiar
and well known to that user. In order to resolve all such issues, an automated software is needed, which
facilitates both users and software engineers. User writes the requirements in simple English in a few
statements and the designed system has obvious ability to analyze the given script. After composite
analysis and mining of associated information, the designed system generates the intended SQL queries
that can be run directly. The designed system has robust ability to create code automatically without
external environment. The designed system provides a quick and reliable way to generate SQL queries
to save the time and budget of both the user and system analyst
3.0. Used Methodology
The understanding and multi-aspect processing of the natural languages that are also termed as "speech
languages", is actually one of the arguments of greater interest in the field artificial intelligence field
[8]. The natural languages are irregular and asymmetrical. Traditionally, natural languages are based
on un-formal grammars. There are the geographical, psychological and sociological factors which
influence the behaviours of natural languages [12]. There are undefined set of words and they also
change and vary area to area and time to time.Due to these variations and inconsistencies, the natural
languages have different flavours as English language has more than half dozen renowned flavours all
over the world [14]. These flavours have different accents, set of vocabularies and phonological
aspects. These ominous and menacing discrepancies and inconsistencies in natural languages make it a
difficult task to process them as compared to the formal languages [13].
3. 846 Imran Sarwar Bajwa, Shahzad Mumtaz and M. Shahid Naweed
The English language statements are effortlessly converted into a SQL query by using the
newly designed rule based algorithm. Select query is the common query used to choose a set of values
from a table [4]. An example of a college database has been used in the conducted research. Student’s
data will be retrieved, inserted and deleted by automatically generated queries from simple English
text.
3.1. SELECT Query
First of all the ‘SELECT’ query has been processed. ‘SELECT’ query has four parts as following:
SELECT * FROM Students
Keyword Required Set keyword Table Name
‘SELECT’ query can easily be generated from the provided input string of as there are two
keywords ‘SELECT’ and ‘FROM’. Other two required values are ‘Required Set’ and ‘Table Name’.
To process the speech language text and find ‘Required Set’ and ‘Table Name’ the conventional norms
of the English language and grammatical rule are used. The conventional structure of simple English
sentence is the key rule of comprehending and analyzing the natural language text [13] as in the
following example:
“I need names of all students.”
Following is the complete analysis of this simple sentence.
Table 01: Generating SELCET Query from text
Lexicons Phase-I Phase –II
I Noun ----------
need Verb ----------
names Noun Field Name
of preposition ----------
all Noun *
students Noun Table Name
In this example the ‘Required Set’ field is filled by the ‘Filed Name’ attribute and the ‘Table
Name’ filed is filled by the ‘Table Name’ attribute as following:
Select * from Students
Here the table Name is searched from the array of available all tables in the database. From all
available tables, the nearest table name is picked that ‘students’ in this example.
3.2. INSERT Query
After ‘SELECT’ query ‘INSERT’ query has been processed. ‘INSERT’ query has five fragments as
following:
INSERT INTO Students VALUES (5, ‘Ali’)
Keyword keyword Table Name Keyword Record
‘INSERT’ query can also produced from the given statement as there are three keywords
‘INSERT’, ‘INTO’ and ‘VALUES’ [6]. Other two required parameters are ‘Table Name’ and
‘Record’. Using same rule based algorithm ‘Table Name’ and ‘Record’ are extracted. As in the
following example:
“I want to insert a student whose Roll No. is 5 and Name is Ali.”
Following is the complete analysis of this simple sentence.
4. Database Interfacing using Natural Language Processing 847
Table 02: Generating INSERT Query from text
Lexicons Phase-I Phase –II
I Noun -----------
want Verb -----------
to Preposition -----------
insert Verb Action
a article -----------
student Noun Table Name
whose Conjunction -----------
Roll No Noun Attribute
is Helping Verb ------------
5 Noun Value
and Conjunction ------------
Name Noun Attribute
is Helping Verb ------------
Ali Noun Value
In this example the ‘Required Set’ field is filled by the ‘Filed Name’ attribute and the ‘Table
Name’ filed is filled by the ‘Table Name’ attribute. Here the table Name is searched from the array of
available all table sin the database. From all available tables, the nearest table name is picked that
‘students’ in this example.
3.3. DELETE Query
Same like ‘SELECT’ and ‘INSERT’ queries ‘DELETE’ query can also be easily processed. ‘DELETE’
query has five parts as following:
DELETE FROM Students WHERE Age > 25
Keyword Keyword Table Name Keyword Condition
The ‘DELETE’ query typically consists of three keywords as ‘DELETE’, ‘FROM’ and
‘WHERE’. Other two required values are ‘Table Name’ and ‘Condition’. To find ‘Table Name’ and
‘Condition’ parameters the English language defined grammatical rule are used as in the following
example:
“I want to delete the students more than 25 years age.”
Following is the complete analysis of this simple sentence.
Table 03: Generating DELETE Query from text
Lexicons Phase-I Phase –II
I Noun ---------
want Verb ---------
to preposition ---------
delete verb Action
the article ---------
students Noun Table Name
more preposition Condition
than Noun ----------
25 Noun Value
years Noun -----------
age Noun Parameter
For ‘DELETE’ query, first the condition is defined. In this example Parameter and Value are
combined with Condition parameters. In this example table Name is also retrieved from the array of
available all tables in the database.
5. 848 Imran Sarwar Bajwa, Shahzad Mumtaz and M. Shahid Naweed
4.0. Work Flow of Designed System
The designed system “Computational Linguistics based System for Automatic Database Query
Generation” is adequately capable of automatically generating queries. This designed system performs
its function in multi-phase procedure. There are five modules in total that are Text input acquisition,
text comprehension, Information retrieval and ultimately generation of SQL Queries. Following is the
brief detail of all these phases.
4.1. Text input Acquisition
This module helps to acquire input text scenario. User provides the business scenario in from of strings
of the text. This module reads the input text in the form characters and generates the words by
concatenating the input characters. This module is the implementation of the lexical phase. Lexicons
and tokens are generated in this module. After the lexicons generation further processing can be
performed on the input text.
Figure 01: Lexical analysis of input text string
4.2. Text Comprehension
This module reads the input from module one in the form of words or lexicons. These words are
categorized into various classes as verbs, helping verbs, nouns, pronouns, adjectives, prepositions,
conjunctions, etc. These classes are further used to understand the various parts of the given sentence.
Figure 02: Parts of speech tagging of input text
4.3. Information Retrieval
This module, extracts key words of the SQL queries as Select, Insert, Delete, From, Into, Where, etc.
Keywords are found by matching the tokens with the given array of al possible keywords. These key
6. Database Interfacing using Natural Language Processing 849
words are further used to generate the respective queries. The information like table name, field name,
number of attributes and logical conditions are also extracted in this phase.
Figure 03: Query information extraction
4.4. SQL Queries generation
This module combines the keywords and other required parameters for a particular query. SQL query
is ultimately generated here according to the given rules in the designed algorithm. As separate
scenario will be provided for various types of queries, the separate functions have been implemented
for particular query.
Figure 04: Generation of SQL Query
5.0. Results and Analysis
After designing and coding the query generating system, its accuracy and efficiency was tested. For
testing purpose of the queries generated by the designed system simple and complex level queries were
generated. Each query from each category as Select, Insert, Delete was checked.
15 sample queries were generated and the intended results have been shown in the following
table.
7. 850 Imran Sarwar Bajwa, Shahzad Mumtaz and M. Shahid Naweed
Table 04: Accuracy ratio of various types of queries
Types Simple Complex Total
SELECT 14 13 90%
INSERT 13 11 80%
DELETE 14 12 87%
Total Accuracy = 86%
A matrix representing accuracy of query generation test (%) for simple level and complex level
queries has been constructed. Overall diagrams accuracy for all types of queries is determined by
adding total accuracy of all categories and calculating its average that is 86% in this case.
Figure 05: Graphical representation of the results
14
12
10
8
Simple
6
Complex
4
2
0
SELECT INSERT DELETE
The graph above is showing the accuracy ratio of various SELECT, INSERT & DELETE
queries in terms of simple and complex queries parameters.
6.0. Conclusion
The designed system “Computational Linguistics based System for Automatic Database Query
Generation” facilitates both users and software engineers in terms of generating SQL queries
automatically. The task of the novel user can be simplified by providing an easy interface that is more
familiar and well known to that user. In order to resolve all such issues, an automated software is
needed, which facilitates both users and software engineers. User writes the requirements in simple
English in a few statements and the designed system has obvious ability to analyze the given script.
After composite analysis and mining of associated information, the designed system generates the
intended SQL queries that can be run directly. The designed system has robust ability to create code
automatically without external environment. The designed system provides a quick and reliable way to
generate SQL queries to save the time and budget of both the user and system analyst. An elegant
graphical user interface has also been provided to the user for entering the Input scenario in a proper
way and generating UML diagrams.
7.0. Future Work
There is also some margin of improvements in the algorithms for generating the intended SQL queries.
Current accuracy of generating diagrams is about 80% to 85%. It can be enhanced up to 95% by
improving the algorithms and inducing the ability of learning in the system. In this research only three
types of queries has been addressed as SELECT, INSERT, and DELETE query. There are still other
types of queries that require some sufficient solution.
8. Database Interfacing using Natural Language Processing 851
References
[1] Allen,J. (1994) Natural Language Understanding. Benjamin- Cummings Publishing Company,
New York.
[2] Biber, D., Conrad, S., & Reppen, R. (1998). Corpus Linguistics: Investigating Language
Structure and Use. Cambridge Univ. Press, Cambridge, U.K.
[3] D. DeHaan, D. Toman, M. P. Consens, and T. Ozsu. (2003) A Comprehensive XQuery to SQL
Translation using Dynamic Interval Encoding. In SIGMOD.
[4] C. A. Thompson, R. J. Mooney and L. R. Tang, Learning to parse natural language database
queries into logical form, in: Workshop on Automata Induction, Grammatical Inference and
Language Acquisition (1997).
[5] Salton, G., & McGill, M. (1983). Introduction to Modern Information Retrieval. McGraw-Hill,
New York.
[6] A. Rosenthal. D. Reiner, Extending the Algebraic Framework of Query Processing to Handle
Outer joins, Proc. VLDB Singa- pore 1984. pp. 334-343.
[7] Fagan, J. L. (1989). The effectiveness of a non-syntactic approach to automatic phrase indexing
for document retrieval. Journal of the American Society for Information Science, 40 (2), 115–
132.
[8] J. M. Zelle and R. J. Mooney, Learning semantic grammars with constructive inductive logic
programming, in: Proceedings of the 11th National Conference on Arti_cial Intelligence
(AAAI Press/MIT Press, Washington, D.C., 1993), pp. 817ñ822.
[9] Kowalski, G. (1998). Information Retrieval Systems: Theory and Implementation. Kluwer,
Boston.
[10] Krovetz, R., & Croft, W. B. (1992). Lexical ambiguity and information retrieval. ACM
Transactions on Information Systems, 10, 115–141.
[11] Losee, R. M. (1988). Parameter estimation for probabilistic document retrieval models. Journal
of the American Society for Information Science, 39(1), 8–16.
[12] Losee, R. M. (1996a). Learning syntactic rules and tags with genetic algorithms for information
retrieval and filtering: An empirical basis for grammatical rules. Information Processing and
Management, 32(2), 185–197.
[13] Manning, C. D., & Schutze, H. (1999). Foundations of Statistical Natural Language
Processing. MIT Press, Cambridge, Mass.
[14] Partee, B. H., Meulen, A. t., &Wall, R. E. (1990). Mathematical Methods in Linguistics.
Kluwer, Dordrecht, The Netherlands.