Abstract Agriculture is one of the important issues for a nation’s economy and it required technical breakthroughs in this century. With today’s computerized world, the agricultural data processing is an increasing need for formers and decision makers. Agricultural data is diversified, complex and non-standard. Developing a data warehouse for agricultural is a key challenge for researchers. The objective of this work is to design a data warehouse about crops and their requirements. The proposed data warehouse may be extended to Decision Support System (DSS) combine with data mining techniques. We proposed a multidimensional data warehouse for agriculture that provides solutions for farmers and gives response of their ad-hoc quires. This multidimensional schema further promotes star schema and snowflake schema that are commonly used to design data warehouses. In our manuscript normalization is applied to store the data in to star schema and duplicate values are removed, so that space and time complexities could be minimized. Index Terms: Agriculture, Data Warehouse, Multidimensional Schema, Dimensional Modeling, OLAP
This document discusses data warehousing and decision support systems. It defines a data warehouse as a subject-oriented, integrated, time-variant, and non-volatile collection of data used to support management decision making. It describes key features of a data warehouse including being subject-oriented, integrated, time-variant, and non-volatile. The document also discusses the need for decision support systems in business and different architectural styles for data warehousing like OLTP and OLAP.
This document defines a data warehouse as a collection of corporate information derived from operational systems and external sources to support business decisions rather than operations. It discusses the purpose of data warehousing to realize the value of data and make better decisions. Key components like staging areas, data marts, and operational data stores are described. The document also outlines evolution of data warehouse architectures and best practices for implementation.
This presentation provides an overview of big data open source technologies. It defines big data as large amounts of data from various sources in different formats that traditional databases cannot handle. It discusses that big data technologies are needed to analyze and extract information from extremely large and complex data sets. The top technologies are divided into data storage, analytics, mining and visualization. Several prominent open source technologies are described for each category, including Apache Hadoop, Cassandra, MongoDB, Apache Spark, Presto and ElasticSearch. The presentation provides details on what each technology is used for and its history.
Data warehousing and online analytical processingVijayasankariS
The document discusses data warehousing and online analytical processing (OLAP). It defines a data warehouse as a subject-oriented, integrated, time-variant and non-volatile collection of data used to support management decision making. It describes key concepts such as data warehouse modeling using data cubes and dimensions, extraction, transformation and loading of data, and common OLAP operations. The document also provides examples of star schemas and how they are used to model data warehouses.
The document discusses different types of multidimensional data models (MDDM) used for data warehousing. It describes MDDM as providing both a mechanism for storing data and enabling business analysis. The main types discussed are star schema, snowflake schema, and fact constellation. Star schema has one central fact table connected to multiple dimension tables, resembling a star. Snowflake schema is similar but dimensional tables are normalized into hierarchies. Fact constellation has multiple fact tables sharing some dimensional tables.
You probably have heard about Big Data, but ever wondered what it exactly is? And why should you care?
Mobile is playing a large part in driving this explosion in data. The data are also created by the apps and other services in the background. As people are moving towards more digital channels, tons of data are being created. This data can be used in a lot of ways for personal and professional use. Big Data and mobile apps are converging in an enterprise and interacting; transforming the whole mobile ecosystem.
This presentation introduces big data and data mining. Big data refers to extremely large data sets that cannot be processed by traditional software. Data mining is the process of discovering patterns in large data sets using machine learning, statistics, and database systems. The presentation discusses key facts about the growth of data, examples of big data sources like Facebook and Google, challenges of big data like hardware limitations, and solutions like parallel computing. Common big data mining tools are also introduced, including Hadoop, Apache S4, and Storm. Applications of big data and data mining are highlighted in various domains like healthcare, public sector, and biology.
The document discusses data warehousing, including its history, types, security, applications, components, architecture, benefits and problems. A data warehouse is defined as a subject-oriented, integrated, time-variant collection of data to support management decision making. In the 1990s, organizations needed timely data but traditional systems were too slow. Data warehouses now provide competitive advantages through improved decision making and productivity. They integrate data from multiple sources to support applications like customer analysis, stock control and fraud detection.
This document discusses data warehousing and decision support systems. It defines a data warehouse as a subject-oriented, integrated, time-variant, and non-volatile collection of data used to support management decision making. It describes key features of a data warehouse including being subject-oriented, integrated, time-variant, and non-volatile. The document also discusses the need for decision support systems in business and different architectural styles for data warehousing like OLTP and OLAP.
This document defines a data warehouse as a collection of corporate information derived from operational systems and external sources to support business decisions rather than operations. It discusses the purpose of data warehousing to realize the value of data and make better decisions. Key components like staging areas, data marts, and operational data stores are described. The document also outlines evolution of data warehouse architectures and best practices for implementation.
This presentation provides an overview of big data open source technologies. It defines big data as large amounts of data from various sources in different formats that traditional databases cannot handle. It discusses that big data technologies are needed to analyze and extract information from extremely large and complex data sets. The top technologies are divided into data storage, analytics, mining and visualization. Several prominent open source technologies are described for each category, including Apache Hadoop, Cassandra, MongoDB, Apache Spark, Presto and ElasticSearch. The presentation provides details on what each technology is used for and its history.
Data warehousing and online analytical processingVijayasankariS
The document discusses data warehousing and online analytical processing (OLAP). It defines a data warehouse as a subject-oriented, integrated, time-variant and non-volatile collection of data used to support management decision making. It describes key concepts such as data warehouse modeling using data cubes and dimensions, extraction, transformation and loading of data, and common OLAP operations. The document also provides examples of star schemas and how they are used to model data warehouses.
The document discusses different types of multidimensional data models (MDDM) used for data warehousing. It describes MDDM as providing both a mechanism for storing data and enabling business analysis. The main types discussed are star schema, snowflake schema, and fact constellation. Star schema has one central fact table connected to multiple dimension tables, resembling a star. Snowflake schema is similar but dimensional tables are normalized into hierarchies. Fact constellation has multiple fact tables sharing some dimensional tables.
You probably have heard about Big Data, but ever wondered what it exactly is? And why should you care?
Mobile is playing a large part in driving this explosion in data. The data are also created by the apps and other services in the background. As people are moving towards more digital channels, tons of data are being created. This data can be used in a lot of ways for personal and professional use. Big Data and mobile apps are converging in an enterprise and interacting; transforming the whole mobile ecosystem.
This presentation introduces big data and data mining. Big data refers to extremely large data sets that cannot be processed by traditional software. Data mining is the process of discovering patterns in large data sets using machine learning, statistics, and database systems. The presentation discusses key facts about the growth of data, examples of big data sources like Facebook and Google, challenges of big data like hardware limitations, and solutions like parallel computing. Common big data mining tools are also introduced, including Hadoop, Apache S4, and Storm. Applications of big data and data mining are highlighted in various domains like healthcare, public sector, and biology.
The document discusses data warehousing, including its history, types, security, applications, components, architecture, benefits and problems. A data warehouse is defined as a subject-oriented, integrated, time-variant collection of data to support management decision making. In the 1990s, organizations needed timely data but traditional systems were too slow. Data warehouses now provide competitive advantages through improved decision making and productivity. They integrate data from multiple sources to support applications like customer analysis, stock control and fraud detection.
A data warehouse is a subject-oriented, integrated, time-variant collection of data that supports management's decision-making processes. It contains data extracted from various operational databases and data sources. The data is cleaned, transformed, integrated and loaded into the data warehouse for analysis. A data warehouse uses a multidimensional model with facts and dimensions to allow for complex analytical and ad-hoc queries from multiple perspectives. It is separately administered from operational databases to avoid impacting transaction processing systems and allow optimized access for decision support.
A Data Warehouse is a collection of integrated, subject-oriented databases designed to support decision-making. It contains non-volatile data that is relevant to a point in time. An operational data store feeds the data warehouse with a stream of raw data. Metadata provides information about the data in the warehouse.
This document discusses data warehousing, including its definition, importance, components, strategies, ETL processes, and considerations for success and pitfalls. A data warehouse is a collection of integrated, subject-oriented, non-volatile data used for analysis. It allows more effective decision making through consolidated historical data from multiple sources. Key components include summarized and current detailed data, as well as transformation programs. Common strategies are enterprise-wide and data mart approaches. ETL processes extract, transform and load the data. Clean data and proper implementation, training and maintenance are important for success.
The document provides information about what a data warehouse is and why it is important. A data warehouse is a relational database designed for querying and analysis that contains historical data from transaction systems and other sources. It allows organizations to access, analyze, and report on integrated information to support business processes and decisions.
Big data refers to very large data sets that are analyzed computationally to reveal patterns, trends, and relationships. It is characterized by 3Vs - volume, velocity, and variety. Big data has many applications in recent scenarios including politics, weather, medicine, media, and manufacturing. It is used in politics to analyze voter data beyond basic demographics. In weather, sensor data from devices is used to create more detailed weather maps and forecasts. Medicine uses big data to identify patterns in symptoms that can help predict and prevent diseases like heart failure. Media analyzes data on user behaviors to tailor content instead of relying on traditional formats. Manufacturing leverages big data for increased transparency and insights into performance issues.
This document provides an overview of data warehousing. It defines data warehousing as collecting data from multiple sources into a central repository for analysis and decision making. The document outlines the history of data warehousing and describes its key characteristics like being subject-oriented, integrated, and time-variant. It also discusses the architecture of a data warehouse including sources, transformation, storage, and reporting layers. The document compares data warehousing to traditional DBMS and explains how data warehouses are better suited for analysis versus transaction processing.
introduction, data mining, why data mining, application of data mining, steps of data mining, threat of data mining, solution of data mining, role of data mining, data warehouse, oltp & olap, data warehouse, data mining tools, latest research
This document discusses user interfaces for data warehouses. It explains that a data warehouse integrates data from multiple sources to support analytics and decision making. When developing a user interface for a data warehouse, ease of use and performance should be prioritized. The interface also needs to support different types of users and allow secure access across devices. OLAP is discussed as a technique for enabling fast analysis by pre-calculating and aggregating data. Selecting the right user interface is important to ensure the data warehouse is effective and useful.
This document discusses data warehousing and online analytical processing (OLAP) technology. It defines a data warehouse, compares it to operational databases, and explains how OLAP systems organize and present data for analysis. The document also describes multidimensional data models, common OLAP operations, and the steps to design and construct a data warehouse. Finally, it discusses applications of data warehouses and efficient processing of OLAP queries.
This document provides an overview of data warehousing concepts including:
- Data warehouses store historical data from operational systems for analysis and reporting. The data passes through a staging area and operational data store for cleaning before loading into the data warehouse.
- Common data warehouse architectures include star schemas with fact and dimension tables and snowflake schemas with normalized dimensions. Data marts contain summarized data for specific business questions.
- ETL processes extract, transform, and load the data in three phases. Transformation cleans and prepares the data before loading into dimensional schemas.
- Data warehouses typically contain historical data, derived data generated from existing data, and metadata describing the data and schemas.
OLAP (Online Analytical Processing) is a technology that uses a multidimensional view of aggregated data to provide quicker access to strategic information and help with decision making. It has four main characteristics: using multidimensional data analysis techniques, providing advanced database support, offering easy-to-use end user interfaces, and supporting client/server architecture. A key aspect is representing data in a multidimensional structure that allows for consolidation and aggregation of data at different levels.
This document summarizes a seminar presentation on using data mining techniques for telecommunications. It discusses three main types of telecom data: call summary data, network data, and customer data. It then describes using a genetic algorithm approach to mine sequential patterns from telecom databases. The genetic algorithm uses country codes to represent chromosomes and applies genetic operators and fitness functions to iteratively find sequential patterns in the telecom data. The approach provides non-optimal solutions faster than traditional algorithms.
An introduction to Industrie 4.0(Internet of Things), and its Potential impact on Supply Chain. Industrie 4.0, touted as the Game changer and a seed for next industrial revolution. Funded by German Bundes Ministerium fur Education & Technologie.
Data mining is the process of automatically discovering useful information from large data sets. It draws from machine learning, statistics, and database systems to analyze data and identify patterns. Common data mining tasks include classification, clustering, association rule mining, and sequential pattern mining. These tasks are used for applications like credit risk assessment, fraud detection, customer segmentation, and market basket analysis. Data mining aims to extract unknown and potentially useful patterns from large data sets.
What is Data Warehouse?OLTP vs. OLAP, Conceptual Modeling of Data Warehouses,Data Warehousing Components, Data Warehousing Components, Building a Data Warehouse, Mapping the Data Warehouse to a Multiprocessor Architecture, Database Architectures for Parallel Processing
Data Warehouse Physical Design,Physical Data Model, Tablespaces, Integrity Constraints, ETL (Extract-Transform-Load) ,OLAP Server Architectures, MOLAP vs. ROLAP, Distributed Data Warehouse ,
Case study: Implementation of OLAP operationschirag patil
The document discusses different types of online analytical processing (OLAP) servers including relational OLAP (ROLAP), multidimensional OLAP (MOLAP), and hybrid OLAP (HOLAP). It also describes common OLAP operations like roll-up, drill-down, slice, dice, and pivot that allow interactive analysis of multidimensional data by aggregation, navigation, and reorganization.
The key steps in developing a data warehouse can be summarized as:
1. Project initiation and requirements analysis
2. Design of the architecture, databases, and applications
3. Construction by selecting tools, developing data feeds, and building reports
4. Deployment including release and training
5. Ongoing maintenance
All about Big Data components and the best tools to ingest, process, store and visualize the data.
This is a keynote from the series "by Developer for Developers" powered by eSolutionsGrup.
A data warehouse is a subject-oriented, integrated, time-variant collection of data that supports management's decision-making processes. It contains data extracted from various operational databases and data sources. The data is cleaned, transformed, integrated and loaded into the data warehouse for analysis. A data warehouse uses a multidimensional model with facts and dimensions to allow for complex analytical and ad-hoc queries from multiple perspectives. It is separately administered from operational databases to avoid impacting transaction processing systems and allow optimized access for decision support.
A Data Warehouse is a collection of integrated, subject-oriented databases designed to support decision-making. It contains non-volatile data that is relevant to a point in time. An operational data store feeds the data warehouse with a stream of raw data. Metadata provides information about the data in the warehouse.
This document discusses data warehousing, including its definition, importance, components, strategies, ETL processes, and considerations for success and pitfalls. A data warehouse is a collection of integrated, subject-oriented, non-volatile data used for analysis. It allows more effective decision making through consolidated historical data from multiple sources. Key components include summarized and current detailed data, as well as transformation programs. Common strategies are enterprise-wide and data mart approaches. ETL processes extract, transform and load the data. Clean data and proper implementation, training and maintenance are important for success.
The document provides information about what a data warehouse is and why it is important. A data warehouse is a relational database designed for querying and analysis that contains historical data from transaction systems and other sources. It allows organizations to access, analyze, and report on integrated information to support business processes and decisions.
Big data refers to very large data sets that are analyzed computationally to reveal patterns, trends, and relationships. It is characterized by 3Vs - volume, velocity, and variety. Big data has many applications in recent scenarios including politics, weather, medicine, media, and manufacturing. It is used in politics to analyze voter data beyond basic demographics. In weather, sensor data from devices is used to create more detailed weather maps and forecasts. Medicine uses big data to identify patterns in symptoms that can help predict and prevent diseases like heart failure. Media analyzes data on user behaviors to tailor content instead of relying on traditional formats. Manufacturing leverages big data for increased transparency and insights into performance issues.
This document provides an overview of data warehousing. It defines data warehousing as collecting data from multiple sources into a central repository for analysis and decision making. The document outlines the history of data warehousing and describes its key characteristics like being subject-oriented, integrated, and time-variant. It also discusses the architecture of a data warehouse including sources, transformation, storage, and reporting layers. The document compares data warehousing to traditional DBMS and explains how data warehouses are better suited for analysis versus transaction processing.
introduction, data mining, why data mining, application of data mining, steps of data mining, threat of data mining, solution of data mining, role of data mining, data warehouse, oltp & olap, data warehouse, data mining tools, latest research
This document discusses user interfaces for data warehouses. It explains that a data warehouse integrates data from multiple sources to support analytics and decision making. When developing a user interface for a data warehouse, ease of use and performance should be prioritized. The interface also needs to support different types of users and allow secure access across devices. OLAP is discussed as a technique for enabling fast analysis by pre-calculating and aggregating data. Selecting the right user interface is important to ensure the data warehouse is effective and useful.
This document discusses data warehousing and online analytical processing (OLAP) technology. It defines a data warehouse, compares it to operational databases, and explains how OLAP systems organize and present data for analysis. The document also describes multidimensional data models, common OLAP operations, and the steps to design and construct a data warehouse. Finally, it discusses applications of data warehouses and efficient processing of OLAP queries.
This document provides an overview of data warehousing concepts including:
- Data warehouses store historical data from operational systems for analysis and reporting. The data passes through a staging area and operational data store for cleaning before loading into the data warehouse.
- Common data warehouse architectures include star schemas with fact and dimension tables and snowflake schemas with normalized dimensions. Data marts contain summarized data for specific business questions.
- ETL processes extract, transform, and load the data in three phases. Transformation cleans and prepares the data before loading into dimensional schemas.
- Data warehouses typically contain historical data, derived data generated from existing data, and metadata describing the data and schemas.
OLAP (Online Analytical Processing) is a technology that uses a multidimensional view of aggregated data to provide quicker access to strategic information and help with decision making. It has four main characteristics: using multidimensional data analysis techniques, providing advanced database support, offering easy-to-use end user interfaces, and supporting client/server architecture. A key aspect is representing data in a multidimensional structure that allows for consolidation and aggregation of data at different levels.
This document summarizes a seminar presentation on using data mining techniques for telecommunications. It discusses three main types of telecom data: call summary data, network data, and customer data. It then describes using a genetic algorithm approach to mine sequential patterns from telecom databases. The genetic algorithm uses country codes to represent chromosomes and applies genetic operators and fitness functions to iteratively find sequential patterns in the telecom data. The approach provides non-optimal solutions faster than traditional algorithms.
An introduction to Industrie 4.0(Internet of Things), and its Potential impact on Supply Chain. Industrie 4.0, touted as the Game changer and a seed for next industrial revolution. Funded by German Bundes Ministerium fur Education & Technologie.
Data mining is the process of automatically discovering useful information from large data sets. It draws from machine learning, statistics, and database systems to analyze data and identify patterns. Common data mining tasks include classification, clustering, association rule mining, and sequential pattern mining. These tasks are used for applications like credit risk assessment, fraud detection, customer segmentation, and market basket analysis. Data mining aims to extract unknown and potentially useful patterns from large data sets.
What is Data Warehouse?OLTP vs. OLAP, Conceptual Modeling of Data Warehouses,Data Warehousing Components, Data Warehousing Components, Building a Data Warehouse, Mapping the Data Warehouse to a Multiprocessor Architecture, Database Architectures for Parallel Processing
Data Warehouse Physical Design,Physical Data Model, Tablespaces, Integrity Constraints, ETL (Extract-Transform-Load) ,OLAP Server Architectures, MOLAP vs. ROLAP, Distributed Data Warehouse ,
Case study: Implementation of OLAP operationschirag patil
The document discusses different types of online analytical processing (OLAP) servers including relational OLAP (ROLAP), multidimensional OLAP (MOLAP), and hybrid OLAP (HOLAP). It also describes common OLAP operations like roll-up, drill-down, slice, dice, and pivot that allow interactive analysis of multidimensional data by aggregation, navigation, and reorganization.
The key steps in developing a data warehouse can be summarized as:
1. Project initiation and requirements analysis
2. Design of the architecture, databases, and applications
3. Construction by selecting tools, developing data feeds, and building reports
4. Deployment including release and training
5. Ongoing maintenance
All about Big Data components and the best tools to ingest, process, store and visualize the data.
This is a keynote from the series "by Developer for Developers" powered by eSolutionsGrup.
This document provides a list of over 200 guests that were interviewed by Richard R. Dagenais on various current events, music, art, literature, film, television, sports and other topics. The guests represented a wide range of professions and backgrounds including politicians, authors, athletes, artists, scientists and more.
El documento describe las características de un mal docente y un buen docente. Un mal docente dicta conceptos rápidamente sin explicar, no retroalimenta a los estudiantes, prefiere a algunos estudiantes y hace clases aburridas. Un buen docente conoce la materia, organiza bien las clases, promueve la discusión sin preferencias, usa medios visuales, responde preguntas claramente y usa lenguaje comprensible para los estudiantes.
El documento resume la narrativa y el teatro medieval en España. Explica que la literatura de este periodo fue cultivada principalmente por clérigos educados que utilizaban la cuaderna vía como estrofa característica. Los principales autores fueron Gonzalo de Berceo, quien escribió sobre temas religiosos, y Juan Ruiz, Arcipreste de Hita, conocido por su obra Libro de Buen Amor que fusionó las culturas cristiana, árabe y judía. También se desarrolló la prosa novelesca con obras como
1) The document is an outline for an informative speech about synesthesia, a condition where senses are blended causing people to see letters or numbers as certain colors, genders, or personalities.
2) The outline discusses what synesthesia is, its prevalence in the population, and common forms like chromesthesia, personification, and mirror-touch synesthesia.
3) The speaker discusses their personal experiences with personification and color synesthesia and assigns colors and genders to letters. They aim to raise awareness that synesthesia is more common than most people realize.
El documento habla sobre herramientas para análisis de código abierto como GHDB, revhosts y Xssed. Luego menciona herramientas en BackTrack como Dirb, Golismero, SqlScan, Deblaze y WebShag para auditoría de servidores web. Explica que WebShag es una herramienta en Python para rastreo web, escaneo de URL y fuzzing. Finalmente, describe a DirBuster como una aplicación en Java para fuerza bruta en directorios y archivos en servidores web y que viene con 9 listas para encontrar archivos y
Improving your staff using coaching and counseling techniquesjbeane7028
The document discusses coaching and counseling employees to improve performance. It states that coaching alone does not guarantee improved performance, but must be accompanied by the employee already possessing the right character. It also outlines several factors for effective coaching, such as an open and trusting environment, empathetic and helpful attitude from managers, establishing dialogue, and focusing on work-related goals. The document provides tips for giving effective feedback and addressing why some employees may not perform well.
On multi dimensional cubes of census data: designing and queryingJaspreet Issaj
The primary focus of this research is to design a data warehouse that specifically targets OLAP storage, analyzing and querying requirements to the multidimensional cubes of census data with an efficient and timely manner.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Abstract Learning Analytics by nature relies on computational information processing activities intended to extract from raw data some interesting aspects that can be used to obtain insights into the behaviors of learners, the design of learning experiences, etc. There is a large variety of computational techniques that can be employed, all with interesting properties, but it is the interpretation of their results that really forms the core of the analytics process. As a rising subject, data mining and business intelligence are playing an increasingly important role in the decision support activity of every walk of life. The Variance Rover System (VRS) mainly focused on the large data sets obtained from online web visiting and categorizing this into clusters according some similarity and the process of predicting customer behavior and selecting actions to influence that behavior to benefit the company, so as to take optimized and beneficial decisions of business expansion. Keywords: Analytics, Business intelligence, Clustering, Data Mining, Standard K-means, Optimized K-means
Data Mining System and Applications: A Reviewijdpsjournal
In the Information Technology era information plays vital role in every sphere of the human life. It is very important to gather data from different data sources, store and maintain the data, generate information, generate knowledge and disseminate data, information and knowledge to every stakeholder. Due to vast use of computers and electronics devices and tremendous growth in computing power and storage capacity, there is explosive growth in data collection. The storing of the data in data warehouse enables entire enterprise to access a reliable current database. To analyze this vast amount of data and drawing fruitful conclusions and inferences it needs the special tools called data mining tools. This paper gives overview of the data mining systems and some of its applications.
CLASSIFICATION ALGORITHM USING RANDOM CONCEPT ON A VERY LARGE DATA SET: A SURVEYEditor IJMTER
Data mining environment produces a large amount of data, that need to be
analyses, pattern have to be extracted from that to gain knowledge. In this new period with
rumble of data both ordered and unordered, by using traditional databases and architectures, it
has become difficult to process, manage and analyses patterns. To gain knowledge about the
Big Data a proper architecture should be understood. Classification is an important data mining
technique with broad applications to classify the various kinds of data used in nearly every
field of our life. Classification is used to classify the item according to the features of the item
with respect to the predefined set of classes. This paper provides an inclusive survey of
different classification algorithms and put a light on various classification algorithms including
j48, C4.5, k-nearest neighbor classifier, Naive Bayes, SVM etc., using random concept.
Study on Theoretical Aspects of Virtual Data Integration and its ApplicationsIJERA Editor
Data integration is the technique of merging data residing at different sources at different locations, and
providing users with an integrated, reconciled view of these data. Such unified view is called global or mediated
schema. It represents the intentional level of the integrated and reconciled data. In the data integration system,
our area of interest in this paper is characterized by an architecture based on a global schema and a set of
sources or source schemas. The objective of this paper is to provide a study on the theoretical aspects of data
integration systems and to present a comprehensive review of the applications of data integration in various
fields including biomedicine, environment, and social networks. It also discusses a privacy framework for
protecting user’s privacy with privacy views and privacy policies.
Study on Theoretical Aspects of Virtual Data Integration and its ApplicationsIJERA Editor
Data integration is the technique of merging data residing at different sources at different locations, and
providing users with an integrated, reconciled view of these data. Such unified view is called global or mediated
schema. It represents the intentional level of the integrated and reconciled data. In the data integration system,
our area of interest in this paper is characterized by an architecture based on a global schema and a set of
sources or source schemas. The objective of this paper is to provide a study on the theoretical aspects of data
integration systems and to present a comprehensive review of the applications of data integration in various
fields including biomedicine, environment, and social networks. It also discusses a privacy framework for
protecting user’s privacy with privacy views and privacy policies.
A SURVEY ON DATA MINING IN STEEL INDUSTRIESIJCSES Journal
In Industrial environments, huge amount of data is being generated which in turn collected indatabase anddata warehouses from all involved areas such as planning, process design, materials, assembly, production, quality, process control, scheduling, fault detection,shutdown, customer relation management, and so on. Data Mining has become auseful tool for knowledge acquisition for industrial process of Iron and steel making. Due to the rapid growth in Data Mining, various industries started using data mining technology to search the hidden patterns, which might further be used to the system with the new knowledge which might design new models to enhance the production quality, productivity optimum cost and maintenance etc. The continuous improvement of all steel production process regarding the avoidance of quality deficiencies and the related improvement of production yield is an essential task of steel producer. Therefore, zero defect strategy is popular today and to maintain it several quality assurancetechniques areused. The present report explains the methods of data mining and describes its application in the industrial environment and especially, in the steel industry.
Characterizing and Processing of Big Data Using Data Mining TechniquesIJTET Journal
The document discusses big data and techniques for processing it, including data mining. It begins by defining big data and its key characteristics of volume, variety, and velocity. It then discusses various data mining techniques that can be used to process big data, including clustering, classification, and prediction. It introduces the HACE theorem for characterizing big data based on its huge size, heterogeneous and diverse sources, decentralized control, and complex relationships within the data. The document proposes a big data processing model involving data set aggregation, pre-processing, connectivity-based clustering, and subset selection to efficiently retrieve relevant data. It evaluates the performance of subset selection versus deterministic search methods.
IRJET- Improving the Performance of Smart Heterogeneous Big DataIRJET Journal
This document discusses improving the performance of smart heterogeneous big data. It begins by defining key concepts like big data, data mining, and the challenges of analyzing large, complex datasets. It then describes two common association rule mining algorithms - Apriori and FP-Growth - that are used to extract patterns from big data. The document proposes using principal component analysis as a feature selection method to improve the performance of these algorithms. It finds that this proposed approach reduces execution time compared to the original algorithms when processing big data.
A study and survey on various progressive duplicate detection mechanismseSAT Journals
Abstract One of the serious problems faced in several applications with personal details management, customer affiliation management, data mining, etc is duplicate detection. This survey deals with the various duplicate record detection techniques in both small and large datasets. To detect the duplicity with less time of execution and also without disturbing the dataset quality, methods like Progressive Blocking and Progressive Neighborhood are used. Progressive sorted neighborhood method also called as PSNM is used in this model for finding or detecting the duplicate in a parallel approach. Progressive Blocking algorithm works on large datasets where finding duplication requires immense time. These algorithms are used to enhance duplicate detection system. The efficiency can be doubled over the conventional duplicate detection method using this algorithm. Severa
Frequent Item set Mining of Big Data for Social MediaIJERA Editor
Big data is a term for massive data sets having large, more varied and complex structure with the difficulties of storing, analyzing and visualizing for further processes or results. Bigdata includes data from email, documents, pictures, audio, video files, and other sources that do not fit into a relational database. This unstructured data brings enormous challenges to Bigdata.The process of research into massive amounts of data to reveal hidden patterns and secret correlations named as big data analytics. Therefore, big data implementations need to be analyzed and executed as accurately as possible. The proposed model structures the unstructured data from social media in a structured form so that data can be queried efficiently by using Hadoop MapReduce framework. The Bigdata mining is essential in order to extract value from massive amount of data. MapReduce is efficient method to deal with Big data than traditional techniques.The proposed Linguistic string matching Knuth-Morris-Pratt algorithm and K-Means clustering algorithm gives proper platform to extract value from massive amount of data and recommendation for user.Linguistic matching techniques such as Knuth–Morris–Pratt string matching algorithm are very useful in giving proper matching output to user query. The K-Means algorithm is one which works on clustering data using vector space model. It can be an appropriate method to produce recommendation for user.
Frequent Item set Mining of Big Data for Social MediaIJERA Editor
Big data is a term for massive data sets having large, more varied and complex structure with the difficulties of storing, analyzing and visualizing for further processes or results. Bigdata includes data from email, documents, pictures, audio, video files, and other sources that do not fit into a relational database. This unstructured data brings enormous challenges to Bigdata.The process of research into massive amounts of data to reveal hidden patterns and secret correlations named as big data analytics. Therefore, big data implementations need to be analyzed and executed as accurately as possible. The proposed model structures the unstructured data from social media in a structured form so that data can be queried efficiently by using Hadoop MapReduce framework. The Bigdata mining is essential in order to extract value from massive amount of data. MapReduce is efficient method to deal with Big data than traditional techniques.The proposed Linguistic string matching Knuth-Morris-Pratt algorithm and K-Means clustering algorithm gives proper platform to extract value from massive amount of data and recommendation for user.Linguistic matching techniques such as Knuth–Morris–Pratt string matching algorithm are very useful in giving proper matching output to user query. The K-Means algorithm is one which works on clustering data using vector space model. It can be an appropriate method to produce recommendation for user
An Efficient Approach for Clustering High Dimensional DataIJSTA
The document discusses clustering high dimensional data using an efficient approach called "Big Data Clustering using k-Mediods BAT Algorithm" (KMBAT). KMBAT simultaneously considers all data points as potential exemplars and exchanges real-valued messages between data points until a high-quality set of exemplars and corresponding clusters emerges. It is demonstrated on Facebook user profile data stored in an HDInsight Hadoop cluster. KMBAT finds better clustering solutions than other methods in less time for high dimensional big data.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Applying K-Means Clustering Algorithm to Discover Knowledge from Insurance Da...theijes
Data mining works to extract information known in advance from the enormous quantities of data which can lead to knowledge. It provides information that helps to make good decisions. The effectiveness of data mining in access to knowledge to achieve the goal of which is the discovery of the hidden facts contained in databases and through the use of multiple technologies. Clustering is organizing data into clusters or groups such that they have high intra-cluster similarity and low inter cluster similarity. This paper deals with K-means clustering algorithm which collect a number of data based on the characteristics and attributes of this data, and process the Clustering by reducing the distances between the data center. This algorithm is applied using open source tool called WEKA, with the Insurance dataset as its input
Abstract In this paper, the concept of data mining was summarized and its significance towards its methodologies was illustrated. The data mining based on Neural Network and Genetic Algorithm is researched in detail and the key technology and ways to achieve the data mining on Neural Network and Genetic Algorithm are also surveyed. This paper also conducts a formal review of the area of rule extraction from ANN and GA. Keywords: Data Mining, Neural Network, Genetic Algorithm, Rule Extraction.
Survey on MapReduce in Big Data Clustering using Machine Learning AlgorithmsIRJET Journal
This document summarizes research on using MapReduce techniques for big data clustering with machine learning algorithms. It discusses how traditional clustering algorithms do not scale well for large datasets. MapReduce allows distributed processing of large datasets in parallel. The document reviews several studies that implemented clustering algorithms like k-means using MapReduce on Hadoop. It found this improved efficiency and reduced complexity compared to traditional approaches. Faster processing of large datasets enables applications in areas like education and healthcare.
This document discusses data structures and provides an introduction and overview. It defines data structures as specialized formats for organizing and storing data to allow efficient access and manipulation. Key points include:
- Data structures include arrays, linked lists, stacks, queues, trees and graphs. They allow efficient handling of data through operations like traversal, insertion, deletion, searching and sorting.
- Linear data structures arrange elements in a sequential order while non-linear structures do not. Common examples are discussed.
- Characteristics of data structures include being static or dynamic, homogeneous or non-homogeneous. Efficiency and complexity are also addressed.
- Basic array operations like traversal, insertion, deletion and searching are demonstrated with pseudocode examples
This document discusses data mining and related topics. It begins by defining data mining as the process of discovering patterns in large datasets using methods from machine learning, statistics, and database systems. The document then discusses data warehouses, how they work, and their role in data mining. It describes different data mining functionalities and tasks such as classification, prediction, and clustering. The document outlines some common data mining applications and issues related to methodology, performance, and diverse data types. Finally, it discusses some social implications of data mining involving privacy, profiling, and unauthorized use of data.
Similar to Multidimensional schema for agricultural data warehouse (20)
Mechanical properties of hybrid fiber reinforced concrete for pavementseSAT Journals
Abstract
The effect of addition of mono fibers and hybrid fibers on the mechanical properties of concrete mixture is studied in the present
investigation. Steel fibers of 1% and polypropylene fibers 0.036% were added individually to the concrete mixture as mono fibers and
then they were added together to form a hybrid fiber reinforced concrete. Mechanical properties such as compressive, split tensile and
flexural strength were determined. The results show that hybrid fibers improve the compressive strength marginally as compared to
mono fibers. Whereas, hybridization improves split tensile strength and flexural strength noticeably.
Keywords:-Hybridization, mono fibers, steel fiber, polypropylene fiber, Improvement in mechanical properties.
Material management in construction – a case studyeSAT Journals
Abstract
The objective of the present study is to understand about all the problems occurring in the company because of improper application
of material management. In construction project operation, often there is a project cost variance in terms of the material, equipments,
manpower, subcontractor, overhead cost, and general condition. Material is the main component in construction projects. Therefore,
if the material management is not properly managed it will create a project cost variance. Project cost can be controlled by taking
corrective actions towards the cost variance. Therefore a methodology is used to diagnose and evaluate the procurement process
involved in material management and launch a continuous improvement was developed and applied. A thorough study was carried
out along with study of cases, surveys and interviews to professionals involved in this area. As a result, a methodology for diagnosis
and improvement was proposed and tested in selected projects. The results obtained show that the main problem of procurement is
related to schedule delays and lack of specified quality for the project. To prevent this situation it is often necessary to dedicate
important resources like money, personnel, time, etc. To monitor and control the process. A great potential for improvement was
detected if state of the art technologies such as, electronic mail, electronic data interchange (EDI), and analysis were applied to the
procurement process. These helped to eliminate the root causes for many types of problems that were detected.
Managing drought short term strategies in semi arid regions a case studyeSAT Journals
Abstract
Drought management needs multidisciplinary action. Interdisciplinary efforts among the experts in various fields of the droughts
prone areas are helpful to achieve tangible and permanent solution for this recurring problem. The Gulbarga district having the total
area around 16, 240 sq.km, and accounts 8.45 per cent of the Karnataka state area. The district has been situated with latitude 17º 19'
60" North and longitude of 76 º 49' 60" east. The district is situated entirely on the Deccan plateau positioned at a height of 300 to
750 m above MSL. Sub-tropical, semi-arid type is one among the drought prone districts of Karnataka State. The drought
management is very important for a district like Gulbarga. In this paper various short term strategies are discussed to mitigate the
drought condition in the district.
Keywords: Drought, South-West monsoon, Semi-Arid, Rainfall, Strategies etc.
Life cycle cost analysis of overlay for an urban road in bangaloreeSAT Journals
Abstract
Pavements are subjected to severe condition of stresses and weathering effects from the day they are constructed and opened to traffic
mainly due to its fatigue behavior and environmental effects. Therefore, pavement rehabilitation is one of the most important
components of entire road systems. This paper highlights the design of concrete pavement with added mono fibers like polypropylene,
steel and hybrid fibres for a widened portion of existing concrete pavement and various overlay alternatives for an existing
bituminous pavement in an urban road in Bangalore. Along with this, Life cycle cost analyses at these sections are done by Net
Present Value (NPV) method to identify the most feasible option. The results show that though the initial cost of construction of
concrete overlay is high, over a period of time it prove to be better than the bituminous overlay considering the whole life cycle cost.
The economic analysis also indicates that, out of the three fibre options, hybrid reinforced concrete would be economical without
compromising the performance of the pavement.
Keywords: - Fatigue, Life cycle cost analysis, Net Present Value method, Overlay, Rehabilitation
Laboratory studies of dense bituminous mixes ii with reclaimed asphalt materialseSAT Journals
Abstract
The issue of growing demand on our nation’s roadways over that past couple of decades, decreasing budgetary funds, and the need to
provide a safe, efficient, and cost effective roadway system has led to a dramatic increase in the need to rehabilitate our existing
pavements and the issue of building sustainable road infrastructure in India. With these emergency of the mentioned needs and this
are today’s burning issue and has become the purpose of the study.
In the present study, the samples of existing bituminous layer materials were collected from NH-48(Devahalli to Hassan) site.The
mixtures were designed by Marshall Method as per Asphalt institute (MS-II) at 20% and 30% Reclaimed Asphalt Pavement (RAP).
RAP material was blended with virgin aggregate such that all specimens tested for the, Dense Bituminous Macadam-II (DBM-II)
gradation as per Ministry of Roads, Transport, and Highways (MoRT&H) and cost analysis were carried out to know the economics.
Laboratory results and analysis showed the use of recycled materials showed significant variability in Marshall Stability, and the
variability increased with the increase in RAP content. The saving can be realized from utilization of recycled materials as per the
methodology, the reduction in the total cost is 19%, 30%, comparing with the virgin mixes.
Keywords: Reclaimed Asphalt Pavement, Marshall Stability, MS-II, Dense Bituminous Macadam-II
Laboratory investigation of expansive soil stabilized with natural inorganic ...eSAT Journals
This document summarizes a study on stabilizing expansive black cotton soil with the natural inorganic stabilizer RBI-81. Laboratory tests were conducted to evaluate the effect of RBI-81 on the soil's engineering properties. The tests showed that with 2% RBI-81 and 28 days of curing, the unconfined compressive strength increased by around 250% and the CBR value improved by approximately 400% compared to the untreated soil. Overall, the study found that RBI-81 effectively improved the strength properties of the black cotton soil and its suitability as a soil stabilizer was supported.
Influence of reinforcement on the behavior of hollow concrete block masonry p...eSAT Journals
Abstract
Reinforced masonry was developed to exploit the strength potential of masonry and to solve its lack of tensile strength. Experimental
and analytical studies have been carried out to investigate the effect of reinforcement on the behavior of hollow concrete block
masonry prisms under compression and to predict ultimate failure compressive strength. In the numerical program, three dimensional
non-linear finite elements (FE) model based on the micro-modeling approach is developed for both unreinforced and reinforced
masonry prisms using ANSYS (14.5). The proposed FE model uses multi-linear stress-strain relationships to model the non-linear
behavior of hollow concrete block, mortar, and grout. Willam-Warnke’s five parameter failure theory has been adopted to model the
failure of masonry materials. The comparison of the numerical and experimental results indicates that the FE models can successfully
capture the highly nonlinear behavior of the physical specimens and accurately predict their strength and failure mechanisms.
Keywords: Structural masonry, Hollow concrete block prism, grout, Compression failure, Finite element method,
Numerical modeling.
Influence of compaction energy on soil stabilized with chemical stabilizereSAT Journals
This document summarizes a study on the influence of compaction energy on soil stabilized with a chemical stabilizer. Laboratory tests were conducted on locally available loamy soil treated with a patented polymer liquid stabilizer and compacted at four different energy levels. The study found that increasing the compaction effort increased the density of both untreated and treated soil, but the rate of increase was lower for stabilized soil. Treating the soil with the stabilizer improved its unconfined compressive strength and resilient modulus, and reduced accumulated plastic strain, with these properties further improved by higher compaction efforts. The stabilized soil exhibited strength and performance benefits compared to the untreated soil.
Geographical information system (gis) for water resources managementeSAT Journals
This document describes a hydrological framework developed in the form of a Hydrologic Information System (HIS) to meet the information needs of various government departments related to water management in a state. The HIS consists of a hydrological database coupled with tools for collecting and analyzing spatial and non-spatial water resources data. It also incorporates a hydrological model to indirectly assess water balance components over space and time. A web-based GIS portal was created to allow users to access and visualize the hydrological data, as well as outputs from the SWAT hydrological model. The framework is intended to facilitate integrated water resources planning and management across different administrative levels.
Forest type mapping of bidar forest division, karnataka using geoinformatics ...eSAT Journals
Abstract
The study demonstrate the potentiality of satellite remote sensing technique for the generation of baseline information on forest types
including tree plantation details in Bidar forest division, Karnataka covering an area of 5814.60Sq.Kms. The Total Area of Bidar
forest division is 5814Sq.Kms analysis of the satellite data in the study area reveals that about 84% of the total area is Covered by
crop land, 1.778% of the area is covered by dry deciduous forest, 1.38 % of mixed plantation, which is very threatening to the
environmental stability of the forest, future plantation site has been mapped. With the use of latest Geo-informatics technology proper
and exact condition of the trees can be observed and necessary precautions can be taken for future plantation works in an appropriate
manner
Keywords:-RS, GIS, GPS, Forest Type, Tree Plantation
Factors influencing compressive strength of geopolymer concreteeSAT Journals
Abstract
To study effects of several factors on the properties of fly ash based geopolymer concrete on the compressive strength and also the
cost comparison with the normal concrete. The test variables were molarities of sodium hydroxide(NaOH) 8M,14M and 16M, ratio of
NaOH to sodium silicate (Na2SiO3) 1, 1.5, 2 and 2.5, alkaline liquid to fly ash ratio 0.35 and 0.40 and replacement of water in
Na2SiO3 solution by 10%, 20% and 30% were used in the present study. The test results indicated that the highest compressive
strength 54 MPa was observed for 16M of NaOH, ratio of NaOH to Na2SiO3 2.5 and alkaline liquid to fly ash ratio of 0.35. Lowest
compressive strength of 27 MPa was observed for 8M of NaOH, ratio of NaOH to Na2SiO3 is 1 and alkaline liquid to fly ash ratio of
0.40. Alkaline liquid to fly ash ratio of 0.35, water replacement of 10% and 30% for 8 and 16 molarity of NaOH and has resulted in
compressive strength of 36 MPa and 20 MPa respectively. Superplasticiser dosage of 2 % by weight of fly ash has given higher
strength in all cases.
Keywords: compressive strength, alkaline liquid, fly ash
Experimental investigation on circular hollow steel columns in filled with li...eSAT Journals
Abstract
Composite Circular hollow Steel tubes with and without GFRP infill for three different grades of Light weight concrete are tested for
ultimate load capacity and axial shortening , under Cyclic loading. Steel tubes are compared for different lengths, cross sections and
thickness. Specimens were tested separately after adopting Taguchi’s L9 (Latin Squares) Orthogonal array in order to save the initial
experimental cost on number of specimens and experimental duration. Analysis was carried out using ANN (Artificial Neural
Network) technique with the assistance of Mini Tab- a statistical soft tool. Comparison for predicted, experimental & ANN output is
obtained from linear regression plots. From this research study, it can be concluded that *Cross sectional area of steel tube has most
significant effect on ultimate load carrying capacity, *as length of steel tube increased- load carrying capacity decreased & *ANN
modeling predicted acceptable results. Thus ANN tool can be utilized for predicting ultimate load carrying capacity for composite
columns.
Keywords: Light weight concrete, GFRP, Artificial Neural Network, Linear Regression, Back propagation, orthogonal
Array, Latin Squares
Experimental behavior of circular hsscfrc filled steel tubular columns under ...eSAT Journals
This document summarizes an experimental study that tested circular concrete-filled steel tube columns with varying parameters. 45 specimens were tested with different fiber percentages (0-2%), tube diameter-to-wall-thickness ratios (D/t from 15-25), and length-to-diameter (L/d) ratios (from 2.97-7.04). The results found that columns filled with fiber-reinforced concrete exhibited higher stiffness, equal ductility, and enhanced energy absorption compared to those filled with plain concrete. The load carrying capacity increased with fiber content up to 1.5% but not at 2.0%. The analytical predictions of failure load closely matched the experimental values.
Evaluation of punching shear in flat slabseSAT Journals
Abstract
Flat-slab construction has been widely used in construction today because of many advantages that it offers. The basic philosophy in
the design of flat slab is to consider only gravity forces; this method ignores the effect of punching shear due to unbalanced moments
at the slab column junction which is critical. An attempt has been made to generate generalized design sheets which accounts both
punching shear due to gravity loads and unbalanced moments for cases (a) interior column; (b) edge column (bending perpendicular
to shorter edge); (c) edge column (bending parallel to shorter edge); (d) corner column. These design sheets are prepared as per
codal provisions of IS 456-2000. These design sheets will be helpful in calculating the shear reinforcement to be provided at the
critical section which is ignored in many design offices. Apart from its usefulness in evaluating punching shear and the necessary
shear reinforcement, the design sheets developed will enable the designer to fix the depth of flat slab during the initial phase of the
design.
Keywords: Flat slabs, punching shear, unbalanced moment.
Evaluation of performance of intake tower dam for recent earthquake in indiaeSAT Journals
Abstract
Intake towers are typically tall, hollow, reinforced concrete structures and form entrance to reservoir outlet works. A parametric
study on dynamic behavior of circular cylindrical towers can be carried out to study the effect of depth of submergence, wall thickness
and slenderness ratio, and also effect on tower considering dynamic analysis for time history function of different soil condition and
by Goyal and Chopra accounting interaction effects of added hydrodynamic mass of surrounding and inside water in intake tower of
dam
Key words: Hydrodynamic mass, Depth of submergence, Reservoir, Time history analysis,
Evaluation of operational efficiency of urban road network using travel time ...eSAT Journals
This document evaluates the operational efficiency of an urban road network in Tiruchirappalli, India using travel time reliability measures. Traffic volume and travel times were collected using video data from 8-10 AM on various roads. Average travel times, 95th percentile travel times, and buffer time indexes were calculated to assess reliability. Non-motorized vehicles were found to most impact reliability on one road. A relationship between buffer time index and traffic volume was developed. Finally, a travel time model was created and validated based on length, speed, and volume.
Estimation of surface runoff in nallur amanikere watershed using scs cn methodeSAT Journals
Abstract
The development of watershed aims at productive utilization of all the available natural resources in the entire area extending from
ridge line to stream outlet. The per capita availability of land for cultivation has been decreasing over the years. Therefore, water and
the related land resources must be developed, utilized and managed in an integrated and comprehensive manner. Remote sensing and
GIS techniques are being increasingly used for planning, management and development of natural resources. The study area, Nallur
Amanikere watershed geographically lies between 110 38’ and 110 52’ N latitude and 760 30’ and 760 50’ E longitude with an area of
415.68 Sq. km. The thematic layers such as land use/land cover and soil maps were derived from remotely sensed data and overlayed
through ArcGIS software to assign the curve number on polygon wise. The daily rainfall data of six rain gauge stations in and around
the watershed (2001-2011) was used to estimate the daily runoff from the watershed using Soil Conservation Service - Curve Number
(SCS-CN) method. The runoff estimated from the SCS-CN model was then used to know the variation of runoff potential with different
land use/land cover and with different soil conditions.
Keywords: Watershed, Nallur watershed, Surface runoff, Rainfall-Runoff, SCS-CN, Remote Sensing, GIS.
Estimation of morphometric parameters and runoff using rs & gis techniqueseSAT Journals
This document summarizes a study that used remote sensing and GIS techniques to estimate morphometric parameters and runoff for the Yagachi catchment area in India over a 10-year period. Morphometric analysis was conducted to understand the hydrological response at the micro-watershed level. Daily runoff was estimated using the SCS curve number model. The results showed a positive correlation between rainfall and runoff. Land use/land cover changes between 2001-2010 were found to impact estimated runoff amounts. Remote sensing approaches provided an effective means to model runoff for this large, ungauged area.
Effect of variation of plastic hinge length on the results of non linear anal...eSAT Journals
Abstract The nonlinear Static procedure also well known as pushover analysis is method where in monotonically increasing loads are applied to the structure till the structure is unable to resist any further load. It is a popular tool for seismic performance evaluation of existing and new structures. In literature lot of research has been carried out on conventional pushover analysis and after knowing deficiency efforts have been made to improve it. But actual test results to verify the analytically obtained pushover results are rarely available. It has been found that some amount of variation is always expected to exist in seismic demand prediction of pushover analysis. Initial study is carried out by considering user defined hinge properties and default hinge length. Attempt is being made to assess the variation of pushover analysis results by considering user defined hinge properties and various hinge length formulations available in literature and results compared with experimentally obtained results based on test carried out on a G+2 storied RCC framed structure. For the present study two geometric models viz bare frame and rigid frame model is considered and it is found that the results of pushover analysis are very sensitive to geometric model and hinge length adopted. Keywords: Pushover analysis, Base shear, Displacement, hinge length, moment curvature analysis
Effect of use of recycled materials on indirect tensile strength of asphalt c...eSAT Journals
Abstract
Depletion of natural resources and aggregate quarries for the road construction is a serious problem to procure materials. Hence
recycling or reuse of material is beneficial. On emphasizing development in sustainable construction in the present era, recycling of
asphalt pavements is one of the effective and proven rehabilitation processes. For the laboratory investigations reclaimed asphalt
pavement (RAP) from NH-4 and crumb rubber modified binder (CRMB-55) was used. Foundry waste was used as a replacement to
conventional filler. Laboratory tests were conducted on asphalt concrete mixes with 30, 40, 50, and 60 percent replacement with RAP.
These test results were compared with conventional mixes and asphalt concrete mixes with complete binder extracted RAP
aggregates. Mix design was carried out by Marshall Method. The Marshall Tests indicated highest stability values for asphalt
concrete (AC) mixes with 60% RAP. The optimum binder content (OBC) decreased with increased in RAP in AC mixes. The Indirect
Tensile Strength (ITS) for AC mixes with RAP also was found to be higher when compared to conventional AC mixes at 300C.
Keywords: Reclaimed asphalt pavement, Foundry waste, Recycling, Marshall Stability, Indirect tensile strength.
Null Bangalore | Pentesters Approach to AWS IAMDivyanshu
#Abstract:
- Learn more about the real-world methods for auditing AWS IAM (Identity and Access Management) as a pentester. So let us proceed with a brief discussion of IAM as well as some typical misconfigurations and their potential exploits in order to reinforce the understanding of IAM security best practices.
- Gain actionable insights into AWS IAM policies and roles, using hands on approach.
#Prerequisites:
- Basic understanding of AWS services and architecture
- Familiarity with cloud security concepts
- Experience using the AWS Management Console or AWS CLI.
- For hands on lab create account on [killercoda.com](https://killercoda.com/cloudsecurity-scenario/)
# Scenario Covered:
- Basics of IAM in AWS
- Implementing IAM Policies with Least Privilege to Manage S3 Bucket
- Objective: Create an S3 bucket with least privilege IAM policy and validate access.
- Steps:
- Create S3 bucket.
- Attach least privilege policy to IAM user.
- Validate access.
- Exploiting IAM PassRole Misconfiguration
-Allows a user to pass a specific IAM role to an AWS service (ec2), typically used for service access delegation. Then exploit PassRole Misconfiguration granting unauthorized access to sensitive resources.
- Objective: Demonstrate how a PassRole misconfiguration can grant unauthorized access.
- Steps:
- Allow user to pass IAM role to EC2.
- Exploit misconfiguration for unauthorized access.
- Access sensitive resources.
- Exploiting IAM AssumeRole Misconfiguration with Overly Permissive Role
- An overly permissive IAM role configuration can lead to privilege escalation by creating a role with administrative privileges and allow a user to assume this role.
- Objective: Show how overly permissive IAM roles can lead to privilege escalation.
- Steps:
- Create role with administrative privileges.
- Allow user to assume the role.
- Perform administrative actions.
- Differentiation between PassRole vs AssumeRole
Try at [killercoda.com](https://killercoda.com/cloudsecurity-scenario/)
Optimizing Gradle Builds - Gradle DPE Tour Berlin 2024Sinan KOZAK
Sinan from the Delivery Hero mobile infrastructure engineering team shares a deep dive into performance acceleration with Gradle build cache optimizations. Sinan shares their journey into solving complex build-cache problems that affect Gradle builds. By understanding the challenges and solutions found in our journey, we aim to demonstrate the possibilities for faster builds. The case study reveals how overlapping outputs and cache misconfigurations led to significant increases in build times, especially as the project scaled up with numerous modules using Paparazzi tests. The journey from diagnosing to defeating cache issues offers invaluable lessons on maintaining cache integrity without sacrificing functionality.
Advanced control scheme of doubly fed induction generator for wind turbine us...IJECEIAES
This paper describes a speed control device for generating electrical energy on an electricity network based on the doubly fed induction generator (DFIG) used for wind power conversion systems. At first, a double-fed induction generator model was constructed. A control law is formulated to govern the flow of energy between the stator of a DFIG and the energy network using three types of controllers: proportional integral (PI), sliding mode controller (SMC) and second order sliding mode controller (SOSMC). Their different results in terms of power reference tracking, reaction to unexpected speed fluctuations, sensitivity to perturbations, and resilience against machine parameter alterations are compared. MATLAB/Simulink was used to conduct the simulations for the preceding study. Multiple simulations have shown very satisfying results, and the investigations demonstrate the efficacy and power-enhancing capabilities of the suggested control system.
Use PyCharm for remote debugging of WSL on a Windo cf5c162d672e4e58b4dde5d797...shadow0702a
This document serves as a comprehensive step-by-step guide on how to effectively use PyCharm for remote debugging of the Windows Subsystem for Linux (WSL) on a local Windows machine. It meticulously outlines several critical steps in the process, starting with the crucial task of enabling permissions, followed by the installation and configuration of WSL.
The guide then proceeds to explain how to set up the SSH service within the WSL environment, an integral part of the process. Alongside this, it also provides detailed instructions on how to modify the inbound rules of the Windows firewall to facilitate the process, ensuring that there are no connectivity issues that could potentially hinder the debugging process.
The document further emphasizes on the importance of checking the connection between the Windows and WSL environments, providing instructions on how to ensure that the connection is optimal and ready for remote debugging.
It also offers an in-depth guide on how to configure the WSL interpreter and files within the PyCharm environment. This is essential for ensuring that the debugging process is set up correctly and that the program can be run effectively within the WSL terminal.
Additionally, the document provides guidance on how to set up breakpoints for debugging, a fundamental aspect of the debugging process which allows the developer to stop the execution of their code at certain points and inspect their program at those stages.
Finally, the document concludes by providing a link to a reference blog. This blog offers additional information and guidance on configuring the remote Python interpreter in PyCharm, providing the reader with a well-rounded understanding of the process.
Comparative analysis between traditional aquaponics and reconstructed aquapon...bijceesjournal
The aquaponic system of planting is a method that does not require soil usage. It is a method that only needs water, fish, lava rocks (a substitute for soil), and plants. Aquaponic systems are sustainable and environmentally friendly. Its use not only helps to plant in small spaces but also helps reduce artificial chemical use and minimizes excess water use, as aquaponics consumes 90% less water than soil-based gardening. The study applied a descriptive and experimental design to assess and compare conventional and reconstructed aquaponic methods for reproducing tomatoes. The researchers created an observation checklist to determine the significant factors of the study. The study aims to determine the significant difference between traditional aquaponics and reconstructed aquaponics systems propagating tomatoes in terms of height, weight, girth, and number of fruits. The reconstructed aquaponics system’s higher growth yield results in a much more nourished crop than the traditional aquaponics system. It is superior in its number of fruits, height, weight, and girth measurement. Moreover, the reconstructed aquaponics system is proven to eliminate all the hindrances present in the traditional aquaponics system, which are overcrowding of fish, algae growth, pest problems, contaminated water, and dead fish.
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...IJECEIAES
Climate change's impact on the planet forced the United Nations and governments to promote green energies and electric transportation. The deployments of photovoltaic (PV) and electric vehicle (EV) systems gained stronger momentum due to their numerous advantages over fossil fuel types. The advantages go beyond sustainability to reach financial support and stability. The work in this paper introduces the hybrid system between PV and EV to support industrial and commercial plants. This paper covers the theoretical framework of the proposed hybrid system including the required equation to complete the cost analysis when PV and EV are present. In addition, the proposed design diagram which sets the priorities and requirements of the system is presented. The proposed approach allows setup to advance their power stability, especially during power outages. The presented information supports researchers and plant owners to complete the necessary analysis while promoting the deployment of clean energy. The result of a case study that represents a dairy milk farmer supports the theoretical works and highlights its advanced benefits to existing plants. The short return on investment of the proposed approach supports the paper's novelty approach for the sustainable electrical system. In addition, the proposed system allows for an isolated power setup without the need for a transmission line which enhances the safety of the electrical network
Design and optimization of ion propulsion dronebjmsejournal
Electric propulsion technology is widely used in many kinds of vehicles in recent years, and aircrafts are no exception. Technically, UAVs are electrically propelled but tend to produce a significant amount of noise and vibrations. Ion propulsion technology for drones is a potential solution to this problem. Ion propulsion technology is proven to be feasible in the earth’s atmosphere. The study presented in this article shows the design of EHD thrusters and power supply for ion propulsion drones along with performance optimization of high-voltage power supply for endurance in earth’s atmosphere.
Multidimensional schema for agricultural data warehouse
1. IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163
__________________________________________________________________________________________
Volume: 02 Issue: 03 | Mar-2013, Available @ http://www.ijret.org 245
MULTIDIMENSIONAL SCHEMA FOR AGRICULTURAL DATA
WAREHOUSE
Aditya Kumar Gupta1
, Bireshwar Dass Mazumdar2
1
Research Scholar, Department of Computer Science, Sai Nath University, Ranchi, India,
2
Assistant Professor, Department of Computer Science, School of Management Sciences, Varanasi, India,
aditya.guptas@gmail.com, bireshwardm@gmail.com
Abstract
Agriculture is one of the important issues for a nation’s economy and it required technical breakthroughs in this century. With today’s
computerized world, the agricultural data processing is an increasing need for formers and decision makers. Agricultural data is
diversified, complex and non-standard. Developing a data warehouse for agricultural is a key challenge for researchers. The
objective of this work is to design a data warehouse about crops and their requirements. The proposed data warehouse may be
extended to Decision Support System (DSS) combine with data mining techniques. We proposed a multidimensional data warehouse
for agriculture that provides solutions for farmers and gives response of their ad-hoc quires. This multidimensional schema further
promotes star schema and snowflake schema that are commonly used to design data warehouses. In our manuscript normalization is
applied to store the data in to star schema and duplicate values are removed, so that space and time complexities could be minimized.
Index Terms: Agriculture, Data Warehouse, Multidimensional Schema, Dimensional Modeling, OLAP
-----------------------------------------------------------------------***-----------------------------------------------------------------------
1. INTRODUCTION
The first step in data warehouse design is to organize raw facts
into suitable data model. The requirements definition of any
data warehouse is completely drives the data design for the
data warehouse. Data design consists of putting all the data
structure together. The agricultural-data contains a large
number of raw facts that are inter- dependent to each other.
These facts required processing for converting informational
data into operational data. The raw data is first classified in
different attributes and then placed in a logical data model.
This includes many data warehouse functions such as
cleaning, summarizing and reformatting of attributes. An
efficient data warehouse must be emphasis on information
retrieval techniques. Here efficiency concerns both algorithms
and data structured used. Logical data design includes
determination of the various data elements that are needed for
a particular subject. It also establishes the relationships among
various data structures used for storage the data.
In our work the multidimensional modeling concepts are used
to design agricultural data warehouse. It is a logical design
technique to give structure to the agricultural dimensions and
metrics, which are analyzed as important attributes for
agriculture. We attempt to explore eight types of land, depends
on combinatory percentage of various fertilizers. We also
classify crops in twelve major classes and propose a classifier
for weather, based on months of the calendar. These attributes
are well placed in the model that systematically evaluates the
performance of crops for a particular land, weather and
environment domain.
2. LITERATURE REVIEW
In data warehouse literature multidimensional cube and their
correspondent star and snowflake schema are used to design
subject oriented databases. Dimensional modeling is the most
suitable approach to design a data warehouse for the purpose
of predictive data mining. According to [Kor99], the
objectives of dimensional modeling are: (i) to produce
database structures that are easy for end-users to understand
and write queries against, and (ii) to maximize the efficiency
of queries [7]. One way to look at the multidimensional data
model is to view as a cube, the caveat here is that, as the
number of dimensional increases, the number of cube‟s cells
increases exponentially [13]. On the other hand, the majority
of multidimensional queries deal with consolidated and high
level of data. Therefore the solution to building an efficient
multidimensional data base is to consolidate all logical
attributes. The consolidation is especially valuable since
typical dimensional are hierarchical in nature. We found this
approach relevant and useful for our work.
3. DATA MINING MODELS
Data mining involves many different algorithms to accomplish
different task. The algorithms examine the data and determine
a model that is closest to the characteristics of the data being
examined. Data mining techniques can be used to make three
kinds of models correspondent to three types of task:
2. IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163
__________________________________________________________________________________________
Volume: 02 Issue: 03 | Mar-2013, Available @ http://www.ijret.org 246
descriptive model, profiling model, and predictive model.
Hypothesis testing often produces descriptive models. On the
other hand, both profiling model and predictive model have a
goal in mind when a model is being built.
3.1 Descriptive Models
Descriptive model describe what is in the data. The output in
one or more charts or numbers or graphics that explain what is
going on. A descriptive model identifies patterns or
relationships in data; it serves as a way to explore the
properties of data examined; not to predict the new properties.
Clustering, summarization, association rules, and sequence
discovery are usually viewed as descriptive in nature.
3.2 Profiling Models
Profiling models are often based on demographic variables. In
profiling models, the target is from the same time frame as the
input. Profiling model has serious limitations. One is the
inability to distinguish cause and effect, as the profiling is
based on familiar demographic variables. It needs not to
involve any sophisticated data analysis. Surveys are the
common method of building the profiles. The distinction
between profiling and prediction is that profiling has
implications for the modeling methodology, especially the
treatment of time in the creation of the model set.
3.3 Predictive Models
A predictive model makes a prediction about values of data
using known results found from different data sources.
Prediction means finding pattern in data from one period that
are capable of explaining outcome in a later period. Predictive
modeling may be made based on the use of other historical
data, and it predicts what is likely to happen in the future.
Building a predictive model requires separation in time
between the model inputs and the model output, the thing to
be predicted. Predictive model data mining tasks includes
classification, regression, time series analysis, and prediction.
In our study we use predictive approach to predict
performance of crops in a particular time and particular type
of soil and fertilizer set.
3.4 DIMENSIONAL MODELING
Dimensional modeling is a different way to view and
interrogate data in database. This view may be used in a DSS
in conjunction with data mining tasks. Decision support
applications often require that information obtained along
many dimension. For example, a farmer may want to predict
production of any crop, in a specific geographic region, in
particular month for seeding, and by crop type. This query
requires three dimensions: Crop-set, Seeding Month, Land
type. Each dimension is a collection of logically related
attributes and is viewed as an axis for modeling the data.
Within each dimension, these entities form levels, on which
various DSS questions may be asked. The specific data stored
is called facts and usually numeric data. Facts consist of
measures and context data. The measures are the numeric
attributes about the facts that are queried. DSS queries may
access the facts from many different dimensions and levels.
The levels in each dimension facilitate the retrieval of facts
from different dimensions. The data warehouse should have
summarized collection of data attributes that makes
information retrieval more efficient.
4. MULTIDIMENSIONAL SCHEMA FOR
AGRICULTURAL DATA WAREHOUSE
Dimensional models represent data with a “cube” structure
[Kim96-1], making more compatible logical data
representation with OLAP data management. The data can be
queried directly in any combination of dimensions, by passing
complex database queries. Multidimensional models take
advantage of inherent relationships in data to populate data in
multidimensional matrices called data cubes. The query
performance in these multidimensional cubes is much better
in comparison to relational data model. The response time of
the multidimensional query depends on how many cells have
to be added at each dimension. In figure 4.1 we show a three
dimensional data cube, that organizes agricultural attributes
crop-set, their seeding month and land type. However, a data
hypercube could be produced by including additional
dimensional, correspond to attributes such as irrigation levels
and seed quality.
In our schema “12 months” (Jan to Dec) are the main classifier
of weather and placed along x- axis. The attribute “crop-sets”
are placed along y-axis and “land-type”, we have placed along
z-axis. Let m is the number of calendar months placed along
x-axis, c is the number of crop-sets at y-axis and l is the
number of land type placed at z –axis . Then total number of
values in cube is given by Vn = m × c × l i.e. 12 × 12 × 8 =
1152. The detail description of cube is given in following
sections.
4.1 Calendar for Seeding Crops
Every crop needs particular and specific weather for the best
results. In our multidimensional schema 12 months calendar is
used as a main classifier of weather. We have assumed that
maximum production of any crop is possible when it is seeded
in suitable month of the calendar. The weather report for every
month of calendar is based on some factors, such as cold,
rain, humidity and heat. Forecasting about weather is
dependent to all these factors, table 4.1 shows predicted
percentage of cold, rain, humidity and rain in large part on
India in corresponding to months of the calendar.
4.2 Crop-Sets
There are hundreds of crops whose success level depends on
some attributes such as start time, ending time, requirements
of fertilizers and type of seeds. We identifying 12 major crop
3. IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163
__________________________________________________________________________________________
Volume: 02 Issue: 03 | Mar-2013, Available @ http://www.ijret.org 247
sets correspond to their seeding months. These crop set
consists number of crops that require similar weather and
environmental needs. As displays in table 4.2, similar types of
crops are put together in a crop-set and identified by the name
of major crop belongs to it
Table-4.1 Table-4.2
Month % Cold % Rain % Humidity % Heat
1
January 60 15 20 5
2 February 60 15 15 10
3 March 50 5 10 35
4 April 25 5 10 60
5
May 15 10 5 70
6 June 5 30 5 60
7 July 5 55 25 15
8 August 10 60 25 5
9 September 10 60 25 5
10 October 15 50 25 10
11 November 35 25 30 10
12 December 50 15 30 5
Crop Set Crop Starting Month Ending Month
I Wheat December March
II Potato November January
III Mustered October March
IV Oil-seed September February
V Coffee August November
VI Pulses July October
VII Rice June October
VIII Peppermint May July
IX Sunflower April June
X Cash-crops March June
XI Coconut February August
XII Sugarcane January September
X axis (Calendar Months)
Z axis (Land Type)
Fig-4.1: Multidimensional Cube for Agricultural Attributes
Jan
July
Aug
Oct
Sep
Nov
Dec
Mar
April
May
June
Feb
XII
XI
XI
IX
VIII
VII
VI
V
IV
III
II
I
Y axis (Crop Set)
4. IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163
__________________________________________________________________________________________
Volume: 02 Issue: 03 | Mar-2013, Available @ http://www.ijret.org 248
Table-4.3
S
No.
Land
Type
Percentage of Fertilizers in Fertilizer Set Weight of Fertilizer Total Weight
U P Fe N Z S Ca K (Pf) (Fw)
1 L1 20 - - 50 - - - 10 U2 + N5 + K1 8
2 L2 20 30 10 10 10 - - - U2 + P3 + Fe1 + N1 + Z1 8
3 L3 20 10 20 10 20 - - U2 + Fe1 + N2 + Z1 + S2 8
4 L4 - 30 - 20 - - 10 20 P3 + N2 + Ca1 + K2 8
5 L5 10 - - 30 - 10 20 10 U1 + N3 + S1+Ca2 + K1 8
6 L6 40 10 - 10 - 20 - - U4 + P1 + N1 + S2 8
7 L7 - - 40 10 20 - 10 - I4 + N1 + Z2 + Ca1 8
8 L8 10 10 10 20 - 30 - - U1 + P1 + Fe1 + N2 + S3 8
4.3 Land Type
The impact of fertilizers on production of any crop is
significantly measured. In a particular type of land, there are
various fertilizers and minerals. We can identify many types
of land based on combinatory ratio of various fertilizers in
soil. In our study we identified 8 major fertilizers that are
naturally available in soil. However, we can generalize the
fertilizer-set, which may change with geographical regions.
1. Urea (U)
2. Phosphorus (P)
3. Iron (Fe)
4. Nitrogen (N)
5. Zinc(Z)
6. Sulfur (S)
7. Calcium (Ca)
8. Potassium (K)
Further on the basis of the ratio of above fertilizers present in
soil, we have identify 8 types of land for our study; table 4.3
displays land type depends on percentage of fertilizers
available in a soil. In table 3 we hypothetically demonstrate 8
different type of land based on above fertilizer, however for
the generalization purpose we can identify n types of land. In
the model, further we assign 8-unit weight to each type of soil;
each unit of weight corresponds to 10 percent availability of a
particular fertilizer in any fertilizer- set. For example land type
L1 contains 50 percent of Nitrogen, 20 percent of Urea and 10
percent of Potassium (table 4.3) and, rest 20 percent includes
other elements in soil. We are not including such 20 percent in
the study, assuming it is non-deterministic minerals, as soil
has many other impurities.
4.4 Irrigation
Irrigation is an important factor that majorly affects
production of any crop. In our model irrigation factor is
consider as a dimensional of the cube (not shown in figure).
We have taken three possible states of irrigation: high
irrigation, medium irrigation and low irrigation. Accordingly
we assign some constant values to each of these levels of
irrigation, for the computational purpose. The irrigation factor
is also described in correspondent star schema and further
extended into snowflake schema.
4.5 Seed Quality
With the development of biotechnology, researcher produces
high quality of seeds; the production rate of any crop is
certainly dependent on quality of seed. As we have done for
irrigation, quality of seeds is also classified in three levels,
high quality, medium quality and low quality. Further we have
assigned some constant values to each of these levels of seed
quality in our model. Like irrigation seed quality is also
consider as a dimension of the cube (not shown in figure). The
Seed quality factor is also described in correspondent star
schema and then, in snowflake schema.
5. ARRENGING DIMENSIONS IN SCHEMA
Specialized schemas have been developed to portray
multidimensional data. These include star schema and
snowflake schema. A star schema shows data as a collection
of two types: facts and dimensions. The star schema view can
be obtained via a relation system where each dimension is a
table and the facts are stored in fact table. Unlike relation
5. IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163
__________________________________________________________________________________________
Volume: 02 Issue: 03 | Mar-2013, Available @ http://www.ijret.org 249
schema which is flat, a star schema is a graphical view of data.
At the center of star, the data being examined, the facts are
shown in the fact table. On the outside of the facts, each
dimension is shown separately in dimension table. Descriptive
information about the dimensions is stored in dimensions
tables, which tend to be smaller. While the actual data is being
accessed, are stored in fact table and thus tend to be quite
large. To ensure efficiency of access, facts may be stored for
all possible aggregation levels. The fact data would then be
extended to include a level indicator. Snowflake schema is an
extension of star schema; it facilitates more complex data
views. In snowflake schema, the aggregation hierarchy is
described explicitly in the schema itself. The snowflake
schema can be understood as partially normalized version of
corresponding star schema.
5.1 Star Schema Arrangement
The star schema proposed for multidimensional agricultural
data warehouse is starts with a central fact table that
corresponds to facts about agricultural technology. Data in fact
table can be viewed as a regular relation with an attribute for
each fact, to be stored and the key being the values, for each
dimension. Each row of the central table contains some
combination of keys that makes a fact unique. These keys are
called dimensions.
In our model we use simplest star schema that has one fact
table and three dimensional tables corresponds to weather,
crop-sets and land type, two other dimensions, irrigation and
seed quality is also included in the model for computation of
decision coefficient .The central fact table also has other
columns that typically contain information specific to each
row, such as irrigation level, sequence of crop-set and seeding
month and, quality of seed. For the purpose of computation
we have assigned numeric values to all these information.
Figure 5.1 explains the star schema arrangements of
multidimensional cube given in previous section. The fact
table contains major attributes such as irrigation level, crop-
set, seeding month of the crop, soil type and quality of seed. A
computational model is required to produce the value of
decision coefficient. Decision coefficient is the success ratio
of any crop based on above attributes describe in the cube. We
can compute the value of decision coefficient for any crop by
providing values of these attributes as input, to the
computational model. There are four basic approaches to the
storage of data in dimensional table [PB99], which are
flattened approach, normalized approach expanded approach
and levelized approach. Each dimension table can be stored in
one of these manners. We have adapted normalized approach
to store the data and to remove duplicate values. In this
approach a table exists for each level in each dimension. Each
table has one tuple for every occurrence at that level. As with
normalization each lower level dimension table has a foreign
key pointing to next higher level. Figure 5.2 provides queries
to implement star schema for agricultural data warehouse
using normalized approach.
Fig- 5.1: Star Schema for Agricultural Data Warehouse
Seeding Month
Weather Weight
Weather
Soil Type
Soil Weight
Fertilizer Set
Location
Soil
Irrigation Level
Crop Set
Seeding Month
Soil Type
Seed Quality
Dec_Coeff_Val
Decision Coefficient
Crop Set
Description
Type
Starting Month
Ending Month
Crop
Irrigation Level
Irrigation Weight
Irrigation
Seed Quality
Seed Weight
Seed
6. IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163
__________________________________________________________________________________________
Volume: 02 Issue: 03 | Mar-2013, Available @ http://www.ijret.org 250
5.2 Snowflake Schema Arrangement
A snowflake schema is a logical arrangement of tables in
a multidimensional database, and so the entity
relationship diagram resembles as a snowflake in shape. The
snowflake schema is represented by centralized fact
tables which are connected to multiple dimensions. In
snowflake schema, dimensions are normalized into multiple
related tables, where in star schema dimensions are
normalized with each dimension represented by a single table.
A star schema stores all attributes for a dimension into one de-
normalized (flattened) table. This requires more disk space
than a snowflake schema that is eventually more normalized
approach. Snowflake schema normalizes the dimension by
moving attributes with few distinct values into separate
dimension tables that relates to core dimension table by using
foreign keys. A complex snowflake shape emerges when the
dimensions of a star schema are elaborated, having multiple
levels of relationships, i.e. the child tables have multiple
parent tables. In fig-5.3 we have elaborated weather dimension
into weather report, irrigation into irrigation points, and seed
into seed points. The soil dimensions is extended into three
relationships i.e. land, fertilizer availability and fertilizer
weight. Snowflake is used to improve query performance
against low cardinality attributes. Business intelligence
applications that use OLAP architecture may perform better,
when the snowflake schema is used for the purpose of data
warehousing.
Decision Coefficient (Irrigation Level, Crop Set, Seeding
Month, Soil Type, Seed Quality, Dec_Coeff_Val)
Irrigation (Irrigation Level, Irrigation Weight)
Irrigation Points (Irrigation Weight, Low, Mid, High)
Crop (Crop Set, Description, Type, Starting Month, Ending
Month)
Weather (Seeding Month, Weather Weight)
Weather Report (Weather Weight, Cold, Rain, Humidity,
Heat)
Soil (Soil Type, Soil Weight, Fertilizer Set, Location)
Fertilizers Availability (Soil weight, PerAvlF1, PerAvlF2,
PerAvlF3, PerAvlF4, PerAvlF5, PerAvlF6, PerAvlF7,
PerAvlF8, PerImpurity)
Fertilizer (Fertilizer Set, f1, f2, f2, f4, f5, f6, f7, f8)
Land (Location, State, District, Village)
Seed (Seed Quality, Seed Weight)
Seed Points (Seed Weight, Low, Mid, High)
Fig- 5.2: Implementation Details of Star Schema for
Agricultural Data Warehouse
7. IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163
__________________________________________________________________________________________
Volume: 02 Issue: 03 | Mar-2013, Available @ http://www.ijret.org 251
6. OLAP FOR AGRO DATA WAREHOUSE
Online analytical processing system are targeted to provide
more complex query results than online transaction processing
system (OLTP) or relational database system. The OLAP
system provides extra analysis of data as well as more
imprecise nature of queries. This basically differentiates an
OLAP application from traditional database or OLTP
application. The multidimensional view of data is fundamental
to OLAP applications. The complex nature of OLAP
application requires a multidimensional view of data, and the
type of data accessed is often a data warehouse. OLAP tools
can be classified as relational OLAP or multidimensional
OLAP (MOLAP). The data warehouse systems often contain
multiple OLAP cubes, and the power of OLAP arises from
practice of sharing dimensional across different cubes. In
MOLAP system, data are modeled, viewed, and physically
stored in a multidimensional schema. MOLAP tools are
implemented by specialized DBMS and software systems
capable of supporting the multidimensional data. With
MOLAP the cube view of data is stored as an n- dimensional
array, this approach require extremely high storage space and
then indices may be used to speed up processing of data. The
parallel computing can be used to overcome this limitation,
and then such cube requires less processing time.
In our agricultural data warehouse, we purposefully use multi
dimensional cube to store the data. Here every cube is
particularly valuable because they make possible to drill-down
and roll-up operations with other cubes. To assist roll-up and
drill down operations aggregation is used, and the values that
Seed Weight
Low
Mid
High
Seed Points
Location
State
District
Village
Land
Fertilizer Set
f1
f2
f3
f4
f5
f6
f7
f8
Fertilizers
Soil Weight
PerAvlF1
PerAvlF2
PerAvlF3
PerAvlF4
PerAvlF5
PerAvlF6
PerAvlF7
PerAvlF8
PerImpurity
Fertilizer
Availability
Weather Weight
Cold
Rain
Humidity
Heat
Weather Report
Seeding Month
Weather Weight
Weather
Soil Type
Soil Weight
Fertilizer Set
Location
Soil
Irrigation Level
Crop Set
Seeding Month
Soil Type
Seed Quality
Dec_Coeff_Val
Decision Coefficient
Crop Set
Description
Type
Starting Month
Ending Month
Crop
Irrigation Level
Irrigation Weight
Irrigation
Seed Quality
Seed Weight
Seed
Irrigation Weight
Low
Mid
High
Irrigation Points
Fig- 5.3: Extension of Star Schema into Snowflake Schema for Agricultural Data Warehouse
8. IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163
__________________________________________________________________________________________
Volume: 02 Issue: 03 | Mar-2013, Available @ http://www.ijret.org 252
are pre-computed are stored in the model. In section 5.1 we
have mention the requirement of a computational model,
which computes value of decision coefficient given in star
schema. A simple query may look at a single cell within the
cube; however queries may stated in multidimensional terms.
There are several types of OLAP operations that also support
to response more complex queries.
Slice: This operation performed by selecting values on one
dimension. Such as, values of decision coefficient for all crops
in a particular month. The SQL statement may be given as.
SELECT ALL (Dec_Coeff_Val) FROM Decision_Coefficient
WHERE Seeding_Month = „July‟;
Dice: This can be performed by slice on one dimension and
then rotating the cube to select on second dimension i.e.
SELECT Dec_Coeff_Val FROM Decision_Coefficient
WHERE Seeding_Month IN(„Jan‟, „Feb‟, „March‟ , „April‟,
„May‟, „June‟ , „July‟, „Aug‟, „Sep‟, „Oct‟, „Nov‟ , „Dec‟);
Roll-up: This allows asking queries that moves up an
aggregation hierarchy. Instead of looking one fact we look at
all the facts.
SELECT ALL (Dec_Coeff_Val) FROM Decision_Coefficient
WHERE Seeding_Month IN(„Jan‟, „Feb‟, „March‟ , „April‟,
„May‟, „June‟ , „July‟, „Aug‟, „Sep‟, „Oct‟, „Nov‟ , „Dec‟);
Drill-down: This operation allows user to navigate lower in
the aggregation hierarchy. In this user get more specified
results. Such as, value of decision coefficient for a particular
crop in a particular month.
SELECT Dec_Coeff_Val FROM Decision_Coefficient
WHERE Seeding_Month = „July‟;
The other OLAP operations may be used to support more
analytical results of agricultural facts. We can combine each
of above OLAP operations to response more analytical
queries.
CONCLUSIONS
In this manuscript we have applied concepts of dimensional
modeling. We have identified major agricultural attributes and
place these attributes in the correspondent dimensions of the
cube. Further we developed a star schema for
multidimensional data cube and identified the primary keys to
connect each of the dimensions with central fact table of the
star schema. The fact table collects information from each of
the dimensions and then, the values of decision coefficient is
computed and stored in the fact table, correspondent to each
cell of the cube. We have also extended the star schema in to
snowflake schema by applying hierarchical aggregation of
each dimension. In snowflake schema, the microscopic view
of factors, affecting each dimension of the cube is clearly
explained. In the last section of the work we have applied
OLAP operations to analyze the facts stored in the
multidimensional database. The OLAP tools help farmers and
decision makers to predict production of crops, to find the lack
of any fertilizer available in soil or to guess the level of
irrigation required, during growth of the crop.
The real strength of the paper lies in the adoption of predictive
data mining techniques. Many data mining tools and algorithm
may be applied to this agro database, which probably responds
to more complex queries. For example a farmer might
discover that a particular crop produce better results, when it is
irrigated at a particular time, under a particular weather
conditions. The decision maker may found some pattern in
crop sequence for better results such as farmers of
Chhattisgarh, Orissa and parts of Bihar could found that rice-
wheat system is more suitable than rice-pulses system.
Similarly other complex and specific queries can respond
using data mining. This paper strongly advocates the use of
predictive data mining techniques to retrieve more
sophisticated information from agricultural data warehouse.
We will find our work meaningful if an agent based software
could be developed for the purpose of warehousing and
mining of agricultural attributes.
REFERENCES
[1] Aditya K Gupta, M.H. Khan, “Clock Based Model for
Cropping System: A Frame work for Agricultural Data
Warehouse”. National Conference, CSI, Delhi Chapter. Feb
2007.
[2] Agricultural Research Data Book (2002), IASRI, New
Delhi.
[3] Agricultural Statistics at a Glance (2001), Directorate of
Economics and Statistics, Department of Agriculture and
Cooperation, Ministry of Agriculture, Govt. of India.
[4] Alex Berson and Stephen J. Smith , Data Warehousing ,
Data Mining , and OLAP. Tata McGraw-Hill 2004.
[5] C. Adamson, M. Venerable. Data Warehouse Design
Solutions. J. Wiley & Sons, Inc.1998.
[6] E. Thomsen. OLAP Solutions. Building Multidimensional
Information. John Wiley & Sons, Inc., 1997.
[7] Kimball, R., Reeves, L., Ross, M. and Thornthwaite, W.
The Data Warehouse Lifecycle Toolkit: Expert Methods for
Designing, developing, and Deploying Data Warehouses. John
Wiley & Sons, 1998
[8] L. Silverston, W. H. Inmon, K. Graziano. The Data Model
Resource Book. J. Wiley & Sons, Inc. 1997
[9] M. Blaschka. FIESTA: A Framework for Schema
Evolution in Multidimensional Information Systems. Proc. of
6th. CAISE Doctoral Consortium, 1999, Heidelberg,
Germany.
[10] M. Golfarelli, Stefano Rizzi. A Methodological
Framework for Data Warehouse Design. DOLAP 1998.
[11] Margaret H. Dunhan, Data Mining Introdouctry &
Advanced Topics. Pearson Education 2003.
9. IJRET: International Journal of Research in Engineering and Technology ISSN: 2319-1163
__________________________________________________________________________________________
Volume: 02 Issue: 03 | Mar-2013, Available @ http://www.ijret.org 253
[12] Michale J.A. Berry Gorden S. Linoff, Data Mining
Techniques For Marketing, Sales and CRM. Wiley Publishing
Inc. 2004.
[13] Paulraj Ponniah , Data warehousing Fundamentals. J.
Wiley & Sons, 2005.
[14] R. Agrawal, A. Gupta, S. Sarawagi. Modeling
Multidimensional Databases. ICDE 1997
[15] R. Kimball. The Data Warehouse Lifecycle Toolkit. J.
Wiley & Sons, Inc. 1998.
[16] Ramez Elmasri and Shamkant B. Navathe. Fundamentals
of Database System, 3rd edition Addison Weseley, 2000.
[17] S. Chaudhuri, U. Dayal. An overview of Data
Warehousing and OLAP Technology. SIGMOD Record 26(1).
1997.
[18] W. H. Inmon. Building the Operational Data Store. John
Wiley & Sons Inc., 1996.
BIOGRAPHIES:
Aditya Kumar Gupta has expert
hands over DBMS and Data
Warehouse technologies. After
completing MCA in 2002; presently he
is pursuing Ph.D. in computer science.
His objective of research is to design
data warehouse for Indian agriculture.
He has more than eight years of rich
experience in academia and industry.
He has presented his work in national conference held at New
Delhi and that was organized by department of science and
technology government of India, and Computer society of
India.
Bireshwar Dass Mazumdar has earned
his Ph.D. from IIT - BHU Varanasi. He
has completed MCA in 2004 from UP
Technical University. Presently he is
working as Assistant Professor in,
School of Management Sciences,
Varanasi. Dr. Mazumdar has seven
years of core experience in teaching in
reputed academic institutions. He has
expertise knowledge of subjects like Multi Agent Systems,
Data Warehousing, Data mining, Artificial Intelligence, and
Software Quality Engineering. He has a pool of Research and
Publications in reputed journals. He has seven international
publications and three national publications to his credit.