This document discusses analyzing wine quality using chemical properties data from red wine varieties in Portugal. The objectives are to predict quality ranking from chemical properties and provide guidance to vineyards without relying on wine tasters. Various analyses are performed, including basic statistics, histograms, correlation matrices, scatter plots, and hypothesis testing. Linear regression models find alcohol content is highly correlated with quality.
Lush is a cosmetics company founded in the UK in 1994 that sells fresh handmade cosmetic products. They have over 820 stores globally. The document discusses Lush's marketing strategy for entering the Malaysian market. It analyzes the political, economic, social, technological, environmental and cultural factors in Malaysia and discusses Lush's values and competitive advantages. It recommends that Lush globalize further through international expansion.
The marketing team conducted a blind test of Smirnoff's current vodka blend against two new test blends among regular vodka drinkers. Data analysis showed that taste and mouthfeel were the key drivers of overall preference. Test Blend 1 scored higher than the current blend on these attributes. The report recommends replacing the current blend with Test Blend 1, and focusing future product development on aroma, taste, and mouthfeel. It also notes some anomalies in the data that require further investigation to ensure validity.
Red Bull has built brand equity through non-traditional marketing strategies like sponsoring extreme sports events. It targets niche groups like nightclub attendees and athletes. Red Bull communicates its message of providing energy simply through its slogan "Red Bull gives you wings". The company leverages event sponsorships and athlete endorsements to supplement word-of-mouth marketing. Red Bull's unique blue-silver cans are also easily recognizable. However, its association with late-night alcohol consumption could undermine its message as a stimulant for any time of day. To maintain momentum, Red Bull needs to continue targeting youth and investing in event sponsorships tied to physical endurance.
The document discusses analyzing wine quality prediction using machine learning models. It aims to predict wine quality, which is measured on an ordinal scale of 3 to 9, based on various predictor factors about the wines. The document performs data cleaning and preprocessing steps like handling missing data through mean imputation and normalizing variables. It analyzes the distributions of the predictor variables which are found to mostly follow a normal distribution. The ranges of the predictor variables are also examined and found to make sense. The objective is to apply ML models to predict wine quality and use autoML and SHAP to analyze model performance and feature importance.
Red wine analysis using programming in R to conclude findings of chemical properties that are contributing factors to quality and alcoholic content of red wine.
Wine Taste Preference Modeling Based On Physicochemical Tests_ShuaiWeiShuai Wei
This document discusses modeling wine taste preferences based on physicochemical tests of wine samples. It analyzes data from over 6,000 white and red wine samples, measuring 11 chemical properties and sensory quality. Multiple regression techniques are applied to explore relationships between properties and quality ratings, including linear regression, Lasso, regression trees, and random forests. The best models explained around 30% of the variation in quality and had good predictive performance on test data.
Lush is a cosmetics company founded in the UK in 1994 that sells fresh handmade cosmetic products. They have over 820 stores globally. The document discusses Lush's marketing strategy for entering the Malaysian market. It analyzes the political, economic, social, technological, environmental and cultural factors in Malaysia and discusses Lush's values and competitive advantages. It recommends that Lush globalize further through international expansion.
The marketing team conducted a blind test of Smirnoff's current vodka blend against two new test blends among regular vodka drinkers. Data analysis showed that taste and mouthfeel were the key drivers of overall preference. Test Blend 1 scored higher than the current blend on these attributes. The report recommends replacing the current blend with Test Blend 1, and focusing future product development on aroma, taste, and mouthfeel. It also notes some anomalies in the data that require further investigation to ensure validity.
Red Bull has built brand equity through non-traditional marketing strategies like sponsoring extreme sports events. It targets niche groups like nightclub attendees and athletes. Red Bull communicates its message of providing energy simply through its slogan "Red Bull gives you wings". The company leverages event sponsorships and athlete endorsements to supplement word-of-mouth marketing. Red Bull's unique blue-silver cans are also easily recognizable. However, its association with late-night alcohol consumption could undermine its message as a stimulant for any time of day. To maintain momentum, Red Bull needs to continue targeting youth and investing in event sponsorships tied to physical endurance.
The document discusses analyzing wine quality prediction using machine learning models. It aims to predict wine quality, which is measured on an ordinal scale of 3 to 9, based on various predictor factors about the wines. The document performs data cleaning and preprocessing steps like handling missing data through mean imputation and normalizing variables. It analyzes the distributions of the predictor variables which are found to mostly follow a normal distribution. The ranges of the predictor variables are also examined and found to make sense. The objective is to apply ML models to predict wine quality and use autoML and SHAP to analyze model performance and feature importance.
Red wine analysis using programming in R to conclude findings of chemical properties that are contributing factors to quality and alcoholic content of red wine.
Wine Taste Preference Modeling Based On Physicochemical Tests_ShuaiWeiShuai Wei
This document discusses modeling wine taste preferences based on physicochemical tests of wine samples. It analyzes data from over 6,000 white and red wine samples, measuring 11 chemical properties and sensory quality. Multiple regression techniques are applied to explore relationships between properties and quality ratings, including linear regression, Lasso, regression trees, and random forests. The best models explained around 30% of the variation in quality and had good predictive performance on test data.
Leeder Analytical provides wine testing services including analysis of organic acids, sugars, metals, alcohol content, and other chemicals. They test for various organic acids like tartaric acid, lactic acid and citric acid. They also test for sugars like glucose, fructose, and sucrose. Additionally, they analyze metals in wine including calcium, sodium, copper, iron, and magnesium. Their services help wine makers understand the chemical composition of grapes and wine.
The document analyzes the relationship between physicochemical elements of Vinho Verde white wine and human sensory grades. It uses data on 4898 wine samples to build a neural network model. The model predicts grades best for medium-quality wines and worst for high-quality wines. Analysis found that acidity attributes contribute to freshness, while multiple attributes complexly impact taste balance. Key physicochemical indicators like alcohol content were identified. Practical suggestions could help the growing wine industry control quality and improve prices.
Without having prior knowledge on wine and quality of the wine; just for curiosity purpose worked on the famous wine data and find out some relations between the compositions used in the wine and the quality rating given by the individuals.
Kindly go through the report and share your comments and suggestions.
Assignment - 03
Model Building, Selection, & Prediction
Question 1:
1. Predicting the Output Variable Y – Energy Production Prediction
a) Importing the data from CSV data and splitting into test and training data:
Using the read.csv() function we can import the data into R
INPUT:
OUTPUT:
INPUT:
OUTPUT:
b) Fitting a Linear Regression Model:
Running the Linear Regression Model with all the Variables
INPUT:
OUTPUT:
The Adjusted R-Squared value is found to be 0.2366.
From the data It can seen that Pressure and Wind are only significant.
So, we run the model only with wind and pressure variables.
Reduced Regression Model (Wind and Pressures Variable only)
INPUT:
OUTPUT:
Removing the Wind Variable since the Adjusted R Squared Value is only 0.0229. Now we run the regression using only the Pressure Variable.
Running the Regression model with only Wind Variable:
INPUT:
OUTPUT:
The Adjusted R-Squared value is found to be 0.219, which is less than the previous regression models.
ANOVA test is to be conducted to find the significance of the all variable included model and the reduced pressure variable model.
INPUT:
OUTPUT:
Between the All variable and Reduced model, the P value is found to be 0.2578, so we should not reject the Null hypothesis and use the Reduced Model.
Between the Pressure variable and Reduced model, the P value is found to be 0.0768, so we should not reject the Null hypothesis and use the Pressure Model.
Running Best Subset to find the model:
Best Subset find the value of statistics for all variables involved and print the statistics for comparison, using which we can select the appropriate variable
INPUT:
OUTPUT:
RSS Value decrease as the variable increase.
Model with 5 variable has the highest Adjusted R Square.
Model with 3 variable has the smallest AIC (or Cp).
Model with 8 variable has the smallest BIC.
Since the Bestsubset approach provides a broad result we check the predicted R square and use the model with highest R square and lower RMSE
R square and RMSE Prediction:
For all variable considered Model:
INPUT:
OUTPUT:
For the Reduced Model with Pressure and Wind Variables:
INPUT:
OUTPUT:
Single Model with Pressure as the dependent variable:
INPUT:
OUTPUT:
Summary:
From the Analysis we can conclude that model with the pressure as the dependent variable is better than the other models. The Adjusted R square value of 0.31 is the best and the RMSE value is also the least in case of the pressur model.
From the Adjusted R Squared value we conclude that the pressure model is the best and can predict the energy produced rate accurately for 31% of the data.
c) Backward Selection Approach:
Regression Model using all the variables:
INPUT:
OUTPUT:
Conclusion:
The backward step AIC function tells a slightly different result then the models generated above. However, when we create the regression model we see a low R2 value then our single mod.
Assignment - 03
Model Building, Selection, & Prediction
Question 1:
1. Predicting the Output Variable Y – Energy Production Prediction
a) Importing the data from CSV data and splitting into test and training data:
Using the read.csv() function we can import the data into R
INPUT:
OUTPUT:
INPUT:
OUTPUT:
b) Fitting a Linear Regression Model:
Running the Linear Regression Model with all the Variables
INPUT:
OUTPUT:
The Adjusted R-Squared value is found to be 0.2366.
From the data It can seen that Pressure and Wind are only significant.
So, we run the model only with wind and pressure variables.
Reduced Regression Model (Wind and Pressures Variable only)
INPUT:
OUTPUT:
Removing the Wind Variable since the Adjusted R Squared Value is only 0.0229. Now we run the regression using only the Pressure Variable.
Running the Regression model with only Wind Variable:
INPUT:
OUTPUT:
The Adjusted R-Squared value is found to be 0.219, which is less than the previous regression models.
ANOVA test is to be conducted to find the significance of the all variable included model and the reduced pressure variable model.
INPUT:
OUTPUT:
Between the All variable and Reduced model, the P value is found to be 0.2578, so we should not reject the Null hypothesis and use the Reduced Model.
Between the Pressure variable and Reduced model, the P value is found to be 0.0768, so we should not reject the Null hypothesis and use the Pressure Model.
Running Best Subset to find the model:
Best Subset find the value of statistics for all variables involved and print the statistics for comparison, using which we can select the appropriate variable
INPUT:
OUTPUT:
RSS Value decrease as the variable increase.
Model with 5 variable has the highest Adjusted R Square.
Model with 3 variable has the smallest AIC (or Cp).
Model with 8 variable has the smallest BIC.
Since the Bestsubset approach provides a broad result we check the predicted R square and use the model with highest R square and lower RMSE
R square and RMSE Prediction:
For all variable considered Model:
INPUT:
OUTPUT:
For the Reduced Model with Pressure and Wind Variables:
INPUT:
OUTPUT:
Single Model with Pressure as the dependent variable:
INPUT:
OUTPUT:
Summary:
From the Analysis we can conclude that model with the pressure as the dependent variable is better than the other models. The Adjusted R square value of 0.31 is the best and the RMSE value is also the least in case of the pressur model.
From the Adjusted R Squared value we conclude that the pressure model is the best and can predict the energy produced rate accurately for 31% of the data.
c) Backward Selection Approach:
Regression Model using all the variables:
INPUT:
OUTPUT:
Conclusion:
The backward step AIC function tells a slightly different result then the models generated above. However, when we create the regression model we see a low R2 value then our single mod ...
Beer is an incredibly complex beverage containing more than 3000 different compounds, including carbohydrates, proteins, ions, microbes, organic acids, and polyphenols, among others.Some of the analytical methods used for quality control are presented
Practical White Wine Production: Theory and PracticeSabrina Lueck
The document discusses white wine making theory and practice at the Walla Walla Center for Enology and Viticulture, including details on their 2013 and 2012 wine awards. It provides a two-part format covering chemical parameters and processes in winemaking, and how to identify issues and apply concepts to make easier and better wines. The Center thanks those who support their work empowering them to be better winemakers and educators.
This document analyzes a dataset containing information about 1000 red wines and 1000 white wines. It aims to understand the characteristics of these wines and how different ingredients affect quality. The dataset includes 13 attributes for each wine like acidity, sugar, and alcohol content. The analysis includes descriptive statistics by wine type and quality, decision trees to predict type and quality based on ingredients, and clustering to group wines. Key findings are that chlorides, sulfates and acidity help predict wine type, while alcohol and density most influence quality.
Determination of Wine Color and Total Phenol Content using the LAMBDA PDA UV/...PerkinElmer, Inc.
Historically, the earliest evidence of viniculture is approximately
8,000 years ago and worldwide it has become increasingly more prevalent in recent years. The expansion of markets and producers has resulted in an escalation in methods used to
guarantee product safety and quality of wine.
Wine contains over 600 nutritional substances including vitamins, organic acids and more importantly polyphenols. The seeds and skin of the grape provide a valuable source of polyphenols, and with increasing interest in their health-enhancing properties as antioxidants, research has gathered pace over the last 15 years. The key benefits found have been aiding age prevention and cardiovascular disease by preventing the oxidation of Low Density Lipoprotein (LDL).1
The versatility of PerkinElmer’s LAMBDA™ 265 and LAMBDA 465 PDA UV/Visible Spectrophotometers allows quantification of the total phenol content in the wines, and also wine color to be measured to determine quality and any potential contamination.
This document examines quality properties in a wine dataset from a Portuguese vineyard called Vinho Verde. The dataset contains over 1000 observations of red wines rated by experts on quality. Quality ratings are associated with 12 quantitative characteristics that may influence quality. Three models are created to predict quality:
1. The first model uses all 12 characteristics to predict quality but has a low adjusted R-square. Seven characteristics are significant predictors of quality.
2. Stepwise regression identifies the same seven predictors as the first model.
3. A third model attempts to predict high quality wines rated 7 or above but fails due to a low adjusted R-square and only alcohol being a marginally significant predictor. The model is rejected due to
This document provides information about Vaikash Exims, an Indian exporter of various chemicals and minerals. It introduces the company and its owner, K.K. Kumar, who has over 20 years of experience in export. The company exports products to destinations around the world, especially in Asia, Africa, and the Middle East. The document lists the company's certifications and awards and provides details on its product offerings, including chemical analysis reports and specifications for various chemicals and minerals. It describes common uses of the chemicals in different industries.
2018 Oregon Wine Symposium | Understanding Control Points from Crush Pad to B...Oregon Wine Board
Sydney Morgan presented her research on the influence of decisions made at crush on yeast populations and wine sensory profiles. She studied the effects of uninoculated fermentations, sulfur dioxide additions, and pied de cuve inoculations using pinot gris and chardonnay grapes. Her results showed that uninoculated fermentations with low or no sulfur dioxide increased yeast diversity and introduced indigenous yeasts, producing wines with more tropical and fruit characteristics. Pied de cuve inoculations required a higher yeast count to be effective. Future work includes identifying indigenous yeasts in vineyards and evaluating their fermentation potential to improve complexity in minimal intervention wines.
Target variable: Quality
Parameters associated: Alcohol, pH, Acidity, Volatility
The following Quality can be achieved
Pricing based on the chemical and physiometric properties.
Segmentation: Defining new markets.
DFV Wines is steadfastly committed in crafting and representing wines of the highest quality produced in accordance with sustainable wine growing practice using data mining.
Analysis of fermentation products of (2) (1)prakash64742
The document provides information on the analysis of fermentation products of spirits. It discusses the key constituents typically found in spirits like brandy, gin, rum, whisky and vodka. These include alcohols, acids, esters, aldehydes and others. The document also describes analytical tests prescribed for analyzing spirits, such as tests for alcohol content, total solids, acidity, esters and others. Distillation is used to produce spirits from fermented liquors resulting in products high in alcohol that do not spoil microbially.
The document presents analytical testing methods for alcoholic beverages as established by the Food Safety and Standards Authority of India (FSSAI). It begins with an introduction to FSSAI's regulations for alcoholic beverages and lists common types of alcoholic beverages. The main part of the document outlines 11 specific analytical testing methods prescribed by FSSAI to test alcoholic beverages for various quality parameters like ethyl alcohol content, residue, total acidity, volatile acids, esters, and contaminants. Each testing method is described in detail including required equipment, reagents, procedures, and calculations.
Presentation of CDR WineLab®, Wine Analysis SystemCDR S.r.l.
CDR conducts business in various sectors including food and beverage analysis. The document discusses CDR WineLab, an analyzer used for wine quality control and analysis. It can test for various parameters in grapes, must, wine and bottled wine like sugars, acids, yeast nutrients, and sulfur levels. The analyzer is easy to use with pre-filled reagents and provides fast, accurate results to help monitor the winemaking process from grapes to finished wine.
This document discusses ways that caustic cleaning chemicals can accidentally be introduced into beer during the cleaning-in-place (CIP) process. It provides details on how automated brewery systems work and how errors can occur, leading to chemical contamination. The document recommends monitoring rinse water pH and sodium levels in beer to detect contamination. It provides an example calculation for determining the volume of caustic introduced based on sodium concentration differences between a contaminated and control beer sample.
CDR WineLab®: controllare, intervenire e migliorare la vinificazione in cantinaCDR S.r.l.
Le analisi del vino per il controllo del processo della vinificazione in Rosso e in Bianco con CDR WineLab®, il Sistema semplice per il tuo controllo qualità.
Leeder Analytical provides wine testing services including analysis of organic acids, sugars, metals, alcohol content, and other chemicals. They test for various organic acids like tartaric acid, lactic acid and citric acid. They also test for sugars like glucose, fructose, and sucrose. Additionally, they analyze metals in wine including calcium, sodium, copper, iron, and magnesium. Their services help wine makers understand the chemical composition of grapes and wine.
The document analyzes the relationship between physicochemical elements of Vinho Verde white wine and human sensory grades. It uses data on 4898 wine samples to build a neural network model. The model predicts grades best for medium-quality wines and worst for high-quality wines. Analysis found that acidity attributes contribute to freshness, while multiple attributes complexly impact taste balance. Key physicochemical indicators like alcohol content were identified. Practical suggestions could help the growing wine industry control quality and improve prices.
Without having prior knowledge on wine and quality of the wine; just for curiosity purpose worked on the famous wine data and find out some relations between the compositions used in the wine and the quality rating given by the individuals.
Kindly go through the report and share your comments and suggestions.
Assignment - 03
Model Building, Selection, & Prediction
Question 1:
1. Predicting the Output Variable Y – Energy Production Prediction
a) Importing the data from CSV data and splitting into test and training data:
Using the read.csv() function we can import the data into R
INPUT:
OUTPUT:
INPUT:
OUTPUT:
b) Fitting a Linear Regression Model:
Running the Linear Regression Model with all the Variables
INPUT:
OUTPUT:
The Adjusted R-Squared value is found to be 0.2366.
From the data It can seen that Pressure and Wind are only significant.
So, we run the model only with wind and pressure variables.
Reduced Regression Model (Wind and Pressures Variable only)
INPUT:
OUTPUT:
Removing the Wind Variable since the Adjusted R Squared Value is only 0.0229. Now we run the regression using only the Pressure Variable.
Running the Regression model with only Wind Variable:
INPUT:
OUTPUT:
The Adjusted R-Squared value is found to be 0.219, which is less than the previous regression models.
ANOVA test is to be conducted to find the significance of the all variable included model and the reduced pressure variable model.
INPUT:
OUTPUT:
Between the All variable and Reduced model, the P value is found to be 0.2578, so we should not reject the Null hypothesis and use the Reduced Model.
Between the Pressure variable and Reduced model, the P value is found to be 0.0768, so we should not reject the Null hypothesis and use the Pressure Model.
Running Best Subset to find the model:
Best Subset find the value of statistics for all variables involved and print the statistics for comparison, using which we can select the appropriate variable
INPUT:
OUTPUT:
RSS Value decrease as the variable increase.
Model with 5 variable has the highest Adjusted R Square.
Model with 3 variable has the smallest AIC (or Cp).
Model with 8 variable has the smallest BIC.
Since the Bestsubset approach provides a broad result we check the predicted R square and use the model with highest R square and lower RMSE
R square and RMSE Prediction:
For all variable considered Model:
INPUT:
OUTPUT:
For the Reduced Model with Pressure and Wind Variables:
INPUT:
OUTPUT:
Single Model with Pressure as the dependent variable:
INPUT:
OUTPUT:
Summary:
From the Analysis we can conclude that model with the pressure as the dependent variable is better than the other models. The Adjusted R square value of 0.31 is the best and the RMSE value is also the least in case of the pressur model.
From the Adjusted R Squared value we conclude that the pressure model is the best and can predict the energy produced rate accurately for 31% of the data.
c) Backward Selection Approach:
Regression Model using all the variables:
INPUT:
OUTPUT:
Conclusion:
The backward step AIC function tells a slightly different result then the models generated above. However, when we create the regression model we see a low R2 value then our single mod.
Assignment - 03
Model Building, Selection, & Prediction
Question 1:
1. Predicting the Output Variable Y – Energy Production Prediction
a) Importing the data from CSV data and splitting into test and training data:
Using the read.csv() function we can import the data into R
INPUT:
OUTPUT:
INPUT:
OUTPUT:
b) Fitting a Linear Regression Model:
Running the Linear Regression Model with all the Variables
INPUT:
OUTPUT:
The Adjusted R-Squared value is found to be 0.2366.
From the data It can seen that Pressure and Wind are only significant.
So, we run the model only with wind and pressure variables.
Reduced Regression Model (Wind and Pressures Variable only)
INPUT:
OUTPUT:
Removing the Wind Variable since the Adjusted R Squared Value is only 0.0229. Now we run the regression using only the Pressure Variable.
Running the Regression model with only Wind Variable:
INPUT:
OUTPUT:
The Adjusted R-Squared value is found to be 0.219, which is less than the previous regression models.
ANOVA test is to be conducted to find the significance of the all variable included model and the reduced pressure variable model.
INPUT:
OUTPUT:
Between the All variable and Reduced model, the P value is found to be 0.2578, so we should not reject the Null hypothesis and use the Reduced Model.
Between the Pressure variable and Reduced model, the P value is found to be 0.0768, so we should not reject the Null hypothesis and use the Pressure Model.
Running Best Subset to find the model:
Best Subset find the value of statistics for all variables involved and print the statistics for comparison, using which we can select the appropriate variable
INPUT:
OUTPUT:
RSS Value decrease as the variable increase.
Model with 5 variable has the highest Adjusted R Square.
Model with 3 variable has the smallest AIC (or Cp).
Model with 8 variable has the smallest BIC.
Since the Bestsubset approach provides a broad result we check the predicted R square and use the model with highest R square and lower RMSE
R square and RMSE Prediction:
For all variable considered Model:
INPUT:
OUTPUT:
For the Reduced Model with Pressure and Wind Variables:
INPUT:
OUTPUT:
Single Model with Pressure as the dependent variable:
INPUT:
OUTPUT:
Summary:
From the Analysis we can conclude that model with the pressure as the dependent variable is better than the other models. The Adjusted R square value of 0.31 is the best and the RMSE value is also the least in case of the pressur model.
From the Adjusted R Squared value we conclude that the pressure model is the best and can predict the energy produced rate accurately for 31% of the data.
c) Backward Selection Approach:
Regression Model using all the variables:
INPUT:
OUTPUT:
Conclusion:
The backward step AIC function tells a slightly different result then the models generated above. However, when we create the regression model we see a low R2 value then our single mod ...
Beer is an incredibly complex beverage containing more than 3000 different compounds, including carbohydrates, proteins, ions, microbes, organic acids, and polyphenols, among others.Some of the analytical methods used for quality control are presented
Practical White Wine Production: Theory and PracticeSabrina Lueck
The document discusses white wine making theory and practice at the Walla Walla Center for Enology and Viticulture, including details on their 2013 and 2012 wine awards. It provides a two-part format covering chemical parameters and processes in winemaking, and how to identify issues and apply concepts to make easier and better wines. The Center thanks those who support their work empowering them to be better winemakers and educators.
This document analyzes a dataset containing information about 1000 red wines and 1000 white wines. It aims to understand the characteristics of these wines and how different ingredients affect quality. The dataset includes 13 attributes for each wine like acidity, sugar, and alcohol content. The analysis includes descriptive statistics by wine type and quality, decision trees to predict type and quality based on ingredients, and clustering to group wines. Key findings are that chlorides, sulfates and acidity help predict wine type, while alcohol and density most influence quality.
Determination of Wine Color and Total Phenol Content using the LAMBDA PDA UV/...PerkinElmer, Inc.
Historically, the earliest evidence of viniculture is approximately
8,000 years ago and worldwide it has become increasingly more prevalent in recent years. The expansion of markets and producers has resulted in an escalation in methods used to
guarantee product safety and quality of wine.
Wine contains over 600 nutritional substances including vitamins, organic acids and more importantly polyphenols. The seeds and skin of the grape provide a valuable source of polyphenols, and with increasing interest in their health-enhancing properties as antioxidants, research has gathered pace over the last 15 years. The key benefits found have been aiding age prevention and cardiovascular disease by preventing the oxidation of Low Density Lipoprotein (LDL).1
The versatility of PerkinElmer’s LAMBDA™ 265 and LAMBDA 465 PDA UV/Visible Spectrophotometers allows quantification of the total phenol content in the wines, and also wine color to be measured to determine quality and any potential contamination.
This document examines quality properties in a wine dataset from a Portuguese vineyard called Vinho Verde. The dataset contains over 1000 observations of red wines rated by experts on quality. Quality ratings are associated with 12 quantitative characteristics that may influence quality. Three models are created to predict quality:
1. The first model uses all 12 characteristics to predict quality but has a low adjusted R-square. Seven characteristics are significant predictors of quality.
2. Stepwise regression identifies the same seven predictors as the first model.
3. A third model attempts to predict high quality wines rated 7 or above but fails due to a low adjusted R-square and only alcohol being a marginally significant predictor. The model is rejected due to
This document provides information about Vaikash Exims, an Indian exporter of various chemicals and minerals. It introduces the company and its owner, K.K. Kumar, who has over 20 years of experience in export. The company exports products to destinations around the world, especially in Asia, Africa, and the Middle East. The document lists the company's certifications and awards and provides details on its product offerings, including chemical analysis reports and specifications for various chemicals and minerals. It describes common uses of the chemicals in different industries.
2018 Oregon Wine Symposium | Understanding Control Points from Crush Pad to B...Oregon Wine Board
Sydney Morgan presented her research on the influence of decisions made at crush on yeast populations and wine sensory profiles. She studied the effects of uninoculated fermentations, sulfur dioxide additions, and pied de cuve inoculations using pinot gris and chardonnay grapes. Her results showed that uninoculated fermentations with low or no sulfur dioxide increased yeast diversity and introduced indigenous yeasts, producing wines with more tropical and fruit characteristics. Pied de cuve inoculations required a higher yeast count to be effective. Future work includes identifying indigenous yeasts in vineyards and evaluating their fermentation potential to improve complexity in minimal intervention wines.
Target variable: Quality
Parameters associated: Alcohol, pH, Acidity, Volatility
The following Quality can be achieved
Pricing based on the chemical and physiometric properties.
Segmentation: Defining new markets.
DFV Wines is steadfastly committed in crafting and representing wines of the highest quality produced in accordance with sustainable wine growing practice using data mining.
Analysis of fermentation products of (2) (1)prakash64742
The document provides information on the analysis of fermentation products of spirits. It discusses the key constituents typically found in spirits like brandy, gin, rum, whisky and vodka. These include alcohols, acids, esters, aldehydes and others. The document also describes analytical tests prescribed for analyzing spirits, such as tests for alcohol content, total solids, acidity, esters and others. Distillation is used to produce spirits from fermented liquors resulting in products high in alcohol that do not spoil microbially.
The document presents analytical testing methods for alcoholic beverages as established by the Food Safety and Standards Authority of India (FSSAI). It begins with an introduction to FSSAI's regulations for alcoholic beverages and lists common types of alcoholic beverages. The main part of the document outlines 11 specific analytical testing methods prescribed by FSSAI to test alcoholic beverages for various quality parameters like ethyl alcohol content, residue, total acidity, volatile acids, esters, and contaminants. Each testing method is described in detail including required equipment, reagents, procedures, and calculations.
Presentation of CDR WineLab®, Wine Analysis SystemCDR S.r.l.
CDR conducts business in various sectors including food and beverage analysis. The document discusses CDR WineLab, an analyzer used for wine quality control and analysis. It can test for various parameters in grapes, must, wine and bottled wine like sugars, acids, yeast nutrients, and sulfur levels. The analyzer is easy to use with pre-filled reagents and provides fast, accurate results to help monitor the winemaking process from grapes to finished wine.
This document discusses ways that caustic cleaning chemicals can accidentally be introduced into beer during the cleaning-in-place (CIP) process. It provides details on how automated brewery systems work and how errors can occur, leading to chemical contamination. The document recommends monitoring rinse water pH and sodium levels in beer to detect contamination. It provides an example calculation for determining the volume of caustic introduced based on sodium concentration differences between a contaminated and control beer sample.
CDR WineLab®: controllare, intervenire e migliorare la vinificazione in cantinaCDR S.r.l.
Le analisi del vino per il controllo del processo della vinificazione in Rosso e in Bianco con CDR WineLab®, il Sistema semplice per il tuo controllo qualità.
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Aggregage
This webinar will explore cutting-edge, less familiar but powerful experimentation methodologies which address well-known limitations of standard A/B Testing. Designed for data and product leaders, this session aims to inspire the embrace of innovative approaches and provide insights into the frontiers of experimentation!
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"sameer shah
Embark on a captivating financial journey with 'Financial Odyssey,' our hackathon project. Delve deep into the past performance of two companies as we employ an array of financial statement analysis techniques. From ratio analysis to trend analysis, uncover insights crucial for informed decision-making in the dynamic world of finance."
End-to-end pipeline agility - Berlin Buzzwords 2024Lars Albertsson
We describe how we achieve high change agility in data engineering by eliminating the fear of breaking downstream data pipelines through end-to-end pipeline testing, and by using schema metaprogramming to safely eliminate boilerplate involved in changes that affect whole pipelines.
A quick poll on agility in changing pipelines from end to end indicated a huge span in capabilities. For the question "How long time does it take for all downstream pipelines to be adapted to an upstream change," the median response was 6 months, but some respondents could do it in less than a day. When quantitative data engineering differences between the best and worst are measured, the span is often 100x-1000x, sometimes even more.
A long time ago, we suffered at Spotify from fear of changing pipelines due to not knowing what the impact might be downstream. We made plans for a technical solution to test pipelines end-to-end to mitigate that fear, but the effort failed for cultural reasons. We eventually solved this challenge, but in a different context. In this presentation we will describe how we test full pipelines effectively by manipulating workflow orchestration, which enables us to make changes in pipelines without fear of breaking downstream.
Making schema changes that affect many jobs also involves a lot of toil and boilerplate. Using schema-on-read mitigates some of it, but has drawbacks since it makes it more difficult to detect errors early. We will describe how we have rejected this tradeoff by applying schema metaprogramming, eliminating boilerplate but keeping the protection of static typing, thereby further improving agility to quickly modify data pipelines without fear.
Global Situational Awareness of A.I. and where its headedvikram sood
You can see the future first in San Francisco.
Over the past year, the talk of the town has shifted from $10 billion compute clusters to $100 billion clusters to trillion-dollar clusters. Every six months another zero is added to the boardroom plans. Behind the scenes, there’s a fierce scramble to secure every power contract still available for the rest of the decade, every voltage transformer that can possibly be procured. American big business is gearing up to pour trillions of dollars into a long-unseen mobilization of American industrial might. By the end of the decade, American electricity production will have grown tens of percent; from the shale fields of Pennsylvania to the solar farms of Nevada, hundreds of millions of GPUs will hum.
The AGI race has begun. We are building machines that can think and reason. By 2025/26, these machines will outpace college graduates. By the end of the decade, they will be smarter than you or I; we will have superintelligence, in the true sense of the word. Along the way, national security forces not seen in half a century will be un-leashed, and before long, The Project will be on. If we’re lucky, we’ll be in an all-out race with the CCP; if we’re unlucky, an all-out war.
Everyone is now talking about AI, but few have the faintest glimmer of what is about to hit them. Nvidia analysts still think 2024 might be close to the peak. Mainstream pundits are stuck on the wilful blindness of “it’s just predicting the next word”. They see only hype and business-as-usual; at most they entertain another internet-scale technological change.
Before long, the world will wake up. But right now, there are perhaps a few hundred people, most of them in San Francisco and the AI labs, that have situational awareness. Through whatever peculiar forces of fate, I have found myself amongst them. A few years ago, these people were derided as crazy—but they trusted the trendlines, which allowed them to correctly predict the AI advances of the past few years. Whether these people are also right about the next few years remains to be seen. But these are very smart people—the smartest people I have ever met—and they are the ones building this technology. Perhaps they will be an odd footnote in history, or perhaps they will go down in history like Szilard and Oppenheimer and Teller. If they are seeing the future even close to correctly, we are in for a wild ride.
Let me tell you what we see.
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...Social Samosa
The Modern Marketing Reckoner (MMR) is a comprehensive resource packed with POVs from 60+ industry leaders on how AI is transforming the 4 key pillars of marketing – product, place, price and promotions.
Build applications with generative AI on Google CloudMárton Kodok
We will explore Vertex AI - Model Garden powered experiences, we are going to learn more about the integration of these generative AI APIs. We are going to see in action what the Gemini family of generative models are for developers to build and deploy AI-driven applications. Vertex AI includes a suite of foundation models, these are referred to as the PaLM and Gemini family of generative ai models, and they come in different versions. We are going to cover how to use via API to: - execute prompts in text and chat - cover multimodal use cases with image prompts. - finetune and distill to improve knowledge domains - run function calls with foundation models to optimize them for specific tasks. At the end of the session, developers will understand how to innovate with generative AI and develop apps using the generative ai industry trends.
Open Source Contributions to Postgres: The Basics POSETTE 2024ElizabethGarrettChri
Postgres is the most advanced open-source database in the world and it's supported by a community, not a single company. So how does this work? How does code actually get into Postgres? I recently had a patch submitted and committed and I want to share what I learned in that process. I’ll give you an overview of Postgres versions and how the underlying project codebase functions. I’ll also show you the process for submitting a patch and getting that tested and committed.
Analysis insight about a Flyball dog competition team's performanceroli9797
Insight of my analysis about a Flyball dog competition team's last year performance. Find more: https://github.com/rolandnagy-ds/flyball_race_analysis/tree/main
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
1. Capstone Project- Wine Quality Analysis
#ALL LINES IN THIS COLOUR THROUGHUT THE REPORT ARE INFERENCES FROM THE ANALYSIS DONE ABOVE THAT LINE IN THE
REPORT
Overview
We consider a set of observations on a number of red varieties involving their chemical properties and
ranking by tasters. Wine industry showed a recent growth as social drinking was on the rise. The price of
wine depends on a rather abstract concept of wine appreciation by wine tasters, opinion among whom may
have a high degree of variability. Pricing of wine depends on such a volatile factor to some extent. Another
key factor in wine certification and quality assessment is physicochemical tests which are laboratory-based
and takes into account factors like acidity, pH level, presence of sugar and other chemical properties. For the
wine market, it would be of interest if human quality of tasting can be related to the chemical properties of
wine so that certification and quality assessment and assurance process is more controlled.
Introduction
Red Wine Dataset is available having 1599 different varieties. All wines are produced in a particular area of
Portugal. Data are collected on 12 different properties of the wines one of which is Quality, based on sensory
data, and the rest are on chemical properties of the wines including density, acidity, alcohol content etc. All
chemical properties of wines are continuous variables. Quality is an ordinal variable with possible ranking
from 1 (worst) to 10 (best). Each sample of wine is tasted by three independent tasters and the final rank
assigned is the median rank given by the tasters.
Objectives of the Analysis
Objective is prediction of Quality ranking from the chemical properties of the wines. A predictive model
developed to be this data is expected to provide guidance to vineyards regarding quality and price
expected on their produce without heavy reliance on volatility of wine tasters.
List of Attributes in Data
1. Fixed acidity: most acids involved with wine or fixed or non-volatile (one that do not evaporate
readily)
2. Volatile acidity: the amount of acetic acid in wine, which at too high of levels can lead to an
unpleasant, vinegar taste
3. Citric acid: found in small quantities, citric acid can add ‘freshness’ and flavour to wines
4. Residual sugar: the amount of sugar remaining after fermentation stops, it’s rare to find wines with
less than 1 gram/litre and wines with greater than 45 grams/litre are considered sweet
5. Chlorides: the amount of salt in the wine
6. Free sulphur dioxide: the free form of SO2 exists in equilibrium between molecular SO2 (as a
dissolved gas) and bisulphite ion; it prevents microbial growth and the oxidation of wine
7. Total sulphur dioxide:amount of free and bound forms of S02; in low concentrations, SO2 is mostly
undetectable in wine, but at free SO2 concentrations over 50 ppm, SO2 becomes evident in the
nose and taste of wine
8. Density: the density of wine is close to that of water depending on the percent alcohol and sugar
content
9. pH: describes how acidic or basic a wine is on a scale from 0 (very acidic) to 14 (very basic); most
wines are between 3-4 on the pH scale.
2. 10. Sulphates: a wine additive which can contribute to sulphur dioxide gas (S02) levels, which acts as an
antimicrobial and antioxidant.
11. Alcohol: the percent alcohol content of the wine
12. Quality: output variable (based on sensory data, score between 0 and 10)
Analysis of Data
1. Basic Statistics
summary(wine.df)
## fixed.acidity volatile.acidity citric.acid residual.sugar
## Min. : 4.60 Min. :0.1200 Min. :0.000 Min. : 0.900
## 1st Qu.: 7.10 1st Qu.:0.3900 1st Qu.:0.090 1st Qu.: 1.900
## Median : 7.90 Median :0.5200 Median :0.260 Median : 2.200
## Mean : 8.32 Mean :0.5278 Mean :0.271 Mean : 2.539
## 3rd Qu.: 9.20 3rd Qu.:0.6400 3rd Qu.:0.420 3rd Qu.: 2.600
## Max. :15.90 Max. :1.5800 Max. :1.000 Max. :15.500
## chlorides free.sulfur.dioxide total.sulfur.dioxide
## Min. :0.01200 Min. : 1.00 Min. : 6.00
## 1st Qu.:0.07000 1st Qu.: 7.00 1st Qu.: 22.00
## Median :0.07900 Median :14.00 Median : 38.00
## Mean :0.08747 Mean :15.87 Mean : 46.47
## 3rd Qu.:0.09000 3rd Qu.:21.00 3rd Qu.: 62.00
## Max. :0.61100 Max. :72.00 Max. :289.00
## density pH sulphates alcohol
## Min. :0.9901 Min. :2.740 Min. :0.3300 Min. : 8.40
## 1st Qu.:0.9956 1st Qu.:3.210 1st Qu.:0.5500 1st Qu.: 9.50
## Median :0.9968 Median :3.310 Median :0.6200 Median :10.20
## Mean :0.9967 Mean :3.311 Mean :0.6581 Mean :10.42
## 3rd Qu.:0.9978 3rd Qu.:3.400 3rd Qu.:0.7300 3rd Qu.:11.10
## Max. :1.0037 Max. :4.010 Max. :2.0000 Max. :14.90
## quality
## Min. :3.000
## 1st Qu.:5.000
## Median :6.000
## Mean :5.636
## 3rd Qu.:6.000
## Max. :8.000
1. The alcohol contentvaries from 8.40 to 14.90 for the samples in dataset.
2. The quality of the samples range from 3 to 8 with 6 being the median.
3. The range for fixed acidity is quite high with minimum being 4.6 and maximum being 15.9,
4. pH value varies from 2.740 to 4.010 with a median being 3.310.
3. 2. Histogram Plot Analysis
1. The spread for the quality for Red w ine exhibit a peak quality rating of approx 5.
2. The pH value seems to dispaly a normal distribution w ith major samples exhibiting values betw een 3.0 and 3.6
3. The free sulfur dioxide seems to be betw een the 1-60 w ith peaking around 10 mark.
4. The total sulfur dioxide seems to a have a spread betw een 0 and 300 and exhibiting peak around 50.
5. The alcohol content seems to vary from8 to 14 w ith major peaks around 9.
3. Correlation Matrix and Correlogram and CoVariance
#Correlation Matrix
cor(wine.df)
5. #Correlogram
library("corrgram", lib.loc="/Library/Frameworks/R.framework/Versions/3.4/Resourc
es/library")
corrgram(wine.df, order=TRUE, lower.panel=panel.shade,upper.panel=panel.pie, text
.panel=panel.txt,main="Red Wine Quality")
1. Free SO2-Noticeable positive correlation with Total SO2 and Residual sugar Negative correlation
with pH and Alcohol
2. Total So2-Positive correlation between free so2 and residual sugar Negative correlation with
Alcohol
3. pH-Positive correlation with Alcohol and Volatile Acidity Negative correlation with Total and Free
SO2,Residual sugar,citric acid.
4. Alcohol-Positive correlation with pH and quality NEGATIVE Correlation with density,total and free
so2,chlorides
5. Quality-positive correlation with alcohol negative correaltion with density,chlorides,volatile acidity
6. AlcoholAnalysis - ScatterPlots
1. There seems to be no significantbias ofthe alcohol contenteventhough there are samples with higer
Alcohol contentfor Red wine
2. pH scatterplot indicates an intrestng observation that pH and alcohol share storng correlations.
3. Total SO2 content decreases with Alcohol contentfor wine
4. The Free SO2 content decrease as the alcohol contentincreases for wine.
7. pH Analysis - ScatterPlots
1. No clear relation is established between quality and pH
2. There is a distributed relations between pH and Total sulphur dioxide with SO2
maximum ranging to be around 150.
3. There is a distributed relations between pH and Free sulphur dioxide
8. Hypothesis Testing
##Hypothesis 1
#A higher alcohol content and lower fixed acidity tends to equal a higher
quality wine. Why is this?
Sol.
I will use heatmaps and Chi-Square Tests for concluding this hypothesis.
#HeatMap
#Chi-Sq on Quality and Alcohol
chisq.test(quality, alcohol)
## data: quality and alcohol
## X-squared = 1124.5, df = 320, p-value < 2.2e-16
#Chi-Sq on Quality and Alcohol
chisq.test(quality, fixed.acidity)
## data: quality and fixed.acidity
## X-squared = 736.08, df = 475, p-value = 1.416e-13
This hypothesis comes out to be correct as the Chi-Sq tests confirm that there exists a signif
icant relation and heatmap shows the distribution.
9. ##Hypothesis 2
#Higher quality wine tends to have a lower residual sugar and lower citric
acid. Why is this?
#HeatMap
#Chi-Sq on Quality and Alcohol
chisq.test(quality, citric.acid)
## data: quality and citric.acid
## X-squared = 695.82, df = 395, p-value < 2.2e-16
#Chi-Sq on Quality and Alcohol
chisq.test(quality, residual.sugar)
## data: quality and residual.sugar
## X-squared = 864.79, df = 450, p-value < 2.2e-16
This hypothesis is wrong considering the observations from the heatmap.
10. ##Hypothesis 3
#Does lower sulfur content make wine higher quality?
#ScatterPlot
plot(sulphates, quality, ylab="Quality", xlab="Sulphates", main="Quality vs
Sulphates")
#Chi-Sq on Quality and Sulphates
chisq.test(quality, sulphates)
## data: quality and sulphates
## X-squared = 925.78, df = 475, p-value < 2.2e-16
Yes this hypothesis stands correct as majorly the samples with higher quality
tend to have lower sulphate contents.
Linear Regression Models and Testing
#Test Model 1
model1 <- lm( quality ~ alcohol)
summary(model1)
##
## Call:
## lm(formula = quality ~ alcohol)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.8442 -0.4112 -0.1690 0.5166 2.5888
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
11. ## (Intercept) 1.87497 0.17471 10.73 <2e-16 ***
## alcohol 0.36084 0.01668 21.64 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.7104 on 1597 degrees of freedom
## Multiple R-squared: 0.2267, Adjusted R-squared: 0.2263
## F-statistic: 468.3 on 1 and 1597 DF, p-value: < 2.2e-16
P-value and the star marking assure that Alcohol is a significant factor.
add1(model1, scope = wine.df, test = 'F')
## Warning in model.matrix.default(Terms, m, contrasts.arg = object
## $contrasts): the response appeared on the right-hand side and was dropped
## Warning in model.matrix.default(Terms, m, contrasts.arg = object
## $contrasts): problem with term 11 in model.matrix: no columns are assigned
## Single term additions
##
## Model:
## quality ~ alcohol
## Df Sum of Sq RSS AIC F value Pr(>F)
## <none> 805.87 -1091.7
## volatile.acidity 1 94.074 711.80 -1288.1 210.9346 < 2.2e-16 ***
## citric.acid 1 31.953 773.92 -1154.3 65.8949 9.408e-16 ***
## residual.sugar 1 0.041 805.83 -1089.7 0.0822 0.774437
## chlorides 1 0.611 805.26 -1090.9 1.2103 0.271443
## free.sulfur.dioxide 1 0.325 805.55 -1090.3 0.6431 0.422696
## total.sulfur.dioxide 1 8.270 797.60 -1106.2 16.5475 4.976e-05 ***
## density 1 5.203 800.67 -1100.0 10.3708 0.001306 **
## pH 1 26.362 779.51 -1142.8 53.9749 3.226e-13 ***
## sulphates 1 44.977 760.89 -1181.5 94.3399 < 2.2e-16 ***
## quality 0 0.000 805.87 -1091.7
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We check that based on variance which all other factors can be significant
here.
#Test Model 7
model7 <- lm( quality ~ alcohol + pH + total.sulfur.dioxide + citric.acid + c
hlorides + sulphates + volatile.acidity)
summary(model7)
##
## Call:
## lm(formula = quality ~ alcohol + pH + total.sulfur.dioxide +
## citric.acid + chlorides + sulphates + volatile.acidity)
##
## Residuals:
12. ## Min 1Q Median 3Q Max
## -2.58632 -0.36679 -0.04584 0.45297 1.95470
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.6134833 0.4607493 10.013 < 2e-16 ***
## alcohol 0.2951742 0.0171178 17.244 < 2e-16 ***
## pH -0.5247565 0.1328432 -3.950 8.15e-05 ***
## total.sulfur.dioxide -0.0023114 0.0005082 -4.549 5.81e-06 ***
## citric.acid -0.1670682 0.1207391 -1.384 0.167
## chlorides -1.9153285 0.4028925 -4.754 2.17e-06 ***
## sulphates 0.8994970 0.1102877 8.156 6.96e-16 ***
## volatile.acidity -1.1146326 0.1145923 -9.727 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.6485 on 1591 degrees of freedom
## Multiple R-squared: 0.3579, Adjusted R-squared: 0.3551
## F-statistic: 126.7 on 7 and 1591 DF, p-value: < 2.2e-16
add1(model7, scope = wine.df, test = 'F')
## Warning in model.matrix.default(Terms, m, contrasts.arg = object
## $contrasts): the response appeared on the right-hand side and was dropped
## Warning in model.matrix.default(Terms, m, contrasts.arg = object
## $contrasts): problem with term 11 in model.matrix: no columns are assigned
## Single term additions
##
## Model:
## quality ~ alcohol + pH + total.sulfur.dioxide + citric.acid +
## chlorides + sulphates + volatile.acidity
## Df Sum of Sq RSS AIC F value Pr(>F)
## <none> 669.13 -1377.0
## residual.sugar 1 0.41979 668.71 -1376.0 0.9982 0.3179
## free.sulfur.dioxide 1 2.06369 667.06 -1379.9 4.9190 0.0267 *
## density 1 0.05573 669.07 -1375.1 0.1324 0.7160
## quality 0 0.00000 669.13 -1377.0
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#Test Model 8
model8 <- lm( quality ~ alcohol + pH + total.sulfur.dioxide + chlorides + sul
phates + volatile.acidity)
summary(model8)
This the final model
##
## Call:
13. ## lm(formula = quality ~ alcohol + pH + total.sulfur.dioxide +
## chlorides + sulphates + volatile.acidity)
##
## Residuals:
## Min 1Q Median 3Q Max
## -2.60575 -0.35883 -0.04806 0.46079 1.95643
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 4.2957316 0.3995603 10.751 < 2e-16 ***
## alcohol 0.2906738 0.0168108 17.291 < 2e-16 ***
## pH -0.4351830 0.1160368 -3.750 0.000183 ***
## total.sulfur.dioxide -0.0023721 0.0005064 -4.684 3.05e-06 ***
## chlorides -2.0022839 0.3980757 -5.030 5.46e-07 ***
## sulphates 0.8886802 0.1100419 8.076 1.31e-15 ***
## volatile.acidity -1.0381945 0.1004270 -10.338 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.6487 on 1592 degrees of freedom
## Multiple R-squared: 0.3572, Adjusted R-squared: 0.3548
## F-statistic: 147.4 on 6 and 1592 DF, p-value: < 2.2e-16
In this final model all the factors have come out to be significant.
add1(model8, scope = wine.df, test = 'F')
## Warning in model.matrix.default(Terms, m, contrasts.arg = object
## $contrasts): the response appeared on the right-hand side and was dropped
## Warning in model.matrix.default(Terms, m, contrasts.arg = object
## $contrasts): problem with term 11 in model.matrix: no columns are assigned
## Single term additions
##
## Model:
## quality ~ alcohol + pH + total.sulfur.dioxide + chlorides + sulphates +
## volatile.acidity
## Df Sum of Sq RSS AIC F value Pr(>F)
## <none> 669.93 -1377.1
## citric.acid 1 0.80525 669.13 -1377.0 1.9147 0.16664
## residual.sugar 1 0.28390 669.65 -1375.7 0.6745 0.41161
## free.sulfur.dioxide 1 2.39413 667.54 -1380.8 5.7061 0.01702 *
## density 1 0.04468 669.89 -1375.2 0.1061 0.74465
## quality 0 0.00000 669.93 -1377.1
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
14. Conclusion
A limitation of the current analysis is that the current data consists of samples collected
from a specific portugal region.It will be intresting to obtain datasets across various wine
making regions to eliminate any bias created by any secific qualities of the product.
Regression Equation
Quality = 4.29 + (0.29)*alcohol + (0.88)*sulphates – { (0.43)*pH + (0.002)*tot.SO2 +
(2)*chlorides + (1.03)*vol.acidity }
Hence quality depends on factors like alcohol and sulphates in a positive relation
and on pH , SO2 chlorides and acidity in a negative relation.
Refrences
Practicalwinery.com: http://www.practicalwinery.com/janfeb09/page5.htm:
Calwineries: http://www.calwineries.com/learn/wine-chemistry
Waterhouse Lab :http://waterhouse.ucdavis.edu/
Aroma Dictiory:http://www.aromadictionary.com/articles/salt_article.html
Wisconsin Dept.Health Services: https://www.dhs.wisconsin.gov/chemical/sulfates.htm
Wines.com: http://www.wines.com/wiki/Density/