SlideShare a Scribd company logo
1 of 48
Download to read offline
UNIVERSITY OF KABIANGA 
SCHOOL OF SCIENCE AND TECHNOLOGY 
DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE 
THE STUDY ON THE FACTORS AFFECTING THE DIFFICULTY OF A SUDOKU 
PUZZLE 
A PROJECT REPORT SUBMITTED IN PARTIAL FULFILMENT OF THE 
REQUIREMENTS FOR THE AWARD OF DEGREE OF BACHELOR OF SCIENCE IN 
APPLIED STATISTICS WITH COMPUTING OF UNIVERSITY OF KABIANGA 
BY 
OKOYO COLLINS OMONDI 
AST/0037/09 
SUPERVISORS: MR. RUEBEN C. LANGA’T 
MR. TONUI B. 
APRIL 2013
DECLARATION 
I, Okoyo Collins Omondi, do hereby declare that this project report is my original work and has 
not been presented for an award of degree in any other university. 
Sign:……………………………. Date:……………………………… 
This project has been submitted for examination with our approval as University supervisors. 
2 
Supervisors: 
1. Mr. Rueben C. Langa’t 
Department of Mathematics and Computer Science 
University of Kabianga 
Signature:…………………. Date:…………………….. 
2. Mr. Tonui B. 
Department of Mathematics and Computer Science 
University of Kabianga 
Signature:…………………. Date:……………………..
DEDICATION 
This project is dedicated to all the Sudoku players and hobbyists. 
I dedicate this work too to my beloved parents Charles Okoyo and Lucy Okoyo for their 
unconditional material and financial support; my siblings Everlyne, Evans, Basil, Sheilah and 
Oliver for their overwhelming social support. 
3
ACRONYMS 
SAS: Statistical Application Software. 
SPSS: Statistical Package for Social Sciences. 
DOE: Design of Experiments. 
4
5 
DEFINATION OF TERMS 
Box: A 3 by 3 grid inside the Sudoku puzzle. It works the same as rows and 
columns, meaning it must contain the digits 1 – 9. 
Region: This refers to a row, column or box. 
Candidate: An empty square in a Sudoku puzzle have a certain set of numbers that 
does not conflict with the row, column and box it is in. Those numbers are 
called candidates or candidate numbers. 
Given: A given is defined as a number in the original Sudoku puzzle, meaning 
that a Sudoku puzzle has a certain number of clues which is then used to 
fill in new squares. The number filled in by the solver is, however, not 
regarded as a given. 
Complete block: This is where each treatment appears in each block. 
Response: The process output. 
Factor: Uncontrolled or controlled variable whose influence is being studied. 
Level: Setting of a factor (+, -, 1, -1, high, low, alpha, numeric). 
Run: This is the treatment combinations; setting all factors to obtain a response. 
Replicate: Number of times a treatment combination is run (usually randomized). 
Repeat: Non – randomized replicate. 
Inference space: Operating range of factors under study. 
Design Expert: Software used to design experiments.
ABSTRACT 
This project demonstrates how to apply design of experiments; in particular, the full factorial 
design to gain insight into real, everyday statistical problems and situations. The design of this 
project ultimately results in an intuitive understanding of statistical procedures and strategies 
most often used by practicing statisticians and scientists. 
Hence, it’s expected that the choice on the study of factors affecting the difficulty of Sudoku 
puzzle provides a real statistical problem in designing different experiments. 
6
TABLE OF CONTENTS 
DECLARATION...................................................................................................................................... 2 
DEDICATION ........................................................................................................................................ 3 
ACRONYMS.......................................................................................................................................... 4 
DEFINATION OF TERMS ....................................................................................................................... 5 
ABSTRACT ............................................................................................................................................ 6 
CHAPTER ONE .................................................................................................................................... 10 
1.0 BACKGROUND ........................................................................................................................ 10 
1.1 PROBLEM STATEMENT ........................................................................................................... 11 
1.2 STUDY PURPOSE..................................................................................................................... 11 
1.3 STUDY OBJECTIVES ................................................................................................................. 11 
CHAPTER TWO ................................................................................................................................... 12 
2.0 LITERATURE REVIEW .............................................................................................................. 12 
2.1 INTRODUCTION ...................................................................................................................... 12 
2.2 HOW TO PLAY ....................................................................................................................... 13 
2.3 OPERATIONALIZATION OF VARIABLES .................................................................................... 14 
2.3.1 Response Variable .............................................................................................................. 14 
2.3.2 Control Variables ................................................................................................................ 15 
CHAPTER THREE ................................................................................................................................ 18 
3.0 EXPERIMENTAL DESIGN .......................................................................................................... 18 
3.1 PERFORMING THE EXPERIMENT ............................................................................................. 20 
3.2 STATISTICAL ANALYSIS............................................................................................................ 20 
CHAPTER FOUR .................................................................................................................................. 21 
4.0 RESULTS AND ANALYSIS OF DATA ........................................................................................... 21 
4.1 INTRODUCTION ...................................................................................................................... 21 
4.2 SUMMARY OF THE EXPERIMENTS........................................................................................... 21 
4.3 ANALYSIS OF SUDOKU PUZZLES WITH EASY DIFFICULTY RATING............................................. 22 
4.3.1 Factors Effects .................................................................................................................... 22 
4.3.2 The Analysis of Variance (ANOVA) ...................................................................................... 25 
4.3.3 Model Prediction ................................................................................................................ 26 
4.4 ANALYSIS OF SUDOKU PUZZLES WITH MEDIUM DIFFICULTY RATING ...................................... 27 
7
4.4.1 Factors Effects .................................................................................................................... 27 
4.4.2 The Analysis of Variance (ANOVA) ...................................................................................... 30 
4.4.3 Model Prediction ................................................................................................................ 31 
CHAPTER FIVE .................................................................................................................................... 32 
5.0 CONCLUSION AND RECOMMENDATIONS ............................................................................... 32 
5.1 CONCLUSION ......................................................................................................................... 32 
5.2 RECOMMENDATIONS ............................................................................................................. 33 
APPENDICES ...................................................................................................................................... 34 
1.0 Data Collection Tool for Easy Rating Sudoku Puzzles ............................................................... 34 
2.0 Data Collection Tool for Medium Rating Sudoku Puzzles ........................................................ 38 
3.0 Experiment data for Easy Sudoku puzzles ............................................................................... 42 
4.0 Experiment data for Medium Sudoku puzzles ......................................................................... 45 
REFERENCES ...................................................................................................................................... 48 
LIST OF FIGURES 
Figure 1: Sudoku Grid with Row, Column and Box Names ................................................................. 12 
Figure 2: General view of Sudoku game environment ....................................................................... 13 
Figure 3: Main Effects plot ................................................................................................................ 22 
Figure 4: Interaction Plots ................................................................................................................. 23 
Figure 9: Half – Normal Plot .............................................................................................................. 24 
Figure 10: Pareto Plot ..................................................................................................................... 24 
Figure 11: Residual Plots ................................................................................................................. 25 
Figure 11: Main Effect Plots ............................................................................................................ 27 
Figure 12: Interaction Plots ............................................................................................................. 28 
`Figure 13: Half – Normal Plot .......................................................................................................... 29 
Figure 14: Pareto Plot ..................................................................................................................... 29 
Figure 15: Residual Plots ................................................................................................................. 30 
8
LIST OF TABLES 
Table 1: The amount ranges of givens in each difficult level ............................................................. 15 
Table 2: Variation of the Number of Givens. .................................................................................... 15 
Table 3: Variation of the Distribution of Givens. ............................................................................... 16 
Table 4: Variation of the Redundant Numbers. ................................................................................ 17 
Table 5: The design matrix for each experiment in coded values. ..................................................... 18 
Table 6: The design matrix in statistics values for the Easy experiment. ........................................... 19 
Table 7: The design matrix in statistics values for the Medium experiment. ..................................... 19 
Table 8: Effects List .......................................................................................................................... 23 
Table 9: ANOVA FOR EASY EXPERIMENT .......................................................................................... 25 
Table 12: Fit Statistics for Y1 .............................................................................................................. 26 
Table 13: Effects List .......................................................................................................................... 28 
Table 14: ANOVA FOR MEDIUM EXPERIMENT .................................................................................... 30 
Table 15: Fit Statistics for Y1 .............................................................................................................. 31 
9
CHAPTER ONE 
1.0 BACKGROUND 
In recent years Sudoku puzzles have become an increasingly popular pass time. Sudoku’s simple 
set of rules and multiple levels of puzzle difficulty attract hobbyists with varying skills and 
experience. Typically, Sudoku puzzles are categorized by level of difficulty i.e. Easy, Medium, 
Hard, Killer, etc. Yet, it is not uncommon for the perceived difficulty of puzzles to vary greatly, 
even within a single difficulty rating. 
It is the goal of this project is to determine whether additional factors, other than the published 
difficulty rating, have an effect on the difficulty of a Sudoku puzzle. In an attempt to keep the 
experimental results practical for the casual Sudoku hobbyist, factors that can be easily estimated 
by visual inspection of the puzzle were chosen. The factors are: 
 Number of givens – The number of initial givens provided in the puzzle. 
 Distribution of givens – The relative placement of initial givens in the puzzle. 
 Redundant numbers – The repetition of specific numbers in a puzzle’s set of initial 
10 
givens. 
The interest is the effects of these factors on puzzles of a single published difficulty level. In 
preparation for this project a full 23 factorial experiment was performed on a set of eight “Easy” 
puzzles. The experiment was then repeated on eight “Medium” puzzles. 
The results of the two experiments were used to show which of the above mentioned factors have 
the largest effect on the puzzle difficulty. Several other effects were also determined and their 
significance, further, a general model equation that approximates, as accurately as possible, the 
time expected to be taken by a Sudoku hobbyist in solving the puzzle by taking into account the 
above factors and holding other factors constant was as well determined. 
The rest of the project report is organized as follows: Chapter two provides a brief description of 
the objective of our two experiments and gives a quick background to the problem. Chapter 3 
describes the experimental design used for the two experiments and about how the experiment 
was run. Chapter 4 provides results and an analysis of the data. Conclusion and Recommendation 
is given in Chapter 5.
1.1 PROBLEM STATEMENT 
There are multiple factors that normally affect the difficulty of a Sudoku puzzle apart from the 
published difficulty rating, that is, extremely easy, easy, medium, difficult, evil, etc., this project 
will be limited to studying only three such factors: 
1. Number of the initial givens. 
2. The distribution of the givens. 
3. The redundancy of the individual givens. 
Hence, it’s expected that the knowledge of the effects of these factors would reduce the time 
taken by Sudoku hobbyist in playing the game, subsequently, making the game enjoyable and 
interesting. 
1.2 STUDY PURPOSE 
Sudoku is today a popular game throughout the world and it appears in multiple Medias, 
including websites, newspapers and books. As a result, it is of interest to find the factors 
affecting the difficulty of a Sudoku puzzle besides the published difficulty rating of extremely 
easy, easy, medium, difficult, evil etc. 
Moreover, another goal of this study is therefore, to contribute to the knowledge and 
comprehension of the Latin square designs, factorial designs and the design of experiments in 
general, as the analysis of the factors affecting the difficulty of Sudoku puzzles employs the use 
of these designs. 
1.3 STUDY OBJECTIVES 
The broad objective of this experiment is to quantify the effects of a Sudoku puzzle’s initial 
structure and set of givens on the expected time required to complete the puzzle. 
In particular, the experiment will seek to fulfill the following; 
1. What are the effects of the number of givens, distribution of the givens and the 
redundancy of the specific givens on the difficulty of a Sudoku puzzle? 
2. Are the effects of each factor consistent across levels of each factor? 
11
CHAPTER TWO 
2.0 LITERATURE REVIEW 
2.1 INTRODUCTION 
Sudoku puzzle, as a widely popular intellectual game in recent years, was invented in Swiss in 
18th century. Then, it initially harvested well development in Japan in the past decades. The 
name Sudoku actually derives from Japanese that means “number place” 
12 
[1] 
. Due to its simple 
and friendly rules for beginners and the charm from intellectual challenge, Sudoku becomes 
welcome recently for players of various ages. You are even able to solve a Sudoku puzzle easily 
without any mathematical knowledge. 
A Sudoku puzzle consists of a table with nine rows and nine columns. The squares (i.e. cells in 
the table) are grouped in sets of nine which we will call boxes. For clarity we will call the rows 
r1, r2, … r9, the columns c1, c2, … c9, and the boxes b1, b2, … b9. Figure 1 provides a diagram 
showing a sample Sudoku grid with row, column, and box names. The squares are named sij 
where i is the row number and j is the column number. 
c1 c2 c3 c4 c5 c6 c7 c8 c9 
r1 b1 b2 b3 
r2 1 
r3 
r4 b4 5 b5 2 b6 
r5 2 9 
r6 6 
r7 b7 b8 b9 
r8 
r9 
Figure 1: Sudoku Grid with Row, Column and Box Names 
Source: 
www.onlinegames.com/ 
sudokugame.
2.2 HOW TO PLAY 
How is the Sudoku game played? “You only need to know where you play the game and what 
your goal is. The simple aspects that help you join the game are specified as follows”[2]: 
Game Environment: you may first get a general overview of this game board as shown below; 
Figure 2: General view of Sudoku game environment 
Several basic components of the board are defined as Figure 1.1 illustrates. The whole board is 
actually a 9-by-9 grid made of nine smaller 3-by-3 grids called blocks. The smallest unit square 
is called a cell which has two types of states: empty, and confirmed by a digit from 1 through 9. 
We mark the whole grid with rows and columns from top-left corner. 
Goal of the Game: generally, Sudoku game is started with such a situation in grid that some of 
the cells have already been confirmed by digits known as givens. The task for Sudoku players is 
to place a digit from 1 to 9 into each cell of the grid, and meanwhile each digits can only be used 
exactly once in each row, each column and each block. Additionally, all the nine rows, nine 
columns and nine blocks are respectively ensured to contain all the digits from 1 through 9. 
13
These limitations for placing digits in three locations are respectively called row constraint, 
column constraint and block constraint. 
Based on the rules that we mentioned above, Sudoku players are commonly inspired to complete 
the placement of digits into all empty cells using various techniques as soon as possible. 
2.3 OPERATIONALIZATION OF VARIABLES 
2.3.1 Response Variable 
The time required to complete a puzzle was the single response variable for this experiment. The 
typical level for this variable can range from a few minutes to over half of an hour. The response 
will be measured in terms of minutes. There is no practical limit on the range at which this 
response can be measured. The project tried to as well develop and approximate the total time 
expected to be taken averagely by any Sudoku hobbyist to complete filling the empty cells 
correctly. 
The following formula was used to approximate the response variable; 
Ŷijk = μ + αi + βj + γk + εijk 
Where; Ŷijk: Observation with factor A at level i, B at level j and C at level k. 
μ: Mean response. 
αi: Effect of factor A at level i. 
βj: Effect of factor B at level j. 
γk: Effect of factor C at level k. 
εijk: Error term. 
14
2.3.2 Control Variables 
(i) Number of givens 
As the first factor affecting the level estimation, the total amount of given cells in an initial 
Sudoku puzzle can significantly eliminate potential choices of digits in each cell by the three 
constraints in the game rules. In general, it is reasonable to argue that the more empty cells 
provided at the start of a Sudoku game, the higher level the puzzle is graded in. We moderately 
scale the amount ranges of givens for each difficult level as shown below; 
Table 1: The amount ranges of givens in each difficult level 
Level Givens Amount Scores 
1 (Extremely easy) more than 50 1 
2 (Easy) 36-49 2 
3 (Medium) 32-35 3 
4 (Difficult) 28-31 4 
5 (Evil) 22-27 5 
Possibly the most obvious measurement of a puzzle’s difficulty is the number of givens provided 
in the initial grid. Typical values range from 50 givens for easier puzzles to 20 givens for highly 
difficult puzzles. For the two experiments, the control variable was varied as follows: 
Table 2: Variation of the Number of Givens. 
Easy Experiment Medium Experiment 
Min = 36 Min = 32 
Max = 49 Max = 35 
It was expected that the difficulty of a puzzle would increase as the number of initial givens 
decreases. 
(ii) Distribution of givens 
Another possible variable is the distribution of the initial givens in the grid. One can imagine that 
a puzzle with all, or most, of the givens crowded in one section of the grid may be different in 
15
difficulty than a puzzle with the givens spread evenly around the grid. Formally, this geometric 
property can be viewed as the variance of row, column, and box densities. 
The density of a row, column or box is defined to be the number of givens provided in that row, 
column or box. For example, row r4 in Figure 1 has a density of two, Column c3 has a density of 
one, and box b5 has a density of three. The mean density is then 
16 
27 
9 
1 
9 
1 
9 
1 
 
 
 
 
  
 
   
  i  
i 
i 
i 
i 
i 
desnsity 
c r b 
 
and the variance is 
      
27 
9 
1 
2 
9 
1 
2 
9 
1 
2 
2 
 
 
 
 
     
 
   
  i 
i density 
i 
i density 
i 
i density 
desnsity 
c  r  b  
 . 
The normal range for 2 
density  in the sample set that was conducted was found to be between 0 and 
3.333. Hence, for these two experiments, 2 
density  was varied as follows: 
Table 3: Variation of the Distribution of Givens. 
It is expected that the difficulty of a Sudoku puzzle would decrease as 2 
density  increases. 
Easy Experiment Medium Experiment 
1.19 <= Min <= 1.70 0.59 <= Min <= 0.96 
2.15 <= Max <= 2.67 1.26 <= Max <= 2.81
(iii) Redundant Numbers in the Initial Grid 
Easy Experiment Medium Experiment 
0.89 <= Min <= 1.33 0.89 <= Min <= 1.56 
2.89 <= Max <= 3.56 2.00 <= Max <= 3.11 
17 
This variable measures the variance, 2 
deg ree  , in the number of times a specific number is repeated 
in the initial Sudoku grid. Here, the degree of a number, deg(i) , is defined to be the number of 
times the number i appears in the initial grid. So the mean degree is 
deg( ) 
9 
9 
1 
deg 
 
 i 
ree 
i 
 
And the variance of the degree is 
 deg( ) 
 
9 
9 
1 
2 
deg 
2 
deg 
 
 
 i 
ree 
ree 
i  
 . 
The normal range for 2 
deg ree  in the sample set was found to be between 0.5432 and 3.7778. For 
these two experiments, 2 
deg ree  was varied as follows. 
Table 4: Variation of the Redundant Numbers. 
It was less expected that the degree 2 
deg ree  would be a significant factor in determining the 
difficulty of the Sudoku puzzle.
CHAPTER THREE 
3.0 EXPERIMENTAL DESIGN 
These experiments were run using two 23 full factorial designs. The original desire was to 
perform a single 24 full factorial design with the additional factor being the published rating for 
the puzzle. Unfortunately, a strong correlation between the published rating and the typical 
values for our other factors was found. The correlation was so strong that Easy and Medium 
puzzles with matching high and low values for the other factors could not be found. After 
measuring the typical values for the factors on over 120 puzzles, it was suitable to run two 
individual experiments. The first was run on a sample of puzzles with an Easy difficulty rating. 
The second was run on a sample of puzzles with a Medium difficulty rating. 
Table 5: The design matrix for each experiment in coded values. 
18 
Number of 
givens 
(Factor C) 
Distribution 
of givens 
(Factor B) 
Redundant 
Numbers 
(Factor A) 
(1) -1 -1 -1 
a -1 -1 1 
b -1 1 -1 
ab -1 1 1 
c 1 -1 -1 
ac 1 -1 1 
bc 1 1 -1 
abc 1 1 1
Table 6: The design matrix in statistics values for the Easy experiment. 
19 
Number of 
givens 
(Factor C) 
Distribution of 
givens 
(Factor B) 
Redundant 
Numbers 
(Factor A) 
(1) 36 1.56 0.89 
a 36 1.70 3.11 
b 36 2.67 0.89 
ab 36 2.15 3.56 
c 49 1.19 1.33 
ac 49 1.26 2.89 
bc 49 2.15 1.11 
abc 49 2.15 3.33 
Table 7: The design matrix in statistics values for the Medium experiment. 
Number of 
givens 
(Factor C) 
Distribution of 
givens 
(Factor B) 
Redundant 
Numbers 
(Factor A) 
(1) 32 0.59 1.11 
a 32 0.59 2.00 
b 32 1.56 0.89 
ab 32 1.26 2.22 
c 35 0.89 1.33 
ac 35 0.96 3.11 
bc 35 2.81 1.56 
abc 35 2.81 2.67
3.1 PERFORMING THE EXPERIMENT 
The two experiments were replicated 16 times each. The replicates were then blocked and each 
block was run by a single individual. The random run order for each block was determined by 
SAS Software and the result attached in the appendix of the report. 
The experiments were started by creating a Perl script to calculate all the control variables 
outlined in section 3.0 above. Eight different volumes of Sudoku puzzle books (in this case, 
Sudoku appearing in the daily Standard and Nation Newspapers were used) from the same author 
were purchased over the internet and proceed to transfer the puzzles from the books to the Perl 
script. From the Perl script output, the high and low values for each of the variables being 
analyzed were created. Once the high and low values were set, eight of the Easy rating puzzles 
and eight of the Medium rating puzzles that fit well with the high and low values set for each 
variable were identified. 
Once the puzzles were chosen, the experiment designs were then set up in Design Expert 
software. Then copies of the puzzles made and stapled in their individual blocks based on the run 
order provided by the software. 
Each block were then distributed to one of the 16 willing participants and were asked to perform 
the puzzles in the order in which they were stapled. The participants were required to provide 
the start time, finish time, and delta time for each of the puzzles. 
Once the completed puzzles were returned, the delta times (response variable) were then added 
to the run order table in the SAS Software and proceed to analyze the data. 
3.2 STATISTICAL ANALYSIS 
The data from the two experiments were then analyzed individually in two subsections. SAS 
Software and Design Expert were comparatively used to aid in the analysis of the data. 
20
CHAPTER FOUR 
4.0 RESULTS AND ANALYSIS OF DATA 
4.1 INTRODUCTION 
In this chapter, the experiment results are shown as given by the SAS software and Design 
Expert analysis tools, the results are thereof explained. 
The data that was used for analysis for the two experiments; easy and medium experiments, are 
found in the appendices 3.0 and 4.0 respectively 
The analysis are divided into two parts, with the first part being the analysis of easy experiment 
while the other deals with the analysis of medium experiment. 
Out of the 16 individuals tasked with solving the Sudoku puzzles with easy difficulty rating, only 
13 of them returned the puzzles in time to be analyzed in this report. This represented a return 
rate of 81%. Fortunately, since the design was blocked on replicates this therefore, had little to 
no effect on the analysis of the results in this report. 
However, for the Sudoku puzzles with medium difficulty rating, only 11 out of 16 individuals 
returned their puzzles in time for analysis. This translates to 69% puzzles return rate. 
4.2 SUMMARY OF THE EXPERIMENTS 
21 
DESIGN DETAILS 
Design type: Two-level 
Design description: Repeated 
Number of factors: 3 
Number of runs: 128 
Resolution: Full 
Number of blocks: 16 
FACTORS 
Factors and Levels: 
__________________________________________________ 
Factor Label Low Center High 
__________________________________________________ 
C Number of Givens -1 0 1 
B Distribution of Givens -1 0 1 
A Redundancy of Givens -1 0 1 
__________________________________________________
22 
RESPONSE 
_________________________ 
Response Label Units 
_________________________ 
Y1 Time seconds 
_________________________ 
BLOCK INFORMATION 
Block name: BLOCK 
Block label: INDIVIDUAL 
Number of blocks: 16 
4.3 ANALYSIS OF SUDOKU PUZZLES WITH EASY DIFFICULTY RATING 
4.3.1 Factors Effects 
The analysis was started by looking at the effects of various factors as presented by the SAS 
software. 
Due to the interest in maintaining a hierarchical model, the model was screened by omitting only 
the ABC variable. Therefore, in the whole analysis the interaction factor ABC was omitted. 
Main Factors and Interaction Plots 
Figure 3: Main Effects plot 
The main factors plot above shows that, for factor C; the mean of all response values for which 
C=-1 is 600, while that for which C=+1 is 500. The interpretation is similar for factor A, while 
that for factor B shows that both mean response values for B=-1 and B=+1 were the same.
23 
Figure 4: Interaction Plots 
From the interaction plots above, there is apparent interaction between factors B and C, also 
between B and A, thus, suggesting a significant effect (because the lines are crossed). This 
implies that as such it would be difficult to determine whether one or both of these effects are 
significant, even though their interaction plot strongly suggests a significant effect. 
To determine which factor(s) or interaction(s) has a significant effect we explore other ways like 
use of effect list, half – normal plots and others. 
Table 8: Effects List 
From the table 8 above, factors C, A and BA were significant, while factors B, CB and CA were 
not significant since they had a p – value greater than 0.05. 
Factors C and A are exhibiting negative effects while the others are contributing positive effects.
24 
Figure 9: Half – Normal Plot 
The half – normal plot was then run to analyze the same information using a more visual tool. 
The plot identified variable C as the most significant factor. The other factors seem to fall on or 
near the insignificant line as shown in the figure 1 above. 
Figure 10: Pareto Plot 
To give the % effect contribution of each factor, a more appropriate plot (Pareto plot) was used 
as shown in figure 10 above. Once again, factor C has the highest contribution while factor B has 
the least contribution.
4.3.2 The Analysis of Variance (ANOVA) 
Table 9: ANOVA FOR EASY EXPERIMENT 
ANOVA for Y1 
25 
Master Model 
Source DF SS MS F Pr > F 
C 1 457788.5 457788.5 11.18521 0.001228 
B 1 2803.846 2803.846 0.068507 0.794157 
A 1 381634.6 381634.6 9.324534 0.003019 
C*B 1 70096.15 70096.15 1.71267 0.194167 
C*A 1 146250 146250 3.573348 0.062122 
B*A 1 174496.2 174496.2 4.26349 0.041987 
Model 18 8966008 498111.5 12.17043 0.0001 
Error 85 3478881 40928.01 
Total 103 12444888 
When the ANOVA was run using the full model to identify significant variables and utilizing the 
p – values to identify variables meeting α = 0.05 requirement, variables C, A and interaction BA 
were found to be significant. The above table shows the details. 
Figure 11: Residual Plots 
Normal Plot of Residuals Residual Vs Predicted
The next step was to analyze the model adequacy via the residual plots as shown in the figure 
above. The plots identified a possible model adequacy in normality and equal variance 
assumptions. 
The model produced the best results, with the normal probability plot passing the “fat pencil” test 
and the residual vs. predicted showing no patterns. 
Since there was potential model adequacy, it was needless to transform the data to find a more 
accurate model. 
In summary, when considering Easy Level Sudoku puzzles, the following factors are significant. 
26 
 C – Number of Givens (negative) 
 A – Repetition of Numbers (negative) 
 BA – interaction effect (positive) 
Based on the percentage contribution as projected by Pareto plot, it is clear that the Number of 
Givens is far and away the most significant factor. 
4.3.3 Model Prediction 
Predictive Model for Y1 
Uncoded Levels: 
Y1 = 300 + 22.5*(BLOCK='10') + 465*(BLOCK='11') + 750*(BLOCK='12') 
+ 637.5*(BLOCK='13') + 472.5*(BLOCK='16') - 45*(BLOCK='2') 
+ 90*(BLOCK='3') - 97.5*(BLOCK='4') + 90*(BLOCK='5') + 22.5*(BLOCK='6') 
+ 465*(BLOCK='7') + 270*(BLOCK='8') - 66.34615*C - 60.57692*A 
Table 12: Fit Statistics for Y1 
____________________________________________ 
Master Model Predictive 
Model 
____________________________________________ 
Mean 541.7308 541.7308 
R-square 72.05% 68.88% 
Adj. R-square 66.13% 63.99% 
RMSE 202.3067 208.5942 
CV 37.34451 38.50514
27 
Response(s): 
___________________________________ 
Response Est. Value 
___________________________________ 
Y1 300 [147.6909, 452.3091] 
___________________________________ 
From the SAS output above, the regression equation above shows that it would take any person 
filling any easy rating Sudoku puzzle would approximately take 300 seconds to accurately 
complete the puzzle with a confidence interval of [147.6909, 452.3091] seconds at α = 0.05 level 
of significance. 
4.4 ANALYSIS OF SUDOKU PUZZLES WITH MEDIUM DIFFICULTY RATING 
4.4.1 Factors Effects 
The analysis was begun by looking at the effects of various factors as presented by the SAS 
software. 
Due to the interest in maintaining a hierarchical model, the model was screened by omitting only 
the ABC variable. Therefore, in the whole analysis the interaction factor ABC was omitted. 
Main Factors and Interaction Plots 
Figure 11: Main Effect Plots 
The main factors plot above shows that, for factor C; both mean response values for C=-1 and 
C=+1 were the same. The interpretation is similar for factor A, while that for factor B shows that 
the mean of all response values for which B=-1 is 625, while that for which B=+1 is 675.
28 
Figure 12: Interaction Plots 
From the plots, there is a possible significant effect between factors A and C, C and B, A and B, 
C and A also between B and A. 
Thus, suggesting a significant effect (because the lines are crossed). This implies that as such it 
would be difficult to determine whether one or both of these effects are significant, even though 
their interaction plot strongly suggests a significant effect. 
To determine which factor(s) or interaction(s) has a significant effect we explore other ways like 
use of effect list, half – normal plots and others. 
Table 13: Effects List 
From the effects list generated above, all the main factors and their interactions seem to be 
insignificant since, they all have p-values greater than 0.05, when α=0.05 was taken as the level 
of significance.
To confirm the results of the effects list, half – normal plot was used as indicated below. 
`Figure 13: Half – Normal Plot 
Using a more visual tool (the half – normal plot), it was further confirmed that indeed no factor 
is significant since some seem to fall on the Lenth’s PSE line while the remaining are at the 
space between Lenth’s PSE line and RMSE line. 
Figure 14: Pareto Plot 
29 
` 
To give the % effect contribution of each factor, a more appropriate plot (Pareto plot) was used 
as shown in figure 2 above. Interaction B*A has the highest contribution while factor C has the 
least contribution, even though none of these factors are significant.
4.4.2 The Analysis of Variance (ANOVA) 
Table 14: ANOVA FOR MEDIUM EXPERIMENT 
ANOVA for Y1 
30 
Master Model 
Source DF SS MS F Pr > F 
C 1 63.92045 63.92045 0.001401 0.970243 
B 1 38097.28 38097.28 0.835226 0.363859 
A 1 3108.284 3108.284 0.068144 0.794814 
C*B 1 2967.284 2967.284 0.065053 0.799418 
C*A 1 13975.92 13975.92 0.306401 0.581636 
B*A 1 122478.3 122478.3 2.685152 0.105712 
Model 16 3959457 247466.1 5.425321 0.0001 
Error 71 3238535 45613.17 
Total 87 7197992 
When the ANOVA was run using the full model to identify significant variables and utilizing the 
p – values to identify variables meeting the α = 0.05 requirement, it was asserted that none of the 
variables had p-value less than 0.05, confirming that no factor was found to be significant. The 
above table shows the details. 
Figure 15: Residual Plots 
The next step was to analyze the model adequacy via the residual plots as shown in the figure 
above. The plots identified a possible model adequacy in normality and equal variance 
assumptions.
The model produced the best results, with the normal probability plot passing the “fat pencil” test 
and the residual vs. predicted showing no patterns. 
Since there was potential model adequacy, it was needless to transform the data to find a more 
accurate model. 
In summary, it was clearly found that none of the hypothesized factors i.e. Number of givens, 
Distribution of the givens and Redundancy of the givens, and/or their factor interactions seemed 
not to have any significant influence on the difficulty of medium rating Sudoku puzzles. 
4.4.3 Model Prediction 
Table 15: Fit Statistics for Y1 
____________________________________________ 
Master Model Predictive Model 
____________________________________________ 
Mean 644.0795 644.0795 
R-square 55.01% 55.01% 
Adj. R-square 44.87% 44.87% 
RMSE 213.5724 213.5724 
CV 33.15932 33.15932 
____________________________________________ 
Response(s): 
______________________________________ 
Response Est. Value 
______________________________________ 
Y1 464.75 [314.1888, 615.3112] 
______________________________________ 
From the SAS output above, the estimation table above shows that it would take any person 
filling any medium rating Sudoku puzzle approximately would take 464.75 seconds to 
accurately complete the puzzle with a confidence interval of [314.1888,615.3112] seconds at α = 
0.05 level of significance. 
31
CHAPTER FIVE 
5.0 CONCLUSION AND RECOMMENDATIONS 
5.1 CONCLUSION 
Based on the two experiments discussed in this study, it is clear that the number of givens (i.e. 
factor C) had the greatest effect on the difficulty of an Easy rating Sudoku puzzle. This factor 
also seemed to have some minimal effect on the Medium rating Sudoku puzzle; it failed to reach 
the significant level as projected by p-value in the ANOVA table 14. 
In addition, the Redundancy (repetition) of the givens (i.e. factor A) also had some significant 
effect on the difficulty of an easy rating Sudoku puzzle but this factor failed as well to reach the 
significant level of influence in the medium rating Sudoku puzzle. 
Furthermore, the interaction factor BA as well exhibited significant influence on the easy rating 
process. 
Nonetheless, in the medium it was clearly found that none of the hypothesized factors i.e. 
Number of givens, Distribution of the givens and Redundancy of the givens, and/or their factor 
interactions seemed not to have any significant influence on the difficulty of medium rating 
Sudoku puzzles. 
The Adjusted R2 values reported for both the Easy and Medium levels of experiment were both 
unusually small, they were 0.6613 (66.13%) and 0.4487 (44.87%) for the easy and medium 
rating experiments respectively. Normally, this may be a cause for concern. In these particular 
experiments, the ANOVA for both experiments shows that the block effects were largely 
significant. When the block effects are that significant, it become pretty difficult and tricky to 
pick a single model that accurately describes the general behavior of the response. 
Since the Adjusted R2 values for these experiments are small, the regression model is not 
expected to accurately predict the completion time for any puzzle. Though, according to these 
experiments the roughly projected times for completion are 300 seconds and 464.75 seconds for 
easy and medium experiments respectively. 
32
5.2 RECOMMENDATIONS 
After the successful completion of the experimentation, it is in the interest of the investigator to 
make the following recommendations; 
1. Due to the inaccurate prediction of the completion time, further experimentation is 
required to accurately model the completion time for a specific person and specific 
Sudoku puzzle. 
2. During similar studies, participants (blocks) should be properly trained first to reduce the 
33 
significance of their effects. 
3. Since, in this study, the investigator found no significant effect of the hypothesized 
factors in the medium experiment, different factors and/or careful analysis of these 
factors should be considered in future studies.
APPENDICES 
1.0 Data Collection Tool for Easy Rating Sudoku Puzzles 
Kindly complete the below Sudoku and provide the start time, end time and the delta time (i.e. 
the difference between the start and end times). 
Provide the delta time in seconds. 
1. 
9 7 5 3 
1 2 4 8 
3 1 8 6 
5 6 2 
2 3 4 9 1 5 
3 6 7 
6 4 8 9 
34 
2. 
9 2 7 1 
5 1 4 2 
9 8 6 7 1 5 
6 8 2 3 4 
7 5 1 4 9 2 
8 2 4 9 
2 4 9 5 3 8 1 
5 9 3 8 6 
9 1 7 3 4 
7 8 2 1 5 
4 3 8 1 6 7 
Start time:…………………….. 
End time:………………………. 
Delta time:……………………. 
Start time:…………………….. 
End time:………………………. 
Delta time:…………………….
35 
3. 
8 2 6 5 3 
5 9 2 4 
1 7 4 9 5 
9 5 2 1 3 4 
8 7 9 
3 8 9 6 1 
6 3 9 1 7 5 
1 8 4 6 
7 5 8 2 
4. 
6 3 4 9 1 
5 6 2 8 
8 1 7 6 
4 1 2 6 7 
9 7 5 4 
3 8 7 
4 3 1 9 
1 6 5 4 
7 2 8 3 5 
Start time:…………………….. 
End time:………………………. 
Delta time:……………………. 
Start time:…………………….. 
End time:………………………. 
Delta time:…………………….
36 
5. 
8 3 
9 1 2 7 4 
2 8 6 7 9 
5 9 8 2 
8 6 7 4 3 
2 3 9 8 4 
4 1 3 5 7 
7 8 9 5 1 
9 7 8 6 
6. 
9 2 5 7 8 6 
5 7 4 1 3 2 
8 3 2 9 6 
2 1 5 9 4 6 
4 1 6 3 2 
6 8 2 1 3 
3 7 9 5 4 
2 4 1 6 7 
4 7 5 6 1 9 
Start time:…………………….. 
End time:………………………. 
Delta time:……………………. 
Start time:…………………….. 
End time:………………………. 
Delta time:…………………….
37 
7. 
4 9 7 6 
1 7 4 8 5 
8 9 5 4 7 
4 9 7 8 2 5 
2 5 3 4 8 6 
8 6 1 4 7 
5 6 4 2 3 
4 6 5 7 9 
2 1 3 7 5 4 
8. 
4 7 5 3 1 
7 6 1 9 
1 2 8 3 
1 5 7 4 3 
9 8 1 
5 2 1 6 7 
8 4 3 
3 7 2 1 5 
7 9 6 1 2 
Start time:…………………….. 
End time:………………………. 
Delta time:……………………. 
Start time:…………………….. 
End time:………………………. 
Delta time:…………………….
2.0 Data Collection Tool for Medium Rating Sudoku Puzzles 
Kindly complete the below Sudoku and provide the start time, end time and the delta time (i.e. the 
difference between the start and end times). 
Provide the delta time in seconds. 
1. 
5 3 9 
6 5 7 2 
4 7 2 6 5 
1 7 
9 2 7 1 
1 9 
7 3 6 1 5 
3 4 1 8 
8 5 3 
2. 
5 2 9 4 
1 7 
38 
7 4 6 1 9 
7 8 4 6 3 5 
4 1 9 
6 1 3 7 4 
2 7 4 5 
5 3 
1 3 5 8 
Start time:…………………….. 
End time:………………………. 
Delta time:……………………. 
Start time:…………………….. 
End time:………………………. 
Delta time:…………………….
39 
3. 
1 6 2 5 
9 8 1 6 
2 7 6 1 
6 1 2 9 
3 5 4 1 
4 1 3 5 8 
1 7 9 
1 4 3 
9 8 6 1 
4. 
2 9 5 
7 8 1 6 
6 8 4 1 
1 9 3 7 
5 7 8 
8 2 3 9 
4 7 5 1 2 
6 4 5 3 
3 5 7 8 
Start time:…………………….. 
End time:………………………. 
Delta time:……………………. 
Start time:…………………….. 
End time:………………………. 
Delta time:…………………….
5. 
5 6 
3 1 6 5 4 
7 8 3 9 2 
40 
5 8 3 
5 3 
6 7 2 
9 4 5 6 1 2 
6 3 2 7 9 
7 2 9 3 
6. 
3 5 8 
2 4 9 
2 7 8 
3 8 2 
6 1 2 7 9 5 
4 5 6 
9 1 7 6 2 
8 1 3 9 
4 5 
Start time:…………………….. 
End time:………………………. 
Delta time:……………………. 
Start time:…………………….. 
End time:………………………. 
Delta time:…………………….
7. 
3 1 7 8 6 4 
4 1 
41 
5 4 
8 9 5 3 
9 4 
7 6 3 5 
8 6 5 
9 5 8 
6 8 1 5 2 4 9 
8. 
9 8 4 1 
1 7 9 3 6 8 
6 2 1 8 4 
9 8 2 
8 7 6 5 
7 1 4 5 8 
8 
5 8 7 2 
Start time:…………………….. 
End time:………………………. 
Delta time:……………………. 
Start time:…………………….. 
End time:………………………. 
Delta time:…………………….
3.0 Experiment data for Easy Sudoku puzzles 
42 
DESIGN POINTS (Uncoded) 
_________________________________________________ 
RUN BLOCK C B A Y1 
_________________________________________________ 
61 8 -1 -1 1 360 
57 8 -1 -1 -1 600 
63 8 -1 1 1 600 
58 8 1 -1 -1 480 
59 8 -1 1 -1 540 
62 8 1 -1 1 600 
60 8 1 1 -1 660 
64 8 1 1 1 720 
42 6 1 -1 -1 300 
46 6 1 -1 1 180 
48 6 1 1 1 480 
44 6 1 1 -1 120 
47 6 -1 1 1 240 
43 6 -1 1 -1 300 
41 6 -1 -1 -1 540 
45 6 -1 -1 1 420 
75 10 -1 1 -1 360 
77 10 -1 -1 1 360 
73 10 -1 -1 -1 540 
76 10 1 1 -1 240 
79 10 -1 1 1 300 
74 10 1 -1 -1 300 
80 10 1 1 1 180 
78 10 1 -1 1 300 
3 1 -1 1 -1 . 
5 1 -1 -1 1 . 
4 1 1 1 -1 . 
7 1 -1 1 1 . 
1 1 -1 -1 -1 . 
6 1 1 -1 1 . 
8 1 1 1 1 . 
2 1 1 -1 -1 . 
37 5 -1 -1 1 420 
40 5 1 1 1 300 
36 5 1 1 -1 300 
35 5 -1 1 -1 300 
33 5 -1 -1 -1 780 
34 5 1 -1 -1 360 
38 5 1 -1 1 240 
39 5 -1 1 1 420 
10 2 1 -1 -1 300 
11 2 -1 1 -1 120 
14 2 1 -1 1 180 
15 2 -1 1 1 480 
9 2 -1 -1 -1 360 
16 2 1 1 1 300
12 2 1 1 -1 120 
13 2 -1 -1 1 180 
104 13 1 1 1 960 
100 13 1 1 -1 840 
99 13 -1 1 -1 1200 
101 13 -1 -1 1 780 
98 13 1 -1 -1 600 
103 13 -1 1 1 1140 
97 13 -1 -1 -1 1200 
102 13 1 -1 1 780 
84 11 1 1 -1 960 
83 11 -1 1 -1 840 
86 11 1 -1 1 420 
81 11 -1 -1 -1 660 
88 11 1 1 1 960 
85 11 -1 -1 1 600 
82 11 1 -1 -1 780 
87 11 -1 1 1 900 
92 12 1 1 -1 780 
90 12 1 -1 -1 1140 
91 12 -1 1 -1 900 
94 12 1 -1 1 660 
93 12 -1 -1 1 1320 
95 12 -1 1 1 540 
89 12 -1 -1 -1 2100 
96 12 1 1 1 960 
125 16 -1 -1 1 660 
121 16 -1 -1 -1 1020 
124 16 1 1 -1 720 
128 16 1 1 1 660 
127 16 -1 1 1 780 
126 16 1 -1 1 600 
122 16 1 -1 -1 960 
123 16 -1 1 -1 780 
22 3 1 -1 1 240 
19 3 -1 1 -1 900 
17 3 -1 -1 -1 600 
20 3 1 1 -1 300 
23 3 -1 1 1 480 
21 3 -1 -1 1 300 
24 3 1 1 1 120 
18 3 1 -1 -1 180 
67 9 -1 1 -1 240 
71 9 -1 1 1 240 
68 9 1 1 -1 240 
70 9 1 -1 1 240 
69 9 -1 -1 1 240 
66 9 1 -1 -1 240 
72 9 1 1 1 240 
65 9 -1 -1 -1 720 
54 7 1 -1 1 420 
49 7 -1 -1 -1 720 
52 7 1 1 -1 720 
55 7 -1 1 1 420 
51 7 -1 1 -1 1560 
43
50 7 1 -1 -1 960 
56 7 1 1 1 600 
53 7 -1 -1 1 720 
29 4 -1 -1 1 180 
28 4 1 1 -1 180 
32 4 1 1 1 240 
30 4 1 -1 1 180 
26 4 1 -1 -1 180 
27 4 -1 1 -1 240 
25 4 -1 -1 -1 240 
31 4 -1 1 1 180 
111 14 -1 1 1 . 
107 14 -1 1 -1 . 
110 14 1 -1 1 . 
105 14 -1 -1 -1 . 
112 14 1 1 1 . 
109 14 -1 -1 1 . 
108 14 1 1 -1 . 
106 14 1 -1 -1 . 
113 15 -1 -1 -1 . 
118 15 1 -1 1 . 
120 15 1 1 1 . 
114 15 1 -1 -1 . 
115 15 -1 1 -1 . 
116 15 1 1 -1 . 
119 15 -1 1 1 . 
117 15 -1 -1 1 . 
_________________________________________________ 
44
4.0 Experiment data for Medium Sudoku puzzles 
45 
DESIGN POINTS (Uncoded) 
_________________________________________________ 
RUN BLOCK C B A Y 
_________________________________________________ 
61 8 -1 -1 1 962 
57 8 -1 -1 -1 802 
63 8 -1 1 1 637 
58 8 1 -1 -1 651 
59 8 -1 1 -1 991 
62 8 1 -1 1 582 
60 8 1 1 -1 583 
64 8 1 1 1 1040 
42 6 1 -1 -1 480 
46 6 1 -1 1 660 
48 6 1 1 1 582 
44 6 1 1 -1 1045 
47 6 -1 1 1 370 
43 6 -1 1 -1 750 
41 6 -1 -1 -1 490 
45 6 -1 -1 1 495 
75 10 -1 1 -1 1191 
77 10 -1 -1 1 1161 
73 10 -1 -1 -1 469 
76 10 1 1 -1 1333 
79 10 -1 1 1 928 
74 10 1 -1 -1 984 
80 10 1 1 1 715 
78 10 1 -1 1 1269 
3 1 -1 1 -1 277 
5 1 -1 -1 1 373 
4 1 1 1 -1 449 
7 1 -1 1 1 544 
1 1 -1 -1 -1 540 
6 1 1 -1 1 500 
8 1 1 1 1 608 
2 1 1 -1 -1 427 
37 5 -1 -1 1 600 
40 5 1 1 1 540 
36 5 1 1 -1 540 
35 5 -1 1 -1 540 
33 5 -1 -1 -1 840 
34 5 1 -1 -1 600 
38 5 1 -1 1 720 
39 5 -1 1 1 900 
10 2 1 -1 -1 . 
11 2 -1 1 -1 . 
14 2 1 -1 1 . 
15 2 -1 1 1 . 
9 2 -1 -1 -1 . 
16 2 1 1 1 .
12 2 1 1 -1 . 
13 2 -1 -1 1 . 
104 13 1 1 1 358 
100 13 1 1 -1 267 
99 13 -1 1 -1 269 
101 13 -1 -1 1 297 
98 13 1 -1 -1 420 
103 13 -1 1 1 337 
97 13 -1 -1 -1 361 
102 13 1 -1 1 540 
84 11 1 1 -1 . 
83 11 -1 1 -1 . 
86 11 1 -1 1 . 
81 11 -1 -1 -1 . 
88 11 1 1 1 . 
85 11 -1 -1 1 . 
82 11 1 -1 -1 . 
87 11 -1 1 1 . 
92 12 1 1 -1 420 
90 12 1 -1 -1 540 
91 12 -1 1 -1 1500 
94 12 1 -1 1 600 
93 12 -1 -1 1 780 
95 12 -1 1 1 780 
89 12 -1 -1 -1 840 
96 12 1 1 1 780 
125 16 -1 -1 1 . 
121 16 -1 -1 -1 . 
124 16 1 1 -1 . 
128 16 1 1 1 . 
127 16 -1 1 1 . 
126 16 1 -1 1 . 
122 16 1 -1 -1 . 
123 16 -1 1 -1 . 
22 3 1 -1 1 285 
19 3 -1 1 -1 180 
17 3 -1 -1 -1 435 
20 3 1 1 -1 270 
23 3 -1 1 1 225 
21 3 -1 -1 1 270 
24 3 1 1 1 570 
18 3 1 -1 -1 315 
67 9 -1 1 -1 590 
71 9 -1 1 1 382 
68 9 1 1 -1 874 
70 9 1 -1 1 366 
69 9 -1 -1 1 682 
66 9 1 -1 -1 304 
72 9 1 1 1 785 
65 9 -1 -1 -1 404 
54 7 1 -1 1 . 
49 7 -1 -1 -1 . 
52 7 1 1 -1 . 
55 7 -1 1 1 . 
51 7 -1 1 -1 . 
46
50 7 1 -1 -1 . 
56 7 1 1 1 . 
53 7 -1 -1 1 . 
29 4 -1 -1 1 800 
28 4 1 1 -1 720 
32 4 1 1 1 360 
30 4 1 -1 1 480 
26 4 1 -1 -1 960 
27 4 -1 1 -1 420 
25 4 -1 -1 -1 780 
31 4 -1 1 1 660 
111 14 -1 1 1 . 
107 14 -1 1 -1 . 
110 14 1 -1 1 . 
105 14 -1 -1 -1 . 
112 14 1 1 1 . 
109 14 -1 -1 1 . 
108 14 1 1 -1 . 
106 14 1 -1 -1 . 
113 15 -1 -1 -1 735 
118 15 1 -1 1 1275 
120 15 1 1 1 720 
114 15 1 -1 -1 645 
115 15 -1 1 -1 1155 
116 15 1 1 -1 1215 
119 15 -1 1 1 855 
117 15 -1 -1 1 705 
_________________________________________________ 
47
REFERENCES 
1. Wei – Meng Lee, Programming Sudoku (Technology in Action), Apress, 2006. 
2. www.wikipedia.com/sudokugames. 
3. Jonathan Lutz, et al. Design of Engineering Experiments: Arizona State University. 2000. Pg 5 – 
48 
10. 
4. Raj Jain, et al. Two factors; full factorial design without replication. Washington University, USA. 
Pg 21 – 7.

More Related Content

What's hot

scatter diagram
 scatter diagram scatter diagram
scatter diagramshrey8916
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learningshivani saluja
 
Supervised learning and unsupervised learning
Supervised learning and unsupervised learningSupervised learning and unsupervised learning
Supervised learning and unsupervised learningArunakumariAkula1
 
Methodology of qualitative research.pptx
Methodology of qualitative research.pptxMethodology of qualitative research.pptx
Methodology of qualitative research.pptxDr. Hina Kaynat
 
Exploratory Data Analysis - Satyajit.pdf
Exploratory Data Analysis - Satyajit.pdfExploratory Data Analysis - Satyajit.pdf
Exploratory Data Analysis - Satyajit.pdfAmmarAhmedSiddiqui2
 
Exploratory Data Analysis using Python
Exploratory Data Analysis using PythonExploratory Data Analysis using Python
Exploratory Data Analysis using PythonShirin Mojarad, Ph.D.
 
Lecture 3 Computer Science Research SEM1 22_23 (1).pptx
Lecture 3 Computer Science Research SEM1 22_23 (1).pptxLecture 3 Computer Science Research SEM1 22_23 (1).pptx
Lecture 3 Computer Science Research SEM1 22_23 (1).pptxNabilaHassan13
 
Clinical Research Statistics for Non-Statisticians
Clinical Research Statistics for Non-StatisticiansClinical Research Statistics for Non-Statisticians
Clinical Research Statistics for Non-StatisticiansBrook White, PMP
 
1 Introduction to Biostatistics last.pptx
1 Introduction to Biostatistics last.pptx1 Introduction to Biostatistics last.pptx
1 Introduction to Biostatistics last.pptxdebabatolosa
 
Levels of measurement
Levels of measurementLevels of measurement
Levels of measurementBbte Rein
 
Semi-supervised Learning
Semi-supervised LearningSemi-supervised Learning
Semi-supervised Learningbutest
 
data analysis techniques and statistical softwares
data analysis techniques and statistical softwaresdata analysis techniques and statistical softwares
data analysis techniques and statistical softwaresDr.ammara khakwani
 
Diabetes Data Science
Diabetes Data ScienceDiabetes Data Science
Diabetes Data SciencePhilip Bourne
 
Chi square tests using SPSS
Chi square tests using SPSSChi square tests using SPSS
Chi square tests using SPSSParag Shah
 
Quantitative research methods in medicine dr. baxi
Quantitative research methods in medicine   dr. baxiQuantitative research methods in medicine   dr. baxi
Quantitative research methods in medicine dr. baxiIndian Health Journal
 
Analysis of variance (ANOVA) everything you need to know
Analysis of variance (ANOVA) everything you need to knowAnalysis of variance (ANOVA) everything you need to know
Analysis of variance (ANOVA) everything you need to knowStat Analytica
 

What's hot (20)

scatter diagram
 scatter diagram scatter diagram
scatter diagram
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Supervised learning and unsupervised learning
Supervised learning and unsupervised learningSupervised learning and unsupervised learning
Supervised learning and unsupervised learning
 
Methodology of qualitative research.pptx
Methodology of qualitative research.pptxMethodology of qualitative research.pptx
Methodology of qualitative research.pptx
 
Exploratory Data Analysis - Satyajit.pdf
Exploratory Data Analysis - Satyajit.pdfExploratory Data Analysis - Satyajit.pdf
Exploratory Data Analysis - Satyajit.pdf
 
Exploratory Data Analysis using Python
Exploratory Data Analysis using PythonExploratory Data Analysis using Python
Exploratory Data Analysis using Python
 
Statistics Introduction
Statistics IntroductionStatistics Introduction
Statistics Introduction
 
Lecture 3 Computer Science Research SEM1 22_23 (1).pptx
Lecture 3 Computer Science Research SEM1 22_23 (1).pptxLecture 3 Computer Science Research SEM1 22_23 (1).pptx
Lecture 3 Computer Science Research SEM1 22_23 (1).pptx
 
Biases
BiasesBiases
Biases
 
Clinical Research Statistics for Non-Statisticians
Clinical Research Statistics for Non-StatisticiansClinical Research Statistics for Non-Statisticians
Clinical Research Statistics for Non-Statisticians
 
1 Introduction to Biostatistics last.pptx
1 Introduction to Biostatistics last.pptx1 Introduction to Biostatistics last.pptx
1 Introduction to Biostatistics last.pptx
 
Levels of measurement
Levels of measurementLevels of measurement
Levels of measurement
 
Ethics in research
Ethics in researchEthics in research
Ethics in research
 
Semi-supervised Learning
Semi-supervised LearningSemi-supervised Learning
Semi-supervised Learning
 
data analysis techniques and statistical softwares
data analysis techniques and statistical softwaresdata analysis techniques and statistical softwares
data analysis techniques and statistical softwares
 
Diabetes Data Science
Diabetes Data ScienceDiabetes Data Science
Diabetes Data Science
 
Chi square tests using SPSS
Chi square tests using SPSSChi square tests using SPSS
Chi square tests using SPSS
 
L14. Anomaly Detection
L14. Anomaly DetectionL14. Anomaly Detection
L14. Anomaly Detection
 
Quantitative research methods in medicine dr. baxi
Quantitative research methods in medicine   dr. baxiQuantitative research methods in medicine   dr. baxi
Quantitative research methods in medicine dr. baxi
 
Analysis of variance (ANOVA) everything you need to know
Analysis of variance (ANOVA) everything you need to knowAnalysis of variance (ANOVA) everything you need to know
Analysis of variance (ANOVA) everything you need to know
 

Viewers also liked

Statistics Project
Statistics ProjectStatistics Project
Statistics ProjectRonan Santos
 
Statistical survey project
Statistical survey projectStatistical survey project
Statistical survey projectjep2792
 
Stats survey project
Stats survey projectStats survey project
Stats survey projectdacevedo10
 
Introduction to the statistics project
Introduction to the statistics projectIntroduction to the statistics project
Introduction to the statistics projectpmakunja
 
Statistics student sample project (1)
Statistics student sample project (1)Statistics student sample project (1)
Statistics student sample project (1)Jef Faciol
 
Questionnaire mba project
Questionnaire  mba  projectQuestionnaire  mba  project
Questionnaire mba projectAashi Yadav
 
Brand questionnaire
Brand questionnaireBrand questionnaire
Brand questionnaireyasiniub
 

Viewers also liked (11)

Statistics Project
Statistics ProjectStatistics Project
Statistics Project
 
Statistical survey project
Statistical survey projectStatistical survey project
Statistical survey project
 
Statistical Project
Statistical ProjectStatistical Project
Statistical Project
 
Stats survey project
Stats survey projectStats survey project
Stats survey project
 
Introduction to the statistics project
Introduction to the statistics projectIntroduction to the statistics project
Introduction to the statistics project
 
statistics project
statistics projectstatistics project
statistics project
 
Statistics student sample project (1)
Statistics student sample project (1)Statistics student sample project (1)
Statistics student sample project (1)
 
Project report format
Project report formatProject report format
Project report format
 
Statistical ppt
Statistical pptStatistical ppt
Statistical ppt
 
Questionnaire mba project
Questionnaire  mba  projectQuestionnaire  mba  project
Questionnaire mba project
 
Brand questionnaire
Brand questionnaireBrand questionnaire
Brand questionnaire
 

Similar to BSc Statistical Project

QBD_1464843125535 - Copy
QBD_1464843125535 - CopyQBD_1464843125535 - Copy
QBD_1464843125535 - CopyBhavesh Jangale
 
Nweke digital-forensics-masters-thesis-sapienza-university-italy
Nweke digital-forensics-masters-thesis-sapienza-university-italyNweke digital-forensics-masters-thesis-sapienza-university-italy
Nweke digital-forensics-masters-thesis-sapienza-university-italyAimonJamali
 
A.R.C. Usability Evaluation
A.R.C. Usability EvaluationA.R.C. Usability Evaluation
A.R.C. Usability EvaluationJPC Hanson
 
Smart Speaker as Studying Assistant by Joao Pargana
Smart Speaker as Studying Assistant by Joao ParganaSmart Speaker as Studying Assistant by Joao Pargana
Smart Speaker as Studying Assistant by Joao ParganaHendrik Drachsler
 
Work Measurement Application - Ghent Internship Report - Adel Belasker
Work Measurement Application - Ghent Internship Report - Adel BelaskerWork Measurement Application - Ghent Internship Report - Adel Belasker
Work Measurement Application - Ghent Internship Report - Adel BelaskerAdel Belasker
 
ImplementationOFDMFPGA
ImplementationOFDMFPGAImplementationOFDMFPGA
ImplementationOFDMFPGANikita Pinto
 
UCHILE_M_Sc_Thesis_final
UCHILE_M_Sc_Thesis_finalUCHILE_M_Sc_Thesis_final
UCHILE_M_Sc_Thesis_finalGustavo Pabon
 
UCHILE_M_Sc_Thesis_final
UCHILE_M_Sc_Thesis_finalUCHILE_M_Sc_Thesis_final
UCHILE_M_Sc_Thesis_finalGustavo Pabon
 
Im-ception - An exploration into facial PAD through the use of fine tuning de...
Im-ception - An exploration into facial PAD through the use of fine tuning de...Im-ception - An exploration into facial PAD through the use of fine tuning de...
Im-ception - An exploration into facial PAD through the use of fine tuning de...Cooper Wakefield
 
Electricity and MagnetismSimulation Worksheets and LabsP.docx
Electricity and MagnetismSimulation Worksheets and LabsP.docxElectricity and MagnetismSimulation Worksheets and LabsP.docx
Electricity and MagnetismSimulation Worksheets and LabsP.docxSALU18
 
KurtPortelliMastersDissertation
KurtPortelliMastersDissertationKurtPortelliMastersDissertation
KurtPortelliMastersDissertationKurt Portelli
 
iGUARD: An Intelligent Way To Secure - Report
iGUARD: An Intelligent Way To Secure - ReportiGUARD: An Intelligent Way To Secure - Report
iGUARD: An Intelligent Way To Secure - ReportNandu B Rajan
 
project Report on LAN Security Manager
project Report on LAN Security Managerproject Report on LAN Security Manager
project Report on LAN Security ManagerShahrikh Khan
 
Measuring Aspect-Oriented Software In Practice
Measuring Aspect-Oriented Software In PracticeMeasuring Aspect-Oriented Software In Practice
Measuring Aspect-Oriented Software In PracticeHakan Özler
 
Undergrad Thesis | Information Science and Engineering
Undergrad Thesis | Information Science and EngineeringUndergrad Thesis | Information Science and Engineering
Undergrad Thesis | Information Science and EngineeringPriyanka Pandit
 

Similar to BSc Statistical Project (20)

QBD_1464843125535 - Copy
QBD_1464843125535 - CopyQBD_1464843125535 - Copy
QBD_1464843125535 - Copy
 
thesis
thesisthesis
thesis
 
HonsTokelo
HonsTokeloHonsTokelo
HonsTokelo
 
Nweke digital-forensics-masters-thesis-sapienza-university-italy
Nweke digital-forensics-masters-thesis-sapienza-university-italyNweke digital-forensics-masters-thesis-sapienza-university-italy
Nweke digital-forensics-masters-thesis-sapienza-university-italy
 
A.R.C. Usability Evaluation
A.R.C. Usability EvaluationA.R.C. Usability Evaluation
A.R.C. Usability Evaluation
 
Smart Speaker as Studying Assistant by Joao Pargana
Smart Speaker as Studying Assistant by Joao ParganaSmart Speaker as Studying Assistant by Joao Pargana
Smart Speaker as Studying Assistant by Joao Pargana
 
Fulltext02
Fulltext02Fulltext02
Fulltext02
 
Work Measurement Application - Ghent Internship Report - Adel Belasker
Work Measurement Application - Ghent Internship Report - Adel BelaskerWork Measurement Application - Ghent Internship Report - Adel Belasker
Work Measurement Application - Ghent Internship Report - Adel Belasker
 
thesis
thesisthesis
thesis
 
ImplementationOFDMFPGA
ImplementationOFDMFPGAImplementationOFDMFPGA
ImplementationOFDMFPGA
 
UCHILE_M_Sc_Thesis_final
UCHILE_M_Sc_Thesis_finalUCHILE_M_Sc_Thesis_final
UCHILE_M_Sc_Thesis_final
 
UCHILE_M_Sc_Thesis_final
UCHILE_M_Sc_Thesis_finalUCHILE_M_Sc_Thesis_final
UCHILE_M_Sc_Thesis_final
 
Im-ception - An exploration into facial PAD through the use of fine tuning de...
Im-ception - An exploration into facial PAD through the use of fine tuning de...Im-ception - An exploration into facial PAD through the use of fine tuning de...
Im-ception - An exploration into facial PAD through the use of fine tuning de...
 
Electricity and MagnetismSimulation Worksheets and LabsP.docx
Electricity and MagnetismSimulation Worksheets and LabsP.docxElectricity and MagnetismSimulation Worksheets and LabsP.docx
Electricity and MagnetismSimulation Worksheets and LabsP.docx
 
KurtPortelliMastersDissertation
KurtPortelliMastersDissertationKurtPortelliMastersDissertation
KurtPortelliMastersDissertation
 
iGUARD: An Intelligent Way To Secure - Report
iGUARD: An Intelligent Way To Secure - ReportiGUARD: An Intelligent Way To Secure - Report
iGUARD: An Intelligent Way To Secure - Report
 
project Report on LAN Security Manager
project Report on LAN Security Managerproject Report on LAN Security Manager
project Report on LAN Security Manager
 
Measuring Aspect-Oriented Software In Practice
Measuring Aspect-Oriented Software In PracticeMeasuring Aspect-Oriented Software In Practice
Measuring Aspect-Oriented Software In Practice
 
KHAN_FAHAD_FL14
KHAN_FAHAD_FL14KHAN_FAHAD_FL14
KHAN_FAHAD_FL14
 
Undergrad Thesis | Information Science and Engineering
Undergrad Thesis | Information Science and EngineeringUndergrad Thesis | Information Science and Engineering
Undergrad Thesis | Information Science and Engineering
 

Recently uploaded

꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改atducpo
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfJohn Sterrett
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一F sss
 

Recently uploaded (20)

Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
代办国外大学文凭《原版美国UCLA文凭证书》加州大学洛杉矶分校毕业证制作成绩单修改
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
DBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdfDBA Basics: Getting Started with Performance Tuning.pdf
DBA Basics: Getting Started with Performance Tuning.pdf
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
办理学位证中佛罗里达大学毕业证,UCF成绩单原版一比一
 

BSc Statistical Project

  • 1. UNIVERSITY OF KABIANGA SCHOOL OF SCIENCE AND TECHNOLOGY DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE THE STUDY ON THE FACTORS AFFECTING THE DIFFICULTY OF A SUDOKU PUZZLE A PROJECT REPORT SUBMITTED IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE AWARD OF DEGREE OF BACHELOR OF SCIENCE IN APPLIED STATISTICS WITH COMPUTING OF UNIVERSITY OF KABIANGA BY OKOYO COLLINS OMONDI AST/0037/09 SUPERVISORS: MR. RUEBEN C. LANGA’T MR. TONUI B. APRIL 2013
  • 2. DECLARATION I, Okoyo Collins Omondi, do hereby declare that this project report is my original work and has not been presented for an award of degree in any other university. Sign:……………………………. Date:……………………………… This project has been submitted for examination with our approval as University supervisors. 2 Supervisors: 1. Mr. Rueben C. Langa’t Department of Mathematics and Computer Science University of Kabianga Signature:…………………. Date:…………………….. 2. Mr. Tonui B. Department of Mathematics and Computer Science University of Kabianga Signature:…………………. Date:……………………..
  • 3. DEDICATION This project is dedicated to all the Sudoku players and hobbyists. I dedicate this work too to my beloved parents Charles Okoyo and Lucy Okoyo for their unconditional material and financial support; my siblings Everlyne, Evans, Basil, Sheilah and Oliver for their overwhelming social support. 3
  • 4. ACRONYMS SAS: Statistical Application Software. SPSS: Statistical Package for Social Sciences. DOE: Design of Experiments. 4
  • 5. 5 DEFINATION OF TERMS Box: A 3 by 3 grid inside the Sudoku puzzle. It works the same as rows and columns, meaning it must contain the digits 1 – 9. Region: This refers to a row, column or box. Candidate: An empty square in a Sudoku puzzle have a certain set of numbers that does not conflict with the row, column and box it is in. Those numbers are called candidates or candidate numbers. Given: A given is defined as a number in the original Sudoku puzzle, meaning that a Sudoku puzzle has a certain number of clues which is then used to fill in new squares. The number filled in by the solver is, however, not regarded as a given. Complete block: This is where each treatment appears in each block. Response: The process output. Factor: Uncontrolled or controlled variable whose influence is being studied. Level: Setting of a factor (+, -, 1, -1, high, low, alpha, numeric). Run: This is the treatment combinations; setting all factors to obtain a response. Replicate: Number of times a treatment combination is run (usually randomized). Repeat: Non – randomized replicate. Inference space: Operating range of factors under study. Design Expert: Software used to design experiments.
  • 6. ABSTRACT This project demonstrates how to apply design of experiments; in particular, the full factorial design to gain insight into real, everyday statistical problems and situations. The design of this project ultimately results in an intuitive understanding of statistical procedures and strategies most often used by practicing statisticians and scientists. Hence, it’s expected that the choice on the study of factors affecting the difficulty of Sudoku puzzle provides a real statistical problem in designing different experiments. 6
  • 7. TABLE OF CONTENTS DECLARATION...................................................................................................................................... 2 DEDICATION ........................................................................................................................................ 3 ACRONYMS.......................................................................................................................................... 4 DEFINATION OF TERMS ....................................................................................................................... 5 ABSTRACT ............................................................................................................................................ 6 CHAPTER ONE .................................................................................................................................... 10 1.0 BACKGROUND ........................................................................................................................ 10 1.1 PROBLEM STATEMENT ........................................................................................................... 11 1.2 STUDY PURPOSE..................................................................................................................... 11 1.3 STUDY OBJECTIVES ................................................................................................................. 11 CHAPTER TWO ................................................................................................................................... 12 2.0 LITERATURE REVIEW .............................................................................................................. 12 2.1 INTRODUCTION ...................................................................................................................... 12 2.2 HOW TO PLAY ....................................................................................................................... 13 2.3 OPERATIONALIZATION OF VARIABLES .................................................................................... 14 2.3.1 Response Variable .............................................................................................................. 14 2.3.2 Control Variables ................................................................................................................ 15 CHAPTER THREE ................................................................................................................................ 18 3.0 EXPERIMENTAL DESIGN .......................................................................................................... 18 3.1 PERFORMING THE EXPERIMENT ............................................................................................. 20 3.2 STATISTICAL ANALYSIS............................................................................................................ 20 CHAPTER FOUR .................................................................................................................................. 21 4.0 RESULTS AND ANALYSIS OF DATA ........................................................................................... 21 4.1 INTRODUCTION ...................................................................................................................... 21 4.2 SUMMARY OF THE EXPERIMENTS........................................................................................... 21 4.3 ANALYSIS OF SUDOKU PUZZLES WITH EASY DIFFICULTY RATING............................................. 22 4.3.1 Factors Effects .................................................................................................................... 22 4.3.2 The Analysis of Variance (ANOVA) ...................................................................................... 25 4.3.3 Model Prediction ................................................................................................................ 26 4.4 ANALYSIS OF SUDOKU PUZZLES WITH MEDIUM DIFFICULTY RATING ...................................... 27 7
  • 8. 4.4.1 Factors Effects .................................................................................................................... 27 4.4.2 The Analysis of Variance (ANOVA) ...................................................................................... 30 4.4.3 Model Prediction ................................................................................................................ 31 CHAPTER FIVE .................................................................................................................................... 32 5.0 CONCLUSION AND RECOMMENDATIONS ............................................................................... 32 5.1 CONCLUSION ......................................................................................................................... 32 5.2 RECOMMENDATIONS ............................................................................................................. 33 APPENDICES ...................................................................................................................................... 34 1.0 Data Collection Tool for Easy Rating Sudoku Puzzles ............................................................... 34 2.0 Data Collection Tool for Medium Rating Sudoku Puzzles ........................................................ 38 3.0 Experiment data for Easy Sudoku puzzles ............................................................................... 42 4.0 Experiment data for Medium Sudoku puzzles ......................................................................... 45 REFERENCES ...................................................................................................................................... 48 LIST OF FIGURES Figure 1: Sudoku Grid with Row, Column and Box Names ................................................................. 12 Figure 2: General view of Sudoku game environment ....................................................................... 13 Figure 3: Main Effects plot ................................................................................................................ 22 Figure 4: Interaction Plots ................................................................................................................. 23 Figure 9: Half – Normal Plot .............................................................................................................. 24 Figure 10: Pareto Plot ..................................................................................................................... 24 Figure 11: Residual Plots ................................................................................................................. 25 Figure 11: Main Effect Plots ............................................................................................................ 27 Figure 12: Interaction Plots ............................................................................................................. 28 `Figure 13: Half – Normal Plot .......................................................................................................... 29 Figure 14: Pareto Plot ..................................................................................................................... 29 Figure 15: Residual Plots ................................................................................................................. 30 8
  • 9. LIST OF TABLES Table 1: The amount ranges of givens in each difficult level ............................................................. 15 Table 2: Variation of the Number of Givens. .................................................................................... 15 Table 3: Variation of the Distribution of Givens. ............................................................................... 16 Table 4: Variation of the Redundant Numbers. ................................................................................ 17 Table 5: The design matrix for each experiment in coded values. ..................................................... 18 Table 6: The design matrix in statistics values for the Easy experiment. ........................................... 19 Table 7: The design matrix in statistics values for the Medium experiment. ..................................... 19 Table 8: Effects List .......................................................................................................................... 23 Table 9: ANOVA FOR EASY EXPERIMENT .......................................................................................... 25 Table 12: Fit Statistics for Y1 .............................................................................................................. 26 Table 13: Effects List .......................................................................................................................... 28 Table 14: ANOVA FOR MEDIUM EXPERIMENT .................................................................................... 30 Table 15: Fit Statistics for Y1 .............................................................................................................. 31 9
  • 10. CHAPTER ONE 1.0 BACKGROUND In recent years Sudoku puzzles have become an increasingly popular pass time. Sudoku’s simple set of rules and multiple levels of puzzle difficulty attract hobbyists with varying skills and experience. Typically, Sudoku puzzles are categorized by level of difficulty i.e. Easy, Medium, Hard, Killer, etc. Yet, it is not uncommon for the perceived difficulty of puzzles to vary greatly, even within a single difficulty rating. It is the goal of this project is to determine whether additional factors, other than the published difficulty rating, have an effect on the difficulty of a Sudoku puzzle. In an attempt to keep the experimental results practical for the casual Sudoku hobbyist, factors that can be easily estimated by visual inspection of the puzzle were chosen. The factors are:  Number of givens – The number of initial givens provided in the puzzle.  Distribution of givens – The relative placement of initial givens in the puzzle.  Redundant numbers – The repetition of specific numbers in a puzzle’s set of initial 10 givens. The interest is the effects of these factors on puzzles of a single published difficulty level. In preparation for this project a full 23 factorial experiment was performed on a set of eight “Easy” puzzles. The experiment was then repeated on eight “Medium” puzzles. The results of the two experiments were used to show which of the above mentioned factors have the largest effect on the puzzle difficulty. Several other effects were also determined and their significance, further, a general model equation that approximates, as accurately as possible, the time expected to be taken by a Sudoku hobbyist in solving the puzzle by taking into account the above factors and holding other factors constant was as well determined. The rest of the project report is organized as follows: Chapter two provides a brief description of the objective of our two experiments and gives a quick background to the problem. Chapter 3 describes the experimental design used for the two experiments and about how the experiment was run. Chapter 4 provides results and an analysis of the data. Conclusion and Recommendation is given in Chapter 5.
  • 11. 1.1 PROBLEM STATEMENT There are multiple factors that normally affect the difficulty of a Sudoku puzzle apart from the published difficulty rating, that is, extremely easy, easy, medium, difficult, evil, etc., this project will be limited to studying only three such factors: 1. Number of the initial givens. 2. The distribution of the givens. 3. The redundancy of the individual givens. Hence, it’s expected that the knowledge of the effects of these factors would reduce the time taken by Sudoku hobbyist in playing the game, subsequently, making the game enjoyable and interesting. 1.2 STUDY PURPOSE Sudoku is today a popular game throughout the world and it appears in multiple Medias, including websites, newspapers and books. As a result, it is of interest to find the factors affecting the difficulty of a Sudoku puzzle besides the published difficulty rating of extremely easy, easy, medium, difficult, evil etc. Moreover, another goal of this study is therefore, to contribute to the knowledge and comprehension of the Latin square designs, factorial designs and the design of experiments in general, as the analysis of the factors affecting the difficulty of Sudoku puzzles employs the use of these designs. 1.3 STUDY OBJECTIVES The broad objective of this experiment is to quantify the effects of a Sudoku puzzle’s initial structure and set of givens on the expected time required to complete the puzzle. In particular, the experiment will seek to fulfill the following; 1. What are the effects of the number of givens, distribution of the givens and the redundancy of the specific givens on the difficulty of a Sudoku puzzle? 2. Are the effects of each factor consistent across levels of each factor? 11
  • 12. CHAPTER TWO 2.0 LITERATURE REVIEW 2.1 INTRODUCTION Sudoku puzzle, as a widely popular intellectual game in recent years, was invented in Swiss in 18th century. Then, it initially harvested well development in Japan in the past decades. The name Sudoku actually derives from Japanese that means “number place” 12 [1] . Due to its simple and friendly rules for beginners and the charm from intellectual challenge, Sudoku becomes welcome recently for players of various ages. You are even able to solve a Sudoku puzzle easily without any mathematical knowledge. A Sudoku puzzle consists of a table with nine rows and nine columns. The squares (i.e. cells in the table) are grouped in sets of nine which we will call boxes. For clarity we will call the rows r1, r2, … r9, the columns c1, c2, … c9, and the boxes b1, b2, … b9. Figure 1 provides a diagram showing a sample Sudoku grid with row, column, and box names. The squares are named sij where i is the row number and j is the column number. c1 c2 c3 c4 c5 c6 c7 c8 c9 r1 b1 b2 b3 r2 1 r3 r4 b4 5 b5 2 b6 r5 2 9 r6 6 r7 b7 b8 b9 r8 r9 Figure 1: Sudoku Grid with Row, Column and Box Names Source: www.onlinegames.com/ sudokugame.
  • 13. 2.2 HOW TO PLAY How is the Sudoku game played? “You only need to know where you play the game and what your goal is. The simple aspects that help you join the game are specified as follows”[2]: Game Environment: you may first get a general overview of this game board as shown below; Figure 2: General view of Sudoku game environment Several basic components of the board are defined as Figure 1.1 illustrates. The whole board is actually a 9-by-9 grid made of nine smaller 3-by-3 grids called blocks. The smallest unit square is called a cell which has two types of states: empty, and confirmed by a digit from 1 through 9. We mark the whole grid with rows and columns from top-left corner. Goal of the Game: generally, Sudoku game is started with such a situation in grid that some of the cells have already been confirmed by digits known as givens. The task for Sudoku players is to place a digit from 1 to 9 into each cell of the grid, and meanwhile each digits can only be used exactly once in each row, each column and each block. Additionally, all the nine rows, nine columns and nine blocks are respectively ensured to contain all the digits from 1 through 9. 13
  • 14. These limitations for placing digits in three locations are respectively called row constraint, column constraint and block constraint. Based on the rules that we mentioned above, Sudoku players are commonly inspired to complete the placement of digits into all empty cells using various techniques as soon as possible. 2.3 OPERATIONALIZATION OF VARIABLES 2.3.1 Response Variable The time required to complete a puzzle was the single response variable for this experiment. The typical level for this variable can range from a few minutes to over half of an hour. The response will be measured in terms of minutes. There is no practical limit on the range at which this response can be measured. The project tried to as well develop and approximate the total time expected to be taken averagely by any Sudoku hobbyist to complete filling the empty cells correctly. The following formula was used to approximate the response variable; Ŷijk = μ + αi + βj + γk + εijk Where; Ŷijk: Observation with factor A at level i, B at level j and C at level k. μ: Mean response. αi: Effect of factor A at level i. βj: Effect of factor B at level j. γk: Effect of factor C at level k. εijk: Error term. 14
  • 15. 2.3.2 Control Variables (i) Number of givens As the first factor affecting the level estimation, the total amount of given cells in an initial Sudoku puzzle can significantly eliminate potential choices of digits in each cell by the three constraints in the game rules. In general, it is reasonable to argue that the more empty cells provided at the start of a Sudoku game, the higher level the puzzle is graded in. We moderately scale the amount ranges of givens for each difficult level as shown below; Table 1: The amount ranges of givens in each difficult level Level Givens Amount Scores 1 (Extremely easy) more than 50 1 2 (Easy) 36-49 2 3 (Medium) 32-35 3 4 (Difficult) 28-31 4 5 (Evil) 22-27 5 Possibly the most obvious measurement of a puzzle’s difficulty is the number of givens provided in the initial grid. Typical values range from 50 givens for easier puzzles to 20 givens for highly difficult puzzles. For the two experiments, the control variable was varied as follows: Table 2: Variation of the Number of Givens. Easy Experiment Medium Experiment Min = 36 Min = 32 Max = 49 Max = 35 It was expected that the difficulty of a puzzle would increase as the number of initial givens decreases. (ii) Distribution of givens Another possible variable is the distribution of the initial givens in the grid. One can imagine that a puzzle with all, or most, of the givens crowded in one section of the grid may be different in 15
  • 16. difficulty than a puzzle with the givens spread evenly around the grid. Formally, this geometric property can be viewed as the variance of row, column, and box densities. The density of a row, column or box is defined to be the number of givens provided in that row, column or box. For example, row r4 in Figure 1 has a density of two, Column c3 has a density of one, and box b5 has a density of three. The mean density is then 16 27 9 1 9 1 9 1             i  i i i i i desnsity c r b  and the variance is       27 9 1 2 9 1 2 9 1 2 2                i i density i i density i i density desnsity c  r  b   . The normal range for 2 density  in the sample set that was conducted was found to be between 0 and 3.333. Hence, for these two experiments, 2 density  was varied as follows: Table 3: Variation of the Distribution of Givens. It is expected that the difficulty of a Sudoku puzzle would decrease as 2 density  increases. Easy Experiment Medium Experiment 1.19 <= Min <= 1.70 0.59 <= Min <= 0.96 2.15 <= Max <= 2.67 1.26 <= Max <= 2.81
  • 17. (iii) Redundant Numbers in the Initial Grid Easy Experiment Medium Experiment 0.89 <= Min <= 1.33 0.89 <= Min <= 1.56 2.89 <= Max <= 3.56 2.00 <= Max <= 3.11 17 This variable measures the variance, 2 deg ree  , in the number of times a specific number is repeated in the initial Sudoku grid. Here, the degree of a number, deg(i) , is defined to be the number of times the number i appears in the initial grid. So the mean degree is deg( ) 9 9 1 deg   i ree i  And the variance of the degree is  deg( )  9 9 1 2 deg 2 deg    i ree ree i   . The normal range for 2 deg ree  in the sample set was found to be between 0.5432 and 3.7778. For these two experiments, 2 deg ree  was varied as follows. Table 4: Variation of the Redundant Numbers. It was less expected that the degree 2 deg ree  would be a significant factor in determining the difficulty of the Sudoku puzzle.
  • 18. CHAPTER THREE 3.0 EXPERIMENTAL DESIGN These experiments were run using two 23 full factorial designs. The original desire was to perform a single 24 full factorial design with the additional factor being the published rating for the puzzle. Unfortunately, a strong correlation between the published rating and the typical values for our other factors was found. The correlation was so strong that Easy and Medium puzzles with matching high and low values for the other factors could not be found. After measuring the typical values for the factors on over 120 puzzles, it was suitable to run two individual experiments. The first was run on a sample of puzzles with an Easy difficulty rating. The second was run on a sample of puzzles with a Medium difficulty rating. Table 5: The design matrix for each experiment in coded values. 18 Number of givens (Factor C) Distribution of givens (Factor B) Redundant Numbers (Factor A) (1) -1 -1 -1 a -1 -1 1 b -1 1 -1 ab -1 1 1 c 1 -1 -1 ac 1 -1 1 bc 1 1 -1 abc 1 1 1
  • 19. Table 6: The design matrix in statistics values for the Easy experiment. 19 Number of givens (Factor C) Distribution of givens (Factor B) Redundant Numbers (Factor A) (1) 36 1.56 0.89 a 36 1.70 3.11 b 36 2.67 0.89 ab 36 2.15 3.56 c 49 1.19 1.33 ac 49 1.26 2.89 bc 49 2.15 1.11 abc 49 2.15 3.33 Table 7: The design matrix in statistics values for the Medium experiment. Number of givens (Factor C) Distribution of givens (Factor B) Redundant Numbers (Factor A) (1) 32 0.59 1.11 a 32 0.59 2.00 b 32 1.56 0.89 ab 32 1.26 2.22 c 35 0.89 1.33 ac 35 0.96 3.11 bc 35 2.81 1.56 abc 35 2.81 2.67
  • 20. 3.1 PERFORMING THE EXPERIMENT The two experiments were replicated 16 times each. The replicates were then blocked and each block was run by a single individual. The random run order for each block was determined by SAS Software and the result attached in the appendix of the report. The experiments were started by creating a Perl script to calculate all the control variables outlined in section 3.0 above. Eight different volumes of Sudoku puzzle books (in this case, Sudoku appearing in the daily Standard and Nation Newspapers were used) from the same author were purchased over the internet and proceed to transfer the puzzles from the books to the Perl script. From the Perl script output, the high and low values for each of the variables being analyzed were created. Once the high and low values were set, eight of the Easy rating puzzles and eight of the Medium rating puzzles that fit well with the high and low values set for each variable were identified. Once the puzzles were chosen, the experiment designs were then set up in Design Expert software. Then copies of the puzzles made and stapled in their individual blocks based on the run order provided by the software. Each block were then distributed to one of the 16 willing participants and were asked to perform the puzzles in the order in which they were stapled. The participants were required to provide the start time, finish time, and delta time for each of the puzzles. Once the completed puzzles were returned, the delta times (response variable) were then added to the run order table in the SAS Software and proceed to analyze the data. 3.2 STATISTICAL ANALYSIS The data from the two experiments were then analyzed individually in two subsections. SAS Software and Design Expert were comparatively used to aid in the analysis of the data. 20
  • 21. CHAPTER FOUR 4.0 RESULTS AND ANALYSIS OF DATA 4.1 INTRODUCTION In this chapter, the experiment results are shown as given by the SAS software and Design Expert analysis tools, the results are thereof explained. The data that was used for analysis for the two experiments; easy and medium experiments, are found in the appendices 3.0 and 4.0 respectively The analysis are divided into two parts, with the first part being the analysis of easy experiment while the other deals with the analysis of medium experiment. Out of the 16 individuals tasked with solving the Sudoku puzzles with easy difficulty rating, only 13 of them returned the puzzles in time to be analyzed in this report. This represented a return rate of 81%. Fortunately, since the design was blocked on replicates this therefore, had little to no effect on the analysis of the results in this report. However, for the Sudoku puzzles with medium difficulty rating, only 11 out of 16 individuals returned their puzzles in time for analysis. This translates to 69% puzzles return rate. 4.2 SUMMARY OF THE EXPERIMENTS 21 DESIGN DETAILS Design type: Two-level Design description: Repeated Number of factors: 3 Number of runs: 128 Resolution: Full Number of blocks: 16 FACTORS Factors and Levels: __________________________________________________ Factor Label Low Center High __________________________________________________ C Number of Givens -1 0 1 B Distribution of Givens -1 0 1 A Redundancy of Givens -1 0 1 __________________________________________________
  • 22. 22 RESPONSE _________________________ Response Label Units _________________________ Y1 Time seconds _________________________ BLOCK INFORMATION Block name: BLOCK Block label: INDIVIDUAL Number of blocks: 16 4.3 ANALYSIS OF SUDOKU PUZZLES WITH EASY DIFFICULTY RATING 4.3.1 Factors Effects The analysis was started by looking at the effects of various factors as presented by the SAS software. Due to the interest in maintaining a hierarchical model, the model was screened by omitting only the ABC variable. Therefore, in the whole analysis the interaction factor ABC was omitted. Main Factors and Interaction Plots Figure 3: Main Effects plot The main factors plot above shows that, for factor C; the mean of all response values for which C=-1 is 600, while that for which C=+1 is 500. The interpretation is similar for factor A, while that for factor B shows that both mean response values for B=-1 and B=+1 were the same.
  • 23. 23 Figure 4: Interaction Plots From the interaction plots above, there is apparent interaction between factors B and C, also between B and A, thus, suggesting a significant effect (because the lines are crossed). This implies that as such it would be difficult to determine whether one or both of these effects are significant, even though their interaction plot strongly suggests a significant effect. To determine which factor(s) or interaction(s) has a significant effect we explore other ways like use of effect list, half – normal plots and others. Table 8: Effects List From the table 8 above, factors C, A and BA were significant, while factors B, CB and CA were not significant since they had a p – value greater than 0.05. Factors C and A are exhibiting negative effects while the others are contributing positive effects.
  • 24. 24 Figure 9: Half – Normal Plot The half – normal plot was then run to analyze the same information using a more visual tool. The plot identified variable C as the most significant factor. The other factors seem to fall on or near the insignificant line as shown in the figure 1 above. Figure 10: Pareto Plot To give the % effect contribution of each factor, a more appropriate plot (Pareto plot) was used as shown in figure 10 above. Once again, factor C has the highest contribution while factor B has the least contribution.
  • 25. 4.3.2 The Analysis of Variance (ANOVA) Table 9: ANOVA FOR EASY EXPERIMENT ANOVA for Y1 25 Master Model Source DF SS MS F Pr > F C 1 457788.5 457788.5 11.18521 0.001228 B 1 2803.846 2803.846 0.068507 0.794157 A 1 381634.6 381634.6 9.324534 0.003019 C*B 1 70096.15 70096.15 1.71267 0.194167 C*A 1 146250 146250 3.573348 0.062122 B*A 1 174496.2 174496.2 4.26349 0.041987 Model 18 8966008 498111.5 12.17043 0.0001 Error 85 3478881 40928.01 Total 103 12444888 When the ANOVA was run using the full model to identify significant variables and utilizing the p – values to identify variables meeting α = 0.05 requirement, variables C, A and interaction BA were found to be significant. The above table shows the details. Figure 11: Residual Plots Normal Plot of Residuals Residual Vs Predicted
  • 26. The next step was to analyze the model adequacy via the residual plots as shown in the figure above. The plots identified a possible model adequacy in normality and equal variance assumptions. The model produced the best results, with the normal probability plot passing the “fat pencil” test and the residual vs. predicted showing no patterns. Since there was potential model adequacy, it was needless to transform the data to find a more accurate model. In summary, when considering Easy Level Sudoku puzzles, the following factors are significant. 26  C – Number of Givens (negative)  A – Repetition of Numbers (negative)  BA – interaction effect (positive) Based on the percentage contribution as projected by Pareto plot, it is clear that the Number of Givens is far and away the most significant factor. 4.3.3 Model Prediction Predictive Model for Y1 Uncoded Levels: Y1 = 300 + 22.5*(BLOCK='10') + 465*(BLOCK='11') + 750*(BLOCK='12') + 637.5*(BLOCK='13') + 472.5*(BLOCK='16') - 45*(BLOCK='2') + 90*(BLOCK='3') - 97.5*(BLOCK='4') + 90*(BLOCK='5') + 22.5*(BLOCK='6') + 465*(BLOCK='7') + 270*(BLOCK='8') - 66.34615*C - 60.57692*A Table 12: Fit Statistics for Y1 ____________________________________________ Master Model Predictive Model ____________________________________________ Mean 541.7308 541.7308 R-square 72.05% 68.88% Adj. R-square 66.13% 63.99% RMSE 202.3067 208.5942 CV 37.34451 38.50514
  • 27. 27 Response(s): ___________________________________ Response Est. Value ___________________________________ Y1 300 [147.6909, 452.3091] ___________________________________ From the SAS output above, the regression equation above shows that it would take any person filling any easy rating Sudoku puzzle would approximately take 300 seconds to accurately complete the puzzle with a confidence interval of [147.6909, 452.3091] seconds at α = 0.05 level of significance. 4.4 ANALYSIS OF SUDOKU PUZZLES WITH MEDIUM DIFFICULTY RATING 4.4.1 Factors Effects The analysis was begun by looking at the effects of various factors as presented by the SAS software. Due to the interest in maintaining a hierarchical model, the model was screened by omitting only the ABC variable. Therefore, in the whole analysis the interaction factor ABC was omitted. Main Factors and Interaction Plots Figure 11: Main Effect Plots The main factors plot above shows that, for factor C; both mean response values for C=-1 and C=+1 were the same. The interpretation is similar for factor A, while that for factor B shows that the mean of all response values for which B=-1 is 625, while that for which B=+1 is 675.
  • 28. 28 Figure 12: Interaction Plots From the plots, there is a possible significant effect between factors A and C, C and B, A and B, C and A also between B and A. Thus, suggesting a significant effect (because the lines are crossed). This implies that as such it would be difficult to determine whether one or both of these effects are significant, even though their interaction plot strongly suggests a significant effect. To determine which factor(s) or interaction(s) has a significant effect we explore other ways like use of effect list, half – normal plots and others. Table 13: Effects List From the effects list generated above, all the main factors and their interactions seem to be insignificant since, they all have p-values greater than 0.05, when α=0.05 was taken as the level of significance.
  • 29. To confirm the results of the effects list, half – normal plot was used as indicated below. `Figure 13: Half – Normal Plot Using a more visual tool (the half – normal plot), it was further confirmed that indeed no factor is significant since some seem to fall on the Lenth’s PSE line while the remaining are at the space between Lenth’s PSE line and RMSE line. Figure 14: Pareto Plot 29 ` To give the % effect contribution of each factor, a more appropriate plot (Pareto plot) was used as shown in figure 2 above. Interaction B*A has the highest contribution while factor C has the least contribution, even though none of these factors are significant.
  • 30. 4.4.2 The Analysis of Variance (ANOVA) Table 14: ANOVA FOR MEDIUM EXPERIMENT ANOVA for Y1 30 Master Model Source DF SS MS F Pr > F C 1 63.92045 63.92045 0.001401 0.970243 B 1 38097.28 38097.28 0.835226 0.363859 A 1 3108.284 3108.284 0.068144 0.794814 C*B 1 2967.284 2967.284 0.065053 0.799418 C*A 1 13975.92 13975.92 0.306401 0.581636 B*A 1 122478.3 122478.3 2.685152 0.105712 Model 16 3959457 247466.1 5.425321 0.0001 Error 71 3238535 45613.17 Total 87 7197992 When the ANOVA was run using the full model to identify significant variables and utilizing the p – values to identify variables meeting the α = 0.05 requirement, it was asserted that none of the variables had p-value less than 0.05, confirming that no factor was found to be significant. The above table shows the details. Figure 15: Residual Plots The next step was to analyze the model adequacy via the residual plots as shown in the figure above. The plots identified a possible model adequacy in normality and equal variance assumptions.
  • 31. The model produced the best results, with the normal probability plot passing the “fat pencil” test and the residual vs. predicted showing no patterns. Since there was potential model adequacy, it was needless to transform the data to find a more accurate model. In summary, it was clearly found that none of the hypothesized factors i.e. Number of givens, Distribution of the givens and Redundancy of the givens, and/or their factor interactions seemed not to have any significant influence on the difficulty of medium rating Sudoku puzzles. 4.4.3 Model Prediction Table 15: Fit Statistics for Y1 ____________________________________________ Master Model Predictive Model ____________________________________________ Mean 644.0795 644.0795 R-square 55.01% 55.01% Adj. R-square 44.87% 44.87% RMSE 213.5724 213.5724 CV 33.15932 33.15932 ____________________________________________ Response(s): ______________________________________ Response Est. Value ______________________________________ Y1 464.75 [314.1888, 615.3112] ______________________________________ From the SAS output above, the estimation table above shows that it would take any person filling any medium rating Sudoku puzzle approximately would take 464.75 seconds to accurately complete the puzzle with a confidence interval of [314.1888,615.3112] seconds at α = 0.05 level of significance. 31
  • 32. CHAPTER FIVE 5.0 CONCLUSION AND RECOMMENDATIONS 5.1 CONCLUSION Based on the two experiments discussed in this study, it is clear that the number of givens (i.e. factor C) had the greatest effect on the difficulty of an Easy rating Sudoku puzzle. This factor also seemed to have some minimal effect on the Medium rating Sudoku puzzle; it failed to reach the significant level as projected by p-value in the ANOVA table 14. In addition, the Redundancy (repetition) of the givens (i.e. factor A) also had some significant effect on the difficulty of an easy rating Sudoku puzzle but this factor failed as well to reach the significant level of influence in the medium rating Sudoku puzzle. Furthermore, the interaction factor BA as well exhibited significant influence on the easy rating process. Nonetheless, in the medium it was clearly found that none of the hypothesized factors i.e. Number of givens, Distribution of the givens and Redundancy of the givens, and/or their factor interactions seemed not to have any significant influence on the difficulty of medium rating Sudoku puzzles. The Adjusted R2 values reported for both the Easy and Medium levels of experiment were both unusually small, they were 0.6613 (66.13%) and 0.4487 (44.87%) for the easy and medium rating experiments respectively. Normally, this may be a cause for concern. In these particular experiments, the ANOVA for both experiments shows that the block effects were largely significant. When the block effects are that significant, it become pretty difficult and tricky to pick a single model that accurately describes the general behavior of the response. Since the Adjusted R2 values for these experiments are small, the regression model is not expected to accurately predict the completion time for any puzzle. Though, according to these experiments the roughly projected times for completion are 300 seconds and 464.75 seconds for easy and medium experiments respectively. 32
  • 33. 5.2 RECOMMENDATIONS After the successful completion of the experimentation, it is in the interest of the investigator to make the following recommendations; 1. Due to the inaccurate prediction of the completion time, further experimentation is required to accurately model the completion time for a specific person and specific Sudoku puzzle. 2. During similar studies, participants (blocks) should be properly trained first to reduce the 33 significance of their effects. 3. Since, in this study, the investigator found no significant effect of the hypothesized factors in the medium experiment, different factors and/or careful analysis of these factors should be considered in future studies.
  • 34. APPENDICES 1.0 Data Collection Tool for Easy Rating Sudoku Puzzles Kindly complete the below Sudoku and provide the start time, end time and the delta time (i.e. the difference between the start and end times). Provide the delta time in seconds. 1. 9 7 5 3 1 2 4 8 3 1 8 6 5 6 2 2 3 4 9 1 5 3 6 7 6 4 8 9 34 2. 9 2 7 1 5 1 4 2 9 8 6 7 1 5 6 8 2 3 4 7 5 1 4 9 2 8 2 4 9 2 4 9 5 3 8 1 5 9 3 8 6 9 1 7 3 4 7 8 2 1 5 4 3 8 1 6 7 Start time:…………………….. End time:………………………. Delta time:……………………. Start time:…………………….. End time:………………………. Delta time:…………………….
  • 35. 35 3. 8 2 6 5 3 5 9 2 4 1 7 4 9 5 9 5 2 1 3 4 8 7 9 3 8 9 6 1 6 3 9 1 7 5 1 8 4 6 7 5 8 2 4. 6 3 4 9 1 5 6 2 8 8 1 7 6 4 1 2 6 7 9 7 5 4 3 8 7 4 3 1 9 1 6 5 4 7 2 8 3 5 Start time:…………………….. End time:………………………. Delta time:……………………. Start time:…………………….. End time:………………………. Delta time:…………………….
  • 36. 36 5. 8 3 9 1 2 7 4 2 8 6 7 9 5 9 8 2 8 6 7 4 3 2 3 9 8 4 4 1 3 5 7 7 8 9 5 1 9 7 8 6 6. 9 2 5 7 8 6 5 7 4 1 3 2 8 3 2 9 6 2 1 5 9 4 6 4 1 6 3 2 6 8 2 1 3 3 7 9 5 4 2 4 1 6 7 4 7 5 6 1 9 Start time:…………………….. End time:………………………. Delta time:……………………. Start time:…………………….. End time:………………………. Delta time:…………………….
  • 37. 37 7. 4 9 7 6 1 7 4 8 5 8 9 5 4 7 4 9 7 8 2 5 2 5 3 4 8 6 8 6 1 4 7 5 6 4 2 3 4 6 5 7 9 2 1 3 7 5 4 8. 4 7 5 3 1 7 6 1 9 1 2 8 3 1 5 7 4 3 9 8 1 5 2 1 6 7 8 4 3 3 7 2 1 5 7 9 6 1 2 Start time:…………………….. End time:………………………. Delta time:……………………. Start time:…………………….. End time:………………………. Delta time:…………………….
  • 38. 2.0 Data Collection Tool for Medium Rating Sudoku Puzzles Kindly complete the below Sudoku and provide the start time, end time and the delta time (i.e. the difference between the start and end times). Provide the delta time in seconds. 1. 5 3 9 6 5 7 2 4 7 2 6 5 1 7 9 2 7 1 1 9 7 3 6 1 5 3 4 1 8 8 5 3 2. 5 2 9 4 1 7 38 7 4 6 1 9 7 8 4 6 3 5 4 1 9 6 1 3 7 4 2 7 4 5 5 3 1 3 5 8 Start time:…………………….. End time:………………………. Delta time:……………………. Start time:…………………….. End time:………………………. Delta time:…………………….
  • 39. 39 3. 1 6 2 5 9 8 1 6 2 7 6 1 6 1 2 9 3 5 4 1 4 1 3 5 8 1 7 9 1 4 3 9 8 6 1 4. 2 9 5 7 8 1 6 6 8 4 1 1 9 3 7 5 7 8 8 2 3 9 4 7 5 1 2 6 4 5 3 3 5 7 8 Start time:…………………….. End time:………………………. Delta time:……………………. Start time:…………………….. End time:………………………. Delta time:…………………….
  • 40. 5. 5 6 3 1 6 5 4 7 8 3 9 2 40 5 8 3 5 3 6 7 2 9 4 5 6 1 2 6 3 2 7 9 7 2 9 3 6. 3 5 8 2 4 9 2 7 8 3 8 2 6 1 2 7 9 5 4 5 6 9 1 7 6 2 8 1 3 9 4 5 Start time:…………………….. End time:………………………. Delta time:……………………. Start time:…………………….. End time:………………………. Delta time:…………………….
  • 41. 7. 3 1 7 8 6 4 4 1 41 5 4 8 9 5 3 9 4 7 6 3 5 8 6 5 9 5 8 6 8 1 5 2 4 9 8. 9 8 4 1 1 7 9 3 6 8 6 2 1 8 4 9 8 2 8 7 6 5 7 1 4 5 8 8 5 8 7 2 Start time:…………………….. End time:………………………. Delta time:……………………. Start time:…………………….. End time:………………………. Delta time:…………………….
  • 42. 3.0 Experiment data for Easy Sudoku puzzles 42 DESIGN POINTS (Uncoded) _________________________________________________ RUN BLOCK C B A Y1 _________________________________________________ 61 8 -1 -1 1 360 57 8 -1 -1 -1 600 63 8 -1 1 1 600 58 8 1 -1 -1 480 59 8 -1 1 -1 540 62 8 1 -1 1 600 60 8 1 1 -1 660 64 8 1 1 1 720 42 6 1 -1 -1 300 46 6 1 -1 1 180 48 6 1 1 1 480 44 6 1 1 -1 120 47 6 -1 1 1 240 43 6 -1 1 -1 300 41 6 -1 -1 -1 540 45 6 -1 -1 1 420 75 10 -1 1 -1 360 77 10 -1 -1 1 360 73 10 -1 -1 -1 540 76 10 1 1 -1 240 79 10 -1 1 1 300 74 10 1 -1 -1 300 80 10 1 1 1 180 78 10 1 -1 1 300 3 1 -1 1 -1 . 5 1 -1 -1 1 . 4 1 1 1 -1 . 7 1 -1 1 1 . 1 1 -1 -1 -1 . 6 1 1 -1 1 . 8 1 1 1 1 . 2 1 1 -1 -1 . 37 5 -1 -1 1 420 40 5 1 1 1 300 36 5 1 1 -1 300 35 5 -1 1 -1 300 33 5 -1 -1 -1 780 34 5 1 -1 -1 360 38 5 1 -1 1 240 39 5 -1 1 1 420 10 2 1 -1 -1 300 11 2 -1 1 -1 120 14 2 1 -1 1 180 15 2 -1 1 1 480 9 2 -1 -1 -1 360 16 2 1 1 1 300
  • 43. 12 2 1 1 -1 120 13 2 -1 -1 1 180 104 13 1 1 1 960 100 13 1 1 -1 840 99 13 -1 1 -1 1200 101 13 -1 -1 1 780 98 13 1 -1 -1 600 103 13 -1 1 1 1140 97 13 -1 -1 -1 1200 102 13 1 -1 1 780 84 11 1 1 -1 960 83 11 -1 1 -1 840 86 11 1 -1 1 420 81 11 -1 -1 -1 660 88 11 1 1 1 960 85 11 -1 -1 1 600 82 11 1 -1 -1 780 87 11 -1 1 1 900 92 12 1 1 -1 780 90 12 1 -1 -1 1140 91 12 -1 1 -1 900 94 12 1 -1 1 660 93 12 -1 -1 1 1320 95 12 -1 1 1 540 89 12 -1 -1 -1 2100 96 12 1 1 1 960 125 16 -1 -1 1 660 121 16 -1 -1 -1 1020 124 16 1 1 -1 720 128 16 1 1 1 660 127 16 -1 1 1 780 126 16 1 -1 1 600 122 16 1 -1 -1 960 123 16 -1 1 -1 780 22 3 1 -1 1 240 19 3 -1 1 -1 900 17 3 -1 -1 -1 600 20 3 1 1 -1 300 23 3 -1 1 1 480 21 3 -1 -1 1 300 24 3 1 1 1 120 18 3 1 -1 -1 180 67 9 -1 1 -1 240 71 9 -1 1 1 240 68 9 1 1 -1 240 70 9 1 -1 1 240 69 9 -1 -1 1 240 66 9 1 -1 -1 240 72 9 1 1 1 240 65 9 -1 -1 -1 720 54 7 1 -1 1 420 49 7 -1 -1 -1 720 52 7 1 1 -1 720 55 7 -1 1 1 420 51 7 -1 1 -1 1560 43
  • 44. 50 7 1 -1 -1 960 56 7 1 1 1 600 53 7 -1 -1 1 720 29 4 -1 -1 1 180 28 4 1 1 -1 180 32 4 1 1 1 240 30 4 1 -1 1 180 26 4 1 -1 -1 180 27 4 -1 1 -1 240 25 4 -1 -1 -1 240 31 4 -1 1 1 180 111 14 -1 1 1 . 107 14 -1 1 -1 . 110 14 1 -1 1 . 105 14 -1 -1 -1 . 112 14 1 1 1 . 109 14 -1 -1 1 . 108 14 1 1 -1 . 106 14 1 -1 -1 . 113 15 -1 -1 -1 . 118 15 1 -1 1 . 120 15 1 1 1 . 114 15 1 -1 -1 . 115 15 -1 1 -1 . 116 15 1 1 -1 . 119 15 -1 1 1 . 117 15 -1 -1 1 . _________________________________________________ 44
  • 45. 4.0 Experiment data for Medium Sudoku puzzles 45 DESIGN POINTS (Uncoded) _________________________________________________ RUN BLOCK C B A Y _________________________________________________ 61 8 -1 -1 1 962 57 8 -1 -1 -1 802 63 8 -1 1 1 637 58 8 1 -1 -1 651 59 8 -1 1 -1 991 62 8 1 -1 1 582 60 8 1 1 -1 583 64 8 1 1 1 1040 42 6 1 -1 -1 480 46 6 1 -1 1 660 48 6 1 1 1 582 44 6 1 1 -1 1045 47 6 -1 1 1 370 43 6 -1 1 -1 750 41 6 -1 -1 -1 490 45 6 -1 -1 1 495 75 10 -1 1 -1 1191 77 10 -1 -1 1 1161 73 10 -1 -1 -1 469 76 10 1 1 -1 1333 79 10 -1 1 1 928 74 10 1 -1 -1 984 80 10 1 1 1 715 78 10 1 -1 1 1269 3 1 -1 1 -1 277 5 1 -1 -1 1 373 4 1 1 1 -1 449 7 1 -1 1 1 544 1 1 -1 -1 -1 540 6 1 1 -1 1 500 8 1 1 1 1 608 2 1 1 -1 -1 427 37 5 -1 -1 1 600 40 5 1 1 1 540 36 5 1 1 -1 540 35 5 -1 1 -1 540 33 5 -1 -1 -1 840 34 5 1 -1 -1 600 38 5 1 -1 1 720 39 5 -1 1 1 900 10 2 1 -1 -1 . 11 2 -1 1 -1 . 14 2 1 -1 1 . 15 2 -1 1 1 . 9 2 -1 -1 -1 . 16 2 1 1 1 .
  • 46. 12 2 1 1 -1 . 13 2 -1 -1 1 . 104 13 1 1 1 358 100 13 1 1 -1 267 99 13 -1 1 -1 269 101 13 -1 -1 1 297 98 13 1 -1 -1 420 103 13 -1 1 1 337 97 13 -1 -1 -1 361 102 13 1 -1 1 540 84 11 1 1 -1 . 83 11 -1 1 -1 . 86 11 1 -1 1 . 81 11 -1 -1 -1 . 88 11 1 1 1 . 85 11 -1 -1 1 . 82 11 1 -1 -1 . 87 11 -1 1 1 . 92 12 1 1 -1 420 90 12 1 -1 -1 540 91 12 -1 1 -1 1500 94 12 1 -1 1 600 93 12 -1 -1 1 780 95 12 -1 1 1 780 89 12 -1 -1 -1 840 96 12 1 1 1 780 125 16 -1 -1 1 . 121 16 -1 -1 -1 . 124 16 1 1 -1 . 128 16 1 1 1 . 127 16 -1 1 1 . 126 16 1 -1 1 . 122 16 1 -1 -1 . 123 16 -1 1 -1 . 22 3 1 -1 1 285 19 3 -1 1 -1 180 17 3 -1 -1 -1 435 20 3 1 1 -1 270 23 3 -1 1 1 225 21 3 -1 -1 1 270 24 3 1 1 1 570 18 3 1 -1 -1 315 67 9 -1 1 -1 590 71 9 -1 1 1 382 68 9 1 1 -1 874 70 9 1 -1 1 366 69 9 -1 -1 1 682 66 9 1 -1 -1 304 72 9 1 1 1 785 65 9 -1 -1 -1 404 54 7 1 -1 1 . 49 7 -1 -1 -1 . 52 7 1 1 -1 . 55 7 -1 1 1 . 51 7 -1 1 -1 . 46
  • 47. 50 7 1 -1 -1 . 56 7 1 1 1 . 53 7 -1 -1 1 . 29 4 -1 -1 1 800 28 4 1 1 -1 720 32 4 1 1 1 360 30 4 1 -1 1 480 26 4 1 -1 -1 960 27 4 -1 1 -1 420 25 4 -1 -1 -1 780 31 4 -1 1 1 660 111 14 -1 1 1 . 107 14 -1 1 -1 . 110 14 1 -1 1 . 105 14 -1 -1 -1 . 112 14 1 1 1 . 109 14 -1 -1 1 . 108 14 1 1 -1 . 106 14 1 -1 -1 . 113 15 -1 -1 -1 735 118 15 1 -1 1 1275 120 15 1 1 1 720 114 15 1 -1 -1 645 115 15 -1 1 -1 1155 116 15 1 1 -1 1215 119 15 -1 1 1 855 117 15 -1 -1 1 705 _________________________________________________ 47
  • 48. REFERENCES 1. Wei – Meng Lee, Programming Sudoku (Technology in Action), Apress, 2006. 2. www.wikipedia.com/sudokugames. 3. Jonathan Lutz, et al. Design of Engineering Experiments: Arizona State University. 2000. Pg 5 – 48 10. 4. Raj Jain, et al. Two factors; full factorial design without replication. Washington University, USA. Pg 21 – 7.