This is a small presentation on my project , diabetes prediction using R language.The method used is knn(K nearest neighbour). it the basic Machine learning algorithm.
3. Introduction :
• The project aims at a building a model using machine
learning techniques and automating the diabetes
prediction process .
• The algorithms are applied on the standard datasets
and ten are trained and depending on the training
and the accuracy of the algorithm output is given .
4. Dataset :
• The dataset used is the
“PimaIndiansDiabetes.csv”
• There are total 9 attributes in the dataset .
• Some are glucose, BMI, Age, insulin, and the
target variable whether +ve or –ve.
5. Method :
• The method used will b knn (k Nearest Neighbour)
• The accuracy of the foling method is proved to be
78%-82% .
• The prime imortance is given to the insulin and
glucose factors.
6. Values of K
• The values of K for huge dataset are ideally between
3-10.
• But for small datasets the ideal method for finding the value of K is the
trial and error method.
• This can be done automatically using thecaret package, which chooses a
value of k that minimize the cross-validation error.