1. FAIRFIELD INSTITUTE OF MANAGEMENT
& TECHNOLOGY
(Affiliated to GGSIPU University, an ‘A’ Grade college by DHE, GOVT. Of NCT
Delhi)
SUBJECT NAME:- MACHINE LEARNING WITH
PYTHON LAB FILE
SUBJECT CODE:- BCAP 311
SUBMITTED TO SUBMITTED BY
MS.ARUNA JOSHI NIKHIL KUMAR
ASSISTANT PROFESSOR 01290102021
IT DEPARTMENT B.C.A 5TH
SEMESTER
2. LIST OF PRACTICALS
S.NO PRACTICALS PAGE NO. T.SIGN
1. Extract the data from the database
using python.
1
2. Write a program to implement linear
and logistic regression.
2 - 4
3.
Write a program to implement the
naïve Bayesian classifier for a sample
training data set stored as a .CSV file.
Compute the accuracy of the classifier,
considering few test data sets.
5 - 6
4. Write a program to implement k-
nearest neighbors (KNN) and Support
Vector Machine (SVM) Algorithm for
classification.
7 - 8
5. Implement classification of a given
dataset using random forest.
9
6.
Build an Artificial Neural Network
(ANN) by implementing the Back
propagation algorithm and test the
same using appropriate data sets.
10 - 11
7.
Apply k-Means algorithm k-Means
algorithm to cluster a set of data stored
in a. CSV file. Use the same data set for
clustering using the k-Means
algorithm.
Compare the results of these two
algorithms and comment on the quality
of clustering. You can add Python ML
library classes in the program.
12 - 13
8. Write a program to implement Self -
Organizing Map (SOM).
14 - 15
3. 9. Write a program for empirical
comparison of different supervised
learning algorithms.
16 - 17
10. Write a program for empirical
comparison of different unsupervised
learningalgorithms
18 - 19
4. 1
1) Extract the data from the database using python.
this is our data in SampleDB database now we will write a code fro extract the data using python.
CODE:-
import mysql.connector
myconn = mysql.connector.connect(host = "localhost",
user = "root",passwd = "test",database="SampleDB")
cur = myconn.cursor()
cur.execute("select * from STUDENTS")
result = cur.fetchall()
print("Student Details are :")
for x in result:
print(x)
myconn.commit()
myconn.close()
OUTPUT:-
5. 2
2) Write a program to implement linear and logistic regression.
a) Linear regression:-
import numpy as nmp
import matplotlib.pyplot as mtplt
def estimate_coeff(p, q):
n1 = nmp.size(p)
m_p = nmp.mean(p)
m_q = nmp.mean(q)
SS_pq = nmp.sum(q * p) - n1 * m_q * m_p
SS_pp = nmp.sum(p * p) - n1 * m_p * m_p
b_1 = SS_pq / SS_pp
b_0 = m_q - b_1 * m_p
return (b_0, b_1)
def plot_regression_line(p, q, b):
mtplt.scatter(p, q, color = "m",
marker = "o", s = 30)
q_pred = b[0] + b[1] * p
mtplt.plot(p, q_pred, color = "g")
mtplt.xlabel('p')
mtplt.ylabel('q')
mtplt.show()
def main():
p = nmp.array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19])
q = nmp.array([11, 13, 12, 15, 17, 18, 18, 19, 20, 22])
b = estimate_coeff(p, q)
print("Estimated coefficients are :nb_0 = {}
nb_1 = {}".format(b[0], b[1]))
plot_regression_line(p, q, b)
if __name__ == "__main__":
main()
7. 4
b) Logistic regression:-
Now we have a logistic regression object that is ready to whether a tumor is cancerous
based on the tumor size:
CODE:-
import numpy
from sklearn import linear_model
X = numpy.array([3.78, 2.44, 2.09, 0.14, 1.72, 1.65, 4.92, 4.37, 4.96, 4.52, 3.69,
5.88]).reshape(-1,1)
y = numpy.array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1])
logr = linear_model.LogisticRegression()
logr.fit(X,y)
print(predicted)
OUTPUT:-
We have predicted that a tumor with a size of 3.46mm will not be cancerous.
8. 5
3) Write a program to implement the naïve Bayesian classifier for a sample
training data set stored as a .CSV file. Compute the accuracy of the classifier,
considering few test data sets.
CODE:-
import pandas as pd
from sklearn import tree
from sklearn.preprocessing import LabelEncoder
from sklearn.naive_bayes import GaussianNB
data = pd.read_csv('tennisdata.csv')
print("The first 5 values of data is :n",data.head())
X = data.iloc[:,:-1]
print("nThe First 5 values of train data isn",X.head())
y = data.iloc[:,-1]
print("nThe first 5 values of Train output isn",y.head())
le_outlook = LabelEncoder()
X.Outlook = le_outlook.fit_transform(X.Outlook)
le_Temperature = LabelEncoder()
X.Temperature = le_Temperature.fit_transform(X.Temperature)
le_Humidity = LabelEncoder()
X.Humidity = le_Humidity.fit_transform(X.Humidity)
le_Windy = LabelEncoder()
X.Windy = le_Windy.fit_transform(X.Windy)
print("nNow the Train data is :n",X.head())
le_PlayTennis = LabelEncoder()
y = le_PlayTennis.fit_transform(y)
print("nNow the Train output isn",y)
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size=0.20)
classifier = GaussianNB()
classifier.fit(X_train,y_train)
from sklearn.metrics import accuracy_score
print("Accuracy is:",accuracy_score(classifier.predict(X_test),y_test))
15. 12
7) Apply k-Means algorithm k-Means algorithm to cluster a set of data
stored in a .CSV file. Use the same data set for clustering using the k-Means
algorithm. Compare the results of these two algorithms and comment on the
quality of clustering. You can add Python ML library classes in the program.
CODE:-
import matplotlib.pyplot as plt
from sklearn import datasets
from sklearn.cluster import KMeans
import sklearn.metrics as sm
import pandas as pd
import numpy as np
iris = datasets.load_iris()
X = pd.DataFrame(iris.data)
X.columns = ['Sepal_Length','Sepal_Width','Petal_Length','Petal_Width']
y = pd.DataFrame(iris.target)
y.columns = ['Targets']
model = KMeans(n_clusters=3)
model.fit(X)
plt.figure(figsize=(14,7))
colormap = np.array(['red', 'lime', 'black'])
plt.subplot(1, 2, 1)
plt.scatter(X.Petal_Length, X.Petal_Width, c=colormap[y.Targets], s=40)
plt.title('Real Classification')
plt.xlabel('Petal Length')
plt.ylabel('Petal Width')
# Plot the Models Classifications
plt.subplot(1, 2, 2)
plt.scatter(X.Petal_Length, X.Petal_Width, c=colormap[model.labels_], s=40)
plt.title('K Mean Classification')
plt.xlabel('Petal Length')
plt.ylabel('Petal Width')
print(“ ”)
print('The accuracy score of K-Mean: ',sm.accuracy_score(y, model.labels_))