SlideShare a Scribd company logo
1 of 33
Topik 9
Implementasi Machine Learning
dengan Python
Dr. Sunu Wibirama
Modul Kuliah Kecerdasan Buatan
Kode mata kuliah: UGMx 001001132012
July 4, 2022
July 4, 2022
1 Capaian Pembelajaran Mata Kuliah
Topik ini akan memenuhi CPMK 5, yakni mampu mengimplementasikan Bahasa Pem-
rograman Python sebagai pendukung pengembangan sistem cerdas.
Adapun indikator tercapainya CPMK tersebut adalah Mengerti dan mamahami cara
mengekstrak dataset ke variabel Python, mengerti penggunaan berbagai fungsi classifier,
mengerti dan memahami cara melakukan validasi model machine learning.
2 Cakupan Materi
Cakupan materi dalam topik ini sebagai berikut:
a) Loading machine learning data: materi ini membahas teknik untuk mengunduh data
secara online menggunakan Google Colaboratory dan Python. Selain itu, materi ini
juga menjelaskan dasar-dasar statistika yang dapat digunakan untuk melihat karakter-
istik data. Dataset yang digunakan dalam hands on ini adalah PIMA Indian Dataset.
b) Preparing machine learning data: materi ini membahas teknik-teknik yang digunakan
untuk melihat distribusi dan korelasi antara masing-masing atribut dalam dataset.
c) Data visualization: materi ini membahas hal-hal yang terkait dengan visualisasi data
dengan histogram, density plots, boxplots, correlation matrix, dan scatter plots.
d) Data preparation and transformation: materi ini membahas hal-hal penting yang harus
dilakukan untuk mempersiapkan data untuk mengurangi potensi kesalahan pada saat
data menjadi masukan dari algoritme machine learning. Materi ini akan membahas
data rescaling, data standardization, dan data normalization.
e) Feature selection: materi ini membahas teknik-teknik yang dibutuhkan untuk memilih
features dari sekian banyak features yang ada.
f) Performance evaluation of machine learning algorithms: materi ini membahas teknik-
teknik yang dapat digunakan untuk mengevaluasi dan memilih model machine learn-
ing, misalnya memisahkan dataset menjadi training dan testing sets, K-fold cross
validation, dan repeated random test-train splits.
g) Performance metrics of machine learning algorithms: materi ini membahas metrik-
metrik utama yang dapat digunakan untuk mengukur performa algoritme machine
learning, misalnya akurasi klasifikasi, area di bawah kurva ROC, confusion matrix,
mean absolute error, mean squared error, serta R-squared.
h) Implementation of machine learning and neural network: materi ini membahas contoh
riil implementasi dari algoritme machine learning dalam klasifikasi, algorithm tuning,
dan neural network dengan Keras.
1
Week9:Hands-onMachineLearningwithPython
Copyright(C)2022-Dr.SunuWibirama|UniversitasGadjahMada
Note:Thisnotebookisintendedforeducationalpurpose.DistributionofthisnotebookislimitedonlyforstudentsofKuliah
KecerdasanBuatanthroughICEInstitutePlatform.AnyredistributionorrepublicationwithoutwrittenpermissionfromDr.Sunu
Wibiramaisstrictlyprohibitedandisconsideredascopyrightinfringement.
Inthislastlesson(Week9)ofKuliahKecerdasanBuatan(ArtificialIntelligenceCourse),wewillshowhowtoloaddatasettobeprocessedwith
machinelearningalgorithm.Inaddition,wewillalsolearnhowtovisualizethedata,howtoprepareourdata,andhowfeatureselectionworks.
Then,weintroducesomemetricstoevaluatemachinelearningalgorithms.Finally,wewillimplementseveralmachinelearningalgorithms.
9.1LoadingMachineLearningData
PIMAIndianDataset
ThePimaIndiansdatasetisusedtodemonstratedataloadinginthislesson.ThisdatasetisoriginallyfromtheNationalInstituteofDiabetesand
DigestiveandKidneyDiseases.Theobjectiveistopredictbasedondiagnosticmeasurementswhetherapatienthasdiabeteswithinfiveyears.As
suchitisaclassificationproblem.
Severalconstraintswereplacedontheselectionoftheseinstancesfromalargerdatabase.Inparticular,allpatientsherearefemalesatleast21
yearsoldofPimaIndianheritage.
[preg]--Pregnancies:Numberoftimespregnant
[plas]--Glucose:Plasmaglucoseconcentrationa2hoursinanoralglucosetolerancetest
[pres]--BloodPressure:Diastolicbloodpressure(mmHg)
[skin]--SkinThickness:Tricepsskinfoldthickness(mm)
[test]--Insulin:2-Hourseruminsulin(muU/ml)
[mass]--BMI:Bodymassindex(weightinkg/(heightinm)^2)
[pedi]--DiabetesPedigreeFunction:Diabetespedigreefunction
[age]--Age:Age(years)
[class]--Outcome:Classvariable,0isnoonsetofdiabeteswhile1isthereisonsetofdiabetes
Itisagooddatasetfordemonstrationbecausealloftheinputattributesarenumericandtheoutputvariabletobepredictedisbinary(0or1).More
detailedinformationaboutthedatasetcanbeseenhere:https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database
Task1:DownloadingdatasetfromGithubusingNumpyLibrary
WecanloadyourCSVdatausingNumPyandthenumpy.loadtxt()function.Thisfunctionassumesnoheaderrowandalldatahasthesameformat.
Theexamplebelowassumesthatthefilepima-indians-diabetes.data.csvisloadedfromanonlineURL.Theresultsarelistedinrowsthencolumns.
Youcanseethatthedatasethas768rowsand9columns.
In [89]:
from numpy import loadtxt

from urllib.request import urlopen



# load data
URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv"

raw_data = urlopen(URL)

dataset = loadtxt(raw_data, delimiter=",") 

print(dataset.shape)
(768, 9)

Task2:DownloadingdatasetfromGithubusingPandasLibrary
WecanalsoloadyourCSVdatausingPandasandthe pandas.read csv() function.Thisfunctionisveryflexibleandisperhapsthemost
recommendedapproachforloadingourmachinelearningdata.Thefunctionreturnsa pandas.DataFrame thatwecanimmediatelystart
summarizingandplotting.NotethatinthisexampleweexplicitlyspecifythenamesofeachattributetotheDataFrame.
(768, 9)

Task3:Usingdescriptivestatisticstounderstandyourdata
Thereisnosubstituteforlookingattherawdata.Lookingattherawdatacanrevealinsightsthatwecannotgetanyotherway.Itcanalsoplant
seedsthatmaylatergrowintoideasonhowtobetterpre-processandhandlethedataformachinelearningtasks.Wecanreviewthefirst20rows
ofourdatausingthe head() functiononthePandasdata.Wecanseethatthefirstcolumnliststherownumber,whichishandyforreferencinga
specificobservation.
preg plas pres skin test mass pedi age class

0 6 148 72 35 0 33.6 0.627 50 1

1 1 85 66 29 0 26.6 0.351 31 0

2 8 183 64 0 0 23.3 0.672 32 1

3 1 89 66 23 94 28.1 0.167 21 0

4 0 137 40 35 168 43.1 2.288 33 1

5 5 116 74 0 0 25.6 0.201 30 0

6 3 78 50 32 88 31.0 0.248 26 1

7 10 115 0 0 0 35.3 0.134 29 0

8 2 197 70 45 543 30.5 0.158 53 1

9 8 125 96 0 0 0.0 0.232 54 1

10 4 110 92 0 0 37.6 0.191 30 0

11 10 168 74 0 0 38.0 0.537 34 1

12 10 139 80 0 0 27.1 1.441 57 0

13 1 189 60 23 846 30.1 0.398 59 1

14 5 166 72 19 175 25.8 0.587 51 1

15 7 100 0 0 0 30.0 0.484 32 1

16 0 118 84 47 230 45.8 0.551 31 1

17 7 107 74 0 0 29.6 0.254 31 1

18 1 103 30 38 83 43.3 0.183 33 0

19 1 115 70 30 96 34.6 0.529 32 1

Task4:Observingtypeofdataforeachattribute
Thetypeofeachattributeisimportant.Stringsmayneedtobeconvertedtofloatingpointvaluesorintegerstorepresentcategoricalorordinal
values.Wecangetanideaofthetypesofattributesbypeekingattherawdata.Wecanalsolistthedatatypestocharacterizeeachattributeusing
the dtypes property.
preg int64

plas int64

pres int64

skin int64

test int64

mass float64

pedi float64

age int64

class int64

dtype: object

Task5:Descriptivestatistics
Descriptivestatisticscangiveusgreatinsightintothepropertiesofeachattribute.Oftenwecancreatemoresummariesthanwehavetimeto
review.The describe() functiononthePandasdatalists8statisticalpropertiesofeachattribute:
In [90]:
import pandas as pd



# load data
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] 

URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv"

data = pd.read_csv(URL,names=names)

print(data.shape)

In [91]:
import pandas as pd



# load data
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] 

URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv"

data = pd.read_csv(URL,names=names)

peek = data.head(20)

print(peek)

In [92]:
import pandas as pd



# load data
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] 

URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv"

data = pd.read_csv(URL,names=names)

types = data.dtypes

print(types)
Count.
Mean.
StandardDeviation.
MinimumValue.
25thPercentile.
50thPercentile(Median).
75thPercentile.
MaximumValue.
Wewillnotesomecallsto pandas.set_option() intherecipetochangetheprecisionofthenumbersandthepreferredwidthoftheoutput.
Thisistomakeitmorereadableforthisexample.Notethat display.width parametersetsthewidthofthedisplayincharacters.IncasePython
isrunninginaterminalthiscanbesettoNoneandPandaswillcorrectlyauto-detectthewidth.
Whendescribingourdatathisway,itisworthtakingsometimeandreviewingobservationsfromtheresults.ThismightincludethepresenceofNA
valuesformissingdataorsurprisingdistributionsforattributes.
preg plas pres skin test mass pedi age class

count 768.000 768.000 768.000 768.000 768.000 768.000 768.000 768.000 768.000

mean 3.845 120.895 69.105 20.536 79.799 31.993 0.472 33.241 0.349

std 3.370 31.973 19.356 15.952 115.244 7.884 0.331 11.760 0.477

min 0.000 0.000 0.000 0.000 0.000 0.000 0.078 21.000 0.000

25% 1.000 99.000 62.000 0.000 0.000 27.300 0.244 24.000 0.000

50% 3.000 117.000 72.000 23.000 30.500 32.000 0.372 29.000 0.000

75% 6.000 140.250 80.000 32.000 127.250 36.600 0.626 41.000 1.000

max 17.000 199.000 122.000 99.000 846.000 67.100 2.420 81.000 1.000

9.2PreparingMachineLearningData
Task1:Checkingthedistributionofthedata
Onclassificationproblemsweneedtoknowhowbalancedtheclassvaluesare.Highlyimbalancedproblems(alotmoreobservationsforoneclass
thananother)arecommonandmayneedspecialhandlinginthedatapreparationstageofourproject.Wecanquicklygetanideaofthe
distributionoftheclassattributeinPandas.Wecanseethattherearenearlydoublethenumberofobservationswithclass0(noonsetofdiabetes)
thantherearewithclass1(onsetofdiabetes).
class
0 500

1 268

dtype: int64

Task2:Checkingcorrelationofdata
Correlationreferstotherelationshipbetweentwovariablesandhowtheymayormaynotchangetogether.Themostcommonmethodfor
calculatingcorrelationisPearson’sCorrelationCoefficient,thatassumesanormaldistributionoftheattributesinvolved.
Acorrelationof-1or1showsafullnegativeorpositivecorrelationrespectively.Whereasavalueof0showsnocorrelationatall.Somemachine
learningalgorithmslikelinearandlogisticregressioncansufferpoorperformanceiftherearehighlycorrelatedattributesinyourdataset.Assuch,
itisagoodideatoreviewallofthepairwisecorrelationsoftheattributesinourdataset.Wecanusethe corr() functiononthePandasdatato
calculateacorrelationmatrix.
Thematrixlistsallattributesacrossthetopanddowntheside,togivecorrelationbetweenallpairsofattributes(twice,becausethematrixis
symmetrical).Wecanseethediagonallinethroughthematrixfromthetoplefttobottomrightcornersofthematrixshowsperfectcorrelationof
eachattributewithitself.
In [93]:
import pandas as pd



# load data
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] 

URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv"

data = pd.read_csv(URL,names=names)

pd.set_option('display.width', 100)

pd.set_option('precision', 3)

description = data.describe()

print(description)

In [94]:
import pandas as pd



# load data
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] 

URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv"

data = pd.read_csv(URL,names=names)

class_counts = data.groupby('class').size()

print(class_counts)

In [95]:
import pandas as pd



# load data
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] 

URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv"

data = pd.read_csv(URL,names=names)

pd.set_option('display.width', 100)

pd.set_option('precision', 3)
preg plas pres skin test mass pedi age class

preg 1.000 0.129 0.141 -0.082 -0.074 0.018 -0.034 0.544 0.222

plas 0.129 1.000 0.153 0.057 0.331 0.221 0.137 0.264 0.467

pres 0.141 0.153 1.000 0.207 0.089 0.282 0.041 0.240 0.065

skin -0.082 0.057 0.207 1.000 0.437 0.393 0.184 -0.114 0.075

test -0.074 0.331 0.089 0.437 1.000 0.198 0.185 -0.042 0.131

mass 0.018 0.221 0.282 0.393 0.198 1.000 0.141 0.036 0.293

pedi -0.034 0.137 0.041 0.184 0.185 0.141 1.000 0.034 0.174

age 0.544 0.264 0.240 -0.114 -0.042 0.036 0.034 1.000 0.238

class 0.222 0.467 0.065 0.075 0.131 0.293 0.174 0.238 1.000

Task3:Skewofunivariatedistribution
SkewreferstoadistributionthatisassumedGaussian(normalorbellcurve)thatisshiftedorsquashedinonedirectionoranother.Manymachine
learningalgorithmsassumeaGaussiandistribution.Knowingthatanattributehasaskewmayallowustoperformdatapreparationtocorrectthe
skewandlaterimprovetheaccuracyofourmodels.Wecancalculatetheskewofeachattributeusingthe skew() functiononthePandasdata.
Theskewresultsshowapositive(right)ornegative(left)skew.Valuesclosertozeroshowlessskew.
preg 0.902

plas 0.174

pres -1.844

skin 0.109

test 2.272

mass -0.429

pedi 1.920

age 1.130

class 0.635

dtype: float64

Howtouseourstatisticalresults
Reviewthenumbers.Generatingthesummarystatisticsisnotenough.Takeamomenttopause,readandreallythinkaboutthenumbersyou
areseeing.
Askwhy.Reviewyournumbersandaskalotofquestions.Howandwhyareyouseeingspecificvalues.Thinkabouthowthenumbersrelateto
theproblemdomainingeneralandspecificentitiesthatobservationsrelateto.
Writedownideas.Writedownyourobservationsandideas.Keepasmalltextfileornotepadandjotdownalloftheideasforhowvariables
mayrelate,forwhatnumbersmean,andideasfortechniquestotrylater.Thethingsyouwritedownnowwhilethedataisfreshwillbevery
valuablelaterwhenyouaretryingtothinkupnewthingstotry.
9.3DataVisualization(Part01)
Wemustunderstandyourdatainordertogetthebestresultsfrommachinelearningalgorithms.Thefastestwaytolearnmoreaboutourdataisto
usedatavisualization.InthischapterwewilldiscoverexactlyhowwecanvisualizeyourmachinelearningdatainPythonusingPandas.First,we
willlearnthreeunivariateplots:
Histograms
DensityPlots
BoxandWhiskerPlots
Inthesubsequentpart,wewilllearnsomemultivariateplots,including:
CorrelationMatrixPlots
ScatterPlotMatrix
Task1:PlottingHistograms
Afastwaytogetanideaofthedistributionofeachattributeistolookathistograms.Histogramsgroupdataintobinsandprovideusacountofthe
numberofobservationsineachbin.FromtheshapeofthebinswecanquicklygetafeelingforwhetheranattributeisGaussian,skewedoreven
hasanexponentialdistribution.Itcanalsohelpusseepossibleoutliers.
Wecanseethatperhapstheattributes age, pedi and test mayhaveanexponentialdistribution.Wecanalsoseethatperhapsthe mass
and pres and plas attributesmayhaveaGaussianornearlyGaussiandistribution.Thisisinterestingbecausemanymachinelearning
techniquesassumeaGaussianunivariatedistributionontheinputvariables.
correlations = data.corr(method='pearson')

print(correlations)

In [96]:
import pandas as pd



# load data
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] 

URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv"

data = pd.read_csv(URL,names=names)

skew = data.skew()

print(skew)

In [97]:
# Univariate Histograms

import matplotlib.pyplot as plt

import pandas as pd



# load data
/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:11: UserWarning: To output multiple subplots, the figure conta
ining the passed axes is being cleared

# This is added back by InteractiveShellApp.init_path()

Task2:PlottingDensityPlots
Densityplotsareanotherwayofgettingaquickideaofthedistributionofeachattribute.Theplotslooklikeanabstractedhistogramwithasmooth
curvedrawnthroughthetopofeachbin,muchlikeoureyetriedtodowiththehistograms.Wecanseethedistributionforeachattributeisclearer
thanthehistograms.
/usr/local/lib/python3.7/dist-packages/pandas/plotting/_matplotlib/__init__.py:71: UserWarning: To output multiple subplot
s, the figure containing the passed axes is being cleared

plot_obj.generate()

names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] 

URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv"

data = pd.read_csv(URL,names=names)

fig = plt.figure(figsize = (12,12))

ax = fig.gca()

data.hist(ax = ax)

plt.show()

In [98]:
# Univariate Density Plots

import matplotlib.pyplot as plt

import pandas as pd



# load data
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] 

URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv"

data = pd.read_csv(URL,names=names)

fig = plt.figure(figsize = (12,12))

ax = fig.gca()

data.plot(kind='density', subplots=True, layout=(3,3), sharex=False, ax=ax) 

plt.show()
Task3:PlottingBoxandWhiskerPlots
AnotherusefulwaytoreviewthedistributionofeachattributeistouseBoxandWhiskerPlotsorboxplotsforshort.Boxplotssummarizethe
distributionofeachattribute,drawingalineforthemedian(middlevalue)andaboxaroundthe25thand75thpercentiles(themiddle50%ofthe
data).Thewhiskersgiveanideaofthespreadofthedataanddotsoutsideofthewhiskersshowcandidateoutliervalues(valuesthatare1.5times
greaterthanthesizeofspreadofthemiddle50%ofthedata).
Wecanseethatthespreadofattributesisquitedifferent.Somelike age , test and skin appearquiteskewedtowardssmallervalues.
/usr/local/lib/python3.7/dist-packages/pandas/plotting/_matplotlib/__init__.py:71: UserWarning: To output multiple subplot
s, the figure containing the passed axes is being cleared

plot_obj.generate()

In [99]:
# Univariate Box and Whisker Plots

import matplotlib.pyplot as plt

import pandas as pd



# load data
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] 

URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv"

data = pd.read_csv(URL,names=names)

fig = plt.figure(figsize = (12,12))

ax = fig.gca()

data.plot(kind='box', subplots=True, layout=(3,3), sharex=False, ax=ax) 

plt.show()
9.4DataVisualization(Part02)
Thislectureprovidesexamplesoftwoplotsthatshowtheinteractionsbetweenmultiplevariablesinyourdataset.
CorrelationMatrixPlot.
ScatterPlotMatrix.
Task1:PlottingCorrelationMatrixPlot
Correlationgivesanindicationofhowrelatedthechangesarebetweentwovariables.Iftwovariableschangeinthesamedirectiontheyare
positivelycorrelated.Iftheychangeinoppositedirectionstogether(onegoesup,onegoesdown),thentheyarenegativelycorrelated.
Wecancalculatethecorrelationbetweeneachpairofattributes.Thisiscalledacorrelationmatrix.Wecanthenplotthecorrelationmatrixandget
anideaofwhichvariableshaveahighcorrelationwitheachother.Thisisusefultoknow,becausesomemachinelearningalgorithmslikelinearand
logisticregressioncanhavepoorperformanceiftherearehighlycorrelatedinputvariablesinourdata.
Wecanseethatthematrixissymmetrical,i.e.thebottomleftofthematrixisthesameasthetopright.Thisisusefulaswecanseetwodifferent
viewsonthesamedatainoneplot.Wecanalsoseethateachvariableisperfectlypositivelycorrelatedwithitself(asyouwouldhaveexpected)in
thediagonallinefromtoplefttobottomright.
In [100…
#Correlation Matrix Plot

import matplotlib.pyplot as plt

import pandas as pd

import numpy as np



# load data
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] 

URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv"

data = pd.read_csv(URL,names=names)

correlations = data.corr()



# plot correlation matrix

fig = plt.figure(figsize = (12,12))

ax = fig.add_subplot(111)

cax = ax.matshow(correlations, vmin=-1, vmax=1) 

fig.colorbar(cax)

ticks = np.arange(0,9,1) 

ax.set_xticks(ticks)

ax.set_yticks(ticks)

ax.set_xticklabels(names) 

ax.set_yticklabels(names)

plt.show()



#print correlation value

pd.set_option('display.width', 100)

pd.set_option('precision', 3)
preg plas pres skin test mass pedi age class

preg 1.000 0.129 0.141 -0.082 -0.074 0.018 -0.034 0.544 0.222

plas 0.129 1.000 0.153 0.057 0.331 0.221 0.137 0.264 0.467

pres 0.141 0.153 1.000 0.207 0.089 0.282 0.041 0.240 0.065

skin -0.082 0.057 0.207 1.000 0.437 0.393 0.184 -0.114 0.075

test -0.074 0.331 0.089 0.437 1.000 0.198 0.185 -0.042 0.131

mass 0.018 0.221 0.282 0.393 0.198 1.000 0.141 0.036 0.293

pedi -0.034 0.137 0.041 0.184 0.185 0.141 1.000 0.034 0.174

age 0.544 0.264 0.240 -0.114 -0.042 0.036 0.034 1.000 0.238

class 0.222 0.467 0.065 0.075 0.131 0.293 0.174 0.238 1.000

Task2:PlottingScatterPlotMatrix
Ascatterplotshowstherelationshipbetweentwovariablesasdotsintwodimensions,oneaxisforeachattribute.Wercancreateascatterplotfor
eachpairofattributesinourdata.Drawingallthesescatterplotstogetheriscalledascatterplotmatrix.
Scatterplotsareusefulforspottingstructuredrelationshipsbetweenvariables,likewhetherwecouldsummarizetherelationshipbetweentwo
variableswithaline.Attributeswithstructuredrelationshipsmayalsobecorrelatedandgoodcandidatesforremovalfromyourdataset.Belowisa
figureshowingshapeofscatterplotoftypicalcorrelationbetweentwovariables.
correlations = data.corr(method='pearson')

print(correlations)
LiketheCorrelationMatrixPlotabove,thescatterplotmatrixissymmetrical.Thisisusefultolookatthepairwiserelationshipsfromdifferent
perspectives.Becausethereislittlepointofdrawingascatterplotofeachvariablewithitself,thediagonalshowshistogramsofeachattribute.
preg plas pres skin test mass pedi age class

preg 1.000 0.129 0.141 -0.082 -0.074 0.018 -0.034 0.544 0.222

plas 0.129 1.000 0.153 0.057 0.331 0.221 0.137 0.264 0.467

pres 0.141 0.153 1.000 0.207 0.089 0.282 0.041 0.240 0.065

skin -0.082 0.057 0.207 1.000 0.437 0.393 0.184 -0.114 0.075

test -0.074 0.331 0.089 0.437 1.000 0.198 0.185 -0.042 0.131

mass 0.018 0.221 0.282 0.393 0.198 1.000 0.141 0.036 0.293

pedi -0.034 0.137 0.041 0.184 0.185 0.141 1.000 0.034 0.174

age 0.544 0.264 0.240 -0.114 -0.042 0.036 0.034 1.000 0.238

class 0.222 0.467 0.065 0.075 0.131 0.293 0.174 0.238 1.000

9.5DataPreparationandTransformation
Manymachinelearningalgorithmsmakeassumptionsaboutourdata.Itisoftenaverygoodideatoprepareourdatainsuchawaytobestexpose
thestructureoftheproblemtothemachinelearningalgorithmsthatyouintendtouse.Adifficultyisthatdifferentalgorithmsmakedifferent
assumptionsaboutourdataandmayrequiredifferenttransforms.Further,whenyoufollowalloftherulesandprepareourdata,sometimes
algorithmscandeliverbetterresultswithoutpre-processing.
Task1:Rescalingthedata
Whenourdataiscomprisedofattributeswithvaryingscales,manymachinelearningalgorithmscanbenefitfromrescalingtheattributestoallhave
thesamescale.Oftenthisisreferredto
asnormalizationandattributesareoftenrescaledintotherangebetween0and1.Thisisusefulfor
optimizationalgorithmsusedinthecoreofmachinelearningalgorithmslikegradientdescent.Itisalsousefulforalgorithmsthatweightinputslike
regressionandneuralnetworksandalgorithmsthatusedistancemeasureslikek-NearestNeighbors.Wecanrescaleourdatausingscikit-learn
usingthe MinMaxScaler class.Afterrescalingwecanseethatallofthevaluesareintherangebetween0and1.
In [101…
# Scatter Plot Matrix
import matplotlib.pyplot as plt

import pandas as pd



# load data
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] 

URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv"

data = pd.read_csv(URL,names=names)

pd.plotting.scatter_matrix(data, figsize=(12,12))

plt.show()



#print correlation value

pd.set_option('display.width', 100)

pd.set_option('precision', 3)

correlations = data.corr(method='pearson')

print(correlations)
[[0.353 0.744 0.59 0.354 0. 0.501 0.234 0.483]

[0.059 0.427 0.541 0.293 0. 0.396 0.117 0.167]

[0.471 0.92 0.525 0. 0. 0.347 0.254 0.183]

[0.059 0.447 0.541 0.232 0.111 0.419 0.038 0. ]

[0. 0.688 0.328 0.354 0.199 0.642 0.944 0.2 ]]

Task2:Standardizethedata
StandardizationisausefultechniquetotransformattributeswithaGaussiandistributionanddifferingmeansandstandarddeviationstoastandard
Gaussiandistributionwithameanof
0andastandarddeviationof1.ItismostsuitablefortechniquesthatassumeaGaussiandistributioninthe
inputvariablesandworkbetterwithrescaleddata,suchaslinearregression,logisticregressionandlineardiscriminateanalysis.Wecan
standardizedatausingscikit-learnwiththe StandardScaler class.
[[ 0.64 0.848 0.15 0.907 -0.693 0.204 0.468 1.426]

[-0.845 -1.123 -0.161 0.531 -0.693 -0.684 -0.365 -0.191]

[ 1.234 1.944 -0.264 -1.288 -0.693 -1.103 0.604 -0.106]

[-0.845 -0.998 -0.161 0.155 0.123 -0.494 -0.921 -1.042]

[-1.142 0.504 -1.505 0.907 0.766 1.41 5.485 -0.02 ]]

Task3:Normalizingthedata
Normalizinginscikit-learnreferstorescalingeachobservation(row)tohavealengthof1(calledaunitnormoravectorwiththelengthof1in
linearalgebra).Thispre-processingmethod
canbeusefulforsparsedatasets(lotsofzeros)withattributesofvaryingscaleswhenusing
algorithmsthatweightinputvaluessuchasneuralnetworksandalgorithmsthatusedistancemeasuressuchask-NearestNeighbors.Wecan
normalizedatainPythonwithscikit-learnusingthe Normalizer class.
In [102… # Rescale Data (between 0 and 1)

from sklearn.preprocessing import MinMaxScaler
import pandas as pd

from numpy import set_printoptions



# load data
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] 

URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv"

data = pd.read_csv(URL,names=names)

array = data.values



# separate array into input and output components

X = array[:,0:8]

Y = array[:,8]



# rescaling the data

scaler = MinMaxScaler(feature_range=(0, 1))

rescaledX = scaler.fit_transform(X)



# summarize transformed data

set_printoptions(precision=3)

print(rescaledX[0:5,:])

In [ ]:
# Standardize Data (0 mean, 1 stdev)

from sklearn.preprocessing import StandardScaler

import pandas as pd

from numpy import set_printoptions



# load data
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] 

URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv"

data = pd.read_csv(URL,names=names)

array = data.values



# separate array into input and output components

X = array[:,0:8]

Y = array[:,8]



# standardize the data

scaler = StandardScaler().fit(X) 

rescaledX = scaler.transform(X)



# summarize transformed data 

set_printoptions(precision=3) 

print(rescaledX[0:5,:])

In [ ]:
# Normalize Data (length of 1)

from sklearn.preprocessing import Normalizer

import pandas as pd

from numpy import set_printoptions



# load data
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] 

URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv"

data = pd.read_csv(URL,names=names)

array = data.values



# separate array into input and output components

X = array[:,0:8]

Y = array[:,8]



# normalize the data

scaler = Normalizer().fit(X) 

normalizedX = scaler.transform(X)
[[0.034 0.828 0.403 0.196 0. 0.188 0.004 0.28 ]

[0.008 0.716 0.556 0.244 0. 0.224 0.003 0.261]

[0.04 0.924 0.323 0. 0. 0.118 0.003 0.162]

[0.007 0.588 0.436 0.152 0.622 0.186 0.001 0.139]

[0. 0.596 0.174 0.152 0.731 0.188 0.01 0.144]]

9.6FeatureSelection
Thedatafeaturesthatweusetotrainourmachinelearningmodelshaveahugeinfluenceontheperformancewecanachieve.Inthislessonwewill
discoverautomaticfeatureselectiontechniquesthatwecanusetoprepareourmachinelearningdatainPythonwithscikit-learn.
Featureselectionisaprocesswhereyouautomaticallyselectthosefeaturesinourdatathatcontributemosttothepredictionvariableoroutputin
whichyouareinterested.Havingirrelevantfeaturesinourdatacandecreasetheaccuracyofmanymodels,especiallylinearalgorithmslikelinear
andlogisticregression.Threebenefitsofperformingfeatureselectionbeforemodelingourdataare:
Reducesoverfitting:lessredundantdatameanslessopportunitytomakedecisionsbasedonnoise.
Improvesaccuracy:lessmisleadingdatameansmodelingaccuracyimproves.
Reducestrainingtime:lessdatameansthatalgorithmstrainfaster
Aftercompletingthislessonyouwillknowhowtouse:
UnivariateSelection.
RecursiveFeatureElimination.
FeatureImportance.
Task1:UnivariateSelection
Statisticaltestscanbeusedtoselectthosefeaturesthathavethestrongestrelationshipwiththeoutputvariable.Thescikit-learnlibraryprovides
the SelectKBest classthatcanbeusedwithasuiteofdifferentstatisticalteststoselectaspecificnumberoffeatures.
TheexamplebelowusestheChi-Squared statisticaltestfornon-negativefeaturestoselect4ofthebestfeaturesfromthePimaIndiansonset
ofdiabetesdataset.
Selected features with first 5 entries:

plas test mass age

0 148 0 33.6 50

1 85 0 26.6 31

2 183 0 23.3 32

3 89 94 28.1 21

4 137 168 43.1 33

Chi-square scores of the selected features:

Index(['plas', 'test', 'mass', 'age'], dtype='object')

[1411.887 2175.565 127.669 181.304]

Task2:RecursiveFeatureElimination
TheRecursiveFeatureElimination(orRFE)worksbyrecursivelyremovingattributesandbuildingamodelonthoseattributesthatremain.Ituses
themodelaccuracytoidentifywhichattributes(andcombinationofattributes)contributethemosttopredictingthetargetattribute.Theexample
# summarize transformed data 

set_printoptions(precision=3) 

print(normalizedX[0:5,:])

(χ
2
)
In [ ]:
# Feature selection with Univariate Statistical Tests (Chi-squared for classification)

import pandas as pd

from numpy import set_printoptions

from sklearn.feature_selection import SelectKBest, f_classif

from sklearn.feature_selection import chi2



# load data
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] 

URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv"

data = pd.read_csv(URL,names=names)

array = data.values



# separate array into input and output components

X = array[:,0:8]

Y = array[:,8]



# select four features and create new pandas dataframe

selector = SelectKBest(score_func=chi2, k=4).fit(X,Y)

f = selector.get_support(1)

data_new = data[data.columns[f]]

print ("Selected features with first 5 entries:")

print(data_new.head(5))



# show selected chi-square scores for selected features

print('n')

print ("Chi-square scores of the selected features:")

x_new = selector.transform(X) # not needed to get the score

scores = selector.scores_

print (data.columns[f])

print(scores[f])
belowusesRFEwiththelogisticregressionalgorithmtoselectthetop3features.Thechoiceofalgorithmdoesnotmattertoomuchaslongasitis
skillfulandconsistent.
Number of selected features via RFE: 3

Boolean of selected features: [ True False False False False True True False]

Feature ranking: [1 2 4 6 5 1 1 3]

Selected features: 

preg

mass

pedi

Task3:FeatureImportance
Featureimportancereferstotechniquesthatassignascoretoinputfeaturesbasedonhowusefultheyareatpredictingatargetvariable.
Most
importancescoresarecalculatedbyapredictivemodelthathasbeenfitonthedataset.Inspectingtheimportancescoreprovidesinsightintothat
specificmodelandwhichfeaturesarethemostimportantandleastimportanttothemodelwhenmakingaprediction.Thisisatypeofmodel
interpretationthatcanbeperformedforthosemodelsthatsupportit.
BaggeddecisiontreeslikeRandomForestandExtraTreescanbeusedtoestimatetheimportanceoffeatures.Intheexamplebelowweconstruct
aExtraTreesClassifierclassifierforthePimaIndiansonsetofdiabetesdataset.Wecanseethatwearegivenanimportancescoreforeachattribute
wherethelargerthescore,themoreimportanttheattribute.Thescoreshighlighttheimportanceof plas , age and mass .
Features with importance scores:

preg 0.10917902521438591

plas 0.23778795159254987

pres 0.09677965067606348

skin 0.07938108481610057

test 0.0715765118317984

mass 0.1418165024365237

In [ ]:
# Feature Selection with RFE

import pandas as pd

from sklearn.feature_selection import RFE
from sklearn.linear_model import LogisticRegression



# load data
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] 

URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv"

data = pd.read_csv(URL,names=names)

array = data.values



# separate array into input and output components

X = array[:,0:8]

Y = array[:,8]



# feature selection

#model = LogisticRegression()

rfe = RFE(estimator=LogisticRegression(solver='lbfgs', max_iter=1000),n_features_to_select=3)

fit = rfe.fit(X, Y)

print("Number of selected features via RFE: %d" % fit.n_features_)
print("Boolean of selected features: %s" % fit.support_)

print("Feature ranking: %s" % fit.ranking_)



print("Selected features: ")

idx = 0

for x in fit.ranking_:

if x==1: 

print(data.columns[idx])

idx +=1

In [62]:
# Feature Importance with Extra Trees Classifier

import pandas as pd

from sklearn.ensemble import ExtraTreesClassifier

import numpy as np

import matplotlib.pyplot as plt



# load data
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] 

URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv"

data = pd.read_csv(URL,names=names)

array = data.values



# separate array into input and output components

X = array[:,0:8]

Y = array[:,8]



# feature selection

model = ExtraTreesClassifier()

model.fit(X, Y)

importance_sorted = np.sort(model.feature_importances_)



print("Features with importance scores:")

idx = 0

for x in model.feature_importances_:

print(data.columns[idx], x)

idx += 1



#show bar plot

features = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age']

plt.bar(features,model.feature_importances_)

plt.show()
pedi 0.11855796799091706

age 0.14492130544166104

9.7PerformanceEvaluationofMachineLearningAlgorithms
Weneedtoknowhowwellyouralgorithmsperformonunseendata.Thebestwaytoevaluatetheperformanceofanalgorithmwouldbetomake
predictionsfornewdatatowhichwealreadyknowtheanswers.Thesecondbestwayistouseclevertechniquesfromstatisticscalledresampling
methodsthatallowustomakeaccurateestimatesforhowwellouralgorithmwillperformonnewdata.
Task1:SplittingDataintoTrainandTestSets
Thesimplestmethodthatwecanusetoevaluatetheperformanceofamachinelearningalgorithmistousedifferenttrainingandtestingdatasets.
Wecantakeouroriginaldatasetandsplititintotwoparts.Trainthealgorithmonthefirstpart,makepredictionsonthesecondpartandevaluate
thepredictionsagainsttheexpectedresults.Thesizeofthesplitcandependonthesizeandspecificsofyourdataset,althoughitiscommonto
use67%ofthedatafortrainingandtheremaining33%fortesting.
Thisalgorithmevaluationtechniqueisveryfast.Itisidealforlargedatasets(millionsofrecords)wherethereisstrongevidencethatbothsplitsof
thedataarerepresentativeoftheunderlyingproblem.Becauseofthespeed,itisusefultousethisapproachwhenthealgorithmweare
investigatingisslowtotrain.
Adownsideofthistechniqueisthatitcanhaveahighvariance.Thismeansthatdifferencesinthetrainingandtestdatasetcanresultinmeaningful
differencesintheestimateofaccuracy.IntheexamplebelowwesplitthePimaIndiansdatasetinto67%/33%splitsfortrainingandtestand
evaluatetheaccuracyofaLogisticRegressionmodel.
Notethatinadditiontospecifyingthesizeofthesplit,wealsospecifytherandomseed.Becausethesplitofthedataisrandom,wewanttoensure
thattheresultsarereproducible.Byspecifyingtherandomseedweensurethatwegetthesamerandomnumberseachtimewerunthe
codeandinturnthesamesplitofdata.Thisisimportantifwewanttocomparethisresulttotheestimatedaccuracyofanothermachinelearning
algorithmorthesamealgorithmwithadifferentconfiguration.Toensurethecomparisonwasapples-for-apples,wemustensurethattheyare
trainedandtestedonexactlythesamedata.
Accuracy: 78.740%

Task2:K-FoldCrossValidation
Cross-validationisanapproachthatwecanusetoestimatetheperformanceofamachinelearningalgorithmwithlessvariancethanasingletrain-
testsetsplit.Itworksbysplittingthedatasetintok-parts(e.g.k=5ork=10).Eachsplitofthedataiscalledafold.Thealgorithmistrainedonk−
1foldswithoneheldbackandtestedontheheldbackfold.Thisisrepeatedsothateachfoldofthedatasetisgivenachancetobetheheldback
testset.Afterrunningcross-validationweendupwithkdifferentperformancescoresthatwecansummarizeusingameanandastandard
deviation.
Theresultisamorereliableestimateoftheperformanceofthealgorithmonnewdata.Itismoreaccuratebecausethealgorithmistrainedand
evaluatedmultipletimesondifferentdata.Thechoiceofkmustallowthesizeofeachtestpartitiontobelargeenoughtobeareasonablesample
oftheproblem,whilstallowingenoughrepetitionsofthetrain-testevaluationofthealgorithmtoprovideafairestimateofthealgorithms
In [75]:
# Evaluate using a train and a test set

import pandas as pd

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LogisticRegression



# load data
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] 

URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv"

data = pd.read_csv(URL,names=names)

array = data.values



# separate array into input and output components

X = array[:,0:8]

Y = array[:,8]



test_size = 0.33

seed = 7



X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=test_size, random_state=seed)

model = LogisticRegression(solver='lbfgs', max_iter=1000) 

model.fit(X_train, Y_train)

result = model.score(X_test, Y_test) 

print("Accuracy: %.3f%%" % (result*100.0))
performanceonunseendata.Formodestsizeddatasetsinthethousandsortensofthousandsofrecords,kvaluesof3,5and10arecommon.In
theexamplebelowweuse10-foldcross-validation.
Wereportboththemeanandthestandarddeviationoftheperformancemeasure.Whensummarizingperformancemeasures,itisagoodpractice
tosummarizethedistributionofthemeasures,inthiscaseassumingaGaussiandistributionofperformance(averyreasonableassumption)and
recordingthemeanandstandarddeviation.
Accuracy: 77.604% (5.158%)

Task3:RepeatedRandomTest-TrainSplits
Anothervariationonk-foldcross-validationistocreatearandomsplitofthedatalikethetrain/testsplitdescribedabove,butrepeattheprocessof
splittingandevaluationofthealgorithmmultipletimes,likecross-validation.Thishasthespeedofusingatrain/testsplitandthereductionin
varianceintheestimatedperformanceofk-foldcross-validation.
Wecanalsorepeattheprocessmanymoretimesasneededtoimprovetheaccuracy.Adownsideisthatrepetitionsmayincludemuchofthesame
datainthetrainorthetestsplitfromruntorun,introducingredundancyintotheevaluation.Theexamplebelowsplitsthedataintoa67%/33%
train/testsplitandrepeatstheprocess10times.
Accuracy: 76.535% (2.235%)

Notestobeconsidered
Generallyk-foldcross-validationisthegoldstandardforevaluatingtheperformanceofamachinelearningalgorithmonunseendatawithkset
to3,5,or10.
Usingatrain/testsplitisgoodforspeedwhenusingaslowalgorithmandproducesperformanceestimateswithlowerbiaswhenusinglarge
datasets.
Techniqueslikerepeatedrandomsplitscanbeusefulintermediateswhentryingtobalancevarianceintheestimatedperformance,model
trainingspeedanddatasetsize.
Thebestadviceistoexperimentandfindatechniqueforyourproblemthatisfastandproducesreasonableestimatesofperformancethatyoucan
usetomakedecisions.Ifindoubt,use10-foldcross-validation.
9.8PerformanceMetricsofMachineLearningAlgorithm(Part01)
Themetricsthatwechoosetoevaluateyourmachinelearningalgorithmsareveryimportant.Choiceofmetricsinfluenceshowtheperformanceof
machinelearningalgorithmsismeasuredandcompared.Theyinfluencehowweweighttheimportanceofdifferentcharacteristicsintheresults
In [85]:
# Evaluate using cross validation

import pandas as pd

from sklearn.model_selection import KFold
from sklearn.model_selection import cross_val_score

from sklearn.linear_model import LogisticRegression



# load data
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] 

URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv"

data = pd.read_csv(URL,names=names)

array = data.values



# separate array into input and output components

X = array[:,0:8]

Y = array[:,8]



kfold = KFold(n_splits=10, random_state=None)

model = LogisticRegression(solver='lbfgs', max_iter=1000)

results = cross_val_score(model, X, Y, cv=kfold)

print("Accuracy: %.3f%% (%.3f%%)" % (results.mean()*100.0, results.std()*100.0))

In [84]:
# Evaluate using Shuffle Split Cross Validation

import pandas as pd

from sklearn.model_selection import ShuffleSplit

from sklearn.model_selection import cross_val_score

from sklearn.linear_model import LogisticRegression



# load data
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] 

URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv"

data = pd.read_csv(URL,names=names)

array = data.values



# separate array into input and output components

X = array[:,0:8]

Y = array[:,8]



n_splits = 10

test_size = 0.33

seed = 7



kfold = ShuffleSplit(n_splits=n_splits, test_size=test_size, random_state=seed)

model = LogisticRegression(solver='lbfgs', max_iter=1000)

results = cross_val_score(model, X, Y, cv=kfold)

print("Accuracy: %.3f%% (%.3f%%)" % (results.mean()*100.0, results.std()*100.0))
andourultimatechoiceofwhichalgorithmtochoose.
Task1:ClassificationAccuracy
Classificationaccuracyisthenumberofcorrectpredictionsmadeasaratioofallpredictionsmade.Thisisthemostcommonevaluationmetricfor
classificationproblems,itisalsothemostmisused.Itisreallyonlysuitablewhenthereareanequalnumberofobservationsineachclass(whichis
rarelythecase)andthatallpredictionsandpredictionerrorsareequallyimportant,whichisoftennotthecase.Belowisanexampleofcalculating
classificationaccuracy.
Accuracy: 0.776 (0.052)

Task2:AreaUnderROCCurve
AreaunderROCCurve(orAUCforshort)isaperformancemetricforbinaryclassificationproblems.TheAUCrepresentsamodel’sabilityto
discriminatebetweenpositiveandnegativeclasses.Anareaof1.0representsamodelthatmadeallpredictionsperfectly.Anareaof0.5represents
amodelthatisasgoodasrandom.ROCcanbebrokendownintosensitivityandspecificity.Abinaryclassificationproblemisreallyatrade-off
betweensensitivityandspecificity.
Sensitivityisthetruepositiverate,alsocalledtherecall.Itisthenumberofinstancesfromthepositive(first)classthatwereactuallypredicted
correctly.
Specificityisalsocalledthetruenegativerate.Isthenumberofinstancesfromthenegative(second)classthatwereactuallypredicted
correctly.
Fromourresultsbelow,wecanseetheAUCisrelativelycloseto1andgreaterthan0.5,suggestingsomeskillinthepredictions.
In [87]:
# Cross Validation Classification Accuracy

import pandas as pd

from sklearn.model_selection import KFold
from sklearn.model_selection import cross_val_score

from sklearn.linear_model import LogisticRegression



# load data
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] 

URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv"

data = pd.read_csv(URL,names=names)

array = data.values



# separate array into input and output components

X = array[:,0:8]

Y = array[:,8]



kfold = KFold(n_splits=10, random_state=None)

model = LogisticRegression(solver='lbfgs', max_iter=1000)

scoring = 'accuracy'

results = cross_val_score(model, X, Y, cv=kfold, scoring=scoring) 

print("Accuracy: %.3f (%.3f)" % (results.mean(), results.std()))

In [88]:
# Cross Validation Classification ROC AUC

import pandas as pd

from sklearn.model_selection import KFold
from sklearn.model_selection import cross_val_score

from sklearn.linear_model import LogisticRegression



# load data
AUC: 0.828 (0.043)

Task3:ConfusionMatrix
Theconfusionmatrixisahandypresentationoftheaccuracyofamodelwithtwoormoreclasses.Thetablepresentspredictionsonthex-axisand
trueoutcomesonthey-axis.Thecellsofthetablearethenumberofpredictionsmadebyamachinelearningalgorithm.
Forexample,amachinelearningalgorithmcanpredict0or1andeachpredictionmayactuallyhavebeena0or1.Predictionsfor0thatwere
actually0appearinthecellforprediction=0andactual=0,whereaspredictionsfor0thatwereactually1appearinthecellforprediction=0and
actual=1.Andsoon.
BelowisanexampleofcalculatingaconfusionmatrixforasetofpredictionsbyaLogisticRegressiononthePimaIndiansonsetofdiabetes
dataset.
[[142 20]
[ 34 58]]

<matplotlib.axes._subplots.AxesSubplot at 0x7f76eb824a10>
9.9PerformanceMetricsofMachineLearningAlgorithm(Part02)
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] 

URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv"

data = pd.read_csv(URL,names=names)

array = data.values



# separate array into input and output components

X = array[:,0:8]

Y = array[:,8]



kfold = KFold(n_splits=10, random_state=None)

model = LogisticRegression(solver='lbfgs', max_iter=1000)

scoring = 'roc_auc'

results = cross_val_score(model, X, Y, cv=kfold, scoring=scoring)

print("AUC: %.3f (%.3f)" % (results.mean(), results.std()))

In [114…
# Cross Validation Classification Confusion Matrix

import pandas as pd

import seaborn as sns

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LogisticRegression

from sklearn.metrics import confusion_matrix



# load data
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] 

URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv"

data = pd.read_csv(URL,names=names)

array = data.values



# separate array into input and output components

X = array[:,0:8]

Y = array[:,8]

test_size = 0.33

seed = 7



X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=test_size,random_state=seed)

model = LogisticRegression(solver='lbfgs', max_iter=1000) 

model.fit(X_train, Y_train)

predicted = model.predict(X_test)

matrix = confusion_matrix(Y_test, predicted) 

print(matrix)



# visualize the results

group_names = ['True Neg','False Pos','False Neg','True Pos']
group_counts = ["{0:0.0f}".format(value) for value in

matrix.flatten()]

group_percentages = ["{0:.2%}".format(value) for value in

matrix.flatten()/np.sum(matrix)]



labels = [f"{v1}n{v2}n{v3}" for v1, v2, v3 in

zip(group_names,group_counts,group_percentages)]



labels = np.asarray(labels).reshape(2,2)

sns.heatmap(matrix, annot=labels, fmt='', cmap='Blues')

Out[114…
Inthepreviouslesson,welearnaboutperformancemetricsforclassificationproblems.Inthislessonwillreview3ofthemostcommonmetricsfor
evaluatingpredictionsonregressionmachinelearningproblems:
MeanAbsoluteError
MeanSquaredError
R-Squared
Task1:MeanAbsoluteError
TheMeanAbsoluteError(orMAE)isthesumoftheabsolutedifferencesbetweenpredictionsandactualvalues.Itgivesanideaofhowwrongthe
predictionswere.Themeasuregivesanideaofthemagnitudeoftheerror,butnoideaofthedirection(e.g.overorunderpredicting).Avalueof0
indicatesnoerrororperfectpredictions.Likelogloss,thismetricisinvertedby
the cross_val_score() function.
TheexamplebelowdemonstratescalculatingmeanabsoluteerrorontheBostonhousepricedataset.ThisdatasetwastakenfromtheStatLib
libraryandismaintainedbyCarnegieMellonUniversity.ThisdatasetconcernsthehousingpricesinthehousingcityofBoston.Thedataset
providedhas506instanceswith13features.
MAE: -4.005 (2.084)

In [117…
# Cross Validation Regression MAE

import pandas as pd

from sklearn.model_selection import KFold
from sklearn.model_selection import cross_val_score

from sklearn.linear_model import LinearRegression



# load data
names = ['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO',

'B', 'LSTAT', 'MEDV']

URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/housing.csv"

data = pd.read_csv(URL, delim_whitespace=True, names=names) 

array = data.values



# separate array into input and output components

X = array[:,0:13]

Y = array[:,13]



kfold = KFold(n_splits=10, random_state=None)

model = LinearRegression()



scoring = 'neg_mean_absolute_error'

results = cross_val_score(model, X, Y, cv=kfold, scoring=scoring) 

print("MAE: %.3f (%.3f)" % (results.mean(), results.std()))
Task2:MeanSquaredError
TheMeanSquaredError(orMSE)ismuchlikethemeanabsoluteerrorinthatitprovidesagrossideaofthemagnitudeoferror.Takingthesquare
rootofthemeansquarederrorconvertstheunitsbacktotheoriginalunitsoftheoutputvariableandcanbemeaningfulfordescriptionand
presentation.ThisiscalledtheRootMeanSquaredError(orRMSE).Theexamplebelowprovidesademonstrationofcalculatingmeansquared
error.
Note:MSEmaybelessrobustthanMAE,sincethesquaringoftheerrorswillenforceahigherimportanceonoutliers.Butwhenoutliersare
exponentiallyrare(likeinabell-shapedcurve),theMSEperformsverywellandisgenerallypreferred.
MSE: -34.705 (45.574)

Task3:R-Squared
The (orR-Squared)metricprovidesanindicationofthegoodnessoffitofasetofpredictionstotheactualvalues.Instatisticalliteraturethis
measureiscalledthecoefficientofdetermination.Thisisavaluebetween0and1forno-fitandperfectfit,respectively.Theexamplebelow
providesademonstrationofcalculatingthemean forasetofpredictions.
where:
: valueofobservation
:predictedvalueof forobservation
:meanvalueof
In [118…
# Cross Validation Regression MAE

import pandas as pd

from sklearn.model_selection import KFold
from sklearn.model_selection import cross_val_score

from sklearn.linear_model import LinearRegression



# load data
names = ['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO',

'B', 'LSTAT', 'MEDV']

URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/housing.csv"

data = pd.read_csv(URL, delim_whitespace=True, names=names) 

array = data.values



# separate array into input and output components

X = array[:,0:13]

Y = array[:,13]



kfold = KFold(n_splits=10, random_state=None)

model = LinearRegression()



scoring = 'neg_mean_squared_error'

results = cross_val_score(model, X, Y, cv=kfold, scoring=scoring) 

print("MSE: %.3f (%.3f)" % (results.mean(), results.std()))

R
2
R
2
yi
y i
^
y y i
ȳ y
In [120…
# Cross Validation Regression MAE

import pandas as pd

from sklearn.model_selection import KFold
from sklearn.model_selection import cross_val_score

from sklearn.linear_model import LinearRegression



# load data
names = ['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO',

'B', 'LSTAT', 'MEDV']

URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/housing.csv"

data = pd.read_csv(URL, delim_whitespace=True, names=names) 

array = data.values
R-Squared: 0.203 (0.595)

9.10ImplementingMachineLearningAlgorithms
Whenweworkonamachinelearningproject,weoftenendupwithmultiplegoodmodelstochoosefrom.Eachmodelwillhavedifferent
performancecharacteristics.Usingmethodslikecross-validation,wecangetanestimateforhowaccurateeachmodelmaybeonunseendata.
Weneedtobeabletousetheseestimatestochooseoneortwobestmodelsfromthesuiteofmodelsthatyouhavecreated.Whenwehaveanew
dataset,itisagoodideatovisualizethedatausingdifferenttechniquesinordertolookatthedatafromdifferentperspectives.
Thesameideaappliestomodelselection.Weshoulduseanumberofdifferentwaysoflookingattheestimatedaccuracyofourmachinelearning
algorithmsinordertochoosetheoneortwoalgorithmstofinalize.
Task1:ComparingMachineLearningAlgorithms
Intheexamplebelowfivedifferentclassificationalgorithmsarecomparedonasingledataset:
LinearDiscriminantAnalysis.
k-NearestNeighbors.
ClassificationandRegressionTrees.
NaiveBayes.
SupportVectorMachines.
ThedatasetisthePimaIndiansonsetofdiabetesproblem.Theproblemhastwoclassesandeightnumericinputvariablesofvaryingscales.The
10-foldcross-validationprocedureisusedtoevaluateeachalgorithm,importantlyconfiguredwiththesamerandomseedtoensurethatthesame
splitstothetrainingdataareperformedandthateachalgorithmisevaluatedinpreciselythesameway.Eachalgorithmisgivenashortname,
usefulforsummarizingresultsafterward.
# separate array into input and output components

X = array[:,0:13]

Y = array[:,13]



kfold = KFold(n_splits=10, random_state=None)

model = LinearRegression()



scoring = 'r2'

results = cross_val_score(model, X, Y, cv=kfold, scoring=scoring) 

print("R-Squared: %.3f (%.3f)" % (results.mean(), results.std()))

In [128…
# Compare Algorithms

import pandas as pd

from matplotlib import pyplot

from sklearn.model_selection import KFold
from sklearn.model_selection import cross_val_score

from sklearn.linear_model import LogisticRegression

from sklearn.tree import DecisionTreeClassifier

from sklearn.neighbors import KNeighborsClassifier

from sklearn.discriminant_analysis import LinearDiscriminantAnalysis

from sklearn.naive_bayes import GaussianNB

from sklearn.svm import SVC



# load data
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] 

URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv"

data = pd.read_csv(URL,names=names)

array = data.values



# separate array into input and output components

X = array[:,0:8]

Y = array[:,8]



# prepare models
models = []

models.append(('LDA', LinearDiscriminantAnalysis()))

models.append(('KNN', KNeighborsClassifier()))

models.append(('CART', DecisionTreeClassifier()))

models.append(('NB', GaussianNB()))

models.append(('SVM', SVC()))



# evaluate each model in turn

results = []

names = []

scoring = 'accuracy'

for name, model in models:

kfold = KFold(n_splits=10, random_state=None)

cv_results = cross_val_score(model, X, Y, cv=kfold, scoring=scoring) 

results.append(cv_results)

names.append(name)

msg = "%s: %f (%f)" % (name, cv_results.mean(), cv_results.std()) 

print(msg)



# boxplot algorithm comparison

fig = pyplot.figure() 

fig.suptitle('Algorithm Comparison') 

ax = fig.add_subplot(111) 

pyplot.boxplot(results) 

ax.set_xticklabels(names) 

pyplot.show()
LDA: 0.773462 (0.051592)

KNN: 0.726555 (0.061821)

CART: 0.691302 (0.072112)

NB: 0.755178 (0.042766)

SVM: 0.760424 (0.052931)

Task2:AlgorithmTuning
Algorithmtuningisafinalstepintheprocessofappliedmachinelearningbeforefinalizingourmodel.Itissometimescalledhyperparameter
optimizationwherethealgorithmparametersarereferredtoashyperparameters,whereasthecoefficientsfoundbythemachinelearningalgorithm
itselfarereferredtoasparameters.Optimizationsuggeststhesearch-natureoftheproblem.Phrasedasasearchproblem,youcanusedifferent
searchstrategiestofindagoodandrobustparameterorsetofparametersforanalgorithmonagivenproblem.First,wewillshowin
implementationofSupportVectorMachinewithoutGridSearch.
WewillusebreastcancerdatasetfromScikitLearnlibrary.Thisisabinaryclassificationdataset.IthasnoMissingattributeornullvalues.Theclass
distributionisasfollows.
212:malignant
357:benign
Moreinformationcanbefoundhere:https://scikit-learn.org/stable/datasets/toy_dataset.html#breast-cancer-dataset
In [11]:
# Classification Without Algorithm Optimization

import pandas as pd

import numpy as np

from sklearn.metrics import classification_report, confusion_matrix

from sklearn.model_selection import train_test_split

from sklearn.datasets import load_breast_cancer

from sklearn.svm import SVC



# load data
cancer = load_breast_cancer()

X = pd.DataFrame(cancer['data'],

columns = cancer['feature_names'])



# cancer column is our target

Y = pd.DataFrame(cancer['target'],

columns =['Cancer'])





X_train, X_test, y_train, y_test = train_test_split(X, np.ravel(Y),test_size = 0.30, random_state = 7)



# train the model on train set
precision recall f1-score support

0 1.00 0.78 0.88 55

1 0.91 1.00 0.95 116

accuracy 0.93 171

macro avg 0.95 0.89 0.91 171

weighted avg 0.94 0.93 0.93 171

GridSearchisanapproachtoparametertuningthatwillmethodicallybuildandevaluateamodelforeachcombinationofalgorithmparameters
specifiedinagrid.
Oneofthegreatthingsabout GridSearchCV isthatitisameta-estimator.IttakesanestimatorlikeSVCandcreatesanewestimator,that
behavesexactlythesame–inthiscase,likeaclassifier.Youshouldaddrefit=Trueandchooseverbosetowhatevernumberyouwant,thehigherthe
number,themoreverbose(verbosejustmeansthetextoutputdescribingtheprocess).
Fitting 5 folds for each of 25 candidates, totalling 125 fits

[CV 1/5] END ........C=0.1, gamma=1, kernel=rbf;, score=0.613 total time= 0.0s

[CV 2/5] END ........C=0.1, gamma=1, kernel=rbf;, score=0.600 total time= 0.0s

[CV 3/5] END ........C=0.1, gamma=1, kernel=rbf;, score=0.600 total time= 0.0s

[CV 4/5] END ........C=0.1, gamma=1, kernel=rbf;, score=0.608 total time= 0.0s

[CV 5/5] END ........C=0.1, gamma=1, kernel=rbf;, score=0.608 total time= 0.0s

[CV 1/5] END ......C=0.1, gamma=0.1, kernel=rbf;, score=0.613 total time= 0.0s

[CV 2/5] END ......C=0.1, gamma=0.1, kernel=rbf;, score=0.600 total time= 0.0s

[CV 3/5] END ......C=0.1, gamma=0.1, kernel=rbf;, score=0.600 total time= 0.0s

[CV 4/5] END ......C=0.1, gamma=0.1, kernel=rbf;, score=0.608 total time= 0.0s

[CV 5/5] END ......C=0.1, gamma=0.1, kernel=rbf;, score=0.608 total time= 0.0s

[CV 1/5] END .....C=0.1, gamma=0.01, kernel=rbf;, score=0.613 total time= 0.0s

[CV 2/5] END .....C=0.1, gamma=0.01, kernel=rbf;, score=0.600 total time= 0.0s

[CV 3/5] END .....C=0.1, gamma=0.01, kernel=rbf;, score=0.600 total time= 0.0s

[CV 4/5] END .....C=0.1, gamma=0.01, kernel=rbf;, score=0.608 total time= 0.0s

[CV 5/5] END .....C=0.1, gamma=0.01, kernel=rbf;, score=0.608 total time= 0.0s

[CV 1/5] END ....C=0.1, gamma=0.001, kernel=rbf;, score=0.613 total time= 0.0s

[CV 2/5] END ....C=0.1, gamma=0.001, kernel=rbf;, score=0.600 total time= 0.0s

[CV 3/5] END ....C=0.1, gamma=0.001, kernel=rbf;, score=0.600 total time= 0.0s

[CV 4/5] END ....C=0.1, gamma=0.001, kernel=rbf;, score=0.608 total time= 0.0s

[CV 5/5] END ....C=0.1, gamma=0.001, kernel=rbf;, score=0.608 total time= 0.0s

[CV 1/5] END ...C=0.1, gamma=0.0001, kernel=rbf;, score=0.925 total time= 0.0s

[CV 2/5] END ...C=0.1, gamma=0.0001, kernel=rbf;, score=0.912 total time= 0.0s

[CV 3/5] END ...C=0.1, gamma=0.0001, kernel=rbf;, score=0.938 total time= 0.0s

[CV 4/5] END ...C=0.1, gamma=0.0001, kernel=rbf;, score=0.861 total time= 0.0s

[CV 5/5] END ...C=0.1, gamma=0.0001, kernel=rbf;, score=0.962 total time= 0.0s

[CV 1/5] END ..........C=1, gamma=1, kernel=rbf;, score=0.613 total time= 0.0s

[CV 2/5] END ..........C=1, gamma=1, kernel=rbf;, score=0.600 total time= 0.0s

[CV 3/5] END ..........C=1, gamma=1, kernel=rbf;, score=0.600 total time= 0.0s

model = SVC()

model.fit(X_train, y_train)



# print prediction results

predictions = model.predict(X_test)

print(classification_report(y_test, predictions))

In [24]:
# Classification Grid Search Optimization

import pandas as pd

import numpy as np

from sklearn.metrics import classification_report, confusion_matrix

from sklearn.model_selection import train_test_split

from sklearn.model_selection import GridSearchCV

from sklearn.datasets import load_breast_cancer

from sklearn.svm import SVC



# load data
cancer = load_breast_cancer()

X = pd.DataFrame(cancer['data'],

columns = cancer['feature_names'])



# cancer column is our target

Y = pd.DataFrame(cancer['target'],

columns =['Cancer'])



X_train, X_test, y_train, y_test = train_test_split(X, np.ravel(Y),test_size = 0.30, random_state = 7)



# defining parameter range

param_grid = {'C': [0.1, 1, 10, 100, 1000],

'gamma': [1, 0.1, 0.01, 0.001, 0.0001],

'kernel': ['rbf']}



svc_grid = GridSearchCV(SVC(), param_grid, verbose = 3)



# fitting the model for grid search

svc_grid.fit(X_train, y_train)



# print best parameter after tuning

print(svc_grid.best_params_)



# print how our model looks after hyper-parameter tuning

print(svc_grid.best_estimator_)


svc_grid_predictions = svc_grid.predict(X_test)



# print classification report

print(classification_report(y_test, svc_grid_predictions))
[CV 4/5] END ..........C=1, gamma=1, kernel=rbf;, score=0.608 total time= 0.0s

[CV 5/5] END ..........C=1, gamma=1, kernel=rbf;, score=0.608 total time= 0.0s

[CV 1/5] END ........C=1, gamma=0.1, kernel=rbf;, score=0.613 total time= 0.0s

[CV 2/5] END ........C=1, gamma=0.1, kernel=rbf;, score=0.600 total time= 0.0s

[CV 3/5] END ........C=1, gamma=0.1, kernel=rbf;, score=0.600 total time= 0.0s

[CV 4/5] END ........C=1, gamma=0.1, kernel=rbf;, score=0.608 total time= 0.0s

[CV 5/5] END ........C=1, gamma=0.1, kernel=rbf;, score=0.608 total time= 0.0s

[CV 1/5] END .......C=1, gamma=0.01, kernel=rbf;, score=0.625 total time= 0.0s

[CV 2/5] END .......C=1, gamma=0.01, kernel=rbf;, score=0.613 total time= 0.0s

[CV 3/5] END .......C=1, gamma=0.01, kernel=rbf;, score=0.600 total time= 0.0s

[CV 4/5] END .......C=1, gamma=0.01, kernel=rbf;, score=0.608 total time= 0.0s

[CV 5/5] END .......C=1, gamma=0.01, kernel=rbf;, score=0.608 total time= 0.0s

[CV 1/5] END ......C=1, gamma=0.001, kernel=rbf;, score=0.925 total time= 0.0s

[CV 2/5] END ......C=1, gamma=0.001, kernel=rbf;, score=0.912 total time= 0.0s

[CV 3/5] END ......C=1, gamma=0.001, kernel=rbf;, score=0.938 total time= 0.0s

[CV 4/5] END ......C=1, gamma=0.001, kernel=rbf;, score=0.873 total time= 0.0s

[CV 5/5] END ......C=1, gamma=0.001, kernel=rbf;, score=0.962 total time= 0.0s

[CV 1/5] END .....C=1, gamma=0.0001, kernel=rbf;, score=0.912 total time= 0.0s

[CV 2/5] END .....C=1, gamma=0.0001, kernel=rbf;, score=0.925 total time= 0.0s

[CV 3/5] END .....C=1, gamma=0.0001, kernel=rbf;, score=0.975 total time= 0.0s

[CV 4/5] END .....C=1, gamma=0.0001, kernel=rbf;, score=0.911 total time= 0.0s

[CV 5/5] END .....C=1, gamma=0.0001, kernel=rbf;, score=0.975 total time= 0.0s

[CV 1/5] END .........C=10, gamma=1, kernel=rbf;, score=0.613 total time= 0.0s

[CV 2/5] END .........C=10, gamma=1, kernel=rbf;, score=0.600 total time= 0.0s

[CV 3/5] END .........C=10, gamma=1, kernel=rbf;, score=0.600 total time= 0.0s

[CV 4/5] END .........C=10, gamma=1, kernel=rbf;, score=0.608 total time= 0.0s

[CV 5/5] END .........C=10, gamma=1, kernel=rbf;, score=0.608 total time= 0.0s

[CV 1/5] END .......C=10, gamma=0.1, kernel=rbf;, score=0.613 total time= 0.0s

[CV 2/5] END .......C=10, gamma=0.1, kernel=rbf;, score=0.600 total time= 0.0s

[CV 3/5] END .......C=10, gamma=0.1, kernel=rbf;, score=0.600 total time= 0.0s

[CV 4/5] END .......C=10, gamma=0.1, kernel=rbf;, score=0.608 total time= 0.0s

[CV 5/5] END .......C=10, gamma=0.1, kernel=rbf;, score=0.608 total time= 0.0s

[CV 1/5] END ......C=10, gamma=0.01, kernel=rbf;, score=0.637 total time= 0.0s

[CV 2/5] END ......C=10, gamma=0.01, kernel=rbf;, score=0.613 total time= 0.0s

[CV 3/5] END ......C=10, gamma=0.01, kernel=rbf;, score=0.600 total time= 0.0s

[CV 4/5] END ......C=10, gamma=0.01, kernel=rbf;, score=0.608 total time= 0.0s

[CV 5/5] END ......C=10, gamma=0.01, kernel=rbf;, score=0.608 total time= 0.0s

[CV 1/5] END .....C=10, gamma=0.001, kernel=rbf;, score=0.925 total time= 0.0s

[CV 2/5] END .....C=10, gamma=0.001, kernel=rbf;, score=0.912 total time= 0.0s

[CV 3/5] END .....C=10, gamma=0.001, kernel=rbf;, score=0.887 total time= 0.0s

[CV 4/5] END .....C=10, gamma=0.001, kernel=rbf;, score=0.848 total time= 0.0s

[CV 5/5] END .....C=10, gamma=0.001, kernel=rbf;, score=0.962 total time= 0.0s

[CV 1/5] END ....C=10, gamma=0.0001, kernel=rbf;, score=0.900 total time= 0.0s

[CV 2/5] END ....C=10, gamma=0.0001, kernel=rbf;, score=0.912 total time= 0.0s

[CV 3/5] END ....C=10, gamma=0.0001, kernel=rbf;, score=0.963 total time= 0.0s

[CV 4/5] END ....C=10, gamma=0.0001, kernel=rbf;, score=0.911 total time= 0.0s

[CV 5/5] END ....C=10, gamma=0.0001, kernel=rbf;, score=0.975 total time= 0.0s

[CV 1/5] END ........C=100, gamma=1, kernel=rbf;, score=0.613 total time= 0.0s

[CV 2/5] END ........C=100, gamma=1, kernel=rbf;, score=0.600 total time= 0.0s

[CV 3/5] END ........C=100, gamma=1, kernel=rbf;, score=0.600 total time= 0.0s

[CV 4/5] END ........C=100, gamma=1, kernel=rbf;, score=0.608 total time= 0.0s

[CV 5/5] END ........C=100, gamma=1, kernel=rbf;, score=0.608 total time= 0.0s

[CV 1/5] END ......C=100, gamma=0.1, kernel=rbf;, score=0.613 total time= 0.0s

[CV 2/5] END ......C=100, gamma=0.1, kernel=rbf;, score=0.600 total time= 0.0s

[CV 3/5] END ......C=100, gamma=0.1, kernel=rbf;, score=0.600 total time= 0.0s

[CV 4/5] END ......C=100, gamma=0.1, kernel=rbf;, score=0.608 total time= 0.0s

[CV 5/5] END ......C=100, gamma=0.1, kernel=rbf;, score=0.608 total time= 0.0s

[CV 1/5] END .....C=100, gamma=0.01, kernel=rbf;, score=0.637 total time= 0.0s

[CV 2/5] END .....C=100, gamma=0.01, kernel=rbf;, score=0.613 total time= 0.0s

[CV 3/5] END .....C=100, gamma=0.01, kernel=rbf;, score=0.600 total time= 0.0s

[CV 4/5] END .....C=100, gamma=0.01, kernel=rbf;, score=0.608 total time= 0.0s

[CV 5/5] END .....C=100, gamma=0.01, kernel=rbf;, score=0.608 total time= 0.0s

[CV 1/5] END ....C=100, gamma=0.001, kernel=rbf;, score=0.925 total time= 0.0s

[CV 2/5] END ....C=100, gamma=0.001, kernel=rbf;, score=0.912 total time= 0.0s

[CV 3/5] END ....C=100, gamma=0.001, kernel=rbf;, score=0.887 total time= 0.0s

[CV 4/5] END ....C=100, gamma=0.001, kernel=rbf;, score=0.848 total time= 0.0s

[CV 5/5] END ....C=100, gamma=0.001, kernel=rbf;, score=0.962 total time= 0.0s

[CV 1/5] END ...C=100, gamma=0.0001, kernel=rbf;, score=0.938 total time= 0.0s

[CV 2/5] END ...C=100, gamma=0.0001, kernel=rbf;, score=0.887 total time= 0.0s

[CV 3/5] END ...C=100, gamma=0.0001, kernel=rbf;, score=0.950 total time= 0.0s

[CV 4/5] END ...C=100, gamma=0.0001, kernel=rbf;, score=0.886 total time= 0.0s

[CV 5/5] END ...C=100, gamma=0.0001, kernel=rbf;, score=0.975 total time= 0.0s

[CV 1/5] END .......C=1000, gamma=1, kernel=rbf;, score=0.613 total time= 0.0s

[CV 2/5] END .......C=1000, gamma=1, kernel=rbf;, score=0.600 total time= 0.0s

[CV 3/5] END .......C=1000, gamma=1, kernel=rbf;, score=0.600 total time= 0.0s

[CV 4/5] END .......C=1000, gamma=1, kernel=rbf;, score=0.608 total time= 0.0s

[CV 5/5] END .......C=1000, gamma=1, kernel=rbf;, score=0.608 total time= 0.0s

[CV 1/5] END .....C=1000, gamma=0.1, kernel=rbf;, score=0.613 total time= 0.0s

[CV 2/5] END .....C=1000, gamma=0.1, kernel=rbf;, score=0.600 total time= 0.0s

[CV 3/5] END .....C=1000, gamma=0.1, kernel=rbf;, score=0.600 total time= 0.0s

[CV 4/5] END .....C=1000, gamma=0.1, kernel=rbf;, score=0.608 total time= 0.0s

[CV 5/5] END .....C=1000, gamma=0.1, kernel=rbf;, score=0.608 total time= 0.0s

[CV 1/5] END ....C=1000, gamma=0.01, kernel=rbf;, score=0.637 total time= 0.0s

[CV 2/5] END ....C=1000, gamma=0.01, kernel=rbf;, score=0.613 total time= 0.0s

[CV 3/5] END ....C=1000, gamma=0.01, kernel=rbf;, score=0.600 total time= 0.0s

[CV 4/5] END ....C=1000, gamma=0.01, kernel=rbf;, score=0.608 total time= 0.0s

[CV 5/5] END ....C=1000, gamma=0.01, kernel=rbf;, score=0.608 total time= 0.0s

[CV 1/5] END ...C=1000, gamma=0.001, kernel=rbf;, score=0.925 total time= 0.0s

[CV 2/5] END ...C=1000, gamma=0.001, kernel=rbf;, score=0.912 total time= 0.0s

[CV 3/5] END ...C=1000, gamma=0.001, kernel=rbf;, score=0.887 total time= 0.0s

[CV 4/5] END ...C=1000, gamma=0.001, kernel=rbf;, score=0.848 total time= 0.0s

[CV 5/5] END ...C=1000, gamma=0.001, kernel=rbf;, score=0.962 total time= 0.0s

[CV 1/5] END ..C=1000, gamma=0.0001, kernel=rbf;, score=0.925 total time= 0.0s

[CV 2/5] END ..C=1000, gamma=0.0001, kernel=rbf;, score=0.912 total time= 0.0s

[CV 3/5] END ..C=1000, gamma=0.0001, kernel=rbf;, score=0.938 total time= 0.0s

[CV 4/5] END ..C=1000, gamma=0.0001, kernel=rbf;, score=0.899 total time= 0.0s
[CV 5/5] END ..C=1000, gamma=0.0001, kernel=rbf;, score=0.962 total time= 0.0s

{'C': 1, 'gamma': 0.0001, 'kernel': 'rbf'}

SVC(C=1, gamma=0.0001)

precision recall f1-score support

0 0.93 0.93 0.93 55

1 0.97 0.97 0.97 116

accuracy 0.95 171

macro avg 0.95 0.95 0.95 171

weighted avg 0.95 0.95 0.95 171

9.11NeuralNetworkswithKeras
ThiscodeisademonstrationofashallowneuralnetworkonMNISTdataset.MNISTdatasetisastandarddatasetusedinmostdeeplearning
tutorials.Inthiscode,threelayersshallowneuralnetworkisusedforhandwrittendigitsclassification.Thefirstlayerisinputlayer,consistedof784
nodes.Thesecondlayerisahiddenlayerwith64sigmoidneurons.Thelastlayerisanoutputlayerwith10softmaxneurons.
Task1:LoadingDependencies
Task2:LoadingMNISTDataset
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz

11493376/11490434 [==============================] - 0s 0us/step

11501568/11490434 [==============================] - 0s 0us/step

(60000, 28, 28)
(60000,)
(10000, 28, 28)
(10000,)
array([5, 0, 4, 1, 9, 2, 1, 3, 1, 4, 3, 5], dtype=uint8)
<function matplotlib.pyplot.show>
In [28]:
import keras

from keras.datasets import mnist

from keras.models import Sequential

from keras.layers import Dense 



#additional code for keras 2.4.0

from keras.utils import np_utils



#from keras.optimizers import SGD #deprecated

from keras.optimizers import gradient_descent_v2 

#then use it : sgd = gradient_descent_v2.SGD(...)



from matplotlib import pyplot as plt

In [29]:
(X_train, y_train), (X_valid, y_valid) = mnist.load_data()

In [30]:
X_train.shape

Out[30]:
In [31]:
y_train.shape

Out[31]:
In [32]:
X_valid.shape

Out[32]:
In [33]:
y_valid.shape

Out[33]:
In [34]:
y_train[0:12]

Out[34]:
In [35]:
plt.figure(figsize=(4,4))

for k in range(12):

plt.subplot(3, 4, k+1)

plt.imshow(X_train[k], cmap='Greys')

plt.axis('off')

plt.tight_layout()

plt.show

Out[35]:
<matplotlib.image.AxesImage at 0x7fd8a48451d0>
7

Task3:DataPreprocessing
Reshapingthedatafrom2Dto1D
Normalizingthedata(tobe0to1)
Convertingintegerlabelstoone-hotencoding.Wearrangethelabelswithsuchone-hotencodingssothattheylineupwiththe10probabilities
beingoutputbythefinallayerofourartificialneuralnetwork.Theyrepresenttheidealoutputthatwearestrivingtoattainwithournetwork:Ifthe
inputimageisahandwrittenseven,thenaperfectlytrainednetworkwouldoutputaprobabilityof1.00thatitisasevenandaprobabilityof0.00for
eachoftheothernineclassesofdigits.
Task4:DesigningNeuralNetworks
Inthefirstlineofcode,weinstantiatethesimplesttypeofneuralnetworkmodelobject,the Sequential typeand—inadashofextreme
creativity—namethemodel model .
Inthesecondline,weusethe add() methodofourmodelobjecttospecifytheattributesofournetwork’shiddenlayer(64sigmoid-typeartificial
neuronsinthegeneral-purpose,fullyconnectedarrangementdefinedbythe Dense() methodaswellastheshapeofourinputlayer(one-
dimensionalarrayoflength784).
Inthethirdandfinallineweusethe add() methodagaintospecifytheoutputlayeranditsparameters:10artificialneuronsofthe softmax
variety,correspondingtothe10probabilities(oneforeachofthe10possibledigits)thatthenetworkwilloutputwhenfedagivenhandwritten
image.
Model: "sequential"

_________________________________________________________________

Layer (type) Output Shape Param # 

=================================================================

In [36]:
plt.imshow(X_valid[0], cmap='Greys')

Out[36]:
In [37]:
print(y_valid[0])

In [38]:
X_train = X_train.reshape(60000, 784).astype('float32')

X_valid = X_valid.reshape(10000, 784).astype('float32')

In [39]:
X_train = X_train/255

X_valid = X_valid/255

In [40]:
n_classes = 10

y_train = keras.utils.np_utils.to_categorical(y_train, n_classes)

y_valid = keras.utils.np_utils.to_categorical(y_valid, n_classes)

In [41]:
model = Sequential()

model.add(Dense(64, activation='sigmoid', input_shape=(784,)))

model.add(Dense(10, activation='softmax'))

In [42]:
model.summary()
dense (Dense) (None, 64) 50240 



dense_1 (Dense) (None, 10) 650 



=================================================================

Total params: 50,890

Trainable params: 50,890

Non-trainable params: 0

_________________________________________________________________

Task5:TrainingNeuralNetwork
val_loss isthevalueofcostfunctionforyourcross-validationdataand loss isthevalueofcostfunctionforyourtrainingdata.Howeverwith
val_loss (kerasvalidationloss)and val_acc (kerasvalidationaccuracy),manycasescanbepossiblelikebelow:
val_loss startsincreasing, val_acc startsdecreasing.Thismeansmodeliscrammingvaluesnotlearning
val_loss startsincreasing, val_acc alsoincreases.Thiscouldbecaseofoverfittingordiverseprobabilityvaluesincaseswheresoftmax
isbeingusedinoutputlayer
val_loss startsdecreasing, val_acc startsincreasing.Thisisalsofineasthatmeansmodelbuiltislearningandworkingfine.
Epoch 1/200

469/469 [==============================] - 4s 6ms/step - loss: 0.0934 - accuracy: 0.0876 - val_loss: 0.0923 - val_accuracy:
0.0817

Epoch 2/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0918 - accuracy: 0.0729 - val_loss: 0.0914 - val_accuracy:
0.0776

Epoch 3/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0911 - accuracy: 0.0851 - val_loss: 0.0907 - val_accuracy:
0.1058

Epoch 4/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0905 - accuracy: 0.1191 - val_loss: 0.0902 - val_accuracy:
0.1451

Epoch 5/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0901 - accuracy: 0.1545 - val_loss: 0.0898 - val_accuracy:
0.1825

Epoch 6/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0897 - accuracy: 0.1913 - val_loss: 0.0894 - val_accuracy:
0.2185

Epoch 7/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0893 - accuracy: 0.2349 - val_loss: 0.0890 - val_accuracy:
0.2681

Epoch 8/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0889 - accuracy: 0.2873 - val_loss: 0.0887 - val_accuracy:
0.3167

Epoch 9/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0886 - accuracy: 0.3287 - val_loss: 0.0883 - val_accuracy:
0.3528

Epoch 10/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0882 - accuracy: 0.3571 - val_loss: 0.0879 - val_accuracy:
0.3769

Epoch 11/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0878 - accuracy: 0.3771 - val_loss: 0.0875 - val_accuracy:
0.3999

Epoch 12/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0874 - accuracy: 0.3942 - val_loss: 0.0871 - val_accuracy:
0.4152

Epoch 13/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0871 - accuracy: 0.4087 - val_loss: 0.0868 - val_accuracy:
0.4281

Epoch 14/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0867 - accuracy: 0.4207 - val_loss: 0.0864 - val_accuracy:
0.4385

Epoch 15/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0863 - accuracy: 0.4273 - val_loss: 0.0860 - val_accuracy:
0.4460

Epoch 16/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0859 - accuracy: 0.4313 - val_loss: 0.0856 - val_accuracy:
0.4500

Epoch 17/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0855 - accuracy: 0.4340 - val_loss: 0.0852 - val_accuracy:
0.4524

Epoch 18/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0851 - accuracy: 0.4360 - val_loss: 0.0848 - val_accuracy:
0.4534

Epoch 19/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0847 - accuracy: 0.4370 - val_loss: 0.0843 - val_accuracy:
0.4531

Epoch 20/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0843 - accuracy: 0.4366 - val_loss: 0.0839 - val_accuracy:
0.4531

Epoch 21/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0839 - accuracy: 0.4365 - val_loss: 0.0835 - val_accuracy:
0.4535

Epoch 22/200

469/469 [==============================] - 3s 6ms/step - loss: 0.0834 - accuracy: 0.4367 - val_loss: 0.0830 - val_accuracy:
In [43]:
model.compile(loss='mean_squared_error', optimizer=gradient_descent_v2.SGD(learning_rate=0.01), metrics=['accuracy'])

In [44]:
model.fit(X_train, y_train, batch_size=128, epochs=200, verbose=1, validation_data=(X_valid, y_valid))
0.4534

Epoch 23/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0830 - accuracy: 0.4383 - val_loss: 0.0825 - val_accuracy:
0.4539

Epoch 24/200

469/469 [==============================] - 3s 5ms/step - loss: 0.0825 - accuracy: 0.4392 - val_loss: 0.0821 - val_accuracy:
0.4560

Epoch 25/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0820 - accuracy: 0.4409 - val_loss: 0.0816 - val_accuracy:
0.4584

Epoch 26/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0816 - accuracy: 0.4436 - val_loss: 0.0811 - val_accuracy:
0.4620

Epoch 27/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0811 - accuracy: 0.4462 - val_loss: 0.0806 - val_accuracy:
0.4651

Epoch 28/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0806 - accuracy: 0.4507 - val_loss: 0.0801 - val_accuracy:
0.4690

Epoch 29/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0801 - accuracy: 0.4549 - val_loss: 0.0796 - val_accuracy:
0.4726

Epoch 30/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0796 - accuracy: 0.4604 - val_loss: 0.0791 - val_accuracy:
0.4784

Epoch 31/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0791 - accuracy: 0.4650 - val_loss: 0.0786 - val_accuracy:
0.4839

Epoch 32/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0786 - accuracy: 0.4707 - val_loss: 0.0781 - val_accuracy:
0.4896

Epoch 33/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0781 - accuracy: 0.4771 - val_loss: 0.0775 - val_accuracy:
0.4965

Epoch 34/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0776 - accuracy: 0.4847 - val_loss: 0.0770 - val_accuracy:
0.5025

Epoch 35/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0770 - accuracy: 0.4903 - val_loss: 0.0765 - val_accuracy:
0.5084

Epoch 36/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0765 - accuracy: 0.4980 - val_loss: 0.0760 - val_accuracy:
0.5149

Epoch 37/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0760 - accuracy: 0.5052 - val_loss: 0.0754 - val_accuracy:
0.5212

Epoch 38/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0755 - accuracy: 0.5124 - val_loss: 0.0749 - val_accuracy:
0.5284

Epoch 39/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0750 - accuracy: 0.5200 - val_loss: 0.0744 - val_accuracy:
0.5350

Epoch 40/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0744 - accuracy: 0.5284 - val_loss: 0.0738 - val_accuracy:
0.5417

Epoch 41/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0739 - accuracy: 0.5342 - val_loss: 0.0733 - val_accuracy:
0.5499

Epoch 42/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0734 - accuracy: 0.5411 - val_loss: 0.0728 - val_accuracy:
0.5573

Epoch 43/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0729 - accuracy: 0.5484 - val_loss: 0.0723 - val_accuracy:
0.5628

Epoch 44/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0724 - accuracy: 0.5551 - val_loss: 0.0717 - val_accuracy:
0.5694

Epoch 45/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0718 - accuracy: 0.5608 - val_loss: 0.0712 - val_accuracy:
0.5759

Epoch 46/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0713 - accuracy: 0.5661 - val_loss: 0.0707 - val_accuracy:
0.5814

Epoch 47/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0708 - accuracy: 0.5719 - val_loss: 0.0702 - val_accuracy:
0.5866

Epoch 48/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0703 - accuracy: 0.5780 - val_loss: 0.0696 - val_accuracy:
0.5914

Epoch 49/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0698 - accuracy: 0.5832 - val_loss: 0.0691 - val_accuracy:
0.5958

Epoch 50/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0693 - accuracy: 0.5873 - val_loss: 0.0686 - val_accuracy:
0.6011

Epoch 51/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0688 - accuracy: 0.5917 - val_loss: 0.0681 - val_accuracy:
0.6055

Epoch 52/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0683 - accuracy: 0.5950 - val_loss: 0.0676 - val_accuracy:
0.6109

Epoch 53/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0677 - accuracy: 0.6004 - val_loss: 0.0670 - val_accuracy:
0.6153

Epoch 54/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0672 - accuracy: 0.6036 - val_loss: 0.0665 - val_accuracy:
0.6183

Epoch 55/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0667 - accuracy: 0.6071 - val_loss: 0.0660 - val_accuracy:
0.6200

Epoch 56/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0662 - accuracy: 0.6110 - val_loss: 0.0655 - val_accuracy:
0.6223

Epoch 57/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0657 - accuracy: 0.6143 - val_loss: 0.0650 - val_accuracy:
0.6255

Epoch 58/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0652 - accuracy: 0.6176 - val_loss: 0.0645 - val_accuracy:
0.6282

Epoch 59/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0647 - accuracy: 0.6208 - val_loss: 0.0640 - val_accuracy:
0.6301

Epoch 60/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0643 - accuracy: 0.6237 - val_loss: 0.0635 - val_accuracy:
0.6325

Epoch 61/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0638 - accuracy: 0.6260 - val_loss: 0.0630 - val_accuracy:
0.6353

Epoch 62/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0633 - accuracy: 0.6295 - val_loss: 0.0625 - val_accuracy:
0.6371

Epoch 63/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0628 - accuracy: 0.6315 - val_loss: 0.0620 - val_accuracy:
0.6398

Epoch 64/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0623 - accuracy: 0.6335 - val_loss: 0.0616 - val_accuracy:
0.6422

Epoch 65/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0619 - accuracy: 0.6364 - val_loss: 0.0611 - val_accuracy:
0.6446

Epoch 66/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0614 - accuracy: 0.6390 - val_loss: 0.0606 - val_accuracy:
0.6461

Epoch 67/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0609 - accuracy: 0.6410 - val_loss: 0.0601 - val_accuracy:
0.6489

Epoch 68/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0605 - accuracy: 0.6435 - val_loss: 0.0597 - val_accuracy:
0.6513

Epoch 69/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0600 - accuracy: 0.6461 - val_loss: 0.0592 - val_accuracy:
0.6527

Epoch 70/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0596 - accuracy: 0.6476 - val_loss: 0.0588 - val_accuracy:
0.6546

Epoch 71/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0591 - accuracy: 0.6501 - val_loss: 0.0583 - val_accuracy:
0.6565

Epoch 72/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0587 - accuracy: 0.6519 - val_loss: 0.0579 - val_accuracy:
0.6577

Epoch 73/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0582 - accuracy: 0.6545 - val_loss: 0.0574 - val_accuracy:
0.6604

Epoch 74/200

469/469 [==============================] - 3s 6ms/step - loss: 0.0578 - accuracy: 0.6563 - val_loss: 0.0570 - val_accuracy:
0.6635

Epoch 75/200

469/469 [==============================] - 3s 6ms/step - loss: 0.0574 - accuracy: 0.6588 - val_loss: 0.0566 - val_accuracy:
0.6652

Epoch 76/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0570 - accuracy: 0.6614 - val_loss: 0.0561 - val_accuracy:
0.6679

Epoch 77/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0565 - accuracy: 0.6633 - val_loss: 0.0557 - val_accuracy:
0.6702

Epoch 78/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0561 - accuracy: 0.6661 - val_loss: 0.0553 - val_accuracy:
0.6723

Epoch 79/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0557 - accuracy: 0.6681 - val_loss: 0.0549 - val_accuracy:
0.6748

Epoch 80/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0553 - accuracy: 0.6708 - val_loss: 0.0545 - val_accuracy:
0.6785

Epoch 81/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0549 - accuracy: 0.6735 - val_loss: 0.0541 - val_accuracy:
0.6810

Epoch 82/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0545 - accuracy: 0.6760 - val_loss: 0.0537 - val_accuracy:
0.6846

Epoch 83/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0541 - accuracy: 0.6787 - val_loss: 0.0533 - val_accuracy:
0.6867

Epoch 84/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0538 - accuracy: 0.6813 - val_loss: 0.0529 - val_accuracy:
0.6890

Epoch 85/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0534 - accuracy: 0.6838 - val_loss: 0.0525 - val_accuracy:
0.6918

Epoch 86/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0530 - accuracy: 0.6859 - val_loss: 0.0521 - val_accuracy:
0.6939

Epoch 87/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0526 - accuracy: 0.6890 - val_loss: 0.0518 - val_accuracy:
0.6965

Epoch 88/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0523 - accuracy: 0.6919 - val_loss: 0.0514 - val_accuracy:
0.6994

Epoch 89/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0519 - accuracy: 0.6943 - val_loss: 0.0510 - val_accuracy:
0.7030

Epoch 90/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0515 - accuracy: 0.6980 - val_loss: 0.0507 - val_accuracy:
0.7066

Epoch 91/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0512 - accuracy: 0.7019 - val_loss: 0.0503 - val_accuracy:
0.7097

Epoch 92/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0508 - accuracy: 0.7043 - val_loss: 0.0500 - val_accuracy:
0.7124

Epoch 93/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0505 - accuracy: 0.7077 - val_loss: 0.0496 - val_accuracy:
0.7159

Epoch 94/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0502 - accuracy: 0.7104 - val_loss: 0.0493 - val_accuracy:
0.7192

Epoch 95/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0498 - accuracy: 0.7129 - val_loss: 0.0489 - val_accuracy:
0.7231

Epoch 96/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0495 - accuracy: 0.7159 - val_loss: 0.0486 - val_accuracy:
0.7253

Epoch 97/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0491 - accuracy: 0.7186 - val_loss: 0.0482 - val_accuracy:
0.7278

Epoch 98/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0488 - accuracy: 0.7212 - val_loss: 0.0479 - val_accuracy:
0.7313

Epoch 99/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0485 - accuracy: 0.7237 - val_loss: 0.0476 - val_accuracy:
0.7338

Epoch 100/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0482 - accuracy: 0.7269 - val_loss: 0.0473 - val_accuracy:
0.7367

Epoch 101/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0479 - accuracy: 0.7291 - val_loss: 0.0469 - val_accuracy:
0.7399

Epoch 102/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0476 - accuracy: 0.7317 - val_loss: 0.0466 - val_accuracy:
0.7426

Epoch 103/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0472 - accuracy: 0.7345 - val_loss: 0.0463 - val_accuracy:
0.7446

Epoch 104/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0469 - accuracy: 0.7366 - val_loss: 0.0460 - val_accuracy:
0.7469

Epoch 105/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0466 - accuracy: 0.7389 - val_loss: 0.0457 - val_accuracy:
0.7504

Epoch 106/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0463 - accuracy: 0.7415 - val_loss: 0.0454 - val_accuracy:
0.7523

Epoch 107/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0460 - accuracy: 0.7440 - val_loss: 0.0451 - val_accuracy:
0.7540

Epoch 108/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0457 - accuracy: 0.7464 - val_loss: 0.0448 - val_accuracy:
0.7565

Epoch 109/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0454 - accuracy: 0.7484 - val_loss: 0.0445 - val_accuracy:
0.7589

Epoch 110/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0452 - accuracy: 0.7505 - val_loss: 0.0442 - val_accuracy:
0.7620

Epoch 111/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0449 - accuracy: 0.7527 - val_loss: 0.0439 - val_accuracy:
0.7644

Epoch 112/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0446 - accuracy: 0.7551 - val_loss: 0.0436 - val_accuracy:
0.7664

Epoch 113/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0443 - accuracy: 0.7572 - val_loss: 0.0434 - val_accuracy:
0.7700

Epoch 114/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0440 - accuracy: 0.7595 - val_loss: 0.0431 - val_accuracy:
0.7724

Epoch 115/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0438 - accuracy: 0.7617 - val_loss: 0.0428 - val_accuracy:
0.7742

Epoch 116/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0435 - accuracy: 0.7635 - val_loss: 0.0425 - val_accuracy:
0.7765

Epoch 117/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0432 - accuracy: 0.7658 - val_loss: 0.0423 - val_accuracy:
0.7784

Epoch 118/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0430 - accuracy: 0.7684 - val_loss: 0.0420 - val_accuracy:
0.7801

Epoch 119/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0427 - accuracy: 0.7704 - val_loss: 0.0417 - val_accuracy:
0.7816

Epoch 120/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0424 - accuracy: 0.7728 - val_loss: 0.0415 - val_accuracy:
0.7846

Epoch 121/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0422 - accuracy: 0.7747 - val_loss: 0.0412 - val_accuracy:
0.7867

Epoch 122/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0419 - accuracy: 0.7769 - val_loss: 0.0409 - val_accuracy:
0.7885

Epoch 123/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0417 - accuracy: 0.7782 - val_loss: 0.0407 - val_accuracy:
0.7905

Epoch 124/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0414 - accuracy: 0.7807 - val_loss: 0.0404 - val_accuracy:
0.7916

Epoch 125/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0412 - accuracy: 0.7829 - val_loss: 0.0402 - val_accuracy:
0.7932

Epoch 126/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0409 - accuracy: 0.7849 - val_loss: 0.0399 - val_accuracy:
0.7949

Epoch 127/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0407 - accuracy: 0.7867 - val_loss: 0.0397 - val_accuracy:
0.7964

Epoch 128/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0404 - accuracy: 0.7886 - val_loss: 0.0395 - val_accuracy:
0.7976

Epoch 129/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0402 - accuracy: 0.7907 - val_loss: 0.0392 - val_accuracy:
0.8000

Epoch 130/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0400 - accuracy: 0.7923 - val_loss: 0.0390 - val_accuracy:
0.8008

Epoch 131/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0397 - accuracy: 0.7939 - val_loss: 0.0387 - val_accuracy:
0.8022

Epoch 132/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0395 - accuracy: 0.7956 - val_loss: 0.0385 - val_accuracy:
0.8047

Epoch 133/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0393 - accuracy: 0.7969 - val_loss: 0.0383 - val_accuracy:
0.8062

Epoch 134/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0390 - accuracy: 0.7987 - val_loss: 0.0380 - val_accuracy:
0.8082

Epoch 135/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0388 - accuracy: 0.8007 - val_loss: 0.0378 - val_accuracy:
0.8098

Epoch 136/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0386 - accuracy: 0.8022 - val_loss: 0.0376 - val_accuracy:
0.8113

Epoch 137/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0384 - accuracy: 0.8038 - val_loss: 0.0374 - val_accuracy:
0.8136

Epoch 138/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0382 - accuracy: 0.8055 - val_loss: 0.0372 - val_accuracy:
0.8152

Epoch 139/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0379 - accuracy: 0.8071 - val_loss: 0.0369 - val_accuracy:
0.8171

Epoch 140/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0377 - accuracy: 0.8087 - val_loss: 0.0367 - val_accuracy:
0.8183

Epoch 141/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0375 - accuracy: 0.8102 - val_loss: 0.0365 - val_accuracy:
0.8201

Epoch 142/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0373 - accuracy: 0.8116 - val_loss: 0.0363 - val_accuracy:
0.8213

Epoch 143/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0371 - accuracy: 0.8130 - val_loss: 0.0361 - val_accuracy:
0.8222

Epoch 144/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0369 - accuracy: 0.8146 - val_loss: 0.0359 - val_accuracy:
0.8240

Epoch 145/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0367 - accuracy: 0.8160 - val_loss: 0.0357 - val_accuracy:
0.8249

Epoch 146/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0365 - accuracy: 0.8171 - val_loss: 0.0355 - val_accuracy:
0.8258

Epoch 147/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0363 - accuracy: 0.8187 - val_loss: 0.0353 - val_accuracy:
0.8270

Epoch 148/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0361 - accuracy: 0.8201 - val_loss: 0.0351 - val_accuracy:
0.8280

Epoch 149/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0359 - accuracy: 0.8212 - val_loss: 0.0349 - val_accuracy:
0.8293

Epoch 150/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0357 - accuracy: 0.8221 - val_loss: 0.0347 - val_accuracy:
0.8309

Epoch 151/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0355 - accuracy: 0.8232 - val_loss: 0.0345 - val_accuracy:
0.8334

Epoch 152/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0353 - accuracy: 0.8246 - val_loss: 0.0343 - val_accuracy:
0.8349

Epoch 153/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0352 - accuracy: 0.8257 - val_loss: 0.0341 - val_accuracy:
0.8358

Epoch 154/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0350 - accuracy: 0.8270 - val_loss: 0.0339 - val_accuracy:
0.8367

Epoch 155/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0348 - accuracy: 0.8281 - val_loss: 0.0338 - val_accuracy:
0.8372

Epoch 156/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0346 - accuracy: 0.8290 - val_loss: 0.0336 - val_accuracy:
0.8391

Epoch 157/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0344 - accuracy: 0.8302 - val_loss: 0.0334 - val_accuracy:
0.8401

Epoch 158/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0342 - accuracy: 0.8312 - val_loss: 0.0332 - val_accuracy:
0.8410

Epoch 159/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0341 - accuracy: 0.8319 - val_loss: 0.0330 - val_accuracy:
0.8423

Epoch 160/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0339 - accuracy: 0.8329 - val_loss: 0.0329 - val_accuracy:
0.8433

Epoch 161/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0337 - accuracy: 0.8337 - val_loss: 0.0327 - val_accuracy:
0.8441

Epoch 162/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0336 - accuracy: 0.8346 - val_loss: 0.0325 - val_accuracy:
0.8452

Epoch 163/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0334 - accuracy: 0.8355 - val_loss: 0.0324 - val_accuracy:
0.8459

Epoch 164/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0332 - accuracy: 0.8364 - val_loss: 0.0322 - val_accuracy:
0.8468

Epoch 165/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0331 - accuracy: 0.8374 - val_loss: 0.0320 - val_accuracy:
0.8476

Epoch 166/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0329 - accuracy: 0.8380 - val_loss: 0.0319 - val_accuracy:
0.8482

Epoch 167/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0327 - accuracy: 0.8389 - val_loss: 0.0317 - val_accuracy:
0.8492

Epoch 168/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0326 - accuracy: 0.8395 - val_loss: 0.0315 - val_accuracy:
0.8503

Epoch 169/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0324 - accuracy: 0.8406 - val_loss: 0.0314 - val_accuracy:
0.8513

Epoch 170/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0323 - accuracy: 0.8412 - val_loss: 0.0312 - val_accuracy:
0.8522

Epoch 171/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0321 - accuracy: 0.8418 - val_loss: 0.0311 - val_accuracy:
0.8530

Epoch 172/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0320 - accuracy: 0.8425 - val_loss: 0.0309 - val_accuracy:
0.8541

Epoch 173/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0318 - accuracy: 0.8430 - val_loss: 0.0308 - val_accuracy:
0.8547

Epoch 174/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0317 - accuracy: 0.8437 - val_loss: 0.0306 - val_accuracy:
0.8556

Epoch 175/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0315 - accuracy: 0.8443 - val_loss: 0.0305 - val_accuracy:
0.8559

Epoch 176/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0314 - accuracy: 0.8451 - val_loss: 0.0303 - val_accuracy:
0.8563

Epoch 177/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0312 - accuracy: 0.8456 - val_loss: 0.0302 - val_accuracy:
0.8573

Epoch 178/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0311 - accuracy: 0.8463 - val_loss: 0.0301 - val_accuracy:
0.8575

Epoch 179/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0310 - accuracy: 0.8470 - val_loss: 0.0299 - val_accuracy:
0.8579

Epoch 180/200

469/469 [==============================] - 2s 4ms/step - loss: 0.0308 - accuracy: 0.8475 - val_loss: 0.0298 - val_accuracy:
0.8583

Epoch 181/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0307 - accuracy: 0.8479 - val_loss: 0.0296 - val_accuracy:
0.8588

Epoch 182/200

469/469 [==============================] - 2s 5ms/step - loss: 0.0305 - accuracy: 0.8485 - val_loss: 0.0295 - val_accuracy:
Modul Topik 9 - Kecerdasan Buatan

More Related Content

Similar to Modul Topik 9 - Kecerdasan Buatan

Learning with Relative Attributes
Learning with Relative AttributesLearning with Relative Attributes
Learning with Relative AttributesVikas Jain
 
ODSC West 2022 – Kitbashing in ML
ODSC West 2022 – Kitbashing in MLODSC West 2022 – Kitbashing in ML
ODSC West 2022 – Kitbashing in MLBryan Bischof
 
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...Databricks
 
IRJET- Object Detection in an Image using Convolutional Neural Network
IRJET- Object Detection in an Image using Convolutional Neural NetworkIRJET- Object Detection in an Image using Convolutional Neural Network
IRJET- Object Detection in an Image using Convolutional Neural NetworkIRJET Journal
 
Deep Learning with CNTK
Deep Learning with CNTKDeep Learning with CNTK
Deep Learning with CNTKAshish Jaiman
 
Machine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptxMachine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptxVenkateswaraBabuRavi
 
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...IRJET Journal
 
Top 10 Most Important Interview Question and Answer on Machine Learning
Top 10 Most Important  Interview Question and Answer on Machine LearningTop 10 Most Important  Interview Question and Answer on Machine Learning
Top 10 Most Important Interview Question and Answer on Machine LearningDucatNoida1
 
Real Time Sign Language Detection
Real Time Sign Language DetectionReal Time Sign Language Detection
Real Time Sign Language DetectionIRJET Journal
 
AUTOMATING AUTOMATION: MASTER MENTORING PROCESS
AUTOMATING AUTOMATION: MASTER MENTORING PROCESSAUTOMATING AUTOMATION: MASTER MENTORING PROCESS
AUTOMATING AUTOMATION: MASTER MENTORING PROCESScscpconf
 
A scenario based approach for dealing with
A scenario based approach for dealing withA scenario based approach for dealing with
A scenario based approach for dealing withijcsa
 
IRJET - Content based Image Classification
IRJET -  	  Content based Image ClassificationIRJET -  	  Content based Image Classification
IRJET - Content based Image ClassificationIRJET Journal
 
Image Classification and Annotation Using Deep Learning
Image Classification and Annotation Using Deep LearningImage Classification and Annotation Using Deep Learning
Image Classification and Annotation Using Deep LearningIRJET Journal
 
Creating a Custom ML Model for your Application - Kotlin/Everywhere
Creating a Custom ML Model for your Application - Kotlin/EverywhereCreating a Custom ML Model for your Application - Kotlin/Everywhere
Creating a Custom ML Model for your Application - Kotlin/EverywhereIsabel Palomar
 
Real time Traffic Signs Recognition using Deep Learning
Real time Traffic Signs Recognition using Deep LearningReal time Traffic Signs Recognition using Deep Learning
Real time Traffic Signs Recognition using Deep LearningIRJET Journal
 
Metaphorical Analysis of diseases in Tomato leaves using Deep Learning Algori...
Metaphorical Analysis of diseases in Tomato leaves using Deep Learning Algori...Metaphorical Analysis of diseases in Tomato leaves using Deep Learning Algori...
Metaphorical Analysis of diseases in Tomato leaves using Deep Learning Algori...IRJET Journal
 

Similar to Modul Topik 9 - Kecerdasan Buatan (20)

Learning with Relative Attributes
Learning with Relative AttributesLearning with Relative Attributes
Learning with Relative Attributes
 
ODSC West 2022 – Kitbashing in ML
ODSC West 2022 – Kitbashing in MLODSC West 2022 – Kitbashing in ML
ODSC West 2022 – Kitbashing in ML
 
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
Analytics Zoo: Building Analytics and AI Pipeline for Apache Spark and BigDL ...
 
Machine_Learning_Co__
Machine_Learning_Co__Machine_Learning_Co__
Machine_Learning_Co__
 
11 ta dts2021-11-v2
11 ta dts2021-11-v211 ta dts2021-11-v2
11 ta dts2021-11-v2
 
IRJET- Object Detection in an Image using Convolutional Neural Network
IRJET- Object Detection in an Image using Convolutional Neural NetworkIRJET- Object Detection in an Image using Convolutional Neural Network
IRJET- Object Detection in an Image using Convolutional Neural Network
 
Deep Learning with CNTK
Deep Learning with CNTKDeep Learning with CNTK
Deep Learning with CNTK
 
Machine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptxMachine learning ppt unit one syllabuspptx
Machine learning ppt unit one syllabuspptx
 
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
IRJET- Unabridged Review of Supervised Machine Learning Regression and Classi...
 
Top 10 Most Important Interview Question and Answer on Machine Learning
Top 10 Most Important  Interview Question and Answer on Machine LearningTop 10 Most Important  Interview Question and Answer on Machine Learning
Top 10 Most Important Interview Question and Answer on Machine Learning
 
Real Time Sign Language Detection
Real Time Sign Language DetectionReal Time Sign Language Detection
Real Time Sign Language Detection
 
Presentation1
Presentation1Presentation1
Presentation1
 
AUTOMATING AUTOMATION: MASTER MENTORING PROCESS
AUTOMATING AUTOMATION: MASTER MENTORING PROCESSAUTOMATING AUTOMATION: MASTER MENTORING PROCESS
AUTOMATING AUTOMATION: MASTER MENTORING PROCESS
 
A scenario based approach for dealing with
A scenario based approach for dealing withA scenario based approach for dealing with
A scenario based approach for dealing with
 
Fake News Detection using Deep Learning
Fake News Detection using Deep LearningFake News Detection using Deep Learning
Fake News Detection using Deep Learning
 
IRJET - Content based Image Classification
IRJET -  	  Content based Image ClassificationIRJET -  	  Content based Image Classification
IRJET - Content based Image Classification
 
Image Classification and Annotation Using Deep Learning
Image Classification and Annotation Using Deep LearningImage Classification and Annotation Using Deep Learning
Image Classification and Annotation Using Deep Learning
 
Creating a Custom ML Model for your Application - Kotlin/Everywhere
Creating a Custom ML Model for your Application - Kotlin/EverywhereCreating a Custom ML Model for your Application - Kotlin/Everywhere
Creating a Custom ML Model for your Application - Kotlin/Everywhere
 
Real time Traffic Signs Recognition using Deep Learning
Real time Traffic Signs Recognition using Deep LearningReal time Traffic Signs Recognition using Deep Learning
Real time Traffic Signs Recognition using Deep Learning
 
Metaphorical Analysis of diseases in Tomato leaves using Deep Learning Algori...
Metaphorical Analysis of diseases in Tomato leaves using Deep Learning Algori...Metaphorical Analysis of diseases in Tomato leaves using Deep Learning Algori...
Metaphorical Analysis of diseases in Tomato leaves using Deep Learning Algori...
 

More from Sunu Wibirama

Modul Topik 8 - Kecerdasan Buatan
Modul Topik 8 - Kecerdasan BuatanModul Topik 8 - Kecerdasan Buatan
Modul Topik 8 - Kecerdasan BuatanSunu Wibirama
 
Modul Topik 7 - Kecerdasan Buatan
Modul Topik 7 - Kecerdasan BuatanModul Topik 7 - Kecerdasan Buatan
Modul Topik 7 - Kecerdasan BuatanSunu Wibirama
 
Modul Topik 6 - Kecerdasan Buatan.pdf
Modul Topik 6 - Kecerdasan Buatan.pdfModul Topik 6 - Kecerdasan Buatan.pdf
Modul Topik 6 - Kecerdasan Buatan.pdfSunu Wibirama
 
Modul Topik 5 - Kecerdasan Buatan
Modul Topik 5 - Kecerdasan BuatanModul Topik 5 - Kecerdasan Buatan
Modul Topik 5 - Kecerdasan BuatanSunu Wibirama
 
Modul Topik 4 - Kecerdasan Buatan.pdf
Modul Topik 4 - Kecerdasan Buatan.pdfModul Topik 4 - Kecerdasan Buatan.pdf
Modul Topik 4 - Kecerdasan Buatan.pdfSunu Wibirama
 
Modul Topik 3 - Kecerdasan Buatan
Modul Topik 3 - Kecerdasan BuatanModul Topik 3 - Kecerdasan Buatan
Modul Topik 3 - Kecerdasan BuatanSunu Wibirama
 
Modul Topik 2 - Kecerdasan Buatan.pdf
Modul Topik 2 - Kecerdasan Buatan.pdfModul Topik 2 - Kecerdasan Buatan.pdf
Modul Topik 2 - Kecerdasan Buatan.pdfSunu Wibirama
 
Modul Topik 1 - Kecerdasan Buatan
Modul Topik 1 - Kecerdasan BuatanModul Topik 1 - Kecerdasan Buatan
Modul Topik 1 - Kecerdasan BuatanSunu Wibirama
 
Pengantar Mata Kuliah Kecerdasan Buatan.pdf
Pengantar Mata Kuliah Kecerdasan Buatan.pdfPengantar Mata Kuliah Kecerdasan Buatan.pdf
Pengantar Mata Kuliah Kecerdasan Buatan.pdfSunu Wibirama
 
Introduction to Artificial Intelligence - Pengenalan Kecerdasan Buatan
Introduction to Artificial Intelligence - Pengenalan Kecerdasan BuatanIntroduction to Artificial Intelligence - Pengenalan Kecerdasan Buatan
Introduction to Artificial Intelligence - Pengenalan Kecerdasan BuatanSunu Wibirama
 
Mengenal Eye Tracking (Introduction to Eye Tracking Research)
Mengenal Eye Tracking (Introduction to Eye Tracking Research)Mengenal Eye Tracking (Introduction to Eye Tracking Research)
Mengenal Eye Tracking (Introduction to Eye Tracking Research)Sunu Wibirama
 

More from Sunu Wibirama (11)

Modul Topik 8 - Kecerdasan Buatan
Modul Topik 8 - Kecerdasan BuatanModul Topik 8 - Kecerdasan Buatan
Modul Topik 8 - Kecerdasan Buatan
 
Modul Topik 7 - Kecerdasan Buatan
Modul Topik 7 - Kecerdasan BuatanModul Topik 7 - Kecerdasan Buatan
Modul Topik 7 - Kecerdasan Buatan
 
Modul Topik 6 - Kecerdasan Buatan.pdf
Modul Topik 6 - Kecerdasan Buatan.pdfModul Topik 6 - Kecerdasan Buatan.pdf
Modul Topik 6 - Kecerdasan Buatan.pdf
 
Modul Topik 5 - Kecerdasan Buatan
Modul Topik 5 - Kecerdasan BuatanModul Topik 5 - Kecerdasan Buatan
Modul Topik 5 - Kecerdasan Buatan
 
Modul Topik 4 - Kecerdasan Buatan.pdf
Modul Topik 4 - Kecerdasan Buatan.pdfModul Topik 4 - Kecerdasan Buatan.pdf
Modul Topik 4 - Kecerdasan Buatan.pdf
 
Modul Topik 3 - Kecerdasan Buatan
Modul Topik 3 - Kecerdasan BuatanModul Topik 3 - Kecerdasan Buatan
Modul Topik 3 - Kecerdasan Buatan
 
Modul Topik 2 - Kecerdasan Buatan.pdf
Modul Topik 2 - Kecerdasan Buatan.pdfModul Topik 2 - Kecerdasan Buatan.pdf
Modul Topik 2 - Kecerdasan Buatan.pdf
 
Modul Topik 1 - Kecerdasan Buatan
Modul Topik 1 - Kecerdasan BuatanModul Topik 1 - Kecerdasan Buatan
Modul Topik 1 - Kecerdasan Buatan
 
Pengantar Mata Kuliah Kecerdasan Buatan.pdf
Pengantar Mata Kuliah Kecerdasan Buatan.pdfPengantar Mata Kuliah Kecerdasan Buatan.pdf
Pengantar Mata Kuliah Kecerdasan Buatan.pdf
 
Introduction to Artificial Intelligence - Pengenalan Kecerdasan Buatan
Introduction to Artificial Intelligence - Pengenalan Kecerdasan BuatanIntroduction to Artificial Intelligence - Pengenalan Kecerdasan Buatan
Introduction to Artificial Intelligence - Pengenalan Kecerdasan Buatan
 
Mengenal Eye Tracking (Introduction to Eye Tracking Research)
Mengenal Eye Tracking (Introduction to Eye Tracking Research)Mengenal Eye Tracking (Introduction to Eye Tracking Research)
Mengenal Eye Tracking (Introduction to Eye Tracking Research)
 

Recently uploaded

(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZTE
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...RajaP95
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 

Recently uploaded (20)

(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 

Modul Topik 9 - Kecerdasan Buatan

  • 1. Topik 9 Implementasi Machine Learning dengan Python Dr. Sunu Wibirama Modul Kuliah Kecerdasan Buatan Kode mata kuliah: UGMx 001001132012 July 4, 2022
  • 2. July 4, 2022 1 Capaian Pembelajaran Mata Kuliah Topik ini akan memenuhi CPMK 5, yakni mampu mengimplementasikan Bahasa Pem- rograman Python sebagai pendukung pengembangan sistem cerdas. Adapun indikator tercapainya CPMK tersebut adalah Mengerti dan mamahami cara mengekstrak dataset ke variabel Python, mengerti penggunaan berbagai fungsi classifier, mengerti dan memahami cara melakukan validasi model machine learning. 2 Cakupan Materi Cakupan materi dalam topik ini sebagai berikut: a) Loading machine learning data: materi ini membahas teknik untuk mengunduh data secara online menggunakan Google Colaboratory dan Python. Selain itu, materi ini juga menjelaskan dasar-dasar statistika yang dapat digunakan untuk melihat karakter- istik data. Dataset yang digunakan dalam hands on ini adalah PIMA Indian Dataset. b) Preparing machine learning data: materi ini membahas teknik-teknik yang digunakan untuk melihat distribusi dan korelasi antara masing-masing atribut dalam dataset. c) Data visualization: materi ini membahas hal-hal yang terkait dengan visualisasi data dengan histogram, density plots, boxplots, correlation matrix, dan scatter plots. d) Data preparation and transformation: materi ini membahas hal-hal penting yang harus dilakukan untuk mempersiapkan data untuk mengurangi potensi kesalahan pada saat data menjadi masukan dari algoritme machine learning. Materi ini akan membahas data rescaling, data standardization, dan data normalization. e) Feature selection: materi ini membahas teknik-teknik yang dibutuhkan untuk memilih features dari sekian banyak features yang ada. f) Performance evaluation of machine learning algorithms: materi ini membahas teknik- teknik yang dapat digunakan untuk mengevaluasi dan memilih model machine learn- ing, misalnya memisahkan dataset menjadi training dan testing sets, K-fold cross validation, dan repeated random test-train splits. g) Performance metrics of machine learning algorithms: materi ini membahas metrik- metrik utama yang dapat digunakan untuk mengukur performa algoritme machine learning, misalnya akurasi klasifikasi, area di bawah kurva ROC, confusion matrix, mean absolute error, mean squared error, serta R-squared. h) Implementation of machine learning and neural network: materi ini membahas contoh riil implementasi dari algoritme machine learning dalam klasifikasi, algorithm tuning, dan neural network dengan Keras. 1
  • 3. Week9:Hands-onMachineLearningwithPython Copyright(C)2022-Dr.SunuWibirama|UniversitasGadjahMada Note:Thisnotebookisintendedforeducationalpurpose.DistributionofthisnotebookislimitedonlyforstudentsofKuliah KecerdasanBuatanthroughICEInstitutePlatform.AnyredistributionorrepublicationwithoutwrittenpermissionfromDr.Sunu Wibiramaisstrictlyprohibitedandisconsideredascopyrightinfringement. Inthislastlesson(Week9)ofKuliahKecerdasanBuatan(ArtificialIntelligenceCourse),wewillshowhowtoloaddatasettobeprocessedwith machinelearningalgorithm.Inaddition,wewillalsolearnhowtovisualizethedata,howtoprepareourdata,andhowfeatureselectionworks. Then,weintroducesomemetricstoevaluatemachinelearningalgorithms.Finally,wewillimplementseveralmachinelearningalgorithms. 9.1LoadingMachineLearningData PIMAIndianDataset ThePimaIndiansdatasetisusedtodemonstratedataloadinginthislesson.ThisdatasetisoriginallyfromtheNationalInstituteofDiabetesand DigestiveandKidneyDiseases.Theobjectiveistopredictbasedondiagnosticmeasurementswhetherapatienthasdiabeteswithinfiveyears.As suchitisaclassificationproblem. Severalconstraintswereplacedontheselectionoftheseinstancesfromalargerdatabase.Inparticular,allpatientsherearefemalesatleast21 yearsoldofPimaIndianheritage. [preg]--Pregnancies:Numberoftimespregnant [plas]--Glucose:Plasmaglucoseconcentrationa2hoursinanoralglucosetolerancetest [pres]--BloodPressure:Diastolicbloodpressure(mmHg) [skin]--SkinThickness:Tricepsskinfoldthickness(mm) [test]--Insulin:2-Hourseruminsulin(muU/ml) [mass]--BMI:Bodymassindex(weightinkg/(heightinm)^2) [pedi]--DiabetesPedigreeFunction:Diabetespedigreefunction [age]--Age:Age(years) [class]--Outcome:Classvariable,0isnoonsetofdiabeteswhile1isthereisonsetofdiabetes Itisagooddatasetfordemonstrationbecausealloftheinputattributesarenumericandtheoutputvariabletobepredictedisbinary(0or1).More detailedinformationaboutthedatasetcanbeseenhere:https://www.kaggle.com/datasets/uciml/pima-indians-diabetes-database Task1:DownloadingdatasetfromGithubusingNumpyLibrary WecanloadyourCSVdatausingNumPyandthenumpy.loadtxt()function.Thisfunctionassumesnoheaderrowandalldatahasthesameformat. Theexamplebelowassumesthatthefilepima-indians-diabetes.data.csvisloadedfromanonlineURL.Theresultsarelistedinrowsthencolumns. Youcanseethatthedatasethas768rowsand9columns. In [89]: from numpy import loadtxt from urllib.request import urlopen # load data URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv" raw_data = urlopen(URL) dataset = loadtxt(raw_data, delimiter=",") print(dataset.shape)
  • 4. (768, 9) Task2:DownloadingdatasetfromGithubusingPandasLibrary WecanalsoloadyourCSVdatausingPandasandthe pandas.read csv() function.Thisfunctionisveryflexibleandisperhapsthemost recommendedapproachforloadingourmachinelearningdata.Thefunctionreturnsa pandas.DataFrame thatwecanimmediatelystart summarizingandplotting.NotethatinthisexampleweexplicitlyspecifythenamesofeachattributetotheDataFrame. (768, 9) Task3:Usingdescriptivestatisticstounderstandyourdata Thereisnosubstituteforlookingattherawdata.Lookingattherawdatacanrevealinsightsthatwecannotgetanyotherway.Itcanalsoplant seedsthatmaylatergrowintoideasonhowtobetterpre-processandhandlethedataformachinelearningtasks.Wecanreviewthefirst20rows ofourdatausingthe head() functiononthePandasdata.Wecanseethatthefirstcolumnliststherownumber,whichishandyforreferencinga specificobservation. preg plas pres skin test mass pedi age class 0 6 148 72 35 0 33.6 0.627 50 1 1 1 85 66 29 0 26.6 0.351 31 0 2 8 183 64 0 0 23.3 0.672 32 1 3 1 89 66 23 94 28.1 0.167 21 0 4 0 137 40 35 168 43.1 2.288 33 1 5 5 116 74 0 0 25.6 0.201 30 0 6 3 78 50 32 88 31.0 0.248 26 1 7 10 115 0 0 0 35.3 0.134 29 0 8 2 197 70 45 543 30.5 0.158 53 1 9 8 125 96 0 0 0.0 0.232 54 1 10 4 110 92 0 0 37.6 0.191 30 0 11 10 168 74 0 0 38.0 0.537 34 1 12 10 139 80 0 0 27.1 1.441 57 0 13 1 189 60 23 846 30.1 0.398 59 1 14 5 166 72 19 175 25.8 0.587 51 1 15 7 100 0 0 0 30.0 0.484 32 1 16 0 118 84 47 230 45.8 0.551 31 1 17 7 107 74 0 0 29.6 0.254 31 1 18 1 103 30 38 83 43.3 0.183 33 0 19 1 115 70 30 96 34.6 0.529 32 1 Task4:Observingtypeofdataforeachattribute Thetypeofeachattributeisimportant.Stringsmayneedtobeconvertedtofloatingpointvaluesorintegerstorepresentcategoricalorordinal values.Wecangetanideaofthetypesofattributesbypeekingattherawdata.Wecanalsolistthedatatypestocharacterizeeachattributeusing the dtypes property. preg int64 plas int64 pres int64 skin int64 test int64 mass float64 pedi float64 age int64 class int64 dtype: object Task5:Descriptivestatistics Descriptivestatisticscangiveusgreatinsightintothepropertiesofeachattribute.Oftenwecancreatemoresummariesthanwehavetimeto review.The describe() functiononthePandasdatalists8statisticalpropertiesofeachattribute: In [90]: import pandas as pd # load data names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv" data = pd.read_csv(URL,names=names) print(data.shape) In [91]: import pandas as pd # load data names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv" data = pd.read_csv(URL,names=names) peek = data.head(20) print(peek) In [92]: import pandas as pd # load data names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv" data = pd.read_csv(URL,names=names) types = data.dtypes print(types)
  • 5. Count. Mean. StandardDeviation. MinimumValue. 25thPercentile. 50thPercentile(Median). 75thPercentile. MaximumValue. Wewillnotesomecallsto pandas.set_option() intherecipetochangetheprecisionofthenumbersandthepreferredwidthoftheoutput. Thisistomakeitmorereadableforthisexample.Notethat display.width parametersetsthewidthofthedisplayincharacters.IncasePython isrunninginaterminalthiscanbesettoNoneandPandaswillcorrectlyauto-detectthewidth. Whendescribingourdatathisway,itisworthtakingsometimeandreviewingobservationsfromtheresults.ThismightincludethepresenceofNA valuesformissingdataorsurprisingdistributionsforattributes. preg plas pres skin test mass pedi age class count 768.000 768.000 768.000 768.000 768.000 768.000 768.000 768.000 768.000 mean 3.845 120.895 69.105 20.536 79.799 31.993 0.472 33.241 0.349 std 3.370 31.973 19.356 15.952 115.244 7.884 0.331 11.760 0.477 min 0.000 0.000 0.000 0.000 0.000 0.000 0.078 21.000 0.000 25% 1.000 99.000 62.000 0.000 0.000 27.300 0.244 24.000 0.000 50% 3.000 117.000 72.000 23.000 30.500 32.000 0.372 29.000 0.000 75% 6.000 140.250 80.000 32.000 127.250 36.600 0.626 41.000 1.000 max 17.000 199.000 122.000 99.000 846.000 67.100 2.420 81.000 1.000 9.2PreparingMachineLearningData Task1:Checkingthedistributionofthedata Onclassificationproblemsweneedtoknowhowbalancedtheclassvaluesare.Highlyimbalancedproblems(alotmoreobservationsforoneclass thananother)arecommonandmayneedspecialhandlinginthedatapreparationstageofourproject.Wecanquicklygetanideaofthe distributionoftheclassattributeinPandas.Wecanseethattherearenearlydoublethenumberofobservationswithclass0(noonsetofdiabetes) thantherearewithclass1(onsetofdiabetes). class 0 500 1 268 dtype: int64 Task2:Checkingcorrelationofdata Correlationreferstotherelationshipbetweentwovariablesandhowtheymayormaynotchangetogether.Themostcommonmethodfor calculatingcorrelationisPearson’sCorrelationCoefficient,thatassumesanormaldistributionoftheattributesinvolved. Acorrelationof-1or1showsafullnegativeorpositivecorrelationrespectively.Whereasavalueof0showsnocorrelationatall.Somemachine learningalgorithmslikelinearandlogisticregressioncansufferpoorperformanceiftherearehighlycorrelatedattributesinyourdataset.Assuch, itisagoodideatoreviewallofthepairwisecorrelationsoftheattributesinourdataset.Wecanusethe corr() functiononthePandasdatato calculateacorrelationmatrix. Thematrixlistsallattributesacrossthetopanddowntheside,togivecorrelationbetweenallpairsofattributes(twice,becausethematrixis symmetrical).Wecanseethediagonallinethroughthematrixfromthetoplefttobottomrightcornersofthematrixshowsperfectcorrelationof eachattributewithitself. In [93]: import pandas as pd # load data names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv" data = pd.read_csv(URL,names=names) pd.set_option('display.width', 100) pd.set_option('precision', 3) description = data.describe() print(description) In [94]: import pandas as pd # load data names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv" data = pd.read_csv(URL,names=names) class_counts = data.groupby('class').size() print(class_counts) In [95]: import pandas as pd # load data names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv" data = pd.read_csv(URL,names=names) pd.set_option('display.width', 100) pd.set_option('precision', 3)
  • 6. preg plas pres skin test mass pedi age class preg 1.000 0.129 0.141 -0.082 -0.074 0.018 -0.034 0.544 0.222 plas 0.129 1.000 0.153 0.057 0.331 0.221 0.137 0.264 0.467 pres 0.141 0.153 1.000 0.207 0.089 0.282 0.041 0.240 0.065 skin -0.082 0.057 0.207 1.000 0.437 0.393 0.184 -0.114 0.075 test -0.074 0.331 0.089 0.437 1.000 0.198 0.185 -0.042 0.131 mass 0.018 0.221 0.282 0.393 0.198 1.000 0.141 0.036 0.293 pedi -0.034 0.137 0.041 0.184 0.185 0.141 1.000 0.034 0.174 age 0.544 0.264 0.240 -0.114 -0.042 0.036 0.034 1.000 0.238 class 0.222 0.467 0.065 0.075 0.131 0.293 0.174 0.238 1.000 Task3:Skewofunivariatedistribution SkewreferstoadistributionthatisassumedGaussian(normalorbellcurve)thatisshiftedorsquashedinonedirectionoranother.Manymachine learningalgorithmsassumeaGaussiandistribution.Knowingthatanattributehasaskewmayallowustoperformdatapreparationtocorrectthe skewandlaterimprovetheaccuracyofourmodels.Wecancalculatetheskewofeachattributeusingthe skew() functiononthePandasdata. Theskewresultsshowapositive(right)ornegative(left)skew.Valuesclosertozeroshowlessskew. preg 0.902 plas 0.174 pres -1.844 skin 0.109 test 2.272 mass -0.429 pedi 1.920 age 1.130 class 0.635 dtype: float64 Howtouseourstatisticalresults Reviewthenumbers.Generatingthesummarystatisticsisnotenough.Takeamomenttopause,readandreallythinkaboutthenumbersyou areseeing. Askwhy.Reviewyournumbersandaskalotofquestions.Howandwhyareyouseeingspecificvalues.Thinkabouthowthenumbersrelateto theproblemdomainingeneralandspecificentitiesthatobservationsrelateto. Writedownideas.Writedownyourobservationsandideas.Keepasmalltextfileornotepadandjotdownalloftheideasforhowvariables mayrelate,forwhatnumbersmean,andideasfortechniquestotrylater.Thethingsyouwritedownnowwhilethedataisfreshwillbevery valuablelaterwhenyouaretryingtothinkupnewthingstotry. 9.3DataVisualization(Part01) Wemustunderstandyourdatainordertogetthebestresultsfrommachinelearningalgorithms.Thefastestwaytolearnmoreaboutourdataisto usedatavisualization.InthischapterwewilldiscoverexactlyhowwecanvisualizeyourmachinelearningdatainPythonusingPandas.First,we willlearnthreeunivariateplots: Histograms DensityPlots BoxandWhiskerPlots Inthesubsequentpart,wewilllearnsomemultivariateplots,including: CorrelationMatrixPlots ScatterPlotMatrix Task1:PlottingHistograms Afastwaytogetanideaofthedistributionofeachattributeistolookathistograms.Histogramsgroupdataintobinsandprovideusacountofthe numberofobservationsineachbin.FromtheshapeofthebinswecanquicklygetafeelingforwhetheranattributeisGaussian,skewedoreven hasanexponentialdistribution.Itcanalsohelpusseepossibleoutliers. Wecanseethatperhapstheattributes age, pedi and test mayhaveanexponentialdistribution.Wecanalsoseethatperhapsthe mass and pres and plas attributesmayhaveaGaussianornearlyGaussiandistribution.Thisisinterestingbecausemanymachinelearning techniquesassumeaGaussianunivariatedistributionontheinputvariables. correlations = data.corr(method='pearson') print(correlations) In [96]: import pandas as pd # load data names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv" data = pd.read_csv(URL,names=names) skew = data.skew() print(skew) In [97]: # Univariate Histograms import matplotlib.pyplot as plt import pandas as pd # load data
  • 7. /usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:11: UserWarning: To output multiple subplots, the figure conta ining the passed axes is being cleared # This is added back by InteractiveShellApp.init_path() Task2:PlottingDensityPlots Densityplotsareanotherwayofgettingaquickideaofthedistributionofeachattribute.Theplotslooklikeanabstractedhistogramwithasmooth curvedrawnthroughthetopofeachbin,muchlikeoureyetriedtodowiththehistograms.Wecanseethedistributionforeachattributeisclearer thanthehistograms. /usr/local/lib/python3.7/dist-packages/pandas/plotting/_matplotlib/__init__.py:71: UserWarning: To output multiple subplot s, the figure containing the passed axes is being cleared plot_obj.generate() names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv" data = pd.read_csv(URL,names=names) fig = plt.figure(figsize = (12,12)) ax = fig.gca() data.hist(ax = ax) plt.show() In [98]: # Univariate Density Plots import matplotlib.pyplot as plt import pandas as pd # load data names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv" data = pd.read_csv(URL,names=names) fig = plt.figure(figsize = (12,12)) ax = fig.gca() data.plot(kind='density', subplots=True, layout=(3,3), sharex=False, ax=ax) plt.show()
  • 8. Task3:PlottingBoxandWhiskerPlots AnotherusefulwaytoreviewthedistributionofeachattributeistouseBoxandWhiskerPlotsorboxplotsforshort.Boxplotssummarizethe distributionofeachattribute,drawingalineforthemedian(middlevalue)andaboxaroundthe25thand75thpercentiles(themiddle50%ofthe data).Thewhiskersgiveanideaofthespreadofthedataanddotsoutsideofthewhiskersshowcandidateoutliervalues(valuesthatare1.5times greaterthanthesizeofspreadofthemiddle50%ofthedata). Wecanseethatthespreadofattributesisquitedifferent.Somelike age , test and skin appearquiteskewedtowardssmallervalues. /usr/local/lib/python3.7/dist-packages/pandas/plotting/_matplotlib/__init__.py:71: UserWarning: To output multiple subplot s, the figure containing the passed axes is being cleared plot_obj.generate() In [99]: # Univariate Box and Whisker Plots import matplotlib.pyplot as plt import pandas as pd # load data names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv" data = pd.read_csv(URL,names=names) fig = plt.figure(figsize = (12,12)) ax = fig.gca() data.plot(kind='box', subplots=True, layout=(3,3), sharex=False, ax=ax) plt.show()
  • 9. 9.4DataVisualization(Part02) Thislectureprovidesexamplesoftwoplotsthatshowtheinteractionsbetweenmultiplevariablesinyourdataset. CorrelationMatrixPlot. ScatterPlotMatrix. Task1:PlottingCorrelationMatrixPlot Correlationgivesanindicationofhowrelatedthechangesarebetweentwovariables.Iftwovariableschangeinthesamedirectiontheyare positivelycorrelated.Iftheychangeinoppositedirectionstogether(onegoesup,onegoesdown),thentheyarenegativelycorrelated. Wecancalculatethecorrelationbetweeneachpairofattributes.Thisiscalledacorrelationmatrix.Wecanthenplotthecorrelationmatrixandget anideaofwhichvariableshaveahighcorrelationwitheachother.Thisisusefultoknow,becausesomemachinelearningalgorithmslikelinearand logisticregressioncanhavepoorperformanceiftherearehighlycorrelatedinputvariablesinourdata. Wecanseethatthematrixissymmetrical,i.e.thebottomleftofthematrixisthesameasthetopright.Thisisusefulaswecanseetwodifferent viewsonthesamedatainoneplot.Wecanalsoseethateachvariableisperfectlypositivelycorrelatedwithitself(asyouwouldhaveexpected)in thediagonallinefromtoplefttobottomright. In [100… #Correlation Matrix Plot import matplotlib.pyplot as plt import pandas as pd import numpy as np # load data names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv" data = pd.read_csv(URL,names=names) correlations = data.corr() # plot correlation matrix fig = plt.figure(figsize = (12,12)) ax = fig.add_subplot(111) cax = ax.matshow(correlations, vmin=-1, vmax=1) fig.colorbar(cax) ticks = np.arange(0,9,1) ax.set_xticks(ticks) ax.set_yticks(ticks) ax.set_xticklabels(names) ax.set_yticklabels(names) plt.show() #print correlation value pd.set_option('display.width', 100) pd.set_option('precision', 3)
  • 10. preg plas pres skin test mass pedi age class preg 1.000 0.129 0.141 -0.082 -0.074 0.018 -0.034 0.544 0.222 plas 0.129 1.000 0.153 0.057 0.331 0.221 0.137 0.264 0.467 pres 0.141 0.153 1.000 0.207 0.089 0.282 0.041 0.240 0.065 skin -0.082 0.057 0.207 1.000 0.437 0.393 0.184 -0.114 0.075 test -0.074 0.331 0.089 0.437 1.000 0.198 0.185 -0.042 0.131 mass 0.018 0.221 0.282 0.393 0.198 1.000 0.141 0.036 0.293 pedi -0.034 0.137 0.041 0.184 0.185 0.141 1.000 0.034 0.174 age 0.544 0.264 0.240 -0.114 -0.042 0.036 0.034 1.000 0.238 class 0.222 0.467 0.065 0.075 0.131 0.293 0.174 0.238 1.000 Task2:PlottingScatterPlotMatrix Ascatterplotshowstherelationshipbetweentwovariablesasdotsintwodimensions,oneaxisforeachattribute.Wercancreateascatterplotfor eachpairofattributesinourdata.Drawingallthesescatterplotstogetheriscalledascatterplotmatrix. Scatterplotsareusefulforspottingstructuredrelationshipsbetweenvariables,likewhetherwecouldsummarizetherelationshipbetweentwo variableswithaline.Attributeswithstructuredrelationshipsmayalsobecorrelatedandgoodcandidatesforremovalfromyourdataset.Belowisa figureshowingshapeofscatterplotoftypicalcorrelationbetweentwovariables. correlations = data.corr(method='pearson') print(correlations)
  • 11. LiketheCorrelationMatrixPlotabove,thescatterplotmatrixissymmetrical.Thisisusefultolookatthepairwiserelationshipsfromdifferent perspectives.Becausethereislittlepointofdrawingascatterplotofeachvariablewithitself,thediagonalshowshistogramsofeachattribute. preg plas pres skin test mass pedi age class preg 1.000 0.129 0.141 -0.082 -0.074 0.018 -0.034 0.544 0.222 plas 0.129 1.000 0.153 0.057 0.331 0.221 0.137 0.264 0.467 pres 0.141 0.153 1.000 0.207 0.089 0.282 0.041 0.240 0.065 skin -0.082 0.057 0.207 1.000 0.437 0.393 0.184 -0.114 0.075 test -0.074 0.331 0.089 0.437 1.000 0.198 0.185 -0.042 0.131 mass 0.018 0.221 0.282 0.393 0.198 1.000 0.141 0.036 0.293 pedi -0.034 0.137 0.041 0.184 0.185 0.141 1.000 0.034 0.174 age 0.544 0.264 0.240 -0.114 -0.042 0.036 0.034 1.000 0.238 class 0.222 0.467 0.065 0.075 0.131 0.293 0.174 0.238 1.000 9.5DataPreparationandTransformation Manymachinelearningalgorithmsmakeassumptionsaboutourdata.Itisoftenaverygoodideatoprepareourdatainsuchawaytobestexpose thestructureoftheproblemtothemachinelearningalgorithmsthatyouintendtouse.Adifficultyisthatdifferentalgorithmsmakedifferent assumptionsaboutourdataandmayrequiredifferenttransforms.Further,whenyoufollowalloftherulesandprepareourdata,sometimes algorithmscandeliverbetterresultswithoutpre-processing. Task1:Rescalingthedata Whenourdataiscomprisedofattributeswithvaryingscales,manymachinelearningalgorithmscanbenefitfromrescalingtheattributestoallhave thesamescale.Oftenthisisreferredto asnormalizationandattributesareoftenrescaledintotherangebetween0and1.Thisisusefulfor optimizationalgorithmsusedinthecoreofmachinelearningalgorithmslikegradientdescent.Itisalsousefulforalgorithmsthatweightinputslike regressionandneuralnetworksandalgorithmsthatusedistancemeasureslikek-NearestNeighbors.Wecanrescaleourdatausingscikit-learn usingthe MinMaxScaler class.Afterrescalingwecanseethatallofthevaluesareintherangebetween0and1. In [101… # Scatter Plot Matrix import matplotlib.pyplot as plt import pandas as pd # load data names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv" data = pd.read_csv(URL,names=names) pd.plotting.scatter_matrix(data, figsize=(12,12)) plt.show() #print correlation value pd.set_option('display.width', 100) pd.set_option('precision', 3) correlations = data.corr(method='pearson') print(correlations)
  • 12. [[0.353 0.744 0.59 0.354 0. 0.501 0.234 0.483] [0.059 0.427 0.541 0.293 0. 0.396 0.117 0.167] [0.471 0.92 0.525 0. 0. 0.347 0.254 0.183] [0.059 0.447 0.541 0.232 0.111 0.419 0.038 0. ] [0. 0.688 0.328 0.354 0.199 0.642 0.944 0.2 ]] Task2:Standardizethedata StandardizationisausefultechniquetotransformattributeswithaGaussiandistributionanddifferingmeansandstandarddeviationstoastandard Gaussiandistributionwithameanof 0andastandarddeviationof1.ItismostsuitablefortechniquesthatassumeaGaussiandistributioninthe inputvariablesandworkbetterwithrescaleddata,suchaslinearregression,logisticregressionandlineardiscriminateanalysis.Wecan standardizedatausingscikit-learnwiththe StandardScaler class. [[ 0.64 0.848 0.15 0.907 -0.693 0.204 0.468 1.426] [-0.845 -1.123 -0.161 0.531 -0.693 -0.684 -0.365 -0.191] [ 1.234 1.944 -0.264 -1.288 -0.693 -1.103 0.604 -0.106] [-0.845 -0.998 -0.161 0.155 0.123 -0.494 -0.921 -1.042] [-1.142 0.504 -1.505 0.907 0.766 1.41 5.485 -0.02 ]] Task3:Normalizingthedata Normalizinginscikit-learnreferstorescalingeachobservation(row)tohavealengthof1(calledaunitnormoravectorwiththelengthof1in linearalgebra).Thispre-processingmethod canbeusefulforsparsedatasets(lotsofzeros)withattributesofvaryingscaleswhenusing algorithmsthatweightinputvaluessuchasneuralnetworksandalgorithmsthatusedistancemeasuressuchask-NearestNeighbors.Wecan normalizedatainPythonwithscikit-learnusingthe Normalizer class. In [102… # Rescale Data (between 0 and 1) from sklearn.preprocessing import MinMaxScaler import pandas as pd from numpy import set_printoptions # load data names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv" data = pd.read_csv(URL,names=names) array = data.values # separate array into input and output components X = array[:,0:8] Y = array[:,8] # rescaling the data scaler = MinMaxScaler(feature_range=(0, 1)) rescaledX = scaler.fit_transform(X) # summarize transformed data set_printoptions(precision=3) print(rescaledX[0:5,:]) In [ ]: # Standardize Data (0 mean, 1 stdev) from sklearn.preprocessing import StandardScaler import pandas as pd from numpy import set_printoptions # load data names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv" data = pd.read_csv(URL,names=names) array = data.values # separate array into input and output components X = array[:,0:8] Y = array[:,8] # standardize the data scaler = StandardScaler().fit(X) rescaledX = scaler.transform(X) # summarize transformed data set_printoptions(precision=3) print(rescaledX[0:5,:]) In [ ]: # Normalize Data (length of 1) from sklearn.preprocessing import Normalizer import pandas as pd from numpy import set_printoptions # load data names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv" data = pd.read_csv(URL,names=names) array = data.values # separate array into input and output components X = array[:,0:8] Y = array[:,8] # normalize the data scaler = Normalizer().fit(X) normalizedX = scaler.transform(X)
  • 13. [[0.034 0.828 0.403 0.196 0. 0.188 0.004 0.28 ] [0.008 0.716 0.556 0.244 0. 0.224 0.003 0.261] [0.04 0.924 0.323 0. 0. 0.118 0.003 0.162] [0.007 0.588 0.436 0.152 0.622 0.186 0.001 0.139] [0. 0.596 0.174 0.152 0.731 0.188 0.01 0.144]] 9.6FeatureSelection Thedatafeaturesthatweusetotrainourmachinelearningmodelshaveahugeinfluenceontheperformancewecanachieve.Inthislessonwewill discoverautomaticfeatureselectiontechniquesthatwecanusetoprepareourmachinelearningdatainPythonwithscikit-learn. Featureselectionisaprocesswhereyouautomaticallyselectthosefeaturesinourdatathatcontributemosttothepredictionvariableoroutputin whichyouareinterested.Havingirrelevantfeaturesinourdatacandecreasetheaccuracyofmanymodels,especiallylinearalgorithmslikelinear andlogisticregression.Threebenefitsofperformingfeatureselectionbeforemodelingourdataare: Reducesoverfitting:lessredundantdatameanslessopportunitytomakedecisionsbasedonnoise. Improvesaccuracy:lessmisleadingdatameansmodelingaccuracyimproves. Reducestrainingtime:lessdatameansthatalgorithmstrainfaster Aftercompletingthislessonyouwillknowhowtouse: UnivariateSelection. RecursiveFeatureElimination. FeatureImportance. Task1:UnivariateSelection Statisticaltestscanbeusedtoselectthosefeaturesthathavethestrongestrelationshipwiththeoutputvariable.Thescikit-learnlibraryprovides the SelectKBest classthatcanbeusedwithasuiteofdifferentstatisticalteststoselectaspecificnumberoffeatures. TheexamplebelowusestheChi-Squared statisticaltestfornon-negativefeaturestoselect4ofthebestfeaturesfromthePimaIndiansonset ofdiabetesdataset. Selected features with first 5 entries: plas test mass age 0 148 0 33.6 50 1 85 0 26.6 31 2 183 0 23.3 32 3 89 94 28.1 21 4 137 168 43.1 33 Chi-square scores of the selected features: Index(['plas', 'test', 'mass', 'age'], dtype='object') [1411.887 2175.565 127.669 181.304] Task2:RecursiveFeatureElimination TheRecursiveFeatureElimination(orRFE)worksbyrecursivelyremovingattributesandbuildingamodelonthoseattributesthatremain.Ituses themodelaccuracytoidentifywhichattributes(andcombinationofattributes)contributethemosttopredictingthetargetattribute.Theexample # summarize transformed data set_printoptions(precision=3) print(normalizedX[0:5,:]) (χ 2 ) In [ ]: # Feature selection with Univariate Statistical Tests (Chi-squared for classification) import pandas as pd from numpy import set_printoptions from sklearn.feature_selection import SelectKBest, f_classif from sklearn.feature_selection import chi2 # load data names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv" data = pd.read_csv(URL,names=names) array = data.values # separate array into input and output components X = array[:,0:8] Y = array[:,8] # select four features and create new pandas dataframe selector = SelectKBest(score_func=chi2, k=4).fit(X,Y) f = selector.get_support(1) data_new = data[data.columns[f]] print ("Selected features with first 5 entries:") print(data_new.head(5)) # show selected chi-square scores for selected features print('n') print ("Chi-square scores of the selected features:") x_new = selector.transform(X) # not needed to get the score scores = selector.scores_ print (data.columns[f]) print(scores[f])
  • 14. belowusesRFEwiththelogisticregressionalgorithmtoselectthetop3features.Thechoiceofalgorithmdoesnotmattertoomuchaslongasitis skillfulandconsistent. Number of selected features via RFE: 3 Boolean of selected features: [ True False False False False True True False] Feature ranking: [1 2 4 6 5 1 1 3] Selected features: preg mass pedi Task3:FeatureImportance Featureimportancereferstotechniquesthatassignascoretoinputfeaturesbasedonhowusefultheyareatpredictingatargetvariable. Most importancescoresarecalculatedbyapredictivemodelthathasbeenfitonthedataset.Inspectingtheimportancescoreprovidesinsightintothat specificmodelandwhichfeaturesarethemostimportantandleastimportanttothemodelwhenmakingaprediction.Thisisatypeofmodel interpretationthatcanbeperformedforthosemodelsthatsupportit. BaggeddecisiontreeslikeRandomForestandExtraTreescanbeusedtoestimatetheimportanceoffeatures.Intheexamplebelowweconstruct aExtraTreesClassifierclassifierforthePimaIndiansonsetofdiabetesdataset.Wecanseethatwearegivenanimportancescoreforeachattribute wherethelargerthescore,themoreimportanttheattribute.Thescoreshighlighttheimportanceof plas , age and mass . Features with importance scores: preg 0.10917902521438591 plas 0.23778795159254987 pres 0.09677965067606348 skin 0.07938108481610057 test 0.0715765118317984 mass 0.1418165024365237 In [ ]: # Feature Selection with RFE import pandas as pd from sklearn.feature_selection import RFE from sklearn.linear_model import LogisticRegression # load data names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv" data = pd.read_csv(URL,names=names) array = data.values # separate array into input and output components X = array[:,0:8] Y = array[:,8] # feature selection #model = LogisticRegression() rfe = RFE(estimator=LogisticRegression(solver='lbfgs', max_iter=1000),n_features_to_select=3) fit = rfe.fit(X, Y) print("Number of selected features via RFE: %d" % fit.n_features_) print("Boolean of selected features: %s" % fit.support_) print("Feature ranking: %s" % fit.ranking_) print("Selected features: ") idx = 0 for x in fit.ranking_: if x==1: print(data.columns[idx]) idx +=1 In [62]: # Feature Importance with Extra Trees Classifier import pandas as pd from sklearn.ensemble import ExtraTreesClassifier import numpy as np import matplotlib.pyplot as plt # load data names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv" data = pd.read_csv(URL,names=names) array = data.values # separate array into input and output components X = array[:,0:8] Y = array[:,8] # feature selection model = ExtraTreesClassifier() model.fit(X, Y) importance_sorted = np.sort(model.feature_importances_) print("Features with importance scores:") idx = 0 for x in model.feature_importances_: print(data.columns[idx], x) idx += 1 #show bar plot features = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age'] plt.bar(features,model.feature_importances_) plt.show()
  • 15. pedi 0.11855796799091706 age 0.14492130544166104 9.7PerformanceEvaluationofMachineLearningAlgorithms Weneedtoknowhowwellyouralgorithmsperformonunseendata.Thebestwaytoevaluatetheperformanceofanalgorithmwouldbetomake predictionsfornewdatatowhichwealreadyknowtheanswers.Thesecondbestwayistouseclevertechniquesfromstatisticscalledresampling methodsthatallowustomakeaccurateestimatesforhowwellouralgorithmwillperformonnewdata. Task1:SplittingDataintoTrainandTestSets Thesimplestmethodthatwecanusetoevaluatetheperformanceofamachinelearningalgorithmistousedifferenttrainingandtestingdatasets. Wecantakeouroriginaldatasetandsplititintotwoparts.Trainthealgorithmonthefirstpart,makepredictionsonthesecondpartandevaluate thepredictionsagainsttheexpectedresults.Thesizeofthesplitcandependonthesizeandspecificsofyourdataset,althoughitiscommonto use67%ofthedatafortrainingandtheremaining33%fortesting. Thisalgorithmevaluationtechniqueisveryfast.Itisidealforlargedatasets(millionsofrecords)wherethereisstrongevidencethatbothsplitsof thedataarerepresentativeoftheunderlyingproblem.Becauseofthespeed,itisusefultousethisapproachwhenthealgorithmweare investigatingisslowtotrain. Adownsideofthistechniqueisthatitcanhaveahighvariance.Thismeansthatdifferencesinthetrainingandtestdatasetcanresultinmeaningful differencesintheestimateofaccuracy.IntheexamplebelowwesplitthePimaIndiansdatasetinto67%/33%splitsfortrainingandtestand evaluatetheaccuracyofaLogisticRegressionmodel. Notethatinadditiontospecifyingthesizeofthesplit,wealsospecifytherandomseed.Becausethesplitofthedataisrandom,wewanttoensure thattheresultsarereproducible.Byspecifyingtherandomseedweensurethatwegetthesamerandomnumberseachtimewerunthe codeandinturnthesamesplitofdata.Thisisimportantifwewanttocomparethisresulttotheestimatedaccuracyofanothermachinelearning algorithmorthesamealgorithmwithadifferentconfiguration.Toensurethecomparisonwasapples-for-apples,wemustensurethattheyare trainedandtestedonexactlythesamedata. Accuracy: 78.740% Task2:K-FoldCrossValidation Cross-validationisanapproachthatwecanusetoestimatetheperformanceofamachinelearningalgorithmwithlessvariancethanasingletrain- testsetsplit.Itworksbysplittingthedatasetintok-parts(e.g.k=5ork=10).Eachsplitofthedataiscalledafold.Thealgorithmistrainedonk− 1foldswithoneheldbackandtestedontheheldbackfold.Thisisrepeatedsothateachfoldofthedatasetisgivenachancetobetheheldback testset.Afterrunningcross-validationweendupwithkdifferentperformancescoresthatwecansummarizeusingameanandastandard deviation. Theresultisamorereliableestimateoftheperformanceofthealgorithmonnewdata.Itismoreaccuratebecausethealgorithmistrainedand evaluatedmultipletimesondifferentdata.Thechoiceofkmustallowthesizeofeachtestpartitiontobelargeenoughtobeareasonablesample oftheproblem,whilstallowingenoughrepetitionsofthetrain-testevaluationofthealgorithmtoprovideafairestimateofthealgorithms In [75]: # Evaluate using a train and a test set import pandas as pd from sklearn.model_selection import train_test_split from sklearn.linear_model import LogisticRegression # load data names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv" data = pd.read_csv(URL,names=names) array = data.values # separate array into input and output components X = array[:,0:8] Y = array[:,8] test_size = 0.33 seed = 7 X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=test_size, random_state=seed) model = LogisticRegression(solver='lbfgs', max_iter=1000) model.fit(X_train, Y_train) result = model.score(X_test, Y_test) print("Accuracy: %.3f%%" % (result*100.0))
  • 16. performanceonunseendata.Formodestsizeddatasetsinthethousandsortensofthousandsofrecords,kvaluesof3,5and10arecommon.In theexamplebelowweuse10-foldcross-validation. Wereportboththemeanandthestandarddeviationoftheperformancemeasure.Whensummarizingperformancemeasures,itisagoodpractice tosummarizethedistributionofthemeasures,inthiscaseassumingaGaussiandistributionofperformance(averyreasonableassumption)and recordingthemeanandstandarddeviation. Accuracy: 77.604% (5.158%) Task3:RepeatedRandomTest-TrainSplits Anothervariationonk-foldcross-validationistocreatearandomsplitofthedatalikethetrain/testsplitdescribedabove,butrepeattheprocessof splittingandevaluationofthealgorithmmultipletimes,likecross-validation.Thishasthespeedofusingatrain/testsplitandthereductionin varianceintheestimatedperformanceofk-foldcross-validation. Wecanalsorepeattheprocessmanymoretimesasneededtoimprovetheaccuracy.Adownsideisthatrepetitionsmayincludemuchofthesame datainthetrainorthetestsplitfromruntorun,introducingredundancyintotheevaluation.Theexamplebelowsplitsthedataintoa67%/33% train/testsplitandrepeatstheprocess10times. Accuracy: 76.535% (2.235%) Notestobeconsidered Generallyk-foldcross-validationisthegoldstandardforevaluatingtheperformanceofamachinelearningalgorithmonunseendatawithkset to3,5,or10. Usingatrain/testsplitisgoodforspeedwhenusingaslowalgorithmandproducesperformanceestimateswithlowerbiaswhenusinglarge datasets. Techniqueslikerepeatedrandomsplitscanbeusefulintermediateswhentryingtobalancevarianceintheestimatedperformance,model trainingspeedanddatasetsize. Thebestadviceistoexperimentandfindatechniqueforyourproblemthatisfastandproducesreasonableestimatesofperformancethatyoucan usetomakedecisions.Ifindoubt,use10-foldcross-validation. 9.8PerformanceMetricsofMachineLearningAlgorithm(Part01) Themetricsthatwechoosetoevaluateyourmachinelearningalgorithmsareveryimportant.Choiceofmetricsinfluenceshowtheperformanceof machinelearningalgorithmsismeasuredandcompared.Theyinfluencehowweweighttheimportanceofdifferentcharacteristicsintheresults In [85]: # Evaluate using cross validation import pandas as pd from sklearn.model_selection import KFold from sklearn.model_selection import cross_val_score from sklearn.linear_model import LogisticRegression # load data names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv" data = pd.read_csv(URL,names=names) array = data.values # separate array into input and output components X = array[:,0:8] Y = array[:,8] kfold = KFold(n_splits=10, random_state=None) model = LogisticRegression(solver='lbfgs', max_iter=1000) results = cross_val_score(model, X, Y, cv=kfold) print("Accuracy: %.3f%% (%.3f%%)" % (results.mean()*100.0, results.std()*100.0)) In [84]: # Evaluate using Shuffle Split Cross Validation import pandas as pd from sklearn.model_selection import ShuffleSplit from sklearn.model_selection import cross_val_score from sklearn.linear_model import LogisticRegression # load data names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv" data = pd.read_csv(URL,names=names) array = data.values # separate array into input and output components X = array[:,0:8] Y = array[:,8] n_splits = 10 test_size = 0.33 seed = 7 kfold = ShuffleSplit(n_splits=n_splits, test_size=test_size, random_state=seed) model = LogisticRegression(solver='lbfgs', max_iter=1000) results = cross_val_score(model, X, Y, cv=kfold) print("Accuracy: %.3f%% (%.3f%%)" % (results.mean()*100.0, results.std()*100.0))
  • 17. andourultimatechoiceofwhichalgorithmtochoose. Task1:ClassificationAccuracy Classificationaccuracyisthenumberofcorrectpredictionsmadeasaratioofallpredictionsmade.Thisisthemostcommonevaluationmetricfor classificationproblems,itisalsothemostmisused.Itisreallyonlysuitablewhenthereareanequalnumberofobservationsineachclass(whichis rarelythecase)andthatallpredictionsandpredictionerrorsareequallyimportant,whichisoftennotthecase.Belowisanexampleofcalculating classificationaccuracy. Accuracy: 0.776 (0.052) Task2:AreaUnderROCCurve AreaunderROCCurve(orAUCforshort)isaperformancemetricforbinaryclassificationproblems.TheAUCrepresentsamodel’sabilityto discriminatebetweenpositiveandnegativeclasses.Anareaof1.0representsamodelthatmadeallpredictionsperfectly.Anareaof0.5represents amodelthatisasgoodasrandom.ROCcanbebrokendownintosensitivityandspecificity.Abinaryclassificationproblemisreallyatrade-off betweensensitivityandspecificity. Sensitivityisthetruepositiverate,alsocalledtherecall.Itisthenumberofinstancesfromthepositive(first)classthatwereactuallypredicted correctly. Specificityisalsocalledthetruenegativerate.Isthenumberofinstancesfromthenegative(second)classthatwereactuallypredicted correctly. Fromourresultsbelow,wecanseetheAUCisrelativelycloseto1andgreaterthan0.5,suggestingsomeskillinthepredictions. In [87]: # Cross Validation Classification Accuracy import pandas as pd from sklearn.model_selection import KFold from sklearn.model_selection import cross_val_score from sklearn.linear_model import LogisticRegression # load data names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv" data = pd.read_csv(URL,names=names) array = data.values # separate array into input and output components X = array[:,0:8] Y = array[:,8] kfold = KFold(n_splits=10, random_state=None) model = LogisticRegression(solver='lbfgs', max_iter=1000) scoring = 'accuracy' results = cross_val_score(model, X, Y, cv=kfold, scoring=scoring) print("Accuracy: %.3f (%.3f)" % (results.mean(), results.std())) In [88]: # Cross Validation Classification ROC AUC import pandas as pd from sklearn.model_selection import KFold from sklearn.model_selection import cross_val_score from sklearn.linear_model import LogisticRegression # load data
  • 18. AUC: 0.828 (0.043) Task3:ConfusionMatrix Theconfusionmatrixisahandypresentationoftheaccuracyofamodelwithtwoormoreclasses.Thetablepresentspredictionsonthex-axisand trueoutcomesonthey-axis.Thecellsofthetablearethenumberofpredictionsmadebyamachinelearningalgorithm. Forexample,amachinelearningalgorithmcanpredict0or1andeachpredictionmayactuallyhavebeena0or1.Predictionsfor0thatwere actually0appearinthecellforprediction=0andactual=0,whereaspredictionsfor0thatwereactually1appearinthecellforprediction=0and actual=1.Andsoon. BelowisanexampleofcalculatingaconfusionmatrixforasetofpredictionsbyaLogisticRegressiononthePimaIndiansonsetofdiabetes dataset. [[142 20] [ 34 58]] <matplotlib.axes._subplots.AxesSubplot at 0x7f76eb824a10> 9.9PerformanceMetricsofMachineLearningAlgorithm(Part02) names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv" data = pd.read_csv(URL,names=names) array = data.values # separate array into input and output components X = array[:,0:8] Y = array[:,8] kfold = KFold(n_splits=10, random_state=None) model = LogisticRegression(solver='lbfgs', max_iter=1000) scoring = 'roc_auc' results = cross_val_score(model, X, Y, cv=kfold, scoring=scoring) print("AUC: %.3f (%.3f)" % (results.mean(), results.std())) In [114… # Cross Validation Classification Confusion Matrix import pandas as pd import seaborn as sns from sklearn.model_selection import train_test_split from sklearn.linear_model import LogisticRegression from sklearn.metrics import confusion_matrix # load data names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv" data = pd.read_csv(URL,names=names) array = data.values # separate array into input and output components X = array[:,0:8] Y = array[:,8] test_size = 0.33 seed = 7 X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=test_size,random_state=seed) model = LogisticRegression(solver='lbfgs', max_iter=1000) model.fit(X_train, Y_train) predicted = model.predict(X_test) matrix = confusion_matrix(Y_test, predicted) print(matrix) # visualize the results group_names = ['True Neg','False Pos','False Neg','True Pos'] group_counts = ["{0:0.0f}".format(value) for value in matrix.flatten()] group_percentages = ["{0:.2%}".format(value) for value in matrix.flatten()/np.sum(matrix)] labels = [f"{v1}n{v2}n{v3}" for v1, v2, v3 in zip(group_names,group_counts,group_percentages)] labels = np.asarray(labels).reshape(2,2) sns.heatmap(matrix, annot=labels, fmt='', cmap='Blues') Out[114…
  • 19. Inthepreviouslesson,welearnaboutperformancemetricsforclassificationproblems.Inthislessonwillreview3ofthemostcommonmetricsfor evaluatingpredictionsonregressionmachinelearningproblems: MeanAbsoluteError MeanSquaredError R-Squared Task1:MeanAbsoluteError TheMeanAbsoluteError(orMAE)isthesumoftheabsolutedifferencesbetweenpredictionsandactualvalues.Itgivesanideaofhowwrongthe predictionswere.Themeasuregivesanideaofthemagnitudeoftheerror,butnoideaofthedirection(e.g.overorunderpredicting).Avalueof0 indicatesnoerrororperfectpredictions.Likelogloss,thismetricisinvertedby the cross_val_score() function. TheexamplebelowdemonstratescalculatingmeanabsoluteerrorontheBostonhousepricedataset.ThisdatasetwastakenfromtheStatLib libraryandismaintainedbyCarnegieMellonUniversity.ThisdatasetconcernsthehousingpricesinthehousingcityofBoston.Thedataset providedhas506instanceswith13features. MAE: -4.005 (2.084) In [117… # Cross Validation Regression MAE import pandas as pd from sklearn.model_selection import KFold from sklearn.model_selection import cross_val_score from sklearn.linear_model import LinearRegression # load data names = ['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT', 'MEDV'] URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/housing.csv" data = pd.read_csv(URL, delim_whitespace=True, names=names) array = data.values # separate array into input and output components X = array[:,0:13] Y = array[:,13] kfold = KFold(n_splits=10, random_state=None) model = LinearRegression() scoring = 'neg_mean_absolute_error' results = cross_val_score(model, X, Y, cv=kfold, scoring=scoring) print("MAE: %.3f (%.3f)" % (results.mean(), results.std()))
  • 20. Task2:MeanSquaredError TheMeanSquaredError(orMSE)ismuchlikethemeanabsoluteerrorinthatitprovidesagrossideaofthemagnitudeoferror.Takingthesquare rootofthemeansquarederrorconvertstheunitsbacktotheoriginalunitsoftheoutputvariableandcanbemeaningfulfordescriptionand presentation.ThisiscalledtheRootMeanSquaredError(orRMSE).Theexamplebelowprovidesademonstrationofcalculatingmeansquared error. Note:MSEmaybelessrobustthanMAE,sincethesquaringoftheerrorswillenforceahigherimportanceonoutliers.Butwhenoutliersare exponentiallyrare(likeinabell-shapedcurve),theMSEperformsverywellandisgenerallypreferred. MSE: -34.705 (45.574) Task3:R-Squared The (orR-Squared)metricprovidesanindicationofthegoodnessoffitofasetofpredictionstotheactualvalues.Instatisticalliteraturethis measureiscalledthecoefficientofdetermination.Thisisavaluebetween0and1forno-fitandperfectfit,respectively.Theexamplebelow providesademonstrationofcalculatingthemean forasetofpredictions. where: : valueofobservation :predictedvalueof forobservation :meanvalueof In [118… # Cross Validation Regression MAE import pandas as pd from sklearn.model_selection import KFold from sklearn.model_selection import cross_val_score from sklearn.linear_model import LinearRegression # load data names = ['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT', 'MEDV'] URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/housing.csv" data = pd.read_csv(URL, delim_whitespace=True, names=names) array = data.values # separate array into input and output components X = array[:,0:13] Y = array[:,13] kfold = KFold(n_splits=10, random_state=None) model = LinearRegression() scoring = 'neg_mean_squared_error' results = cross_val_score(model, X, Y, cv=kfold, scoring=scoring) print("MSE: %.3f (%.3f)" % (results.mean(), results.std())) R 2 R 2 yi y i ^ y y i ȳ y In [120… # Cross Validation Regression MAE import pandas as pd from sklearn.model_selection import KFold from sklearn.model_selection import cross_val_score from sklearn.linear_model import LinearRegression # load data names = ['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT', 'MEDV'] URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/housing.csv" data = pd.read_csv(URL, delim_whitespace=True, names=names) array = data.values
  • 21. R-Squared: 0.203 (0.595) 9.10ImplementingMachineLearningAlgorithms Whenweworkonamachinelearningproject,weoftenendupwithmultiplegoodmodelstochoosefrom.Eachmodelwillhavedifferent performancecharacteristics.Usingmethodslikecross-validation,wecangetanestimateforhowaccurateeachmodelmaybeonunseendata. Weneedtobeabletousetheseestimatestochooseoneortwobestmodelsfromthesuiteofmodelsthatyouhavecreated.Whenwehaveanew dataset,itisagoodideatovisualizethedatausingdifferenttechniquesinordertolookatthedatafromdifferentperspectives. Thesameideaappliestomodelselection.Weshoulduseanumberofdifferentwaysoflookingattheestimatedaccuracyofourmachinelearning algorithmsinordertochoosetheoneortwoalgorithmstofinalize. Task1:ComparingMachineLearningAlgorithms Intheexamplebelowfivedifferentclassificationalgorithmsarecomparedonasingledataset: LinearDiscriminantAnalysis. k-NearestNeighbors. ClassificationandRegressionTrees. NaiveBayes. SupportVectorMachines. ThedatasetisthePimaIndiansonsetofdiabetesproblem.Theproblemhastwoclassesandeightnumericinputvariablesofvaryingscales.The 10-foldcross-validationprocedureisusedtoevaluateeachalgorithm,importantlyconfiguredwiththesamerandomseedtoensurethatthesame splitstothetrainingdataareperformedandthateachalgorithmisevaluatedinpreciselythesameway.Eachalgorithmisgivenashortname, usefulforsummarizingresultsafterward. # separate array into input and output components X = array[:,0:13] Y = array[:,13] kfold = KFold(n_splits=10, random_state=None) model = LinearRegression() scoring = 'r2' results = cross_val_score(model, X, Y, cv=kfold, scoring=scoring) print("R-Squared: %.3f (%.3f)" % (results.mean(), results.std())) In [128… # Compare Algorithms import pandas as pd from matplotlib import pyplot from sklearn.model_selection import KFold from sklearn.model_selection import cross_val_score from sklearn.linear_model import LogisticRegression from sklearn.tree import DecisionTreeClassifier from sklearn.neighbors import KNeighborsClassifier from sklearn.discriminant_analysis import LinearDiscriminantAnalysis from sklearn.naive_bayes import GaussianNB from sklearn.svm import SVC # load data names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class'] URL = "https://raw.githubusercontent.com/wibirama/Artificial-Intelligence-Course/master/pima-indians-diabetes.data.csv" data = pd.read_csv(URL,names=names) array = data.values # separate array into input and output components X = array[:,0:8] Y = array[:,8] # prepare models models = [] models.append(('LDA', LinearDiscriminantAnalysis())) models.append(('KNN', KNeighborsClassifier())) models.append(('CART', DecisionTreeClassifier())) models.append(('NB', GaussianNB())) models.append(('SVM', SVC())) # evaluate each model in turn results = [] names = [] scoring = 'accuracy' for name, model in models: kfold = KFold(n_splits=10, random_state=None) cv_results = cross_val_score(model, X, Y, cv=kfold, scoring=scoring) results.append(cv_results) names.append(name) msg = "%s: %f (%f)" % (name, cv_results.mean(), cv_results.std()) print(msg) # boxplot algorithm comparison fig = pyplot.figure() fig.suptitle('Algorithm Comparison') ax = fig.add_subplot(111) pyplot.boxplot(results) ax.set_xticklabels(names) pyplot.show()
  • 22. LDA: 0.773462 (0.051592) KNN: 0.726555 (0.061821) CART: 0.691302 (0.072112) NB: 0.755178 (0.042766) SVM: 0.760424 (0.052931) Task2:AlgorithmTuning Algorithmtuningisafinalstepintheprocessofappliedmachinelearningbeforefinalizingourmodel.Itissometimescalledhyperparameter optimizationwherethealgorithmparametersarereferredtoashyperparameters,whereasthecoefficientsfoundbythemachinelearningalgorithm itselfarereferredtoasparameters.Optimizationsuggeststhesearch-natureoftheproblem.Phrasedasasearchproblem,youcanusedifferent searchstrategiestofindagoodandrobustparameterorsetofparametersforanalgorithmonagivenproblem.First,wewillshowin implementationofSupportVectorMachinewithoutGridSearch. WewillusebreastcancerdatasetfromScikitLearnlibrary.Thisisabinaryclassificationdataset.IthasnoMissingattributeornullvalues.Theclass distributionisasfollows. 212:malignant 357:benign Moreinformationcanbefoundhere:https://scikit-learn.org/stable/datasets/toy_dataset.html#breast-cancer-dataset In [11]: # Classification Without Algorithm Optimization import pandas as pd import numpy as np from sklearn.metrics import classification_report, confusion_matrix from sklearn.model_selection import train_test_split from sklearn.datasets import load_breast_cancer from sklearn.svm import SVC # load data cancer = load_breast_cancer() X = pd.DataFrame(cancer['data'], columns = cancer['feature_names']) # cancer column is our target Y = pd.DataFrame(cancer['target'], columns =['Cancer']) X_train, X_test, y_train, y_test = train_test_split(X, np.ravel(Y),test_size = 0.30, random_state = 7) # train the model on train set
  • 23. precision recall f1-score support 0 1.00 0.78 0.88 55 1 0.91 1.00 0.95 116 accuracy 0.93 171 macro avg 0.95 0.89 0.91 171 weighted avg 0.94 0.93 0.93 171 GridSearchisanapproachtoparametertuningthatwillmethodicallybuildandevaluateamodelforeachcombinationofalgorithmparameters specifiedinagrid. Oneofthegreatthingsabout GridSearchCV isthatitisameta-estimator.IttakesanestimatorlikeSVCandcreatesanewestimator,that behavesexactlythesame–inthiscase,likeaclassifier.Youshouldaddrefit=Trueandchooseverbosetowhatevernumberyouwant,thehigherthe number,themoreverbose(verbosejustmeansthetextoutputdescribingtheprocess). Fitting 5 folds for each of 25 candidates, totalling 125 fits [CV 1/5] END ........C=0.1, gamma=1, kernel=rbf;, score=0.613 total time= 0.0s [CV 2/5] END ........C=0.1, gamma=1, kernel=rbf;, score=0.600 total time= 0.0s [CV 3/5] END ........C=0.1, gamma=1, kernel=rbf;, score=0.600 total time= 0.0s [CV 4/5] END ........C=0.1, gamma=1, kernel=rbf;, score=0.608 total time= 0.0s [CV 5/5] END ........C=0.1, gamma=1, kernel=rbf;, score=0.608 total time= 0.0s [CV 1/5] END ......C=0.1, gamma=0.1, kernel=rbf;, score=0.613 total time= 0.0s [CV 2/5] END ......C=0.1, gamma=0.1, kernel=rbf;, score=0.600 total time= 0.0s [CV 3/5] END ......C=0.1, gamma=0.1, kernel=rbf;, score=0.600 total time= 0.0s [CV 4/5] END ......C=0.1, gamma=0.1, kernel=rbf;, score=0.608 total time= 0.0s [CV 5/5] END ......C=0.1, gamma=0.1, kernel=rbf;, score=0.608 total time= 0.0s [CV 1/5] END .....C=0.1, gamma=0.01, kernel=rbf;, score=0.613 total time= 0.0s [CV 2/5] END .....C=0.1, gamma=0.01, kernel=rbf;, score=0.600 total time= 0.0s [CV 3/5] END .....C=0.1, gamma=0.01, kernel=rbf;, score=0.600 total time= 0.0s [CV 4/5] END .....C=0.1, gamma=0.01, kernel=rbf;, score=0.608 total time= 0.0s [CV 5/5] END .....C=0.1, gamma=0.01, kernel=rbf;, score=0.608 total time= 0.0s [CV 1/5] END ....C=0.1, gamma=0.001, kernel=rbf;, score=0.613 total time= 0.0s [CV 2/5] END ....C=0.1, gamma=0.001, kernel=rbf;, score=0.600 total time= 0.0s [CV 3/5] END ....C=0.1, gamma=0.001, kernel=rbf;, score=0.600 total time= 0.0s [CV 4/5] END ....C=0.1, gamma=0.001, kernel=rbf;, score=0.608 total time= 0.0s [CV 5/5] END ....C=0.1, gamma=0.001, kernel=rbf;, score=0.608 total time= 0.0s [CV 1/5] END ...C=0.1, gamma=0.0001, kernel=rbf;, score=0.925 total time= 0.0s [CV 2/5] END ...C=0.1, gamma=0.0001, kernel=rbf;, score=0.912 total time= 0.0s [CV 3/5] END ...C=0.1, gamma=0.0001, kernel=rbf;, score=0.938 total time= 0.0s [CV 4/5] END ...C=0.1, gamma=0.0001, kernel=rbf;, score=0.861 total time= 0.0s [CV 5/5] END ...C=0.1, gamma=0.0001, kernel=rbf;, score=0.962 total time= 0.0s [CV 1/5] END ..........C=1, gamma=1, kernel=rbf;, score=0.613 total time= 0.0s [CV 2/5] END ..........C=1, gamma=1, kernel=rbf;, score=0.600 total time= 0.0s [CV 3/5] END ..........C=1, gamma=1, kernel=rbf;, score=0.600 total time= 0.0s model = SVC() model.fit(X_train, y_train) # print prediction results predictions = model.predict(X_test) print(classification_report(y_test, predictions)) In [24]: # Classification Grid Search Optimization import pandas as pd import numpy as np from sklearn.metrics import classification_report, confusion_matrix from sklearn.model_selection import train_test_split from sklearn.model_selection import GridSearchCV from sklearn.datasets import load_breast_cancer from sklearn.svm import SVC # load data cancer = load_breast_cancer() X = pd.DataFrame(cancer['data'], columns = cancer['feature_names']) # cancer column is our target Y = pd.DataFrame(cancer['target'], columns =['Cancer']) X_train, X_test, y_train, y_test = train_test_split(X, np.ravel(Y),test_size = 0.30, random_state = 7) # defining parameter range param_grid = {'C': [0.1, 1, 10, 100, 1000], 'gamma': [1, 0.1, 0.01, 0.001, 0.0001], 'kernel': ['rbf']} svc_grid = GridSearchCV(SVC(), param_grid, verbose = 3) # fitting the model for grid search svc_grid.fit(X_train, y_train) # print best parameter after tuning print(svc_grid.best_params_) # print how our model looks after hyper-parameter tuning print(svc_grid.best_estimator_) svc_grid_predictions = svc_grid.predict(X_test) # print classification report print(classification_report(y_test, svc_grid_predictions))
  • 24. [CV 4/5] END ..........C=1, gamma=1, kernel=rbf;, score=0.608 total time= 0.0s [CV 5/5] END ..........C=1, gamma=1, kernel=rbf;, score=0.608 total time= 0.0s [CV 1/5] END ........C=1, gamma=0.1, kernel=rbf;, score=0.613 total time= 0.0s [CV 2/5] END ........C=1, gamma=0.1, kernel=rbf;, score=0.600 total time= 0.0s [CV 3/5] END ........C=1, gamma=0.1, kernel=rbf;, score=0.600 total time= 0.0s [CV 4/5] END ........C=1, gamma=0.1, kernel=rbf;, score=0.608 total time= 0.0s [CV 5/5] END ........C=1, gamma=0.1, kernel=rbf;, score=0.608 total time= 0.0s [CV 1/5] END .......C=1, gamma=0.01, kernel=rbf;, score=0.625 total time= 0.0s [CV 2/5] END .......C=1, gamma=0.01, kernel=rbf;, score=0.613 total time= 0.0s [CV 3/5] END .......C=1, gamma=0.01, kernel=rbf;, score=0.600 total time= 0.0s [CV 4/5] END .......C=1, gamma=0.01, kernel=rbf;, score=0.608 total time= 0.0s [CV 5/5] END .......C=1, gamma=0.01, kernel=rbf;, score=0.608 total time= 0.0s [CV 1/5] END ......C=1, gamma=0.001, kernel=rbf;, score=0.925 total time= 0.0s [CV 2/5] END ......C=1, gamma=0.001, kernel=rbf;, score=0.912 total time= 0.0s [CV 3/5] END ......C=1, gamma=0.001, kernel=rbf;, score=0.938 total time= 0.0s [CV 4/5] END ......C=1, gamma=0.001, kernel=rbf;, score=0.873 total time= 0.0s [CV 5/5] END ......C=1, gamma=0.001, kernel=rbf;, score=0.962 total time= 0.0s [CV 1/5] END .....C=1, gamma=0.0001, kernel=rbf;, score=0.912 total time= 0.0s [CV 2/5] END .....C=1, gamma=0.0001, kernel=rbf;, score=0.925 total time= 0.0s [CV 3/5] END .....C=1, gamma=0.0001, kernel=rbf;, score=0.975 total time= 0.0s [CV 4/5] END .....C=1, gamma=0.0001, kernel=rbf;, score=0.911 total time= 0.0s [CV 5/5] END .....C=1, gamma=0.0001, kernel=rbf;, score=0.975 total time= 0.0s [CV 1/5] END .........C=10, gamma=1, kernel=rbf;, score=0.613 total time= 0.0s [CV 2/5] END .........C=10, gamma=1, kernel=rbf;, score=0.600 total time= 0.0s [CV 3/5] END .........C=10, gamma=1, kernel=rbf;, score=0.600 total time= 0.0s [CV 4/5] END .........C=10, gamma=1, kernel=rbf;, score=0.608 total time= 0.0s [CV 5/5] END .........C=10, gamma=1, kernel=rbf;, score=0.608 total time= 0.0s [CV 1/5] END .......C=10, gamma=0.1, kernel=rbf;, score=0.613 total time= 0.0s [CV 2/5] END .......C=10, gamma=0.1, kernel=rbf;, score=0.600 total time= 0.0s [CV 3/5] END .......C=10, gamma=0.1, kernel=rbf;, score=0.600 total time= 0.0s [CV 4/5] END .......C=10, gamma=0.1, kernel=rbf;, score=0.608 total time= 0.0s [CV 5/5] END .......C=10, gamma=0.1, kernel=rbf;, score=0.608 total time= 0.0s [CV 1/5] END ......C=10, gamma=0.01, kernel=rbf;, score=0.637 total time= 0.0s [CV 2/5] END ......C=10, gamma=0.01, kernel=rbf;, score=0.613 total time= 0.0s [CV 3/5] END ......C=10, gamma=0.01, kernel=rbf;, score=0.600 total time= 0.0s [CV 4/5] END ......C=10, gamma=0.01, kernel=rbf;, score=0.608 total time= 0.0s [CV 5/5] END ......C=10, gamma=0.01, kernel=rbf;, score=0.608 total time= 0.0s [CV 1/5] END .....C=10, gamma=0.001, kernel=rbf;, score=0.925 total time= 0.0s [CV 2/5] END .....C=10, gamma=0.001, kernel=rbf;, score=0.912 total time= 0.0s [CV 3/5] END .....C=10, gamma=0.001, kernel=rbf;, score=0.887 total time= 0.0s [CV 4/5] END .....C=10, gamma=0.001, kernel=rbf;, score=0.848 total time= 0.0s [CV 5/5] END .....C=10, gamma=0.001, kernel=rbf;, score=0.962 total time= 0.0s [CV 1/5] END ....C=10, gamma=0.0001, kernel=rbf;, score=0.900 total time= 0.0s [CV 2/5] END ....C=10, gamma=0.0001, kernel=rbf;, score=0.912 total time= 0.0s [CV 3/5] END ....C=10, gamma=0.0001, kernel=rbf;, score=0.963 total time= 0.0s [CV 4/5] END ....C=10, gamma=0.0001, kernel=rbf;, score=0.911 total time= 0.0s [CV 5/5] END ....C=10, gamma=0.0001, kernel=rbf;, score=0.975 total time= 0.0s [CV 1/5] END ........C=100, gamma=1, kernel=rbf;, score=0.613 total time= 0.0s [CV 2/5] END ........C=100, gamma=1, kernel=rbf;, score=0.600 total time= 0.0s [CV 3/5] END ........C=100, gamma=1, kernel=rbf;, score=0.600 total time= 0.0s [CV 4/5] END ........C=100, gamma=1, kernel=rbf;, score=0.608 total time= 0.0s [CV 5/5] END ........C=100, gamma=1, kernel=rbf;, score=0.608 total time= 0.0s [CV 1/5] END ......C=100, gamma=0.1, kernel=rbf;, score=0.613 total time= 0.0s [CV 2/5] END ......C=100, gamma=0.1, kernel=rbf;, score=0.600 total time= 0.0s [CV 3/5] END ......C=100, gamma=0.1, kernel=rbf;, score=0.600 total time= 0.0s [CV 4/5] END ......C=100, gamma=0.1, kernel=rbf;, score=0.608 total time= 0.0s [CV 5/5] END ......C=100, gamma=0.1, kernel=rbf;, score=0.608 total time= 0.0s [CV 1/5] END .....C=100, gamma=0.01, kernel=rbf;, score=0.637 total time= 0.0s [CV 2/5] END .....C=100, gamma=0.01, kernel=rbf;, score=0.613 total time= 0.0s [CV 3/5] END .....C=100, gamma=0.01, kernel=rbf;, score=0.600 total time= 0.0s [CV 4/5] END .....C=100, gamma=0.01, kernel=rbf;, score=0.608 total time= 0.0s [CV 5/5] END .....C=100, gamma=0.01, kernel=rbf;, score=0.608 total time= 0.0s [CV 1/5] END ....C=100, gamma=0.001, kernel=rbf;, score=0.925 total time= 0.0s [CV 2/5] END ....C=100, gamma=0.001, kernel=rbf;, score=0.912 total time= 0.0s [CV 3/5] END ....C=100, gamma=0.001, kernel=rbf;, score=0.887 total time= 0.0s [CV 4/5] END ....C=100, gamma=0.001, kernel=rbf;, score=0.848 total time= 0.0s [CV 5/5] END ....C=100, gamma=0.001, kernel=rbf;, score=0.962 total time= 0.0s [CV 1/5] END ...C=100, gamma=0.0001, kernel=rbf;, score=0.938 total time= 0.0s [CV 2/5] END ...C=100, gamma=0.0001, kernel=rbf;, score=0.887 total time= 0.0s [CV 3/5] END ...C=100, gamma=0.0001, kernel=rbf;, score=0.950 total time= 0.0s [CV 4/5] END ...C=100, gamma=0.0001, kernel=rbf;, score=0.886 total time= 0.0s [CV 5/5] END ...C=100, gamma=0.0001, kernel=rbf;, score=0.975 total time= 0.0s [CV 1/5] END .......C=1000, gamma=1, kernel=rbf;, score=0.613 total time= 0.0s [CV 2/5] END .......C=1000, gamma=1, kernel=rbf;, score=0.600 total time= 0.0s [CV 3/5] END .......C=1000, gamma=1, kernel=rbf;, score=0.600 total time= 0.0s [CV 4/5] END .......C=1000, gamma=1, kernel=rbf;, score=0.608 total time= 0.0s [CV 5/5] END .......C=1000, gamma=1, kernel=rbf;, score=0.608 total time= 0.0s [CV 1/5] END .....C=1000, gamma=0.1, kernel=rbf;, score=0.613 total time= 0.0s [CV 2/5] END .....C=1000, gamma=0.1, kernel=rbf;, score=0.600 total time= 0.0s [CV 3/5] END .....C=1000, gamma=0.1, kernel=rbf;, score=0.600 total time= 0.0s [CV 4/5] END .....C=1000, gamma=0.1, kernel=rbf;, score=0.608 total time= 0.0s [CV 5/5] END .....C=1000, gamma=0.1, kernel=rbf;, score=0.608 total time= 0.0s [CV 1/5] END ....C=1000, gamma=0.01, kernel=rbf;, score=0.637 total time= 0.0s [CV 2/5] END ....C=1000, gamma=0.01, kernel=rbf;, score=0.613 total time= 0.0s [CV 3/5] END ....C=1000, gamma=0.01, kernel=rbf;, score=0.600 total time= 0.0s [CV 4/5] END ....C=1000, gamma=0.01, kernel=rbf;, score=0.608 total time= 0.0s [CV 5/5] END ....C=1000, gamma=0.01, kernel=rbf;, score=0.608 total time= 0.0s [CV 1/5] END ...C=1000, gamma=0.001, kernel=rbf;, score=0.925 total time= 0.0s [CV 2/5] END ...C=1000, gamma=0.001, kernel=rbf;, score=0.912 total time= 0.0s [CV 3/5] END ...C=1000, gamma=0.001, kernel=rbf;, score=0.887 total time= 0.0s [CV 4/5] END ...C=1000, gamma=0.001, kernel=rbf;, score=0.848 total time= 0.0s [CV 5/5] END ...C=1000, gamma=0.001, kernel=rbf;, score=0.962 total time= 0.0s [CV 1/5] END ..C=1000, gamma=0.0001, kernel=rbf;, score=0.925 total time= 0.0s [CV 2/5] END ..C=1000, gamma=0.0001, kernel=rbf;, score=0.912 total time= 0.0s [CV 3/5] END ..C=1000, gamma=0.0001, kernel=rbf;, score=0.938 total time= 0.0s [CV 4/5] END ..C=1000, gamma=0.0001, kernel=rbf;, score=0.899 total time= 0.0s
  • 25. [CV 5/5] END ..C=1000, gamma=0.0001, kernel=rbf;, score=0.962 total time= 0.0s {'C': 1, 'gamma': 0.0001, 'kernel': 'rbf'} SVC(C=1, gamma=0.0001) precision recall f1-score support 0 0.93 0.93 0.93 55 1 0.97 0.97 0.97 116 accuracy 0.95 171 macro avg 0.95 0.95 0.95 171 weighted avg 0.95 0.95 0.95 171 9.11NeuralNetworkswithKeras ThiscodeisademonstrationofashallowneuralnetworkonMNISTdataset.MNISTdatasetisastandarddatasetusedinmostdeeplearning tutorials.Inthiscode,threelayersshallowneuralnetworkisusedforhandwrittendigitsclassification.Thefirstlayerisinputlayer,consistedof784 nodes.Thesecondlayerisahiddenlayerwith64sigmoidneurons.Thelastlayerisanoutputlayerwith10softmaxneurons. Task1:LoadingDependencies Task2:LoadingMNISTDataset Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz 11493376/11490434 [==============================] - 0s 0us/step 11501568/11490434 [==============================] - 0s 0us/step (60000, 28, 28) (60000,) (10000, 28, 28) (10000,) array([5, 0, 4, 1, 9, 2, 1, 3, 1, 4, 3, 5], dtype=uint8) <function matplotlib.pyplot.show> In [28]: import keras from keras.datasets import mnist from keras.models import Sequential from keras.layers import Dense #additional code for keras 2.4.0 from keras.utils import np_utils #from keras.optimizers import SGD #deprecated from keras.optimizers import gradient_descent_v2 #then use it : sgd = gradient_descent_v2.SGD(...) from matplotlib import pyplot as plt In [29]: (X_train, y_train), (X_valid, y_valid) = mnist.load_data() In [30]: X_train.shape Out[30]: In [31]: y_train.shape Out[31]: In [32]: X_valid.shape Out[32]: In [33]: y_valid.shape Out[33]: In [34]: y_train[0:12] Out[34]: In [35]: plt.figure(figsize=(4,4)) for k in range(12): plt.subplot(3, 4, k+1) plt.imshow(X_train[k], cmap='Greys') plt.axis('off') plt.tight_layout() plt.show Out[35]:
  • 26. <matplotlib.image.AxesImage at 0x7fd8a48451d0> 7 Task3:DataPreprocessing Reshapingthedatafrom2Dto1D Normalizingthedata(tobe0to1) Convertingintegerlabelstoone-hotencoding.Wearrangethelabelswithsuchone-hotencodingssothattheylineupwiththe10probabilities beingoutputbythefinallayerofourartificialneuralnetwork.Theyrepresenttheidealoutputthatwearestrivingtoattainwithournetwork:Ifthe inputimageisahandwrittenseven,thenaperfectlytrainednetworkwouldoutputaprobabilityof1.00thatitisasevenandaprobabilityof0.00for eachoftheothernineclassesofdigits. Task4:DesigningNeuralNetworks Inthefirstlineofcode,weinstantiatethesimplesttypeofneuralnetworkmodelobject,the Sequential typeand—inadashofextreme creativity—namethemodel model . Inthesecondline,weusethe add() methodofourmodelobjecttospecifytheattributesofournetwork’shiddenlayer(64sigmoid-typeartificial neuronsinthegeneral-purpose,fullyconnectedarrangementdefinedbythe Dense() methodaswellastheshapeofourinputlayer(one- dimensionalarrayoflength784). Inthethirdandfinallineweusethe add() methodagaintospecifytheoutputlayeranditsparameters:10artificialneuronsofthe softmax variety,correspondingtothe10probabilities(oneforeachofthe10possibledigits)thatthenetworkwilloutputwhenfedagivenhandwritten image. Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= In [36]: plt.imshow(X_valid[0], cmap='Greys') Out[36]: In [37]: print(y_valid[0]) In [38]: X_train = X_train.reshape(60000, 784).astype('float32') X_valid = X_valid.reshape(10000, 784).astype('float32') In [39]: X_train = X_train/255 X_valid = X_valid/255 In [40]: n_classes = 10 y_train = keras.utils.np_utils.to_categorical(y_train, n_classes) y_valid = keras.utils.np_utils.to_categorical(y_valid, n_classes) In [41]: model = Sequential() model.add(Dense(64, activation='sigmoid', input_shape=(784,))) model.add(Dense(10, activation='softmax')) In [42]: model.summary()
  • 27. dense (Dense) (None, 64) 50240 dense_1 (Dense) (None, 10) 650 ================================================================= Total params: 50,890 Trainable params: 50,890 Non-trainable params: 0 _________________________________________________________________ Task5:TrainingNeuralNetwork val_loss isthevalueofcostfunctionforyourcross-validationdataand loss isthevalueofcostfunctionforyourtrainingdata.Howeverwith val_loss (kerasvalidationloss)and val_acc (kerasvalidationaccuracy),manycasescanbepossiblelikebelow: val_loss startsincreasing, val_acc startsdecreasing.Thismeansmodeliscrammingvaluesnotlearning val_loss startsincreasing, val_acc alsoincreases.Thiscouldbecaseofoverfittingordiverseprobabilityvaluesincaseswheresoftmax isbeingusedinoutputlayer val_loss startsdecreasing, val_acc startsincreasing.Thisisalsofineasthatmeansmodelbuiltislearningandworkingfine. Epoch 1/200 469/469 [==============================] - 4s 6ms/step - loss: 0.0934 - accuracy: 0.0876 - val_loss: 0.0923 - val_accuracy: 0.0817 Epoch 2/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0918 - accuracy: 0.0729 - val_loss: 0.0914 - val_accuracy: 0.0776 Epoch 3/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0911 - accuracy: 0.0851 - val_loss: 0.0907 - val_accuracy: 0.1058 Epoch 4/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0905 - accuracy: 0.1191 - val_loss: 0.0902 - val_accuracy: 0.1451 Epoch 5/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0901 - accuracy: 0.1545 - val_loss: 0.0898 - val_accuracy: 0.1825 Epoch 6/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0897 - accuracy: 0.1913 - val_loss: 0.0894 - val_accuracy: 0.2185 Epoch 7/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0893 - accuracy: 0.2349 - val_loss: 0.0890 - val_accuracy: 0.2681 Epoch 8/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0889 - accuracy: 0.2873 - val_loss: 0.0887 - val_accuracy: 0.3167 Epoch 9/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0886 - accuracy: 0.3287 - val_loss: 0.0883 - val_accuracy: 0.3528 Epoch 10/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0882 - accuracy: 0.3571 - val_loss: 0.0879 - val_accuracy: 0.3769 Epoch 11/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0878 - accuracy: 0.3771 - val_loss: 0.0875 - val_accuracy: 0.3999 Epoch 12/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0874 - accuracy: 0.3942 - val_loss: 0.0871 - val_accuracy: 0.4152 Epoch 13/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0871 - accuracy: 0.4087 - val_loss: 0.0868 - val_accuracy: 0.4281 Epoch 14/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0867 - accuracy: 0.4207 - val_loss: 0.0864 - val_accuracy: 0.4385 Epoch 15/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0863 - accuracy: 0.4273 - val_loss: 0.0860 - val_accuracy: 0.4460 Epoch 16/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0859 - accuracy: 0.4313 - val_loss: 0.0856 - val_accuracy: 0.4500 Epoch 17/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0855 - accuracy: 0.4340 - val_loss: 0.0852 - val_accuracy: 0.4524 Epoch 18/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0851 - accuracy: 0.4360 - val_loss: 0.0848 - val_accuracy: 0.4534 Epoch 19/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0847 - accuracy: 0.4370 - val_loss: 0.0843 - val_accuracy: 0.4531 Epoch 20/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0843 - accuracy: 0.4366 - val_loss: 0.0839 - val_accuracy: 0.4531 Epoch 21/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0839 - accuracy: 0.4365 - val_loss: 0.0835 - val_accuracy: 0.4535 Epoch 22/200 469/469 [==============================] - 3s 6ms/step - loss: 0.0834 - accuracy: 0.4367 - val_loss: 0.0830 - val_accuracy: In [43]: model.compile(loss='mean_squared_error', optimizer=gradient_descent_v2.SGD(learning_rate=0.01), metrics=['accuracy']) In [44]: model.fit(X_train, y_train, batch_size=128, epochs=200, verbose=1, validation_data=(X_valid, y_valid))
  • 28. 0.4534 Epoch 23/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0830 - accuracy: 0.4383 - val_loss: 0.0825 - val_accuracy: 0.4539 Epoch 24/200 469/469 [==============================] - 3s 5ms/step - loss: 0.0825 - accuracy: 0.4392 - val_loss: 0.0821 - val_accuracy: 0.4560 Epoch 25/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0820 - accuracy: 0.4409 - val_loss: 0.0816 - val_accuracy: 0.4584 Epoch 26/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0816 - accuracy: 0.4436 - val_loss: 0.0811 - val_accuracy: 0.4620 Epoch 27/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0811 - accuracy: 0.4462 - val_loss: 0.0806 - val_accuracy: 0.4651 Epoch 28/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0806 - accuracy: 0.4507 - val_loss: 0.0801 - val_accuracy: 0.4690 Epoch 29/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0801 - accuracy: 0.4549 - val_loss: 0.0796 - val_accuracy: 0.4726 Epoch 30/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0796 - accuracy: 0.4604 - val_loss: 0.0791 - val_accuracy: 0.4784 Epoch 31/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0791 - accuracy: 0.4650 - val_loss: 0.0786 - val_accuracy: 0.4839 Epoch 32/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0786 - accuracy: 0.4707 - val_loss: 0.0781 - val_accuracy: 0.4896 Epoch 33/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0781 - accuracy: 0.4771 - val_loss: 0.0775 - val_accuracy: 0.4965 Epoch 34/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0776 - accuracy: 0.4847 - val_loss: 0.0770 - val_accuracy: 0.5025 Epoch 35/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0770 - accuracy: 0.4903 - val_loss: 0.0765 - val_accuracy: 0.5084 Epoch 36/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0765 - accuracy: 0.4980 - val_loss: 0.0760 - val_accuracy: 0.5149 Epoch 37/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0760 - accuracy: 0.5052 - val_loss: 0.0754 - val_accuracy: 0.5212 Epoch 38/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0755 - accuracy: 0.5124 - val_loss: 0.0749 - val_accuracy: 0.5284 Epoch 39/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0750 - accuracy: 0.5200 - val_loss: 0.0744 - val_accuracy: 0.5350 Epoch 40/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0744 - accuracy: 0.5284 - val_loss: 0.0738 - val_accuracy: 0.5417 Epoch 41/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0739 - accuracy: 0.5342 - val_loss: 0.0733 - val_accuracy: 0.5499 Epoch 42/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0734 - accuracy: 0.5411 - val_loss: 0.0728 - val_accuracy: 0.5573 Epoch 43/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0729 - accuracy: 0.5484 - val_loss: 0.0723 - val_accuracy: 0.5628 Epoch 44/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0724 - accuracy: 0.5551 - val_loss: 0.0717 - val_accuracy: 0.5694 Epoch 45/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0718 - accuracy: 0.5608 - val_loss: 0.0712 - val_accuracy: 0.5759 Epoch 46/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0713 - accuracy: 0.5661 - val_loss: 0.0707 - val_accuracy: 0.5814 Epoch 47/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0708 - accuracy: 0.5719 - val_loss: 0.0702 - val_accuracy: 0.5866 Epoch 48/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0703 - accuracy: 0.5780 - val_loss: 0.0696 - val_accuracy: 0.5914 Epoch 49/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0698 - accuracy: 0.5832 - val_loss: 0.0691 - val_accuracy: 0.5958 Epoch 50/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0693 - accuracy: 0.5873 - val_loss: 0.0686 - val_accuracy: 0.6011 Epoch 51/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0688 - accuracy: 0.5917 - val_loss: 0.0681 - val_accuracy: 0.6055 Epoch 52/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0683 - accuracy: 0.5950 - val_loss: 0.0676 - val_accuracy: 0.6109 Epoch 53/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0677 - accuracy: 0.6004 - val_loss: 0.0670 - val_accuracy: 0.6153 Epoch 54/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0672 - accuracy: 0.6036 - val_loss: 0.0665 - val_accuracy:
  • 29. 0.6183 Epoch 55/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0667 - accuracy: 0.6071 - val_loss: 0.0660 - val_accuracy: 0.6200 Epoch 56/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0662 - accuracy: 0.6110 - val_loss: 0.0655 - val_accuracy: 0.6223 Epoch 57/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0657 - accuracy: 0.6143 - val_loss: 0.0650 - val_accuracy: 0.6255 Epoch 58/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0652 - accuracy: 0.6176 - val_loss: 0.0645 - val_accuracy: 0.6282 Epoch 59/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0647 - accuracy: 0.6208 - val_loss: 0.0640 - val_accuracy: 0.6301 Epoch 60/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0643 - accuracy: 0.6237 - val_loss: 0.0635 - val_accuracy: 0.6325 Epoch 61/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0638 - accuracy: 0.6260 - val_loss: 0.0630 - val_accuracy: 0.6353 Epoch 62/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0633 - accuracy: 0.6295 - val_loss: 0.0625 - val_accuracy: 0.6371 Epoch 63/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0628 - accuracy: 0.6315 - val_loss: 0.0620 - val_accuracy: 0.6398 Epoch 64/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0623 - accuracy: 0.6335 - val_loss: 0.0616 - val_accuracy: 0.6422 Epoch 65/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0619 - accuracy: 0.6364 - val_loss: 0.0611 - val_accuracy: 0.6446 Epoch 66/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0614 - accuracy: 0.6390 - val_loss: 0.0606 - val_accuracy: 0.6461 Epoch 67/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0609 - accuracy: 0.6410 - val_loss: 0.0601 - val_accuracy: 0.6489 Epoch 68/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0605 - accuracy: 0.6435 - val_loss: 0.0597 - val_accuracy: 0.6513 Epoch 69/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0600 - accuracy: 0.6461 - val_loss: 0.0592 - val_accuracy: 0.6527 Epoch 70/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0596 - accuracy: 0.6476 - val_loss: 0.0588 - val_accuracy: 0.6546 Epoch 71/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0591 - accuracy: 0.6501 - val_loss: 0.0583 - val_accuracy: 0.6565 Epoch 72/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0587 - accuracy: 0.6519 - val_loss: 0.0579 - val_accuracy: 0.6577 Epoch 73/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0582 - accuracy: 0.6545 - val_loss: 0.0574 - val_accuracy: 0.6604 Epoch 74/200 469/469 [==============================] - 3s 6ms/step - loss: 0.0578 - accuracy: 0.6563 - val_loss: 0.0570 - val_accuracy: 0.6635 Epoch 75/200 469/469 [==============================] - 3s 6ms/step - loss: 0.0574 - accuracy: 0.6588 - val_loss: 0.0566 - val_accuracy: 0.6652 Epoch 76/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0570 - accuracy: 0.6614 - val_loss: 0.0561 - val_accuracy: 0.6679 Epoch 77/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0565 - accuracy: 0.6633 - val_loss: 0.0557 - val_accuracy: 0.6702 Epoch 78/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0561 - accuracy: 0.6661 - val_loss: 0.0553 - val_accuracy: 0.6723 Epoch 79/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0557 - accuracy: 0.6681 - val_loss: 0.0549 - val_accuracy: 0.6748 Epoch 80/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0553 - accuracy: 0.6708 - val_loss: 0.0545 - val_accuracy: 0.6785 Epoch 81/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0549 - accuracy: 0.6735 - val_loss: 0.0541 - val_accuracy: 0.6810 Epoch 82/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0545 - accuracy: 0.6760 - val_loss: 0.0537 - val_accuracy: 0.6846 Epoch 83/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0541 - accuracy: 0.6787 - val_loss: 0.0533 - val_accuracy: 0.6867 Epoch 84/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0538 - accuracy: 0.6813 - val_loss: 0.0529 - val_accuracy: 0.6890 Epoch 85/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0534 - accuracy: 0.6838 - val_loss: 0.0525 - val_accuracy: 0.6918 Epoch 86/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0530 - accuracy: 0.6859 - val_loss: 0.0521 - val_accuracy:
  • 30. 0.6939 Epoch 87/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0526 - accuracy: 0.6890 - val_loss: 0.0518 - val_accuracy: 0.6965 Epoch 88/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0523 - accuracy: 0.6919 - val_loss: 0.0514 - val_accuracy: 0.6994 Epoch 89/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0519 - accuracy: 0.6943 - val_loss: 0.0510 - val_accuracy: 0.7030 Epoch 90/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0515 - accuracy: 0.6980 - val_loss: 0.0507 - val_accuracy: 0.7066 Epoch 91/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0512 - accuracy: 0.7019 - val_loss: 0.0503 - val_accuracy: 0.7097 Epoch 92/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0508 - accuracy: 0.7043 - val_loss: 0.0500 - val_accuracy: 0.7124 Epoch 93/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0505 - accuracy: 0.7077 - val_loss: 0.0496 - val_accuracy: 0.7159 Epoch 94/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0502 - accuracy: 0.7104 - val_loss: 0.0493 - val_accuracy: 0.7192 Epoch 95/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0498 - accuracy: 0.7129 - val_loss: 0.0489 - val_accuracy: 0.7231 Epoch 96/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0495 - accuracy: 0.7159 - val_loss: 0.0486 - val_accuracy: 0.7253 Epoch 97/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0491 - accuracy: 0.7186 - val_loss: 0.0482 - val_accuracy: 0.7278 Epoch 98/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0488 - accuracy: 0.7212 - val_loss: 0.0479 - val_accuracy: 0.7313 Epoch 99/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0485 - accuracy: 0.7237 - val_loss: 0.0476 - val_accuracy: 0.7338 Epoch 100/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0482 - accuracy: 0.7269 - val_loss: 0.0473 - val_accuracy: 0.7367 Epoch 101/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0479 - accuracy: 0.7291 - val_loss: 0.0469 - val_accuracy: 0.7399 Epoch 102/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0476 - accuracy: 0.7317 - val_loss: 0.0466 - val_accuracy: 0.7426 Epoch 103/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0472 - accuracy: 0.7345 - val_loss: 0.0463 - val_accuracy: 0.7446 Epoch 104/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0469 - accuracy: 0.7366 - val_loss: 0.0460 - val_accuracy: 0.7469 Epoch 105/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0466 - accuracy: 0.7389 - val_loss: 0.0457 - val_accuracy: 0.7504 Epoch 106/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0463 - accuracy: 0.7415 - val_loss: 0.0454 - val_accuracy: 0.7523 Epoch 107/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0460 - accuracy: 0.7440 - val_loss: 0.0451 - val_accuracy: 0.7540 Epoch 108/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0457 - accuracy: 0.7464 - val_loss: 0.0448 - val_accuracy: 0.7565 Epoch 109/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0454 - accuracy: 0.7484 - val_loss: 0.0445 - val_accuracy: 0.7589 Epoch 110/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0452 - accuracy: 0.7505 - val_loss: 0.0442 - val_accuracy: 0.7620 Epoch 111/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0449 - accuracy: 0.7527 - val_loss: 0.0439 - val_accuracy: 0.7644 Epoch 112/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0446 - accuracy: 0.7551 - val_loss: 0.0436 - val_accuracy: 0.7664 Epoch 113/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0443 - accuracy: 0.7572 - val_loss: 0.0434 - val_accuracy: 0.7700 Epoch 114/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0440 - accuracy: 0.7595 - val_loss: 0.0431 - val_accuracy: 0.7724 Epoch 115/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0438 - accuracy: 0.7617 - val_loss: 0.0428 - val_accuracy: 0.7742 Epoch 116/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0435 - accuracy: 0.7635 - val_loss: 0.0425 - val_accuracy: 0.7765 Epoch 117/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0432 - accuracy: 0.7658 - val_loss: 0.0423 - val_accuracy: 0.7784 Epoch 118/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0430 - accuracy: 0.7684 - val_loss: 0.0420 - val_accuracy:
  • 31. 0.7801 Epoch 119/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0427 - accuracy: 0.7704 - val_loss: 0.0417 - val_accuracy: 0.7816 Epoch 120/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0424 - accuracy: 0.7728 - val_loss: 0.0415 - val_accuracy: 0.7846 Epoch 121/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0422 - accuracy: 0.7747 - val_loss: 0.0412 - val_accuracy: 0.7867 Epoch 122/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0419 - accuracy: 0.7769 - val_loss: 0.0409 - val_accuracy: 0.7885 Epoch 123/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0417 - accuracy: 0.7782 - val_loss: 0.0407 - val_accuracy: 0.7905 Epoch 124/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0414 - accuracy: 0.7807 - val_loss: 0.0404 - val_accuracy: 0.7916 Epoch 125/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0412 - accuracy: 0.7829 - val_loss: 0.0402 - val_accuracy: 0.7932 Epoch 126/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0409 - accuracy: 0.7849 - val_loss: 0.0399 - val_accuracy: 0.7949 Epoch 127/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0407 - accuracy: 0.7867 - val_loss: 0.0397 - val_accuracy: 0.7964 Epoch 128/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0404 - accuracy: 0.7886 - val_loss: 0.0395 - val_accuracy: 0.7976 Epoch 129/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0402 - accuracy: 0.7907 - val_loss: 0.0392 - val_accuracy: 0.8000 Epoch 130/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0400 - accuracy: 0.7923 - val_loss: 0.0390 - val_accuracy: 0.8008 Epoch 131/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0397 - accuracy: 0.7939 - val_loss: 0.0387 - val_accuracy: 0.8022 Epoch 132/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0395 - accuracy: 0.7956 - val_loss: 0.0385 - val_accuracy: 0.8047 Epoch 133/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0393 - accuracy: 0.7969 - val_loss: 0.0383 - val_accuracy: 0.8062 Epoch 134/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0390 - accuracy: 0.7987 - val_loss: 0.0380 - val_accuracy: 0.8082 Epoch 135/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0388 - accuracy: 0.8007 - val_loss: 0.0378 - val_accuracy: 0.8098 Epoch 136/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0386 - accuracy: 0.8022 - val_loss: 0.0376 - val_accuracy: 0.8113 Epoch 137/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0384 - accuracy: 0.8038 - val_loss: 0.0374 - val_accuracy: 0.8136 Epoch 138/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0382 - accuracy: 0.8055 - val_loss: 0.0372 - val_accuracy: 0.8152 Epoch 139/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0379 - accuracy: 0.8071 - val_loss: 0.0369 - val_accuracy: 0.8171 Epoch 140/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0377 - accuracy: 0.8087 - val_loss: 0.0367 - val_accuracy: 0.8183 Epoch 141/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0375 - accuracy: 0.8102 - val_loss: 0.0365 - val_accuracy: 0.8201 Epoch 142/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0373 - accuracy: 0.8116 - val_loss: 0.0363 - val_accuracy: 0.8213 Epoch 143/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0371 - accuracy: 0.8130 - val_loss: 0.0361 - val_accuracy: 0.8222 Epoch 144/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0369 - accuracy: 0.8146 - val_loss: 0.0359 - val_accuracy: 0.8240 Epoch 145/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0367 - accuracy: 0.8160 - val_loss: 0.0357 - val_accuracy: 0.8249 Epoch 146/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0365 - accuracy: 0.8171 - val_loss: 0.0355 - val_accuracy: 0.8258 Epoch 147/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0363 - accuracy: 0.8187 - val_loss: 0.0353 - val_accuracy: 0.8270 Epoch 148/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0361 - accuracy: 0.8201 - val_loss: 0.0351 - val_accuracy: 0.8280 Epoch 149/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0359 - accuracy: 0.8212 - val_loss: 0.0349 - val_accuracy: 0.8293 Epoch 150/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0357 - accuracy: 0.8221 - val_loss: 0.0347 - val_accuracy:
  • 32. 0.8309 Epoch 151/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0355 - accuracy: 0.8232 - val_loss: 0.0345 - val_accuracy: 0.8334 Epoch 152/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0353 - accuracy: 0.8246 - val_loss: 0.0343 - val_accuracy: 0.8349 Epoch 153/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0352 - accuracy: 0.8257 - val_loss: 0.0341 - val_accuracy: 0.8358 Epoch 154/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0350 - accuracy: 0.8270 - val_loss: 0.0339 - val_accuracy: 0.8367 Epoch 155/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0348 - accuracy: 0.8281 - val_loss: 0.0338 - val_accuracy: 0.8372 Epoch 156/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0346 - accuracy: 0.8290 - val_loss: 0.0336 - val_accuracy: 0.8391 Epoch 157/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0344 - accuracy: 0.8302 - val_loss: 0.0334 - val_accuracy: 0.8401 Epoch 158/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0342 - accuracy: 0.8312 - val_loss: 0.0332 - val_accuracy: 0.8410 Epoch 159/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0341 - accuracy: 0.8319 - val_loss: 0.0330 - val_accuracy: 0.8423 Epoch 160/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0339 - accuracy: 0.8329 - val_loss: 0.0329 - val_accuracy: 0.8433 Epoch 161/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0337 - accuracy: 0.8337 - val_loss: 0.0327 - val_accuracy: 0.8441 Epoch 162/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0336 - accuracy: 0.8346 - val_loss: 0.0325 - val_accuracy: 0.8452 Epoch 163/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0334 - accuracy: 0.8355 - val_loss: 0.0324 - val_accuracy: 0.8459 Epoch 164/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0332 - accuracy: 0.8364 - val_loss: 0.0322 - val_accuracy: 0.8468 Epoch 165/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0331 - accuracy: 0.8374 - val_loss: 0.0320 - val_accuracy: 0.8476 Epoch 166/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0329 - accuracy: 0.8380 - val_loss: 0.0319 - val_accuracy: 0.8482 Epoch 167/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0327 - accuracy: 0.8389 - val_loss: 0.0317 - val_accuracy: 0.8492 Epoch 168/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0326 - accuracy: 0.8395 - val_loss: 0.0315 - val_accuracy: 0.8503 Epoch 169/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0324 - accuracy: 0.8406 - val_loss: 0.0314 - val_accuracy: 0.8513 Epoch 170/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0323 - accuracy: 0.8412 - val_loss: 0.0312 - val_accuracy: 0.8522 Epoch 171/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0321 - accuracy: 0.8418 - val_loss: 0.0311 - val_accuracy: 0.8530 Epoch 172/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0320 - accuracy: 0.8425 - val_loss: 0.0309 - val_accuracy: 0.8541 Epoch 173/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0318 - accuracy: 0.8430 - val_loss: 0.0308 - val_accuracy: 0.8547 Epoch 174/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0317 - accuracy: 0.8437 - val_loss: 0.0306 - val_accuracy: 0.8556 Epoch 175/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0315 - accuracy: 0.8443 - val_loss: 0.0305 - val_accuracy: 0.8559 Epoch 176/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0314 - accuracy: 0.8451 - val_loss: 0.0303 - val_accuracy: 0.8563 Epoch 177/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0312 - accuracy: 0.8456 - val_loss: 0.0302 - val_accuracy: 0.8573 Epoch 178/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0311 - accuracy: 0.8463 - val_loss: 0.0301 - val_accuracy: 0.8575 Epoch 179/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0310 - accuracy: 0.8470 - val_loss: 0.0299 - val_accuracy: 0.8579 Epoch 180/200 469/469 [==============================] - 2s 4ms/step - loss: 0.0308 - accuracy: 0.8475 - val_loss: 0.0298 - val_accuracy: 0.8583 Epoch 181/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0307 - accuracy: 0.8479 - val_loss: 0.0296 - val_accuracy: 0.8588 Epoch 182/200 469/469 [==============================] - 2s 5ms/step - loss: 0.0305 - accuracy: 0.8485 - val_loss: 0.0295 - val_accuracy: