Injustice - Developers Among Us (SciFiDevCon 2024)
Machine learning.docx
1. Machine learning
Machine learningisanapplicationof artificial intelligence thatprovidessystemsthe abilityto
automaticallylearnandimprove fromexperience withoutbeingexplicitlyprogrammed.
Data preprocessing
It is a process of preparing the raw data and making it suitable for a machine learning model.
It is the first and crucial step while creating a machine learning model.
CSV
CSV stands for "Comma-Separated Values" files; it is a file format which allows us to save
the tabular data, such as spreadsheets.
DATASET
The collected data for a particular problem in a proper format is known as the dataset
o Getting the dataset
o Importing libraries
o Importing datasets
o Finding Missing Data
o Encoding Categorical Data
o Splitting dataset into training and test set
o Feature scaling
LIBRARIES
Numpy: Numpy Python library is used for including any type of mathematical operation in
the code.
Matplotlib: The second library is matplotlib, which is a Python 2D plotting library
Pandas: The last library is the Pandas library, which is one of the most famous Python
libraries and used for importing and managing the datasets
Handling Missing data:
By deleting the particular row:
By calculating the mean:
Assigning a unique category
predicting the missing value
2. using algorithm which supports missing value
TESTING AND TRAINING
Training Set: A subset of dataset to train the machine learning model, and we already
know the output.
Test set: A subset of dataset to test the machine learning model, and by using the test
set, model predicts the output
Feature selection - a feature is an individual measurable property or characteristic
Feature selection is the process of reducing the input variable to your
model by using only relevant data and getting rid of noise in data
It improves the machine learning process and increases the predictive power of
machine learning algorithms by selecting the most important variables and eliminating
redundant and irrelevant features.
FEATURE EXTRACTION
Feature extraction refers to the process of transforming raw data into numerical features that can
be processed while preserving the information in the original data set. It yields better results than
applying machine learning directly to the raw data
XGBOOST = EXTREME GRADIENT BOOSTING
CONDITIONAL PROB, BAYES THEOREM
TYPES OF CROSS VALIDATION