The document describes building a decision tree classification model to predict major depression using gender, divorce status, and firing status as predictor variables. It first prepares the data, splits it into training and test sets, builds the decision tree model on the training set, evaluates it on the test set, and reports an accuracy of 83%. It then interprets the decision tree, describing how it splits the data at each node based on the predictor variables.
This report includes information about:
1. Pre-Processing Variables
a. Treating Missing Values
b. Treating correlated variables
2. Selection of Variables using random forest weights
3. Building model to predict donors and amount expected to be donated.
A tutorial on LDA that first builds on the intuition of the algorithm followed by a numerical example that is solved using MATLAB. This presentation is an audio-slide, which becomes self-explanatory if downloaded and viewed in slideshow mode.
Data Science - Part III - EDA & Model SelectionDerek Kane
This lecture introduces the concept of EDA, understanding, and working with data for machine learning and predictive analysis. The lecture is designed for anyone who wants to understand how to work with data and does not get into the mathematics. We will discuss how to utilize summary statistics, diagnostic plots, data transformations, variable selection techniques including principal component analysis, and finally get into the concept of model selection.
This report includes information about:
1. Pre-Processing Variables
a. Treating Missing Values
b. Treating correlated variables
2. Selection of Variables using random forest weights
3. Building model to predict donors and amount expected to be donated.
A tutorial on LDA that first builds on the intuition of the algorithm followed by a numerical example that is solved using MATLAB. This presentation is an audio-slide, which becomes self-explanatory if downloaded and viewed in slideshow mode.
Data Science - Part III - EDA & Model SelectionDerek Kane
This lecture introduces the concept of EDA, understanding, and working with data for machine learning and predictive analysis. The lecture is designed for anyone who wants to understand how to work with data and does not get into the mathematics. We will discuss how to utilize summary statistics, diagnostic plots, data transformations, variable selection techniques including principal component analysis, and finally get into the concept of model selection.
Predict Backorder on a supply chain data for an OrganizationPiyush Srivastava
Performed cleaning and founded the important variables and created a best model using different classification techniques (Random Forest, Naïve Bayes, Decision tree, KNN, Neural Network, Support Vector Machine) to predict the back-order for an organization using the best modelling and technique approach.
This presentation discusses the procedure involved in two-way mixed ANOVA design. The procedure has been discussed by solving a problem using SPSS functionality.
Please Subscribe to this Channel for more solutions and lectures
http://www.youtube.com/onlineteaching
Chapter 3: Describing, Exploring, and Comparing Data
3.2: Measures of Variation
1. Outline the differences between Hoarding power and Encouraging..docxpaynetawnya
1. Outline the differences between Hoarding power and Encouraging.
2. Explain about the power of Congruency in Leadership.
DataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseDegreeGender1GrCopy Employee Data set to this page.822.10.962233290915.81FAThe ongoing question that the weekly assignments will focus on is: Are males and females paid the same for equal work (under the Equal Pay Act)? 1522.60.984233280814.91FANote: to simplfy the analysis, we will assume that jobs within each grade comprise equal work.3522.60.984232390415.30FA37230.999232295216.20FAThe column labels in the table mean:1023.11.003233080714.71FAID – Employee sample number Salary – Salary in thousands 2323.11.004233665613.30FAAge – Age in yearsPerformance Rating – Appraisal rating (Employee evaluation score)1123.31.01223411001914.81FASERvice – Years of serviceGender: 0 = male, 1 = female 2623.51.020232295216.20FAMidpoint – salary grade midpoint Raise – percent of last raise3123.61.028232960413.91FAGrade – job/pay gradeDegree (0= BS\BA 1 = MS)3623.61.026232775314.30FAGender1 (Male or Female)Compa-ratio - salary divided by midpoint4023.81.034232490206.30MA14241.04523329012161FA4224.21.0512332100815.71FA1924.31.055233285104.61MA25251.0872341704040MA3226.50.855312595405.60MB227.70.895315280703.90MB3428.60.923312680204.91MB3933.91.094312790615.50FB2034.11.1013144701614.80FB1834.51.1133131801115.60FB335.11.132313075513.61FB1341.11.0274030100214.70FC741.31.0324032100815.71FC1642.21.054404490405.70MC4145.81.144402580504.30MC2746.91.172403580703.91MC548.21.0044836901605.71MD3049.31.0274845901804.30MD2456.31.173483075913.80FD4556.91.185483695815.21FD4757.21.003573795505.51ME3357.51.008573590905.51ME4581.01857421001605.51ME3858.81.0325745951104.50ME5059.61.0465738801204.60ME4660.21.0575739752003.91ME2260.31.257484865613.81FD161.61.081573485805.70ME4461.81.0855745901605.21ME49631.1055741952106.60ME1763.71.1185727553131FE1264.71.1355752952204.50ME4869.51.2195734901115.31FE973.91.103674910010041MF4375.61.1286742952015.50FF2976.31.139675295505.40MF2177.21.1526743951306.31MF678.11.1656736701204.51MF2878.31.169674495914.40FF
Week 2This assignment covers the material presented in weeks 1 and 2.Six QuestionsBefore starting this assignment, make sure the the assignment data from the Employee Salary Data Set file is copied over to this Assignment file.You can do this either by a copy and paste of all the columns or by opening the data file, right clicking on the Data tab, selecting Move or Copy, and copying the entire sheet to this file(Weekly Assignment Sheet or whatever you are calling your master assignment file).It is highly recommended that you copy the data columns (with labels) and paste them to the right so that whatever you do will not disrupt the original data values and relationships.To Ensure full credit for each question, you need to show how you got your results. For example, Question 1 asks for several data values. If you obtain them using descript ...
Explore the latest techniques and technologies used in classifying fetal health, from traditional methods to cutting-edge AI approaches. Understand the importance of accurate classification for prenatal care and fetal well-being. Join us to delve into this critical aspect of healthcare. visit https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/ for more data science insights
Predict Backorder on a supply chain data for an OrganizationPiyush Srivastava
Performed cleaning and founded the important variables and created a best model using different classification techniques (Random Forest, Naïve Bayes, Decision tree, KNN, Neural Network, Support Vector Machine) to predict the back-order for an organization using the best modelling and technique approach.
This presentation discusses the procedure involved in two-way mixed ANOVA design. The procedure has been discussed by solving a problem using SPSS functionality.
Please Subscribe to this Channel for more solutions and lectures
http://www.youtube.com/onlineteaching
Chapter 3: Describing, Exploring, and Comparing Data
3.2: Measures of Variation
1. Outline the differences between Hoarding power and Encouraging..docxpaynetawnya
1. Outline the differences between Hoarding power and Encouraging.
2. Explain about the power of Congruency in Leadership.
DataIDSalaryCompaMidpoint AgePerformance RatingServiceGenderRaiseDegreeGender1GrCopy Employee Data set to this page.822.10.962233290915.81FAThe ongoing question that the weekly assignments will focus on is: Are males and females paid the same for equal work (under the Equal Pay Act)? 1522.60.984233280814.91FANote: to simplfy the analysis, we will assume that jobs within each grade comprise equal work.3522.60.984232390415.30FA37230.999232295216.20FAThe column labels in the table mean:1023.11.003233080714.71FAID – Employee sample number Salary – Salary in thousands 2323.11.004233665613.30FAAge – Age in yearsPerformance Rating – Appraisal rating (Employee evaluation score)1123.31.01223411001914.81FASERvice – Years of serviceGender: 0 = male, 1 = female 2623.51.020232295216.20FAMidpoint – salary grade midpoint Raise – percent of last raise3123.61.028232960413.91FAGrade – job/pay gradeDegree (0= BS\BA 1 = MS)3623.61.026232775314.30FAGender1 (Male or Female)Compa-ratio - salary divided by midpoint4023.81.034232490206.30MA14241.04523329012161FA4224.21.0512332100815.71FA1924.31.055233285104.61MA25251.0872341704040MA3226.50.855312595405.60MB227.70.895315280703.90MB3428.60.923312680204.91MB3933.91.094312790615.50FB2034.11.1013144701614.80FB1834.51.1133131801115.60FB335.11.132313075513.61FB1341.11.0274030100214.70FC741.31.0324032100815.71FC1642.21.054404490405.70MC4145.81.144402580504.30MC2746.91.172403580703.91MC548.21.0044836901605.71MD3049.31.0274845901804.30MD2456.31.173483075913.80FD4556.91.185483695815.21FD4757.21.003573795505.51ME3357.51.008573590905.51ME4581.01857421001605.51ME3858.81.0325745951104.50ME5059.61.0465738801204.60ME4660.21.0575739752003.91ME2260.31.257484865613.81FD161.61.081573485805.70ME4461.81.0855745901605.21ME49631.1055741952106.60ME1763.71.1185727553131FE1264.71.1355752952204.50ME4869.51.2195734901115.31FE973.91.103674910010041MF4375.61.1286742952015.50FF2976.31.139675295505.40MF2177.21.1526743951306.31MF678.11.1656736701204.51MF2878.31.169674495914.40FF
Week 2This assignment covers the material presented in weeks 1 and 2.Six QuestionsBefore starting this assignment, make sure the the assignment data from the Employee Salary Data Set file is copied over to this Assignment file.You can do this either by a copy and paste of all the columns or by opening the data file, right clicking on the Data tab, selecting Move or Copy, and copying the entire sheet to this file(Weekly Assignment Sheet or whatever you are calling your master assignment file).It is highly recommended that you copy the data columns (with labels) and paste them to the right so that whatever you do will not disrupt the original data values and relationships.To Ensure full credit for each question, you need to show how you got your results. For example, Question 1 asks for several data values. If you obtain them using descript ...
Explore the latest techniques and technologies used in classifying fetal health, from traditional methods to cutting-edge AI approaches. Understand the importance of accurate classification for prenatal care and fetal well-being. Join us to delve into this critical aspect of healthcare. visit https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/ for more data science insights
Logistic Regresson is a bell weather binary classifier. This chapter shows how to use Logistic Regression. The separation boundary for Logistic is linear. Discriminative Classifier. Probabilistic Classifier.
A measure of central tendency (also referred to as measures of centre or central location) is a summary measure that attempts to describe a whole set of data with a single value that represents the middle or centre of its distribution.
This presentation explores a brief idea about the structural and functional attributes of nucleotides, the structure and function of genetic materials along with the impact of UV rays and pH upon them.
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...Wasswaderrick3
In this book, we use conservation of energy techniques on a fluid element to derive the Modified Bernoulli equation of flow with viscous or friction effects. We derive the general equation of flow/ velocity and then from this we derive the Pouiselle flow equation, the transition flow equation and the turbulent flow equation. In the situations where there are no viscous effects , the equation reduces to the Bernoulli equation. From experimental results, we are able to include other terms in the Bernoulli equation. We also look at cases where pressure gradients exist. We use the Modified Bernoulli equation to derive equations of flow rate for pipes of different cross sectional areas connected together. We also extend our techniques of energy conservation to a sphere falling in a viscous medium under the effect of gravity. We demonstrate Stokes equation of terminal velocity and turbulent flow equation. We look at a way of calculating the time taken for a body to fall in a viscous medium. We also look at the general equation of terminal velocity.
Richard's aventures in two entangled wonderlandsRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
Professional air quality monitoring systems provide immediate, on-site data for analysis, compliance, and decision-making.
Monitor common gases, weather parameters, particulates.
What is greenhouse gasses and how many gasses are there to affect the Earth.moosaasad1975
What are greenhouse gasses how they affect the earth and its environment what is the future of the environment and earth how the weather and the climate effects.
Slide 1: Title Slide
Extrachromosomal Inheritance
Slide 2: Introduction to Extrachromosomal Inheritance
Definition: Extrachromosomal inheritance refers to the transmission of genetic material that is not found within the nucleus.
Key Components: Involves genes located in mitochondria, chloroplasts, and plasmids.
Slide 3: Mitochondrial Inheritance
Mitochondria: Organelles responsible for energy production.
Mitochondrial DNA (mtDNA): Circular DNA molecule found in mitochondria.
Inheritance Pattern: Maternally inherited, meaning it is passed from mothers to all their offspring.
Diseases: Examples include Leber’s hereditary optic neuropathy (LHON) and mitochondrial myopathy.
Slide 4: Chloroplast Inheritance
Chloroplasts: Organelles responsible for photosynthesis in plants.
Chloroplast DNA (cpDNA): Circular DNA molecule found in chloroplasts.
Inheritance Pattern: Often maternally inherited in most plants, but can vary in some species.
Examples: Variegation in plants, where leaf color patterns are determined by chloroplast DNA.
Slide 5: Plasmid Inheritance
Plasmids: Small, circular DNA molecules found in bacteria and some eukaryotes.
Features: Can carry antibiotic resistance genes and can be transferred between cells through processes like conjugation.
Significance: Important in biotechnology for gene cloning and genetic engineering.
Slide 6: Mechanisms of Extrachromosomal Inheritance
Non-Mendelian Patterns: Do not follow Mendel’s laws of inheritance.
Cytoplasmic Segregation: During cell division, organelles like mitochondria and chloroplasts are randomly distributed to daughter cells.
Heteroplasmy: Presence of more than one type of organellar genome within a cell, leading to variation in expression.
Slide 7: Examples of Extrachromosomal Inheritance
Four O’clock Plant (Mirabilis jalapa): Shows variegated leaves due to different cpDNA in leaf cells.
Petite Mutants in Yeast: Result from mutations in mitochondrial DNA affecting respiration.
Slide 8: Importance of Extrachromosomal Inheritance
Evolution: Provides insight into the evolution of eukaryotic cells.
Medicine: Understanding mitochondrial inheritance helps in diagnosing and treating mitochondrial diseases.
Agriculture: Chloroplast inheritance can be used in plant breeding and genetic modification.
Slide 9: Recent Research and Advances
Gene Editing: Techniques like CRISPR-Cas9 are being used to edit mitochondrial and chloroplast DNA.
Therapies: Development of mitochondrial replacement therapy (MRT) for preventing mitochondrial diseases.
Slide 10: Conclusion
Summary: Extrachromosomal inheritance involves the transmission of genetic material outside the nucleus and plays a crucial role in genetics, medicine, and biotechnology.
Future Directions: Continued research and technological advancements hold promise for new treatments and applications.
Slide 11: Questions and Discussion
Invite Audience: Open the floor for any questions or further discussion on the topic.
Salas, V. (2024) "John of St. Thomas (Poinsot) on the Science of Sacred Theol...Studia Poinsotiana
I Introduction
II Subalternation and Theology
III Theology and Dogmatic Declarations
IV The Mixed Principles of Theology
V Virtual Revelation: The Unity of Theology
VI Theology as a Natural Science
VII Theology’s Certitude
VIII Conclusion
Notes
Bibliography
All the contents are fully attributable to the author, Doctor Victor Salas. Should you wish to get this text republished, get in touch with the author or the editorial committee of the Studia Poinsotiana. Insofar as possible, we will be happy to broker your contact.
[M4A1] Data Analysis and Interpretation Specialization
1. DATA ANALYSIS COLLECTION
ASSIGNMENT
Data Analysis And Interpretation Specialization
Running a classification tree
Andrea Rubio Amorós
June 26, 2017
Modul 4
Assignment 1
2. Data Analysis And Interpretation Specialization
Running a classification tree M4A1
1 General
In this session, you will learn about decision trees, a type of data mining algorithm that can select from among a large
number of variables those and their interactions that are most important in predicting the target or response variable
to be explained. Decision trees create segmentations or subgroups in the data, by applying a series of simple rules or
criteria over and over again, which choose variable constellations that best predict the target variable.
Document written in LATEX
template_version_01.tex
2
3. Data Analysis And Interpretation Specialization
Running a classification tree M4A1
2 Description
GENDER <= 0.5
gini = 0.2736
samples = 25855
value = [21626, 4229]
DIVORCED <= 0.5
gini = 0.2091
samples = 11122
value = [9803, 1319]
True
DIVORCED <= 0.5
gini = 0.317
samples = 14733
value = [11823, 2910]
False
FIRED <= 0.5
gini = 0.198
samples = 10422
value = [9261, 1161]
FIRED <= 0.5
gini = 0.3495
samples = 700
value = [542, 158]
gini = 0.1929
samples = 9691
value = [8643, 1048]
gini = 0.2614
samples = 731
value = [618, 113]
gini = 0.3399
samples = 585
value = [458, 127]
gini = 0.3938
samples = 115
value = [84, 31]
FIRED <= 0.5
gini = 0.3028
samples = 13713
value = [11163, 2550]
FIRED <= 0.5
gini = 0.4567
samples = 1020
value = [660, 360]
gini = 0.296
samples = 13088
value = [10724, 2364]
gini = 0.4181
samples = 625
value = [439, 186]
gini = 0.4487
samples = 877
value = [579, 298]
gini = 0.4912
samples = 143
value = [81, 62]
Figure 2.1 Decision tree
Today, I will build a decision tree (statistical model) to study supervised prediction problems. For that, I will work
with the NESARC data set.
First, I set my explanatory and response variables (binary, categorical).
Explanatory variables (predictors):
• GENDER: Sex (0=Male / 1=Female)
• DIVORCED: got divorced in last 12 months (0=no / 1=yes)
• FIRED: fired in last 12 months (0=no / 1=yes)
Response variable (target):
• MAJORDEPP12: major depression in last 12 months (0=no / 1=yes)
Then, I include the train test split function for predictors and target, to split my data set into two:
train sample
(25855, 3)
test sample
(17238, 3)
• The train sample has 25855 observations or rows, 60% of the original sample, and 3 explanatory variables.
• The test sample has 17238 observations or rows, 40% of the original sample, and again 3 explanatory variables
or columns.
Once training and testing data sets have been created, I initialize the DecisionTreeClassifier from SKLearn.
Then, I include the predict function and the confusion matrix function, to study the classification accuracy of my
decision tree.
Predict function - show number of true and false negatives and positives
[[14393 0]
[ 2845 0]]
Document written in LATEX
template_version_01.tex
3
4. Data Analysis And Interpretation Specialization
Running a classification tree M4A1
The predict function shows the correct and incorrect classifications of our decision tree. The diagonal, 14393 and 0,
represent the number of true negative and the number of true positives for major depression, respectively. The 2845,
on the bottom left, represents the number of false negatives. Classifying individuals who suffer from major depression
as individuals who does not. And the 0 on the top right, the number of false positives, classifying who does not suffer
from major depression as individuals who does.
Confusion matrix function - show classification accuracy in percentage
0.836291913215
The confusion matrix function indicates the accuracy score, approximately 0.83, which suggests that the decision tree
model has classified 83% of the sample correctly.
My decision tree is built with MAJORDEPP12, my binary major depression variable, as the target. And GENDER,
DIVORCED and FIRED as the predictors or explanatory variables.
The resulting tree starts with the first node that indicates the total number of observations in the train set of 25855,
and from those, 21649 does not suffer from major depression while 4206 does.
The first split is made on GENDER, our first explanatory variable. Values for GENDER less than 0.5, that is Male, move
to the left side of the split and include 11079 of the 25855 individuals. Values equal 0.5 or higher, move to the right side
of the split and include 14776 of the 25855 individuals. That means that from the total individuals, 11079 are male
and 14776 are female.
In each side we can also see that:
• From the male individuals (left node), 9809 do not suffer from major depression while 1270 do.
• From the female individuals (right node), 11840 do not suffer from major depression while 2936 do.
From this node, more splits are done with the variables DIVORCE and FIRED, which generate more nodes in the same
way as described before.
By looking to the bottom left and the bottom right nodes, we can describe the output as:
• From the 9646 male individuals who did not get divorced and did not get fired, 8622 individuals (89%) do not
suffer from major depression, while 1024 (11%) do.
• From the 143 female individuals who got divorced and got fired, 79 individuals (55% do not suffer from major
depression, while 64 (45%) do.
Document written in LATEX
template_version_01.tex
4
5. Data Analysis And Interpretation Specialization
Running a classification tree M4A1
3 Python Code
import pandas as pd
from pandas import Series, DataFrame
import numpy as np
from sklearn.cross_validation import train_test_split
from sklearn.tree import DecisionTreeClassifier
import sklearn.metrics
# saving the python console window as text file
import sys
sys.stdout =open(working_folder+"M4A1output.txt","w")
# reading in the data set we want to work with
mydata = pd.read_csv(working_folder+"M4A1data_nesarc_pds.csv",low_memory=False)
# cleaning my data from NaNs
mydata_clean = mydata.dropna()
mydata_clean.dtypes
mydata_clean.describe()
# recode variable observations to 0, 1
def GENDER(x):
if x['SEX'] == 1:
return 0
else:
return 1
mydata_clean['GENDER'] = mydata_clean.apply(lambda x: GENDER(x), axis = 1)
def DIVORCED(x):
if x['S1Q238'] == 1:
return 1
else:
return 0
mydata_clean['DIVORCED'] = mydata_clean.apply(lambda x: DIVORCED(x), axis = 1)
def FIRED(x):
if x['S1Q234'] == 1:
return 1
else:
return 0
mydata_clean['FIRED'] = mydata_clean.apply(lambda x: FIRED(x), axis = 1)
# set explanatory (predictors) and response (target) variables
predictors = mydata_clean[['GENDER', 'DIVORCED', 'FIRED']]
target = mydata_clean.MAJORDEPP12
# split into training and testing sets
pred_train, pred_test, tar_train, tar_test = train_test_split(predictors, target, test_size=.4)
print('train sample')
print(pred_train.shape)
print('test sample')
print(pred_test.shape)
tar_train.shape
tar_test.shape
# build model on training data
classifier = DecisionTreeClassifier()
classifier = classifier.fit(pred_train,tar_train)
# predict for the test values
predictions = classifier.predict(pred_test)
# show number of true and false negatives and positives
print('show number of true and false negatives and positives')
print(sklearn.metrics.confusion_matrix(tar_test, predictions))
# show classification accuracy in percentage
print('show classification accuracy in percentage')
Document written in LATEX
template_version_01.tex
5
6. Data Analysis And Interpretation Specialization
Running a classification tree M4A1
print(sklearn.metrics.accuracy_score(tar_test, predictions))
# Display the decision tree
from sklearn import tree
import graphviz
import pydotplus
dot_data = tree.export_graphviz(classifier,
feature_names=['GENDER', 'DIVORCED', 'FIRED'], filled=True, rounded=True, out_file=None)
graph = pydotplus.graph_from_dot_data(dot_data)
graph.write_pdf(working_folder+"M4A1fig1.pdf")
Document written in LATEX
template_version_01.tex
6
7. Data Analysis And Interpretation Specialization
Running a classification tree M4A1
4 Codebook
Document written in LATEX
template_version_01.tex
7