SlideShare a Scribd company logo
Teaching Data Science Students
to Write Clean Code
Todd Iverson, Winona State University
Slides/Code: https://bit.ly/2WgFIbI
Episode 37
Doug Wants an A
Doug “Crazy Legs” Ervison “Mean” Dr. Iverson
Hi! I’m
Doug
Grrr
The Hero The Villain
I NEED
an A!
Doug?
An A?
HA!
His code
stinks!
Doug will demonstrate
1. Good names
2. Small functions
3. Unit tests
4. Refactor code, specifically
1. Extract functions
2. Split loops
Opening Scene - The Assignment
https://www.kaggle.com/c/spooky-author-identification/data
Kaggle
is Kool!
Kaggle
assignment!
But Dr. Iverson is
SO mean!
Doug’s Original code
(…this assignment
require unit tests!…)
Iverson
loves Bag of
Words!
I am going to
get an A for
sure!
(…F!…)
It looks like our hero is doomed
to an F!
Then just in the nick of time …
… Doug remembers unit tests!
You MUST have
unit tests.
What are unit tests?
• Captures/maintain intended behavior
• Helpful when changing code
• Should be automated
Doug writes some unit tests
Original behavior
New behavior
That was
easy!
And my code
passed!
Doug’s Original code
(…with names
like that …)
Remembered
the Unit Tests!
I am going to
get an A for
sure!
(…a C at
best…)
Luckily, Doug remembers to think about names
Names are
important!
They should
express
intent!
Good names…
• Reveal intent
• Use the proper parts of speech
• Have the proper length for their scope
• Avoids disinformation and encodings
Data: What is it?
Function: What does it do?
Good names…
• Reveal intent
• Use the proper parts of speech
• Have the proper length for their scope
• Avoids disinformation and encodings
Variable: Noun
Function: Verb
Boolean: Predicate
These names
are bad.
What’ ews
again??
Doug inspects
his names
ews hold the words
for Edgar Allen Poe
Maybe I
should just use
poe_words
Doug finds some better names
The other
names are
better too!
poe_words is
better than ews
A+ for
sure!
(…see the
bug?…)
(…D at
best…)
It looks like our hero is doomed
with D!
Then just in the nick of time …
… Doug remembers to test!
Good thing I
ran unit test!
Better test!
Oops!
What else
would Iverson
complain
about?
Doug imagines Iverson’s feedback
Clean
functions do
one thing
You need more
functions
and Doug even remembers refactoring!
… and extract
a function!
Find a block
that does
something …
Common Refactoring Technique
Extract Function
The DRY principle
• Don’t repeat yourself!
• Find similar code
• Make it exactly the same
• Extract a function!
Extract Functions
Replace hyphens
Remove punctuation
Test after each change!
Extract this
function!
What does
each part do?
Better
test
Extract Another Function
Clean and Split
Any other
functions!
Extract this
too!
Test after each change!
Better
test
Doug is on a roll now!
There are a LOT
of nesting …
And this blocks
look similar
… a sign of a
function doing
too much
Common Refactoring Technique
Split Loop
1 Loop that
does 2 things
2 Loop that
do 1 thing
Doug makes the blocks identical …
These are
independent!
And replace
words with a
query
I can separate them!
… carefully splits the loop …
First split the
loop
… and extracts a function
Now extract
a function
and replace
blocks with a call
Tests pass,
But isn’t this
inefficient?
What did Iverson say about efficiency?
97% of your code
doesn’t impact overall
speed
Optimize the other
3% … after profiling
Blah blah …
Donald Knuth
… Blah blah
The real problem is that programmers have spent far too much
time worrying about efficiency in the wrong places and at the
wrong times; premature optimization is the root of all evil (or
at least most of it) in programming.
- Donald Knuth
Iverson is always
talking about Knuth
What a fanboy!
(…he’s not wrong …)
Doug’s Final Product
Better names!
A+ work for
sure!
Short
functions!
Fast even with
split loops!
Doug’s code is demonstrably better
He clearly took Iverson’s
clean code lectures to heart
His solution consists of
many small functions
with good names
And he even refactored like a pro
So does Doug get the A?
Tune in next week to find out …
(Still too much
nesting…)
(MUHAHAHA…)
.. on the next exciting episode of
Doug Does Data Science
Let’s Review
What are unit tests?
• Captures/maintain intended behavior
• Helpful when changing code
• Should be automated
Good names…
• Reveal intent
• Use the proper parts of speech
• Have the proper length for their scope
• Avoids disinformation and encodings
Data: What is it?
Function: What does it do?
Good names…
• Reveal intent
• Use the proper parts of speech
• Have the proper length for their scope
• Avoids disinformation and encodings
Variable: Noun
Function: Verb
Boolean: Predicate
Refactoring
What is it?
• Reorganize your code
• Break it into different
parts
• Change composition
Why use it?
• Understand the code
• Clean the code
• Allow new features
Common Refactoring Technique
Extract Function
Common Refactoring Technique
Split Loop
1 Loop that
does 2 things
2 Loop that
do 1 thing
The DRY principle
• Don’t repeat yourself!
• Find similar code
• Make it exactly the same
• Extract a function!
Advice for teaching clean code
• Require unit tests and good names.
• Don’t just teach it, live it!
• Allow students to see you clean your messy code.
• Teach/reinforce important concepts.
• DRY
• Refactoring
• Efficiency concerns and profiling
• Projects that require 100’s of lines of code.
Clean Code Resources
• These slides: https://bit.ly/2WgFIb
• Clean Code, by Robert Martin
• www.cleancoders.com, videos by Robert Martin
and friends
• Refactoring Code, by Martin Fowler

More Related Content

Similar to teaching data science students to write clean code

Noam Kfir - There is no Java Script - code.talks 2015
Noam Kfir - There is no Java Script - code.talks 2015Noam Kfir - There is no Java Script - code.talks 2015
Noam Kfir - There is no Java Script - code.talks 2015
AboutYouGmbH
 
Comparing Golang and understanding Java Value Types
Comparing Golang and understanding Java Value TypesComparing Golang and understanding Java Value Types
Comparing Golang and understanding Java Value Types
Péter Verhás
 
Code refactoring workshop (in Javascript)
Code refactoring workshop (in Javascript)Code refactoring workshop (in Javascript)
Code refactoring workshop (in Javascript)
Ilias Bartolini
 
Test First Teaching
Test First TeachingTest First Teaching
Test First Teaching
Sarah Allen
 
Ruby object model
Ruby object modelRuby object model
Ruby object modelmbeizer
 
The disaster of mutable state
The disaster of mutable stateThe disaster of mutable state
The disaster of mutable state
kenbot
 
Nullcon Jailbreak CTF 2012,Walkthrough by Team Loosers
Nullcon Jailbreak CTF 2012,Walkthrough by Team LoosersNullcon Jailbreak CTF 2012,Walkthrough by Team Loosers
Nullcon Jailbreak CTF 2012,Walkthrough by Team Loosers
Ajith Chandran
 
Scottish Ruby Conference 2014
Scottish Ruby Conference  2014Scottish Ruby Conference  2014
Scottish Ruby Conference 2014michaelag1971
 
Make a better with clean code
Make a better with clean codeMake a better with clean code
Make a better with clean code
Keattiwut Kosittaruk
 
How I Learned to Stop Worrying and Love Legacy Code.....
How I Learned to Stop Worrying and Love Legacy Code.....How I Learned to Stop Worrying and Love Legacy Code.....
How I Learned to Stop Worrying and Love Legacy Code.....
Mike Harris
 
How to not suck at JavaScript
How to not suck at JavaScriptHow to not suck at JavaScript
How to not suck at JavaScript
tmont
 
ORUG - Sept 2014 - Lesson When Learning Ruby/Rails
ORUG - Sept 2014 - Lesson When Learning Ruby/RailsORUG - Sept 2014 - Lesson When Learning Ruby/Rails
ORUG - Sept 2014 - Lesson When Learning Ruby/Rails
danielrsmith
 
Four Stages of Automated Testing by Bradley Temple
Four Stages of Automated Testing by Bradley TempleFour Stages of Automated Testing by Bradley Temple
Four Stages of Automated Testing by Bradley Temple
QA or the Highway
 
CPP02 - The Structure of a Program
CPP02 - The Structure of a ProgramCPP02 - The Structure of a Program
CPP02 - The Structure of a Program
Michael Heron
 
A class action
A class actionA class action
A class action
Luciano Colosio
 
Reinvent yourself - How to become a native iOS developer in nine steps
Reinvent yourself - How to become a native iOS developer in nine stepsReinvent yourself - How to become a native iOS developer in nine steps
Reinvent yourself - How to become a native iOS developer in nine steps
Jason Hanson
 
TDD - Christchurch APN May 2012
TDD - Christchurch APN May 2012TDD - Christchurch APN May 2012
TDD - Christchurch APN May 2012
Alan Christensen
 
Tensorflow go
Tensorflow goTensorflow go
Tensorflow go
Patrick Walker
 
Test Driven Development on Android (Kotlin Kenya)
Test Driven Development on Android (Kotlin Kenya)Test Driven Development on Android (Kotlin Kenya)
Test Driven Development on Android (Kotlin Kenya)
Danny Preussler
 

Similar to teaching data science students to write clean code (20)

Noam Kfir - There is no Java Script - code.talks 2015
Noam Kfir - There is no Java Script - code.talks 2015Noam Kfir - There is no Java Script - code.talks 2015
Noam Kfir - There is no Java Script - code.talks 2015
 
Comparing Golang and understanding Java Value Types
Comparing Golang and understanding Java Value TypesComparing Golang and understanding Java Value Types
Comparing Golang and understanding Java Value Types
 
Code refactoring workshop (in Javascript)
Code refactoring workshop (in Javascript)Code refactoring workshop (in Javascript)
Code refactoring workshop (in Javascript)
 
Test First Teaching
Test First TeachingTest First Teaching
Test First Teaching
 
Ruby object model
Ruby object modelRuby object model
Ruby object model
 
The disaster of mutable state
The disaster of mutable stateThe disaster of mutable state
The disaster of mutable state
 
Nullcon Jailbreak CTF 2012,Walkthrough by Team Loosers
Nullcon Jailbreak CTF 2012,Walkthrough by Team LoosersNullcon Jailbreak CTF 2012,Walkthrough by Team Loosers
Nullcon Jailbreak CTF 2012,Walkthrough by Team Loosers
 
Scottish Ruby Conference 2014
Scottish Ruby Conference  2014Scottish Ruby Conference  2014
Scottish Ruby Conference 2014
 
Make a better with clean code
Make a better with clean codeMake a better with clean code
Make a better with clean code
 
How I Learned to Stop Worrying and Love Legacy Code.....
How I Learned to Stop Worrying and Love Legacy Code.....How I Learned to Stop Worrying and Love Legacy Code.....
How I Learned to Stop Worrying and Love Legacy Code.....
 
How to not suck at JavaScript
How to not suck at JavaScriptHow to not suck at JavaScript
How to not suck at JavaScript
 
ORUG - Sept 2014 - Lesson When Learning Ruby/Rails
ORUG - Sept 2014 - Lesson When Learning Ruby/RailsORUG - Sept 2014 - Lesson When Learning Ruby/Rails
ORUG - Sept 2014 - Lesson When Learning Ruby/Rails
 
Four Stages of Automated Testing by Bradley Temple
Four Stages of Automated Testing by Bradley TempleFour Stages of Automated Testing by Bradley Temple
Four Stages of Automated Testing by Bradley Temple
 
CPP02 - The Structure of a Program
CPP02 - The Structure of a ProgramCPP02 - The Structure of a Program
CPP02 - The Structure of a Program
 
Dmk audioviz
Dmk audiovizDmk audioviz
Dmk audioviz
 
A class action
A class actionA class action
A class action
 
Reinvent yourself - How to become a native iOS developer in nine steps
Reinvent yourself - How to become a native iOS developer in nine stepsReinvent yourself - How to become a native iOS developer in nine steps
Reinvent yourself - How to become a native iOS developer in nine steps
 
TDD - Christchurch APN May 2012
TDD - Christchurch APN May 2012TDD - Christchurch APN May 2012
TDD - Christchurch APN May 2012
 
Tensorflow go
Tensorflow goTensorflow go
Tensorflow go
 
Test Driven Development on Android (Kotlin Kenya)
Test Driven Development on Android (Kotlin Kenya)Test Driven Development on Android (Kotlin Kenya)
Test Driven Development on Android (Kotlin Kenya)
 

More from saber tabatabaee

clean architecture uncle bob AnalysisAndDesign.el.en.pptx
clean architecture uncle bob AnalysisAndDesign.el.en.pptxclean architecture uncle bob AnalysisAndDesign.el.en.pptx
clean architecture uncle bob AnalysisAndDesign.el.en.pptx
saber tabatabaee
 
هاستینگ و راه اندازی یک پروژه لاراول - آنالیز و امکان سنجی
هاستینگ و راه اندازی یک پروژه لاراول - آنالیز و امکان سنجیهاستینگ و راه اندازی یک پروژه لاراول - آنالیز و امکان سنجی
هاستینگ و راه اندازی یک پروژه لاراول - آنالیز و امکان سنجی
saber tabatabaee
 
لاراول ارائه 26 سپتامبر 2021 اسکایپ
لاراول ارائه 26 سپتامبر 2021 اسکایپلاراول ارائه 26 سپتامبر 2021 اسکایپ
لاراول ارائه 26 سپتامبر 2021 اسکایپ
saber tabatabaee
 
scrum master اسکرام مستر
scrum master اسکرام مسترscrum master اسکرام مستر
scrum master اسکرام مستر
saber tabatabaee
 
L5 swagger
L5 swaggerL5 swagger
L5 swagger
saber tabatabaee
 
Crm or xrm
Crm or xrmCrm or xrm
Crm or xrm
saber tabatabaee
 
Scrum master - daily scrum master story
Scrum master - daily scrum master storyScrum master - daily scrum master story
Scrum master - daily scrum master story
saber tabatabaee
 
Online exam ismc
Online exam ismcOnline exam ismc
Online exam ismc
saber tabatabaee
 
clean code book summary - uncle bob - English version
clean code book summary - uncle bob - English versionclean code book summary - uncle bob - English version
clean code book summary - uncle bob - English version
saber tabatabaee
 
Writing clean scientific software Murphy cleancoding
Writing clean scientific software Murphy cleancodingWriting clean scientific software Murphy cleancoding
Writing clean scientific software Murphy cleancoding
saber tabatabaee
 
R. herves. clean code (theme)2
R. herves. clean code (theme)2R. herves. clean code (theme)2
R. herves. clean code (theme)2
saber tabatabaee
 
Clean code chpt_1
Clean code chpt_1Clean code chpt_1
Clean code chpt_1
saber tabatabaee
 
Code quality
Code qualityCode quality
Code quality
saber tabatabaee
 
refactoring code by clean code rules
refactoring code by clean code rulesrefactoring code by clean code rules
refactoring code by clean code rules
saber tabatabaee
 
clean code - uncle bob
clean code - uncle bobclean code - uncle bob
clean code - uncle bob
saber tabatabaee
 
sharepoint 2007 presentation in crcis
sharepoint 2007 presentation in crcis sharepoint 2007 presentation in crcis
sharepoint 2007 presentation in crcis
saber tabatabaee
 
Linux DVD 03 Learnkey linux+ setup
Linux DVD 03 Learnkey linux+ setupLinux DVD 03 Learnkey linux+ setup
Linux DVD 03 Learnkey linux+ setup
saber tabatabaee
 
linux+ learnkey DVD 2
linux+ learnkey DVD 2 linux+ learnkey DVD 2
linux+ learnkey DVD 2
saber tabatabaee
 

More from saber tabatabaee (18)

clean architecture uncle bob AnalysisAndDesign.el.en.pptx
clean architecture uncle bob AnalysisAndDesign.el.en.pptxclean architecture uncle bob AnalysisAndDesign.el.en.pptx
clean architecture uncle bob AnalysisAndDesign.el.en.pptx
 
هاستینگ و راه اندازی یک پروژه لاراول - آنالیز و امکان سنجی
هاستینگ و راه اندازی یک پروژه لاراول - آنالیز و امکان سنجیهاستینگ و راه اندازی یک پروژه لاراول - آنالیز و امکان سنجی
هاستینگ و راه اندازی یک پروژه لاراول - آنالیز و امکان سنجی
 
لاراول ارائه 26 سپتامبر 2021 اسکایپ
لاراول ارائه 26 سپتامبر 2021 اسکایپلاراول ارائه 26 سپتامبر 2021 اسکایپ
لاراول ارائه 26 سپتامبر 2021 اسکایپ
 
scrum master اسکرام مستر
scrum master اسکرام مسترscrum master اسکرام مستر
scrum master اسکرام مستر
 
L5 swagger
L5 swaggerL5 swagger
L5 swagger
 
Crm or xrm
Crm or xrmCrm or xrm
Crm or xrm
 
Scrum master - daily scrum master story
Scrum master - daily scrum master storyScrum master - daily scrum master story
Scrum master - daily scrum master story
 
Online exam ismc
Online exam ismcOnline exam ismc
Online exam ismc
 
clean code book summary - uncle bob - English version
clean code book summary - uncle bob - English versionclean code book summary - uncle bob - English version
clean code book summary - uncle bob - English version
 
Writing clean scientific software Murphy cleancoding
Writing clean scientific software Murphy cleancodingWriting clean scientific software Murphy cleancoding
Writing clean scientific software Murphy cleancoding
 
R. herves. clean code (theme)2
R. herves. clean code (theme)2R. herves. clean code (theme)2
R. herves. clean code (theme)2
 
Clean code chpt_1
Clean code chpt_1Clean code chpt_1
Clean code chpt_1
 
Code quality
Code qualityCode quality
Code quality
 
refactoring code by clean code rules
refactoring code by clean code rulesrefactoring code by clean code rules
refactoring code by clean code rules
 
clean code - uncle bob
clean code - uncle bobclean code - uncle bob
clean code - uncle bob
 
sharepoint 2007 presentation in crcis
sharepoint 2007 presentation in crcis sharepoint 2007 presentation in crcis
sharepoint 2007 presentation in crcis
 
Linux DVD 03 Learnkey linux+ setup
Linux DVD 03 Learnkey linux+ setupLinux DVD 03 Learnkey linux+ setup
Linux DVD 03 Learnkey linux+ setup
 
linux+ learnkey DVD 2
linux+ learnkey DVD 2 linux+ learnkey DVD 2
linux+ learnkey DVD 2
 

Recently uploaded

Launch Your Streaming Platforms in Minutes
Launch Your Streaming Platforms in MinutesLaunch Your Streaming Platforms in Minutes
Launch Your Streaming Platforms in Minutes
Roshan Dwivedi
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
Rakesh Kumar R
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
Adele Miller
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 
AI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website CreatorAI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website Creator
Google
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
Aftab Hussain
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
Google
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
Philip Schwarz
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
Deuglo Infosystem Pvt Ltd
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
Shane Coughlan
 
Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
TheSMSPoint
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Łukasz Chruściel
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
Max Andersen
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
Fermin Galan
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Mind IT Systems
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
Łukasz Chruściel
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata
 

Recently uploaded (20)

Launch Your Streaming Platforms in Minutes
Launch Your Streaming Platforms in MinutesLaunch Your Streaming Platforms in Minutes
Launch Your Streaming Platforms in Minutes
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
May Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdfMay Marketo Masterclass, London MUG May 22 2024.pdf
May Marketo Masterclass, London MUG May 22 2024.pdf
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 
AI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website CreatorAI Genie Review: World’s First Open AI WordPress Website Creator
AI Genie Review: World’s First Open AI WordPress Website Creator
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
 
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing SuiteAI Pilot Review: The World’s First Virtual Assistant Marketing Suite
AI Pilot Review: The World’s First Virtual Assistant Marketing Suite
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
 
Empowering Growth with Best Software Development Company in Noida - Deuglo
Empowering Growth with Best Software  Development Company in Noida - DeugloEmpowering Growth with Best Software  Development Company in Noida - Deuglo
Empowering Growth with Best Software Development Company in Noida - Deuglo
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
 
Transform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR SolutionsTransform Your Communication with Cloud-Based IVR Solutions
Transform Your Communication with Cloud-Based IVR Solutions
 
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Need for Speed: Removing speed bumps from your Symfony projects ⚡️
Need for Speed: Removing speed bumps from your Symfony projects ⚡️
 
Quarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden ExtensionsQuarkus Hidden and Forbidden Extensions
Quarkus Hidden and Forbidden Extensions
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
 
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
 
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf2024 eCommerceDays Toulouse - Sylius 2.0.pdf
2024 eCommerceDays Toulouse - Sylius 2.0.pdf
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
 

teaching data science students to write clean code

  • 1. Teaching Data Science Students to Write Clean Code Todd Iverson, Winona State University Slides/Code: https://bit.ly/2WgFIbI
  • 3. Doug “Crazy Legs” Ervison “Mean” Dr. Iverson Hi! I’m Doug Grrr The Hero The Villain I NEED an A! Doug? An A? HA! His code stinks!
  • 4. Doug will demonstrate 1. Good names 2. Small functions 3. Unit tests 4. Refactor code, specifically 1. Extract functions 2. Split loops
  • 5. Opening Scene - The Assignment https://www.kaggle.com/c/spooky-author-identification/data Kaggle is Kool! Kaggle assignment! But Dr. Iverson is SO mean!
  • 6. Doug’s Original code (…this assignment require unit tests!…) Iverson loves Bag of Words! I am going to get an A for sure! (…F!…)
  • 7. It looks like our hero is doomed to an F!
  • 8. Then just in the nick of time …
  • 9. … Doug remembers unit tests! You MUST have unit tests.
  • 10. What are unit tests? • Captures/maintain intended behavior • Helpful when changing code • Should be automated
  • 11. Doug writes some unit tests Original behavior New behavior That was easy! And my code passed!
  • 12. Doug’s Original code (…with names like that …) Remembered the Unit Tests! I am going to get an A for sure! (…a C at best…)
  • 13. Luckily, Doug remembers to think about names Names are important! They should express intent!
  • 14. Good names… • Reveal intent • Use the proper parts of speech • Have the proper length for their scope • Avoids disinformation and encodings Data: What is it? Function: What does it do?
  • 15. Good names… • Reveal intent • Use the proper parts of speech • Have the proper length for their scope • Avoids disinformation and encodings Variable: Noun Function: Verb Boolean: Predicate
  • 16. These names are bad. What’ ews again?? Doug inspects his names
  • 17. ews hold the words for Edgar Allen Poe Maybe I should just use poe_words Doug finds some better names
  • 18. The other names are better too! poe_words is better than ews A+ for sure! (…see the bug?…) (…D at best…)
  • 19. It looks like our hero is doomed with D!
  • 20. Then just in the nick of time …
  • 21. … Doug remembers to test! Good thing I ran unit test! Better test! Oops!
  • 23. Doug imagines Iverson’s feedback Clean functions do one thing You need more functions
  • 24. and Doug even remembers refactoring! … and extract a function! Find a block that does something …
  • 26. The DRY principle • Don’t repeat yourself! • Find similar code • Make it exactly the same • Extract a function!
  • 27. Extract Functions Replace hyphens Remove punctuation Test after each change! Extract this function! What does each part do? Better test
  • 28. Extract Another Function Clean and Split Any other functions! Extract this too! Test after each change! Better test
  • 29. Doug is on a roll now! There are a LOT of nesting … And this blocks look similar … a sign of a function doing too much
  • 30. Common Refactoring Technique Split Loop 1 Loop that does 2 things 2 Loop that do 1 thing
  • 31. Doug makes the blocks identical … These are independent! And replace words with a query I can separate them!
  • 32. … carefully splits the loop … First split the loop
  • 33. … and extracts a function Now extract a function and replace blocks with a call Tests pass, But isn’t this inefficient?
  • 34. What did Iverson say about efficiency? 97% of your code doesn’t impact overall speed Optimize the other 3% … after profiling Blah blah … Donald Knuth … Blah blah
  • 35. The real problem is that programmers have spent far too much time worrying about efficiency in the wrong places and at the wrong times; premature optimization is the root of all evil (or at least most of it) in programming. - Donald Knuth Iverson is always talking about Knuth What a fanboy! (…he’s not wrong …)
  • 36. Doug’s Final Product Better names! A+ work for sure! Short functions! Fast even with split loops!
  • 37. Doug’s code is demonstrably better
  • 38. He clearly took Iverson’s clean code lectures to heart
  • 39. His solution consists of many small functions with good names
  • 40. And he even refactored like a pro
  • 41. So does Doug get the A?
  • 42. Tune in next week to find out … (Still too much nesting…) (MUHAHAHA…) .. on the next exciting episode of Doug Does Data Science
  • 44. What are unit tests? • Captures/maintain intended behavior • Helpful when changing code • Should be automated
  • 45. Good names… • Reveal intent • Use the proper parts of speech • Have the proper length for their scope • Avoids disinformation and encodings Data: What is it? Function: What does it do?
  • 46. Good names… • Reveal intent • Use the proper parts of speech • Have the proper length for their scope • Avoids disinformation and encodings Variable: Noun Function: Verb Boolean: Predicate
  • 47. Refactoring What is it? • Reorganize your code • Break it into different parts • Change composition Why use it? • Understand the code • Clean the code • Allow new features
  • 49. Common Refactoring Technique Split Loop 1 Loop that does 2 things 2 Loop that do 1 thing
  • 50. The DRY principle • Don’t repeat yourself! • Find similar code • Make it exactly the same • Extract a function!
  • 51. Advice for teaching clean code • Require unit tests and good names. • Don’t just teach it, live it! • Allow students to see you clean your messy code. • Teach/reinforce important concepts. • DRY • Refactoring • Efficiency concerns and profiling • Projects that require 100’s of lines of code.
  • 52. Clean Code Resources • These slides: https://bit.ly/2WgFIb • Clean Code, by Robert Martin • www.cleancoders.com, videos by Robert Martin and friends • Refactoring Code, by Martin Fowler

Editor's Notes

  1. (click) The hero of our drama, Doug Ervison, budding data science major with a penchant to messy code . (click) The villain in this drama is the mean Dr. Iverson, who always complains about Doug’s code. He sometimes even says it stinks.
  2. Over the last few years, I have been doing some research on software engineering techniques that will help our students. In this talk I will highlight a few; namely picking good names, using small functions that do one thing, using unit tests to ensure our code is correct, and refactoring our code to make it more modular and readable. But more importantly, this talk tells the story of Doug.
  3. Doug has a problem. (click) He was assigned a Kaggle assignment for class and he thinks he has a nice solution, but he knows that Dr. Iverson is going to dock points for messy code.
  4. Doug solution is based on the word distributions of each author. (click)Surely this solution will get Doug that elusive A. (click)Unfortunately, Doug forgot to look over the requirements for the assignment, which included unit tests for all functions.
  5. It looks like our hero is doomed to an F!
  6. Then just in the nick of time …
  7. Doug recalls this assignment requires unit tests. What was it that Iverson said in class about unit test?
  8. Doug looks back at his notes. So tests should be automated and capture the behavior of the code.
  9. Ok, unit tests. First, he makes some examples data and the intended output. (click) Then write an automated test that checks that his main function works. (click) (click) Finally, run the test and make sure the original function passes.
  10. Initially, Doug is happy with this code. (click) Surely this solution will get Doug that elusive A. (click) but then he remembers losing points for poor names on previous assignments
  11. He thinks back to a lecture on picking good names, remembering that names should express the intent of your code.
  12. Doug looks over his nores. (click) So data should say what it is (click) and functions should say what they do.
  13. Doug notices that next slide talks about using the correct parts of speech. (click) variables are nouns (click) functions verbs (click) and something about Booleans.
  14. Doug looks over his names. What would Iverson say?
  15. He definitely wouldn’t like ews and hws. He decides to use new names that use the authors last names.
  16. Doug continues to change names, replaces the name for each variable, trying to better capture the indent of the code. (click) He is now confident in getting an A! (click) Unfortunately, there is a bug in his code, and Iverson gives code that crashes a D.
  17. It looks like our hero is doomed with D!
  18. Then just in the nick of time …
  19. Doug remembers to test. (click) the code fails the test (click) and he figures out that he forgot to change to “a”s to “author” (click) he fixes the problem (click) and verifies the code passes his tests.
  20. That was close. So what other changes should he make?
  21. He remembers that Iverson likes programs with many small functions. (click) and he has one large function.
  22. This reminds him of one of his favorite lectures on extracting functions.
  23. He looks over his notes on extracting functions. (click) so he should find a block that does something (click) extract the code into a function with a good name (click) and replace the original block with a function call.
  24. He also sees that this technique is related to the DRY principle. Whatever!
  25. Doug looks at his code, looking for blocks that do something. (click) He finds some code that replaces hyphens with a space, (click) so he extracts that code to a function called replace_hythen and (click) replaces the original code with a function call. (click) Doug has learned his lesson after almost forgetting to test his name changes. He runs his unit test. They pass. (click) He also find some code that removes punctuation, and extracts a functions for that as well. Again his code passes the unit tests. (click) Turns out Doug likes to refactor.
  26. This is fun! Doug decides to extract another function. (click) This part cleans and splits each block of text. He extracts this function as well and reruns the tests. (click) (click) (click)
  27. Doug really does like to refactor (click) What other refactoring can he do? (click) Remembers that nesting is a sign that a function does to much. (click) and that he should look for repeated blocks of code He remembers something from class about refactoring a loop that does more than one thing.
  28. Doug looks over his notes on splitting a loop. (click) you find a loop that does more than one thing (click) then split it into multiple loops that each do one thing.
  29. Doug applies this technique (click) changing the if/else to separate if statements (click) and replacing a temporary variable with separate queires.
  30. Then he splits the 1 loop into 3 loops. (click) one for each author.
  31. Now he can extract a loop into a function. (click) And replace each loop with function call (click) Now that he’s done it, splitting the loop just feels wrong. His old code only passed over the data one time, while the new code scans the data three times. Isn’t this needlessly inefficient?
  32. Doug thinks back to what Iverson said in class on efficiency. So not all parts of your code really matter, and you won’t know which parts matter until after you run your code. He also remembers that Iverson went on-and-on about some guy named Knuth.
  33. Doug finds that Knuth guy’s quote in his notes. Hmm, “root of all evil”? That IS strong language. Ok, so maybe he shouldn’t worry so much about efficiency until he sees that his code is slow.
  34. Doug looks over his code one more time. Everything looks good and he has to admit that it is clean and easier to read.
  35. Doug’s code is demonstrably better
  36. He clearly took Iverson’s clean code lectures to heart
  37. And his code consists of small functions with good names
  38. He clearly likes to refactor
  39. He clearly likes to refactor
  40. Unfortunately you will have to tune in next week to find out
  41. Doug looks back at his notes. So tests should be automated and capture the behavior of the code.
  42. Doug looks over his nores. (click) So data should say what it is (click) and functions should say what they do.
  43. Doug notices that next slide talks about using the correct parts of speech. (click) variables are nouns (click) functions verbs (click) and something about Booleans.
  44. He looks over his notes on extracting functions. (click) so he should find a block that does something (click) extract the code into a function with a good name (click) and replace the original block with a function call.
  45. Doug looks over his notes on splitting a loop. (click) you find a loop that does more than one thing (click) then split it into multiple loops that each do one thing.
  46. He also sees that this technique is related to the DRY principle. Whatever!