SlideShare a Scribd company logo
1 of 18
Download to read offline
1
August 2021 Jean Carlo Machado
Data Products
Python Clean Code for
Machine Learning
2
3
2
|
Motivation
● Clean ML Code is hard
● Less surprises
● Fewer incidents & Bugs
● Less Technical debt
● Easier handover of projects
● More Data Science less
operations
● Consistently ship products faster
3
3
|
Outline
1 2 3
The problem Large Scale
Clean Code
Small Scale
Clean Code
3
4
|
What is clean code?
“You know you are working on clean
code when each routine you read turns
out to be pretty much what you
expected.”
Ward Cunningham
3
5
|
ML Debt > Software Debt
Clean Code related
Glue code
Pipeline jungles
Configuration debt
Experimental code
paths
Not Clean code related
Entanglement
Hidden feedback loops
Static analysis of data
dependencies
Correlations drift
D. Schulley et. al. (2014)
6
Small Scale Clean Code
3
7
|
- Size & complexity of each
line
- Indentation level
+ Average comments
+ Spacing & blank lines
Readability “Feature Importance”
Buse and Weimar (2008)
3
8
|
Decorators
def track_execution(func):
print(f"Started " + func.__name__ )
func()
print(f"Finished " + func.__name__ )
@track_execution
def train():
print("Training")
$ python decorator.py
Started train
Training
Finished train
Add pre/post behaviour to functions.
3
9
|
List Comprehension
Reduces indentation, does not invite adding complexity, pythonic
3
10
|
Avoid Else, Early Return Instead
def with_else():
#...
if df_historic_performance_aggregated is None:
df_historic_performance_aggregated = df_aggregation
else:
#..
df_historic_performance_aggregated = (
# ...
)
return df_historic_performance_aggregated
def without_else():
# ...
if df_historic_performance_aggregated is None:
return df_aggregation
# ..
df_historic_performance_aggregated = (
#...
)
return df_historic_performance_aggregated
Reduce indentation and perceived complexity.
11
3
11
|
Other Dos and Dont’s
● Metaphor journal
● Import *
● assert out of tests
● default values for
functions
3
12
|
Type-Systems
from enum import Enum
from dataclasses import dataclass
class Platform(str, Enum):
mweb = 'mweb'
ios = 'ios'
@dataclass
class HeaderMypyChecked:
platform: Platform
HeaderMypyChecked(
platform="123")
from enum import Enum
from pydantic import BaseModel
class Platform(str, Enum):
mweb = 'mweb'
ios = 'ios'
class HeaderRuntimeChecked
(BaseModel):
platform: Platform
HeaderRuntimeChecked(
platform="123")
$ mypy type_system.py
type_system.py:15: error: Argument
"platform" to "HeaderMypyChecked" has
incompatible type "str" ; expected "Platform"
Found 1 error in 1 file (checked 1 source
file)
$ python type_system.py
pydantic.error_wrappers.ValidationError: 1
validation error for HeaderRuntimeChecked
platform
value is not a valid enumeration member;
permitted: 'mweb', 'ios '
(type=type_error.enum;
enum_values=[<Platform.mweb: 'mweb'>,
<Platform.ios: 'ios'>])
15% reduction of
software bugs
13
Large Scale Clean Code
3
14
|
Dom
1. Inside bounded-contexts the same
language is spoken
2. Clean and stable contracts
between contexts
Domain Driven Design
3
15
|
Side-effects
3
16
|
Closing Notes
Much more..
1. DRY
2. KISS
3. YAGNI
“Relatively simple things can tolerate a certain
level of disorganization. However, as
complexity increases, disorganization becomes
suicidal.“
Robert Martin
3
17
|
Books
18
3
18
|

More Related Content

Similar to Python Clean Code for Machine Learning

Building source code level profiler for C++.pdf
Building source code level profiler for C++.pdfBuilding source code level profiler for C++.pdf
Building source code level profiler for C++.pdfssuser28de9e
 
Functional Programming 101 for Java 7 Developers
Functional Programming 101 for Java 7 DevelopersFunctional Programming 101 for Java 7 Developers
Functional Programming 101 for Java 7 DevelopersJayaram Sankaranarayanan
 
GlobalLogic Test Automation Online TechTalk “Test Driven Development as a Per...
GlobalLogic Test Automation Online TechTalk “Test Driven Development as a Per...GlobalLogic Test Automation Online TechTalk “Test Driven Development as a Per...
GlobalLogic Test Automation Online TechTalk “Test Driven Development as a Per...GlobalLogic Ukraine
 
Performance schema in_my_sql_5.6_pluk2013
Performance schema in_my_sql_5.6_pluk2013Performance schema in_my_sql_5.6_pluk2013
Performance schema in_my_sql_5.6_pluk2013Valeriy Kravchuk
 
U19CS101 - PPS Unit 4 PPT (1).ppt
U19CS101 - PPS Unit 4 PPT (1).pptU19CS101 - PPS Unit 4 PPT (1).ppt
U19CS101 - PPS Unit 4 PPT (1).pptManivannan837728
 
Deep dive in Citrix Troubleshooting
Deep dive in Citrix TroubleshootingDeep dive in Citrix Troubleshooting
Deep dive in Citrix TroubleshootingDenis Gundarev
 
embeddedc-lecture1-160404055102.pptx
embeddedc-lecture1-160404055102.pptxembeddedc-lecture1-160404055102.pptx
embeddedc-lecture1-160404055102.pptxsangeetaSS
 
BP206 - Let's Give Your LotusScript a Tune-Up
BP206 - Let's Give Your LotusScript a Tune-Up BP206 - Let's Give Your LotusScript a Tune-Up
BP206 - Let's Give Your LotusScript a Tune-Up Craig Schumann
 
Improving Code Quality Through Effective Review Process
Improving Code Quality Through Effective  Review ProcessImproving Code Quality Through Effective  Review Process
Improving Code Quality Through Effective Review ProcessDr. Syed Hassan Amin
 
Evolutionary db development
Evolutionary db development Evolutionary db development
Evolutionary db development Open Party
 
SE2023 0401 Software Coding and Testing.pptx
SE2023 0401 Software Coding and Testing.pptxSE2023 0401 Software Coding and Testing.pptx
SE2023 0401 Software Coding and Testing.pptxBharat Chawda
 
How to write maintainable code without tests
How to write maintainable code without testsHow to write maintainable code without tests
How to write maintainable code without testsJuti Noppornpitak
 
GSP 125 Entire Course NEW
GSP 125 Entire Course NEWGSP 125 Entire Course NEW
GSP 125 Entire Course NEWshyamuopten
 
Netflix Machine Learning Infra for Recommendations - 2018
Netflix Machine Learning Infra for Recommendations - 2018Netflix Machine Learning Infra for Recommendations - 2018
Netflix Machine Learning Infra for Recommendations - 2018Karthik Murugesan
 
ML Infra for Netflix Recommendations - AI NEXTCon talk
ML Infra for Netflix Recommendations - AI NEXTCon talkML Infra for Netflix Recommendations - AI NEXTCon talk
ML Infra for Netflix Recommendations - AI NEXTCon talkFaisal Siddiqi
 
elm-d3 @ NYC D3.js Meetup (30 June, 2014)
elm-d3 @ NYC D3.js Meetup (30 June, 2014)elm-d3 @ NYC D3.js Meetup (30 June, 2014)
elm-d3 @ NYC D3.js Meetup (30 June, 2014)Spiros
 
មេរៀនៈ Data Structure and Algorithm in C/C++
មេរៀនៈ Data Structure and Algorithm in C/C++មេរៀនៈ Data Structure and Algorithm in C/C++
មេរៀនៈ Data Structure and Algorithm in C/C++Ngeam Soly
 

Similar to Python Clean Code for Machine Learning (20)

Building source code level profiler for C++.pdf
Building source code level profiler for C++.pdfBuilding source code level profiler for C++.pdf
Building source code level profiler for C++.pdf
 
Functional Programming 101 for Java 7 Developers
Functional Programming 101 for Java 7 DevelopersFunctional Programming 101 for Java 7 Developers
Functional Programming 101 for Java 7 Developers
 
GlobalLogic Test Automation Online TechTalk “Test Driven Development as a Per...
GlobalLogic Test Automation Online TechTalk “Test Driven Development as a Per...GlobalLogic Test Automation Online TechTalk “Test Driven Development as a Per...
GlobalLogic Test Automation Online TechTalk “Test Driven Development as a Per...
 
Performance schema in_my_sql_5.6_pluk2013
Performance schema in_my_sql_5.6_pluk2013Performance schema in_my_sql_5.6_pluk2013
Performance schema in_my_sql_5.6_pluk2013
 
U19CS101 - PPS Unit 4 PPT (1).ppt
U19CS101 - PPS Unit 4 PPT (1).pptU19CS101 - PPS Unit 4 PPT (1).ppt
U19CS101 - PPS Unit 4 PPT (1).ppt
 
Deep dive in Citrix Troubleshooting
Deep dive in Citrix TroubleshootingDeep dive in Citrix Troubleshooting
Deep dive in Citrix Troubleshooting
 
embeddedc-lecture1-160404055102.pptx
embeddedc-lecture1-160404055102.pptxembeddedc-lecture1-160404055102.pptx
embeddedc-lecture1-160404055102.pptx
 
Embedded C - Lecture 1
Embedded C - Lecture 1Embedded C - Lecture 1
Embedded C - Lecture 1
 
BP206 - Let's Give Your LotusScript a Tune-Up
BP206 - Let's Give Your LotusScript a Tune-Up BP206 - Let's Give Your LotusScript a Tune-Up
BP206 - Let's Give Your LotusScript a Tune-Up
 
Improving Code Quality Through Effective Review Process
Improving Code Quality Through Effective  Review ProcessImproving Code Quality Through Effective  Review Process
Improving Code Quality Through Effective Review Process
 
Evolutionary db development
Evolutionary db development Evolutionary db development
Evolutionary db development
 
SE2023 0401 Software Coding and Testing.pptx
SE2023 0401 Software Coding and Testing.pptxSE2023 0401 Software Coding and Testing.pptx
SE2023 0401 Software Coding and Testing.pptx
 
How to write maintainable code without tests
How to write maintainable code without testsHow to write maintainable code without tests
How to write maintainable code without tests
 
GSP 125 Entire Course NEW
GSP 125 Entire Course NEWGSP 125 Entire Course NEW
GSP 125 Entire Course NEW
 
Netflix Machine Learning Infra for Recommendations - 2018
Netflix Machine Learning Infra for Recommendations - 2018Netflix Machine Learning Infra for Recommendations - 2018
Netflix Machine Learning Infra for Recommendations - 2018
 
ML Infra for Netflix Recommendations - AI NEXTCon talk
ML Infra for Netflix Recommendations - AI NEXTCon talkML Infra for Netflix Recommendations - AI NEXTCon talk
ML Infra for Netflix Recommendations - AI NEXTCon talk
 
elm-d3 @ NYC D3.js Meetup (30 June, 2014)
elm-d3 @ NYC D3.js Meetup (30 June, 2014)elm-d3 @ NYC D3.js Meetup (30 June, 2014)
elm-d3 @ NYC D3.js Meetup (30 June, 2014)
 
ELAVARASAN.pdf
ELAVARASAN.pdfELAVARASAN.pdf
ELAVARASAN.pdf
 
មេរៀនៈ Data Structure and Algorithm in C/C++
មេរៀនៈ Data Structure and Algorithm in C/C++មេរៀនៈ Data Structure and Algorithm in C/C++
មេរៀនៈ Data Structure and Algorithm in C/C++
 
MDE in Practice
MDE in PracticeMDE in Practice
MDE in Practice
 

More from Jean Carlo Machado

More from Jean Carlo Machado (11)

Domain Driven Design Made Functional with Python
Domain Driven Design Made Functional with Python Domain Driven Design Made Functional with Python
Domain Driven Design Made Functional with Python
 
Search microservice
Search microserviceSearch microservice
Search microservice
 
Git avançado
Git avançadoGit avançado
Git avançado
 
Functional php
Functional phpFunctional php
Functional php
 
Why functional programming matters
Why functional programming mattersWhy functional programming matters
Why functional programming matters
 
Clean code v3
Clean code v3Clean code v3
Clean code v3
 
Clean Code V2
Clean Code V2Clean Code V2
Clean Code V2
 
Review articles bio inspired algorithms
Review articles bio inspired algorithmsReview articles bio inspired algorithms
Review articles bio inspired algorithms
 
Introduction to Rust
Introduction to RustIntroduction to Rust
Introduction to Rust
 
Limitações do HTML no Desenvolvimento de Jogos Multiplataforma
Limitações do HTML no Desenvolvimento de Jogos MultiplataformaLimitações do HTML no Desenvolvimento de Jogos Multiplataforma
Limitações do HTML no Desenvolvimento de Jogos Multiplataforma
 
Clean code
Clean codeClean code
Clean code
 

Recently uploaded

KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningVitsRangannavar
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
buds n tech IT solutions
buds n  tech IT                solutionsbuds n  tech IT                solutions
buds n tech IT solutionsmonugehlot87
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsMehedi Hasan Shohan
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 

Recently uploaded (20)

KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
buds n tech IT solutions
buds n  tech IT                solutionsbuds n  tech IT                solutions
buds n tech IT solutions
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software Solutions
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 

Python Clean Code for Machine Learning

  • 1. 1 August 2021 Jean Carlo Machado Data Products Python Clean Code for Machine Learning
  • 2. 2 3 2 | Motivation ● Clean ML Code is hard ● Less surprises ● Fewer incidents & Bugs ● Less Technical debt ● Easier handover of projects ● More Data Science less operations ● Consistently ship products faster
  • 3. 3 3 | Outline 1 2 3 The problem Large Scale Clean Code Small Scale Clean Code
  • 4. 3 4 | What is clean code? “You know you are working on clean code when each routine you read turns out to be pretty much what you expected.” Ward Cunningham
  • 5. 3 5 | ML Debt > Software Debt Clean Code related Glue code Pipeline jungles Configuration debt Experimental code paths Not Clean code related Entanglement Hidden feedback loops Static analysis of data dependencies Correlations drift D. Schulley et. al. (2014)
  • 7. 3 7 | - Size & complexity of each line - Indentation level + Average comments + Spacing & blank lines Readability “Feature Importance” Buse and Weimar (2008)
  • 8. 3 8 | Decorators def track_execution(func): print(f"Started " + func.__name__ ) func() print(f"Finished " + func.__name__ ) @track_execution def train(): print("Training") $ python decorator.py Started train Training Finished train Add pre/post behaviour to functions.
  • 9. 3 9 | List Comprehension Reduces indentation, does not invite adding complexity, pythonic
  • 10. 3 10 | Avoid Else, Early Return Instead def with_else(): #... if df_historic_performance_aggregated is None: df_historic_performance_aggregated = df_aggregation else: #.. df_historic_performance_aggregated = ( # ... ) return df_historic_performance_aggregated def without_else(): # ... if df_historic_performance_aggregated is None: return df_aggregation # .. df_historic_performance_aggregated = ( #... ) return df_historic_performance_aggregated Reduce indentation and perceived complexity.
  • 11. 11 3 11 | Other Dos and Dont’s ● Metaphor journal ● Import * ● assert out of tests ● default values for functions
  • 12. 3 12 | Type-Systems from enum import Enum from dataclasses import dataclass class Platform(str, Enum): mweb = 'mweb' ios = 'ios' @dataclass class HeaderMypyChecked: platform: Platform HeaderMypyChecked( platform="123") from enum import Enum from pydantic import BaseModel class Platform(str, Enum): mweb = 'mweb' ios = 'ios' class HeaderRuntimeChecked (BaseModel): platform: Platform HeaderRuntimeChecked( platform="123") $ mypy type_system.py type_system.py:15: error: Argument "platform" to "HeaderMypyChecked" has incompatible type "str" ; expected "Platform" Found 1 error in 1 file (checked 1 source file) $ python type_system.py pydantic.error_wrappers.ValidationError: 1 validation error for HeaderRuntimeChecked platform value is not a valid enumeration member; permitted: 'mweb', 'ios ' (type=type_error.enum; enum_values=[<Platform.mweb: 'mweb'>, <Platform.ios: 'ios'>]) 15% reduction of software bugs
  • 14. 3 14 | Dom 1. Inside bounded-contexts the same language is spoken 2. Clean and stable contracts between contexts Domain Driven Design
  • 16. 3 16 | Closing Notes Much more.. 1. DRY 2. KISS 3. YAGNI “Relatively simple things can tolerate a certain level of disorganization. However, as complexity increases, disorganization becomes suicidal.“ Robert Martin