Avik Basu
Mono-repositories in Python
What, When & How?
About Me
• Sta
ff
Data Scientist at Intuit
• Engineering + Data Science
• Love PS5 games
• Soccer + Tennis
• Driving is therapy!
What is a mono-repo?
What is a mono-repository?
• a.k.a. Monolith
• A single code base for multiple projects
• Projects can be
• Libraries
• Microservices
• Jupyter notebooks/explorations
• Projects can be in more than 1 language
When does it makes sense to
have a mono-repo?
Mono-repos
When does it make sense?
• Inter-related projects with a common goal
• Small to medium level code-base
• Early stage projects
• Better development velocity
• Easier promotion
• Di
ff
erent components have similar change rate
• System-level extensions, e.g. C/C++/Rust
Python Mono-repos
Advantages
• One place to look through everything
• Can improve development velocity (sometimes)
• Easier promotion of open-source projects (think GitHub ⭐)
• Better management of other language extensions, e.g. C/C++/Rust
• Popular mono-repositories
• pytorch-lightning
• azure-sdk-for-python
Python Mono-repos
Disadvantages
• Versioning decisions
• CI/CD complexity
• Merging and con
fl
icts
• Code ownership issues
• Need for specialized tools
• Unintended code breakage
Sample project
Project Scenario
Data Science
• 2 libraries
• 1 core library containing ML models and related tools
• 1 supporting library for connecting to di
ff
erent data sources
• 1 microservice
• FastAPI app with a docker container
• Serves real-time inference
• 1 notebook project
• Contains training jobs in Jupyter notebooks
Open-source tools
For managing mono-repos
• uv workspace
• Allows to manage multiple projects
• As a bonus uv is a great package management tool
• Poetry mono-ranger plugin
• Early stage
• nx
• Language agnostic
• Pants
• Multi-language
Version management
For multiple libraries in the repo
A. One version for the whole repo
‣ Simpler to manage
‣ Easier to keep track
‣ Unnecessary updates for
libraries with no changes
B. Each library has its own version
‣ No unnecessary updates
‣ Di
ffi
cult to maintain
‣ Essential to have a compatibility
matrix
‣ Needs git submodules
Thank You! 🙏
Github Repo Connect on LinkedIn

Mono-repositories in Python: What, When and How?

  • 1.
    Avik Basu Mono-repositories inPython What, When & How?
  • 2.
    About Me • Sta ff DataScientist at Intuit • Engineering + Data Science • Love PS5 games • Soccer + Tennis • Driving is therapy!
  • 3.
    What is amono-repo?
  • 4.
    What is amono-repository? • a.k.a. Monolith • A single code base for multiple projects • Projects can be • Libraries • Microservices • Jupyter notebooks/explorations • Projects can be in more than 1 language
  • 5.
    When does itmakes sense to have a mono-repo?
  • 6.
    Mono-repos When does itmake sense? • Inter-related projects with a common goal • Small to medium level code-base • Early stage projects • Better development velocity • Easier promotion • Di ff erent components have similar change rate • System-level extensions, e.g. C/C++/Rust
  • 7.
    Python Mono-repos Advantages • Oneplace to look through everything • Can improve development velocity (sometimes) • Easier promotion of open-source projects (think GitHub ⭐) • Better management of other language extensions, e.g. C/C++/Rust • Popular mono-repositories • pytorch-lightning • azure-sdk-for-python
  • 8.
    Python Mono-repos Disadvantages • Versioningdecisions • CI/CD complexity • Merging and con fl icts • Code ownership issues • Need for specialized tools • Unintended code breakage
  • 9.
  • 10.
    Project Scenario Data Science •2 libraries • 1 core library containing ML models and related tools • 1 supporting library for connecting to di ff erent data sources • 1 microservice • FastAPI app with a docker container • Serves real-time inference • 1 notebook project • Contains training jobs in Jupyter notebooks
  • 11.
    Open-source tools For managingmono-repos • uv workspace • Allows to manage multiple projects • As a bonus uv is a great package management tool • Poetry mono-ranger plugin • Early stage • nx • Language agnostic • Pants • Multi-language
  • 12.
    Version management For multiplelibraries in the repo A. One version for the whole repo ‣ Simpler to manage ‣ Easier to keep track ‣ Unnecessary updates for libraries with no changes B. Each library has its own version ‣ No unnecessary updates ‣ Di ffi cult to maintain ‣ Essential to have a compatibility matrix ‣ Needs git submodules
  • 13.
    Thank You! 🙏 GithubRepo Connect on LinkedIn