This research has been partly funded by the EU H2020 Programs
H2020-EU.2.1.1-CyberSec4Europe (Grant No. 830929), AssureMoss (Grant
No. 952647) and SPARTA project (Grant No. 830892).
LastPyMile:
a lightweight approach for securing Python ecosystem from
software supply chain attack
D.L. Vu1
F. Massacci1,2
I. Pashchenko1
H. Plate3
A. Sabetta3
1
University of Trento, IT, 2
Vrije Universiteit Amsterdam, NL, 3
SAP Security Research, FR
In FOSS, we all go and use the Source, don’t we?
• Well, mostly not, we all download a binary (for Linux), or a package (in Python)
from a repository (e.g. PyPI)
• What if
• the package in PyPI is NOT the “same” as the source in Github?
• In the top 2.5K packages in PyPI (around 10MLoC), 65% have subtle differences, 5%
changes in Python code
• Malicious code anyone?
• How to check this AT SCALE?
• PyPI’s ~10 maintainers must “shepherd” 400K package owners
• In 2020 → 1.9K updates x day with average of 3.5K files by 77K developers
• Reproducible builds? A non-starter → too many developers, too many diffs
• Gitlog? Malware check all lines? A non starter → even 1 False Positive x #updates ...
2
2
A PyPI Journey: From Source to Package
3
A PyPI Journey: From Source to Package
3
Start releasing
Source code repo
(e.g., Github)
Development
A PyPI Journey: From Source to Package
3
Start releasing
Build, additional
code generation
(e.g., Swagger,
Codegen
Release-ready
code
config info:
- CI credentials
- Source code repo
credentials
Source code repo
(e.g., Github)
Development
A PyPI Journey: From Source to Package
3
Start releasing
Build, additional
code generation
(e.g., Swagger,
Codegen
Upload to
package repo
(e.g., by twine)
Release-ready
code
Generated Artifacts
Possibly:
- OS binaries
- Test Coverage Log
- Documentation
- Automated generated
code
config info:
- CI credentials
- Source code repo
credentials
config info:
- Package repo
credentials
Source code repo
(e.g., Github)
Development
Build systems (CI Cloud services such as Travis CI,
developers’ workstation, etc.)
Build and Publication
A PyPI Journey: From Source to Package
3
Start releasing
Build, additional
code generation
(e.g., Swagger,
Codegen
Upload to
package repo
(e.g., by twine)
Release-ready
code
Generated Artifacts
Possibly:
- OS binaries
- Test Coverage Log
- Documentation
- Automated generated
code
Generated Artifacts in
package repo (PyPI)
Possibly:
- OS binaries
- Test Coverage Log
- Documentation
- Automated generated
code
config info:
- CI credentials
- Source code repo
credentials
config info:
- Package repo
credentials
Source code repo
(e.g., Github)
Development
Build systems (CI Cloud services such as Travis CI,
developers’ workstation, etc.)
Build and Publication
End releasing
Consumption
Package repository
Current Package Scanners
4
Current Package Scanners
4
Get a Package
Current Package Scanners
4
Get a Package
PyPI
Current Package Scanners
4
Get a Package
PyPI
Select files
for review
Current Package Scanners
4
Get a Package
PyPI
Select files
for review
Selected files
(e.g., setup.py)
Current Package Scanners
4
Get a Package
PyPI
Select files
for review
Selected files
(e.g., setup.py) Automated check for
malware (e.g., PyPI
Malware Check)
Current Package Scanners
4
Get a Package
PyPI
Select files
for review
Selected files
(e.g., setup.py) Automated check for
malware (e.g., PyPI
Malware Check)
Output
Review
Current Package Scanners
4
Get a Package
PyPI
Select files
for review
Selected files
(e.g., setup.py) Automated check for
malware (e.g., PyPI
Malware Check)
Output
Review
Malicious code can be
embedded in to a
different file
Current Package Scanners
4
Get a Package
PyPI
FOSS Alternative
Select files
for review
Selected files
(e.g., setup.py) Automated check for
malware (e.g., PyPI
Malware Check)
Output
Review
Malicious code can be
embedded in to a
different file
Current Package Scanners
4
Get a Package
PyPI
FOSS Alternative
Select files
for review
Selected files
(e.g., setup.py) Automated check for
malware (e.g., PyPI
Malware Check)
Get an artifact Output
Review
Malicious code can be
embedded in to a
different file
Current Package Scanners
4
Get a Package
PyPI
FOSS Alternative
Select files
for review
Selected files
(e.g., setup.py) Automated check for
malware (e.g., PyPI
Malware Check)
Get an artifact Output
Review
Files, and resources
Malicious code can be
embedded in to a
different file
Current Package Scanners
4
Get a Package
PyPI
FOSS Alternative
Select files
for review
Selected files
(e.g., setup.py) Automated check for
malware (e.g., PyPI
Malware Check)
Get an artifact Bandit Checks
Output
Review
Files, and resources
Malicious code can be
embedded in to a
different file
Current Package Scanners
4
Get a Package
PyPI
FOSS Alternative
Select files
for review
Selected files
(e.g., setup.py) Automated check for
malware (e.g., PyPI
Malware Check)
Get an artifact Bandit Checks
Output
Review
Files, and resources
Malicious code can be
embedded in to a
different file
All files/lines => many FP
Current Package Scanners
4
Get a Package
PyPI
FOSS Alternative
Comprehensive
Security Review
Select files
for review
Selected files
(e.g., setup.py) Automated check for
malware (e.g., PyPI
Malware Check)
Get an artifact Bandit Checks
Output
Review
Files, and resources
Malicious code can be
embedded in to a
different file
All files/lines => many FP
Current Package Scanners
4
Get a Package
PyPI
FOSS Alternative
Comprehensive
Security Review
Select files
for review
Selected files
(e.g., setup.py) Automated check for
malware (e.g., PyPI
Malware Check)
Get an artifact Bandit Checks
Output
Review
Identify source
code repo
Files, and resources
Malicious code can be
embedded in to a
different file
All files/lines => many FP
Current Package Scanners
4
Get a Package
PyPI
FOSS Alternative
Comprehensive
Security Review
Select files
for review
Selected files
(e.g., setup.py) Automated check for
malware (e.g., PyPI
Malware Check)
Get an artifact Bandit Checks
Output
Review
Identify source
code repo
Source code
repository
Files, and resources
Malicious code can be
embedded in to a
different file
All files/lines => many FP
Current Package Scanners
4
Get a Package
PyPI
FOSS Alternative
Comprehensive
Security Review
Select files
for review
Selected files
(e.g., setup.py) Automated check for
malware (e.g., PyPI
Malware Check)
Get an artifact Bandit Checks
Output
Review
Identify source
code repo
Source Code
Scanners
Source code
repository
Files, and resources
Malicious code can be
embedded in to a
different file
All files/lines => many FP
Current Package Scanners
4
Get a Package
PyPI
FOSS Alternative
Comprehensive
Security Review
Select files
for review
Selected files
(e.g., setup.py) Automated check for
malware (e.g., PyPI
Malware Check)
Get an artifact Bandit Checks
Output
Review
Identify source
code repo
Source Code
Scanners
Source code
repository
Files, and resources
Malicious code can be
embedded in to a
different file
All files/lines => many FP
All files/lines + resource
intensive + human effort
=> many FP + not
possible to perform
on-the-fly
LastPyMile to Identify Code Injection
5
Get a Package
PyPI
FOSS Alternative
Comprehensive
Security Review
Select files for review
Selected files
(e.g., setup.py)
Automated check
for malware (e.g.,
PyPI Malware
Check)
Get an artifact Bandit Checks Output
Review
Identify source
code repo
Source code
scanners
Source code repository
Files, and resources
LastPyMile to Identify Code Injection
5
LastPyMile
Get a Package
PyPI
FOSS Alternative
Comprehensive
Security Review
Select files for review
Selected files
(e.g., setup.py)
Automated check
for malware (e.g.,
PyPI Malware
Check)
Get an artifact Bandit Checks Output
Review
Identify source
code repo
Source code
scanners
Source code repository
Files, and resources
LastPyMile to Identify Code Injection
5
LastPyMile
Identify source
code repo
(Step 1)
Get a Package
PyPI
FOSS Alternative
Comprehensive
Security Review
Select files for review
Selected files
(e.g., setup.py)
Automated check
for malware (e.g.,
PyPI Malware
Check)
Get an artifact Bandit Checks Output
Review
Identify source
code repo
Source code
scanners
Source code repository
Files, and resources
LastPyMile to Identify Code Injection
5
LastPyMile
Identify source
code repo
(Step 1)
Hash Files & Collect Lines
from Repo (Step 2)
Get a Package
PyPI
FOSS Alternative
Comprehensive
Security Review
Select files for review
Selected files
(e.g., setup.py)
Automated check
for malware (e.g.,
PyPI Malware
Check)
Get an artifact Bandit Checks Output
Review
Identify source
code repo
Source code
scanners
Source code repository
Files, and resources
LastPyMile to Identify Code Injection
5
LastPyMile
Identify source
code repo
(Step 1)
Get an artifact
Hash Files & Collect Lines
from Repo (Step 2)
Get a Package
PyPI
FOSS Alternative
Comprehensive
Security Review
Select files for review
Selected files
(e.g., setup.py)
Automated check
for malware (e.g.,
PyPI Malware
Check)
Get an artifact Bandit Checks Output
Review
Identify source
code repo
Source code
scanners
Source code repository
Files, and resources
LastPyMile to Identify Code Injection
5
LastPyMile
Identify source
code repo
(Step 1)
Get an artifact Hash Files & Collect Lines
from Artifact (Step 3)
Hash Files & Collect Lines
from Repo (Step 2)
Get a Package
PyPI
FOSS Alternative
Comprehensive
Security Review
Select files for review
Selected files
(e.g., setup.py)
Automated check
for malware (e.g.,
PyPI Malware
Check)
Get an artifact Bandit Checks Output
Review
Identify source
code repo
Source code
scanners
Source code repository
Files, and resources
LastPyMile to Identify Code Injection
5
LastPyMile
Identify source
code repo
(Step 1)
Get an artifact Hash Files & Collect Lines
from Artifact (Step 3)
Hash Files & Collect Lines
from Repo (Step 2)
Identify phantom files
and lines (Step 4)
Get a Package
PyPI
FOSS Alternative
Comprehensive
Security Review
Select files for review
Selected files
(e.g., setup.py)
Automated check
for malware (e.g.,
PyPI Malware
Check)
Get an artifact Bandit Checks Output
Review
Identify source
code repo
Source code
scanners
Source code repository
Files, and resources
LastPyMile to Identify Code Injection
5
LastPyMile
Identify source
code repo
(Step 1)
Get an artifact Hash Files & Collect Lines
from Artifact (Step 3)
Hash Files & Collect Lines
from Repo (Step 2)
Identify phantom files
and lines (Step 4)
Filter Input
Get a Package
PyPI
FOSS Alternative
Comprehensive
Security Review
Select files for review
Selected files
(e.g., setup.py)
Automated check
for malware (e.g.,
PyPI Malware
Check)
Get an artifact Bandit Checks Output
Review
Identify source
code repo
Source code
scanners
Source code repository
Files, and resources
LastPyMile to Identify Code Injection
5
LastPyMile
Identify source
code repo
(Step 1)
Get an artifact Hash Files & Collect Lines
from Artifact (Step 3)
Hash Files & Collect Lines
from Repo (Step 2)
Identify phantom files
and lines (Step 4)
Filter Input Filter Output
Get a Package
PyPI
FOSS Alternative
Comprehensive
Security Review
Select files for review
Selected files
(e.g., setup.py)
Automated check
for malware (e.g.,
PyPI Malware
Check)
Get an artifact Bandit Checks Output
Review
Identify source
code repo
Source code
scanners
Source code repository
Files, and resources
LastPyMile Combined with other Package Scanners
6
Artifact Type #Files Problem Size Malware Checks Alerts Suspicious Bandit Alerts
#LoCs
(all files)
#LoCs
(setup.py)
Coverage
(setup.py)
setup.py whole
pkg
LastPyMile setup.py whole
pkg
LastPyMile
urllib3-1.26.3 80 25,348 97 0,4% 1 260 0 8 1398 0
requests-2.25.1 32 9325 112 1,2% 3 57 0 9 505 0
setuptools-53.0.0 244 70,794 162 0,2% 4 2932 0 5 762 0
urlib3-1.21.1 72 20 448 197 1% 4 177 3 20 1044 12
request-1.0.117 3 166 52 31,3% 2 8 2 5 27 20
setup-tools-36.0.1 112 31 245 304 1% 8 1289 3 21 489 12
In
setup.py
B
e
n
i
g
n
M
a
l
i
c
i
o
u
s
Y
N
Y
LastPyMile Combined with other Package Scanners
6
Artifact Type #Files Problem Size Malware Checks Alerts Suspicious Bandit Alerts
#LoCs
(all files)
#LoCs
(setup.py)
Coverage
(setup.py)
setup.py whole
pkg
LastPyMile setup.py whole
pkg
LastPyMile
urllib3-1.26.3 80 25,348 97 0,4% 1 260 0 8 1398 0
requests-2.25.1 32 9325 112 1,2% 3 57 0 9 505 0
setuptools-53.0.0 244 70,794 162 0,2% 4 2932 0 5 762 0
urlib3-1.21.1 72 20 448 197 1% 4 177 3 20 1044 12
request-1.0.117 3 166 52 31,3% 2 8 2 5 27 20
setup-tools-36.0.1 112 31 245 304 1% 8 1289 3 21 489 12
In
setup.py
B
e
n
i
g
n
M
a
l
i
c
i
o
u
s
Y
N
Y
LastPyMile Combined with other Package Scanners
6
Artifact Type #Files Problem Size Malware Checks Alerts Suspicious Bandit Alerts
#LoCs
(all files)
#LoCs
(setup.py)
Coverage
(setup.py)
setup.py whole
pkg
LastPyMile setup.py whole
pkg
LastPyMile
urllib3-1.26.3 80 25,348 97 0,4% 1 260 0 8 1398 0
requests-2.25.1 32 9325 112 1,2% 3 57 0 9 505 0
setuptools-53.0.0 244 70,794 162 0,2% 4 2932 0 5 762 0
urlib3-1.21.1 72 20 448 197 1% 4 177 3 20 1044 12
request-1.0.117 3 166 52 31,3% 2 8 2 5 27 20
setup-tools-36.0.1 112 31 245 304 1% 8 1289 3 21 489 12
In
setup.py
B
e
n
i
g
n
M
a
l
i
c
i
o
u
s
Y
N
Y
LastPyMile Combined with other Package Scanners
6
Artifact Type #Files Problem Size Malware Checks Alerts Suspicious Bandit Alerts
#LoCs
(all files)
#LoCs
(setup.py)
Coverage
(setup.py)
setup.py whole
pkg
LastPyMile setup.py whole
pkg
LastPyMile
urllib3-1.26.3 80 25,348 97 0,4% 1 260 0 8 1398 0
requests-2.25.1 32 9325 112 1,2% 3 57 0 9 505 0
setuptools-53.0.0 244 70,794 162 0,2% 4 2932 0 5 762 0
urlib3-1.21.1 72 20 448 197 1% 4 177 3 20 1044 12
request-1.0.117 3 166 52 31,3% 2 8 2 5 27 20
setup-tools-36.0.1 112 31 245 304 1% 8 1289 3 21 489 12
In
setup.py
B
e
n
i
g
n
M
a
l
i
c
i
o
u
s
Y
N
Y
LastPyMile Combined with other Package Scanners
6
Artifact Type #Files Problem Size Malware Checks Alerts Suspicious Bandit Alerts
#LoCs
(all files)
#LoCs
(setup.py)
Coverage
(setup.py)
setup.py whole
pkg
LastPyMile setup.py whole
pkg
LastPyMile
urllib3-1.26.3 80 25,348 97 0,4% 1 260 0 8 1398 0
requests-2.25.1 32 9325 112 1,2% 3 57 0 9 505 0
setuptools-53.0.0 244 70,794 162 0,2% 4 2932 0 5 762 0
urlib3-1.21.1 72 20 448 197 1% 4 177 3 20 1044 12
request-1.0.117 3 166 52 31,3% 2 8 2 5 27 20
setup-tools-36.0.1 112 31 245 304 1% 8 1289 3 21 489 12
In
setup.py
B
e
n
i
g
n
M
a
l
i
c
i
o
u
s
Y
N
Y
LastPyMile Combined with other Package Scanners
6
Artifact Type #Files Problem Size Malware Checks Alerts Suspicious Bandit Alerts
#LoCs
(all files)
#LoCs
(setup.py)
Coverage
(setup.py)
setup.py whole
pkg
LastPyMile setup.py whole
pkg
LastPyMile
urllib3-1.26.3 80 25,348 97 0,4% 1 260 0 8 1398 0
requests-2.25.1 32 9325 112 1,2% 3 57 0 9 505 0
setuptools-53.0.0 244 70,794 162 0,2% 4 2932 0 5 762 0
urlib3-1.21.1 72 20 448 197 1% 4 177 3 20 1044 12
request-1.0.117 3 166 52 31,3% 2 8 2 5 27 20
setup-tools-36.0.1 112 31 245 304 1% 8 1289 3 21 489 12
In
setup.py
B
e
n
i
g
n
M
a
l
i
c
i
o
u
s
Y
N
Y
LastPyMile Combined with other Package Scanners
6
Artifact Type #Files Problem Size Malware Checks Alerts Suspicious Bandit Alerts
#LoCs
(all files)
#LoCs
(setup.py)
Coverage
(setup.py)
setup.py whole
pkg
LastPyMile setup.py whole
pkg
LastPyMile
urllib3-1.26.3 80 25,348 97 0,4% 1 260 0 8 1398 0
requests-2.25.1 32 9325 112 1,2% 3 57 0 9 505 0
setuptools-53.0.0 244 70,794 162 0,2% 4 2932 0 5 762 0
urlib3-1.21.1 72 20 448 197 1% 4 177 3 20 1044 12
request-1.0.117 3 166 52 31,3% 2 8 2 5 27 20
setup-tools-36.0.1 112 31 245 304 1% 8 1289 3 21 489 12
In
setup.py
B
e
n
i
g
n
M
a
l
i
c
i
o
u
s
Y
N
Y
LastPyMile Combined with other Package Scanners
6
Artifact Type #Files Problem Size Malware Checks Alerts Suspicious Bandit Alerts
#LoCs
(all files)
#LoCs
(setup.py)
Coverage
(setup.py)
setup.py whole
pkg
LastPyMile setup.py whole
pkg
LastPyMile
urllib3-1.26.3 80 25,348 97 0,4% 1 260 0 8 1398 0
requests-2.25.1 32 9325 112 1,2% 3 57 0 9 505 0
setuptools-53.0.0 244 70,794 162 0,2% 4 2932 0 5 762 0
urlib3-1.21.1 72 20 448 197 1% 4 177 3 20 1044 12
request-1.0.117 3 166 52 31,3% 2 8 2 5 27 20
setup-tools-36.0.1 112 31 245 304 1% 8 1289 3 21 489 12
In
setup.py
B
e
n
i
g
n
M
a
l
i
c
i
o
u
s
Y
N
Y
LastPyMile Combined with other Package Scanners
6
Artifact Type #Files Problem Size Malware Checks Alerts Suspicious Bandit Alerts
#LoCs
(all files)
#LoCs
(setup.py)
Coverage
(setup.py)
setup.py whole
pkg
LastPyMile setup.py whole
pkg
LastPyMile
urllib3-1.26.3 80 25,348 97 0,4% 1 260 0 8 1398 0
requests-2.25.1 32 9325 112 1,2% 3 57 0 9 505 0
setuptools-53.0.0 244 70,794 162 0,2% 4 2932 0 5 762 0
urlib3-1.21.1 72 20 448 197 1% 4 177 3 20 1044 12
request-1.0.117 3 166 52 31,3% 2 8 2 5 27 20
setup-tools-36.0.1 112 31 245 304 1% 8 1289 3 21 489 12
In
setup.py
B
e
n
i
g
n
M
a
l
i
c
i
o
u
s
Y
N
Y
LastPyMile
LastPyMile is a simple yet efficient methodology that allows you to identify
discrepancies between packages and sources.
Combined with other package scanners, LastPyMile is promising to be used for
scanning for the maliciousness of injected code.
Future plans:
The code is under review by a company - stay tuned…
We plan to submit LastPyMile to PyPA as a new check for the uploaded
packages.
Check out our replication package: https://zenodo.org/record/4899935
Want to know more about LastPyMile? Have ideas?
Want to contribute?
7
Feel free to reach us!

SFScon 21 - Duc Ly Vu - LastPyMile: a lightweight approach for securing Python ecosystem from software supply chain attacks

  • 1.
    This research hasbeen partly funded by the EU H2020 Programs H2020-EU.2.1.1-CyberSec4Europe (Grant No. 830929), AssureMoss (Grant No. 952647) and SPARTA project (Grant No. 830892). LastPyMile: a lightweight approach for securing Python ecosystem from software supply chain attack D.L. Vu1 F. Massacci1,2 I. Pashchenko1 H. Plate3 A. Sabetta3 1 University of Trento, IT, 2 Vrije Universiteit Amsterdam, NL, 3 SAP Security Research, FR
  • 2.
    In FOSS, weall go and use the Source, don’t we? • Well, mostly not, we all download a binary (for Linux), or a package (in Python) from a repository (e.g. PyPI) • What if • the package in PyPI is NOT the “same” as the source in Github? • In the top 2.5K packages in PyPI (around 10MLoC), 65% have subtle differences, 5% changes in Python code • Malicious code anyone? • How to check this AT SCALE? • PyPI’s ~10 maintainers must “shepherd” 400K package owners • In 2020 → 1.9K updates x day with average of 3.5K files by 77K developers • Reproducible builds? A non-starter → too many developers, too many diffs • Gitlog? Malware check all lines? A non starter → even 1 False Positive x #updates ... 2 2
  • 3.
    A PyPI Journey:From Source to Package 3
  • 4.
    A PyPI Journey:From Source to Package 3 Start releasing Source code repo (e.g., Github) Development
  • 5.
    A PyPI Journey:From Source to Package 3 Start releasing Build, additional code generation (e.g., Swagger, Codegen Release-ready code config info: - CI credentials - Source code repo credentials Source code repo (e.g., Github) Development
  • 6.
    A PyPI Journey:From Source to Package 3 Start releasing Build, additional code generation (e.g., Swagger, Codegen Upload to package repo (e.g., by twine) Release-ready code Generated Artifacts Possibly: - OS binaries - Test Coverage Log - Documentation - Automated generated code config info: - CI credentials - Source code repo credentials config info: - Package repo credentials Source code repo (e.g., Github) Development Build systems (CI Cloud services such as Travis CI, developers’ workstation, etc.) Build and Publication
  • 7.
    A PyPI Journey:From Source to Package 3 Start releasing Build, additional code generation (e.g., Swagger, Codegen Upload to package repo (e.g., by twine) Release-ready code Generated Artifacts Possibly: - OS binaries - Test Coverage Log - Documentation - Automated generated code Generated Artifacts in package repo (PyPI) Possibly: - OS binaries - Test Coverage Log - Documentation - Automated generated code config info: - CI credentials - Source code repo credentials config info: - Package repo credentials Source code repo (e.g., Github) Development Build systems (CI Cloud services such as Travis CI, developers’ workstation, etc.) Build and Publication End releasing Consumption Package repository
  • 8.
  • 9.
  • 10.
  • 11.
    Current Package Scanners 4 Geta Package PyPI Select files for review
  • 12.
    Current Package Scanners 4 Geta Package PyPI Select files for review Selected files (e.g., setup.py)
  • 13.
    Current Package Scanners 4 Geta Package PyPI Select files for review Selected files (e.g., setup.py) Automated check for malware (e.g., PyPI Malware Check)
  • 14.
    Current Package Scanners 4 Geta Package PyPI Select files for review Selected files (e.g., setup.py) Automated check for malware (e.g., PyPI Malware Check) Output Review
  • 15.
    Current Package Scanners 4 Geta Package PyPI Select files for review Selected files (e.g., setup.py) Automated check for malware (e.g., PyPI Malware Check) Output Review Malicious code can be embedded in to a different file
  • 16.
    Current Package Scanners 4 Geta Package PyPI FOSS Alternative Select files for review Selected files (e.g., setup.py) Automated check for malware (e.g., PyPI Malware Check) Output Review Malicious code can be embedded in to a different file
  • 17.
    Current Package Scanners 4 Geta Package PyPI FOSS Alternative Select files for review Selected files (e.g., setup.py) Automated check for malware (e.g., PyPI Malware Check) Get an artifact Output Review Malicious code can be embedded in to a different file
  • 18.
    Current Package Scanners 4 Geta Package PyPI FOSS Alternative Select files for review Selected files (e.g., setup.py) Automated check for malware (e.g., PyPI Malware Check) Get an artifact Output Review Files, and resources Malicious code can be embedded in to a different file
  • 19.
    Current Package Scanners 4 Geta Package PyPI FOSS Alternative Select files for review Selected files (e.g., setup.py) Automated check for malware (e.g., PyPI Malware Check) Get an artifact Bandit Checks Output Review Files, and resources Malicious code can be embedded in to a different file
  • 20.
    Current Package Scanners 4 Geta Package PyPI FOSS Alternative Select files for review Selected files (e.g., setup.py) Automated check for malware (e.g., PyPI Malware Check) Get an artifact Bandit Checks Output Review Files, and resources Malicious code can be embedded in to a different file All files/lines => many FP
  • 21.
    Current Package Scanners 4 Geta Package PyPI FOSS Alternative Comprehensive Security Review Select files for review Selected files (e.g., setup.py) Automated check for malware (e.g., PyPI Malware Check) Get an artifact Bandit Checks Output Review Files, and resources Malicious code can be embedded in to a different file All files/lines => many FP
  • 22.
    Current Package Scanners 4 Geta Package PyPI FOSS Alternative Comprehensive Security Review Select files for review Selected files (e.g., setup.py) Automated check for malware (e.g., PyPI Malware Check) Get an artifact Bandit Checks Output Review Identify source code repo Files, and resources Malicious code can be embedded in to a different file All files/lines => many FP
  • 23.
    Current Package Scanners 4 Geta Package PyPI FOSS Alternative Comprehensive Security Review Select files for review Selected files (e.g., setup.py) Automated check for malware (e.g., PyPI Malware Check) Get an artifact Bandit Checks Output Review Identify source code repo Source code repository Files, and resources Malicious code can be embedded in to a different file All files/lines => many FP
  • 24.
    Current Package Scanners 4 Geta Package PyPI FOSS Alternative Comprehensive Security Review Select files for review Selected files (e.g., setup.py) Automated check for malware (e.g., PyPI Malware Check) Get an artifact Bandit Checks Output Review Identify source code repo Source Code Scanners Source code repository Files, and resources Malicious code can be embedded in to a different file All files/lines => many FP
  • 25.
    Current Package Scanners 4 Geta Package PyPI FOSS Alternative Comprehensive Security Review Select files for review Selected files (e.g., setup.py) Automated check for malware (e.g., PyPI Malware Check) Get an artifact Bandit Checks Output Review Identify source code repo Source Code Scanners Source code repository Files, and resources Malicious code can be embedded in to a different file All files/lines => many FP All files/lines + resource intensive + human effort => many FP + not possible to perform on-the-fly
  • 26.
    LastPyMile to IdentifyCode Injection 5 Get a Package PyPI FOSS Alternative Comprehensive Security Review Select files for review Selected files (e.g., setup.py) Automated check for malware (e.g., PyPI Malware Check) Get an artifact Bandit Checks Output Review Identify source code repo Source code scanners Source code repository Files, and resources
  • 27.
    LastPyMile to IdentifyCode Injection 5 LastPyMile Get a Package PyPI FOSS Alternative Comprehensive Security Review Select files for review Selected files (e.g., setup.py) Automated check for malware (e.g., PyPI Malware Check) Get an artifact Bandit Checks Output Review Identify source code repo Source code scanners Source code repository Files, and resources
  • 28.
    LastPyMile to IdentifyCode Injection 5 LastPyMile Identify source code repo (Step 1) Get a Package PyPI FOSS Alternative Comprehensive Security Review Select files for review Selected files (e.g., setup.py) Automated check for malware (e.g., PyPI Malware Check) Get an artifact Bandit Checks Output Review Identify source code repo Source code scanners Source code repository Files, and resources
  • 29.
    LastPyMile to IdentifyCode Injection 5 LastPyMile Identify source code repo (Step 1) Hash Files & Collect Lines from Repo (Step 2) Get a Package PyPI FOSS Alternative Comprehensive Security Review Select files for review Selected files (e.g., setup.py) Automated check for malware (e.g., PyPI Malware Check) Get an artifact Bandit Checks Output Review Identify source code repo Source code scanners Source code repository Files, and resources
  • 30.
    LastPyMile to IdentifyCode Injection 5 LastPyMile Identify source code repo (Step 1) Get an artifact Hash Files & Collect Lines from Repo (Step 2) Get a Package PyPI FOSS Alternative Comprehensive Security Review Select files for review Selected files (e.g., setup.py) Automated check for malware (e.g., PyPI Malware Check) Get an artifact Bandit Checks Output Review Identify source code repo Source code scanners Source code repository Files, and resources
  • 31.
    LastPyMile to IdentifyCode Injection 5 LastPyMile Identify source code repo (Step 1) Get an artifact Hash Files & Collect Lines from Artifact (Step 3) Hash Files & Collect Lines from Repo (Step 2) Get a Package PyPI FOSS Alternative Comprehensive Security Review Select files for review Selected files (e.g., setup.py) Automated check for malware (e.g., PyPI Malware Check) Get an artifact Bandit Checks Output Review Identify source code repo Source code scanners Source code repository Files, and resources
  • 32.
    LastPyMile to IdentifyCode Injection 5 LastPyMile Identify source code repo (Step 1) Get an artifact Hash Files & Collect Lines from Artifact (Step 3) Hash Files & Collect Lines from Repo (Step 2) Identify phantom files and lines (Step 4) Get a Package PyPI FOSS Alternative Comprehensive Security Review Select files for review Selected files (e.g., setup.py) Automated check for malware (e.g., PyPI Malware Check) Get an artifact Bandit Checks Output Review Identify source code repo Source code scanners Source code repository Files, and resources
  • 33.
    LastPyMile to IdentifyCode Injection 5 LastPyMile Identify source code repo (Step 1) Get an artifact Hash Files & Collect Lines from Artifact (Step 3) Hash Files & Collect Lines from Repo (Step 2) Identify phantom files and lines (Step 4) Filter Input Get a Package PyPI FOSS Alternative Comprehensive Security Review Select files for review Selected files (e.g., setup.py) Automated check for malware (e.g., PyPI Malware Check) Get an artifact Bandit Checks Output Review Identify source code repo Source code scanners Source code repository Files, and resources
  • 34.
    LastPyMile to IdentifyCode Injection 5 LastPyMile Identify source code repo (Step 1) Get an artifact Hash Files & Collect Lines from Artifact (Step 3) Hash Files & Collect Lines from Repo (Step 2) Identify phantom files and lines (Step 4) Filter Input Filter Output Get a Package PyPI FOSS Alternative Comprehensive Security Review Select files for review Selected files (e.g., setup.py) Automated check for malware (e.g., PyPI Malware Check) Get an artifact Bandit Checks Output Review Identify source code repo Source code scanners Source code repository Files, and resources
  • 35.
    LastPyMile Combined withother Package Scanners 6 Artifact Type #Files Problem Size Malware Checks Alerts Suspicious Bandit Alerts #LoCs (all files) #LoCs (setup.py) Coverage (setup.py) setup.py whole pkg LastPyMile setup.py whole pkg LastPyMile urllib3-1.26.3 80 25,348 97 0,4% 1 260 0 8 1398 0 requests-2.25.1 32 9325 112 1,2% 3 57 0 9 505 0 setuptools-53.0.0 244 70,794 162 0,2% 4 2932 0 5 762 0 urlib3-1.21.1 72 20 448 197 1% 4 177 3 20 1044 12 request-1.0.117 3 166 52 31,3% 2 8 2 5 27 20 setup-tools-36.0.1 112 31 245 304 1% 8 1289 3 21 489 12 In setup.py B e n i g n M a l i c i o u s Y N Y
  • 36.
    LastPyMile Combined withother Package Scanners 6 Artifact Type #Files Problem Size Malware Checks Alerts Suspicious Bandit Alerts #LoCs (all files) #LoCs (setup.py) Coverage (setup.py) setup.py whole pkg LastPyMile setup.py whole pkg LastPyMile urllib3-1.26.3 80 25,348 97 0,4% 1 260 0 8 1398 0 requests-2.25.1 32 9325 112 1,2% 3 57 0 9 505 0 setuptools-53.0.0 244 70,794 162 0,2% 4 2932 0 5 762 0 urlib3-1.21.1 72 20 448 197 1% 4 177 3 20 1044 12 request-1.0.117 3 166 52 31,3% 2 8 2 5 27 20 setup-tools-36.0.1 112 31 245 304 1% 8 1289 3 21 489 12 In setup.py B e n i g n M a l i c i o u s Y N Y
  • 37.
    LastPyMile Combined withother Package Scanners 6 Artifact Type #Files Problem Size Malware Checks Alerts Suspicious Bandit Alerts #LoCs (all files) #LoCs (setup.py) Coverage (setup.py) setup.py whole pkg LastPyMile setup.py whole pkg LastPyMile urllib3-1.26.3 80 25,348 97 0,4% 1 260 0 8 1398 0 requests-2.25.1 32 9325 112 1,2% 3 57 0 9 505 0 setuptools-53.0.0 244 70,794 162 0,2% 4 2932 0 5 762 0 urlib3-1.21.1 72 20 448 197 1% 4 177 3 20 1044 12 request-1.0.117 3 166 52 31,3% 2 8 2 5 27 20 setup-tools-36.0.1 112 31 245 304 1% 8 1289 3 21 489 12 In setup.py B e n i g n M a l i c i o u s Y N Y
  • 38.
    LastPyMile Combined withother Package Scanners 6 Artifact Type #Files Problem Size Malware Checks Alerts Suspicious Bandit Alerts #LoCs (all files) #LoCs (setup.py) Coverage (setup.py) setup.py whole pkg LastPyMile setup.py whole pkg LastPyMile urllib3-1.26.3 80 25,348 97 0,4% 1 260 0 8 1398 0 requests-2.25.1 32 9325 112 1,2% 3 57 0 9 505 0 setuptools-53.0.0 244 70,794 162 0,2% 4 2932 0 5 762 0 urlib3-1.21.1 72 20 448 197 1% 4 177 3 20 1044 12 request-1.0.117 3 166 52 31,3% 2 8 2 5 27 20 setup-tools-36.0.1 112 31 245 304 1% 8 1289 3 21 489 12 In setup.py B e n i g n M a l i c i o u s Y N Y
  • 39.
    LastPyMile Combined withother Package Scanners 6 Artifact Type #Files Problem Size Malware Checks Alerts Suspicious Bandit Alerts #LoCs (all files) #LoCs (setup.py) Coverage (setup.py) setup.py whole pkg LastPyMile setup.py whole pkg LastPyMile urllib3-1.26.3 80 25,348 97 0,4% 1 260 0 8 1398 0 requests-2.25.1 32 9325 112 1,2% 3 57 0 9 505 0 setuptools-53.0.0 244 70,794 162 0,2% 4 2932 0 5 762 0 urlib3-1.21.1 72 20 448 197 1% 4 177 3 20 1044 12 request-1.0.117 3 166 52 31,3% 2 8 2 5 27 20 setup-tools-36.0.1 112 31 245 304 1% 8 1289 3 21 489 12 In setup.py B e n i g n M a l i c i o u s Y N Y
  • 40.
    LastPyMile Combined withother Package Scanners 6 Artifact Type #Files Problem Size Malware Checks Alerts Suspicious Bandit Alerts #LoCs (all files) #LoCs (setup.py) Coverage (setup.py) setup.py whole pkg LastPyMile setup.py whole pkg LastPyMile urllib3-1.26.3 80 25,348 97 0,4% 1 260 0 8 1398 0 requests-2.25.1 32 9325 112 1,2% 3 57 0 9 505 0 setuptools-53.0.0 244 70,794 162 0,2% 4 2932 0 5 762 0 urlib3-1.21.1 72 20 448 197 1% 4 177 3 20 1044 12 request-1.0.117 3 166 52 31,3% 2 8 2 5 27 20 setup-tools-36.0.1 112 31 245 304 1% 8 1289 3 21 489 12 In setup.py B e n i g n M a l i c i o u s Y N Y
  • 41.
    LastPyMile Combined withother Package Scanners 6 Artifact Type #Files Problem Size Malware Checks Alerts Suspicious Bandit Alerts #LoCs (all files) #LoCs (setup.py) Coverage (setup.py) setup.py whole pkg LastPyMile setup.py whole pkg LastPyMile urllib3-1.26.3 80 25,348 97 0,4% 1 260 0 8 1398 0 requests-2.25.1 32 9325 112 1,2% 3 57 0 9 505 0 setuptools-53.0.0 244 70,794 162 0,2% 4 2932 0 5 762 0 urlib3-1.21.1 72 20 448 197 1% 4 177 3 20 1044 12 request-1.0.117 3 166 52 31,3% 2 8 2 5 27 20 setup-tools-36.0.1 112 31 245 304 1% 8 1289 3 21 489 12 In setup.py B e n i g n M a l i c i o u s Y N Y
  • 42.
    LastPyMile Combined withother Package Scanners 6 Artifact Type #Files Problem Size Malware Checks Alerts Suspicious Bandit Alerts #LoCs (all files) #LoCs (setup.py) Coverage (setup.py) setup.py whole pkg LastPyMile setup.py whole pkg LastPyMile urllib3-1.26.3 80 25,348 97 0,4% 1 260 0 8 1398 0 requests-2.25.1 32 9325 112 1,2% 3 57 0 9 505 0 setuptools-53.0.0 244 70,794 162 0,2% 4 2932 0 5 762 0 urlib3-1.21.1 72 20 448 197 1% 4 177 3 20 1044 12 request-1.0.117 3 166 52 31,3% 2 8 2 5 27 20 setup-tools-36.0.1 112 31 245 304 1% 8 1289 3 21 489 12 In setup.py B e n i g n M a l i c i o u s Y N Y
  • 43.
    LastPyMile Combined withother Package Scanners 6 Artifact Type #Files Problem Size Malware Checks Alerts Suspicious Bandit Alerts #LoCs (all files) #LoCs (setup.py) Coverage (setup.py) setup.py whole pkg LastPyMile setup.py whole pkg LastPyMile urllib3-1.26.3 80 25,348 97 0,4% 1 260 0 8 1398 0 requests-2.25.1 32 9325 112 1,2% 3 57 0 9 505 0 setuptools-53.0.0 244 70,794 162 0,2% 4 2932 0 5 762 0 urlib3-1.21.1 72 20 448 197 1% 4 177 3 20 1044 12 request-1.0.117 3 166 52 31,3% 2 8 2 5 27 20 setup-tools-36.0.1 112 31 245 304 1% 8 1289 3 21 489 12 In setup.py B e n i g n M a l i c i o u s Y N Y
  • 44.
    LastPyMile LastPyMile is asimple yet efficient methodology that allows you to identify discrepancies between packages and sources. Combined with other package scanners, LastPyMile is promising to be used for scanning for the maliciousness of injected code. Future plans: The code is under review by a company - stay tuned… We plan to submit LastPyMile to PyPA as a new check for the uploaded packages. Check out our replication package: https://zenodo.org/record/4899935 Want to know more about LastPyMile? Have ideas? Want to contribute? 7 Feel free to reach us!