BU_DEMO

By: Kostas, Jeremy, Renqing, Rahul
Mentor: Sastry S Duri (IBM Research)

What is it?
OVERVIEW
Container Safety Determination (CSD) is a
scanning and monitoring tool that lets
engineers examine the safety state of their
containers.
The tool works for both images and
containers, and can be configured to work
without user intervention.
WORKING
CSD works by detecting suspicious files. It
compares all the files of a given image with a
database of known malicious and non-malicious
binaries in order to determine how safe an image
is. The security engineer works on the feedback
received for a particular image and takes action
accordingly.

Agile Development
 Sprint planning before start of sprint
 Break down requirements into simple tasks that can be completed during sprint
 Prioritize and assign tasks
 Status meetings every other day - address blocking issues
 Weekly meetings with mentor - review progress, design
 Sprint review at end of sprint
 Discuss lessons learned and how to improve in next sprint
 Used Trello for sprint planning
 Used Slack for team communication and hangouts for meetings

- Analysis & Design
- Registry
- Sdhash
- Usecase 1
- sdHash
Sprint 1 Sprint 2 Sprint 3
- Usecase 2
- Action plan
Sprint 6 Sprint 5 Sprint 4
- Elasticsearch
- Endpoints
- Docker crawl
- Docker UI
- Scalability
Sprint Planning
- Documentation
- Scalability
- UI and video

1.
DOCKER REGISTRY
Let’s start with docker registry

PRODUCTION
Docker Private
RegistryDeveloper
Docker Public
Registry
PRODUCTION PRODUCTION PRODUCTION
Typical Deployment Model

Usecases
USECASE 1
What if the image used to launch a container
contains suspicious files?
Deployment engineer should have some
confidence/trust on the image that its free of
suspicious elements
USECASE 2
What if the container was launched from a safe
image, but it got compromised after that.
There should be a way to determine how safe a
container is after a specific interval. A continuous
scan after a fixed interval can help in determining
compromised containers

PRODUCTION
Docker Private
RegistryDeveloper
Docker Public
Registry
PRODUCTION PRODUCTION PRODUCTION
Areas of Interest
Scan images stored in the
registry
Scan running
containers

SDHASH
Example:
sdhashes for two
text files (~14KB)
with mostly same
content except a few
sentences deleted
from file2.
file1 hash file2 hash
sdhash similarity index: 96
Sdhash is a tool which allows comparing of two blobs of data and checking the similarity
based on the common strings. It provides quick results which are helpful in initial trial and
investigation of files. It also reduces the filesize to 2-3% of the original.

REFERENCE DATASETS
 Official images available on dockerhub
 NSRL Dataset
 Self evaluated dataset
 Datasets by third parties
ClamAV
VirusShare.com

Flow of image upload
Find base
Rabbitmq
# docker push 10.10.10.10:5000/xyz:latest
centos:7
Index Different files
Registry
Instance
Broadcaster
ENDPOINT
xyz:latest
Calculate sdhash
Elasticsearch
sdhash index
Compare
centos:7 vs xyz:latest
Scan different
files
Scanner

Container Scan
Get Container diff
and find files
scan ae3894fea89
DOCKER HOST
centos:7
get containers
Elasticsearch
sdhash index
Compare
Scan different files
Scanner

Lessons learnt
 Working as a team in agile environment
 Working with technologies such as docker, sdhash,
elasticsearch, and rabbitmq
 Internals of docker and docker-registry
 Working with Cloud platforms
 Configuring/packaging code within containers, distribution of
containers

Limitations and future plans
LIMITATIONS
 Sdhash is not ideal for comparing small files – can result in
false positives
 Indicates if an image is safe or potentially unsafe only for
known files. The tool can be improved to provide more
conclusive verdicts on image safety
 Sdhash does not work well with binary files
 Current reference dataset is very small and relies on the fact
that the official images would be correct. We need to have a
bigger dataset of malicious files
FUTURE PLANS
 Enable plugin architecture for adding new modules to
detect vulnerabilities. In this way, developers can
integrate their detection engines without any hassles
and we can have better results
 Enable master-slave model where master can spin-up
containers as the load increases
 Add a tool which works with comparing binary files
 Add tool which works with comparing small files

Thanks!
Any questions?
You can find us at:-
gladius@bu.edu
jmwenda@bu.edu
konpap94@bu.edu
rahuls@ccs.neu.edu

BU_DEMO

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to BU_DEMO

Similar to BU_DEMO (20)

BU_DEMO

Editor's Notes