2. Valerio Schiavoni - University of Neuchatel! UFSM - 02/02/2018SafeFS
but ο¬rstβ¦where is NeuchΓ’tel?
2
3. Valerio Schiavoni - University of Neuchatel! UFSM - 02/02/2018SafeFS
β’ Data is growing at an unprecedented rate
β’ Cloud storage is the de facto choice for millions of
users and enterprises
β’ reduced costs
β’ availability
β’ ease of use
Cloud Storage
3
4. Valerio Schiavoni - University of Neuchatel! UFSM - 02/02/2018SafeFS
β’ Heterogeneous interfaces for applications
β’ Data control belongs to the cloud
β’ according to a European study conducted in 2015
β’ 67% of the population is concerned with data privacy
β’ only 15% of users think to be in control of their data
β’ Cloud data is vulnerable to
β’ hackers, storage providers, governmental agencies
β’ other (possibly unknown) threats
4
not in this talk
Cloud Storage
5. Valerio Schiavoni - University of Neuchatel! UFSM - 02/02/2018SafeFS
β’ Abstract third-party interfaces
β’ e.g., multi-cloud ο¬le system
β’ Support data processing at the client premises
before uploading it to cloud services
β’ data encryption
β’ replication, deduplication, caching
Current Solutions
5
6. Valerio Schiavoni - University of Neuchatel! UFSM - 02/02/2018SafeFS
β’ Traditional ο¬lesystems follow a monolithic design
β’ Different applications have speciο¬c requirements
β’ performance
β’ dependability
β’ security
β’ β‘ different storage features
Challenges
6
ext3 ext4 encFS CryFS
7. Valerio Schiavoni - University of Neuchatel! UFSM - 02/02/2018SafeFS
β’ Stackable ο¬le system solutions improve ο¬exibility
β’ Their design is still limited:
β’ focused on the modularity of a speciο¬c feature
β’ decisions (kernel vs user-space)
7
source: hypem.com
Challenges
8. Valerio Schiavoni - University of Neuchatel! UFSM - 02/02/2018SafeFS
β’ SafeFS: A modular user-space secure ο¬le system
β’ layered design with two-dimensional modularity
β’ self-contained, stackable and reusable layers
β’ easy implementation & reuse of layers
β’ support for single and multiple storage backends
β’ adaptability to different application workloads
β’ transparency for applications
Contributions
8
9. Valerio Schiavoni - University of Neuchatel! UFSM - 02/02/2018SafeFS
β’Architecture
β’Life of a SafeFS operation
β’Some implementation details
β’Some evaluation results
β’Conclusion
The rest of this talk
9
π
π
π π
π¬(
π¬
10. Valerio Schiavoni - University of Neuchatel! UFSM - 02/02/2018SafeFS
β’ Layers
β’ processing vs storage
β’ stackable
β’ common API (FUSE)
β’ Drivers
β’ extended ο¬exibility
β’ common API
Architecture
10
User Application
FUSE User-Space Library
SafeFS
Processing
FUSE
Virtual Filesystem
.
.
.
Processing
FUSE
Processing
FUSE
Processing
FUSE
Processing
FUSE
Storage
FUSE
Storage
FUSE
Privacy-Preserving
Layer
Drivers
AES DET
FUSE
FUSE Kernel ModuleKernel
Space
User
Space
Layer 0
Layer 1
Layer N-2
Layer N-1
Layer N
.
.
.
...
request reply
11. Valerio Schiavoni - University of Neuchatel! UFSM - 02/02/2018SafeFS
Storage requests ο¬ow
11
User Application
Fuse User-Space Library
SafeFS
Processing
FUSE API
Virtual Filesystem
Fuse Kernel Module
Kernel
Space
User
Space
Storage
FUSE API
Storage
FUSE API
β
β β
β
β
β
β
β
request reply
β
β
12. Valerio Schiavoni - University of Neuchatel! UFSM - 02/02/2018SafeFS
SafeFS - Implementation
12
SafeFS
Privacy-Preserving Layer
Drivers
AES DET ...
Granularity-Oriented Layer
Drivers
Block ID
Multiple-backend Layer
Drivers
REP XOR ER
NFS Dropbox
Other
Storage
...
FUSE
...
β’ 3 Supported layers
β’ Granularity-Oriented
β’ Privacy-Preserving
β’ Multiple-Backend
β’ Layers and drivers chosen
at mount time
β’ Implemented in C
13. Valerio Schiavoni - University of Neuchatel! UFSM - 02/02/2018SafeFS
SafeFS - Conο¬guration
13
β’ Possible combinations of layers and drivers
Granularity-Oriented Privacy-Preserving Multiple-Backend
Groups Stack Block Id AES Det Id Simple XOR Erasure
Baseline
FUSE β₯ β₯ β₯ β₯ β₯
p
,1 β₯ β₯
Identity β₯
p
β₯ β₯
p p
,1 β₯ β₯
Privacy
AES
p
β₯
p
β₯ β₯
p
,1 β₯ β₯
Det
p
β₯ β₯
p
β₯
p
,1 β₯ β₯
XOR β₯ β₯ β₯ β₯ β₯ β₯
p
,3 β₯
Redundancy
Rep β₯ β₯ β₯ β₯ β₯
p
,3 β₯ β₯
Erasure
p
β₯ β₯ β₯ β₯ β₯ β₯
p
,3
Table 2: The diβ΅erent SafeFS stacks deployed in the evaluation. Stacks are divided in three distinct groups: Baseline
Privacy, Redundancy. The table header holds the three SafeFS layers. Below each layer we show the respective drivers. Fo
each stack, we indicate the active drivers (the
p
symbol). Layers without any active drivers are not used in the stack. Th
ndices for Multiple-Backend drivers indicate the number of storage backends used to write data.
tively to a standard and a deterministic encryption mecha-
nism. The AES stack is expected to be less e cient than Det
as it generates a diβ΅erent IV for each block. However, Det
has the weakest security guarantee. The third stack, named
XOR, considers a diβ΅erent trust model where no single stor-
age location is trusted with the totality of the ciphered data.
Data is stored across distinct storage back-ends in such a
way that unless an attacker gains access simultaneously to
We ran several workloads for each considered ο¬le system (4
third-party ο¬le systems and 7 SafeFS stacks). The result
have been grouped according to the workloads. First, w
present the results of using db_bench, then filebench and
ο¬nally, we describe the results of running latency analysi
for SafeFS layers.
Microbenchmark: db bench. We ο¬rst present the re
sults obtained with db_bench. We pick 7 workloads, each
β’ Each offering different guarantees in terms of
β’ security
β’ dependability
β’ performance
14. Valerio Schiavoni - University of Neuchatel! UFSM - 02/02/2018SafeFS
β’ Multiple benchmarks and workloads
β’ filebench
β’ db_bench
β’ Third-party ο¬lesystems and SafeFS conο¬gurations
β’ 7 SafeFS setups
β’ 4 ο¬lesystems (CryFS, LessFS, MetFs and eCryptFS)
β’ Experimental setup
β’ Virtual Machines with 4 Cores, 4GB RAM and HDD drives
Experimental Evaluation
14
15. Valerio Schiavoni - University of Neuchatel! UFSM - 02/02/2018SafeFS
Filebench results
15
0
0.2
0.4
0.6
0.8
β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β
Ratioagainstnativ
0
0.2
0.4
0.6
0.8
1
1.2
β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β
Ratioagainstnative(ext4)
β File-server β Mail-server β Web-server β filemicro_rread_4K β filemicro_rwrite_4K β filemicro_seqread_4K β filemicro_seqwrite_4K
SAFEFS AES SAFEFS Det SAFEFS Erasure SAFEFS FUSE SAFEFS Identity SAFEFS Rep SAFEFS XOR
Figure 5: Relative performance of filebench workloads against native.
0
0.2
0.4
0.6
0.8
1
AES
Det
Erasure
FUSE
Identity
Rep
XOR
AES
Det
Erasure
FUSE
Identity
Rep
XOR
AES
Det
Erasure
FUSE
Identity
Rep
XOR
AES
Det
Erasure
FUSE
Identity
Rep
XOR
AES
Det
Erasure
FUSE
Identity
Rep
XOR
AES
Det
Erasure
FUSE
Identity
Rep
XOR
AES
Det
Erasure
FUSE
Identity
Rep
XOR
fill100K fillrandom fillseq overwrite readrandom readreverse readseq
ExecutionTime(%)
multi_write sfuse_write align_write multi_read sfuse_read align_read
Figure 6: Execution time breakdown for diβ΅erent SafeFS stacks.
As expected, the time spent in each layer varies according
to the tasks performed by the layers. The 3 most CPU-
facilitate future choices for practitioners and researchers.
We envision to extend SafeFS along three main direc
β’ Evaluation of SafeFS setups with 7 ο¬lebench workloads
β’ Throughput compared against ext4
β’ red (below 25%)
β’ orange (up to 75%)
β’ yellow (up to 95%)
β’ green (>= 95%)
0
0.2
0.4
0.6
0.8
β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β
0
0.2
0.4
0.6
0.8
1
1.2
β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β
β File-server β Mail-server β Web-server β filemicro_rread_4K β filemicro_rwrite_4K β filemicro_seqread_4K β filemicro_seqwrite_4K
SAFEFS AES SAFEFS Det SAFEFS Erasure SAFEFS FUSE SAFEFS Identity SAFEFS Rep SAFEFS XOR
Figure 5: Relative performance of filebench workloads against native.
0
0.2
0.4
0.6
0.8
1
AES
Det
Erasure
FUSE
Identity
Rep
XOR
AES
Det
Erasure
FUSE
Identity
Rep
XOR
AES
Det
Erasure
FUSE
Identity
Rep
XOR
AES
Det
Erasure
FUSE
Identity
Rep
XOR
AES
Det
Erasure
FUSE
Identity
Rep
XOR
AES
Det
Erasure
FUSE
Identity
Rep
XOR
AES
Det
Erasure
FUSE
Identity
Rep
XOR
fill100K fillrandom fillseq overwrite readrandom readreverse readseq
multi_write sfuse_write align_write multi_read sfuse_read align_read
Figure 6: Execution time breakdown for diβ΅erent SafeFS stacks.
As expected, the time spent in each layer varies according
o the tasks performed by the layers. The 3 most CPU-
tensive stacks (AES, Det and Erasure) concentrate their
facilitate future choices for practitioners and researchers.
We envision to extend SafeFS along three main direc-
tions. First, we plan to smooth the eβ΅orts to integrate any
16. Valerio Schiavoni - University of Neuchatel! UFSM - 02/02/2018SafeFS
Filebench results
16
0
0.2
0.4
0.6
0.8
β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β
Ratioagainstnativ
0
0.2
0.4
0.6
0.8
1
1.2
β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β
Ratioagainstnative(ext4)
β File-server β Mail-server β Web-server β filemicro_rread_4K β filemicro_rwrite_4K β filemicro_seqread_4K β filemicro_seqwrite_4K
SAFEFS AES SAFEFS Det SAFEFS Erasure SAFEFS FUSE SAFEFS Identity SAFEFS Rep SAFEFS XOR
Figure 5: Relative performance of filebench workloads against native.
0
0.2
0.4
0.6
0.8
1
AES
Det
Erasure
FUSE
Identity
Rep
XOR
AES
Det
Erasure
FUSE
Identity
Rep
XOR
AES
Det
Erasure
FUSE
Identity
Rep
XOR
AES
Det
Erasure
FUSE
Identity
Rep
XOR
AES
Det
Erasure
FUSE
Identity
Rep
XOR
AES
Det
Erasure
FUSE
Identity
Rep
XOR
AES
Det
Erasure
FUSE
Identity
Rep
XOR
fill100K fillrandom fillseq overwrite readrandom readreverse readseq
ExecutionTime(%)
multi_write sfuse_write align_write multi_read sfuse_read align_read
Figure 6: Execution time breakdown for diβ΅erent SafeFS stacks.
As expected, the time spent in each layer varies according
to the tasks performed by the layers. The 3 most CPU-
facilitate future choices for practitioners and researchers.
We envision to extend SafeFS along three main direc
β’ Evaluation of SafeFS setups with 7 ο¬lebench workloads
β’ Throughput compared with ext4
0
0.2
0.4
0.6
0.8
β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β
0
0.2
0.4
0.6
0.8
1
1.2
β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β
β File-server β Mail-server β Web-server β filemicro_rread_4K β filemicro_rwrite_4K β filemicro_seqread_4K β filemicro_seqwrite_4K
SAFEFS AES SAFEFS Det SAFEFS Erasure SAFEFS FUSE SAFEFS Identity SAFEFS Rep SAFEFS XOR
Figure 5: Relative performance of filebench workloads against native.
0
0.2
0.4
0.6
0.8
1
AES
Det
Erasure
FUSE
Identity
Rep
XOR
AES
Det
Erasure
FUSE
Identity
Rep
XOR
AES
Det
Erasure
FUSE
Identity
Rep
XOR
AES
Det
Erasure
FUSE
Identity
Rep
XOR
AES
Det
Erasure
FUSE
Identity
Rep
XOR
AES
Det
Erasure
FUSE
Identity
Rep
XOR
AES
Det
Erasure
FUSE
Identity
Rep
XOR
fill100K fillrandom fillseq overwrite readrandom readreverse readseq
multi_write sfuse_write align_write multi_read sfuse_read align_read
Figure 6: Execution time breakdown for diβ΅erent SafeFS stacks.
As expected, the time spent in each layer varies according
o the tasks performed by the layers. The 3 most CPU-
tensive stacks (AES, Det and Erasure) concentrate their
facilitate future choices for practitioners and researchers.
We envision to extend SafeFS along three main direc-
tions. First, we plan to smooth the eβ΅orts to integrate any
β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β
eCryptFS EncFS MetFS SAFEFS AES SAFEFS Det
β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β β
β Web-server β filemicro_rread_4K β filemicro_rwrite_4K β filemicro_seqread_4K β filemicro_seqwri
FS Det SAFEFS Erasure SAFEFS FUSE SAFEFS Identity SAFEFS Rep SAFEFS XO
17. Valerio Schiavoni - University of Neuchatel! UFSM - 02/02/2018SafeFS
Other results
17
β’ DB_bench experiments
β’ signiο¬cant overhead in write requests
β’ read requests performance close to ext4
β’ uniform results across SafeFS and other
ο¬lesystems
β’ Time spent in each SafeFS layer
β’ Setups using encryption or erasure coding
require signiο¬cant processing time and CPU in the
respective layers
β’ The Granularity-Oriented layer is time-
demanding specially for write requests
18. Valerio Schiavoni - University of Neuchatel! UFSM - 02/02/2018SafeFS
β’ Strict combinations of storage features cannot fulο¬l
the requirements of distinct applications
β’ SafeFS addresses this challenge with
β’ a modular layer and driver design
β’ a common API for easily stacking layers
β’ Allows to create
β’ combinations of storage features based on applications
requirements
β’ to reduce the cost and complexity of reusing or
implementing new layers
Conclusion /1
18
19. Valerio Schiavoni - University of Neuchatel! UFSM - 02/02/2018SafeFS
β’ Our experiments show that
β’ different SafeFS setups are easily deployable
β’ a layered approach has similar performance to other
monolithic privacy-preserving ο¬lesystems
β’ Future Work
β’ Workload-aware and automatic conο¬guration of layers
β’ Run-time conο¬guration of layers and drivers
β’ Encryption keys management and access control
Conclusion /2
19
Open source, Available at
https://github.com/safecloud-project/SafeFS