Storage Policies can be set by unprivileged users.
HDFS also supports quotas on storage media which are set by the administrator
Memory-mapped files are another option. Work well for reads but do not work well with the existing HDFS write pipeline.
Cache pools are analogous to HDFS Quotas, but not quite the same
Cache pools allow administrators to control which users can use memory resources
These two problems are relatively easy to solve.
We don’t want to indiscriminately target all input or output data to memory
Frameworks lack application context such as which data will be accessed often, expected output size of a given job
Let’s say we have a hypothetical file system called memfs which performs caching io on both read and write path