Be the first to like this
Computing-intensive experiences in modern sciences have become increasingly data-driven illustrating perfectly the Big-Data era's challenges. These experiences are usually specified and enacted in the form of workflows that would need to manage (i.e.,~read, write, store, and retrieve) sensitive data like persons' past diseases and treatments. While there is an active research body on how to protect sensitive data by, for instance, anonymizing datasets, there is a limited number of approaches that would assist scientists identifying the datasets, generated by the workflows, that need to be anonymized along with setting the anonymization degree that must be met. We present in this paper a preliminary for setting and inferring anonymization requirements of datasets used and generated by a workflow execution. The approach was implemented and showcased using a concrete example, and its efficiency assessed through validation exercises.