This document proposes a new distributed learning method called p-FeSTA (Federated Split Task-Agnostic learning with permutating pure Vision Transformer) that aims to address limitations of existing methods. p-FeSTA uses a pure Vision Transformer architecture with a patch permutation module to benefit more from multi-task learning while reducing communication costs and enhancing privacy preservation compared to federated learning, split learning, and FeSTA. Evaluation on classification, severity prediction and segmentation tasks across multiple datasets and clients shows that p-FeSTA achieves significantly lower communication overhead and better performance than existing methods. Ablation studies validate the importance of the permutation module for privacy.