Parallelformers is a tool for efficiently parallelizing large language models across multiple GPUs. It was created to address the challenges of deploying and using very large models, which require extensive engineering and expensive hardware. Parallelformers uses model parallelism techniques inspired by Megatron-LM to split models across GPUs for efficient distributed processing and inference. The key design principles of Parallelformers are efficient model parallelism, scalability to support many models, simplicity of use, and enabling easy deployment of large models.