R is a popular statistical programming language used for data analysis and machine learning. It has over 3 million users and is taught widely in universities. While powerful, R has some scaling limitations for big data. Several Apache Spark integrations with R like SparkR and sparklyr enable distributed, parallel processing of large datasets using R on Spark clusters. Other options for scaling R include H2O for in-memory analytics, Microsoft ML Server for on-premises scaling, and ScaleR for portable parallel processing across platforms. These solutions allow R programs and models to be trained on large datasets and deployed for operational use on big data in various cloud and on-premises environments.