Spark is a general-purpose cluster computing framework that provides APIs in Java, Scala, and Python. It supports distributed computing using Resilient Distributed Datasets (RDDs) and can handle large-scale data processing faster than Hadoop. Spark also supports SQL queries, streaming data, machine learning algorithms, and graph processing. RDDs are immutable, partitioned collections of elements that can be operated on in parallel. They provide a fault-tolerant abstraction of data across clusters.