Spark uses Resilient Distributed Datasets (RDDs) as its fundamental data structure. RDDs are immutable, lazy evaluated collections of data that can be operated on in parallel. This allows RDD transformations to be computed lazily and combined for better performance. RDDs also support type inference and caching to improve efficiency. Spark programs run by submitting jobs to a cluster manager like YARN or Mesos, which then schedule tasks across worker nodes where the lazy transformations are executed.