This document discusses the history and evolution of supercomputer architectures from the 1960s to present. Early supercomputers relied on compact designs and local parallelism. Starting in the 1990s, massively parallel systems with thousands of processors became common. Modern supercomputers can use over 100,000 processors connected by fast interconnects and may utilize GPUs, computer clusters, or distributed computing networks to achieve petaflop-scale performance. Vector processing is also discussed as an important technique used in many historical supercomputers to improve performance.