1. The DATA step is processed in two phases - compile and execution. During compile, syntax is checked and the program data vector (PDV) is initialized. During execution, statements are executed in order and observations are written to the output dataset.
2. The PDV spans both phases and contains the current values of all variables. It is where SAS builds each observation before writing to the output dataset.
3. Understanding how the PDV is populated and changes during execution is important for effective DATA step programming and variable manipulation.