Extract camera motion clips from YouTube videos and use them to train a neural network to generate a Multiplane Image (MPI) scene representation from narrow-baseline stereo image pairs. The inferred MPI representation can then be used to synthesize novel views of the scene, including ones that extrapolate significantly beyond the input baseline. (Video stills in this and other figures are used under Creative-Commons license from YouTube user SonaVisual.)