This document discusses the automated extraction of deuterated drug structures from patents into PubChem and the challenges it presents. Around 80,000 deuterated compounds in PubChem originate predominantly from patent images through the efforts of SCRIPDB, IBM, SureChEMBL, and Thomson Pharma. While text-based extraction of deuterium works reasonably well, image-only extraction often fails. As a result, most of the 25,000 derivatives of around 500 drugs in PubChem are "virtual" compounds that do not actually exist. This presents users with a dilemma between potential intellectual property significance and the lack of any linked bioactivity data.