The document discusses a neural network system that achieves near-human performance on the SQuAD question answering dataset. However, the document argues this result requires context: SQuAD involves span selection within a paragraph where the answer is guaranteed to exist, and questions can often be answered through superficial clues without complex reasoning. While the system scores 84% F1 and humans 91%, humans were incentivized to answer quickly and their performance is underestimated. The document concludes neural networks perform clever pattern matching rather than true intelligence or reasoning, and can be easily fooled by examples beyond their training.