Chain-of-thought prompting involves dividing complex reasoning tasks into natural language steps to help large language models perform better. It has been shown to improve arithmetic word problem solving by prompting models to show the steps and equations used to arrive at the answer. An ablation study found that showing the intermediate steps led to better performance than just showing the equation or computed answer alone. While promising for improving reasoning abilities, chain-of-thought prompting may not truly elicit human-like reasoning and can be costly to apply due to annotation efforts and model sizes required.