Background Pain is a feature of approximately 70% of all Emergency Department (ED) presentations. It has been demonstrated that mandating recording of a patient’s feeling of pain can improve service delivery for ED patients. However, there is a substantial group of patients (approximately 21% of ED visits in our 12-month sample) for which there exists an inconsistency between pain score and the Australian Triage Scale (ATS) score assigned by the nurse; where a patient reports high levels of pain but they are assigned a lower-urgency triage category. It has been unclear until now whether this “inconsistent” group of patients has been receiving optimal care. Methods To better understand the characteristics in this inconsistent group, we performed topic modeling of the clinical notes collected during ED triage assessments. We divided the notes into two subgroups, according to whether or not the patient’s self-reported level of pain was consistent with the triage urgency recorded in the ATS score. We performed topic modeling of these two subgroups separately, using the implementation of Latent Dirichlet Allocation (LDA) in the Mallet toolkit. We have experimented with several representations of the notes, including unigrams (tokens), bigrams, and the medical concepts contained in each note, as determined with the MetaMap medical concept recognition tool. An ED nurse reviewed the topics generated in each case and assigned a descriptor to them. Results When considering the token-based presentation of the notes, the labels in the consistent group are related to road trauma, cardiac pain, change of consciousness, ongoing chest pain, limb injury, renal illness and pain due to illness. In the inconsistent group, we find topics related to either conditions related to ongoing conditions (including postoperative complications or worsening abdominal pain), urinary and respiratory problems, infections and injury related complications. When considering the concept-based representation of the notes, the labels in the consistent set denote gastrointestinal diseases, neurological illness, dizziness, chest pain, testicular pain, shortness of breath and trauma. The labels in the inconsistent set denote different issues caused by trauma and distress due to pain, infection and urinary condition. This includes injuries in several body parts like in the limbs and back. The latter topic containing body parts appears to have been enabled by the abstraction of individual terms into concepts. Conclusions Topic modeling of Emergency Department data shows substantial promise for helping to characterise particular subpopulations of interest, and incorporating pre-processing of clinical notes to capture variation in clinical terminology appears to have value. While this initial work has focused on the pain-related chief complaints, we have also recently begun to explore temporal characteristics of the data through analysis of how derived topics change ove