This document presents methods for collecting and generating training data for a free-input classification-based dialogue system aimed at cultural competency training, particularly for US military officers interacting with their Chinese counterparts. It discusses various data crowdsourcing prompt types, a binary category system for accurate labeling, and a proposed data generation algorithm. The goal is to replace traditional multiple-choice dialogue systems with more realistic free-response interactions to increase training effectiveness in cross-cultural communication.