This document describes an experimental protocol to collect and analyze social interactions during collaborative problem solving. Three collaborative games were designed and played by pairs of participants remotely via video conferencing. Their interactions were recorded and then annotated by three raters using two scales: the Social Performance Rating Scale and a scale assessing social and cognitive skills in collaborative problem solving. Moderate to excellent reliability was found between the raters for most items on the scales. The results provide initial evidence that these scales can reliably measure social skills. The collected dataset will be used to further study social interactions and inform the design of virtual agents for social skills training.