Summary Data coding , analysis, archiving, and sharing for open collabora9on Richard Aslin University of Rochester
1. What is your hypothesis? • 9/11 occurred because the intelligence community suﬀered from a “failure of imagina9on” – BoGom-‐up data mining (“connec9ng the dots”) – Top-‐down predic9ons (“what are vulnerabili9es??”) • Clearly, you need both • Must apply approaches itera9vely and repeatedly
2. Observa9ons are DVs • Are the paGerns you “see” the ones that are “relevant” or causal? • Problem of data sparsity and false correla9ons • Hypothesis tes9ng requires an experiment (manipula9ng an IV) • Tension between “ecology” and “control of variables” (sociology of preferred methods)
3. How expand hypothesis space? • If large/standard datasets, then evalua9on becomes stagnant (only evaluated with that dataset) • If evalua9on only uses standard (sta9s9cal) tools, same problem of stagna9on • Is clever visualiza9on the key to hypothesis forma9on, even if “simple” variables? TED talk by Deb Roy from MIT
4. When do you give up? • Reliance on visual paGern recogni9on by human coder may not reveal relevant (informa9ve) features (sound spectrogram cannot be “read”) • Failure at macro level prompts search for info at micro level (fMRI univariate vs. mul9variate analysis): need to “drill down” • Failure at micro level may indicate indeterminacy of causal hierarchy (Fodor)
5. Rules of sharing • When does “your” data become accessible by: – Your collaborators – Friends who ask – Strangers – Anyone • Who gets credit? • How should junior researchers “share”? Especially with senior labs that have $$$.