Identifying causal effects is an integral part of scientific inquiry, spanning a wide range of questions such as understanding behavior in online systems, effect of social policies, or risk factors for diseases. In the absence of a randomized experiment, however, traditional methods such as matching or instrumental variables fail to provide robust estimates because they depend on strong assumptions that are never tested. My research shows that many of the strong assumptions are testable. This leads to a data mining framework for causal inference from observed data: instead of relying on untestable assumptions, we develop tests for valid experiment-like data---a "natural" experiment---and estimate causal effects only from subsets of data that pass those tests. Two such methods are presented. The first utilizes auxiliary data from large-scale systems to automate the search for natural experiments. Applying it to estimate the additional activity caused by Amazon's recommendation system, I find over 20,000 natural experiments, an order of magnitude more than those in past work. These experiments indicate that less than half of the click-throughs typically attributed to the recommendation system are causal; the rest would have happened anyways. The second method proposes a general Bayesian test that can be used for validating natural experiments in any dataset. For instance, I find that a majority of natural experiments used in recent studies in a premier economics journal are likely invalid. More generally, the proposed framework presents a viable way of doing causal inference in large-scale datasets with minimal assumptions.