Inferring causality usually requires observing a treatment group and a control group. In other words, we need to observe two populations under a distribution shift and make sure that the shifts occur only as a result of some interventions.
What if the shifts occur naturally between different observational datasets? Rephrasing natural experiments in the language of SCMs, we show that only 3 observational datasets with invariant causal mechanisms are enough to:
Uniquely infer the causal structure (first Figure: causal discovery with the invariance principle in Amoglu et al., 2001).
Identify counterfactuals (second Figure: causal inference with the invariance principle in Card and Kruger, 1994).
for acyclic causal models.
We formulate causal discovery as a simplified version of the well-known ICA problem. We show that given a structural causal model with arbitrary bijective mechanisms and Gaussian noise, three environments with invariant causal mechanisms suffice to identify graphs of arbitrary size.
We theoretically and empirically show that sparse autoencoders face a three-way tradeoff: they can be accurate, sparse, or monosemantic, but improving all three at once is generally impossible. It shows that polysemantic features are often not an accident, but an efficient response to concepts that frequently co-occur in the data.
This sheds light on phenomena such as feature hedging and splitting in mechanistic interpretability, which are predictable consequences of the data-generating process.
causally Python library for data sampling with structural causal model under realistic assumptions
I implemented several causal discovery algorithms for the PyWhy project dodiscover
TLDR: I developed a branch of causal discovery connecting the causal graph inference with score matching estimation techniques. Papers:
TLDR: we stress-test traditional and state-of-the-art causal discovery methods under challenging conditions, such as non-iid data, presence of latent confounders, faithfulness violations, violation of parametric assumptions, and others.
TLDR: We show the limitations of supervised learning-based causal discovery with identifiability theory