How tech companies think about causality, and what it means to ML

One of several NeurIPS 2019 takeaways

AltDeep is a newsletter focused on mental models and microtrend-spotting in machine learning, and AI.

NeurIPS 2019 proved the ML community had accepted causal reasoning as a big challenge problem. The current deep learning state-of-the-art struggles with causal reasoning as well as other tasks. However, causal reasoning stands out among these tasks for its well-developed literature and a cadre of seasoned experts in academia and industry.

From conversations I am having, researchers and engineers are trying to hash out a list of action items for incorporating causal reasoning into machine learning.

Applied causal inference vs. machine learning 

A source of confusion comes from the fact that we expect big tech companies to be at the cutting edge of machine learning, but yet they are also at the cutting edge of applied causal inference. But the applied causal inference people are typically on experimentation teams, which are separate from the groups working on building and training machine learning algorithms. Specifically, the applied causal inference use case is this setting is causal effect estimation.

Suppose Netflix shows you the trailer for The Witcher trailer at the top of your Netflix dashboard, and then you watch the show two days later. Netflix wants to know the causal effect that serving you that trailer had on your viewing the show. They can do an A/B test, where you are randomized into a “trailer” or “no-trailer” group — the goal of randomization break statistical influence from non-trailer related factors that might lead you to watch or not watch. But randomization doesn’t prevent influence from events that occur post-randomization. For example, after watching the trailer, you might read a review about the show, see it trend in social media, or hear your coworker’s recommendation to view the show. Causal effect estimation refers to methods for statistically distilling that causal effect from those other sources of statistical association.

So this causal effect estimation has a distinctly different flavor from what we typically think when we say “machine learning.”  

When you squint at applied causal inference, it starts to look like reinforcement learning.

That said, if you squint at applied causal effect estimation, it starts to look like reinforcement learning. Reinforcement learning is the class of ML methods much hyped in recent years for beating humans in board games and video games, see this Forbes explainer to learn more.

Here is my reasoning:

  1. One definition of AI is the automation of decision-making under uncertainty.

  2. The goal of an experiment is to make a decision about a hypothesis based on data. This hypothesis could be causal, e.g., trailers have a large causal impact on viewership.

  3. One can automate experiments to run in sequence, e.g., if the estimated causal effect is greater than a threshold do this experiment next, else do that one.

  4. Suppose we wanted to use empirical experience to learn how to select the best experiment to run next. Reinforcement learning is precisely this task. In this setting, we would require the reinforcement learning algorithm to use a causal model.