Thoughts after one month at Microsoft Research

Plus a paradox and an internship opportunity.

I’m looking for an intern to work on causal machine learning with my team and me at MSR!

If you’re interested please apply, then reach out to me personally. If you know anyone who would be, please share this post.


I’m coming up on a month at MSR. My research agenda is starting to coalesce and I’m excited to get started.

One thing I can talk about publically are some mindmelds I’ve had on algorithmic counterfactual reasoning and agent modeling with Adith Swaminathan. I’ve been a fan of this guy’s work since his dissertation, and he was a major reason I joined.

In flavors of agent modeling ranging from game theory to reinforcement learning, we tend to think of an agent having a utility or reward function that is fixed. Most practical algorithms either assume this function or try to learn it. Adith has convinced me that those algorithms affect the agent in ways that changes their reward function, which is a big problem. You can view the smartphone addiction of teenagers or the erosion of our political culture as consequences of this big algorithmic fail.

I put together some initial ideas in a public Google Colab tutorial on Newcomb’s Paradox.

Wikipedia has a good summary of the paradox. It was constructed as a toy problem to contrast two criteria for optimal decision-making. But I see it as a realistic example of the paradox of how, in the modern era, we have to make decisions based not just on data but on how algorithms think we are going to make decisions based on data. It requires an agent to reason about how other agents will reason about it. That sort of recursive theory of mind reminds me of Vizzini’s battle of wits scene in The Princess Bride.

The notebook hints at some ideas for how one would model this. Perhaps in the future, Cortana could tell you which goblet to drink from?