Hey VCs: Not all problems require deep learning
Andreessen Horowitz's point misses alternative approaches to AI
Two fundamental trends have shaped software: cloud services and software as a service (SaaS). SaaS is the primary mechanism for delivering software solutions to clients. Cloud service enables software companies to scale by removing the friction of additional human and computational resources to match demand.
The assumption was that artificial intelligence (AI) would follow a similar pattern. It turns out this has not been the case.
Computing costs are high. The cost of talent is inflated by competition from large tech companies who pay absurdly high salaries, and evidence shows much of that talent is underutilized (e.g., working on projects that do not connect to the main product).
Andreessen Horowitz recently published a brilliant essay outlining some of the challenges of scaling AI startups.
It is a brilliant piece, but their points only apply to deep learning methods. The following bullet points are direct quotes from the article.
Training a single AI model can cost hundreds of thousands of dollars (or more) in compute resources.
Model inference(the process of generating predictions in production) is also more computationally complex than operating traditional software.
We’ve had AI companies tell us that cloud operations can be more complex and costly than traditional approaches, particularly because there aren’t good tools to scale AI models globally.
In many problem domains, exponentially more processing and data are needed to get incrementally more accuracy. This means – as we’ve noted before – that model complexity is growing at an incredible rate, and it’s unlikely processors will be able to keep up. Moore’s Law is not enough.
All of these points are true, but only for complex models with an extremely high number of parameters to train, and that cannot be separated into modules that can later be recomposed, i.e., deep learning.
They also make the following point.
AI applications are more likely than traditional software to operate on rich media like images, audio, or video. These types of data consume higher than usual storage resources, are expensive to process, and often suffer from region of interest issues – an application may need to process a large file to find a small, relevant snippet.
This is true, but only because deep learning can operate on that type of data better than previous software approaches. But AI is fundamentally about automating decision-making given noisy data. Let’s assume you need a complex model to process rich media. You can't tell me that all the commercial value that can be realized with automated decision-making involve rich media.
My point is simple. Not all problems require complex deep learning models.
In a later post, I will reflect on a class of sophisticated ML modeling approaches that can be a source of valuable IP for a company, but do not fall prey to the issues described in Andreessen Horowitz article. I will also discuss paths to an AI startup that side-step these issues.
If you are not signed up, sign up. Or, forward to your colleagues who might be interested.