AltDeep is a newsletter focused on microtrend-spotting in data and decision science, machine learning, and AI. It is authored by Robert Osazuwa Ness, a PhD machine learning engineer at an AI startup and adjust professor at Northeastern University.
Rant
I went on a 7-day fast this week. Perhaps the lack of essential vitamins and minerals explains why I’m feeling gloomy about ML. Or maybe it’s the dismal state of global politics or all the articles about dying rain-forests and coral reefs. So here’s me whining a bit:
I am irritated with VC money forces people like myself work on efforts not anchored to any transaction in the market place, but to VC milestones for the next round of funding. It is an absurd distortion of reality.
I am irritated how ML forums online are obsessed with nuances in neural net architectures that take hundreds of thousand dollars to train, and how data science forums are just entry-level career tips and whether or not R beats Python.
I am irritated with how every ML or DS-related product showing up on product hunt is frivolous or poorly executed analytics dashboards.
Ear to the ground
OpenAI looks like another hypademic AI Company
Let’s coin a new word, the hypademic model, defined as follows:
Create a research lab. Generate hype by getting big academic personalities to sign on.
Have those personalities differentiate the company by articulating some AI philosophy, and arguing about it on Twitter.
Use hype and social media clout to bring in investors, who nothing else to go on. Ideally these investors themselves should be celebrities and an/or billionaires, thus generating more hype.
Use the cash to hire as many research engineers as you can, pay them a lot, and have them work on projects (not products).
Have at least one project beat some benchmark, generating more hype.
Cash out, usually through acquisition.
I use hype in a pejorative sense. I am not saying that the hypademic model doesn’t generate any machine learning research of substance; publishing good papers is a necessary but insufficient condition with this model. The downside of the hype is that it skews the research landscape, biasing it towards achievements that require compute that only investor money can finance. Hype steers talent into fundamentally non-productive endeavors. Hype encourages new learners to build skill-sets geared toward those methods; for example, think of the Github-project data scientist who can train a cutting-edge neural network architecture but can’t explain logistic regression. It raises the costs for any enterprise who wants to hire from this pool of talent to build products (as opposed to projects).
OpenAI looks exactly like this model
I find myself wondering if OpenAI is just another example of this type. I wonder specifically if the “Open” and the “non-profit” component of the company are just a type of marketing/hype.
Game show trivia questions that stump AI but that humans ace
I’m a fan of this branch of research because if deep learning is a kind of brute-forcing of machine learning, then there is much to be learned about ML from problems that cannot be brute-forced.
The researchers develop this question using an interesting adversarial approach that involved using a computer interface to understand what the algorithm is “thinking” as a human writer types a question, and then editing the question to maximize confusion.
Here is a selection:
Name this African virus with incredibly high mortality rates.
ANSWER: Ebola virusName this mountain range of South America that played a role in the independence of Chile.
ANSWER: AndesReverse transcriptase inhibitors and antiretrovirals are commonly used to treat what sexually transmitted disease?
ANSWER: HIVThe electromagnetic force was unified with what fundamental force that causes beta decay?
ANSWER: Weak interactionName this thought experiment derived from "the imitation game" that asks a judge to determine whether a conversational partner is human or computer named for a British computer scientist and that, for ten points, is said to determine when a computer is intelligent.
ANSWER: Turing test
Further reading:
Wallace, Eric, et al. "Trick Me If You Can: Human-in-the-loop Generation of Adversarial Question Answering Examples." Transactions of ACL 10 (2019).
AI Long Tail
Check out this landing page for a collaborative tool for labeling image data.
Information products, such as online courses, are a significant and dynamically evolving method for content creators to earn revenue online. Recently, Udacity published a paper on the auto-generator of course videos from recorded audio narratives. How this might impact info-product creation?