AI and America’s Racial "One Country, Two Systems" Policy
Plus, recent research on transformer networks and generative ML.
Happy Thursday. This week’s condition comes to you during a time of public outcry and protests in the U.S. precipitated by the recent killings of George Floyd, Brianna Taylor, Ahmaud Arbery, and Sean Reed. If you are a researcher and were working towards the Neurips2020 deadline this week while peaceful protests, police clashes, and/or rioting was happening down the block, you might have felt like that dog-in-burning house meme. You weren’t alone.
America’s One Country, Two Systems policy
With Hong Kong protests back in the news, and as an avid China watcher, I’m reminded of the phrase “One Country, Two Systems”, the term the CCP uses to normalize the apparent contradiction of Hong Kong and Macau having drastically different governance relative to Mainland China, particularly when it comes to rule-of-law and civil liberties.
Recent events reveal the U.S. has its own race-based “One Country, Two Systems” policy. The videos shine a light about the contradiction inherent in such a policy. They also bring to the surface the pent-up frustration about being gaslit about the existence of that contradiction despite a decade of viral video evidence and pesky things like American history.
Tech and ML look inward
For the tech, data science, and machine learning communities, there was a wake-up call in Amy Cooper’s use of the age-old trope of black male aggression against helpless white femininity to weaponize the police against the bird watcher who wanted her to leash her dog. Amy Cooper is an investment banker. Like members of our community, she is well-educated and numerate. She is an elite, and she i̶s̶ was working at an elite institution that hires data scientists. When she pulled racial rank on that bird watcher in Central Park, that rank was part of that elite identity and role that she and many others play in elite quantitation-heavy institutions.
This is forcing us to hold a mirror to our own institutions, as well as ask ourselves how the tools we are building might contribute to institutional racism more broadly. I’m sharing some links this week, new and old, that connect along those lines. Of course, some members of the community have been sounding the alarm on these issues for a while. Hopefully, this time we might hit a tipping point for those ideas leading to actual institutional change.
Racial disparities in tech
TODAY! 3pm EST 12pm PDT, We Won’t Wait, a virtual event by BLCK VC highlighting the issues faced by black founders, investors, and employees in the tech community.
- Ah, fuck it, make it 1000$ donated to for every person getting help. I reserve the right to set a cap if this gets out of control...
Nicolas Le Roux @le_roux_nicolasTo encourage people to reach out to mentors, I'll donate 100$ to @black_in_ai for every person who gets help on their submission. Send me an email with your mentor in Cc so that I can keep the count.
Facial Recognition and Race
Face Recognition and AI ethics — Benedict Evans (If you only read one thing, read this)
AI and the Far Right: A History We Can’t Ignore — AI Now Institute
Wang, Cunrui, et al. "Facial feature discovery for ethnicity recognition." Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery 9.1 (2019): e1278. (A controversial paper where Chinese researchers trained an algo to classify Uighur faces, using data of questionable origin.)
Policing and Data
Attending a Protest — EFF’s Surveillance Self-Defense Guide
How we Analyzed the COMPAS Recidivism Algorithm — ProPublica, has link to original COMPAS article
New subreddit dedicated to documenting instances police brutality during protests (potentially useful for future data analysis)
Help me get the word out. Share this post if you think its useful.
Ear to the Ground — Recent advances in Transformer Networks
Transformers are the most significant advancement in deep learning in the last four years. In that time, they have emerged as the architecture of key models such as OpenAI’s GPT, Microsoft’s Turing-NLG, and Google’s BERT.
The success has lead to a race for size. Larger models have consistently outperformed smaller alternatives across many tasks. Turing-NLG used 17 billion parameters, OpenAI’s new GPT-3 model uses 175 billion.
This publication has argued that there are some bad consequences of this trend.
It biases the articulation of problems to their amenability to solving with deep learning. If it is infeasible to solve a problem this way (e.g., it is not feasible to get the data within a problem domain curated, structured, and labeled enough to train 175 billion parameters, that problem will be ignored).
It increases the gravitational pull the few corporations that can afford to train such models, sucking in greater amounts of research resources, IP, and talent. These are the same organizations that we worry have too much control over our culture, political processes, and privacy.
On the flip side, I love the idea of 175 billion parameters. It’s brute-forcing a problem, showing the frontiers of what can be done with raw compute. Once we’ve found that frontier, we can fiddle with the black box and learn how it works. I jibes with my intuition as an engineer — get something that solves the problem, optimize later.
Brown, Tom B., Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan et al. "Language Models are Few-Shot Learners." arXiv preprint arXiv:2005.14165 (2020).
Evaluating Natural Language Generation with BLUERT — Google AI Blog
End-to-end Object Detection with Transformers — Facebook AI Blog
GPT-3 and Scaling Trends — Trees are Harlequins, Words are Harlequins
People still don’t know what to generate
With all the advances in generative ML, people seem not to know what to generate aside from Democracy-threatening deep fakes and sexy people who don’t exist.
I’ve seen some noble efforts. For example, some have tried to use GANs and other approaches to generate novel molecules.
My experience in materials science and drug discovery has taught me that the bottleneck in those fields is not generating new molecules. The bottleneck is figuring out which molecule will be the next blockbuster. How would this molecule behave in a polymer, or in an adhesive used in humid environments? What would this compound’s binding affinity be to a particular cellular receptor? Would it even make it to the cell if orally ingested? This is the real prediction challenge. Generating new things is useless if we can’t predict how useful they’d be.
This Summer I’m working through the generative ML literature, mostly VAEs and GANs. I’m interested in particular in models where the latent representation is semantically meaningful, not merely collections of curves and ovoids (to use computer vision as an example).
If you have any suggestions or want to do some readings together, please reach out.