This week’s post isn’t ready yet. I know you couldn’t think about anything else till I posted, so I’ll put you out of your misery. I’m writing this quick update post, and I’ll publish another one on Friday.
Which leads to my first update. I’m switching from ‘every other wednesday’ to ‘every other friday.’ While I still need accountability for my writing, switching to a Friday release gives me more time to think carefully about the ideas during the week. Moreover, I write the kind of content that I would personally consume on a Sunday morning or weekend, more generally, so Friday is great timing.
The second update, and you could say ‘focus’ of this post, is that I’ve been learning more about specific technologies that are state-of-the-art in artificial intelligence: GPT-3, Stable Diffusion, Jasper, and Whisper. These technologies are not at the level we see in sci-fi movies and books like the Matrix, but they’re better than you think.
Here are 5 AI technologies that I have used and that impressed me in the last six months:
Writing blog posts and marketing copy.
I heard about Jasper.ai from an Instagram ad. The pitch was that I could have AI, Jasper, be my partner in the content creation process. It provides various options for the tone and style of the text, making it versatile for writing ad copy, marketing briefs, or blog posts. I recently started research with an AI professor at Penn, plus I write a newsletter. So, of course, I tried out Jasper.
I found the product to be user-friendly, making it easy to ask Jasper what to do. However, I personally felt there is limited value added from AI when you don’t have original ideas for a blog post. I guess creativity isn’t fully commoditized yet. Jasper just raised their $125M Series A last week!
Summarizing long pieces of text
GPT-3 is a neural network machine learning model trained using internet data to generate any text. OpenAI created GPT-3.
The idea is simple. You provide GPT-3 with some text, called the prompt, and then based on your settings, GPT-3 will output a guess of the next chunk of text, the completion. Many say it’s getting close to artificial general intelligence (so indistinguishable from a human). It’s impressive, but it’s definitely not there, so don’t get sold snake oil.
The research team I mentioned is building on a project summarizing lecture transcripts using GPT-3. The cool part of this project was that the authors designed a way to systematically circumvent GPT-3’s limit of about 1500 words across the prompt and completion. Anyone can sign up to use GPT-3 now!
This is all the more interesting because these technological developments are new and feel very cutting-edge once you work closely with them. Just one month ago, researchers at The University of Texas at Austin published a paper proving that GPT-3 had the best-in-class summaries and is vastly underrated by the evaluation metrics we have today.
Language Translation
Language Translation isn’t intrinsically impressive. It feels like we’ve solved that problem for a bit now. What’s impressive to me is that the same GPT-3 model that can summarize a piece of text, or answer many open or general questions, can also translate from English to French.
Prompt: “Translate this to French for me: The blue pen was seized by the angry teacher.” Completion: “Le stylo bleu a été saisi par le professeur en colère.”
Converting Speech to Text
Another technology the research group used, Whisper, is also created by OpenAI. Whisper is an automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data collected from the web. It converts speech to text.
What’s truly impressive about Whisper is that it can still detect noisy, poor-quality audio. Listen to the Speed-talking setting on Whisper’s homepage to understand how good Whisper is at accurate detection.
Also, you may or may not be impressed that Whisper can not only detect but translate different languages. It seems OpenAI must somehow be leveraging GPT-3 for the language translation, and that is correct. I just don’t understand enough to explain how.
Making Art
By far, the most popular news about AI today is the question of compensating artists whose works have been used to train models that generate art and charge users for it. I mention this context because I believe it is symbolic of the kind of issues around fairness that we have about “AI taking our jobs.”
That said, these text-to-image AI tools can create truly stunning images. OpenAI’s model is called Dall-E. My friends, Rish and Daniel, preach using StableDiffusion, which is free. There’s even an AI-generated podcast between Joe Rogan and Steve Jobs.
To dramatize how much AI is the future, I’m going to end this post with a bunch of AI-generated images.
But before that, in preparation for Friday’s post, if you have any thoughts about the streaming services industry, specifically Netflix, Disney, or HBO Max, let me know!
Thanks to Anushka Aggarwal for reading drafts of this.
This was a nice read