ChatGPT and FOMO

I was born in 1993. I was too young to witness the internet revolution first hand, but I still remember the first time I connected with a 56k dial-up modem. If you belong to the Gen Z crew, just know this:

you could either be connected to the internet or use the phone
the modem made strange sounds while establishing the connection
download duration estimates could range from tens of minutes to years.

I wasn’t too young to enjoy the mobile revolution, though: I had a Symbian-OS powered Nokia phone (remember those? Resistive touch-screen and all of that) and when Steve show the world the first iPod, phone and internet communicator you could sense it was different.

AI

Nowadays I’m definitely not too young to witness what feels like an AI revolution. I’m not that good at making predictions, but it really feels like we are at the verge of something big:

the rate of improvement in the area is astonishing even if compared with other technologies. Do you remember what dictating a message to a phone felt like 10 years ago? Basically unusable. What about using a web app? Or Googling something? Practically the same as today.
the expected value of what could be created by leveraging this technology is huge: this is one of the few cases where the too-often-abused disruptive word should be used. Frankly, I don’t think I fully grasp how big of a change this could be.

Am I over-reacting? I don’t think so.

Why does this time feel different?

I got interested in AI around 2015. The field was just starting to heat up again after a stagnation period (look up “AI winter” if you’re interested in reading more) thanks to some advancements in computer vision.

Since then, progress just skyrocketed. There have been plenty of breakthroughs: think about that time when AlphaFold solved the protein structure prediction problem. Or when a deep learning model became the world champion in Go, a game previously considered intractable by computers.

But this just feels different. Yes, ChatGPT is just a wrapper around a very large (at least for today’s standards) language model. But interacting with it is a new kind of experience: I suppose this is what felt like using the first graphical user interfaces. AlphaGo felt like Deep Blue on steroids, despite being a completely novel approach. Advances in understanding protein folding felt more like a scientific discovery than technical progress.

So, as you probably already understood by now, I can’t fully articulate why this feels different: I just have a gut feeling. After studying, experimenting and working with AI for many years you develop some taste for what are novel-but-incremental ideas (residual connections, I’m looking at you) and what is a discontinuity¹.

It’s like lighting a room with a small candle: you can feel its size, even if you don’t see it completely. And I suspect I’m underestimating how big this room actually is at least by an order of magnitude.

This applies not only to language models. Generative models overall cause this feeling: have you ever seen what Stable Diffusion can generate? Once again, GANs - the base models used to generate synthetic images - have existed for years. We trained quite a few of those² and had a blast: but this just feels different to someone who has lost way too many hours of his life debugging an upscaling network.

FOMO

Up until now, I might have said nothing general nor surprising. Technology is characterized by trends, and this is the next major one.

Here’s the catch, though: I’m perfectly positioned to ride the next wave, yet I’m here waiting. When thinking about those previous revolutions I - a bit arrogantly - said to myself “man, if only I were there at the time…”: those felt like the easier times to create something new.

The fear of missing out a once-in-a-lifetime opportunity is ramping up.

But if I’ve learned one thing in my career, it’s that technology by itself can only make it as far as cool-demo level. You should not go around and try to hit with a GPT hammer every problem - even the non-existing ones - just because it’s fun³.

The question you should ask yourself is always the same, especially when facing a revolutionary technology: what does it enable? What problem is now tractable? How can I use it to make something people want? Right now, I have no answers for ChatGPT and all that jazz.

Sure, you could build a coding or a writing assistant. You could disrupt all the stock-photos services, generate marketing campaigns, create a new kind of photo-editing experience, replace every voice-over professional in the world: but what would your competitive advantage be? It all comes down to brains, data and compute: take your pick.

If you’re excited about this new thing, just get started and learn. Avoid throwing yourself on the bandwagon of those who fill their mouths with “AI” without even knowing what being an universal approximator means: it may be difficult at first, but you’ll be training yourself to see through smokescreens.

The primary function of this post is just to remind myself it’s fine not to strike every ball I get thrown at me. I’m just going to keep on building stuff that solves problems through technology.

But it’s so damn hard to stick to it this time 😂

Sure, discontinuities are oftentimes the result of stacking together many incremental improvements. This is not to downplay incremental advancements in any way: it’s just something different.↩
It’s a technique called dataset augmentation: you don’t have enough images to train a network on, so you just generate additional images. In the old days this was done by scaling and rotating images in the original dataset but, in industrial scenarios, it is often useless. This is another story, though.↩
Yes, I do understand there are many people with a different definition for fun.↩

Published Mar 10, 2023

Mechatronics Engineer, machine learning enthusiast, busy building Compiuta.Marco Pinato on LinkedIn