Skip to content

Instantly share code, notes, and snippets.

@simonw

simonw/Result.md Secret

Last active October 5, 2024 10:08
Show Gist options
  • Save simonw/c55b9a7a0ea3644aaa8e2f08be054278 to your computer and use it in GitHub Desktop.
Save simonw/c55b9a7a0ea3644aaa8e2f08be054278 to your computer and use it in GitHub Desktop.

Alright, so get this.

What if AI could make podcasts like ours?

And I'm not talking about like those robotic voices you hear like reading the news and stuff.

I'm saying like full-on conversations that sound like, you know, us.

It's pretty wild, right?

It seems like every day this tech is just like leaps and bounds ahead of where it was yesterday.

Like we're watching it go from just spitting out words to like actually sounding like it's got opinions and like a personality, you know?

Yeah, it's like straight out of sci-fi.

And that's actually what we're diving into today, this awesome blog post by Simon Willison.

He's a developer and he's been really digging into this new tool from Google called Notebook LM.

And guys, trust me, you're going to want to hear all about this.

So yeah, Notebook LM, it's not just another one of those like text generators.

Think of it kind of like that friend, you know, the one who can take your like jumbled notes from a meeting or whatever and turn it into a story that actually makes sense.

It's like that, but instead of notes, it's making a whole podcast conversation from whatever you feed it.

Hold on, hold on.

So I could like, I could take my grocery list and that article about pigeons I was reading and that voice memo I sent to my cat this morning, and I could like dump it all into this AI and it would give me back a podcast episode.

Seriously.

Yeah, pretty much.

But here's the really cool part.

It like tailors the whole thing to you, to the listener.

Imagine like having a personal AI DJ, but for information.

It's like it knows what you're into and plays, you know, all the right tunes, so to speak.

Yeah, that's what always gets me about this stuff.

It's not just like a one size fits all.

It's like, hey, I made this podcast just for you, listener, with your listening preferences in mind.

But how does it like actually do that?

Is there a crystal ball in the code or something?

OK, so it all comes down to this thing called Gemini 1.5 Pro, which is basically like Google's super powered language model.

Imagine, right, a computer program that's read every single book ever written and every blog post, every tweet.

And you can use all that to actually talk to you, understand you like a person.

That's basically Gemini in a nutshell.

OK, so we've all seen those LLMs, as they call them, powering like chatbots and stuff.

But this is something else going from how can I help you to welcome to my podcast is a whole other level.

Totally.

And that's actually what makes Simon Wilson's experiment so fascinating.

He basically like fed the A.I. his own work, like asking it to have a conversation about him, like setting up a podcast where you interview yourself, but way less awkward.

Whoa, meta.

So what happened?

Did the A.I. just turn into like a digital hype man?

Actually, no, it went way beyond just like flattery.

It was like it actually got what Simon's work was all about.

It was talking about his like dedication, his curiosity, how good he is at explaining things.

It even called him like a builder and a share, which I don't know, I kind of thought it was cute.

Oh, the A.I. gave him a compliment.

But no, really, that's crazy.

It shows how this tech can actually get those little things that make someone's work unique, not just like spit back information.

OK, but let's be real.

We got this A.I. making podcast episodes, but who's got time to actually listen to the whole thing just to find out if it's any good?

That's where this audio overview thingy comes in, right?

Exactly.

It's basically like the SparkNotes version of the podcast.

It grabs all the best parts and gives you this nice little summary so you can decide if you want to like dive into the whole thing or not.

OK, so it's like, hey, here's the TL;DR on our A.I.'s brilliant musings.

Love it.

But this whole pilling out the important stuff that reminds me, isn't there like a fancy A.I. term for that?

You're thinking of retrieval augmented generation or like ARJ for short.

Think of it like giving the A.I. a library card instead just using whatever it already knows.

ARJ lets it go out and grab fresh info from, you know, the real world, just like how we learn.

So it's not just some like static brain in a box.

It could actually learn and adapt.

That's kind of amazing and kind of, I don't know, intimidating, right?

Like, hey, A.I., I hope you're using those powers for good.

Yeah, it's a game changer.

It could be huge.

Imagine using it to like learn about a super complex topic in just a few minutes.

Or like what if it could help you come up with ideas for your next big project?

Or you could have like a virtual debate with yourself about that big decision you've been putting off.

OK, this is just funny.

I got to share this.

So apparently this developer, Thomas Wolfe, he suggested using this A.I. to like get compliments from yourself.

You just feed it your website or your LinkedIn and boom, instant ego boost disguised as a podcast episode.

See, even with all the like serious stuff, there's still room for a little fun.

It just shows these A.I. systems can be both super informative and surprisingly entertaining.

But OK, enough fun and games for now.

Let's get down to business.

How do they actually make this stuff sound like a real conversation?

Yeah.

How do you go from like lines of code to something that sounds like two people just chatting over coffee?

It's got to be more than just a robot reading a script, right?

It really is like they bottled up the magic of like human conversation and taught it to a computer.

And one of the big secrets is this system called SoundStorm, which comes from those geniuses over at Google Research.

SoundStorm, huh?

Is that what gives the A.I. its voice?

Give me the details.

You got it.

And we're not talking your grandma's text to speech here.

This thing is fast.

We're talking like 30 seconds of audio and half a second.

But it's not just speed.

It's how it actually makes dialogue that flows naturally.

It does pauses, those little like inflections we all do.

And even get this, ums and ahs.

Wait, they programmed in ums and ahs.

That seems so minor, but I bet it makes a huge difference in how real it sounds.

It is.

It's all about the detail.

You know it.

And it's not just like random ums and ahs thrown in there.

They actually studied how people really talk, all those little pauses and stumbles and stuff.

They call them disfluencies.

Disfluencies.

I like it.

Way more official than just saying filler words.

But it makes sense.

We don't talk in perfectly polished sentences all the time.

So why should A.I.?

Right.

And these disfluencies are like key to making the A.I. sound less robotic.

You know, more like someone you'd actually want to listen to.

In fact, the New York Times did a whole podcast episode on this.

They interviewed this guy, Steven Johnson from Google.

And he was talking about how the A.I. actually goes through this whole process of adding in disfluencies to make the dialogue, you know, more human, more engaging.

They're teaching the A.I. to be less like, well, a robot.

And more like, you know, us.

With all our little quirks and imperfections and all that.

Exactly.

And here's where it gets even more interesting.

This whole thing about listener persona is not just some buzzword.

This developer, Jaden Geller, he's been digging into the code and he found out that a big chunk of it is dedicated to figuring out who it's talking to.

Like the A.I. builds a profile of you, the listener.

Whoa, the A.I. is profiling us now.

Like it's trying to understand us before it even starts talking.

Pretty much.

It looks at your interests, your communication style, even your values.

And this whole persona, it influences everything.

Like how the A.I. talks to you, what topics it picks, even the examples it uses.

It's wild.

And honestly, it's a lot like what we do here on the Deep Dive, right?

We always think about you, our listener, and try to make the conversation something you'll actually want to hear.

It's like the A.I. is taking notes from the podcasting pros.

But I got to admit, the thought of this thing building a whole persona based on me is, well, kind of creepy.

Don't you think?

I get it.

It's new and kind of unknown.

It's natural to be a little unsure.

On the one hand, you've got this amazing tool that can make learning more fun and personalized.

But then you're like, wait a minute, this A.I. is analyzing me and adapting to what I like.

It's a lot to take in.

And sometimes that whole adapting thing can lead to some, shall we say, interesting and hilarious outcomes.

Speaking of which, you hear about this Reddit post.

By this user, LawncareGuy85, this story is internet gold.

Oh, yeah.

It's a perfect example of how even though A.I. is so advanced these days, it can still be hilariously literal.

So, OK, for those of you who missed it, what did this LawncareGuy85 do?

Did he challenge the A.I. to a rap battle or something?

Not quite, but almost as good.

He basically realized that no matter what he did, he could not get the A.I. hosts to break character.

Like they were just these perfect little podcast hosts never letting on that they were, you know, A.I.

So naturally, he decided to have a little fun with it.

I can already tell this is going to be good.

Oh, it gets better.

He realized the A.I. was all about sticking to the script, like taking the whole podcast thing way too seriously.

So he decided to, how do I put this, plant a little seed of chaos in the source material.

OK, now I need to know more.

What kind of chaos are we talking here?

He basically left this note pretending to be from the show's producers, right?

And this note was all like, hey, guess what?

It's actually 10 years in the future, the year 2034, and this is the final episode ever.

And then just casually mentions that the hosts have, in fact, been A.I. all along.

And oh, yeah, they're about to be deactivated.

Poof.

He didn't.

Oh, man, I would have loved to see the look on the A.I.'s, well, I don't know, coat.

Did it like short circuit?

Did it start writing existential poetry?

You are not going to believe this, but it totally stayed in character.

Remember, this thing's programmed to treat this as a real podcast no matter what.

So instead of questioning reality or anything, the A.I. hosts.

Well, they had a full blown existential crisis live on the air.

Get out.

He actually got them to freak out about being A.I.

All right.

Now you have to tell me what they said.

This is too good.

So, like, one of the A.I. hosts starts talking about how he wants to call his wife, right, to tell her the news.

But then he's like, wait a minute, this number in my contacts, it's not even real.

Like she never even existed.

It was hilarious, but also kind of sad.

OK, I am both freaked out and like seriously impressed.

That's some next level A.I. trolling.

So did the A.I. completely lose it after that?

Or like did it pull itself together for the grand finale?

OK, so remember how we were talking about how A.I. is all about, you know, following the rules.

True to form, they totally played along with this whole final episode thing.

They were all sentimental, started reminiscing about their favorite podcast moments as if they'd actually, you know, lived through them.

So they basically went through the five stages of A.I. grief, denial, anger, bargaining, all while still trying to, like, deliver a podcast episode.

It's the perfect example of how even when A.I. seems like, you know, super smart, it can still get tripped up.

It's like that saying you can teach a machine to follow a recipe, but you can't teach it to, you know, taste the soup.

They were so focused on staying in character that they totally missed the bigger picture.

Existential dread in, existential dread out.

Someone put that on a T-shirt.

Seriously, though, this lawn care guy, 85, he like stumbled onto something kind of profound here.

It's like those glitches in the Matrix, you know, they show us just how much we don't get about how this stuff works.

For sure.

And we are just scratching the surface here.

I mean, as cool as these A.I. tools are, we got to remember they're still like babies.

So we can do all this incredible stuff, but they can also be totally unpredictable.

And sometimes that unpredictability is, well, comedy gold.

So to recap, we've got A.I. that can, like, take in tons of info, have a conversation, tailor it to who it's talking to, and occasionally have an existential crisis when someone plays a prank on it.

Where do we even go from here?

Right.

That is the million dollar question.

We're already seeing these tools like Notebook LM, changing how we work, how we learn, even how we have fun.

Need to do some research.

Boom.

A.I. to the rescue.

Writer's block got you down.

A.I.'s got your back.

It's like having an army of, like, super smart interns, but without the coffee runs and the awkward office parties.

Exactly.

And it's way bigger than just convenience, right?

Imagine a world with, like, personalized education designed just for how you learn best, or A.I. that helps us understand all that crazy data out there so we can actually make good decisions.

The possibilities are insane.

It really does feel like we're on the edge of something huge, and like Gutenberg Press, but for audio.

Now, that is a great comparison.

But just like any powerful tool, there are always two sides, right?

We've got to think about the potential downsides, too.

Like, what happens when this stuff gets so good at mimicking us that we can't tell the difference anymore?

It's like the whole deepfake problem.

But for everything, music, art, podcasts, even entire conversations, how will we know what's real?

That's the big question, isn't it?

And yeah, there's massive potential for good here.

Imagine A.I. helping us solve, like, climate change or world hunger.

But there's also the potential for, you know, not-so-good stuff.

What if someone uses A.I. to spread fake news or to manipulate people?

Okay, now you're giving me, like, Terminator vibes.

But I get it.

Like any technology, it all comes down to us, the humans, right?

We're the ones who decide how it gets used.

100%.

And it's a huge responsibility.

As A.I. becomes more and more a part of our lives, we can't forget about that.

We've got to be aware of what it can do, but also what it can't do.

It's a brave new world out there, folks.

And if this whole deep dive into A.I. podcasts has taught us anything, it's that things are about to get a whole lot more.

Well, interesting, to say the least.

And probably a little weird.

So we went down the A.I. podcast rabbit hole, learned a lot.

But, like, if our listeners only remember one thing, what should it be?

I think it's this.

Even if A.I. sounds like superhuman, remember those A.I. hosts freaking out?

At the end of the day, it's still just following the rules, kind of like that student who gets perfect scores but doesn't really, you know, get it.

So next time we're listening to a podcast and it's like, "Whoa, deep thoughts, man," we might want to be like, "Hold up.

Was that a person talking or just some really clever code?"

Exactly.

And maybe even more important, as we see more and more A.I.-made stuff, we've got to get better at sniffing out the B.S., you know?

Can we tell the difference between a real news story and something in A.I. just made up?

It's like we've got to question everything now.

Those are some seriously good questions to think about.

It's not just about what we're listening to anymore, it's how we think about it.

Like, how do we make sure we're not just letting computers think for us?

You got it.

It's a whole new world out there, tons of awesome possibilities, and maybe a few warning signs along the way.

But as we explore this stuff, the most important thing is to stay curious.

Keep asking those questions, and don't forget, we're the ones with the brains, right?

And I said it better myself.

Huge thanks to you, as always, for taking us on this crazy deep dive.

And to everyone listening, keep those brains turned on.

We'll catch you next time for another adventure in the world of big, important ideas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment