HomeVideos

ElevenLabs V3 Alpha review: Is it worth the hype?

Now Playing

ElevenLabs V3 Alpha review: Is it worth the hype?

Transcript

298 segments

0:00

[Music]

0:05

Hello everyone. This is Professor

0:06

Patterns and in this video we're going

0:07

to be covering the 11 Labs V3 Alpha

0:09

model. Now, a lot of people have been

0:11

talking about this one. Uh I have some

0:13

people commenting in some of my earlier

0:15

videos. Um I will say I've never been a

0:17

huge fan of 11 Labs before just because

0:19

it's a paid service, especially for

0:20

Texas to speech and there's so many like

0:22

free services such as Kokoro and Orpheus

0:25

Texas speech available. Uh but I thought

0:27

might as well at least try it out. um

0:29

see what all of the hype is all about.

0:31

So the 11 Labs V3 alpha model um they

0:35

did also share a sheet of different

0:39

voice related um things like panicked

0:42

tired shouting and stuff. There's some

0:44

sound effects like gunshots, rainfall

0:46

and then some unique ones like strong X

0:49

accent. So if I want like maybe a strong

0:51

Italian accent or something I guess I

0:52

can put that. Um I basically just put

0:55

that into chat GBT. I said, "Give me

0:57

four sentences using some voice related

0:59

sound effects." And uh it gave me some

1:01

sentences here. So, let me just copy the

1:03

first one. And I'm going to paste that

1:06

right here. There are a couple of

1:08

different voices that you can choose,

1:09

but they had some that are like they

1:11

call them the best voices for V3. Um so,

1:15

we'll actually try some of these out.

1:17

We've got some James. I must not

1:19

fear. Fear is the mind killer. Hey

1:23

everybody, this is Juniper. Hey, are you

1:26

looking for a fresh and engaging voice

1:27

for your podcast or social media? Then

1:30

I'm the voice for you. Okay, that's the

1:32

one I selected. And uh it's hope, it's

1:35

upbeat, it's clear, u it makes sense.

1:38

Let's try that voice. Turn up the volume

1:40

a little bit. And uh what's a sense?

1:42

Excited. You won the competition.

1:43

Giggles. That's amazing. Seriously, I

1:45

didn't think you'd actually pull it off.

1:46

Applause. Um I started off with 103 306

1:50

credits and it says cost 25 credits to

1:53

make. Okay. So, that's not bad. I will

1:56

say let's generate this speech and see

1:58

how good it is.

2:02

You won the competition. That's amazing.

2:05

Seriously, I didn't think you'd actually

2:07

pull it off.

2:10

Okay, that it was it sounded good. Not

2:14

better than like Orpheus, maybe. Uh

2:18

let's try the other one that generated.

2:21

You won the competition.

2:24

That's amazing. Seriously, I didn't

2:26

think you'd actually pull it off.

2:30

That was the saddest applause. Um, but

2:34

this one I think this the second one was

2:36

a little bit better for sure. Um, it had

2:38

the giggle. It had the half kind of

2:41

laugh into the sentence like this is

2:44

what I'm talking about. Seriously, it's

2:47

amazing.

2:48

Serious. That's amazing. Yeah, that I

2:51

don't know if that was intentional or

2:52

not, but that was pretty good. Uh, let's

2:55

try the next sentence. So, this one

2:57

says, "Curious, what are you saying?

2:59

This treasure under the old lighthouse."

3:01

Dramatic pause thunder. Okay, so maybe I

3:03

want something a little bit more

3:04

dramatic for this voice. Um, let's

3:08

try Liam. Life isn't about finding

3:11

yourself. Life is about creating

3:13

yourself. A single rose can be Friends

3:16

show their love in times of trouble.

3:17

Life without love is like a tree without

3:20

blossoms or fruit. Yeah, I think Harry

3:21

makes sense. So, I'm going to pick

3:23

Harry. And this one costs 28 credits.

3:26

Okay, just

3:30

generate.

3:31

Curious. Wait, are you saying the

3:34

treasure is hidden under the old

3:37

lighthouse? That's actually kind of

3:40

epic.

3:42

All right, a couple of things that went

3:44

wrong. First, it read out curious for

3:46

some reason.

3:47

Um, I feel like the voice kind of

3:49

changed. It read this. It did the pause,

3:52

the thunder, which is okay. And then

3:54

this was in a completely different

3:57

voice. Or am I just tripping? Let me try

3:58

the other other one that I generated.

4:00

Cursor. Cursor. Wait, are you saying the

4:03

treasure is hidden under the old

4:06

lighthouse? That's actually kind of

4:08

epic. I still feel like it's changing

4:11

its voice halfway through the thunder.

4:15

Or am I am I wrong? Let me know in the

4:17

comments if you think so as well. Let's

4:20

try the third sentence. Okay. Starts

4:22

laughing. Oh no, not again. Snorts. You

4:24

and raccoons have the weirdest be.

4:25

There's a fart. Uh, what voice did I

4:28

choose for this one? Uh,

4:30

maybe by repeating what students say.

4:33

Hey, you're not asleep yet, are you? Oh,

4:36

come on. You think I don't see what's

4:38

happening here? Please. I was two steps

4:40

ahead before you even laced up. All

4:42

right, I'm going to pick Priyanka Sogum.

4:44

Late night radio, neutral accent. Um,

4:47

let's generate speech. How much is this?

4:50

Is it matter? Uh, okay. 19 credits. So,

4:53

the farts are cheaper. So, that's

4:58

great. Oh, no. Not

5:02

again. You and raccoons have the

5:05

weirdest

5:07

beef. Okay. Uh, the fart was a little

5:10

bit underwhelming. Maybe let's go back

5:12

to Blondie. I think Blondie was

5:14

good. And let's generate again.

5:22

Oh no, not

5:24

again. You and raccoons have the

5:27

weirdest beef.

5:31

I didn't expect the explo the explosive

5:34

part um on there.

5:39

Oh no, not again.

5:41

[Music]

5:48

10 on 10. 10 on 10. I think it's

5:52

following

5:58

us. Don't make a

6:02

sound. Wow, that was good. That was

6:07

really good. the heartbeat, the dramatic

6:09

effect, the underwater, the don't make a

6:13

sound. I mean, if I'm writing like a

6:15

horror kind of audio book, I feel like

6:17

this is a great voice. But the good

6:19

thing is that you can actually download

6:21

um the entire voice file. And what is

6:24

it? A It's a MP3 file. Okay, nice. So,

6:27

you can actually download the entire MP3

6:28

file. Great. Let's try a full

6:30

conversation. Maybe like multiple

6:32

speakers and let's see how that goes. Oh

6:34

no, please someone save me. I am in

6:39

trouble. Um, help

6:42

someone please. And then here, let's add

6:46

a villain. So maybe for this villain

6:48

voice, I'm going to pick something

6:50

like Reginald. Intense villain.

6:54

Um, no one

6:57

can save you here. Let's let's add a

7:00

fart in there. Um, and then maybe an

7:04

evil laugh. What other tags are there?

7:07

Explosion. Yeah, let's do that. Um,

7:11

haha,

7:13

explosion. And let me add another tag.

7:15

Maybe something

7:18

like applause. Um, you are in severe

7:25

danger.

7:26

Applause. Let's add another. And this

7:29

can be a follow-up by

7:31

Blondie. Please, someone save me

7:34

again. And now we can have a hero come

7:37

in. Maybe the hero could be

7:43

Kuan. How about this one? Now, if you're

7:44

ever down this way, don't be shy. Yeah,

7:46

this is the one. We southern folk love

7:48

having folks over. Um, stop right

7:53

there. And then maybe a

7:55

gunshot. Um, you sir are under

8:00

arrest. Um, I'm going to take you back

8:04

to the sheriff or I guess I'm the

8:08

sheriff. I'm going I'm going to b take

8:10

you back to downtown to the town to to

8:15

the place in where they go um to prison

8:21

um and then add a speaker. And in this

8:24

case, Reginald

8:26

responds, "No, you will

8:28

not." And then let's make him fart

8:31

again. Um, all right. Let's see how much

8:33

this costs to make 60 credits. Okay, so

8:36

it does start racking up, but maybe it's

8:38

not still not like a huge amount or

8:39

anything. U, I think the subscription

8:41

that I had was

8:44

the creator subscription. That's I don't

8:48

have the pro one. I have the creator

8:49

one. And this gives me 100,000 credits

8:52

per month. If I want more

8:55

credits, is that okay? So, that's 30

8:57

cents for 1,000 credits. So, that's the

9:00

overall cost. Let's go back here and

9:02

let's generate the

9:06

speech. Oh, no. Please, someone save me.

9:10

I'm in trouble. Help. Someone, please.

9:14

No one can save you here.

9:20

You are in severe danger.

9:24

Please, someone save me.

9:28

Stop right there. You, sir, are under

9:30

arrest. I'm going to take you back to

9:32

prison.

9:35

No, you will not.

9:39

I don't know what it is about that fart

9:41

noise, but that one was okay. Oh no,

9:43

please, someone save me. I am in

9:46

trouble. Help someone. What's with this

9:48

song in the background? What? Oh no.

9:51

Please, someone save me. I am in

9:53

trouble. Help someone. Please. No one

9:56

can save you here.

9:59

[Music]

10:00

You are in severe danger. Please,

10:04

someone save me. Stop right there. You,

10:08

sir, are under arrest. I'm going to take

10:11

Why did that gunshot sound like a fart

10:13

noise? Um, you back to prison. No, you

10:17

will not.

10:20

Um, okay. I I like some aspects. I like

10:24

the fact that the applause that's here,

10:27

it carries onto the second part of the

10:29

conversation. So, it's not like applause

10:31

and then end and then it goes into the

10:33

next part. So, I like some of those

10:36

aspects.

10:38

Um, what what does enhance do? Adds

10:41

audio tags to help guide the delivery.

10:47

What do you

10:49

mean? Oh, it adds the tags so you don't

10:53

have to do it.

10:56

Okay. Wait, it removed all of my

10:59

farts. Oh, no. Please, someone save me.

11:02

Fart. I'm in trouble. Please, someone

11:05

help. Um, let's let's see what other

11:07

tags there are. Um, there is a woo.

11:11

Let's go with the woo. I am in trouble.

11:13

Someone save me, please.

11:16

Woo. And then there

11:19

is echoes. So, I'm going to put that in

11:22

here. No one can save you here. You

11:25

maybe this can be an echo for

11:27

sure. And then what about ASMR

11:31

mode? I think that would be cool.

11:34

Um, please someone ASMR mode save me. I

11:38

don't know how that's going to go, but

11:40

it'll be interesting. And then lastly,

11:42

maybe like

11:43

a

11:45

gulp that can come in after. No gulp.

11:50

You will not. How much does this cotton

11:53

cost me? 77 credits. Okay, so I'm

11:55

starting to rack up now. Oh no. Please,

11:58

someone save me. I am in trouble. Help.

12:01

Someone, please.

12:05

No one can save you here.

12:10

You are in severe danger.

12:14

Please, someone save me. Stop right

12:18

there. You, sir, are under arrest. I'm

12:21

going to take you back to prison.

12:24

No, you will not.

12:28

That one was amazing. Minus the random

12:31

gulp in there. But besides

12:34

that,

12:36

wow, that was that was good.

12:40

um for an audiobook kind of material or

12:43

something.

12:45

I am a little bit impressed. I don't

12:48

know if there is an open- source

12:49

solution that comes close. The cost

12:53

is weirdly or surprisingly not that bad.

12:59

There has to be a catch though, right?

13:03

Like, oh, it's got an 80% discount. No

13:07

wonder. Okay, because this this was good

13:11

for how much I'm paying. Um, this was

13:14

really good. But if it's at an 80%

13:17

discount, okay, that's going to get a

13:19

lot more expensive. Um, June 2025. What

13:23

do that's a month. Okay, so after a

13:26

month, this model gets extremely

13:28

expensive. Um, but you have a month to

13:31

at least try it out on this discount.

13:33

And honestly, not that bad. I mean, if

13:35

you want to maybe create an audio book

13:37

in the month, um, go for it. Uh, but

13:39

that's pretty much it for this video.

13:41

Overall, not terrible. Still a paid

13:44

solution, um, but not bad, honestly. All

13:48

right, that's it for this video. Thank

13:49

you all for watching. I'll see you in

13:50

the next one. Goodbye.

Interactive Summary

The video reviews the 11 Labs V3 Alpha model for text-to-speech. The presenter, initially skeptical due to the paid nature of the service compared to free alternatives, explores the model's capabilities. The V3 Alpha offers various voice styles, sound effects, and emotional expressions. The presenter tests several sentences with different voices and tags, noting the cost in credits for each generation. While some outputs are impressive, like the horror-themed narration, others have issues such as incorrect word pronunciation or voice inconsistencies. The presenter also explores advanced features like downloading audio files and creating multi-speaker conversations, highlighting both successes and failures in the generated audio. The video concludes by discussing the pricing model, noting a significant discount that makes the service affordable for a limited time, after which it becomes considerably more expensive.

Suggested questions

4 ready-made prompts