AI Fails 96% of Times, Study Says

Watch on YouTube

Now Playing

Transcript

256 segments

0:00

In recent years, artificial intelligence

0:02

has been marketed as the technology that

0:04

will replace humans. But today, this is

0:06

still far from happening.

0:07

>> Look, I think what it's going to do is

0:09

completely transform the industry. It's

0:11

going to change the way content is

0:13

created, produced, and distributed. But

0:16

that doesn't mean there isn't going to

0:17

be a role for the great artists who are

0:20

in roles today.

0:21

>> Well, a recent study reveals that AI

0:23

fails in 96% of almost all real

0:26

professional tasks it performs.

0:27

>> But this was just a reaction to the

0:29

headline, because as you dig deeper into

0:31

what was actually said in the report,

0:33

the picture isn't as clear-cut as AI is

0:35

simply failing everywhere.

0:37

>> In addition, it makes many constant

0:39

errors, such as producing corrupt files

0:41

or incomplete work that forces people to

0:43

redo them from scratch.

0:45

>> But the real problem is you don't know

0:46

what it's made of, and you don't know

0:47

what's accurate. So you or your staff

0:50

manually have to go back and check

0:52

everything.

0:53

>> For this reason, AI cannot work on its

0:55

own, since it does not understand rules

0:57

and does not know what it is doing. It

0:59

only imitates language without having

1:01

real logic.

1:01

>> But sometimes, that answer is unwieldy

1:04

or not what you wanted.

1:05

>> Despite this, many companies are laying

1:07

off people, while one in five AI systems

1:09

makes critical errors in sectors such as

1:11

medicine or engineering. Even in the

1:13

medical field, Reuters just reported

1:15

that the FDA has received 100 reports of

1:18

AI malfunctions, botched surgeries, and

1:20

misidentified body parts.

1:21

>> Thus, what seemed to be the future of

1:23

work could become a problem with much

1:25

more serious consequences than expected.

1:27

So, why is AI failing so much? In recent

1:30

years, artificial intelligence has gone

1:32

from being a technological promise to a

1:34

tool present in daily life. More than

1:36

300 million people use AI in their work,

1:39

which has accelerated the debate about

1:41

its impact on employment. However, a

1:43

recent study shows that AI fails in 96%

1:47

of professional tasks, which raises

1:48

doubts about its ability to replace

1:50

humans anytime soon. As companies invest

1:53

large amounts of money in this

1:54

technology, expectations also grow. The

1:56

six US firms pouring the most cash into

1:59

AI are projected to spend over 750

2:02

billion dollars on it this year alone,

2:04

more than the entire GDP of Ireland.

2:07

>> It is expected to take on complex

2:09

functions autonomously. More than 400

2:11

million dollars are allocated each year

2:13

to the development of new models,

2:15

reflecting market confidence in its

2:17

potential. It is estimated that AI

2:19

investment up to 2025 shows steady

2:22

growth, reaching 126 trillion dollars.

2:25

However, there is a clear gap between

2:27

what is expected from these systems and

2:29

what they can actually deliver with

2:30

professional quality. This gap raises a

2:32

clear issue regarding the true scope of

2:34

artificial intelligence in today's labor

2:36

market.

2:37

>> It's only honest for people um to speak

2:40

frankly about the fact this will have a

2:42

big effect on the job market.

2:44

>> One of the main problems is the low

2:45

effectiveness of AI in real tasks. In

2:48

tests carried out in areas such as

2:50

design, programming, and writing, more

2:52

than 500 different professional tasks

2:55

were evaluated. Most systems failed to

2:57

complete them correctly. This shows that

2:59

although AI can generate content, it

3:01

still does not reach the level of

3:02

precision required for human work. In

3:04

addition, only 12% of the results were

3:07

considered acceptable without

3:08

modifications. This limitation becomes

3:11

even more evident when results are

3:12

compared with human performance under

3:14

similar conditions. In work carried out

3:16

by professionals, eight out of 10 tasks

3:19

meet standards from the start. This

3:21

shows a clear difference in consistency

3:23

and reliability. AI can produce fast

3:25

responses, but the final quality still

3:27

depends on human intervention.

3:28

Furthermore, more than 200 million work

3:31

tasks have been analyzed across

3:33

different sectors to measure AI

3:34

performance. Despite this volume of

3:36

testing, results continue to show that

3:39

technology does not fully adapt to the

3:41

demands of professional work. In one

3:43

study, GenAI failed in 95% of cases

3:46

where companies tried implementing it.

3:48

In another study of 25,000 Danish

3:51

workers, introducing AI meant more work

3:54

for about 8% of people.

3:55

>> Added to this limitation are the

3:57

frequent technical errors these systems

3:59

present. In many cases, AI generates

4:01

corrupted or incomplete files that

4:03

cannot be used directly in a

4:04

professional environment. This not only

4:06

affects work quality, but also causes

4:08

delays in processes. For example, in

4:11

programming tests, around 35% of

4:13

solutions had failures that prevented

4:15

execution. This type of error forces a

4:17

human to review and correct the work

4:19

from scratch, reducing the expected

4:21

benefit of automation. Additionally, it

4:24

is estimated that more than 230 million

4:26

AI-generated files require some type of

4:29

correction before they can be used. On

4:31

top of that, around 35% of results

4:33

contain formatting errors that make

4:35

immediate use difficult. This shows that

4:37

the technology is still not fully

4:39

reliable in producing technical content.

4:41

>> And like we told you the beginning of

4:42

this, we did use AI to make part of this

4:44

segment, and when we got back from the

4:46

prompt that we put in, there were

4:48

actually a lot of mistakes.

4:49

>> On the other hand, it is important to

4:51

understand that AI works better as a

4:53

tool rather than a complete replacement

4:55

for human work. Its real value lies in

4:58

its ability to support processes and

5:00

improve efficiency in specific tasks. In

5:02

simple activities such as summarizing

5:04

information or generating drafts, it can

5:06

reduce work time by approximately 40%.

5:09

This allows professionals to focus on

5:11

more complex tasks. However, these

5:13

benefits largely depend on constant

5:15

human supervision. When AI is used

5:17

without control, results tend to lose

5:19

quality. In many cases, errors go

5:21

unnoticed without proper review. In

5:22

fact, teams that combine AI with

5:24

supervision achieve better results,

5:26

where seven out of 10 tasks reach an

5:28

acceptable level. Additionally, more

5:30

than 150 million workers use AI as

5:33

support in their daily activities. This

5:35

shows that its current role is more

5:37

related to assistance than full

5:38

replacement.

5:39

>> Sure, we're going to have some jobs that

5:41

change, but really it's going to be

5:42

about working hand in hand with AI,

5:44

using AI as our co-worker.

5:46

>> Despite these limitations, many

5:48

companies have begun making rushed

5:49

decisions regarding the use of AI. In

5:52

some sectors, human teams have been

5:53

reduced by 15% with the expectation that

5:56

technology can cover those functions

5:58

without issues. However, results have

6:00

not been as expected. In several cases,

6:02

staff reduction has led to a decrease in

6:04

work quality and an increase in

6:06

operational errors. This situation has

6:08

forced many organizations to rethink

6:10

their strategies. It is estimated that

6:12

more than 110 million workers could be

6:14

affected by similar decisions in the

6:16

coming years. However, it is also

6:18

expected that three out of 10 companies

6:20

will need to rehire staff to recover

6:22

lost quality levels. Additionally,

6:24

around 45% of companies that implemented

6:27

AI rapidly have reported problems in

6:29

their internal processes. This

6:31

highlights poorly planned adoption. The

6:33

CEO recently walked that policy back and

6:36

started hiring people again. Not out of

6:38

the goodness of his heart, but because

6:39

AI just couldn't cut it, and he's not

6:42

alone. Another key aspect is the lack of

6:44

real logic in these systems. Although

6:46

they can generate coherent text, they do

6:48

not truly understand the content they

6:50

produce. This limits their ability to

6:52

make correct decisions in complex

6:54

situations. In specific tests, more than

6:56

60% of decisions made by these systems

6:59

were incorrect when practical rule

7:01

application was required. This shows

7:03

that AI does not possess a deep

7:05

understanding of the world. This

7:06

limitation is also reflected in the fact

7:08

that five out of 10 responses contain

7:11

reasoning errors in complex contexts.

7:13

Although the language may appear

7:14

correct, the content can have

7:16

significant flaws. Additionally, more

7:18

than 170 million analyzed interactions

7:21

show that AI tends to repeat patterns

7:23

without evaluating whether they are

7:25

appropriate for each situation. All of

7:27

these disasters make sense. LLMs predict

7:30

the next word statistically, but they'll

7:32

never tell you when it doesn't know the

7:33

answer or if it can't understand

7:36

something. Finally, risks increase when

7:38

AI is used in critical areas where

7:40

errors can have serious consequences. In

7:42

sectors such as medicine or engineering,

7:44

precision is essential. A 25% increase

7:47

has been observed in incidents related

7:49

to the improper use of automated systems

7:52

in technical environments. This data

7:53

reflects the importance of maintaining

7:55

constant human oversight. Additionally,

7:58

more than 60 million automated decisions

8:00

per year require human review to avoid

8:02

significant errors. This shows that

8:04

technology still cannot assume full

8:06

responsibility in these fields. Compared

8:08

to processes controlled by

8:09

professionals, two out of five automated

8:12

results present risks that cannot be

8:13

ignored. And in another study focusing

8:16

on programmers specifically, AI made

8:18

coding take 19% longer on average.

8:22

Despite the advancement of artificial

8:23

intelligence, its inability to perform

8:25

complex tasks with precision shows that

8:28

it is still not ready to replace humans.

8:30

The gap between expectations and reality

8:32

remains wide, confirming that its role

8:34

should remain as support rather than a

8:37

substitute.

8:41

At Economy Media, your opinion matters

8:43

to us. Subscribe and let us know what

8:45

you think in the comments below.

Interactive Summary

Ask follow-up questions or revisit key timestamps.

This video examines the current limitations of artificial intelligence in a professional setting, challenging the narrative that it will soon replace human workers. While massive investments are being made in AI development, research indicates that it still struggles with accuracy, technical errors, and complex logic, failing a significant percentage of professional tasks. The consensus presented is that rather than a replacement, AI should be viewed as a supportive tool that requires constant human supervision to be effective and reliable.