720-891-1663

Could We Have, Maybe, Too Much AI?

This is an interesting problem.

Could the glut of AI-generated content completely pollute AI LLMs as they ingest AI-generated garbage. Maybe AI generated content published by our adversaries for the sole purpose of polluting our AI world. This does not mean that all AI-generated content is trash, but the trick for the vacuum cleaners that are sucking up as much data as they can is to distinguish between accurate content and rubbish. This might not be so simple.

If you are a private company and you are building your own LLM or augmenting a commercial LLM with your own corporate data, you probably stand a pretty good chance to only ingest valid data.

But, as more and more commercial software includes AI features, how well are they validating the content that they are ingesting. That is hard to tell.

As the need for more content accelerates, many creators may go for low hanging fruit (data) that is easy to collect and not behind a paywall.

The basic idea is that as the signal quality degrades over time through junk training data, models can remain fluent and fully interact with the user while becoming less reliable.  All without you realizing it.

Gartner says that 84 percent of companies who responded to a recent survey expect to increase spending on Generative AI this year. Depending on how thoughtful those companies are, they could be creating AI hallucination heaven.

If you need assistance, please contact us. Credit: Dark Reading

Facebooktwitterredditlinkedinmailby feather

Leave a Reply

Your email address will not be published. Required fields are marked *