The emergence of the web data infrastructure layer for AI


The next frontier in AI may depend on a new web data infrastructure layer that can enable models to discover and map this ever-expanding digital realm. This layer must be able to navigate hundreds of millions of existing web domains and billions of new URLs created each week, delivering real-time information and overcoming technical barriers.

“The data suggests there’s far more data out there,” says Or Lenchner, CEO of Bright Data, a web data collection platform. “Think of the universe: It’s out there, but you don’t know what you don’t know.”

Enabling access to fresh, relevant, and trustworthy data

While early AI breakthroughs were driven by scaling training data and model size, organizations are now encountering a fundamental bottleneck: They need to keep pace with the dynamic, unstructured, and constantly evolving nature of web data in order to ground outputs in current and verifiable information. AI performance increasingly depends not just on model architecture but on a system’s compute, networking, retrieval, and data engineering capabilities—that is, the system’s ability to quickly and reliably retrieve data that is fresh, relevant, and trustworthy.

Traditional model training relies on snapshots of information collected at a particular point in time. Training AI on such static data is no longer sufficient. To track fluctuations such as competitor pricing, consumer sentiment, and market trends, companies need a constant feed of new information, pulling data in real time along with relevant context. Their infrastructure must therefore be able to handle millions of simultaneous interactions across websites that vary by geography, language, format, and access rules.

“If it can’t retrieve real-time information, it lacks context,” Lenchner says. “In a business setting, that’s not acceptable anymore. Stale answers lead to bad decisions and disappointed consumers.”

Speed is not merely a matter of convenience; it’s a matter of necessity. Today’s organizations operate in environments where prices, inventory, markets, security threats, and customer behavior change continuously. Delayed data retrieval can reduce the usefulness of an otherwise sophisticated model.

Using live, high-quality web data can also reduce AI hallucinations because the model has a more relevant knowledge base. This builds user trust. In fact, one survey found that 56% of AI practitioners said businesses need access to real-time web data to improve trust in AI outputs. To ensure the model runs efficiently and effectively, the information must also be pared down to the appropriate essentials. 

Despite the introduction of retrieval-augmented generation (RAG), where models pull in external data at the moment of a query, many AI systems still struggle to deliver outputs that are current, contextually relevant, and trustworthy in operational settings. According to Gartner, 60% of AI projects that are not supported by AI-ready data—accurate, structured, organized, and contextualized—will be abandoned by the end of the year. 



Source link

  • Related Posts

    European Space Agency’s Euclid Captures The Star-Filled Center Of The Milky Way

    NASA will begin mapping the galactic bulge with a mission later this summer. ESA/Euclid/Euclid Consortium/NASA, CFHT, image processing by J.-C. Cuillandre and E. Bertin (CEA Paris-Saclay) A large…

    Sharing a love for calculus

    As our graduates know better than anyone, preparation in calculus is effectively an admissions requirement at a place like MIT—which means that students in schools with no calculus classes are…

    Leave a Reply

    Your email address will not be published. Required fields are marked *

    You Missed

    Bosnia win 3-2, knock out Qatar to keep alive hopes of World Cup round of 32 | World Cup 2026

    Bosnia win 3-2, knock out Qatar to keep alive hopes of World Cup round of 32 | World Cup 2026

    Bosnia responde con autoridad y derrota 3-1 a Qatar en un partido decisivo.

    Bosnia responde con autoridad y derrota 3-1 a Qatar en un partido decisivo.

    10 American Brands Making Denim & Jeans In The USA

    10 American Brands Making Denim & Jeans In The USA

    Here’s how former baseball star Alex Rodriguez is investing amid stock market volatility

    Here’s how former baseball star Alex Rodriguez is investing amid stock market volatility

    N.L. Crown prosecutor charged with smuggling, prohibited weapon possession

    N.L. Crown prosecutor charged with smuggling, prohibited weapon possession

    European Space Agency’s Euclid Captures The Star-Filled Center Of The Milky Way

    European Space Agency’s Euclid Captures The Star-Filled Center Of The Milky Way