In a growing clash between AI companies and web security platforms, Cloudflare has raised serious allegations against Perplexity, an emerging AI-powered answer engine. According to Cloudflare, Perplexity has been bypassing standard web rules and crawling websites without permission, using tactics designed to evade detection.
Cloudflare’s research points to a pattern of what it calls stealth crawling. Instead of following internet protocol and obeying directives like robots.txt files, which clearly tell bots where they can and cannot go, Perplexity has allegedly been ignoring those boundaries. Even more concerning, Cloudflare claims the company is going to great lengths to cover its tracks.
When a website blocks Perplexity’s declared bots such as PerplexityBot or Perplexity-User, the company reportedly uses a different approach. It modifies its user agent to impersonate a regular browser like Google Chrome and rotates its IP addresses to avoid being flagged. This allows the AI system to access data from websites that had intentionally opted out of being scraped.
To prove these claims, Cloudflare set up a series of new websites that were not indexed on any search engine and had robots.txt files that disallowed all bots. These test sites were invisible to the open web. However, when Cloudflare asked Perplexity questions related to these test sites, it still managed to provide surprisingly accurate and detailed responses. This strongly indicated that Perplexity had crawled and cached the content despite the sites being hidden and protected.
Cloudflare explained that trust between websites and crawlers is built on clear rules. Bots are expected to identify themselves properly, respect site directives, and operate transparently. Perplexity, according to Cloudflare, is breaking that trust by disguising its bots and circumventing these basic rules of engagement.
What makes this situation even more alarming is the effect it has on AI performance. Cloudflare noted that once it successfully blocked Perplexity’s undeclared bots, the quality of Perplexity’s responses dropped significantly. This suggests that much of the system’s real-time accuracy relies on stealth data access rather than licensed or publicly available information.
To counter these actions, Cloudflare has now updated its bot management system. It can detect Perplexity’s hidden crawlers and block their activity automatically, even on free user plans. This new defensive layer is designed to give website owners more control over who accesses their content, especially in an era where AI models are rapidly scaling and sourcing data from across the internet.
The debate around how AI companies collect data is becoming more urgent. As models grow smarter and more integrated into our digital lives, so do the questions around consent, copyright, and transparency. Perplexity has not yet issued a public response to Cloudflare’s claims, but the spotlight is now firmly on how AI platforms gather the information that powers their answers.
For the latest updates on AI, cybersecurity, and tech innovation, follow Tech Moves on Instagram and Facebook.