Tobold's Blog
Thursday, July 06, 2023
 
Dead Internet Theory, APIs and AIs

According to the "dead internet theory", most traffic on the internet is bots, not humans. Bots supposedly overtook humans as internet users back in around 2016, and supposedly currently make up 64% of internet traffic. And that was before the meteoric rise of ChatGPT put AI at the forefront of the discussion about the future of the internet. AI can not only produce a lot of bogus content, aka bots replacing humans as internet content creators; AI also relies on bots to read information from the internet to train it.

A current trend is sites like Reddit and Twitter introducing or massively increasing prices for the use of their APIs. API is the acronym for Application Programming Interface which is a main way in which a bot would read information from another site. For Reddit the discussion was mostly about whether it was okay for another company to make a user interface for Reddit content, making money of that, while denying advertising revenue to Reddit; but it was then revealed that the protesters' Reddit blackout had made Google searches less good. Websites generally don't mind being "read" by Google search bots, because that tends to lead to people searching information getting a link back to the original website. But AI is obviously going to fundamentally change how search engines work, and the new way isn't necessarily working with links.

So, charging a bot money for reading a lot of information from your website via API also seems a rather logical step to prevent revenue loss from somebody training an artificial intelligence to use all of the information on your platform without giving anything back to you. It might prove more effective than trying to sue the AI company after them having downloaded 12 million images "without permission or compensation". The nature of generative AI is that is uses fragments of all the sources that were used for training, which makes a small fee per source used a lot more practical than copyright law.

The result of all this might actually turn the internet into a place that is less friendly towards bots, and thus increase the percentage of users that are actually human. The alternative is an internet where search bots and AI training bots take up an ever increasing part of the traffic, until advertising companies realize that having your ad "seen" by a thousand bots isn't doing anything for your sales, and stop paying for that. Or the few remaining humans on the internet develop some NI, natural intelligence, and refuse to watch boring derivative content created by AI, artificial intelligence.

Comments:
Companies restricting access to their APIs for the purposes of preventing data being taken will actually have the opposite effect. Bots built purely to siphon data will simply revert to scraping data off of sites like reddit which is very inefficient and causes even more load on websites.

Of course I don't believe that's the reason reddit did this. They likely did it because 3rd party apps don't generate them any revenue while also being used by their most costly users, power users and mods. This moves forces thise people to either quit using the site or move to the official app or site where reddit can make money off them. Makes complete business sense.
 
Look at Twitter’s new “rate limit exceeded” error: Platforms can prevent scraping as well as access via API.
 
There are ways around that as well. It all depends on how much effort someone is willing to put into scraping your site. Stuff like Captcha can be circumvented as well.

And seeing as how Twitter was apparently DDoSing itself when it implemented the login and rate limits I don't believe they have the engineering staff competent enough to win an arms race versus dedicated scrapers.

But this is all what ifs because I frankly have no idea how popular of a target Twitter is for bot scrapers.
 
> This moves forces these people to either quit using the site or move to the official app or site where reddit can make money off them.

How? Not much from advertising. Power users and mods are the most likely to use ad blockers religiously. I've been on Reddit for 17 years and basically never seen an there. At best they can sell my anonymous redditing habits.
 
Most reddit users use an app, be it the official one or 3rd party. The idea is to force them to the official one where they would be served ads.

I guess users could opt for something like Firefox mobile with ublock on their phones though.
 
Post a Comment

<< Home
Newer›  ‹Older

  Powered by Blogger   Free Page Rank Tool