News
Aggregators As Denial of Service Clients
Every once in a while I see a developer of a news aggregator
that decides to add a 'feature' that unnecessarily chomps down the
bandwidth of a web server in a manner one could classify as rude.
The first I remember was Syndirella which had
a feature that allowed you to syndicate an HTML page then specify
regular expressions for what parts of the feed you wanted it to
treat as titles and content. There are three reasons I consider
this rude,
- If a site hasn't put up an RSS feed it may be because they
don't want to deal with the bandwidth costs of clients repeatedly
hitting their sites on behalf of a few users
- An HTML page is often larger than the corresponding RSS feed.
The Slashdot RSS
feed is about 2K while just the raw HTML of the front page of
slashdot is about 40K
- An HTML page could change a lot more often than the RSS feed
[e.g. rotating ads, trackback links in blogs, etc] in situations
where an RSS feed would not