Top 100 Websites Illustrate Long Tail
The top 100 sites on the Internet get more than the next 900 combined.
The top 100 sites on the Internet get more than the next 900 combined. And the falloff gets much steeper after that.
Looking at data provided by Google on the top 1000 sites, the gang at Pingdom found:
- To become a top 1,000 website you need at least 4.1 million visitors per month.
- To become a top 500 website you need at least 7.4 million visitors per month.
- To become a top 100 website you need at least 22 million visitors per month.
- To become a top 50 website you need at least 41 million visitors per month.
- To become a top 10 website you need at least 230 million visitors per month.
- To become the number 1 website in the world? Then you need more than 540 million visitors per month.
- The top 10 websites get 42% of the visitors to the top 100, and 21% of the visitors to the top 1,000.
- The top 100 gets 50% of the visitors to the top 1,000. I.e. the top 100 together get as many visitors as the following 900 websites counted together.
They’ve illustrated this with some interesting charts. Here are the top 100 sites:
And here’s the top 1000:
The highest ranking site even in the neighborhood of being a blog is HuffPo, which is at 187. Only a handful of blogs likely crack the top 1000.
The bottom graphic actually misleads about the parity of the sites, because the top 30 or so sites force use of a huge scale. The graph is divided into 50 million unique visitor increments but the 32nd ranked site already falls below that and the 101st site is less than half that. The 283rd site is below 10 million and the 756st below 5 million. Most blogs you read are below 1 million.
The list uses a concept of “site” which is useful service providers, but not so much so for anyone further down the food-chain.
I mean, “facebook” is in there. As a installation, as an infrastructure problem, it is huge. But it certainly doesn’t mean that any one person in particular is driving top-1000 traffic with their facebook page.
Similarly “wordpress” is in there. As far as I know wordpress the organization drives essentially zero traffic. They provide an environment, for some successful and some unsuccessful people.
To make this interesting to anyone other than massive-scale web admins, the concept of “site” has to be pushed down.
There are a couple of interesting things about this. First, most of the top 100 sites aren’t content providers. They’re portals. Just as in the Gold Rush it isn’t the prospectors who get rich, it’s the guys who sell tools.
Second, I don’t have hard proof of this but to my eye the long tail result appears to be robust. That is to say that, if you consider the individual categories of site, within the categories the same long tail behavior is evident.
John: But most top websites aren’t one-man shows but rather conglomerates. It gets murky in cases like The Atlantic, where Andrew Sullivan drives a huge chunk of their traffic, but it otherwise works fine. Facebook is simply a more dominant Web presence — by far — than any individual site.
Dave: Yup, I think those observations are both right. There are a tiny number of dominant players in any given field and the rest of us are fighting over the crumbs.
We need regulations to ensure that visitorship is equal among websites instead of the top 10% hogging all of the visitors…
I think James we need a distinction between sites that do some kind of editorial direction and those that are more like providers, or carriers.
That’s why wordpress.com having such a high rating is a good case in point. As far as I know they do zero content direction or management across their sites, but there numbers are reported with wikipedia who does have a site-wide content system and management.
Right now it’s like comparing Time magazine with a commercial printing company, in “page renditions.”
Pareto rules, so what’s new?
numbers are reported with wikipedia who does have a site-wide content system and management.
But this example demonstrates why such distinctions are folly: I’d consider Wikipedia no different than Facebook. Both are essentially portals with user-generated content.
Pareto rules, so what’s new?
I don’t know that anything’s “new” so much as misunderstood. I’m flummoxed, for example, by the continuing flood of articles about blogs that seem not to understand that some yahoo with 27 pageviews a month isn’t comparable to one with 2.7 million — much less one with that much every few hours. “Blogs” as a phenomenon vs. the handful or so that really influence the debate within a given niche are too frequently conflated.
James Joyner says:
Tuesday, July 6, 2010 at 13:51
Jim, the problem with this list is it mixes portals and genuine blogs that originate materials even if in the process they utilize raw materials from other sources within the blogosphere or traditional media. That said if they separated the list into categories like blogs and portals or even subdivided blogs into say politics, fashiion, ballroom dancing etc etc the same phenomena would occur and the 80/20 rule apply. I don’t personally conflate high traffic and low traffic blogs or pay much attention to articles that do. In fact I’m very sceptical of the influence of obviously high traffic blogs like Huffington Post or Politico which are largely bs aimed at generating headlines. I rather like your blog because it’s what I’d categorise as sane conservative which is not a common phenomenon.
Perhaps repetition, with expansion, will help:
Right, I get your point. It’s obvious that Facebook and Yahoo are something different from Huffington Post, which is in turn different from Daily Kos or TPM or InstaPundit. But I’m not sure where you draw the line between, say, Facebook and Wikipedia, which are both contributor sites not intended to be viewed as a unit.
And if you can figure out how to differentiate those, then there will likely need to be dozens of other categories. It’s silly to compare a blog that’s mostly original content with one that’s mostly aggregation of others’ work. Or political commentary and captioned cats.
If we are looking at it from a sys admin perspective, it’s all about page accesses to a top-level domain. That’s fine for them.
But if “publishers” are looking to compare with one-another, I’d say you need some list of domains with unified editorial management. That doesn’t fall out of domain name or registration rules. Such a list may not exist, and so the apples-and-oranges lists.
(Possibly if wordpress has such a high rank, google is currently reducing top-level hosted on wordpress to “be” wordpress.)
“Huffington Post, which is in turn different from Daily Kos or TPM or InstaPundit.”
……Apart from being much better financed I don’t see a lot of difference between Huffington and the others, they are left leaning, have a stable of writers, and let people post comments. Kos has a somewhat more open architecture but they are all fundamentally similar to each other and indeed to OTB in structure (if not politics) and most routine blogs on subjects as diverse as men’s fashions and the Hapsburg Monarchy.
Apart from being much better financed I don’t see a lot of difference between Huffington and the others, they are left leaning, have a stable of writers, and let people post comments.
But even most newspaper and magazine sites allow comments now. HuffPo has a huge paid staff as well as an even larger band of unpaid contributors plus aggregates tons of external content. It’s almost completely different from Kos, which has a handful of top level authors and tons of unpaid “diarists” generating personalized content. OTB has a much smaller staff still and no diarists.
James Joyner says:
“HuffPo has a huge paid staff as well as an even larger band of unpaid contributors plus aggregates tons of external content. It’s almost completely different from Kos, ”
But Jim this is just a question of scale because Huffington is so muc h better financed. it’s like comparing the Waldorf and a Red Roof Inn. They are both in the hotel business but one offers many more services. You could be a center right Huffington if you could get someone to sign the checks!