Home / Blogs

Measuring Typosquatting Perpetrators and Funders

Benjamin Edelman

Co-authored by Tyler Moore, at Harvard School of Engineering and Applied Sciences and Benjamin Edelman, at Harvard Business School.

For more than a decade, aggressive website registrants have been engaged in 'typosquatting' — the intentional registration of misspellings of popular website addresses. Uses for the diverted traffic have evolved over time, ranging from hosting sexually-explicit content to phishing. Several countermeasures have been implemented, including outlawing the practice and developing policies for resolving disputes. Despite these efforts, typosquatting remains rife.

But just how prevalent is typosquatting today, and why is it so pervasive? We set out to answer exactly these questions. In Measuring the Perpetrators and Funders of Typosquatting (appearing at the Financial Cryptography conference), we estimate that at least 938,000 typosquatting domains target the top 3,264 .com sites, and we crawl more than 285,000 of these domains to analyze their revenue sources.

We find that 80% of typo domains are supported by pay-per-click ads. Often, the typo domains show ads that promote the correctly spelled site, along with the site's competitors. Screenshots of selected examples.

Another 20% of typo domains include static redirects to other sites. For example, 156 misspellings of yellowpages.com redirect to the competing website yellowpagesoftheworld.com. We devised an automated technique that uncovered 75 otherwise legitimate websites which benefited from direct links and redirects from thousands of misspellings of competing websites.

So what's the harm in typosquatting? First, typosquatting confuses consumers, causing them to visit sites different than the ones they intended to visit. Second, site operators must pay large sums of money to ad platforms such as Google AdWords in order to reach the users who specifically requested the corresponding sites. Third, we found evidence that ad platforms exacerbate typosquatting. Using regression analysis, we determined that websites in categories with higher pay-per-click ad prices face more typosquatting than websites whose keywords fetch lower ad prices.

Just how much revenue comes from ads on typo sites? It is difficult to know for certain, since Google and others do not disclose revenue figures at the granularity of particular advertising programs such as AdSense for Domains. We attempt a back-of-the-envelope estimate using Alexa reports of website popularity. We estimate that typo domains matching the top 100,000 websites collectively receive at least 68.2 million daily visitors. If these typo domains were treated as a single website, that site would be ranked by Alexa as the 10th most popular website in the world. It would be more popular, in unique daily visitors, than twitter.com, myspace.com, or amazon.com!

According to our analysis, 57% of typo sites include Google pay-per-click ads. Combining our observations with financial reports and others' estimates, we conclude that Google's revenue from typosquatting on the top 100,000 sites is $497 million per year. This is significant, and not only for the advertisers who are losing out by paying to get their ads placed on typo sites. It matters also because Google's competitors rely on typosquatting to a much smaller extent: In our testing, Yahoo's ads appear on 21% of typo sites, and we did not find a single Microsoft ad on any typosquatting site. Looking at Google's ever-growing share of online search and search advertising, we are struck by the role of typosquatting — making Google look that much larger, to advertisers and to analysts, when in fact this typosquatting traffic is entirely ill-gotten.

However, other findings leave us optimistic about the feasibility of significantly reducing typosquatting. Google's ad click links indicate which Google partner is paid for clicks at a given typo domain. We found high concentration among Google partners engaged in typosquatting: Of typo domains showing Google ads, 63% use one of five Google advertising IDs. So while the sheer number of typo sites remains high, the number of key perpetrators is small.

Our web appendix details many specific typosquatting domains — including the registrars and hosting companies who support those domains and, crucially, the ad networks whose payments put the system in motion.

Our full posting: Measuring the Perpetrators and Funders of Typosquatting and web appendix.

By Benjamin Edelman, Assistant Professor, Harvard Business School
Follow CircleID on
SHARE THIS POST

If you are pressed for time ...

... this is for you. More and more professionals are choosing to publish critical posts on CircleID from all corners of the Internet industry. If you find it hard to keep up daily, consider subscribing to our weekly digest. We will provide you a convenient summary report once a week sent directly to your inbox. It's a quick and easy read.

I make a point of reading CircleID. There is no getting around the utility of knowing what thoughtful people are thinking and saying about our industry.

Vinton Cerf, Co-designer of the TCP/IP Protocols & the Architecture of the Internet

Share your comments

Why do brands ignore the revenue? Antony Van Couvering  –  Feb 17, 2010 12:15 PM PDT

Nice report, Ben, as usual.

It leads me to ask a question of you. Presumably typosquatters don't just register names for fun, but for money. Why don't brands register their own typos and profit from them?

You'd think they'd be keenly registering every typo out there and collectively being the 10th most popular site on the Internet.

There seems to be some stubborn "principle" that they are standing on, that says something like: "We shouldn't have to register typos of our own name, even though we would profit handsomely.  Therefore we will bite off our nose to spite our face."

Every "start your own business" guide on the web suggests registering your name in multiple TLDs as well as obvious typos. This advice is not framed as a protection mechanism, but rather as a way to capture valuable traffic. The more doors that your customers can walk through, the easier you can make it for them, the better

Law are one thing, business reality is another.  I certainly don't condone typosquatting, but a prudent business person will adapt to the environment as necessary.

It seems that brands (large brands, smaller ones seems to "get it") just want the web to be "their way" or "no way." Brands could easily capture all that traffic, and the associated profits. They would just need to hire an ex-typosquatter or two to show them how, just as they hired ex-hackers to show them how to secure their Internet operations.

To me, the mystery is not that there is typosquatting, but rather why brands aren't profiting from the popularity of their own brands.

Antony

Because the only reason to register those typos is defensive against typosquatters glomming on them Suresh Ramasubramanian  –  Feb 18, 2010 5:35 PM PDT

You don't make money from typosquatting by PPC or driving traffic to the websites.

You gain money by trying to sell these off to the company, the company's competitor, a blackhat SEO who wants to use these domains and pump up his google rankings in a search for the typosquatted brand's domain, etc.

The ppc isnt worth even chickenfeed, in comparison

Nice report Ben and Tyler. This David A. Ulevitch  –  Feb 17, 2010 12:22 PM PDT

Nice report Ben and Tyler.  This is a fairly damning article, and it'll be interesting to see how Google responds. 

I never had any solid estimates of how big the business was, but I had my guesses — and your numbers, while larger, aren't that far off from my guesses based on traffic I see.

The corollary to your study Antony Van Couvering  –  Feb 17, 2010 12:46 PM PDT

Ben,

If I'm reading this right, it means that brands are leaving $497 M on the table every years by not registering their own typos.

Antony

Curious: Would typosquatting be considered a form Chr1s Shea7s  –  Feb 17, 2010 2:02 PM PDT

Curious: Would typosquatting be considered a form of social engineering, for the purposes of user training in any given business IT policy program? It requires a user's ignorance and capitalizes on said mistakes for the benefit of the cybersquatter. Taking into consideration the Wikipedia article on typosquatting, is cybersquatting the higher-level problem? And in retrospect, maybe cybersquatting is a form of social engineering? Overall, how might a business best model/write policy/educate it's employees about said dangers?

This is a great article Benjamin. Are Constantine Roussos  –  Feb 17, 2010 4:05 PM PDT

This is a great article Benjamin. Are you at Harvard Business School right now? Would love to talk to you about some of these issues face-to-face. I am at the Harvard Business School OPM by the way until this Friday. Actually had a chat with Deepak about some underlying issues. I am sure you know him.

Great work!

Regards,

Constantine Roussos
.music

A bad side affect to a greater good? Christopher Parente  –  Feb 18, 2010 6:36 PM PDT

Thanks for the time and effort creating this comprehensive look at something going on for many years, Ben.

Big old Google sure doesn't need me defending them, but here's a philosophical question. Google did an undeniable good by creating a new source of income for online publishers. AdSense probably keeps thousands of small publishers afloat. But to err is human, and people will always make mistakes typing in domains, at least until the utopia of true semantic search arrives.

So smart people learn the game the system, as smart people game all systems. That makes typosquatting the lesser evil to a greater good. And Google didn't design the overall system - actually no one did, although ICANN tries to manage it now.

Antony, totally agree the big brands were asleep at the switch but isn't it too late, with .com being first come, first served? Or are you saying they could/should wrest control of their typo domains away from squatters?

@Christopher Parente - .com is first-come, first-served Antony Van Couvering  –  Feb 18, 2010 6:56 PM PDT

@Christopher Parente - .com is first-come, first-served but if a typo is so similar to a brand that it confuses consumers, it's recoverable through UDRP.  So if brands don't act to recover it, then either (a) it's not worth much in their opinion, or (b) it isn't close enough to really be a "typo."

Thanks Antony Christopher Parente  –  Feb 19, 2010 9:09 AM PDT

OK, got it. The report states that despite UDRP the practice continues. You're saying brands just aren't using UDRP enough.

Does seem amazing the biggest brands are leaving this money on the table.

That isnt money on the table Suresh Ramasubramanian  –  Feb 21, 2010 7:09 AM PDT

It is mostly useless domains on the table.

Typo-Domains are profitable en masse Enrico Schaefer  –  Feb 26, 2010 9:41 AM PDT

The argument that trademark holders should try and defensively register every possible variation of their marks as domains, across all TLDs, is a common one. But it is really like saying that ok to rob someone's house because they left the window unlocked.  There really is no way to register all typo variations across all TLDs.  More importantly, these typos are profitable for typosquatters because they typically own thousands, tens of thousands or hundreds of thousands of typos across lots of brands.

So trademark owners are not 'leaving money on the table.' And the laws should continue to focus on making unlawful behavior unlawful, and not shift the burden on legitimate businesses to protect themselves or suffer the consequence.

To post comments, please login or create an account.

Related

Topics

Cybercrime

Sponsored byThreat Intelligence Platform

Cybersecurity

Sponsored byVerisign

DNS Security

Sponsored byAfilias

Whois

Sponsored byWhoisXML API

Domain Names

Sponsored byVerisign

New TLDs

Sponsored byAfilias

IP Addressing

Sponsored byAvenue4 LLC