Home / Blogs

Website Usage Analysis in the New gTLDs

Andy Simpson

A recent study, by EURid and the Leuven Statistics Research Centre, set out to better understand the most common usage of websites that are linked to domains, and we thought it would be an interesting exercise to extend similar analysis to the new gTLD market. So, we analyzed all second-level domains registered in new gTLDs according to published zone files on June 29, 2014. Verisign utilizes our own proprietary process for classifying websites, which results in similar classifications to those by EURid. The primary difference is that the Verisign classification method is machine-based and is evaluated for each domain independently, while the EURid approach leveraged samples that humans classified.

One key finding from our June 29, 2014, analysis is that 3 percent of domain names registered in new gTLDs contain business websites. In this case, we define "business" as a website that shows commercial activity, a definition that is slightly broader than EURid's "business" classification which is defined as a website that clearly shows commercial activity and that is designed for customer interaction. EURid's usage stat for 8 established TLDs found that 30.5 percent of domain names on these established gTLDs contained business websites.

Our analysis also found the most common use of domain names in the new gTLD space is Pay-Per-Click (PPC), with roughly 41 percent of all new gTLD domain names serving up PPC websites. A PPC website contains little user-generated content and almost exclusively advertising links.

The prevalence of PPC websites in the new gTLD space can likely be attributed to:

  • Heightened speculation in the new gTLD space;
  • The practice of several new gTLD registries to register their own domains which are still technically available at premium retail pricing, and several campaigns that provide domains from the new gTLDs at little or no cost to end users (some reportedly without their prior consent), and at least one campaign which automatically creates PPC websites on those provided domain names; and,
  • The EURid approach classifies domains which immediately redirect or "forward" according to their ultimate destination. In contrast, the Verisign method for website classification identifies domains that forward to an alternative destination as a "Redirect." The most notable change to augmenting the Verisign approach to classify domain usage according to the ultimate destination is an additional 2 percent of new gTLD registrations are linked to websites serving up PPC content.

Figure 1 – Source Verisign (6/2014)

Finally, Verisign's study also found that each of the new gTLDs have a personality of their own with very different usage distributions. Two such examples include:

  • Dot Chinese Online (.在线/.xn — 3ds443g) has 91 percent of the registered base serving up "Error" websites. This usage spike likely correlates with their unique distribution model, where they agreed to assign a significant portion of their new names to the Chinese central government. All of the names that are presumably part of that deal fail to return websites when users from the United States attempt to access them. The usage distribution of the remaining top 10 TLDs can be seen in Figure 2.
  • XYZ.COM LLC (.xyz) has a high concentration of PPC websites as a result of a campaign that reportedly automatically registered XYZ domains to domain registrants in other TLDs unless they opted out of receiving the free domain name. After registration, these free names forward to a PPC site unless reconfigured by the end user registrant.

Figure 2 – Source Verisign (June 2014)

While it is still early days for new gTLDs, this analysis offers an interesting snapshot of the first few months of new gTLD general availability. It will be interesting to see how website usage evolves over the next year as more gTLDs become available for registration.

By Andy Simpson, Data Scientist at Verisign
Follow CircleID on
Related topics: New TLDs
SHARE THIS POST

If you are pressed for time ...

... this is for you. More and more professionals are choosing to publish critical posts on CircleID from all corners of the Internet industry. If you find it hard to keep up daily, consider subscribing to our weekly digest. We will provide you a convenient summary report once a week sent directly to your inbox. It's a quick and easy read.

I make a point of reading CircleID. There is no getting around the utility of knowing what thoughtful people are thinking and saying about our industry.

Vinton Cerf, Co-designer of the TCP/IP Protocols & the Architecture of the Internet

Share your comments

Hi Andy,The primary difference is that the Alex Tajirian  –  Aug 16, 2014 11:08 AM PDT

Hi Andy,

The primary difference is that the Verisign classification method is machine-based and is evaluated for each domain independently, while the EURid approach leveraged samples that humans classified.

I don’t see a reference for the technical analysis of your study. Thus, I am not sure what statistical classification model, if any, you used.

I am a strong advocate of machine-based analyses for domain names, but you have not made a convincing argument as to why, for this study, machine-based classification is better than human.

Verisign utilizes our own proprietary process for classifying websites

So the statistical methodology is not proprietary but the process is? Why the secrecy?

The EURid approach classifies domains which immediately redirect or "forward" according to their ultimate destination. In contrast, the Verisign method for website classification identifies domains that forward to an alternative destination as a "Redirect."

I am not sure I understand the difference between the two studies.

Heightened speculation in the new gTLD space

Is the speculation about domain name prices, about the viability of new gTLD parking, or both? Is either of them bad in your opinion? Why?

How sensitive are your results to .xyz’s alleged campaign to automatically registered XYZ domains?

Eurid's approach and categories are not good John McCormac  –  Aug 20, 2014 3:32 AM PDT

Eurid's approach and categories are not good references. It was a manual categorisation of a sample of approximately 5,000 domains on a ccTLD with approximately 3.7 million domains. As a sample, the 5,000 domains might show coarse trends but important and subtle trends such as duplicate content, clone websites, framed websites, domains for sale are missed. These trends are difficult to detect efficiently with manual processing. With an automated full TLD survey, these categories are somewhat easier to detect. The .EU market is essentially a set of country level markets overlaid with a smaller regional (EU-wide) market rather than global market. The results for .EU ccTLD when analysed at a country level in conjunction with the set of major TLDs and ccTLDs active in that market can display very different characteristics to a small sample of the whole ccTLD considered in isolation. Some EU countries such as Germany would have over a million .EU domains registered where as others might only have ten thousand or less. In terms of a country level market's domain footprint, the .EU ccTLD registrations in that country might only occupy a few percent of the footprint while the country's ccTLD and .COM would have over 80%. The new gTLDs, apart from the regional ones, are typically global TLDs targeting a global market and they are often very small players in country level markets.

Grouping the results of a number of new gTLDs together like this can be somewhat misleading in that while all new gTLDs are in their first year of operation, some of the new gTLDs are at different phases in their lifecycle. And a few are developing at a faster pace than others.

The level of PPC parking in .XYZ was 90.27% according to the .XYZ web usage survey that HosterStats.com ran on 26th June as part of a web usage survey of the top ten new gTLDs. Because .XYZ is much larger than the other individual new gTLDs, using cumulative numbers (all new gTLDs) for each category will skew the percentages. The differences between the new gTLDs and more mature TLDs are clear from the 110,000 domain .COM web usage survey and others TLD web usage surveys that were run in July 2014. Some new gTLDs are beginning to take on the early characteristics of a Truckstop TLD where users go before being redirected to the registrant's website in another TLD. Part of this is down to Search Engine Optimisation where a registrant will register keywords in the new gTLD and redirect traffic to their main website. The registrant will maintain their primary brand website in .COM or a ccTLD. This keyword redirection strategy was most notable in the relaunch of .CO ccTLD as a global TLD. It is somewhat different to simple brand protection redirects.

Brand protection redirects are often exact match redirects to websites using the same domain name string in another TLD. Brand protection registrations patterns in new gTLDs are also slightly different to those in legacy gTLDs. The typical Intellectual Property/brand owner approach prior to the launch of the new gTLDs was defensive in nature. An IP owner would register the brand in the main gTLDs and the ccTLDs in which it wanted to do business. With so many new gTLDs hitting the market at the same time, the costs of a defensive strategy registering brands in each new gTLD have risen. Brand protection has drifted from defense towards enforcement and this has reduced the level of brand protection registration activity in some new gTLDs.

End user speculation in the new gTLDs is not as prominent as it was in the launch of previous large TLDs (.EU, .MOBI, .ASIA, .CO). Some of the new gTLDs have had large numbers of premium domain names reserved by the registries or registry affiliated companies. This dampened speculation in those new gTLDs. The registration fee in some new gTLDs is a multiple of the .COM registration fee and this also reduces speculation. The new gTLDs generally have a percentage of their registered domains for sale. Some are premium domains being offered for sale by registries and registry affiliated companies but there there are also end users who have put domain names up for sale. It could be argued that premium domain name reservations by the registries and affliated companies constitute a form of speculation but the general intention of the registries seems to be that they will be drip fed into the market to maintain end user interest in the new gTLD.

On the PPC angle, many registrars now automatically park undeveloped domain names on PPC landing pages. This is, to some extent, temporary PPC while a website is being developed. While there is an element of PPC as a business, PPC in newly launched TLDs tends to have different characteristics to that of PPC in mature TLDs. In terms of direct navigation traffic, .COM is still the winner and the level of public awareness of the new gTLDs means that direct navigation traffic will be far lower than that of .COM or ccTLDs.

The Verisign results are more reliable than those of Eurid as they are full TLD surveys rather than samples. The categorisations need some refinement. The Business category is somewhat nebulous. The level of e-commerce enabled sites in a TLD is always going to be a small fraction of number of business websites due to the brochureware nature of most business websites.

The Verisign results are more reliable than Alex Tajirian  –  Aug 20, 2014 11:31 AM PDT

The Verisign results are more reliable than those of Eurid as they are full TLD surveys rather than samples.

More reliable, yes! But, what about classifying few million domains? Didn’t automating classification lead to Google’s dominance over Yahoo?

A good categorisation process will scale, Alex,Google's John McCormac  –  Aug 21, 2014 1:09 AM PDT

A good categorisation process will scale, Alex,
Google's dominance was due to identifying authority sites. Both Google and Yahoo were already automated in terms of their search engines. Yahoo had its own pay for inclusion web directory and I think it used to pay people to write a few lines about the paid submissions.

Eurid's surveys were, I think, a response to the multi-million domain surveys of .EU that I ran periodically. In addition to other larger web usage surveys, I run a monthly survey of approximately 10K .EU domains. The October 2011 .EU web usage survey covered 2.3 million domains. When running surveys on this scale this, automation is essential unless you have access to a large number of students willing to sit in front of monitors all day. Verisign also runs large surveys and has, as Andy mentioned, an automated process.

Eurid's approach owes more to Psephology than genuine website usage analysis. This is why, with its 5K domain sample, its results become increasingly unreliable as the size of the TLD being surveyed increases.

The most effective approach to large surveys is an automated and algorithmically based process. That's why the results of the Versign and HosterStats.com surveys are in broad agreement. The small size of Eurid's samples mean that the results become more unreliable as the size of the TLD being surveyed increases. A 5K domain survey on a TLD with 100K registrations does not have the same accuracy as a 5K domain survey on a TLD with 3.7M registrations or a 5K domain survey on a TLD with 114M registrations. Thus the most accurate survey is a full TLD survey. It is possible to use a statistical sampling approach to web usage analysis but the Margin of Error and sample sizes are related to the size of the TLD being surveyed. One cannot use the same small sample size on TLDs with vastly differing registration volumes and expect all results to be equally reliable.

To extend the Google/Yahoo comparison, what Verisign and HosterStats.com are doing is akin to search engine index development. Eurid is using the approach of small web directory developers. A good web usage analysis is a precursor to a viable search engine index.

To post comments, please login or create an account.

Related

Topics

Cybersecurity

Sponsored byVerisign

Cybercrime

Sponsored byThreat Intelligence Platform

IP Addressing

Sponsored byAvenue4 LLC

Whois

Sponsored byWhoisXML API

DNS Security

Sponsored byAfilias

Domain Names

Sponsored byVerisign

New TLDs

Sponsored byAfilias