DNS Abuse Forum - March 16

Home / Blogs

WHOIS Record Redaction and GDPR: What's the Evolution Post-2018?

We all use the Internet daily. Practically every element of our reality has its equal in the virtual realm. Friends turn into social media contacts, retail establishments to e-commerce shops, and so on.

We can't deny that the way the Internet was designed, to what it has become, differs much. One example that we'll tackle in this post is the seeming loss of connection between domains and their distinguishable owners. While the original motivation for linking the two was technical, that has also changed to something more important — keeping the digital world and its users safe from threats that may result from anonymity.

The modified goal gave birth to the WHOIS protocol — a standard that allowed users to learn who is responsible for a high-level Internet domain. But the protocol and related policies on its use have also changed throughout history. What hasn't is the fact that it still provides a unique link that ties what's virtual to what's real.

Despite WHOIS's relevance, however, the dawn of new data regulations like the General Data Protection Regulation (GDPR) seemingly disconnected the ties that bind domains to their owners. While it's true that the data contained in WHOIS records can be partly personal, it does serve a crucial purpose when determining who or what outfit could be responsible for a cyber attack.

We sought to quantify the extent of the implications of WHOIS record redaction in this report and summarize our key findings here.

The GDPR's Effects on WHOIS Records

We used data from WhoisXML API's daily data feeds and archives. These data sets comprise daily collections of domain WHOIS records. We sought to determine the ratio of unredacted to redacted WHOIS records since the GDPR took effect. An unredacted record shows the registrant's name or organization while a redacted one fills the fields in with inputs, such as "REDACTED FOR PRIVACY" or "GDPR masked."

We specifically obtained the said ratios for two top-level domains (TLDs) — .uk and .com. Our findings are shown in greater detail in the following sections.

.uk Domains

We chose .uk domains for the following reasons:

  • Not until long ago, it used to be a European country-code TLD (ccTLD).
  • The domains' WHOIS records use English, easing our search.
  • U.K.-based domain owners typically produce more informative records than say, .de or .hu.
  • The U.K. was still a member of the European Union (EU) when the analyses were done.

Our analysis of data from January 2018 to 31 October 2020 revealed that as soon as the GDPR was enforced, a huge drop in the volume of nonredacted WHOIS records was seen.

Figure 1: Volume of WHOIS record redaction for .uk domains (2018 – 2020)

The purple line shows the number of domains registered, the green line shows the number of domains with unredacted WHOIS records, and the red line shows when the GDPR took effect. Note that a smoothing has been applied to the curves to reflect the trends better.

.com Domains

We chose .com because it remains the most used generic TLD (gTLD) to date. Our analysis of data from 1 January 2015 to 31 July 2020 showed a similar trend to .uk domains.

Figure 2: Volume of WHOIS record redaction for .com domains (2015 – 2020)

We will limit our observations to the same elements as those used to analyze the .uk domains.

Focusing on the green line and the two dates — 14 April 2016 when companies began adopting the GDPR and 25 May 2018 when it took full effect, we can see that the original ratio of unredacted records at 80% decreased gradually during the GDPR adoption. Later on, the volume of unredacted records dropped to 20% by the time the GDPR was fully enforced. The chart successfully quantified the negative effect of the GDPR on WHOIS records.

The Way Forward

The examples featured in this post showed how the enactment of the GDPR might have undermined the transparency of WHOIS records quantitatively. The impact is significant and has implications for scientific research, cybersecurity, and legal investigations, among others.

By Jonathan Zhang, Founder and CEO of WhoisXMLAPI & ThreatIntelligencePlatform.com

CircleID Newsletter The Weekly Wrap

More and more professionals are choosing to publish critical posts on CircleID from all corners of the Internet industry. If you find it hard to keep up daily, consider subscribing to our weekly digest. We will provide you a convenient summary report once a week sent directly to your inbox. It's a quick and easy read.

I make a point of reading CircleID. There is no getting around the utility of knowing what thoughtful people are thinking and saying about our industry.

VINTON CERF
Co-designer of the TCP/IP Protocols & the Architecture of the Internet

Comments

As usual your article is light on By Michele Neylon  –  Jan 29, 2021 10:34 am PST

As usual your article is light on facts, high on opinion and totally lacking in any clear logical argument.

Whois was not designed for your purposes. In fact that has been the core of many of the debates around whois that have been raging for at least 20 years. Data collection and processing has to have clear legal purposes.
You talk about GDPR, but you make absolutely no attempt to understand it. It's pretty clear from both this article and others that you've posted here over the last couple of years that you have no respect for privacy.
Also, if .uk is no longer a European ccTLD what is it now?

Add Your Comments

 To post your comments, please login or create an account.

Related

Topics

New TLDs

Sponsored byAfilias

Domain Names

Sponsored byVerisign

Threat Intelligence

Sponsored byWhoisXML API

Cybersecurity

Sponsored byVerisign

Brand Protection

Sponsored byAppdetex

DNS Security

Sponsored byAfilias