Home / Blogs

Languages in the Root: A TLD Launch Strategy Based on ISO 639

TLD registrations in the Internet’s root-zone file currently are divided into two broad classifications: generic and country-code top-level domains. With respect to the latter classification, no new “strategy” is required to add further ccTLDs as a relatively well-working process is already in place to integrate the occasional new country-code top-level domain. With one of these two classifications under reasonably sound management, it is therefore perfectly understandable to see that the ICANN organization consequently views its obligation to “Define and implement a predictable strategy for selecting new TLDs” as a mandate “to begin the process of allocating and implementing new gTLDs”... the flaw in this conclusion, however, stems from the presumption that the Internet’s taxonomy must necessarily contain only the two above-so-mentioned broad classifications. I am proposing a third TLD classification—based on languages.

As our objective is to maximize the public benefit derived from the Internet’s system of unique identifiers, our focus must be upon utility—that which serves the greatest good for the greatest number of people worldwide. A series of top-level domains based on language identifiers would satisfy that goal; it would promote on a global scale commercial and civil/social opportunities that would necessarily result in the opening of new markets for domain registration services world-wide.

As noted in the 16 April 2002 “Discussion Paper on Non-ASCII Top-Level Domain Policy Issues”, “A language-associated TLD string may assist in the development of global language-based Internet communities, particularly where the language speakers are widely distributed around the world, for example, the various Cambodian-speaking communities.”

Imagine for a moment a future in which a young businessman whose native language is Wolof (a language understood by over 8 million Senegalese as well as a language spoken by significant populations in The Gambia, Mauritania, C?te d’Ivoire, Mali, France, Italy, and Spain), can sit by a computer and access websites written in Wolof simply by using a search engine to sift through records found in the .wol domain (a string derived from the ISO 639 list). Before long, he has found trading partners both locally and abroad, a wealth of opportunities and a host of valuable services all provided in his own native language.

It is said that “The Internet is for everyone”—this is a way to make it so, by giving each language grouping its own top-level domain. The significant value of language-based TLDs is to make the Internet more fully accessible to the 92% of the world’s population that does not speak English.

Questions, Issues & Answers:

1. Language

One might reasonably ask, how will ICANN know exactly what constitutes a “language”? The answer lies in recourse to an ISO (International Organization for Standardization) list. ICANN uses the ISO 3166 list to determine that which constitutes a ccTLD. In similar fashion the ISO 639 list of three-letter language codes can be used to definitively establish an acceptable list of languages.

2. IDNs

Does this proposal require the creation of non-ASCII TLDs? In the spirit of “keeping-it-simple”, this proposal only calls for the use of the ASCII three-letter codes as established in ISO 639. One hopes that after an initial proof-of-concept stage is evaluated, possibilities will later emerge to allow for an ultimate migration to non-ASCII representations under ICANN’s guidance.

3. Quantity of TLDs

Just how many language-associated TLD strings (L-TLDS) are being proposed? The ISO 639 list of three-letter codes contains about 400 entries. While some of these listings (such as “peo”, the code for “Persian, Old [ca 600 - 400 B.C.]”), can safely be edited out of the list, I believe that we can still talk in terms of round numbers and use 400 entries as the approximate value under discussion. Please note that there are 241 currently active ccTLD registrations in the Internet’s root-zone file.

4. Phased roll-out of TLDs

I envision a ten+-year phased introduction of the language-associated TLD strings with a launch cycle periodicity of eighteen months (this should allow for necessary review mechanisms):

  • First group—12 language-associated TLDs
  • Second group (eighteen months later)—24 language-associated TLDs
  • Third group (year # 3)—36 language-associated TLDs
  • Fourth group (eighteen months later)—48 language-associated TLDs
  • Fifth group (year #6)—60 language-associated TLDs
  • Sixth group (eighteen months later)—72 language-associated TLDs
  • Seventh group (year #9)—84 language-associated TLDs
  • Eighth group (eighteen months later)—remainder of language-associated TLDs


5. Politics and the Selection process

Because attempting to create TLDs semantically linked to languages might well raise a number of extremely delicate political problems (consider the prospect of selecting a registry operator for a language group that includes hundreds of millions of people and spans a number of nation-states), prior to the start of each selection cycle, deference will be made to governmental entities that oppose participation in this selection process (through some type of diplomatically appropriate method, their language-associated strings will be removed from the group of potential candidates for inclusion into the Internet’s root-zone).

6. The Selection methodology

After necessary exclusions that result from the political process, a computer will be used to randomly select the strings that will be launched in each given cycle.

7. Choosing the registry operators

It is my belief that a process should be put into place to pre-certify registry operators. Once registry operators are accredited entities, they may choose to be considered as candidates in a random draw process. Just as the TLD strings will be randomly selected, so too shall the accredited registry operators be randomly chosen to operate the language-associated TLDs.

8. Communications

Each registry operator selected to operate these L-TLDs will conduct its communications with the public and with the registrar community in the language-group that is under its management. Accordingly, all registry operators accredited to operate LTDs will warrant that they will secure an appropriate level of staff with fluency in whatever language-group they are selected to handle.

Final Thoughts:

In the last round of TLD selections ICANN chose both small communities (such as .museum) and potentially large communities (such as .info) to be awarded a presence in the Internet’s root-zone file. It is my expectation that a random selection process for L-TLDs will result in a similar mix: some small language communities and some large language groups. Whatever the outcome, we can expect the expansion of competition in the domain name registration business as firms with relevant language proficiency and technical skills vie for registrar accreditation in this new TLD environment.

Consumers world-wide will benefit from increased choice and the innovations that will accompany the launch of language-based TLDs, and ICANN will have proven that it is truly an international organization that is committed to the needs of the global community of Internet users.

Filed Under

Comments

Jothan Frakes  –  Oct 6, 2004 5:42 AM

This is definitely application of creativity in new TLDs.

With all of the effort that went into achieving what standards are in place to get us to where we are now today, and watching that process unfold and evolve over a course of many, many years, who is to say that the current measures in place for IDN are perfect?

I had one of (if not THE) first ccTLDs doing browser and operating system agnostic IDN operational in web browsers back in late July of 2000.

The resolution work and DNS side of the process was centrally managed, and DNS servers were already capable of more than 7bit names at the time.  The stage functioned absolutely well.

The audience, however, the End User systems, Operating Systems, and legacy applications, and other such technology were another story.

It is only now, since all of the efforts of the IDNA Working Groups and developers has come to fruition, that we are seeing applications that can actually work with Punicode conversions as part of their actual function.

I am quite pleased to see how far standards have come and how many GTLDs and ccTLDs have launched IDN solutions since 2000.

We have seen so many vast improvements in language based navigation over the past 4 years, and I have certainly seen quite a number of people pour endless and thankless hours into where we have evolved to.

Setting all of that aside, and acknowledging that this LTLD concept is an idea that inspires fresh thinking, I really like the concept of utillizing language based TLDs helping to create segments of the net that one might be able to best use in their native language.

Still, I have a challenge in seeing how creating such a large quantity of TLDs in the root level could avoid creating confusion if it were mixed with the other TLDs.

I also wonder, after watching the past two TLD assignment rounds of interested parties vying for TLDs and personally attending the selection ‘process’ [ahem] for the first in November of 2000, how this great concept of language set TLDs could actually have a chance.

Also, after watching the number of like TLDs and proposals occur in each round, I would be concerned that there would be potential to collide with those other interested parties.

Example: Feline Society of Toronto might argue that .CAT should be a registry for their favorite pets, in walks Catalog Registras JV omes along with their proposal for .CAT being a catalog UDDI repository TLD… Ugh… Big mess.

Now add multiple interested parties who compete for the .ZHA or .ENG registries and attempt to enforce standard practices in all these companies, consortiums, and ‘internet societies’ that would achieve these delegations. 

I like to think I am optomistic about human nature, yet I have a challenge seeing this happening, and with ICANN facing such challenges as even proving itself a legitimate entity, not to mention the pace at which their approvals and coordination of current IDN went, I cannot see where enforcement could originate there.

So, back towards making this feasible…

I would definitely consolidate the number of TLDs down to one if this concept were to be feasible at the ICANN level.  Make this something that realisticly works within their forseeable capacity, mandate, and competence.

What if such a language TLD were created, like .ISO639 or .UTF, and delegated to the ISO or the Unicode Consortium to where they could be the coordinator of the language ‘stub’ TLDs within that TLD.  .ZHA.UTF could be delegated to an appropriate administrator, .WOL.UTF gets assigned to another appropriate party, etc.

As far as making such a technology atually gain traction, work with the IETF standards bodies to add this payload to the existing standards or license New.Net’s brilliant and automagical techology process.

I certainly hope for these future WOL language internet users that either the IDN standards in place today continue at or exceed their adoption pace.

That would really be a great thing.  I hope not to have come accross as a naysayer.  I really believe that with some refinements, this could possibly work!

Please keep up with these innovative ideas, and thank you for the countless other excellent ideas that you have contributed to help make today’s internet a better place!

-J

The Famous Brett Watson  –  Oct 6, 2004 5:01 PM

I agree with Jothan that languages would be better as delegations under a single TLD, rather than each being a TLD in its own right. It’s true that countries get slightly preferential treatment in this matter, being at the root, but I think this is appropriate. Countries are political entities and therefore have (in principle, at least) much clearer lines of delegation. To a first approximation, I’d argue that the root should be reserved for two things: large intrinsically delegable domains (such as the ccTLDs), and generic groupings—not that ICANN seems to share my thoughts, judging by “.aero” and “.museum”, which are, I think, way too specific for the root.

Placing the languages in a second level domain is also just good namespace management: future additions to the root domain and ISO639 should not have to worry about stepping on each other’s toes. I note that “.pro” already collides with “Provencal, Old”, for what that’s worth. All two letter codes are, to the best of my knowledge, reserved in the root to guard against such collisions with respect to ISO3166 country codes. This is the kind of thing that namespace management must consider over and above prestige and aesthetic appeal.

A second level domain would also encourage use for the intended purpose. Contrast this with the use of country domains such as “.tv” and others. There’s a case to be made that a country may exploit its own domain name as it pleases, but we wouldn’t want to invite non-language-based use of language-specific domains, since that would undermine their intended purpose. Take a good look at that list of language codes and pick the domains that will have a million non-language-related registrations on the first day. Regulate it, you say? Lower its desirability a notch by making it *look* language specific, I say.

The first objection that sprang to my mind was the messiness of delegation. Countries have governments, and thus a “natural” path of delegation, but languages? I can imagine France wanting authority over French, and a bunch of non-French Francophones being less than happy about it. These problems are recognised in the proposal. But despite the more explosive scenarios, this may not be any worse in general than what we face with “.com” on a daily basis. It may be the kind of problem that we ought to anticipate without letting it deter us.

Finally, I note that the sales-pitch for language-based domains given here is somewhat web-centred. This is a bad idea, because it can be shown that language-specific domains are simply not needed for the web. We already have well-developed ways of marking the language of web pages, and all good search engines use them. It’s no coincidence that the cited list of language codes is hosted at the W3C. If this is a web-related problem, then domain names are not the place to be looking for an answer.

Domain names are not intended to categorise the Internet in general, or the web in particular; nor do they do a very good job of it if you try to use them that way. In general, life can’t be broken down into a neat categorical hierarchy, but one hierarchy is all the DNS gives you. The native speaker of Wolof given in the example would be much better served by searching for web pages written in Wolof *regardless of domain*. After all, if Wolof is common in Senegal, then I’d expect to find a significant number of Wolof pages in the “.sn” domain.

In my opinion, a much better case must be made as to why language-specific domains would be a good thing for the Internet in general. The given example is, frankly, out of touch with reality. The idea that it “may assist in the development of global language-based Internet communities” is a good start, but I’m not entirely persuaded that it will be as great a boon to speakers of particular languages as you suggest. What’s so great about a language-specific domain name?

Milton Mueller  –  Oct 13, 2004 5:59 AM

Danny:
Well, at least someone is thinking seriously about a process. ICANN certainly isn’t.

Supposing for the moment that ICANN were serious about defining a process, I would have to take exception to the following: “deference will be made to governmental entities that oppose participation in this selection process (through some type of diplomatically appropriate method, their language-associated strings will be removed from the group of potential candidates for inclusion into the Internet’s root-zone).”

“Deference?” Sorry, the world of governments and politics doesn’t work on the basis of deference. Either they have the power to block it or they don’t. And in the real world of international relations, you can’t just hand wave about “appropriate” processes, you have to specify and negotiate a process and get 200 governments to agree to it.

Worse, what you are proposing is basically to give governmens the power to decide who gets to represent a language group on the Internet. That’s wrong, because national governments have almost nothing to do with linguistic groupings. Who gets the veto power over English, to use an obvious example? the USA? Great Britain? New Zealand? Any one of them?
Why should governments have anything at all to say about this? Also, remember that some governments are particularly interested in and adept at suppressing linguistic diversity for political purposes.

If anyone tried to implement your proposal, it would make the ccTLD delegation wars look like a garden party.

No thanks. Let ICANN concentrate on gTLDs (and let it consider IDN TLDs to be a species of gTLD). And gTLDs can best be added through the process we proposed here:
http://dcc.syr.edu/miscarticles/NewTLDs2-MM-LM.pdf

Daniel R. Tobias  –  Oct 13, 2004 3:29 PM

I’m not sure the DNS is the proper place to encode language information… the HTTP protocol already has a method (though underutilized) to do this through the “Content-Language” header and content negotiation, and HTML also has means of identifying languages with the “lang” attribute.

terastra  –  Oct 15, 2004 2:45 AM

Language-linked TLD’s at first blush look like a good idea.
Deeper down it is also quite a subversive idea, that will be welcomed as such, especially in Africa but also in countries like the Philippines and Indonesia, and even in countries like France and Spain in Europe.

Danny recognizes the political sensitivity. He tames his proposal by giving governments veto rights. Is there a better way? Consensus by the WSIS nations?

Milton Mueller  –  Oct 15, 2004 4:49 AM

I didn’t make this clear enough the first time around. The most important reason Danny’s idea is a bad one is that it would create an officially sanctioned monopoly over the language code, just as ccTLDs create an officially sanctioned monopoly over the country code. For every language code, there would be one and only one TLD. That sets in motion an ugly, zero-sum political game over who gets it - even if the TLD itself turns out to be useless. The potential for symbolic conflict is enormous. That’s why I said that this proposal would make the ccTLD delegation wars look tame. At least with most ccTLDs, there is already in place a single political authority. That is NOT the case with languages, nor can language groups be mapped onto political entities without causing numerous fights. It’s a crazy idea. Creating ISO-3166 TLDs was one of Postel’s biggest mistakes. This would be a worse one, by far.

Comment Title:

  Notify me of follow-up comments

We encourage you to post comments and engage in discussions that advance this post through relevant opinion, anecdotes, links and data. If you see a comment that you believe is irrelevant or inappropriate, you can report it using the link at the end of each comment. Views expressed in the comments do not represent those of CircleID. For more information on our comment policy, see Codes of Conduct.

CircleID Newsletter The Weekly Wrap

More and more professionals are choosing to publish critical posts on CircleID from all corners of the Internet industry. If you find it hard to keep up daily, consider subscribing to our weekly digest. We will provide you a convenient summary report once a week sent directly to your inbox. It's a quick and easy read.

I make a point of reading CircleID. There is no getting around the utility of knowing what thoughtful people are thinking and saying about our industry.

VINTON CERF
Co-designer of the TCP/IP Protocols & the Architecture of the Internet

Related

Topics

Cybersecurity

Sponsored byVerisign

DNS

Sponsored byDNIB.com

Brand Protection

Sponsored byCSC

New TLDs

Sponsored byRadix

Threat Intelligence

Sponsored byWhoisXML API

IPv4 Markets

Sponsored byIPv4.Global

Domain Names

Sponsored byVerisign