Home / Blogs

The Inextricable Issue of Internationalized Domain Names

ICANN has embarked on the IDN boat at the same time it wants to introduce DNSSEC and new gTLDs. This promises lots of fun. Or grey hair, depending how you look at it.

First is the issue of country code IDNs. The ISO-3166 table, based on two letter codes, is a western convention. Some cultures do not use abbreviations or acronyms. Some do not use a character-based alphabet, but a syllabic one. Hence, the next logical step would be to represent the full country name in local script, rather than a transliteration of the ISO string. As an example, Morocco may want to use ????????? (or xn—mgbc0a9azcg7dsq in punycode) , in parallel with .ma. This is a simple case: Morocco has only one official language.

Imagine the case of India, where there are 1.652 languages, of which 24 are spoken by more than one million people. All have a distinct alphabet. Further, the Constitution of India does not impose an official language. Are we going to have at least 24 new IDN TLDs for India? This would make political sense, but would be a real burden to manage at the root level, especially if we end up with 1.652 of them, just for India. Obviously, other countries which use several languages may want to do the same.

When it comes to gTLDs, the situation becomes even more interesting. Take, for example, .ORG. ORG stands for “not-for-profit organization”. How does that translate in IDN TLDs under different languages? If we simply transliterate the “org” string in local script, we might end up with a meaningless name or - more unfortunate - an offensive word in the local language.
On the other hand, there may be several ways to translate the NFP organization concept in a specific language. As an example, if I had to translate the NFP organization concept in French, it would be association à but non lucratif in France, but association sans but lucratif in Belgium or association sans but économique in Switzerland.

Yet, it does not look logical that the incumbents gTLD registries could automatically claim to run any IDN TLD which translates more or less the concept of the original string. We should expect those countries which were not offered a piece of the multimillion dollar gTLD cake in previous years to want some money out of the IDN TLDs in their own script. Just imagine how much money could potentially represent a .com TLD in mandarin or arabic.

ICANN will have a hard time designing a policy for IDNs. The technical challenges are actually small, compared to the economical, political and cultural issues surrounding those internationalized domain names.

By Patrick Vande Walle, All around Internet governance troublemaker

Filed Under

Comments

Paul Hoffman  –  Jul 6, 2007 12:23 AM

What is wrong with giving India 24 TLDs? Given that the vast majority of countries have only one to three languages, is there really a problem with adding a thousand or so new TLDs? Have the people running the root servers suggested that this is an issue? If not, why assume that it is a problem?

Also, the statement that all 1652 Indic languages have their own alphabet is absurd, and certainly not supported by the reference.

Patrick Vande Walle  –  Jul 6, 2007 8:37 AM

There is nothing intrinsically “wrong” with allocating 24 TLDs or more for India.  From my own experience living in countries where several languages are used, there is a strong link between language, culture and political and social recognition.
ICANN may find itself in the middle of political tensions if it limits the number of accepted IDN equivalents for each ccTLD ou gTLD. The current .EH case exemplifies quite well how local political issues can have an effect on the root zone file.

As for the alphabets, Wikipedia mentions there are already 29 Brahmic scripts included in Unicode and around twenty of them not yet included. Some language like Khasmiri are written in two different scripts. I agree the “all” word should have been “many” . The main point is that it is a quite complicated matter. It will be difficult for ICANN to come up with an implementation policy that will suit everyone.

Stephane Bortzmeyer  –  Jul 6, 2007 9:59 AM

Also, the statement that all 1652 Indic languages have their own alphabet is absurd

Indeed, there are only 124 scripts registered (for the entire world) in ISO 15924 (http://www.unicode.org/iso15924/codelists.html) so pretending the number of scripts is a problem is pure FUD, there are less scripts than countries!

Paul Hoffman  –  Jul 6, 2007 4:23 PM

Patrick Vande Walle said:

ICANN may find itself in the middle of political tensions if it limits the number of accepted IDN equivalents for each ccTLD ou gTLD. The current .EH case exemplifies quite well how local political issues can have an effect on the root zone file.

It is already in the middle of them. Allowing too much or too little won’t change that.

I am still interested in your response to the other questions about why you think that adding all the language equivalents is a problem. To me, unless the root server operators say it is a problem, folks like you and me should not be saying that it is. They are the ones who understand the deployment issues.

As for the alphabets, Wikipedia mentions there are already 29 Brahmic scripts included in Unicode and around twenty of them not yet included. Some language like Khasmiri are written in two different scripts. I agree the “all” word should have been “many” . The main point is that it is a quite complicated matter. It will be difficult for ICANN to come up with an implementation policy that will suit everyone.

I would disagree with “many” as well. I also don’t believe that 20 of actively-used Indic scripts are not encoded. They may not be stand-alone scripts, but it is likely that all the characters they need are already encoded in the Unicode standard. I agree with Stephane: this is just FUD.

David Wrixon  –  Jul 7, 2007 10:48 AM

At present any substantive commitment from ICANN to do anything at all rather than simply promising to talk about more talks would be a massive step forward.

It would also help if Paul Twomey would stop pretending that ICANN is actually putting Unicode in the Root. It is not and never will. It would also be useful if he would make it abundantly clear that what ICANN is doing only affects the top level and that the biggest log jam is in fact being cause by Microsoft. It is really browser support that is critical to progress not DNS support.

Microsoft has effectively used it monopolistic position to cripple the Internets of Asia through its neglect. Disfunctional Internets are causing economic disadvantage, which is vastly outweighing any philanthropic effects it owner might be making.

Anyway payback time is coming. Supremacy in the browser wars will largely determine Microsoft’s fate. If they fail to roll out IE7 to ward the dual threats from Safari and Firefox, they are doomed. Google and Apple will overrun them just as the Goths and Vandals sacked Rome. The browser is the key to the Control of the Advertising and software markets and failure to get a replacement to IE6 to market is likely to prove Microsoft’s epitaph.

Comment Title:

  Notify me of follow-up comments

We encourage you to post comments and engage in discussions that advance this post through relevant opinion, anecdotes, links and data. If you see a comment that you believe is irrelevant or inappropriate, you can report it using the link at the end of each comment. Views expressed in the comments do not represent those of CircleID. For more information on our comment policy, see Codes of Conduct.

CircleID Newsletter The Weekly Wrap

More and more professionals are choosing to publish critical posts on CircleID from all corners of the Internet industry. If you find it hard to keep up daily, consider subscribing to our weekly digest. We will provide you a convenient summary report once a week sent directly to your inbox. It's a quick and easy read.

I make a point of reading CircleID. There is no getting around the utility of knowing what thoughtful people are thinking and saying about our industry.

VINTON CERF
Co-designer of the TCP/IP Protocols & the Architecture of the Internet

Related

Topics

Threat Intelligence

Sponsored byWhoisXML API

Cybersecurity

Sponsored byVerisign

IPv4 Markets

Sponsored byIPv4.Global

DNS

Sponsored byDNIB.com

Brand Protection

Sponsored byCSC

Domain Names

Sponsored byVerisign

New TLDs

Sponsored byRadix