DNS RPZ, Malicious Domains… Bring Your Own Policy. Dress Casual.

Home / Blogs

DNS RPZ, Malicious Domains… Bring Your Own Policy. Dress Casual.

	By Eric Brunner-Williams Mathematician
	August 02, 2010 Views: 13,985 Comments: 4

Paul observed that most new domain names are malicious. Are they?

Since the “dawn of tasting”, some 30 million domain names have been created for the purposes of interposition on existing name to resource mappings. That is a third of the .COM historical growth, and mostly in the last five years. These differ from NXDOMAIN synthetic return only in implementation details.

The first is a fixed point in an anchored string space, a value in a measure space that defines a Hamming Cloud around a persistent, public referent in the same string space. Managed as a set of points, by one or several authors of the act of interposition on a resource, and monitized by ad networks. The second is a collection of probabilistic matches about a public referent in a string space, made from one or more relative anchors of that string space. Managed as a set of match metrics, each by distinct authors, and also monitized by ad networks.

It is difficult not to conclude that interposition on persistent, public referents is without malice, and that the malicious parties are advertisers seeking to transform public referents into private property, as promotional devices, NXDOMAIN synthetic return forgers, seeking to offer a means to advertisers to articulate their malice, and registrants of abutting points to public referents, seeking to offer “natural traffic” as another means to advertisers to articulate their malice.

These domains are as persistent as the referents they extract value from, and it is reasonable to view VGRS’s recent “Domain Name Exchange” as a very modest modification of the domain taster’s five day taste and commit/drop algorithm to a thirty day taste and commit/drop algorithm, requiring only a six-fold increase in the no-cost resource (cohorts of names) to achieve the same sample-and-capture value to sample set operators as the earlier AGP exploit.

Impersistent domain names exist impermanently in vast numbers due to the absence of real cost to acquire the resource. As a means to allow the authors of spam to dynamically modify the payload of a cohort of spam messages, domain names greatly increase the lifetime of any message cohort, allowing the author to substitute in real-time uncompromised resources for compromised (by law enforcement and others) resources.

Real cost, not sticker price never paid by any actor using the market in credit card data, whether measured in manual processing or temporal offset is the highest bar to impersistent domain names. There is no fraud in .cat or .museum or .coop, and the Conficker .C authors did not go where acquisition of rendezvous points was not automated.

But, in sum, most of new, persistent, and all of the impersistent domains are dreck or worse.

Did universal access between e-mail servers offer a greater boon to the bad guys than to the good guys? Yes. At-sign addressing ment all addresses shared fate. With bang-sign addressing, only those with shared hop-by-hop addressing shared fate. There could still be spam in a bang-path email addressing architecture, but it would be dependent upon a UUCP Maps like database, and limited by per-hop policies, hence vastly different from the present rational economic exploit of port 25 tolerant access network ISPs and the DNS as the routing architecture.

Does organized e-crime now requires access to the Internet’s resource allocation systems?

Without a doubt. For impersistent zero cost resources, both names (via automated registrars within the stolen plastic chargeback “Attack Gratis Period”) and addresses (via automated fate sharing of unprotected memory commodity nodes, broadband or narrowband indifferent).

Crime then, is shaped, like any cost, either away from policy protected use, such as registries which impose real registration cost and access networks which implement ingress filtering, or costed in, within a vastly larger pre-net risk model, the seven basis points of credit card fraud. Note that ICANN’s new gTLD model imposes the credit card fraud risk on all new registries, as “one size fits all”, and BCP 38 remains beyond of the kenn of the technical coordinating body of the IANA anchored name and address allocators.

A problem with response is learning. The actors confronted with first address filtering a decade ago learned that address agility had value, so address acquisition became economically rational, hence botnets. The same or similar actors when confronted with content filtering learned that content agility had value, escaping from early checksums and later stochastic techniques with content and link agilities.

The subtle point is scope. There exists the view that a universal scope is compelling, and contained within that view are several assumptions. A minority view, a vanishingly small minority view, if not for the fact that China Telecom and China Unicom broadband have almost 100 million subscribers, is that a universal scope claim may not be technically neutral, but contain unexamined by its proponents, assumptions. It is on this subtle point that scoped reputation systems can exist. The earlier DLZ was another alternative, temporary in theory, to a single, universal, scopeless assertion about proof properties of delegations.

The utility of a multi-producer, multi-consumer, multi-vendor activity is not merely dependent upon the cooperation of these parties, but also upon the technical taxa of domains as tools for value capture from others. Nameservers and domains used for period of a few days, hours or minutes are sufficinetly different from nameservers and domains used for months, years, and even decades, to be distinguishable.

Another approach, from the registry point of view, is to ask what recursive registrars lie to resolvent users? For geographically scoped, and therefor access network operator limited service populations, the question is not intractible. The registry may simply drop requests from lying recursive resolvers, and enforce a reputation system. This point was contained in a proposal to a major metropolitian registry public applicant, that correctness could be enforced, to the benefit of the registry and its service population.

Reasonable questions to ask are:

what if anything is known about the address block in which a particular name resolves to. More generally, how does routing information relate to resolution?
is any cooperating resolver aware of the per-protocol properties of the address the name resolved to? More generally, how does protocol information relate to resolution?
are the cooperating resolvers aware of the temporal properties of the name to address mapping, and similarly, of the frequency of resolution of the mapping? More generally, how does time and sampled time relate to resolution?
is knowledge of the secondary market inventory (dropped names acquired for repurposing, usually for advertising) available to the cooperating resolver system?
does reputation extend to registrar, or nameserver, or PTR clustering, or other trivially discovered property well known to parties involved in infringment prosecution or content prosecution?
does reputation extend to application specific protocol exploits such as the presence of web beacons and similar malware injectors on a web site?
other than routing domain knowledge, and protocol domain knowledge, and temporal domain knowledge, what other knowledge is available?

One of the IDNAbis design choices, whether a feature should be implemented in the protocol or implemented by registries, can be revisited. Design choices not implemented in the protocol can not only be implemented by registries, but also by resolvers. Another provisioning design choice which can be implemented in resolvers are restrictions on Hamming Distances and variations of address resources within a “bundle” of similar names.

The DNS RPZ specification (Vixie/Schryver) is a step towards persons or automata invoking stub, hence recursive resolvers, exercising scoped subsetting semantics, similar to the related patch to BIND 8 published in September 2003, which was described as a source of “incoherence” in the DNS by its critics. It does not yet attempt to incorporate routing, protocol, or temporal domain knowledge, or character set, or string space sequence properties. It adopts a heuristic approach to deprecated mappings, with manual data origination and automated data provisioning across cooperating instances.

I’ve read the patchs to bin/named/{query,server}.c, lib/dns/include/dns/view.h, lib/dns/view.c and lib/isccfg/namedconf.c, and Paul and Vernon’s draft, and I’m prepared to “party”.

However, my notion of “evil.com” is probably not completely contained in the initial implementation of an RPZ. I think it is a come-as-you-are kind of party.

# /usr/local/sbin/named -v<br /> BIND 9.7.1-P2-RPZ-0

Bring Your Own Policy. Dress Casual.

NORDVPN DISCOUNT - CircleID x NordVPN
Get NordVPN [74% +3 extra months, from $2.99/month]

By Eric Brunner-Williams, Mathematician

Filed Under

Comments

I do not understand this article. Kevin Murphy – Aug 2, 2010 10:59 PM

It looks interesting, but I’m afraid I just don’t understand it.

Since the “dawn of tasting”, some 30 million domain names have been created for the purposes of interposition on existing name to resource mappings. That is a third of the .COM historical growth, and mostly in the last five years. These differ from NXDOMAIN synthetic return only in implementation details.

The first is a fixed point in an anchored string space, a value in a measure space that defines a Hamming Cloud around a persistent, public referent in the same string space.

# 1 Reply | Link | Report Problems

Some sequences are more frequently requested by stub resolvers than others ... Eric Brunner-Williams – Aug 3, 2010 12:30 PM

If you start with the assumption that “life” is more likely to be the target of resolution requests than “0000”, and that resolution requests are not uniformly distributed over the set of all strings formed by four letters, digits and hyphens, then you have enough to follow the discussion. There’s not a lot of typo squatting on the 63 digit value of pi (one of the fundamental constant domains on the net), but there is on trademarks. Once you grasp that point the rest should follow, at least up to the point of the discussion of scope.

Good luck!

# 2 Reply | Link | Report Problems

30 Million Gareth Andrew – Aug 3, 2010 8:45 PM

Eric, where does the value of 30 million come from? Is this typo-squatters you are talking about, or malicious-use domain names in general? Have you any stats or references on how many impersistent (credit-card fraud created or otherwise) new domain registrations there are?

Since the “dawn of tasting”, some 30 million domain names have been created for the purposes of interposition on existing name to resource mappings. That is a third of the .COM historical growth, and mostly in the last five years.

# 3 Reply | Link | Report Problems

Re: 30 Million Eric Brunner-Williams – Aug 4, 2010 1:15 PM

Welcome to CircleID Garth.

Start with Verisign’s periodic domain report for coarse data. The distinction you (and to be fair, most of the domainer benefited parties) offer between “typo-squat” and “malicious” is beneficial to domainers. I think correctly characterizing the malice of repurposing a query from resolution of the user-desired resource to the advertizer-desired resource is a better choice. It is a purpose distinct from injection of behavior profiling mechanisms and other purposes, but it shares the motive of trick common to all.

The literature on impersistence is available through Fast Flux WG work product and the sources we drew upon on to understand domains as no acquisition cost attack side assets (along with the much more well understood ip addresses as no acquisition cost attack side assets). See also the Conficker .C’s sub-author’s construction of rendezvous points (domains in a subset of 105 namespaces) for system bootstrap purposes.

The credit card risk literature is open and abundant and the basis for the seven basis points (pun obviously) is a historic development in that industry.

# 4 Reply | Link | Report Problems

The Weekly Wrap

More and more professionals are choosing to publish critical posts on CircleID from all corners of the Internet industry. If you find it hard to keep up daily, consider subscribing to our weekly digest. We will provide you a convenient summary report once a week sent directly to your inbox. It's a quick and easy read.