Digging Through the Problem of IPv6 and Email - Part 3

By Terry Zink
Terry Zink

IPv6 changes things up because there are 128-bits in an IP address. Here's an example from Wikipedia:

It's beyond the scope of this post to describe the notation of IPv6, but you can see that a /32 is no longer the smallest IP range, it is now /64. The size of a standard subnet is 2^64 IP addresses, the square of the size the number of IPs in IPv4. While the planners of IPv6 don't think that the entire address space will be used, it will very much make network routing and management more efficient.

One idea to make the problem of mail more manageable is to restrict the address space that is allowed to send mail. In an ideal world, we'd restrict where mail mail servers could send mail from. So, if we say that the number of individual mail servers in the world will probably never exceed 32 million (not unreasonable), or 2^25, then what if the 25 least significant bits were reserved for mail servers? Right off the hop, any IP address that tried to connect to you and send mail that was outside the range (in hexadecimal) of 0:0:0:0:0:0:0:0 to 0:0:0:0:0:0:0200:0000 (or, :: to ::0200:0000) could automatically be rejected. This would almost be a PBL in reverse. Whereas PBL lists IPs that should never send mail, this algorithm would say to only accept mail from IPs that are allowed to send it and to reject everything else.

This is actually related to the idea of moving to a whitelist solution — to only accept mail from the servers you want to receive mail from. However, the problem with whitelisting is that you would never be able to hear from a new recipient, only pre-existing ones and that defeats the purpose of email — that you can hear from new people that you haven't previously communicated with. With this idea of mail addressing restriction, you do get to hear from new servers/IPs and ignore those from whom you have never been introduced because new people who you might want to hear from will be sending mail from a permitted set of IP addresses. All of the standard reputation tracking applies and we have now restricted the amount of space that spammers can hide in. If they want to send spam from spamming mail servers that traditionally never send mail, they won't be able to do it because all of the good guys have already set up an agreement that says "If you want to send mail to us, you must do it from this set of IP addresses." Randomizing the IP to send from a mail server that is outside the pre-agreed range will not make it easier for a spammer to hide because they wouldn't have been able to send mail from it anyhow. To make an analogy, if you send mail from an IP on the PBL and then switch IPs to another IP on the PBL, it doesn't matter because in either case, your email would still be rejected.

Now, as it turns out, the least significant 64 bits are actually reserved in IPv6. The first 64-bits of the IPv6 address are the network address (48 bits routing prefix and 16 bit subnet id), and the last 64 bits are the interface identifier. The 64-bit interface identifier is either automatically generated from the interface's MAC address using the modified EUI-64 format, obtained from a DHCPv6 server, automatically established randomly, or assigned manually. So, using those least significant 64 bits is going to be problematic because an IP address is how we identify a device attached to the Internet and if they are already predefined by some algorithm, then we can't use them. In other words, the least 25 bits in an IPv6 address are already spoken for. However, we could allocate some other 32 million or so IP addresses (a /103) somewhere that is used sending mail… couldn't we?

[Side note: because the MAC address of the machine is used to generate the interface identifier in some cases, this makes it easier to reject mail from these servers. You're no longer blocking an IP address that is subject to change in the case of DHCP, but instead blocking the actual piece of hardware who cannot change its MAC address. It's a more granular level of block that is more reliable… if we can determine that the IP was generated using the MAC address.]

While in theory this could work, it would have to be managed and that could sprawl out of control. The reason is this: which block of IP addresses do we reserve for only sending mail? What if that range had to be shared across millions of customers? For example, suppose we had 1024 IP addresses to allocate and we decided to reserve 500-564 (1/16 of the Internet) for sending mail. How do we share it? Let's suppose that there are 10 major regional Internet registries who hand out the IPs to their customers (ISPs, people with their own home Internet permanent connections, etc). Let's suppose they decided to divide it up manually. RIR 1 gets addresses 0-99, RIR 2 gets 100-199, and so forth up to RIR 10 who gets 900-999 with the final 24 IPs being reserved for special functions. However, RIR 6 has all of the IPs that get to send mail. That's not fair and nobody would agree to that.

So, we decide to divide things up. RIR 1 gets addresses 0-99 plus 500 — 504 (5 IP addresses used to send mail). RIR 2 gets 100-199 plus 505-509 (also 5 IP addresses). Thus, each of the registrars has to "logically" manage both its allocated range and its special email range. Instead of using CIDR ranges to allocate everything nicely, it has to have a big table of who owns what. This gets very messy when you have to have a lot of different IP ranges, particularly when the universe is as vast as IPv6 is. On the other hand, we're going to have to manage lots and lots of IP addresses anyhow. If IANA publishes the rules and says these are the designated IP ranges that are used to send mail, and here's how you apply for them, then everyone is playing by the same set of rules right from the beginning. Not only that, but it's really not all that different from today. Regional Internet registries (RIRs) already allocate space to local Internet registries (LIRs) who then distribute the blocks down to their customers. When IANA provisions space, it would have to ensure that it provisions it such that it takes the special reserved range for mail into account. Indeed, this is something that it already does today when it provisions IP space as well as geo-allocates it. Smarter people than me could probably figure out the necessary algorithms.

You can see from the above doing an even distribution based upon numerical order is not going to work but reserving IP ranges and then mapping them out and handing them out probably would. Even today, we have reserved IP address space that nobody is supposed to use (224.0.0.0 upwards is reserved for multicast, 10.0.0.0/8 is part of RFC 1918's internal address space, and so forth). The work that needs to be done here is that a committee of people has to sit down, figure out how many IP addresses should be reserved for sending mail — such that we are not likely to run out of space in a couple of decades — and then reserve an appropriate range for it. IANA then has to reserve that space and come up with rules for how to hand that out to the RIR's who then have to come up with rules for how to allocate it to the LIRs, who then have to figure out how to allocate it to their customers. They then have to manage the infrastructure necessary to maintain the mappings of who owns what.

Next, RFCs need to be written on how to send and receive mail over IPv6. Then, software vendors need to write code to do IPv6 email transaction that are able to implement these rules. Finally, IP blocklist maintainers need to start populating their lists in IPv6 notation but pursuant to the restrictions that are built into the RFCs.

It's a ton of work, years of it, but if we want to start receiving mail over IPv6 then that's what needs to be done.

Click to read Part 1 and Part 2.

By Terry Zink, Program Manager. Visit the blog maintained by Terry Zink here.

Related topics: Email, IPv6

Get our weekly report:

WEEKLY WRAP — Get CircleID's Weekly Summary Report by Email:
Print Comment

Comments

Confused Frank Bulk  –  Mar 24, 2011 7:07 AM PDT

With IPv4 the default block sizes tend to be a /32 and /24.  We can make the equivalent default block sizes in IPv6 a /64 and /48.  Residential customers infected with bots will still be operating out of one /64 — blocking the whole /64 solves the problem. 

Perhaps you can address this approach.

Frank

possible clarification Carl Byington  –  Mar 24, 2011 8:13 AM PDT

Ok, suppose we make all dnsbl lookups in ipv6 space only consider the upper 64 bits, and ignore the lower 64 bits. That still does not solve the problem of the cache size in the dns recursive resolvers.

Currently, spammers have no problems obtaining ipv4 /24 chunks, and many of them control ipv4 /16 chunks. The rough equivalent is ipv6 /48 and ipv6 /32. I think it is *easier* to get an ipv6 /32 now than it was to get an ipv4 /16 a few years ago.

Currently, a spammer with an ipv4 /16 will at worst fill up 64K cache entries. But that same spammer with an ipv6 /32 can fill up 4B cache entries.

An entire /32 can be blocked up Frank Bulk  –  Mar 24, 2011 10:55 AM PDT

An entire /32 can be blocked up with just one line entry.  =) Yes, the number of bits to store a /64 entry is more than twice as much as a /24, but the databases can still be manageable.

My chief concern would be churn in the data center space — there's so much IPv6 address space a data center might more easily hand out a /48 than they would have handed out a /24.  The spammer could go from one /48 assignment to a totally different one with a different data center site and so on, leaving lots of polluted IPv6 space.  As long as a data center owners are responsible this shouldn't be a problem, but they're not all the same.

But in terms of the residential broadband networks, I'm not concerned that their space would represent a threat to DNSBLs.

two different issues Carl Byington  –  Mar 24, 2011 12:29 PM PDT

There are two different issues here.

One is the size of the dnsbl zone file. That can be reasonably controlled by blocking larger chunks. If the spammer has an ipv6 /32 we put in a wildcard entry to block all 2**32 /64 pieces in that /32 in one dns entry. If they have an ipv6 /48, we put in a wildcard entry to block all 2**16 /64 pieces in that /48. That (more or less) controls the size of the zone file, and the memory used in the *authoritative* dns servers.

Now your mail server starts receiving spam from that ipv6 /32. For each email, it makes a recursive query to some local recursive resolver asking about some specific ipv6 /64 prefix (since we are always ignoring the low order 64 bits, both for the zone file, and for the queries) in some dnsbl zone. The recursive resolver asks the authoritative server for that dnsbl zone, and gets an answer with a TTL.  The recursive resolver saves that answer in its cache.

And that is where the problem lies. That recursive resolver cache may grow to (attempt to) contain 4B entries, as the spammer sends each email from a different ipv6 /64 block. Soon that resolver cache will contain nothing but dnsbl entries. If that cache is shared with your normal web browsing clients (for example, you mail servers and your workstations both point to the same set of dns servers) then when they try to reach www.cnn.com, your recursive resolver will need to start asking the root servers where they can find the .com name servers. So all name resolution will take much longer.

So you need to setup dedicated dns servers for your mail servers. That is doable, but an extra step that many folks don't need today. Even if you do that, the cache flushing above will make your ipv4 dnsbl lookups take longer, since you will (almost) always be going back to the authoritative (spamhaus, spamcop, surbl, whatever) dns servers for answers. Your recursive resolver just came down with Alzheimer's. Look at the number of dns lookups your mail server makes now for each incoming message, and consider the effect if none of those could be answered from cache.

Two ideas come to mind:a) age out Frank Bulk  –  Apr 14, 2011 6:32 AM PDT

Two ideas come to mind:
a) age out the oldest cache entries? 
b) modify the RBL to store the matching /48 or /32, rather than the /64