DNS: Understanding The SOA Record

In the hosting industry, the Domain Name System (DNS) is one of the most critical pieces, right behind websites themselves. Without DNS, that website you've worked so hard on would be completely invisible. (Although it's possible to access some sites using only the IP address of their web server, this is not the case for virtual websites, which require that their hostname be included in the HTTP request header. Without a working DNS record, virtual websites are completely inaccessible.) But I've found that DNS is something that is not well understood by many website operators. The basics of creating A records (which translate a hostname to an IP address) are simple enough, but when it comes to understanding how changes are propagated in DNS, this is often something of a mystery.

There is a widely held belief that any change made the DNS zone file of a domain is instantly seen throughout the Internet. Yet nothing could be further from the truth. When advising that changes be made to a zone file to fix a problem, I routinely add the following caveat:

Please allow up to 24 hours for any change to completely propagate throughout the world-wide DNS system.

Changes to a zone file are almost never instantaneous regardless of how despreate you are that they be instantaneous. Any change requires time before it will be seen everywhere on the Internet. But what many don't understand is that how fast or slow these updates are propagated is actually under their direct control through the SOA record.

Let me be completely clear on this one point. Although you have control over the speed that updates are propagated throughtout the Internet, they will never, ever, be instantenous! There will always be a delay. Your only control is over how short or long this delay will be.

SOA: Start Of Authority

The SOA record is perhaps the least understood record in the entire zone file. But it controls the speed that any update is propagated thourghout the Internet. The purpose of the SOA record is:

  • Identify the DNS server that is authoritative for all information within the domain.
  • List the email address of the person in charge of the domain.
  • Control how often secondary servers check for changes to the zone file.
  • Control how long secondary servers keep the zone file active when the primary server cannot be contacted.
  • Control how long a negative response is cached by a DNS resolver (but for some DNS servers, this is also how long a DNS resolver should cache any response).

Now if you control all of the authorative DNS servers for a domain (that is, the DNS servers that actually host the zone files and can answers queries for the domain as opposed to having to ask another DNS server), then with the exception of how long negative responses should be cached, these settings may not seem as important since you can force the secondary servers to update whenever needed. By if you are using third-party name servers which you do not control as your secondary name servers (such as Peer 1's SuperDNS servers), then these settings are vitally important to how fast any changes are propagated. So let's go over each of these settings.

I will be using the official names for each of these fields as listed in RFC 1035: Domain Names — Implementation and Specification.

MNAME: Primary Name Server

Fully-qualified domain name of the primary or master name server for the zone file. Within the structure of DNS, there can only be one server that holds the master, editable zone file. (Yes, there are exceptions, but I won't cover them here.) All secondary name servers create their zone files by transferring the contents from the primary name server. Changes to the domain's resource records are made to the primary name server's zone file and are then propagated to the secondary name servers when they check for updates.

The domain name of the primary name server must end with a period.

RNAME: Responsible Person

Email address of the person responsible for the domain's zone file. Often it will be an alias or group address rather than a particular idividual. It uses a special format where the "@" character is replaced with a "." (period) character and the email address ends with a period. So the email address hostmaster@example.com would become hostmaster.example.com. (note that the endding period is part of the email address).

Never use an email address which uses a period before the "@" character (such as host.master@example.com) since DNS will automatically interpret the first period as the "@" character (where host.master.example.com. would become host@master.example.com).

SERIAL

Serial number of the zone file that is incremented each time a change is made. The secondary name servers compare the serial number returned by the primary name server with the serial number in their copy of the zone file to determine if they should update their zone file. If the serial number from the primary name server is greater than their serial number, they will do a zone update transfer. Otherwise, no action is taken.

If you make a change to the zone file on the primary name server and forget to increment the serial number, the change will not be propagated to the secondary name servers even if you attempt to force a zone update transfer. The primary and secondary name servers will remain out of sync until the serial number is incremented on the primary name server. Unless you are manually editing the zone files (something that is not uncommon when using BIND), most DNS servers or frontend DNS applications will increment the serial number for you. But if you find that updates are not being propagated to the secondary name servers, the serial number is the first thing you should check.

In the early days of DNS, the serial number was just that — a number that was incremented by 1 each time the zone file was changed. So that one could have a better idea of when the zone file was actually changed, it's recommended (but not required) that you use the format YYYYMMDDnn, where YYYY is the year, MM is the month, DD is the day, and nn is the revision number (in case the zone file is changed more than once in a single day).

Never use a decimal in the serial number, such as 20130511.01, even if it is allowed by your DNS server. The serial number is an unsigned 32-bit number, so using a decimal in the serial number will cause it be converted to something unexpected.

REFRESH: Refresh Interval

Time in seconds that a secondary name server should wait between zone file update checks. The value should not be so short that the primary name server is overwhelmed by update checks and not so long that propagation of changes to the secondary name servers are unduely delayed. If you control the secondary name servers and the zone file doesn't change that often, then you might want to set this to as long as day (86400 seconds), especially if you can force an update on the secondary name servers if needed. But if your secondary name servers are not under your control, then you'll probably want to set this to somewhere between 30 minutes (1800 seconds) and 2 hours (7200 seconds) to ensure any changes you make are propagated in a timely fashion.

Even if you configure your primary name server to send NOTIFY messages (which I will cover in a future article) to the secondary name servers whenever a change is made, you should never completely depend on this to ensure timely propagation of the changes, especially when using third-party secondary name servers. The decision to honor a NOTIFY message is entirely up to the secondary name server and some DNS servers do not support NOTIFY.

RETRY: Retry Interval

Time in seconds that a secondary name server should wait before trying to contact the primary name server again after a failed attempt to check for a zone file update. There are all kinds of reasons why a zone file update check could fail, and not all of them mean that there is something wrong with the primary name server. Perhaps it was too busy handling other requests just then. The Retry Interval simply tells the secondary name server to wait for a period of time before trying again. A good retry value would be between 10 minutes (600 seconds) and 1 hour (3600 seconds), depending on the length of the Refresh Interval.

The retry interval should always be shorter than the refresh interval. But don't make this value too short. When in doubt, use a 15 minute (900 second) retry interval.

EXPIRE: Expiry Interval

Time in seconds that a secondary name server will treat its zone file as valid when the primary name server cannot be contacted. If your primary name server goes offline for some reason, you want the secondary name names to keep answering DNS queries for your domain until you can get the primary back online. Make this value too short and your domain will disapear from the Internet before you can bring the primary back online. A good value would be something between 2 weeks (1209600 seconds) and 4 weeks (2419200 seconds).

If you stop using a domain and delete it from the configuration of the primary name server, remember to remove it from the secondary name servers as well. This is especially important if you use third-party secondary name servers since they will continue to answer queries for the deleted domain — answers which could now be completely incorrect — until the expiry interval is reached.

MINIMUM: Negative Caching Time To Live

This field requires special attention since how it's interpreted depends on the DNS server you are using. There have been three possible meanings for the MINIMUM field:

  • Defines the minimum time in seconds that a resource record should be cached by any name server. Though this was the original meaning of this field (and it still retains the name from this meaning), it was never actually used this way by most name servers. This meaning is now officially deprecated.
  • Defines the default Time To Live (TTL) for all resource records that do not have an explicit TTL. This only applies to the zone file on the primary name server since a zone transfer to the secondary server adds the explicit TTL to the resource record if it is missing. Versions of BIND prior to 8.2 use the MINIMUM field as the default TTL for all resource records, as do all versions of Windows DNS Server.
  • Defines the time in seconds that any name server or resolver should cache a negative response. This is now the official meaning of this field as set by RFC 2308.

Unlike all the other SOA fields, MINIMUM effects every name server or resolver that queries your domain. If your DNS server is compliant with RFC 2308, then this field only applies to how long a negative response (that is, for a query where no resource record is found) is cached. But if your DNS server uses this as the default TTL for resource records without an explicit TTL, then it controls how long any response could be cached by a name server.

If you make this too long, then name servers and resolvers will keep using their cached result even after all the secondary name servers have updated their zone files. And there is no method available for you to force these name servers and resolvers to flush their cache. Again, if your DNS server is compliant with RFC 2308, it only applies to negative responses. But if not, then all resource records without an explicit TTL will use this value as the default TTL. If you were to set this to 1 week (604800 seconds), then it could take up to a week for any change to finally be seen everywhere on the Internet.

$TTL: Default Time To Live

This was added in RFC 2308 to define the default TTL to should be used for any resource record that does not have an explicit TTL. But as pointed out earlier, not all DNS servers support it. BIND 8.2 and higher use $TTL to define the default TTL in their zone files, but Windows DNS Server does not, relying on the SOA MINIMUM field instead, So check you DNS server manual to find out how it sets the default TTL.

Final Thoughts

There is no hard and fast rule for setting the refresh, retry, and TTL values. For domains where changes are rarely done, longer values are usually preferred. But if are planning to make changes, then reducing these values before hand, especially the default TTL, can go a long way to ensuring your changes get propagated in a timely fashion. But you must change these values at least as far in advance as the default TTL. If, for example, the current default TTL is set to one week, you'll need to change the default TTL at least a week before the zone file is changed to ensure that every DNS server and resolver is using the new TTL. Otherwise you could find that scattered sections of the Internet don't see the change until the older, cached record finally expires.