Troubleshooting A Dns Server Failure On Windows Server 2003
Troubleshooting a DNS server failure on
Windows Server 2003
by Brien M. Posey MCSE | More from Brien M. Posey MCSE | 11/27/06
Tags: Hardware servers | Software servers | Microsoft Windows | Microsoft Server
• Comments: None | click here to start it
• Rating: Not yet rated Rate it
• Save to my Workspace
• E-mail Article
• Print Article
Takeaway: DNS is a vital service on your network. When it goes wrong, you've got big
problems. Here are some strategies you can use to diagnose and repair DNS problems on
DNS servers are true network workhorses. They are essential both for resolving Internet
domain names, and for the functionality of the Windows Active Directory. When a DNS
Server fails, it can be disruptive to say the least. In this article, I will walk you through
some steps that you can use to troubleshoot DNS failures.
Narrow down the problem
When you first notice that names are not being resolved, the very first thing that I
recommend doing is taking a few minutes to narrow down the problem. There are a wide
variety of DNS related problems, so knowing what the symptoms are of the problem that
you are having is going to be essential to helping you to resolve the problem quickly.
Initially, there are two things that you will want to check for. First, you need to determine
whether all of the computers in the office are having name resolution problems, or if the
problem is isolated to a particular network segment. It is important to keep in mind that
fully qualified domain names can be cached, so it is important to try to resolve a
computer name that the machine in question would not normally need to resolve.
The second thing that I recommend checking on is to see whether the computers on your
network are having trouble resolving internal names, external names, or both.
Performing these two tests will give you a good starting point from which to begin
diagnosing the problem. For example, if you determine that the problem is isolated to a
particular network segment, then the next logical step in the troubleshooting process
would be to check to see if there is a dedicated DNS server that services that segment, or
if the users on that network segment use the same DNS server as everyone else. If the
workstations on the segment that is having problems use the same DNS server as
workstations on other network segments, then there is most likely a communications
problem of some sort. There might be a router down or a firewall may have been
inadvertently configured to block name resolution traffic.
The test to determine whether machines on your network have trouble resolving internal
names, external names, or both is also designed to help you to figure out where you
should start troubleshooting the problem. For example, if you are having trouble
resolving the names of other computers on your internal network, then you most likely
have a true DNS failure. If, on the other hand, internal name resolutions are working,
then the DNS server is obviously functional. It could be that the forwarder is set
incorrectly or that an external DNS server is down.
Problems with your DNS server
Let's pretend that you have done these two tests and you determine that both internal and
external name resolutions are failing for every computer on the network. If that's the case,
then all signs point to a problem with your DNS server.
As simple as it sounds, the first thing that I recommend doing is taking a quick glance at
your DNS server to make sure that the monitor is not displaying the Blue Screen of Death
or some other type of catastrophic error message.
If the DNS server appears at first glance to be functional, then select the Services
command from the server's Administrative Tools menu to open the Service Control
Manager. Scroll through the Service Control Manager and make sure that the DNS
service is running. If for some reason the DNS service is not running, then you can right-
click on the service and select the Start command from the resulting shortcut menu.
Hopefully, doing so will start the DNS service and fix your problem. Even if it does
though, you need to take some time and look through the server's System log for clues as
to why the service has failed.
Assuming that the server appears to be functional and the DNS service is running, then
the next step is to test for a communications problem between the DNS server and other
machines on your network. The easiest way of accomplishing this is to go to one of the
workstations that is experiencing name resolution problems and open a Command
When the Command Prompt window opens, enter the IPCONFIG /ALL command. This
will display the IP configuration for each network adapter on the system. There are two
things that you need to look for when this information is displayed. You need to make
sure that the workstation itself has a valid IP address, and that the IP address of the DNS
server is correct.
The reason why this is important is because in most companies IP configurations are
assigned by DHCP servers. If the workstation has an invalid IP address or if the IP
address of the DNS server is incorrect, then then problem might be with your DHCP
server, not with your DNS server. If your DHCP server is failing or if it has been
misconfigured and is assigning an incorrect IP address scope or an incorrect IP address
for the DNS server, then the symptoms of the problem can mimic that of a DNS server
If the workstation's TCP/IP configuration appears to be correct, then the next step in the
troubleshooting process is to ping the DNS server's IP address. If the ping is returned,
then it means that there is a functional communications path between the workstation and
the DNS server.
If the ping fails, then it doesn't automatically mean that there is a communications
problem. It could be that a firewall between the workstation and the DNS server is
blocking ICMP traffic. One way of testing this is to ping a known good server (by IP
address not by fully qualified domain name) that is in close proximity to the DNS server.
If the ping continues to fail and you have ruled out firewall restrictions as the cause, then
there is most likely a communications problem of some sort going on. I recommend
returning to the DNS server, opening a Command Prompt window, and entering the
IPCONFIG /ALL command.
Doing so will display the TCP/IP configuration for each of the server's network
interfaces. You should verify that the IP address that is bound to the server's primary
network interface matches the address that network workstations are configured to use as
their DNS server.
If everything checks out, then try pinging the server's primary IP address from the
Command Prompt window. Since you are pinging the server's own IP address, the ping
won't verify network connectivity. What it will do though is to verify the integrity of the
TCP/IP stack. If this ping should fail then it means that either some of the files that make
up the TCP/IP stack might be corrupt, or it could mean that the IP address is not being
bound to the network adapter correctly.
If the self ping succeeds, then try pinging some other IP addresses on your network
(especially the IP address of the workstation that was unable to ping the DNS server
earlier). If these pings fail, then a communications problem is definitely to blame. You
might make sure that the patch cable is connected securely to the server's network
adapter. If that doesn't solve the problem, you might try plugging the DNS server into a
different port on your switch, or replacing the network adapter and patch cable.
What if the ping tests are successful though? In a situation like that, communications are
definitely functioning. I recommend going to one of the workstations that is having
problems, and opening a Command Prompt window. Upon doing so, try using the
NSLOOKUP command to resolve some names on your network to IP addresses. This
might seem pointless at first since we have already established that name resolutions are
failing, but I like doing the NSLOOKUP test anyway because it allows you to gather a
little bit more information about the problem. For example, when you perform the
NSLOOKUP, name resolution might fail entirely, or the name might be resolved to an
incorrect IP address.
If the name used in the NSLOOKUP query is resolved to an incorrect IP address, then
there are a couple of different things that could be going on. One possible cause is that
the DNS server involved contains one or more typos in its records. Normally, this should
only be a problem if the IT department manually creates DNS records though. This is
fortunate because the only real way to test for this problem is to manually review the
various host records and make sure that they are correct.
Another situation that could cause an NSLOOKUP query to return an incorrect IP
address is that dynamic updates may be failing. Dynamic updates are typically used
because the majority of the workstations on a corporate network typically receive their
TCP/IP configuration from a DHCP server. As such, a workstation's IP address may
change frequently. That being the case, the DNS server simply can not use static host
records for these machines. Instead, dynamic updates are used to insure that a computer's
host record matches its current IP address.
If dynamic updates are failing, then the DNS server database will contain outdated (often
invalid) IP addresses for various host records. The easiest way of forcing an update of a
host record is to go to one of the machines that has an outdated host record associated
with it and open a Command Prompt window. At the command prompt, enter this
command: IPCONFIG /REGISTERDNS This command should force a host record
update. If updates continue to fail, you should make sure that your DNS server is
configured to accept dynamic updates.
One other issue that can cause the DNS database to contain incorrect IP addresses is that
zone transfers might be failing. Normally though, this will only be an issue if the DNS
server is incorrectly resolving names from a secondary zone. If a zone transfer failure
occurs then outdated host records will remain in the secondary zone database file.
If you suspect a zone transfer problem then you can try to manually force a zone transfer
or try rebooting all of the DNS servers involved. If zone transfers have never worked
between the two zones, then it could be that you have incompatible DNS server types.
Although DNS name resolution itself is universal, some types of DNS servers use
different compression formats or resource record types than others.
If none of these techniques help the DNS server to start supplying the correct IP
addresses, then it could be that the DNS server has cached the incorrect IP addresses.
You can manually clear the DNS server's cache by opening the DNS console, right-
clicking on the DNS server in question, and selecting the Clear Cache command from the
resulting shortcut menu.
External name resolution failures
In some cases, a DNS server will have no trouble resolving names on your local network,
but may be unable to resolve Internet domain names. If that is the case, the problem is
most likely related either to your forwarders or to a failure of either your ISP's DNS
server or a router between you and your ISP's DNS server.
To troubleshoot this problem, open the DNS console, right-click on the listing for your
DNS server, and select the Properties command from the resulting shortcut menu. When
you do, Windows will display the server's properties sheet. Select the properties sheet's
Forwarders tab and make note of the list of forwarding IP addresses.
You can try pinging these IP addresses to make sure that there is a communications path
to them. If nothing seems amiss then you could contact your ISP to verify that they are
still using those addresses for their DNS servers.
While you are looking at the server's properties sheet, you might also check the Root
Hints tab to make sure that it is populated. The Root Hints tab lists the IP addresses of the
root DNS servers. On a Windows based DNS server, the root hints are prepopulated, and
the root addresses rarely if ever change. Even so, it's worth making sure that the root
hints have not been accidentally removed.