Monday, June 23, 2008

Exim is a bit nuts

I'm using Nagios to monitor servers, but was having some trouble getting emails to exit the Ubuntu JeOS server that I had set up to run it under VMware. (Most of my stuff is Windows, and Nagios is a linux program). It turns out that a program called exim is used to send emails, and it's a bit crazy.

All attempts to send email to myself resulted in replies which contained the error:
all relevant MX records point to non-existent hosts
Thanks to this entry on the PkgExim4UserFAQ, I was able to get a clue

A probable cause for this might be that all MX records for the offending domain point to site local or link local IP addresses, which are ignored by the dnslookup router to protect from misconfigured external domains. The default configuration has relaxed checking for domains that the local system is configured to allow relaying to, so adding the offending domain to dc_relay_domains will most probably help. Please note that this entry might be necessary anyway to bypass relay control for the domains in question.

Please note that no domain on the public Internet should have MX records pointing to site local or link local IP addresses, so you might check your externally visible MX records.

If this doesn't help, try analyzing the output of exim -d -bt some.local.part@the.offending.domain.example

Well, I did the requisite test, to find this among the output

ignored host [10.0.0.x]

Ignored host?? Clearly not the same as one that is non-existent. So... the first error was a lie.

All the relevant records pointed correctly to a very much alive and well host, but exim chose to ignore it because it was local.

In order to get around this, you have to follow the suggestion of Marc Haber and tell it that it is going to relay email for your local domain (which sounds like a very bad idea) in order to get it to work.

I don't know why they did it this way, but I'm posting this here to help others figure it out.

1 comment:

Mike Warot said...

Further more, I went to register this as a bug, and the web server died... here's what I wrote it up as

Title: misleading information about local mail domains

If you use exim to send mail to a domain which is on the local network, it fails.

The error you get back via mail includes the text

all relevant MX records point to non-existent hosts

This information is incorrect and misleading. I personally don't know why the default is to ignore local mail servers, but I assume you have good reasons for doing so. This message needs to be modified to give correct and/or supplemental information.

I'd suggest that if an MX server has been ignored, it should trigger a flag which would change the error message to state that a Local address was found, but ignored, with a hyperlink to the page about why this policy exists.

This would save everyone, users and developers, some grief.

Thanks for your time and attention.

Mike Warot