A.J. Clark

Solutions Architect

Archive for the ‘Postfix’ Category

Highly Available Mail Cluster – v2

with one comment

It has been some time since I last blogged about my quest to build a Highly Available Mail Cluster. If you recall, the last HA Mail Cluster architecture that I designed involved four identically configured servers, spread between two data centres utilizing DNS round-robin load balancing. The Maildirs were rsynced from the ‘master node’ to the three slaves every 5 minutes. MySQL master-slave replication was used for user authentication. This worked “well enough” but it wasn’t real High Availability and it meant that, at any given time, three out of four Mail nodes were passive (idle) – wasting resources.

I decided to design a brand new platform to deliver a Highly Available Mail Cluster. The platform consists of two domU nodes in active/active configuation. Each node utilizes a DRBD block device in Primary/Primary mode. OCFS-2 is the clustered file system which sits on top of DRBD. This allows both nodes shared-concurrent access to the Maildir directories.

Both nodes run dovecot for pop3/imap and postfix for smtp. Each node has an equally weighted A/MX record for SMTP / POP3 / IMAP load balancing. The load balancing is still performed by DNS, utilizing two IP addresses in the A record. The beauty of this active/active heartbeat setup means that in the event of a server failure, the IP resource of the failed server will be taken over (via heartbeat) by the other mail server. This means there is virtually zero chance of a user hitting a stale IP address in the DNS A record. I have noticed that when users are constantly checking their inbox (pop3/imap) every 5 minutes or so Outlook caches the DNS entry indefinitely, regardless of TTL.

The above solution is working very well. I did have a couple of initial concerns regarding the stability of DRBD/OCFS-2 within a Xen domU – but I have had no problems to date. Overall the entire solution appears to be very stable.

The architecture diagram below (servers on right) shows the architecture. The full size image can also be found at: http://napta2k.googlepages.com/linode-v2.png

The drawbacks of this solution is that it does not scale past 16 nodes. The scaling limitations are due to how many cluster members can be part of Heartbeat, DRBD and OCFS-2. Thankfully I only host 400 or so mailboxes and can never see the need to scale to any more loads. One domU handles my current load just fine. The two node active/active could easily be active/passive and work just as well.

If I was going to implement a large single-data centre Highly Available Mail Cluster I would use the following architecture:

1. CARP IP address layer to distribute the load to the TCP load balancers <– probably only needed for the largest setups. You may have 16 load balancers on your front line but you definitely do not want to have 16 IP addresses in your DNS A record. CARP masquerades this.

2. TCP load balancer layer such as HAProxy to load balance the IMAP/POP3 traffic to the IMAP/POP3 farm

3. A farm of dovecot servers all sharing a _resilient_ NFS backend of Maildirs

4. A resilient NFS architecture. This could be as part of a SAN (e.g. EMC celerra) or Linux iSCSI/DRBD.

x. SMTP can be load balanced via MX.

Written by napta2k

March 5, 2009 at 8:56 pm