Momentum MTA Performance Tuning Tips

Posted in: General
Tags: ecelerity, mail, Momentum, MTA, performance, tuning

7 Jan2012

This post is being constantly updated as we find out more useful information on Momentum tuning. Last update: 2012-05-05.

About 2 months ago I’ve joined LivingSocial technical operations team and one of my first tasks there was to figure out a way to make our MTAs perform better and deliver faster. We use a really great product called Momentum MTA (former Ecelerity) and it is really fast, but it is always good to be able to squeeze as much performance as possible so I’ve started looking for a ways to make our system faster.

While working on it I’ve created a set of scripts to integrate Momentum with Graphite for all kinds of crazy stats graphing, those scripts will be opensourced soon, but for now I’ve decided to share a few tips about performance-related changes we’ve made to improve our performance at least 2x:

Use EXT2 Filesystem for the spool storage – After a lot of benchmarking we’ve noticed that amounts of I/O we’ve been doing was way too high compared to our throughput. Some investigation showed that EXT3 filesystem we were using for the spool partition had way too high metadata update overhead because of the fact that the spool storage uses a lot of really small files. Switching to EXT2 helped us gain at least 50-75% additional performance. Additional performance gain was caused by turning on noatime option for our spool.
There are some sources that claim using XFS for spool directories is a better option, but we’ve decided to stick with EXT2 for now.

Do not use %h{X} macro in your custom logs – Custom logs is an awesome feature of momentum and we use it to log our bounces along with some information from mail headers. Unfortunately the most straighforward thing to do (using %h{X} macro) was not the best option for I/O loaded servers because every time Momentum needs to log a bounce it needs to swap message body in from the disk and parse it to get you the header value.

To solve this issue we’ve created a Sieve+ policy script that would extract the headers we need from a message during initial spooling phase (when the message is still in memory) and put those values to the message metadata. This way when we need to log those values we wouldn’t need to swap message body in from the disk. Here is the Sieve script to extract header value:

1
2
3
4
5
6

require [ "ec_header_get", "vctx_mess_set", "ec_log" ];

# Extract x-ls-send-id header to LsSendId context variable
# (later used in deliver log)
($send_id) = ec_header_get "x-ls-send-id";
vctx_mess_set "LsSendId" $send_id;

After this we could use it in a custom logger like this:

1
2
3
4
5
6

custom_logger "custom_logger1"
{
delivery_logfile = "cluster:///var/log/ecelerity/ls-delivery_log.cluster=>master"
delivery_format = "%t@%BI@%i@%CI@D@%r@%R@%m@%M@%H@%p@%g@%b@%vctx_mess{LsSendId}"
delivery_log_mode = 0664
}

Give more RAM to Momentum – When Momentum receives a message, it stores it to the disk (as required by SMTP standard) and then tries to deliver the copy it has in memory, if delivery succeeds, on-disk copy is unlinked. The problem with a really have outbound traffic load is that momentum needs to keep tons of emails in memory, but by default it could only hold 250 messages. With a load of 250-500 messages a second this is just too small.
To change this limit we’ve increased Max_Resident_Active_Queue parameter and changed it to 1000000 (of course we made sure have enough RAM to hold that many messages if needed) and Max_Resident_Messages to 0 (which means unlimited). This allows Momentum keep as many messages resident as possible and reduce the load caused by swap-in operations required for re-delivery attempts, etc.
Choose a proper size for your I/O-related thread pools – in default Momentum config they set SwapIn and SwapOut thread pool sizes to 20. Under really high load even on our 4xSAS15k RAID10 this tends to be too high value. We’ve switched those pools to 8 threads each and it helped to reduce I/O contention and overall I/O throughput.
Disable adaptive delivery if you do not need it (especially critical in 3.4 with cluster-aware adaptive delivery) – in our case it allowed us to increase single server throughput from 1.5M to 2M messages/hour for servers with warmed-up bindings that do not need adaptive warmup and other fancy features. The reason behind this difference is in the fact that to be able to do adaptive delivery magic, your MTA needs to spend considerable amounts of resources on all those decisions that make adaptive really valuable for warming-up new IPs (adaptive backoff on ISP blocks, moving mail between nodes based on bindings suppression information). After you disable Adaptive, make sure yo restart your ecelerity process and check that module is not loaded (using module list console command).
Remove Replicate clauses from your configs – if you do not use data replication features in momentum (inbound_cidr, metrics, etc), disabling all replicate stanzas in your config files could give you a great performance boost because apparently this replication stuff takes a lot of resources and greatly reduces throughput (though it could be useful in some cases).

As a summary, I’d like to note, that as with any optimizations, before tuning your system it really helps to set up as much monitoring for your MTA servers as possible: cacti graphs, graphite, ganglia or something else – does not matter. Just make sure you see all the aspects of your system performance and understand what is going on with your system before changing any performance-related settings.

Homo-Adminus Blog

Yet Another Admin’s Blog