Archive for Server Notices

Mandatory SPF Checks

Mandatory SPF checks will be implemented on the servers starting tomorrow afternoon.  This will hopefully combat the deluge of spam customers have been receiving lately.

If you are receiving Undelivered/Undeliverable messages,  then make sure you visit the “SPF Wizard” within the control panel to add an SPF record for your domain.  If you are using Apis to send e-mail — which should look configured as such — then the default settings should help limit the amount of backscatter you are currently receiving.

Failing that, there is one more idea with maildrop presets, but let’s see what happens.

12:49 PM EDT — SPF checks are now implemented on all of the servers.  You should begin seeing a reduction in spam.  Also, make sure the SPF record is a strict fail (-all) instead of a softfail (~all).  You may edit the TXT record via DNS Manager if it is a softfail.

Comments

A solution to HTTP bandwidth monitoring

Good news for me, bad news for everyone else, I have an amicable solution for handling bandwidth tracking on the new servers.  Instead of going the cumbersome method of writing an Apache module to track bandwidth, I will be taking a page from cPanel’s book and just writing traffic to a flat file.  Every night at approximately 3 AM a script will reconcile the bandwidth logs and dump it into the database.  There is a marginal hit on filesystem performance, but again this is a far better solution than Ensim’s cumbersome configuration file parsing, plus we will still be able to retain the rapid Web server restarts that we have all grown to appreciate and love.

Currently, the script to parse and dump the logs into the database is being written and should be live no later than Saturday morning.

IMAP/POP3 and SSH traffic are negligible.  Dedicated port traffic will be next on the list of problems to solve… possibly with some iptables packet counting.

- Matt

Comments

Disk Quota Upgrades

Disk quotas have finally been doubled from their initial figures as promised “back in the day”.  The first step was a 50% increase from 350 MB to 525 MB and the second step doubled the initial disk quota to 700 MB.

All sites should have these changes available at this time.  Packages have been updated to reflect the new disk space allocations.  Next step is to once again count apache-owned files towards the quota.  Once completed, disk space upgrades will be taken care of for some time to come.

Comments

RoundCube webmail now available

Due to popular demand, and a stable release, RoundCube support has been added to all accounts, accessible via http://roundcube.<your domain>/

Have fun with the snazzy, new webmail client.

Comments

Kernel upgrade pilot, May 3rd on Aleph

The kernel on Aleph will be upgraded to 2.6.25 on May 3rd at 1 AM EST (-0500 GMT).  There will be a brief 3 - 5 minute period of service interruption as the server is rebooted.  We will be performing the kernel upgrade to assess the changes in the sky2 network driver used on the servers.  Currently with 2.6.24 we still see intermittent timeouts once every several days lasting 20 seconds when the network card “hangs” and is restarted.  While these are rare, ensuring optimal stability around the clock is important.  If the sky2 driver fixes eliminate the hangs, then 2.6.25 will be rolled out on all the servers in 10 - 14 days.
Update: the new kernel will also include offline support for the second processor to reduce power consumption during non-peak hours.  This feature will be demoed on Saturday night as well between 2 AM - 9 AM EST.

Comments

New feature, Web-based FTP Client

Visit http://ftp.apisnetworks.com/ or http://net2ftp.apisnetworks.com/ to access the Web FTP client.  Please bear in mind that you may only connect to accounts hosted through Apis Networks (IP addresses starting with 64.22.68.x).  Remember for those looking to hand out temporary FTP accounts for friends to temporarily host files, you can jail users too.

Comments

Quota Overage -> Disable MySQL Insert/Update Policy Change

A new policy change will be enacted for all accounts on Wednesday, February 20th.  Those that remain over disk quota for more than one week will have insert privileges revoked.  This will have no effect on those already over quota or those who are in danger of going over quota.  Accounts that have gone over quota for a week, managed to miraculously not notice the sudden stoppage of e-mail, and database errors, but eventually freed up space will need to visit MySQL Manager in the control panel to enable INSERT and UPDATE privileges on the accounts.

This policy has been introduced to counteract the terrible threading issues in MySQL, which have been present as far back as 4.0.  Flash forward a few years later and it’s still an obscure bug that hits randomly.  When an INSERT or UPDATE statement occurs MySQL attempts to acquire an exclusive lock on the table to safeguard against corruption; however, this lock to used solely with writes trickles out past the table level, past the database level, and locks all tables in all databases.  Regardless whether it is a read or write operation, it’s locked.  Although I have written a monitoring service to check against these locks every 3 minutes and restart as necessary, I feel it is inadequate at addressing the problem head-on.  The new policy change should make lock-ups extremely infrequent (1 per 180+ days across all servers).  Right now we’re seeing them still happen at a rate of 1 per 30 days amongst all servers.

Again just to rehash: if you run over quota for more than a week, visit MySQL Manager after fixing disk space usage to re-enable the INSERT and UPDATE privileges.

Comments

Emergency kernel upgrade

An emergency kernel upgrade is scheduled for February 11th at 12 AM EST (-0500 GMT).  This kernel upgrade will bring the servers to 2.6.24.1 and is mandatory to avoid a serious exploit within the kernel.  While I do apologize for the quick kernel upgrades over the past few days, this cannot be avoided and further, the temporary fix tends to result in system instability as has been noted on 2 of the servers earlier today.  More information will be available later tonight.

12:22 AM EST:  one more exploit is out in the wild, which hasn’t been pushed upstream yet.  Rebuilding kernels with custom patch in git; another reboot is coming up.

12:28 AM EST:  the last patch has been applied and it looks like we’re good to go.  This is a nasty set of easy exploits in the vmsplice section of the kernel, which is used all over the place such that it’s not something that can be removed.

There are three exploits, all of which affect the 2.6.17 - 2.6.24.1 branches of the Linux kernel.  Unfortunately this means very high and easy penetration for attackers.  If you work IT, then you have my sympathies tomorrow.  It’s going to be a fun day for those who don’t keep up with security on the weekends.  Although it doesn’t quite rival do_brk() in terms of notoriety, I have a hunch it’ll be an infamous bug for some time to come.

One final note, the RET “fix” can lead to system instability and is not recommended.

Comments

Postfix 2.5 goes live tonight, IMAP/POP3 quota enhancements, sky2 still buggy

IMAP Quota Support
Amidst the massive confusion users experience when they suddenly find themselves over quota with little notification aside from within the control panel, I’ve patched Dovecot’s quota reporting extension to work correctly with the servers.  Even better, it’s reported in SquirrelMail and Horde now too.  In summary, the quota reporting works like this: check if there’s a user quota limit and it’s less than the account quota limit; if so, report that, otherwise report the total account quota used and account quota limit.
sqmail-quota.png  horde-quota.png

Postfix 2.5
Postfix 2.5 will be rolled out on the servers tonight with another very important enhancement designed to protect against accounts monopolizing the mail server.  Delays may be configured between delivery attempts in order to prohibit one particular e-mail account from consuming all delivery slots as we saw earlier this week.  Currently Postfix will add a 10 second delay between delivery attempts to the same address.  For example, let’s say you’ve been inundated with a spam flood of 100 messages destined to your e-mail account all of which were accepted on the mail server’s end in a 2 second window.  Postfix 2.4 and earlier would attempt to deliver all of the messages at once, quickly tying up delivery slots in the process as each message is scanned by SpamAssassin and hopelessly delivered.   Another new feature in Postfix 2.5 is the revised scheduling algorithm that takes into account feedback to determine if a hop is up or down.

vsFTPd+ 
I’ve heard 2 reports from users experiencing authentication difficulties with the new vsFTPd+ build.  I have been unable to confirm the bug from my end.  I am waiting on a packet trace from another customer with Ethereal later on tonight.  If I can get that squared away, then vsFTPd+ will be rolled out on the servers tonight otherwise we’ll hold off until that bug is understood.

sky2 NIC Drivers
Despite having extremely high hopes that the 2.6.24 Linux kernel would address the deadlocking problems present in Marvell Yukon XL chips, it’s still an off and on problem.  Issues have been further exacerbated by an unexpected kernel panic on Assmule Wednesday night.  No other servers have exhibited a kernel panic, but we’re still seeing the NIC lose connectivity for 10-15 seconds randomly.  Fortunately it is extremely rare; Augend has had it happen once this week and 10 seconds/604800 seconds is such a marginal number you could chock it up to random network glitches elsewhere.  Anyway, I am still following the threads as they pop up on the LKML and I’ll let you know if anything turns up.

rc.local Support
Because we are destined for another kernel upgrade in the next few months it’s a good time to think about adding rc.local support to each user’s Basic and higher account.  rc.local allows you to include commands that are executed upon boot by the server.  There’s a global rc.local we use to set readahead rates on the hard drives and toss up any auxiliary services, but as I know from process output, many individuals run their own services like mongrel, pen, svnserve, and the like, which are critical to your site.  You can’t always be around when there’s a kernel panic like with Assmule and often times you won’t witness it.  Servers can recover in as little as 3 minutes.  That is less time than it takes to go to the bathroom and come back if you’re gifted in the large bladder department.  rc.local will be rolled out next week.

That’s all for this Friday’s installment.  Log rotation support in the control panel will be split between a basic and advanced editor… in case you were curious as to the hold-up.

Comments

Service Pilots

Two new versions of the SMTP and FTP servers will be piloted on Borel and Assmule respectively through Saturday morning.  Borel will be running Postfix 2.5, which introduces a nifty new tunable parameter to add wait times between mail delivery.  This should be helpful in alleviating potential bottlenecks when a user’s account goes over quota and ties up all delivery slots as each message is fruitlessly delivered.  Assmule is running vsFTPd+, which introduces a handful of fixes including proper authorization prompts in Internet Explorer/Firefox if authorization fails.
Both servers are running the new versions at this time with no apparent interruption or side-effects.

Comments

« Previous entries