June 25, 2008 at 6:04 pm EDT
· Filed under Miscellaneous
In case you were curious as to the cause of the drought of esprit updates as of late, I am feverishly working on giving Apis’ site a much-needed facelift. You may preview the new site at http://v6.apisnetworks.com/. Feel free to leave any usability remarks in the comments section of the blog.
Look for it to appear sometime in July during the six year anniversary of Apis.
Permalink
June 13, 2008 at 1:47 pm EDT
· Filed under apnscp esprit Updates
New esprit update has been pushed upstream to the servers at this time.
- Added: administrative ticketing capability
- Added: WebDisk component, relaxed WebDAV to listen to all interfaces on the server on port 2077/TCP
- Added: svn DAV provider in WebDAV
- Added: Admin module
- Added: Roundcube Webmail
- Fixed: use noreply address for SQL backups
- Fixed: disable duplicate postback for tickets
- Fixed: Horde, SquirrelMail display fix-ups. Truncate displayed URL, automatically redirect if login successful
- Fixed: Horde authentication bug
- Changed: force low priority on bandwidth/billing/control panel bug/enhancement tickets
Permalink
June 12, 2008 at 6:05 pm EDT
· Filed under Server Notices
After 24 hours of tweaking and logging, the cause looks to be directly attributable to Zend Optimizer. As a result, we are recommending all customers at this time to switch their encoded applications from Zend Encoder to ionCube. ionCube does not exhibit the same issues and after a few hours of exclusively running ionCube + eAccelerator and ionCube + APC, both have performed admirably.
Zend Optimizer tends to result in crashes in clumps, e.g.:
[Thu Jun 12 13:16:24 2008] [notice] child pid 17624 exit signal Segmentation fault (11)
[Thu Jun 12 13:18:05 2008] [notice] child pid 27997 exit signal Segmentation fault (11)
[Thu Jun 12 13:18:05 2008] [notice] child pid 27999 exit signal Segmentation fault (11)
[Thu Jun 12 14:04:08 2008] [notice] child pid 30838 exit signal Segmentation fault (11)
[Thu Jun 12 14:04:09 2008] [notice] child pid 6332 exit signal Segmentation fault (11)
[Thu Jun 12 15:30:18 2008] [notice] child pid 17317 exit signal Segmentation fault (11)
[Thu Jun 12 15:30:18 2008] [notice] child pid 18367 exit signal Segmentation fault (11)
[Thu Jun 12 15:45:19 2008] [notice] child pid 28589 exit signal Segmentation fault (11)
[Thu Jun 12 15:45:42 2008] [notice] child pid 15373 exit signal Segmentation fault (11)
[Thu Jun 12 16:45:03 2008] [notice] child pid 31388 exit signal Segmentation fault (11)
[Thu Jun 12 16:45:19 2008] [notice] child pid 19022 exit signal Segmentation fault (11)
[Thu Jun 12 16:45:19 2008] [notice] child pid 22038 exit signal Segmentation fault (11)
[Thu Jun 12 16:45:45 2008] [notice] child pid 19016 exit signal Segmentation fault (11)
[Thu Jun 12 16:45:45 2008] [notice] child pid 5576 exit signal Segmentation fault (11)
And so on. This is a snippet from Borel, which appears to be all over the map in terms of crashes. Other servers aren’t as frequent in the rate of crashes (2 every 4 - 6 hours). There is likely some suspect code that is evaluated by Zend causing the issues, but having noted long-standing reliability issues with Zend in the past, it’s safe to say we need to put that extension to rest.
Now, as an added perk, we will be able to deploy APC, which has upload byte logging support. Previously, APC and Zend Optimizer cannot co-exist (well they could, but every request segfaults at the end…) eAccelerator and Zend Optimizer can, which is why I originally chose that solution. Not withholding, APC was also extremely buggy during the initial round of software evaluation in November ‘06.
If these results remain as promising as they have been during the first few hours of testing, the servers will be running ionCube and APC by Monday morning. We can finally put the erratic restarts to rest on Augend and Borel.
- Matt
Update: Oh, and reminds me — if my thought process is right, then it is also likely the cause of the odd phenomenon of blank PHP pages that hit less frequently than lottery winners. Assume that eAccelerator needs to compile the page (exceeded TTL or new code), finishes servicing the request, then reserves an inode on the filesystem to store the compiled code. After the inode is reserved, or block in memory, or whatever to signal the code is about to be stored, the shutdown hook calls Zend Optimizer which in turn crashes. This results in an opened inode or section in shared memory with no data written by a buffer flush to disk/memory as Apache has prematurely died thus leading to… the blank page.
(Purely speculation, but a possibility.)
Permalink
June 11, 2008 at 4:01 pm EDT
· Filed under Server Notices
There are three important changes between Augend and Borel today aimed at increasing stability with the Web server. We have been seeing crashes happen every few minutes on a worker for each server. These crashes are, as far as I aware, entirely random, which may be triggered during the 3 minute service check on http://<server>/phpinfo.php. When the service check fails due to a crash, a restart is issued. The crash however appears to be far-reaching, stalling the proper shutdown of all Apache workers. Instead of a normal restart happening then, the service monitor fails to restart the service properly and finalizes the restart during the next check 3 minutes later.
From prior knowledge, Zend Optimizer has caused these dangling restarts in the past, so this is the first thing to be upgraded on the two more problematic servers. Augend has also had ionCube temporarily removed as no one uses ionCube’s loader to load ionCube-encoded code on it. Hopefully one of these two is the culprit, with a simple fix.
Further, the amount of time Apache waits for a shutdown in the init script has been upped from 10 seconds to 30 seconds. If Apache does not shutdown properly within 30 seconds the shutdown/startup process will bail. Apache is configured with a graceful shutdown timeout of 15 seconds. These three fixes should permanently resolve the erratic segfaults that we have seen predominantly on Augend and Borel.
- Matt
Update: just to clarify, the crashes affect a select few users. One crash every 3 minutes serving 30 pages per second on average has a 1/5400 chance of being the one to crash. PHP pages tend to have a higher frequency of encountering this bug, so that’s why I am looking into the trifecta of loaders/optimizers first.
Permalink
June 10, 2008 at 12:02 am EDT
· Filed under apnscp esprit Updates
New esprit update scheduled for 3 AM EDT:
- Added: prefer slave billing server, fallback to master billing server if unable to contact primary billing server
- Added: trouble ticket priority system
- Fixed: leading forward slash in forbidden files in File Manager
- Fixed: CSC changes: nag on missing CSC only if card number changed or no csc + card number unchanged and first access; further, deny credit card updates without CSC provided
- Fixed: Impromptu display bugs in IE
- Fixed: Misc Majordomo fix-ups: missing majordomo-owner alias, removed -w check on majordomo directories as -w doesn’t handle ACLs
Permalink
June 6, 2008 at 12:17 am EDT
· Filed under apnscp esprit Updates
esprit now has a functional level 1 WebDAV server. There’s still a few remaining bugs with not picking up exotic filenames (UTF-8-named files, garbage characters in the name) and PROPPATCH (can’t rationalize an immediate need for it), but it does work and it works well. The server is actually built using the API, very neat stuff once you get into the nitty-gritty. The API still needs some clean-up* and documentation.
Accessing the filesystem from your desktop is very simple now. On Windows 2000/XP go to Explorer, My Network Places -> Add Network Place -> Choose another network location -> http://cp.<server name>.apisnetworks.com:2077/ -> plug in your login/password, e.g. msaladna@apisnetworks.com/password and you’re set. Right now it doesn’t support SSL and likely won’t be bound to port 80 due to conflicts between the dashboard and dav/ subdirectory when accessed under Windows.
- Added: welcome screen for first-time users
- Added: DAV interface accessible via http://cp.<server>.apisnetworks.com:2077/.
- Added: overwrite parameter to File_Module::copy
- Added: File_Module::find_quota_files, file_exists
- Fixed: equal spacing on action images in list view for File Manager
- Fixed: use built-in JSON handler for SOAP
- Changed: upgraded jQuery to 1.2.6
- Removed: unnecessary buffer flushes in Module_Skeleton::call_proc
* Chiefly the *_backend methods not directly invoked from outside the API are scheduled for removal in a few months to be replaced by compact versions, e.g. if (IS_CLI) { /** do elevated tasks */ } else { /** unprivileged actions */ }
Permalink
June 3, 2008 at 4:46 pm EDT
· Filed under Server Notices, apnscp esprit Updates
esprit currently supports WebDAV through Apache; however, there will be a WebDAV interface similar to cPanel’s “Web Disk” component. This will allow remote access to the filesystem from your desktop, without having to login to the control panel. A rough cut of esprit’s WebDAV interface should be up by the weekend. This revolves around the File module and is actually a direct interface into esprit’s API.
Additional modifications have been made to Apache (the Web server) to skip configuration rebuild when necessary called a fast restart. This modified start-up type shaves 30 seconds off the total restart time. Configuration files will now be checked before a reload/restart as well. In the event of a syntax error, the Web server will postpone reloading. Finally, before starting the Apache up again, the server will wait up to 15 seconds for all processes to exit. This is a workaround for the case when a restart occurs before Apache has had the opportunity to shutdown 100% (still processing the remains of a requests). The added pause is introduced in response to a false positive last night.
Permalink
June 2, 2008 at 1:34 pm EDT
· Filed under Server Notices
- The servers now support e-mail addresses in the form user+extension@domain.com. +extension is implicitly checked in the lookup table in Manage Mailboxes. If an address named user+extension on the named domain does not exist, then Postfix will fall back to the address named user on the named domain domain. In other words you may, for example, start providing the e-mail address user+paypal@domain.com without necessarily creating the e-mail account named user+paypal in the control panel.
- One-shot deliveries will be attempted on all Chinese domains due to the marked rise in backscatter/spam bots with .cn e-mail addresses. One delivery will be attempted and in the event of failure, instead of keeping the message in the queue for 7 days to repeatedly attempt delivery, the message will permanently fail. Unless you’re in direct communication with e-mail addresses ending in .cn that happen to be extremely unreliable, this will not affect you.
- vsftpd+ 2.0.6-ext1.1 has been deployed on all servers at this time. After examining the logs over the weekend on Assmule everything looked fine. Per-user configuration for vsftpd+ has changed. The full username@domain is no longer needed for both jailing and configuration.
Permalink