[TYPO3-english] TYPO3 CMS, multiple app servers and load balancing

Daniel Neugebauer mailinglists at energiequant.de
Wed Apr 20 21:38:02 CEST 2016


Hi!

We are running one 4.5 website on two backend servers behind HAProxy
(upgrade to 7 LTS planned, read the end). Both servers are standing next
to each other and are linked directly via a second ethernet card. That
setup was originally meant to only required to provide basic failover
(if one server fails, switch as seamlessly as possible to a standby
server) but then, why not do generic load balancing while everything is
working normally when you can get LB "for free". (otherwise one server
would just be running idle for years until finally something fails...
and maybe, in the worst case, that server then also fails unexpectedly
as it wasn't continuously tested, oops...?! ;) )

As TYPO3 isn't a persistent web application (meaning each request will
boot everything up from start with nothing shared across page calls as
would be possible with e.g. Java web applications), all you basically
need is a shared database, a shared filesystem and a frontend proxy
server. Instead of sharing sessions you may also just use "sticky
sessions" for requests which need persistence (most don't) to forward
consecutive requests to the same backend server.

In our setup:

 - MySQL master on server A, MySQL slave on server B, all client
   connections run through HAProxy. HAProxy queries special check
   scripts to make sure that only the master server is accessed by
   clients. If replication breaks, the slave reconfigures itself to
   master mode within 10 seconds. The check scripts for HAProxy notice
   it within 1 or 2 seconds and the proxy directs all client connections
   to server B until manually resolved. This leaves a very small chance
   of "split-brain" operation for a limited time but should be enough
   for simple websites (mind that it is not for shops/registration
   systems).

 - HAProxy triggers sticky sessions on TYPO3 login and POST requests.
   This will fixate all following requests from the same browser
   session to one backend server and eliminates the need for sharing
   sessions across servers. Depending on your specific website you may
   need to handle this differently (e.g. you could set sticky sessions
   depending on client IP addresses or decide it randomly by browser
   session).

Unfortunately, TYPO3 works with a lot of temporary files. We thought it
would be likely to fail if we didn't use one consistent shared file
system across both servers. For example, server A may generate a
temporary image file when initially generating a page but the next
client request referring to that file may be routed to server B instead
- we would have to make sure the file is available from both servers "at
the same time". Just rsync'ing every minute or so (like it would be
possible for more static websites/CMS) didn't seem to be sufficient, we
didn't even bother to try it. So the biggest issue was to actually find
a shared file system which requires just two physical nodes. As it seems
there is only one working solution: GlusterFS.

Allow me to elaborate a bit on our current network FS setup as I feel
it's the most essential part:

|==========|                             |==========|
| Server A |                             | Server B |
|==========|                             |==========|
|  Host OS |                             |  Host OS |
|  Gluster | <== Gluster Replication ==> |  Gluster |
|----------|                             |----------|
|    ^     |                             |    ^     |
|    | NFS |                             |    | NFS |
|    v     |                             |    v     |
|----------|                             |----------|
|  Web VM  |                             |  Web VM  |
|==========|                             |==========|

The GlusterFS daemon runs on the host OS of our servers. It uses regular
file system mounts to store all content and keeps metadata mostly in the
file system's extended attributes. Both daemons connect over the direct
ethernet link and replicate all changes to each other. We use different
VMs separated by function; the one which connects to GlusterFS is our
"Web VM" running TYPO3. Each of these VMs connects to its local Gluster
host. Mounting the file system is possible by either NFS or a specific
GlusterFS FUSE module (which is then capable of extra features).

While GlusterFS worked very stable so far, we encountered a few pitfalls
when implementing this setup. In short: Don't trust the official
documentation!

 1. Performance is absolutely horrible (at least for TYPO3) when using
    the FUSE module for mounting Gluster on the clients ("Web VM").
    I am absolutely sure the documentation stated at one point that it
    should automatically fall back to NFS mounts if reasonable - this
    is not the case. Mounting the volume via NFS reduced response times
    for plain files to less than 10% for us as compared to FUSE,
    drastically reduced network overhead and cut TYPO3 page generation
    times to 30% (from 1.8 seconds down to ~500ms). Documentation is
    contradictory at best; while on one page it says FUSE would be the
    only recommended way to mount the FS, it recommends to use NFS for
    read-heavy operations on another.

 2. Online upgrades don't work. At least they didn't for us. If you
    don't read the documentation *very* carefully you will easily miss
    that the section about online upgrades in the copy-pasted upgrade
    instructions is missing but still being referred to everywhere
    (great if you just read the same paragraphs 4 times for 4 different
    releases). Therefore, I would only recommend performing upgrades
    during planned cluster-wide downtime. Not ideal for a HA solution...

 3. Upgrade process is not fully documented. You need to perform
    additional steps to get rid of various error messages and/or enable
    new features. Those steps are only documented on mailing lists and
    bug reports. :(

 4. Just because you are upgrading from version x.y.z to x.y.(z+1)
    (commonly assumed to be a "bugfix release") doesn't mean the
    versions will be compatible until all nodes have been upgraded and
    the cluster restarted...

So, in retrospect, I would likely choose another file system if I had
more than 2 servers available (all other solutions require at least 3
nodes, unfortunately).


Note that we did not upgrade that website to 6.2 or 7 LTS yet (however,
finally, a relaunch with 7 LTS is planned for this summer). To
complicate everything, TYPO3 6 introduced what I would call "caching
hell" - I can hardly think of anything which is accessed directly
without copying it to typo3temp etc. before use... There may be a chance
of producing race conditions with GlusterFS replication after caches are
cleared and both servers produce the same files concurrently, but I hope
(given the documentation isn't reliable) it will work just fine. If
anyone has any thoughts on it or knows if TYPO3 releases after 4.5 work
equally well (or not at all) on GlusterFS, please let me know. :)

Hope it helps,
Daniel


More information about the TYPO3-english mailing list