One of the projects we are working on at Zimbra is project “Always On”. The goal is very simple: Email and collaboration should be “always on” for end users. It’s no secret that email is the number one tool used for business collaboration. Project Always On solves key issues related to operating a platform built to be the central hub of collaboration delivered as a cloud service.
Our Design Goals
We started with some very basic design goals based on what we believe are the key attributes of a cloud service.
- Inherently resilient to failure
- Scaling should be elastic based on workload demands
- The software can be enhanced without service disruption
- Efficient usage of commodity hardware resources
As we thought about these design goals, we arrived at a few technical capabilities and basic architectural requirements to achieve them.
- No single points of failure in the application components
- Separating the application code from the data
- Distributing state information across commodity storage
- Automatic failover of application and data storage components
- Automatic load balancing of client requests across the application and data layers
How Zimbra is Changing
There is so much of work that goes into re-architecting your software, so we’ve been doing this over several phases. Our MTA and Proxy components already have load balancing and failover capabilities and we introduced multi-master replication for our LDAP directory in Zimbra 8.0 last year. Or focus has now shifted to our mailbox server. We have spread this work across two phases.
Phase 1
Phase 1 work is centered on breaking up the application and data components of the mailbox server.
- Splitting the web client code from the server logic. Today all of our static HTML, JavaScript and Java code run in the same Jetty instance. We are splitting it apart for two reasons: Enable UI customizations and code changes in real-time and to enable the web app and mailbox services to be optionally deployed on separate servers. In larger environments, deploying and scaling these components on separate servers will improve overall user density.
- Separating our application code from the data stores. Today our Java application code running in a Jetty instance has an affinity to the instance of MySQL (soon to be MariaDB) running on the same server. We have been refactoring and enhancing our application code to remove this dependency for two reasons: Enable MariaDB to be optionally deployed and scaled in a separate data services layer and to enable distributed metadata through a MariaDB Galera Cluster.
- Switching from MySQL to MariaDB. We have been following the MariaDB community for a while. The improvements in performance and the speed at which bugs and security issues are resolved are compelling reasons alone. Now that SkySQL is providing commercial support, we felt it was the right time to make a switch. And switch we did. It took us about 5 minutes to replace MySQL with MariaDB and about 3 minutes to replace ConnectorJ with the MariaDB Java Client and 1 bug fix.
Phase 2
Phase 2 work is centered on distributed data and install/upgrade orchestration.
- Mailbox Metadata. As I mentioned earlier, we will be using MariaDB Galera Cluster to implement a scale out, shared nothing, active-active design for our mailbox metadata.
- Search indexes. We are looking at using Apache SOLR for distributed indexing building on our current use of Apache Lucene.
- Distributed Blobs. We plan to continue to enhance our StoreManager API and implement an S3 compatible interface for use with cloud storage solutions like Scality and Amazon. We have also changed the pathing of blob data on the file system to include a “node name”. This will enable the use of clustered/distributed file systems like NFS and Ceph using their POSIX file system interface.
- Rolling updates and upgrades. We want to enable software updates and even version upgrades to coexist with previous versions and to be installed across each application component layer without user disruption.
There is so much more work going into the Always On project than I’ve covered here. I’ll post some demoes of our Always On lab soon as well as some of the other features we are working on for our future releases.
This sounds great!
Hopefully some of these will make it to the OSS version.
Glad to see a switch to MariaDB I’ve really become to dislike oracle recently and will be glad to see a switch from MySQL to MariaDB I performed the same thing on a couple of my own servers (non-email) and can say I’ve been pleased with MariaDB so far. No issues as of yet (5.5.x versions).
But yeah, it’s great to see Zimbra is still moving along under it’s new management, keep it up, and good luck :-)
I wonder why you are not switching to postgresql instead?
>>It took us about 5 minutes to replace MySQL with MariaDB and about 3 minutes to replace ConnectorJ with the MariaDB Java Client and 1 bug fix<<
.. could this be main reason ? :)
As I read this I sighed with relief. Looks like Zimbra is finally getting the attention it deserves.
any updates ?
When can we expect this release???
This is an good features. is this features available for both zimbra open source?
This is a great improvement as we understand.
We have always had difficulties implementing HA solutions with Zimbra (VMware HA involves additional components and it is not true HA). Clustering is also not a right solution for the new generation of collaboration suite.
The new architecture should enable geographically dispersed servers servicing mailboxes for both Active and Passive Mailboxes (In Exchange parlance, something similar to DAG).
This movement will actually encourage partners like us a lot to invest further on Zimbra resources and stay positive.
Can you share the tentative schedule for Phase 1 and Phase 2 implementations.