NHibernate really is a fantastic ORM… unless you use it badly. Or unless you use it kinda OK. Or unless you use it almost-but-not-quite-perfectly. Then it can be a right pain in the neck. There are a lot of things you can get wrong, and those things can cause you a world of pain.
The particular pain I dealt with recently was memory pressure.
One of the most crippling things you can do to a server is to fill up its RAM. Once your RAM is full, your OS has to start copying things out of RAM onto disk, and then back into RAM when you need to use them again. Disks are slow – much, much slower than RAM – and this is going to hurt your performance badly.
Picture this. You’ve done all the right things. Your software is built using a loosely-coupled Service-Oriented Architecture. You have a website, and it hands all sorts of tasks off to a separate service layer. You have a second service handling various data import tasks. As your load increases, it’s going to be very easy to scale horizontally: you can move your services off to separate servers, and the only thing you need to do is update a few network addresses. Once you expand beyond what you can handle with four servers (those three functions plus a separate database server), you can load-balance each function individually.
You’ve also decided to handle multiple tenants with multiple databases. This one isn’t the right decision in every situation, but there are lots of times when it makes sense, particularly if you’re holding a lot of data for each client. It makes it trivial to archive individual clients off. It makes it easy to offer different tiers of backup. It stops your row-counts from getting too high, and it isolates small clients from the performance headaches of the massive data sets maintained for larger clients.
NHibernate is going to kick you in the teeth for doing this.
We’ve been watching the problem approach for a while now. The base memory overhead for each process soared past a gigabyte some time ago. As our client list headed towards a hundred, our memory overhead headed towards two gigabytes per process. I didn’t need to run a memory profiler to know where the problem was (although I did use one to confirm my suspicions). The culprit was the NHibernate session factories. A single session factory can run towards 20 MB. With fifty clients, that means you have a full gigabyte of RAM filled with nothing but session factories. I didn’t want to have to start scaling horizontally early just because of this, and after all, this gigabyte of memory consisted of lots and lots of copies of 20 MB structures which were identical except for a single string: the database connection string. That’s horribly wasteful. (Actually, there were other differences, but we’ll get to those.) I also couldn’t start disposing of session factories once they hadn’t been used for a little while: these things take a while to construct, and we can’t let our users sit around for several seconds when they log in for the first time in a while. I needed to start re-using our session factories.
There are at least two approaches you can take here. The one I chose has two caveats: firstly, that you’re using NHibernate.Cfg.Environment.ReleaseConnections = “on_close”, and secondly that you’re not using stateless sessions at all. We’ve been moving towards ditching stateless sessions for some time anyway, because stateless sessions don’t support listeners, so the second requirement wasn’t a problem for us. The first setting is a bit more troubling, because it’s legacy behaviour: rather than letting NHibernate manage connections using one of its newer strategies, it forces NHibernate to provide a connection when a session is first opened, and use that connection for the duration of the session. This was acceptable because we were already using the legacy setting, for reasons undocumented in either code comments or our source control history. I haven’t looked into the costs and benefits of this legacy mode compared to the other strategies.
So, let’s dive into some code. First of all, you’re going to need to set your connection provider:
cfg.SetProperty(Environment.ConnectionProvider, typeof(SharedCompanyConnectionProvider).AssemblyQualifiedName);
Then, seeing as there’s no such thing as a SharedCompanyConnectionProvider, you’ll need to implement it!
public class SharedCompanyConnectionProvider : DriverConnectionProvider { protected override string ConnectionString { get { return NHibernateSessionManager.Instance.DatabaseSettings.GetCurrentDatabaseConnectionString(); } } }
If that looks a bit scary, good. If not, let me explain. Your connection provider is no longer thread-safe! It’s relying on a singleton which serves up a connection string. This is dangerous code, and you need to be careful how you use it. (Don’t even think of using this without putting some tests around it – see later in this post.)
Now, on to wherever it is you build your sessions. Mine looks something like this:
private static readonly object CompanySessionFactoryLockObject = new object(); ... lock (CompanySessionFactoryLockObject) { var sessionFactory = NHibernateSessionManager.Instance.GetSessionFactory(); NHibernateSessionManager.Instance.DatabaseSettings.SetCurrentDatabaseConnectionString(databaseGUID); ISession session = sessionFactory.OpenSession(); }
I’ve removed a lot of the detail, but that should give you the gist of what’s going on. The key component here is the lock() line. Now that our connection provider isn’t thread-safe, we have to ensure no other threads interrupt between setting the connection string on the singleton, and creating the actual session (at which time the connection provider will provide a session with the current connection string).
The final step in the process is to make sure you have some thorough testing around what you’re doing. The risk of getting it wrong is that your session factory hands you a connection to the wrong database, and that could be very bad. I’m not going to run through the entire test setup, but it’s certainly not a unit test – this thing runs in a test suite which uses a real database instance and creates (in the case of this test) five complete databases which we’ll be accessing from various threads.
private volatile static string _assertFailed; private const int NumThreadsPerDb = 2; [Test] public void HammerMultipleDatabasesSimultaneously_BehavesWell() { List<Thread> runningThreads = new List<Thread>(); foreach (var coGuid in companyGuids) { for (int i = 0; i < NumThreadsPerDb; i++) { var thread = new Thread(StartHammering); runningThreads.Add(thread); thread.Start(coGuid); } } while (runningThreads.Any(thread => thread.IsAlive)) { if (_assertFailed != null) runningThreads.ForEach(thread => thread.Abort()); else Thread.Sleep(1000); } if (_assertFailed != null) Assert.Fail(_assertFailed); } public void StartHammering( object companyGUIDObj) { // nb don't assert on a thread. We're set up to set a message into _assertFailed instead. var CompanyGUID = (Guid)companyGUIDObj; string expectedDbName = CoDatabaseNames[companyGuids.IndexOf(CompanyGUID)]; try { Entity entity; using (var session = NHibernateSessionManager.Instance.GetNewSession(CompanyGUID)) { // Set up the entity with some unique data session.Save(entity); } for (int i = 0; i < NumTests; i++) { using (var session = NHibernateSessionManager.Instance.GetNewSession(CompanyGUID)) { if (!session.Connection.ConnectionString.Contains(expectedDbName)) throw new Exception( "Got a connection for the wrong database!"); var ent = session.Get<Entity>(entity.GUID); // Check some unique thing about the entity. Change it to something else for the next iteration. } } } catch (ThreadAbortException) { } catch (Exception ex) { if (!ex.ToString().Contains( "ThreadAbortException")) _assertFailed = ex.ToString(); } }
There’s a lot going on there. The key theme is that we’re creating a bunch of threads, and each thread is assigned to a particular database. New sessions are continuously created, and then queried to ensure they contain the expected object. If the object is not found, or the session has the wrong connection string, then something has gone wrong, and the whole system isn’t behaving in a thread-safe fashion.
Note that in a multi-threaded test situation, you cannot just throw an exception if something goes wrong – you need to pass information about the failure to your primary thread.
One final (and important) step is to ensure the test does fail appropriately if the system doesn’t behave as expected. Remove the lock statement around your session creation code and run the test; you should see it fail. Adding the lock back in should fix it.