Deleting RavenDB View Models Before Replaying Your Events

I’ve recently been looking for a nice way to be able to programatically blow away the view models I have stored in RavenDB as part of a new CQRS project I’m working on so I can rebuild them by replaying my events. Whilst the rebuilding part is fairly simple, the deleting part isn’t. At first glance there is no immediately obvious answer as to how to get rid of them all though I did find this post but I wanted something a bit cleaner. The problem is that each document type you define gets stored in its own collection. When you want to replay your events so that you can rebuild those view models you somehow need to first delete them all. You might be tempted to query for the documents and then pass each one to the delete method on the session:

using (var session = store.OpenSession())
{
    var viewModels = session.Query<SampleViewModel>();
    foreach (var vm in viewModels)
    {
        session.Delete(vm);
        session.SaveChanges();
    }
}

but this isn’t a good approach at all as you don’t want to have to load the documents into memory before being able to delete them. Even if it was a good idea, you’d have to repeat that code for each different view model document type, and each time you added a new view model you’d have to update the code to do it again. Not good at all.

Luckily, RavenDB supports a set-based Delete operation which takes an index which we could use to return to us all the view model types:

using (var session = store.OpenSession())
{
    session.Advanced.DatabaseCommands.DeleteByIndex("ViewModelIndex", 
        new IndexQuery());
    session.SaveChanges();
}

The question is what should the index definition look like?

Ordinarily you would define an index in code along the following lines:

public class ViewModelIndex : AbstractIndexCreationTask<SampleViewModel>
{
    public ViewModelIndex()
    {
        Map = docs => from doc in docs
                      select new { Id = doc.Id };
    }
}

but again this index is for a specific view model type, in this case SampleViewModel. What I really wanted was an index that would return me documents from multiple collections. One that I could use to find all my view models no matter what their type, and after quite a bit of experimentation and a few dead-ends, I finally hit upon a way of achieving my goal. The trick is to take advantage of a piece of metadata each document in RavenDB has and to follow the “convention over configuration” approach. Take a look at this index definition:

public class ViewModelIndex : AbstractIndexCreationTask<object>
{
    public ViewModelIndex()
    {
        Map = docs => from doc in docs
                      where MetadataFor(doc)["Raven-Clr-Type"]
                            .ToString().Contains("ViewModels")
                      select new {};
    }
}

This index looks at the “Raven-Clr-Type” metadata on each document which contains the fully qualified type name. If that name is deemed to contain the word “ViewModels” the document is added to the results. The convention part therefore is simply that you have to ensure that all your view model documents are located in a namespace containing the word “ViewModels”. The result is a set of documents from multiple collections which can be used in the DeleteByIndex method. Once your views have been destroyed it’s a simple matter to reload your events and push them through your event handlers again.

I am quite new to RavenDB so there may be an alternative/better approach but if so I’ve yet to find it but I’d be interested to hear from anyone if there is. In the meantime, this solved my dilemma neatly.

Deleting RavenDB View Models Before Replaying Your Events

CQRS – Use Your Common Sense

Jimmy bogard just published a timely and excellent article on CQRS with regards to how the UX should be implemented – synchronous vs asynchronous – a topic I’ve wrestled with many times and something I find trips up a lot of people who expend a lot of effort trying to jump through hoops to make it look like it’s synchronous even if it actually isn’t. This is because nearly all of the guidance you’ll find on CQRS describes the implementation this way which people read and then follow dogmatically. There are other areas of CQRS that seem to cause people to make life unecessarily hard for themselves too and I want to touch on this as well but regardless, whatever you might be struggling with it all essentially boils down to standing back and questioning if what you are doing is right for YOUR system.

The asynchronous nature of updating the read side is to facilitate cheap horizontal scaling to handle large numbers of users issuing lots of read requests but the simple truth is that a lot of us work in environments where that requirement just doesn’t exist. Unless you happen to know that your system is going to be serving such a large audience my advice would be to start with a completely synchronous system so that by the time your command has been processed and the projections have been created you can confidentally redirect to a view knowing full well that the user will have the most up-to-date data on their screen. The fact that you’re using commands, command handlers, events, event handlers etc means that should the need arise to go async, you’ll be well placed to make that leap without much rework but until then YAGNI. Keeping it simple means you can make a lot of progress very quickly instead of hitting stumbling blocks right away. 

If you’re the dev on the team who wants to get his colleagues interested in CQRS then trying to explain why the UI should be asynchronous is going to be a tough sell. It feels intuitively wrong to people because it’s just not something they’ve ever done before and if they feel like they’re having to fudge the UI to make it “look” synchronous as often discussed then it’s going to strengthen any resistance they may already have. Likewise with event sourcing. Event Sourcing is often seen as synonymous with CQRS but again, this is absolutely not the case. Yes, it has some incredible benefits but comes with a requirement to think about current state differently, something I’ve seen people really struggle to grasp. If you’re looking to introduce CQRS into your shop, start simply and store state as you’re used to doing. When the separation of reads from writes as part of your architecture becomes established and people begin to see the benefits then that is the time to maybe think about introducing things like event sourcing as they are much more likely to be receptive to the idea. I personally believe that there are just so many game changing concepts associated with CQRS that trying to bring them all to the table at once is likely to result in resistance and/or failure.

Other CQRS guidance taken too far can be found in the questions posted to the DDD/CQRS google group or StackOverflow where developers can be found trying to do things that just seem to be such a poor fit with CQRS. One such example is discussion centered around how to do security such as user registration and logging on using commands whilst interegating the read model to enforce uniqueness of usernames or email addresses. Whenever I see this I always wonder why they’re jumping through hoops to implement something that to my mind has nothing to do with CQRS. As Udi Dahan and Greg Young say, CQRS is not a top level architecture, it’s something you implement in a particular bounded context that you’ve identified as being suitable for CQRS. User registration and logging on to a system is not suitable, at least to me. Why make it harder for yourself to implement something following CQRS principles that could otherwise be solved in a much simpler manner? Is there a business requirement to track registrations and log-ins? Even if there is, a simple row in an audit table would do the job.

In my workplace we’ve just begun to prototype a new application and decided to use CQRS. Not because we need scalability and not because we need the ability to replay events. The most important reason is because we want the simpler, cleaner architecture that separating reads from writes affords us. We’re going with RavenDB to cut out the tedium of mapping SQL with an ORM and also because it gives us transactions, something most NoSQL databases don’t have (due to their scalability goals). We update the UI in a synchronous manner and employ NServiceBus for async operations. Yes, that means using the MSDTC but so what? The point is, we don’t feel the need to follow all the guidance to the letter. We’ve simply looked at what our requirements are and have taken what we think will work for us aiming to keep it as simple as possible. All other possible decisions are deferred until they’re shown to be needed.

I believe if more emphasis was placed on the read/write separation benefits of CQRS, less on the async nature of updating read models, and less still on event sourcing we might get more people thinking about adopting CQRS and hopefully see less traditional CRUD architectures. If CQRS is on the horizon for you and your team and you’re unsure of the current teams’ skill level, I’d say, to begin with, keep it as simple as possible and try not to over engineer it from the start. The architecture is such that if the need arises for some of the more complex aspects it should be relatively easy to evolve in that direction when the time is right. Just use your common sense and ignore anyone telling you you’re doing it wrong.

Finally, Checkout Rob Ashton’s blog for a breakdown of the simpler ways to implement CQRS.

CQRS – Use Your Common Sense