My thoughts on Core Data

| March 8th, 2010

In the last few weeks, there was a lot going around about Core Data and why it may or may not be a good idea. There was Matt Gallagher’s post: The differences between Core Data and a Database. Then Brent Simmons kicked off a whole slew of other posts with On switching away from Core Data, such as:

There are others as well. I was dying to post about this but was even more determined to get my first release out the door. Now that that’s done, I figured I’d share my thoughts. I think Gus comes closest to my own feelings when he say’s he doesn’t like it’s smell. Although maybe he didn’t need to be beaten on the head as much as I did to come to this realization (someone not liking EJB is just proof that they’re not totally nuts). The problem I see with Core Data isn’t with Core Data in particular, but with object-relational mapping as a whole.

I have used a lot of object-relational mapping tools: Hibernate, NHibernate, ActiveRecord (from Rails), DataMapper (another Ruby one), and of course, Core Data itself (I’ve never used EOF however). I’ve also created custom object-relational mappers (several times in differing languages) on top of very large data models. After all this my conclusion is that it’s best to avoid having to do this mapping.

Here’s my theory on how all this started. In the beginning, there were databases, and we put data in them, and life was good. Well, really it wasn’t, because then came along object-oriented programming, and we realized that life would be better if we used this and instead of data, we have objects. Of course we still needed a place to save the objects, so we figured out many different ways to shove our objects into a database. Fast-forward many years and now it’s basically standard practice: make your objects, convert them into data for databases, then get them back out converting them back into objects.

Really, all of various tools do an incredible job of managing this assuming you let them own the data model completely (I won’t talk about using them on legacy systems since it’s a nightmare and doesn’t really relate to Core Data) and it gets you a bunch of things:

  • In general switching to different SQL databases isn’t too bad
  • You have some powerful searching capabilities at your hands
  • Depending on your backend, it can scale (although Core Data’s failure to scale in certain situations is what started this whole thing off)
  • You have all the other potential side benefits of a database – transactions, sharing data between clients, etc.

Sometimes this stuff is important. In Core Data’s case and in the case of almost kind of desktop application (where the database is local) this is all irrelevant or untrue. In the meantime you’ve set your self up with the biggest limitation of a SQL database: all of your objects must be mapped to tabular data. In the simple cases, this isn’t a big deal. However, the more complicated your object model gets, the more complicated this mapping gets or the more your constrained in how you can even design your object model. In short, the fewer translations your objects/data/whatever must go through, the better.

It’s easy to see the evidence of this impedance mismatch in Core Data. You want ordered collections? You’ll have some work to do. Or how about dealing with the synthetic ids of newly created objects? Or (common to all object-relational mappers) the hackery that goes on to deal with inheritance? All of these problems are solvable, but they’re just a few indications that SQL databases might not be ideal for storing objects.

So what’s the solution? I’m not sure. No, neither CouchDB nor my own CouchDB-inspired BRDocBase are really answers (in case you thought that’s where I was headed). They’re not intended for generic object persistence. Whatever happened to object databases? I know some are still around, and I’ve seen really cool stuff done in Smalltalk with them. Of course ’seeing cool stuff done’ is not at all the same as actually developing with one, which I have zero experience with. I’d love to see an Objective-C based OODB, and have often thought of developing one for fun, but I know that if I’m not using it in an actual product it wouldn’t come out right (that’s just me…I need to be actually using it to see if something is a good idea or horrible one).

As mentioned by a few others, Aaron Hillegass is working on BNRPersistence which sounds interesting. I haven’t looked deeply into it, and Brent Simmons mentioned the LGPL issue of Tokyo Cabinet. I’m under the impression that including an LGPL component is fine fine for non-free, close-sourced products, as long as you share the source (and your changes) of that component (but I’m definitely no lawyer). I’m certainly planning on taking a better look at it though.

Before I finish I just wanted to make clear that I am not at all saying ‘Core Data sucks, don’t use it’. It’s still the best general-purpose solution for persisting objects for a desktop Cocoa application (although I really wish they’d add ordered collections). I would say, that if your data is more tabular in nature and you do more of the kinds of operations SQL is good at (searching, scaling to lots of rows, etc) you might consider doing your own database access. But even if we had an awesome Cocoa OODB available to us, I’d say the same thing (the right tool for the right job). The main point of my lengthy, semi-rant is that it seems the development community as a whole has gone down this path of object-relational mapping and accepted it as The Way to be persisting our objects and I just wish we had better options.

Leave a Reply