Is One Repository Even an Option?
When I present FatWire product architecture I am often asked THE question. Is all my content stored in one repository? When I say “no”, I am given a look that is usually reserved for misbehaving relatives or people seriously ill.
Alfresco has also been getting a lot of heat lately for not storing all their content in one repository. The product apparently has two different repositories, one for web content and one for document management and collaboration.
People seem to want to store all their stuff in one place. Let me ask you this, though … Is this even possible?
We traditionally defined content by what we could NOT do with it. We knew we could not store content in the database. Some people tried. It did not work. It had to be stored elsewhere, in some other repository. You have one database. By analogy you have to have one of these other repositories to store all the stuff that cannot go into the database.
Time goes on, however. Just as Eskimos have 20 terms for snow, we content pros now have as many terms for content. We now define content by what we CAN do with it: collaborate, manage for compliance, publish on the web, user generate, etc. These are very different patterns:
- Web content is mostly readonly. Web content is created by a small group of people and distributed to millions. Repositories managing web content are usually structured to optimize reads at the expense of updates. A common trick is to pre-compute database joins and denormalize metadata.
- User generated content is write intensive. Just look at twitter or facebook. The sites can get hundreds of millions of updates every single day. Repositories used to store user-generated content are optimized for updates at the expense of data and content consistency. A comment posted on facebook, for instance, is not immediately visible to all facebook users.
- Document management content is read-write. A typical example is Documentum or SharePoint. Data and content consistency cannot be compromised and therefore transaction-based processing is the only way to go. This is actually the type of content that can be stored in a database, as vividly demonstrated by SharePoint.
So the bottom line, it takes many different kinds of repositories to make the world go round. It is time we come to peace with this fact. Let different kinds of content be managed by the repositories that are most suitable for the job and use a content federation layer to present one access point to all of them. This is where CMIS, the new content interoperability standard, can be particularly useful.
About this entry
You’re currently reading “Is One Repository Even an Option?” an entry on (t)cherevik
- Published:
- 9.22.09 / 10am
- Category:
- cmis

3 Comments
Jump to comment form | comments rss [?] | trackback uri [?]