Software Developing: Domain Driven Design, looking at Persistence 
First I wrote
Software Developing: Domain Driven Design where
jff asked a few general questions about DDD - since he isn't in the area - as a result I not only replied to his questions but also wrote a second post called
Software Developing: More About Domain Driven Design in order to make some concepts more clear. On the second time it was
jpmsi that raised interesting questions regarding the subject, pointing out information tarpits for people that were new to DDD.
In this post I'll try to expand one of the subjects that
jpmsi pointed out: Persistence in the DDD.
Note: I won't be entering (yet) in the database structure, explaining how to do the Object-Relation bridge but that can be expect in a future post. Also, in the end of each item I'll be presenting some code, which, if javascript is enabled in the browser it will be collapsed, "example code" will be linkable and when clicked will show the code.
To anyone who as read
Martin Fowler's
Patterns of Enterprise Application Architecture this post will seem very familiar. Since I'm talking about the book let me say it's a must read to anyone who is interested in software architectures.
But let's start with the basics, and the easiest thing - yet still very important - which is defining what means persistence.
Persistence is nothing more than the ability of preserving data beyond the execution of the program that has created it. Persistence is associated with some kind of database system, from a
RDBMS to file system storage.
When creating the architecture there are at least three things, regarding persistence, that should be in mind:
- How to access the data layer:
- Mainly how
CRUD operations are performed
- What's the domain level behaviour regarding persistence
- How to persist every change done in a object that is in memory
- How to deal with objects relations
- Only load same object into memory once
- Domain Objects must have some sort of relation with the data store in RDBMS
- Store in an object field the primary key for that object in the database
Right now let's put the third point aside and assume that each domain object has an extra attribute which is the primary key for the object and it's
auto-magically set. In the end I'll get there.
I'm sure there are various ways of solving such problems, but I'll be presenting - like I already said -
design patterns that are based in Martin Fowler's previously mentioned book.
I'll present two different kinds of patterns, using Martin Fowler's names I'll be talking about the the so called
Data Source Architectural Patterns and the
Object-Relational Behavioural Pattern.
Data Source Architectural PatternsThese patterns are responsible for objects in the Data Layer that take care of the direct access to the data storage, abstracting the actual storage system to the rest of the application. Abstraction in this kind of situations is good, because with such abstraction the storage system can be replaced by another one and no other layer besides the Data Layer have to be re-written.
Pattern 1:
Table Data GatewayIn this pattern exists an object that acts as a gateway to the RDBMS table - or tables depending of the database structure. That object has as it's interface the CRUD operations and any given domain object that needs to be persisted will be using this gateway.
Like the name implies the gateway is a gateway for a table, so there is one gateway object for each one of the tables that exist.
As an example, let's imagine that we have a Book object which has a title and a number of pages - along with the primary key that has been previously mentioned.
Example code
public class Book {
private static BookTableDataGateway gateway =
new BookTableDataGateway();
private Integer id;
private String title;
private Integer numberOfPages;
public Book(
String title,
Integer numberOfPages) {
this.title = title;
this.numberOfPages = numberOfPages;
this.id = gateway.saveBook(title,numberOfPages);
}
public String getTitle() {
return this.title;
}
public void setTitle(
String title) {
this.title = title;
gateway.updateBook(
this.id,
this.title,
this.numberOfPages);
}
public static Book readBook(
Integer id) {
return gateway.readBook(id);
}
// more code here
}
public class BookTableDataGateway {
public Book createBook(
String title,
int numberOfPages) {
// jdbc code here
}
public Book readBook(
Integer id) {
// jdbc code here
}
public void updateBook(
Integer id,
String title,
int numberOfPages) {
// jdbc code here
}
public void deleteBook(
Integer id) {
// jdbc code here
}
}
Besides the problem of the maintenance overhead - since there's a new object for each table in the database - there is also the problem of how should the relations with other objects be solved:
- using proxies?
- also getting the other objects in memory?
Each way as it's pros and cons so it's up to the person who implements to decide.
Side note: a simple way to test the domain layer when using this pattern is to replace gateways by
mock objects.Pattern 2:
Active RecordThe Active Record is a very well known pattern, specially after the hype around
RoR, since it extensively uses Active Records.
In the Active Record pattern each domain object knows by itself how to deal with his persistence. This means that the data access logic is within the domain objects, in my opinion it makes the domain object dirty. This can bring problems along the development specially if the domain in question isn't trivial.
But on the other hand if it's a simple domain a few conventions are used - like in RoR - then it can be a powerful, yet not very flexible, aide to the developer.
Example code
public class Book {
private Integer id;
private String title;
private Integer numberOfPages;
public Book(
String title,
Integer numberOfPages) {
this.title = title;
this.numberOfPages = numberOfPages;
this.id = save();
}
private void update() {
// jdbc code to update the object
}
private Integer save() {
//jdbc code to save the object
}
public void delete() {
}
public static Book readBookById(
Integer Id) {
}
}
Pattern 3:
Data Mapper
In the two previous patterns the domain objects knew about the need to be persisted. Domain Objects would have in their code methods to access the data layer itself (Active record) or at delegate to another object which they knew the interface (table data gateway). In those two patterns there wasn't a real abstraction between the domain layer and the persistence layer.
This pattern is different, it introduces a true separation between both layers, the domain objects do
not know anything about the persistence. It brings a bit more complexity in the mapping system but in applications where developers want their domain to be truly independent from persistence a Data Mapper might be the solution.
This pattern can be used along with the Lazy Loading Pattern - which will be described later on this post. Mainly what could happen is that when the load of a certain object is needed the load method of the mapper is called returning the object in question.
Object-Relational Behavioural PatternThe behavioural patterns have the objective of aiding the operations that are happening in the domain layer. By aiding I mean keep tracking of what's changing in the objects, bringing objects from and putting objects in the database.
Each pattern presented here will solve a different problem, in complex systems all three can be found implemented.
Pattern 1:
Unit of WorkThis pattern introduces a way of keeping track of all in-memory modifications and make sure they'll get reproduced in the database. The idea is that inside a transaction flag all objects that has been modified or created an in the end of the transaction - in the
commit - find all flagged objects and write their modifications into the database.
Example code
public class UnitOfWork {
List<DomainObject> newObjects;
List<DomainObject> dirtyObjects;
List<DomainObject> deletedObjects;
public UnitOfWork() {
this.newObjects =
new ArrayList<DomainObject>();
this.dirtyObjects =
new ArrayList<DomainObject>();
this.deletedObjects =
new ArrayList<DomainObject>();
}
public void markNewObject(DomainObject object) {
newObjects.add(object);
}
public void markDirty(DomainObject object) {
dirtyObjects.add(object);
}
public void markDeleted(DomainObject object) {
deletedObjects.add(object);
}
public void commit() {
boolean rollBack =
false;
TransactionalSystem.openTransaction();
try {
for(DomainObject object : newObjects) {
// code to create
new object
}
// cycles
for the other two lists
}
catch(PersistenceException exception) {
rollBack =
true;
exception.printStackTrace();
}
if(rollback) {
rollBack();
}
else {
TransactionalSystem.commit();
}
}
public void rollBack() {
TransactionalSystem.fail();
newObjects.removeAll();
dirtyObjects.removeAll();
deletedObjects.removeAll();
}
}
Pattern 2:
Identity MapThis pattern has two main objectives, avoid data incoherence and speed up the materialization of objects.
The Identity Map is nothing more than a cache, which only materializes objects from the database if they aren't already in memory, it's behaviour can be seen in the activity diagram present below.

This pattern avoids data incoherences because when a object is in memory it only exists one instance of it, so even if somehow it's asked to be read from database it will be returned the same instance and at any time there will be two different instances of the same object making it able for the same object to have different internal states, which would be a severe data incoherence.
Example code
public class IdentityMap {
private Map<
Integer,DomainObject> objectsCache;
private DatabaseConnector connector;
public IdentityMap() {
objectsCache =
new HashMap<
Integer,DomainObject);
connector =
new DatabaseConnector();
}
public DomainObject getObjectWithId(
Integer id) {
DomainObject object = objectsCache.get(id);
return (object!=
null) object : connector.materializeObjectWithId(id);
}
}
I don't know if it's obvious but what has been called Database Connector in these example can - and should be - one of the patterns that have already been described under the Data Source Patterns.
Pattern 3:
Lazy Load
Lazy loading is the ability to load needed attributes only when requested. In Domain Driven Design this is extremely useful since domain objects are strongly related with each other. Mostly because if every attribute of a domain object would be resolved and materialized when a single object would be loaded the entire domain model could be loaded into memory. In a small model this behaviour might not be a problem but, in a big one it is.
The idea is to use
something that represents the object but it's not the object itself. The
something mentioned is called a
proxy, the lazy load pattern is often used with
Proxy pattern, more specific with the virtual proxy. I've said often because there are other ways of achieving lazy load besides proxying, such as value holders.
As an example imagine that a Book as a relation with Authors, one book may have many Authors. Both Book and Author are domain objects.
Example code
public Author
implements IAuthor {
private String name;
// ...
public String getName() {
return name;
}
}
public AuthorProxy
implements IAuthor{
private Integer authorsId;
private Author author =
null;
public AuthorProxy(
Integer idForRealObject) {
this.authorsId = idForRealObject;
}
public String getName() {
if(author==
null) {
author = connector.materializeAuthorWithId(authorsId);
}
return author.getName();
}
}
public class Book
implements IBook {
private Integer id;
private String title;
private Integer numberOfPages;
Collection<IAuthor> authors;
public Collection<IAuthor> getAuthors() {
return authors;
}
}
Keeping track of what is whatLike it has been said before the in-memory objects need to have some relation with the data stored in the RDBM, that relation is kept by a new field in the object which has the key for the object in the relational table.
The question to ask is, where does that identification used as primary key come from?
There are two different approaches:
- It can be generated in the application
- It can be generated by the RDBMS
If the identification is generated by the application, it should be kept in mind the following:
- The Identification numbers have to be unique, at least for each class and, depending how the domain is mapped in the relational database, it's hierarchy.
- The generation and attribution of such identification needs to be transactional.
There are persistence frameworks that uses this strategy, some implementing more than a way different way of calculating the identification numbers, thought there is a simple algorithm, called the
Hi/lo Algorithm that is often used due to it's efficiency. This algorithm generates the next identification based in some parameters.
A strategy is to keep those parameters in a table in database and use them to get the new object id.
On the other hand if the choice is to let the RDBMS take care of the identification numbers, using for example AUTO_INCREMENT in the keys, there are also ways for retrieving the identification number.
It's usual for database driver connectors to provide a certain interface that allows the access to the generated row from the performed query. But, even if such functionality is not available there's always the SQL way to retrieve the maximum - so the latest - identification from a given table.
In the last case should be kept in mind that the creating and read of the identification number should be done in the same transaction in order to be sure that it's the correct identification that is being read.
ConclusionsIt's true that already exists frameworks that deliver the behaviours described in this post and that can be - and probably should be- used instead of implementing all this from the scratch. An example of one of those frameworks is
hibernate.
I do defend that if there are good frameworks they should be used instead of re-inventing the wheel. Though, the concepts that are underlying to such frameworks should be understood in order to gain the ability of analysis and to understand "how the show is running backstage" in case of some strange thing happens.
After this post I hope some of those concepts and ideas were made clear.