Blog Archive

Sunday 13 October 2013

Active Record vs Data Mapper for Persistence

These two design patterns are explained in Martin Fowler's book 'Patterns of Enterprise Application Architecture', and represent ways of handling data persistence in object oriented programming. Here are the basics:

Active Record Example

class Foo 
{
    protected $db;
    public $id;
    public $bar;
    
    public function __construct(PDO $db)
    {
        $this->db = $db;
    }

    public function do_something()
    {
        $this->bar .= uniqid();
    }

    public function save()
    {
        if ($this->id) {
            $sql = "UPDATE foo SET bar = :bar WHERE id = :id";
            $statement = $this->db->prepare($sql);
            $statement->bindParam("bar", $this->bar);
            $statement->bindParam("id", $this->id);
            $statement->execute();
        }
        else {
            $sql = "INSERT INTO foo (bar) VALUES (:bar)";
            $statement = $this->db->prepare($sql);
            $statement->bindParam("bar", $this->bar);
            $statement->execute();
            $this->id = $this->db->lastInsertId();
        }
    }
}

//Insert
$foo = new Foo($db);
$foo->bar = 'baz';
$foo->save();


In this simplified example, a database handle is injected into the constructor of Foo (using dependency injection here allows the object to be unit tested without using a real database), and Foo uses that to save its own data. The do_something method is just a placeholder for some business logic.

Active Record advantages

  • It is fairly quick and easy to write an active record object, particularly when the properties of the object correlate directly with columns in the database.
  • Everything is kept in one place, making it easier to see how the code works.

Active Record disadvantages

  • The Active Record pattern breakes SOLID design principles - in particular the Single Responsibility Principle (SRP - the 'S' of SOLID). According to the SRP, a domain object should only have a single responsibility, ie. its own business logic. By asking it to handle persistence as well, you are giving it an additional responsibility, and this increases the complexity of the object - making it more difficult to maintain and test.
  • The persistence implementation is closely coupled with the business logic, meaning that if you later wanted to use a different persistence layer (for example to store the data in an XML file instead of a database), you would need to refactor the code.

Data Mapper Example

class Foo 
{
    public $id;
    public $bar;

    public function do_something()
    {
        $this->bar .= uniqid();
    }
}

class FooMapper
{
    protected $db;

    public function __construct(PDO $db)
    {
        $this->db = $db;
    }
    public function saveFoo(Foo &$foo)
    {
        if ($foo->id) {
            $sql = "UPDATE foo SET bar = :bar WHERE id = :id";
            $statement = $this->db->prepare($sql);
            $statement->bindParam("bar", $foo->bar);
            $statement->bindParam("id", $foo->id);
            $statement->execute();
        }
        else {
            $sql = "INSERT INTO foo (bar) VALUES (:bar)";
            $statement = $this->db->prepare($sql);
            $statement->bindParam("bar", $foo->bar);
            $statement->execute();
            $foo->id = $this->db->lastInsertId();
        }
    }
}

//Insert
$foo = new Foo();
$foo->bar = 'baz';
$mapper = new FooMapper($db);
$mapper->saveFoo($foo);


Here the Foo class is a lot simpler and only has to worry about its own business logic. Not only does it not have to persist its own data, it doesn't even know or care whether its data is persisted at all. The FooMapper class doesn't contain any business logic, but it does handle the data persistence, and all of the SQL is tucked away in the mapper class.

Data Mapper advantages

  • Each object has a single responsibility, thus preserving the integrity of SOLID design principles, and keeping each object simple and to the point.
  • The business logic and persistence are loosely coupled - if you want to persist in an XML file or some other format, you can just write a new mapper and don't have to touch the domain object.

Data Mapper disadvantages

  • You have to think a bit harder before you type!
  • You end up with more objects to manage, which makes the calling code a little more complex and slightly harder to follow.

Service Objects

When using the data mapper pattern, the calling code has to choose a mapper object and a business object and put them together. If this calling code is in your controller, you could end up having your model layer leak into your controller, and this can cause more difficulties with maintenance and unit testing. This problem can be resolved by introducing a service object. The service object is the gateway between the controller and the model, and it handles marrying up domain objects with their mappers as necessary.

It should of course be remembered that the M in MVC, represents a model layer, not necessarily a model object. There could be several types of object in a single model (in the above example, you could have a service object, domain object, and data mapper object, all forming a single model). If you use the active record pattern on the other hand, your model could well be just a single object.

Which option to use

Active record objects have historically been very popular due to them being simpler, easier to understand and quicker to write, and many PHP frameworks and ORMs use the active record pattern by default. If you are confident that you will never need to swap out the persistence layer (perhaps when dealing with an object which represents an .ini file for example), or are dealing with very simple objects that don't have much (or even any) business logic, or just prefer to keep things contained in fewer classes, the active record pattern may make the most sense.

Using a data mapper though does lead to cleaner, easier to test, easier to maintain code, and allows for greater extensibility - at the cost of slightly increased complexity. If you haven't tried it before, give it a whirl - you might like it.

14 comments:

  1. Hello, thanks for the Post.
    In case of Active Record, how you recommend to handle the list of rows/objects?, meaning the SELECTs part. In the same class?

    ReplyDelete
    Replies
    1. Whether using active record or data mapper, I would typically use a collection class for lists of objects - with active record, the collection class would include its own load method.

      Delete
  2. Could you please elaborate on the Service Object topic a little, for example give some sample code of having to deal with both mapper and model in the controller, the model leaking into the controller, and how it would look when doing it using Service Objects instead? Thanks!

    ReplyDelete
    Replies
    1. I'm afraid I don't have time right now to elaborate (maybe in a future post), but here are a couple of links you might find helpful:

      http://stevelorek.com/service-objects.html
      (talking about Rails, but the principles still apply - when using the controller to co-ordinate things, it can become a 'fat controller')

      http://akrabat.com/php/objects-in-the-model-layer-part-2/
      (an example of using service objects and data mappers in PHP)

      Delete
  3. I've seen a lot of mistaken use of design pattern names applied to PHP libraries. Is there a library for PHP that implements the true data mapper design pattern as Martin Fowler defines it?

    ReplyDelete
    Replies
    1. Doctrine 2 uses the data mapper pattern (and by extension, Symfony 2). I tend to use annotations in my entities to define the mapping characteristics, which I guess could be seen as a pollution of the pattern, but you can use yaml or xml and keep your entities completely separate from the mapping.

      Delete
  4. This is a great post, Russell. I got a very clear picture of the differences.

    ReplyDelete
  5. Nice post. Clear and useful. Thanks! :)

    - Daniel

    ReplyDelete
  6. very nice post. thank you

    ReplyDelete
  7. This "service object" approach looks an awful lot like the repository pattern to me...am I missing something?

    ReplyDelete
    Replies
    1. My understanding (and I am not an expert on the subject), is that where you have complex queries, perhaps involving multiple domains, you can use a repository pattern to deal with collections of objects, and data mappers for individual object mapping. Service objects handle a discrete business task and would call either the repository or data mapper as needed.

      Delete