Always Learning

Is Best Practice Actually Poor Practice? Dependency Injection, Type Hinting, and Unit Tests...

2017-03-30T19:59:00.000+01:00

I've recently been in discussion with a colleague who thinks that dependency injection (DI) is over-used and, in cases where the dependency is a concrete class, unnecessary (in the latter case, he advocates simply creating new objects on the fly). I raised the point that dependency injection allows you to pass in a mock object while unit testing, but he dismissed that as irrelevant and not a valid argument as to why dependency injection is useful. My colleague also stated that type hinting dependencies results in tight coupling, and he would prefer it if we simply abandoned type hints, allowing the developer to pass in whatever they like.

In my opinion, this line of thinking is misguided, but he sent through some links to pages that he felt supported his point of view (including Tony Marston's rant on DI, and the Laravel documentation about 'facades' - which are actually used as an alternative syntax for the service locator [anti-]pattern). I genuinely wanted to understand the reasoning behind his point of view, as it flies in the face of just about everything I have ever read regarding best practice in PHP development. After reading those resources he sent though, I began to notice some misconceptions about what unit testing actually is, as well as confusion about the difference between code that is "strongly typed" (usually good) and "tightly coupled" (usually bad), and also a tendency to blame the wrong thing when problems arise.

Not All Automated Tests Are Unit Tests

There are different types of automated test that can be performed, and sometimes these different types get erroneously conflated. A unit test is used to verify that a particular class behaves correctly in isolation. When you write a unit test, you are only testing that one class - not any of its dependencies (they have their own tests).

That's not to say you can't write a test that covers a number of classes at once, but if you do, it is not a unit test - it is either an integration test (if it deals with the integration between classes), or a functional test (if it tests an entire function regardless of how many classes are involved). Even if you use a unit testing framework (like PHPUnit) to write the test, it is only a unit test if it tests a single unit.

Using Mocks to Test in Isolation

The reason we try to test units in isolation is that we want the test to live with the class, not the implementation of its dependencies. The implementation can change by passing in different dependencies, but we don't care about that, as those dependencies will have their own tests - we only care that this particular class is behaving correctly. If our test fails, it should be because there is something wrong with that particular class (or with the test itself), not because something changed in a dependency. This helps with debugging and provides reassurance that our classes are behaving correctly regardless of external implementation (integration and functional tests are useful for different reasons).

When writing a unit test, it is common to inject mock objects instead of real objects for the dependencies. This has a number of advantages, for example, it allows you to run tests without activating things you don't want (sending e-mails, making database connections, API calls, etc.), it allows you to force the test down paths that would otherwise be difficult to reach, and it allows you to isolate the class you are testing. That's not to say that all dependencies are always mocked. It might be that a concrete class provides what the test needs, and does no harm, so it might be quicker and easier to use that (in which case it is still a unit test, as the dependency is not part of the test subject).

If you create dependencies on the fly though, you no longer have an option - it is not possible to supply a mock object, nor to extend that dependency to tweak the implementation without changing the class that uses it. This is the very definition of tight coupling - it doesn't get any tighter than that. It is no longer possible to unit test that class without also taking the concrete dependency along for the ride.

What, Never Ever Create Objects on the Fly?

There are occasions when it is perfectly acceptable to create objects on the fly, but we need to fully understand the consequences of doing so, and be able to justify it. I find it a good rule of thumb is that the 'new' keyword should only be used in dependency injection containers and factory classes, but (as with all best practice principles) that is just a rule of thumb, not a law. Some occasions where it might be harmless to create objects are:

Where your entities form a hierarchical structure, and a parent entity needs to create and initialise a child entity for one of its properties.
Where a new value object is needed - a value object represents a value with no identity separate from its properties, and is essentially a data structure (albeit possibly with methods). In some cases I might justify creating a value object in the same way as creating a new PHP DateTimeImmutable object (which is itself a value object).
When using a helper class which acts as an extension to the language (for example a string or maths helper class that provides general convenience functions not provided by PHP).

Whenever we do one of these things though, we must bear in mind that the code is now tightly coupled to that class, and be willing to live with the consequences of that. If we ever want to re-use that code in a different situation, will the class that object is based on still be available? Is it possible we would ever want to swap it out for something else? Would we ever want to mock it for unit testing purposes?

What About Those Laravel Facades?

In Laravel, you can make a static call to what they refer to as a facade (not a true facade, more of a proxy), which calls a service locator to find the requested dependency. As long as the service locator is properly implemented, it is possible to populate it with mock objects that can be used in unit testing. This is certainly better than just creating new objects on the fly, as the dependencies are not tightly coupled and can be swapped out relatively easily.

The service locator does still need to be populated though, so there is not a great deal of benefit in terms of setup when using a service locator - populate a service locator, or populate a dependency injection container; it is essentially the same thing, the difference is in how you use it.

Actually consuming a service locator is arguably easier than dependency injection though, because you don't have to declare your dependencies with constructor or setter parameters. All you need is the service locator (and in the case of Laravel, not even that - you can just make static calls wherever you like). That ease of use is what makes it so alluring to some. There are dangers here though...

Hidden Dependencies and Other Dangers

A service locator, whether used explicitly, or via a pseudo facade, hides dependencies from view. This alone has earned it a reputation as an anti-pattern, and a poor choice for enterprise web development. With dependency injection, you can tell from the constructor signature what the dependencies are, and with type hints, you can ensure that only valid values are accepted. With a service locator, you have to examine every line of code in the class, or consult some documentation (and hope that it is up-to-date and accurate) to work out what the class needs in order to run. To my mind, that is a lot more painful than just being explicit up-front with DI.

In addition, code that proxies to a service locator is not very portable. If you use Laravel facades to make static calls to a service locator, you cannot re-use your class in any non-Laravel project without stripping them all out first. Whilst some tight coupling to your framework might be difficult to avoid, there is no need to tie yourself down with facades when dependency injection provides a virtually painless alternative.

Strongly Typed is not Tightly Coupled

In PHP 4, it was not possible to type hint arguments in functions - every argument was 'mixed' and you could pass in anything you like. With PHP 5 came the ability to type hint on a class name, interface, array, or callable, which allowed developers to specify what type of data they expected. In PHP 7, this is taken even further, and we can type hint with scalar types (either strictly, or coercively), and even type hint return values. If, as my colleague suggests, strongly typed arguments lead to tight coupling, why this seemingly unstoppable march toward stronger typing?

It could be argued that it is just fashionable, and that there can be drawbacks to strong typing as it is implemented in PHP (unlike statically typed languages, if you override a method, you cannot change the signature - unless it is the constructor). However, there are a few reasons why I think type-hinted code is much easier to work with. Type hints are self-documenting, giving you lots of useful information on what the argument is for and how it is being used. I like the fact that my IDE can offer code completion and navigation based on type hints (docblocks are OK, but they can get out of sync as development progresses - type hints are always up-to-date). I like the fact that I can catch errors early if the wrong type of argument is passed in, an argument is accidentally missed, or sent in the wrong order - rather than accepting anything only to have the code fail in mysterious and difficult-to-debug ways later on, or have to write my own instanceof checks.

None of that causes tight coupling. Type hinting on concrete classes is more tightly coupled than type hinting on base classes or interfaces, but even with a concrete class (it is not always possible or even desirable to avoid them) it is still possible to extend and mock, so type-hinted dependency injection is always going to be more loosely coupled than having your classes create their own dependencies on the fly.

Another Reason?

Aside from possible misunderstandings like those mentioned above, it appears to me that often, developers who want more 'flexibility' than commonly accepted best practice allows, are actually struggling to understand how to implement best practice. A poor design will result in difficulties implementing best practice, and it is tempting to blame SOLID (or other principles) as being wrong or inappropriate rather than to recognise flaws in one's own design. In my experience, learning OOP is a slow and sometimes painful process. It is not easy to see how and why SOLID principles are necessary. Things like dependency injection, unit testing, using interfaces, and design patterns, are confusing and can appear to be unnecessary or irrelevant. Why go to all that trouble when it works just fine if I do it the way I know?

Exacerbating things, we have popular projects and frameworks (such as Laravel) which are engineered primarily with rapid application development in mind, being used in potentially inappropriate ways for enterprise development. Following SOLID and other best practice principles is less important when your focus is on creating software with a short lifecycle (that may be discarded in a few months' time). Active record (breaking the single responsibility principle), auto-wired dependencies (relying on magic and hiding what is going on), static methods (tightly coupled and hiding dependencies) make development easier, but the resulting software harder to maintain. The phenomenal success of Laravel has much more to do with the fact that it is easy to understand and quick to get started than the use of best practice (that's not to say that you can't use best practice with Laravel if you are disciplined enough).

I think you cannot really appreciate the benefits of some of these things until you get into the habit of unit testing. As the pieces gradually start coming together, SOLID principles become more natural, your code becomes more robust and the whole process of development becomes more of a pleasure. The learning process never stops though.

My conclusion is that SOLID is an invaluable guide to doing things in a way that leads to cleaner, more maintainable, more extendable, more testable code. If I feel tempted to violate any of those principles (or discover that I already have done so without realising it), I need to double check that my approach is not flawed and be certain I understand the consequences.

Naturally, some people will always disagree!

Domain example.com has exceeded the max defers and failures per hour

2015-02-26T17:11:00.000+00:00

When sending e-mail from a cPanel hosting account, you might find that you get an error message similar to this: "Domain yourdomain.com has exceeded the max defers and failures per hour (5/5 (100%)) allowed. Message discarded."

If you are getting that error, it means that a large number of e-mails that have been sent from your domain have been rejected by the receiving mail server. Just increasing the limit is not a good idea, as it exposes the server to blacklisting if a spammer compromises an account. You therefore need to find out why so many sent messages are failing. Here are some suggestions on what to look for:

Do you have a mailing list, perhaps for a newsletter? If so, your list might need cleaning up - too many dead e-mail addresses will push you over the limit. Better still, use a third party mailing service such as MailChimp for your newsletters, so you don't risk getting your own server blacklisted.
Has your account been hacked? Try running an exploit scan (eg. using Configserver Exploit Scanner [CXS]) and make sure all the scripts running on your hosting account are up-to-date.
Are any of your mailboxes forwarding to another mail service such as gmail or hotmail? This is a terrible idea! Delete those forwarders! If you want to get your mail into gmail, you need to 'pull' it from the gmail end, not 'push' it from the cPanel server (likewise for other mail services). Why? Well, if spammers send mail to your mailbox, and your mailbox forwards it to gmail, gmail will reject it (with a message like this: "Our system has detected an unusual rate of unsolicited mail originating from your IP address. To protect our users from spam, mail sent from your IP address has been temporarily rate limited. Please visit http://www.google.com/mail/help/bulk_mail.html to review our Bulk Email Senders Guidelines."). This will count against your failed send limit, causing the above 'max defers/failures' error. It will also get your server's IP address blacklisted, which could cause problems for anyone on the same server sending mail to gmail.
Have any of your mailboxes been hacked? If someone has discovered or guessed your mailbox password, they could be connecting and sending spam. Change your mailbox passwords and run a virus/malware scan on any device that you use to connect to the server.
Login to cPanel and click on the 'Email Trace' icon. Don't enter an e-mail address, just click the 'Run Report' button (this will show details for all mail both in and out of your account). Look through the list for any groups of messages that failed to send from one of your mailboxes - that might give you a clue as to where the failures are coming from.

PHPUnit: cannot open file bootstrap.php

2014-01-29T14:36:00.001+00:00

This post is google fodder for anyone who comes across the error 'cannot open file: /some/directory/here/tests/unit/bootstrap.php' when attempting to run unit tests with PHPUnit. In particular, I got this error when trying to set up unit tests for a new project in PHPEd. The reason for the error in this case is that PHPEd helpfully adds a default phpunit.xml file including a switch that attempts to load said bootstrap file.

To fix: Just edit the phpunit.xml file (or it might be named phpunit.xml.dist) in the root folder of the project, and remove the switch: bootstrap="tests/unit/bootstrap.php".

Entities vs Value Objects and Doctrine 2

2013-11-14T20:05:00.001+00:00

Update (4th June 2014): Value Objects can now be used in Doctrine 2.5 (currently still in Beta), using the new 'Embeddables' feature: http://doctrine-orm.readthedocs.org/en/latest/tutorials/embeddables.html. An example is given at the end of this post.

An aspect of Domain Driven Design (DDD) that I find quite appealing is the differentiation between entities and value objects. I don't think you even need to embrace DDD as a whole to benefit from this distinction (I'm not saying you shouldn't embrace DDD, just that this aspect of it stands on its own). If you are using Doctrine 2 though, there is no native support for value objects (not yet anyway), which leaves you with the problem of mapping them yourself. In this post I will talk a bit about the difference between entities and value objects and give one suggestion of how to handle value objects when using Doctrine 2.

What is an entity?

An entity is a class that represents something which has an identity independent of its properties. In other words, even if some of its properties change, its identity remains the same. Entities normally represent the fundamental building blocks of your application.

For example, a Customer would be an entity. A customer record might have various properties like name, address, email address, order list, etc., but even if you change one of those values (eg. if the customer moves house), it is still the same customer.

What is a value object?

A value object is a class (usually small, with just a few properties, and frequently used), that represents a value which has no identity separate from its properties. If any one of its properties change, it is no longer the same value.

An example of a value object is an address. If you change any part of an address, it becomes a different address. Another example would be a price - it might have properties for currency, amount, and payment frequency, but changing any of those would yield a new price.

As such, value objects are immutable - their properties cannot change (it is normal to use getters but not setters for the properties of value objects). If two value objects have exactly the same properties, they can be regarded as equal.

Value objects can have methods, but the behaviour of a method doesn't ever change the object's state. If a method needs to change a property, it will return a brand new instance of the value object instead of modifying the existing one. Methods on a value object can also return other types of value of course (eg. a boolean return value would be typical for a validate method).

Can a Value Object have an Entity as a property?

It doesn't happen very often, but there is nothing wrong with referencing an entity from a value object - having a reference to an entity does not force a value to become an entity. For example, an address value object would typically include a country property. Whilst this would normally be a string, your application might need to have a Country class which has mutable properties (such as whether the country is a member of the EU) and is therefore an entity. But you could still have a country property on your address value object, even if it refers to an entity (I will use this example in the sample code below).

Can an address/price/date range/etc. ever be an Entity?

Yes! Whether something is an entity or a value object depends on how you intend to use it. If you were writing an application for a postal service or a town planner, an address might be more than just a value - you might have the power to change properties of an address without changing its identity (eg. if you can assign house numbers or postcodes).

Why use value objects?

This is really two separate questions, one of which is a little easier to answer than the other...

Why use value objects instead of scalar values?

This is the easy one. You could have a Customer class with properties for address_line_1, address_line_2, etc. - each as a string. Or a phone number could be held as a string with the area code in brackets. By collecting these properties together into a single value object, or separating a string into constituent parts as separate properties of a value object, it is not hard to see that you will gain many OOP benefits. If there are several pieces of data that go together to make a single domain concept, they belong in an object. You can then perform computations on that discrete set of data, validate it, re-use it, and extend it. It also allows you to separate concerns - your entity does not have to worry about any domain logic relating to the value, leading to cleaner code.

Why use value objects instead of Entities?

Why not just give everything an ID and make it an entity? There's no technical reason why you couldn't have an address ID for example, and even allow the properties to be modified. However, although at first glance it might seem as though separating out entities and value objects makes things more complicated, it actually makes things simpler. Why give an address an ID if it doesn't need one? It unnecessarily complicates your model, and could also cause confusion as it is not clear how you intend the data to be used. If something is defined as a value object, we know what it is, that it will not change, and that another object of the same type with the same properties is considered equal to it. This gives us lots more information on how it is meant to be used - so the conceptual difference is valuable in making your code understandable and easy to maintain.

Having said that, if you are using an ORM which does not support value objects (such as Doctrine 2), there is indeed some additional complexity involved, as you need to provide some way of mapping custom value objects to database columns yourself. If you can do this in a way that will be easy to refactor later (when hopefully the next version of Doctrine will support value objects), it might still be worthwhile doing that extra legwork while defining your entities so that you can reap the rewards of improved code clarity and usability.

Using Value Objects with Doctrine 2

Let's take a simplified example. Here is a basic Customer class, including phpDoc annotations for Doctrine 2 (the BaseEntity class it derives from [not shown] contains magic methods __get and __set for accessing the protected properties, as Doctrine 2 doesn't like you to use public properties):

namespace Entities;
/**
* @Entity @Table(name="customer")
* @property int $id
* @property string $name
* @property \ValueObjects\Address $address
* @property string $email_address
* @property string $telephone
**/
class Customer extends BaseEntity
{
    /** @var integer @Id @GeneratedValue @Column(type="integer") */
    protected $id;
    /** @var string @Column(type="string") */
    protected $name = '';
    /** @var \ValueObjects\Address */
    protected $address;
    /** @var string @Column(type="string", length=150) **/
    protected $email_address = '';
    /** @var string @Column(type="string", length=40) **/
    protected $telephone = '';
}

Here we have an Address value object as one of our properties, but we can't map it directly to a column, because it is made up of several fields and Doctrine 2 currently won't resolve them for us. We could give Address an ID and make it an entity, and Doctrine 2 would then be able to map it, but if we want to use a value object, we have to do our own mapping, which I will come to in a minute. First, here is the Address class (the BaseValueObject it derives from has a magic getter but no setter, so the properties are read only):

namespace ValueObjects;
class Address extends BaseValueObject
{
    /** @var string **/
    protected $line_1 = "";
    /** @var string **/
    protected $line_2 = "";
    /** @var string **/
    protected $line_3 = "";
    /** @var string **/
    protected $town = "";
    /** @var string **/
    protected $state = "";
    /** @var string **/
    protected $postcode = "";
    /** @var \Entities\Country **/
    protected $country;

    public function __construct($line_1 = '', $line_2 = '', $line_3 = '',
                                $town = '', $state = '', $postcode = '',
                                \Entities\Country $country = null)
    {
        $this->line_1 = $line_1;
        $this->line_2 = $line_2;
        $this->line_3 = $line_3;
        $this->town = $town;
        $this->state = $state;
        $this->postcode = $postcode;
        $this->country = $country;
    }
}

As described above, the country property of the Address value object here refers to an entity (although it could very well refer to another value object, or just a scalar value).

Now, to enable Doctrine 2 to persist our Customer class, with its Address value object, we will need to tell it how to map to columns to hold each part of the address. We could use a custom mapping type for this, but there is another way which is probably simpler and a little more generic.

As Address is not an entity, and has no identifier (ie. no primary key), the address data cannot be held in a separate table - it will have to go in the same table as the Customer entity. We could therefore just add the fields to our Customer entity and map them to an accessor like this (note that here we do have a setter as well as a getter for the address, because the Customer entity is not immutable):

namespace Entities;
/**
* @Entity @Table(name="customer")
* @property int $id
* @property string $name
* @property \ValueObjects\Address $address
* @property string $email_address
* @property string $telephone
**/
class Customer extends BaseEntity
{
    /** @var integer @Id @GeneratedValue @Column(type="integer") */
    protected $id;
    /** @var string @Column(type="string") */
    protected $name = '';
    /** @var \ValueObjects\Address */
    protected $address;
    /** @var string @Column(type="string", length=150) **/
    protected $email_address = '';
    /** @var string @Column(type="string", length=40) **/
    protected $telephone = '';

    /** @var string @Column(type="string", length=100) **/
    private $address_line_1 = "";
    /** @var string @Column(type="string", length=100) **/
    private $address_line_2 = "";
    /** @var string @Column(type="string", length=100) **/
    private $address_line_3 = "";
    /** @var string @Column(type="string", length=100) **/
    private $address_town = "";
    /** @var string @Column(type="string", length=100) **/
    private $address_state = "";
    /** @var string @Column(type="string", length=100) **/
    private $address_postcode = "";
    /**
    * @ManyToOne(targetEntity="Country", fetch="EAGER")
    * @JoinColumn(name="address_country", referencedColumnName="code")
    * @var Country
    **/
    private $address_country;

    public function setAddress(\ValueObjects\Address $address)
    {
        $this->address = $address;
        $this->address_line_1 = $address->line_1;
        $this->address_line_2 = $address->line_2;
        $this->address_line_3 = $address->line_3;
        $this->address_town = $address->town;
        $this->address_state = $address->state;
        $this->address_postcode = $address->postcode;
        $this->address_country = $address->country;
    }

    /** @return \ValueObjects\Address **/
    public function getAddress()
    {
        if (!isset($this->address)) //Lazy load
        {
            $this->address = new ValueObject\Address(
                    $this->address_line_1,
                    $this->address_line_2,
                    $this->address_line_3,
                    $this->address_town,
                    $this->address_state,
                    $this->address_postcode,
                    $this->address_country);
        }
        return $this->address;
    }
}

Here we have private properties to map the individual parts of the address to database columns without exposing these to the outside world (the magic getter in the base class will not have access to them), but we have a public getter and setter for the address value object.

This will work, but it doesn't look very nice - our Customer class is now bloated with address mapping code. Also, we might have other entities that also need an address, and we don't want to have to copy and paste this all over the place. This is where traits come in handy! We can use a trait to keep the nasty mapping kludge out of our entity, and re-use it wherever an address is needed:

namespace Traits;
/**
* This trait allows us to map address value objects to database columns using Doctrine 2
*/
trait Address
{
    /** @var string @Column(type="string", length=100) **/
    private $address_line_1 = "";
    /** @var string @Column(type="string", length=100) **/
    private $address_line_2 = "";
    /** @var string @Column(type="string", length=100) **/
    private $address_line_3 = "";
    /** @var string @Column(type="string", length=100) **/
    private $address_town = "";
    /** @var string @Column(type="string", length=100) **/
    private $address_state = "";
    /** @var string @Column(type="string", length=100) **/
    private $address_postcode = "";
    /**
    * @ManyToOne(targetEntity="Country", fetch="EAGER")
    * @JoinColumn(name="address_country", referencedColumnName="code")
    * @var Country
    **/
    private $address_country;

    public function setAddress(\ValueObjects\Address $address)
    {
        $this->address = $address;
        $this->address_line_1 = $address->line_1;
        $this->address_line_2 = $address->line_2;
        $this->address_line_3 = $address->line_3;
        $this->address_town = $address->town;
        $this->address_state = $address->state;
        $this->address_postcode = $address->postcode;
        $this->address_country = $address->country;
    }

    /** @return \ValueObjects\Address **/
    public function getAddress()
    {
        if (!isset($this->address)) //Lazy load
        {
            $this->address = new ValueObject\Address(
                    $this->address_line_1,
                    $this->address_line_2,
                    $this->address_line_3,
                    $this->address_town,
                    $this->address_state,
                    $this->address_postcode,
                    $this->address_country);
        }
        return $this->address;
    }
}

Putting the ugly stuff there in the trait means that the only pollution in our entity class is a use statement:

namespace Entities;
/**
* @Entity @Table(name="customer")
* @property int $id
* @property string $name
* @property \ValueObjects\Address $address
* @property string $email_address
* @property string $telephone
**/
class Customer extends BaseEntity
{
    use \Traits\Address; //Allows Doctrine 2 to map the Address value object

    /** @var integer @Id @GeneratedValue @Column(type="integer") */
    protected $id;
    /** @var string @Column(type="string") */
    protected $name = '';
    /** @var \ValueObjects\Address */
    protected $address;
    /** @var string @Column(type="string", length=150) **/
    protected $email_address = '';
    /** @var string @Column(type="string", length=40) **/
    protected $telephone = '';
}

The fly in the ointment

Aside from the fact that manually mapping fields is a bit of a kludge in itself, there is another problem here. What about cases where an entity needs more than one address? Perhaps a billing address and shipping address for example? We can't include the trait twice (well, we could use aliasing, but both would refer to the same values and would require a monumental kludge to differentiate), so we would have to add more traits for the different address types. I still think that is better than putting it all directly in the entity class, but it does smell a bit iffy. What do you think? Is there a better way?

Update: Using Embeddables

If you can use Doctrine 2.5, this problem is easily solved using Embeddables. Here is how you would do the mapping for a customer address in Doctrine 2.5:

namespace ValueObjects;

/** @Embeddable **/
class Address extends BaseValueObject
{
    /** @var string **/
    protected $line_1 = "";
    /** @var string **/
    protected $line_2 = "";
    /** @var string **/
    protected $line_3 = "";
    /** @var string **/
    protected $town = "";
    /** @var string **/
    protected $state = "";
    /** @var string **/
    protected $postcode = "";
    /** @var \Entities\Country **/
    protected $country;
 
    public function __construct($line_1 = '', $line_2 = '', $line_3 = '',
                                $town = '', $state = '', $postcode = '',
                                \Entities\Country $country = null)
    {
        $this->line_1 = $line_1;
        $this->line_2 = $line_2;
        $this->line_3 = $line_3;
        $this->town = $town;
        $this->state = $state;
        $this->postcode = $postcode;
        $this->country = $country;
    }
}

namespace Entities;
/**
* @Entity @Table(name="customer")
* @property int $id
* @property string $name
* @property \ValueObjects\Address $address
* @property string $email_address
* @property string $telephone
**/
class Customer extends BaseEntity
{
    /** @var integer @Id @GeneratedValue @Column(type="integer") */
    protected $id;
    /** @var string @Column(type="string") */
    protected $name = '';
    /** @var \ValueObjects\Address @Embedded(class="\ValueObjects\Address") */
    protected $address;
    /** @var string @Column(type="string", length=150) **/
    protected $email_address = '';
    /** @var string @Column(type="string", length=40) **/
    protected $telephone = '';
}

Painless!

Active Record vs Data Mapper for Persistence

2013-10-13T13:58:00.000+01:00

These two design patterns are explained in Martin Fowler's book 'Patterns of Enterprise Application Architecture', and represent ways of handling data persistence in object oriented programming. Here are the basics:

Active Record Example

class Foo 
{
    protected $db;
    public $id;
    public $bar;
    
    public function __construct(PDO $db)
    {
        $this->db = $db;
    }

    public function do_something()
    {
        $this->bar .= uniqid();
    }

    public function save()
    {
        if ($this->id) {
            $sql = "UPDATE foo SET bar = :bar WHERE id = :id";
            $statement = $this->db->prepare($sql);
            $statement->bindParam("bar", $this->bar);
            $statement->bindParam("id", $this->id);
            $statement->execute();
        }
        else {
            $sql = "INSERT INTO foo (bar) VALUES (:bar)";
            $statement = $this->db->prepare($sql);
            $statement->bindParam("bar", $this->bar);
            $statement->execute();
            $this->id = $this->db->lastInsertId();
        }
    }
}

//Insert
$foo = new Foo($db);
$foo->bar = 'baz';
$foo->save();

In this simplified example, a database handle is injected into the constructor of Foo (using dependency injection here allows the object to be unit tested without using a real database), and Foo uses that to save its own data. The do_something method is just a placeholder for some business logic.

Active Record advantages

It is fairly quick and easy to write an active record object, particularly when the properties of the object correlate directly with columns in the database.
Everything is kept in one place, making it easier to see how the code works.

Active Record disadvantages

The Active Record pattern breakes SOLID design principles - in particular the Single Responsibility Principle (SRP - the 'S' of SOLID). According to the SRP, a domain object should only have a single responsibility, ie. its own business logic. By asking it to handle persistence as well, you are giving it an additional responsibility, and this increases the complexity of the object - making it more difficult to maintain and test.
The persistence implementation is closely coupled with the business logic, meaning that if you later wanted to use a different persistence layer (for example to store the data in an XML file instead of a database), you would need to refactor the code.

Data Mapper Example

class Foo 
{
    public $id;
    public $bar;

    public function do_something()
    {
        $this->bar .= uniqid();
    }
}

class FooMapper
{
    protected $db;

    public function __construct(PDO $db)
    {
        $this->db = $db;
    }
    public function saveFoo(Foo &$foo)
    {
        if ($foo->id) {
            $sql = "UPDATE foo SET bar = :bar WHERE id = :id";
            $statement = $this->db->prepare($sql);
            $statement->bindParam("bar", $foo->bar);
            $statement->bindParam("id", $foo->id);
            $statement->execute();
        }
        else {
            $sql = "INSERT INTO foo (bar) VALUES (:bar)";
            $statement = $this->db->prepare($sql);
            $statement->bindParam("bar", $foo->bar);
            $statement->execute();
            $foo->id = $this->db->lastInsertId();
        }
    }
}

//Insert
$foo = new Foo();
$foo->bar = 'baz';
$mapper = new FooMapper($db);
$mapper->saveFoo($foo);

Here the Foo class is a lot simpler and only has to worry about its own business logic. Not only does it not have to persist its own data, it doesn't even know or care whether its data is persisted at all. The FooMapper class doesn't contain any business logic, but it does handle the data persistence, and all of the SQL is tucked away in the mapper class.

Data Mapper advantages

Each object has a single responsibility, thus preserving the integrity of SOLID design principles, and keeping each object simple and to the point.
The business logic and persistence are loosely coupled - if you want to persist in an XML file or some other format, you can just write a new mapper and don't have to touch the domain object.

Data Mapper disadvantages

You have to think a bit harder before you type!
You end up with more objects to manage, which makes the calling code a little more complex and slightly harder to follow.

Service Objects

When using the data mapper pattern, the calling code has to choose a mapper object and a business object and put them together. If this calling code is in your controller, you could end up having your model layer leak into your controller, and this can cause more difficulties with maintenance and unit testing. This problem can be resolved by introducing a service object. The service object is the gateway between the controller and the model, and it handles marrying up domain objects with their mappers as necessary.

It should of course be remembered that the M in MVC, represents a model layer, not necessarily a model object. There could be several types of object in a single model (in the above example, you could have a service object, domain object, and data mapper object, all forming a single model). If you use the active record pattern on the other hand, your model could well be just a single object.

Which option to use

Active record objects have historically been very popular due to them being simpler, easier to understand and quicker to write, and many PHP frameworks and ORMs use the active record pattern by default. If you are confident that you will never need to swap out the persistence layer (perhaps when dealing with an object which represents an .ini file for example), or are dealing with very simple objects that don't have much (or even any) business logic, or just prefer to keep things contained in fewer classes, the active record pattern may make the most sense.

Using a data mapper though does lead to cleaner, easier to test, easier to maintain code, and allows for greater extensibility - at the cost of slightly increased complexity. If you haven't tried it before, give it a whirl - you might like it.

Public properties, getters and setters, or magic?

2013-09-23T10:36:00.000+01:00

Opinion seems to be divided on whether it is good practice to use public properties in PHP classes, or whether to use getter and setter methods instead (and keep the properties private or protected). A sort of hybrid third option is to use the magic methods __get() and __set(). As always, there are advantages and disadvantages to each approach, so let's take a look at them...

Public properties example

class Foo 
{
    public $bar;
}

$foo = new Foo();
$foo->bar = 1;
$foo->bar++;

In this example, bar is a public property of Foo. The calling code can manipulate that property in any way it likes, and stuff any old data in it.

Public properties advantages

Compared with getter and setter methods, there is a lot less typing involved!
The calling code is more readable and easier to work with than getter and setter method calls.
A call to a public property (to either set or get) is faster and uses less memory than a method call - but the saving is small unless you are calling it many times in a long loop.
Objects with public properties can be used as parameters for some PHP functions (such as http_build_query), and can be serialized by json_encode.

Public properties disadvantages

There is no way of controlling what data is held in the properties - it is very easy to populate them with data that the class methods might not expect. This can be particularly problematic if the class is being used by another developer (ie. other than the author of the class), as they might not (and should not need to) be aware of the internal workings of the class (and let's face it, after a few months, the author of the class probably won't remember the internal workings either).
If you want to expose a public property in an API, you can't do so using an interface, as in PHP interfaces only allow method definitions.

Using getters and setters example

class Foo
{
    private $bar;
    
    public function getBar()
    {
        return $this->bar;
    }
    
    public function setBar($bar)
    {
        $this->bar = $bar;
    }
}

$foo = new Foo();
$foo->setBar(1);
$foo->setBar($foo->getBar() + 1);

Here the bar property is private so cannot be accessed by the calling code directly. The caller has to use the getBar method to retrieve the value and the setBar method to set it. The methods can perform validation processing to ensure that only valid values are allowed through.

Getters and setters advantages

With getters and setters you can control exactly what is stored in your object properties, and reject any values that are not valid.
You also have the option of performing additional processing when a value is set or retrieved (for example, if updating this property should trigger some action such as notifying an observer).
When setting a value which represents an object or array (rather than a scalar value), you can specify the type in the function declaration (eg. public function setBar(Bar $bar)). Such a damn shame PHP won't let you do the same thing with ints and strings!
If the value of a property has to be loaded from an external data source or the runtime environment, you can lazy load it - so the resources required to load the data are only used if the property itself is called. Of course you would need to be careful that you don't needlessly load the data from the external source on every call to the property. And it would be more common to make a single call to the database to populate all the properties rather than fetching them one at a time.
You can make properties read-only or write-only by only creating a getter but not a setter or vice-versa.
You can include a getter and setter in an interface to expose them in an API.

Getters and setters disadvantages

For developers who are used to accessing properties directly, getters and setters are a pain in the neck to use! For every property you have to define the property itself, a getter, and a setter, and to use the property in the calling code you have to make extra method calls - it is much easier to say $foo->bar++; rather than the long-winded $foo->setBar($foo->getBar() + 1); (although of course you could add yet another method such as $foo->incrementBar();)
As noted above, there is a small additional overhead when making a method call over using a plain ol' public property.
By convention, getter and setter methods start with the verbs 'get' and 'set' respectively, but these verbs are also commonly used for other methods which don't necessarily relate to properties. This is not necessarily a problem for the calling code as you might not care what the implementation is, but the ambiguity about the type of implementation you are dealing with can make the code harder to understand.

Magic method getters and setters example

class Foo
{
    protected $bar;

    public function __get($property)
    {
        switch ($property)
        {
            case 'bar':
                return $this->bar;
            //etc.
        }
    }

    public function __set($property, $value)
    {
        switch ($property)
        {
            case 'bar':
                $this->bar = $value;
                break;
            //etc.
        }
    }
}

$foo = new Foo();
$foo->bar = 1;
$foo->bar++;

Here, the bar property is not publicly exposed, but the calling code still calls it as though it were public. When PHP can't find a matching public property, it calls the appropriate magic method (__get() for retrieving a value, __set() for assigning a value). This might seem like the best of both worlds, but it has a major drawback (see disadvantages below!). Note also that __get and __set are NOT called if a matching public property exists, but they are called if a protected or private property exists but is out of scope, or if no property exists.

Magic method getter and setter advantages

You can manipulate properties directly from the calling code (as though they were public properties) but still have complete control over what data goes into what property.
As with declared getter and setter methods, you can perform additional processing when the property is used.
You can use lazy loading.
You can make properties read-only or write-only.
You can use the magic methods as a catch-all to handle calls to properties that don't exist, and still process them in some way (effectively allowing the calling code to decide what property names to use).

Magic method getter and setter disadvantages

The showstopper for magic methods is that they do not expose the property in any way. To use or extend the class, you have to 'just know' what properties it supports. This is simply unacceptable most of the time (unless perhaps you are one of those hard core programmers who think notepad is an IDE!), although there are times when the advantages mentioned above outweigh this limitation. As pointed out by a commentator below, this disadvantage can be overcome by using the phpDoc tags @property, @property-read, and @property-write. Cool.

Which option to use

Clearly there are some significant advantages to getters and setters, and some people feel that they should therefore be used all the time (especially those who come from a Java background!). But in my opinion, they do break the natural flow of the language, and their additional complexity and verbosity puts me off using them unless there is a need to do so (I find it a little irritating when naive getters and setters don't actually DO anything other than get or set the property). My choice then is to use public properties most of the time, but getters and setters for critical settings that I feel need stricter control, would benefit from lazy loading, or that I want to expose in an interface.

Another alternative?

Before I learned PHP, I used to use C#. In C#, all properties have accessor methods, but you don't have to call them as methods, you can just manipulate the properties directly and the corresponding methods are called magically. So it is similar to PHP's magic __get and __set, but the properties are still declared and exposed. This really is the best of both worlds, and it would be great to see a feature like this in PHP.

Tragically though, the RFC for a C#-like property accessor syntax feature could not quite muster up the two-thirds majority needed to take it forward: https://wiki.php.net/rfc/propertygetsetsyntax-v1.2 Bah!

Handling Global Data in PHP Web Applications

2013-09-07T16:04:00.003+01:00

Almost every web application needs to handle global data. There are certain things that just have to be available throughout the entire code base, such as database connections, configuration settings, and error handling routines. As a PHP developer, you may have heard the mantra 'globals are evil', but this naturally begs the question 'what should I use instead of global variables?'

There are several different strategies available to help cope with the demand for global data - each has its advantages and disadvantages and it can be a challenge to know which approach to use for any given situation. Here I will try to outline what the options are, how they work, their advantages, their disadvantages, and examples of under what circumstances each option might be used. The code samples are not necessarily realistic, they are kept as simple as possible to demonstrate the idea. The principles and design patterns that follow are not specific to PHP - they can be applied to any object oriented language.

Global Variables

Probably the easiest solution to understand and to implement is the use of global variables. A global variable is defined using the global keyword, and that makes the contents of the variable available throughout the entire code base. Typically you declare the variable (again using the global keyword) in every scope that you want to use it, although you can just use the $GLOBALS 'super global' array to reference the value without declaring it first.

Global variable example:

global $database;
$database = new Database();

function doSomething()
{
    global $database; //This has to be declared so that PHP knows 
                      //you want to use the global variable, not 
                      //a local one
    $data = $database->readStuff();

}

Global variable example using $GLOBALS super global:

global $database;
$database = new Database();

function doSomething()
{
    $data = $GLOBALS['database']->readStuff(); //No need to 
                                               //declare the 
                                               //global first

}

Global variable advantages

The main advantage of global variables is that they are easy to understand, and easy to use in your code. Their ease of use makes it tempting to use them a lot, even though there may be much better options available!

Global variable disadvantages

The disadvantages of using global variables nearly always outweigh the advantages.

They make your code hard to read and hard to understand. It is not obvious what the variable is for, where or how it was initialised, or what is the proper way to use it.
They make your code hard to maintain. It is very difficult to make changes to global variables, as you have to search through your entire code base looking for where they have been used.
It is easy to abuse a global variable and cause errors that are hard to debug. Without any control mechanism for how the variable is used, it is easy to populate it with invalid data which can cause errors in other parts of the code (for example, if one part of the code populates the variable with an array but another part of the code expects it to contain an object).
It is easy to get confused regarding the variable's scope. If you forget to declare that the variable is global, you can end up unwittingly working with a local variable without noticing - until your app breaks. This can also be hard to debug.
If you combine your code with someone else's code (eg. by using a third party library or writing an extension for another piece of software), and both systems use global variables, there is a chance that the variable names could clash, causing errors in both systems which are hard to debug.
All parts of your code that use a global variable are tightly coupled and it becomes very difficult to separate out or re-use a module elsewhere.
Unit testing is made more difficult, as the test has to know which global variables are needed, and how to initialise all global variables with valid values.

When to use global variables

It is rarely a good idea to use global variables, especially in a large application, but there are some occasions when their ease of use and simplicity make them an acceptable option. In particular, if you are writing a short and relatively simple plugin or small app, which is going to be easy to read and understand, or perhaps a proof-of-concept or prototype script.

Static Classes (Helper Classes)

Using helper classes that just contain static members is another easy way of dealing with global data, although they share many of the same disadvantages as global variables. Classes with static members can contain both properties and methods, like a normal object, but do not need to be instantiated before use, and retain their values throughout the scope of your application. They have more in common with procedural code than object oriented code, despite the use of classes.

Static class example:

class SmtpConfig
{
    public static $host = 'localhost';
    public static $port = 465;
    public static $user = 'me@example.com';
    public static $password = 'j4a!9Sd@aKP2f';
    public static $tls = true;
}

echo SmtpConfig::$user; //This and other values from the class 
                        //are available everywhere as long as 
                        //the file containing the class 
                        //declaration has been included or can 
                        //be autoloaded

Static class advantages

A static helper class enables you to group several related pieces of data together.
Static classes are easy to to use, easy to understand, and not so prone to naming clashes as global variables (although now that PHP supports namespaces, name clashes are not really an issue except in legacy code).
It is easier to locate where the data was initially defined (most IDEs will automatically locate the class definition for you, whereas with global variables, it is not always possible to tell where they were first declared), although this still doesn't stop the values being initialised or changed anywhere throughout your code base.

Static class disadvantages

If a static class has methods which have their own dependencies, they can be more difficult to unit test than instantiated classes with dependency injection (see below).
The main disadvantage of static classes is that they promote close coupling - any object relying on the static members is closely coupled with the code that initialises those members (and which in turn may have its own dependencies).
Static classes do not have a constructor, so any static methods have to do their own dependency checking, and the calling code has to perform any initialisation beyond just using the default values. This also typically requires error checking after the method call - at which point it is difficult to ascertain which dependency failed.
Dependencies are not enforced, so the code execution can fail to execute with few clues as to why, and for reasons related to a dependency of a dependency of a dependency and not for reasons relating to the place in the code where the failure occurred (a debugging nightmare).

When to use static classes

Static classes are best used in simple cases where there are no dependencies, or where the dependencies are simple and fundamental enough to the operation of your application that they can be taken as read (since you have no way of enforcing them). An example of this would be your application's global error handler (although that could equally well be a singleton - see below - it is best to keep error handling as simple as possible, as it needs to be bulletproof, and static classes are arguably simpler than singletons - you could even use procedural code in a bootstrap file which is simpler still).

Static members are useful as private or protected members of an instantiated class, and provide a way of storing data once for many instances (thus reducing the amount of memory needed for each instance) - for example by holding immutable metadata such as database column information. They can also be used effectively for providing small algorithms that are never likely to need changing or overriding for internal use within an instantiated object. But a class full of just static members is a bit of a 'code smell', and there is usually a better way.

Singleton

A singleton is a class which is instantiated, but for which there can only be a single instance. A singleton class cannot be directly instantiated by the calling code (the constructor carries the private or protected modifier) - it has to be accessed through a static member which checks whether the class has already been instantiated, and if so, returns the existing instance, otherwise, creates a new one (which it holds on to in case another caller wants it). This is an effort to allow better support for inheritance, and to allow it to be passed around and treated like any other object. The intention of the singleton pattern is not really to provide a mechansim for global data, but to ensure that only one object is created (it being globally available is a side effect).

Singleton example

class Singleton
{
    protected static $instance = null;

    protected function __construct()
    {
    }

    protected function __clone()
    {
    }

    public static function getInstance()
    {
        if (!isset(static::$instance))
        {
            self::$instance = new Singleton();
        }
        return static::$instance;
    }
}

$singleton = Singleton::getInstance();

Singleton advantages

Ensures there is only one version of the object (allowing a resource to be shared).
Can be used from anywhere in the code - if it is not already instantiated, it will be on its first use.
Can support inheritance and polymorphism to a limited degree.

Singleton disadvantages

Enforcing a single instance of an object is rarely the desired behaviour (for example, whilst it might seem like you would only need one database connection in an application, requirements might change, requiring the application to access more than one database - perhaps for backup or synchronisation purposes).
There is no need for classes that rely on the singleton to declare their dependency on it, so it is not obvious that they rely on it and it creates a close coupling between them.
Inheritance and polymorphism are restricted, as there can still only be a single instance per request (but the implementation could be different for different types of request).
Once instantiated, the singleton will be held in memory for the life of the request even if it is not needed again (this might be desirable for objects that are expensive to instantiate and/or that are used frequently, but can negatively impact memory usage if used indiscriminately).

When to use a singleton

A singleton should only really be used if a single resource needs to be shared among different objects. It is necessary to check that even if the current requirements do not call for multiple instances of the object, any likely or potential future requirements will also not need to allow for multiple instances. A common use of the singleton pattern is for combining with a global registry pattern (see below), or for interacting with the operating system or host that the application is running on (of which there will only ever be one at a time).

Registry

The registry design pattern allows you to define an object (usually a singleton) which holds references to various other resources (typically as key/value pairs) that may be needed by your application (for example, database connections and configuration settings). Although the registry itself is usually a singleton (as you only want a single registry available to the whole application), the resources it stores are not expected to be singletons - it can store several different instances of the same class. The resources it stores do not even have to be objects - they can be primitive data types or arrays.

Resources can be stored in a hash table (array), or if you know that certain items will need to be in the registry, you can strongly type them (which will help with the code autocomplete features of your IDE). You could also have a mixture!

Registry example (weakly typed)

class Registry
{
    protected static $instance;
    protected $resources = array();

    protected function __construct()
    {
    }

    protected function __clone()
    {
    }

    public static function getInstance()
    {
        if (!isset(self::$instance)) {
            self::$instance = new Registry();
        }
        return self::$instance;
    }

    public function setResource($key, $value, $force_refresh = false)
    {
        if (!$force_refresh && isset($this->resources[$key])) {
            throw new RuntimeException('Resource ' . $key . ' has already been set. If you really ' 
                                       . 'need to replace the existing resource, set the $force_refresh '
                                       . 'flag to true.');
        }
        else {
            $this->resources[$key] = $value;
        }
    }

    public function getResource($key)
    {
        if (isset($this->resources[$key])) {
            return $this->resources[$key];
        }
        throw new RuntimeException ('Resource ' . $key . ' not found in the registry');
    }
}

//Add a resource to the registry
$db = new Database();
Registry::getInstance()->setResource('Database', $db);

//Retrieve a resource from the registry (elsewhere in the code)
$db = Registry::getInstance()->getResource('Database');

Registry example (strongly typed)

class Registry
{
    protected static $instance;
    protected $main_db;
    protected $sync_db;
    protected $config;

    protected function __construct()
    {
    }

    protected function __clone()
    {
    }

    public static function getInstance()
    {
        if (!isset(self::$instance)) {
            self::$instance = new Registry();
        }
        return self::$instance;
    }

    public function setMainDatabase(Database $value, $force_refresh = false)
    {
        if (!$force_refresh && isset($this->main_db)) {
            throw new RuntimeException('Main database has already been set. If you really '
                                       . 'need to replace the existing database, set the '
                                       . '$force_refresh flag to true.');
        }
        else {
            $this->main_db = $value;
        }
    }

    public function getMainDatabase()
    {
        if (isset($this->main_db)) {
            return $this->main_db;
        }
        throw new RuntimeException ('Main database resource not found in the registry');
    }

    public function setSyncDatabase(Database $value, $force_refresh = false)
    {
        if (!$force_refresh && isset($this->sync_db)) {
            throw new RuntimeException('Synchronisation database has already been set. If you really '
                                       . 'need to replace the existing database, set the $force_refresh '
                                       . 'flag to true.');
        }
        else {
            $this->sync_db = $value;
        }
    }

    public function getSyncDatabase()
    {
        if (isset($this->sync_db)) {
            return $this->sync_db;
        }
        throw new RuntimeException ('Synchronisation database resource not found in the registry');
    }

    public function setConfig(Config $value, $force_refresh = false)
    {
        if (!$force_refresh && isset($this->config)) {
            throw new RuntimeException('Configuration object has already been set. If you really '
                                       . 'need to replace the existing configuration, set the '
                                       . '$force_refresh flag to true.');
        }
        else {
            $this->config = $value;
        }
    }

    public function getConfig()
    {
        if (isset($this->config)) {
            return $this->config;
        }
        throw new RuntimeException ('Configuration resource not found in the registry');
    }
}

//Add a resource to the registry
$db = new Database();
Registry::getInstance()->setMainDatabase($db);

//Retrieve a resource from the registry (elsewhere in the code)
$db = Registry::getInstance()->getMainDatabase();

In these examples, the developer is allowed to overwrite existing resources, but only if they make it clear that this was their intention (by setting the $force_refresh flag).

Registry advantages

If there are common dependencies that are used throughout your code, you can use the global registry instead of passing an individual parameter for each one.
A registry allows you the freedom to store and manage your global data centrally, without restricting the implementation to a single instance, and allowing full use of inheritance and polymorphism for the resources it manages.
A strongly typed registry allows your IDE to help you avoid typing mistakes.
A registry is somewhat easier to use than dependency injection.

Registry disadvantages

A registry still hides dependencies and is tightly coupled to objects that rely on it (or its contents), although not as tightly as a global variable (because the resources can be replaced with different sub classes).

When to use a registry

Some developers reject the use of a registry on the grounds that it is just a global array in disguise, and (in particular with a weakly typed implementation) gives no clue as to how its contents are meant to be used. However, the generally preferred alternative (dependency injection - see below) can get out of hand when you have to inject lots of dependencies, many of which are the same ones over and over again (you can use a dependency injection container to manage this, but it is arguably more complex than using a global registry). Used sparingly then, a registry can be an appropriate vehicle for managing the most common dependencies that are fundamental to the workings of your application (typically, one or more databases, maybe a logger, and a configuration object), without requiring an unreasonably long list of parameters, or repeated use of the same parameters, for every object instantiation.

Dependency Injection

Dependency injection requires that the calling code supply all of the dependencies to an object before use. In most cases, this is done by passing parameters to the constructor - so that the object cannot be instantiated unless it has been given all of the data it needs to do its job. Optional dependencies are often injected using a separate method call. Injected dependencies are often objects but they don't have to be - any data the object requires to do its job is a dependency and must be supplied by the calling code.

Dependency injection example

class Person
{
    protected $database;
    public $title;
    public $first_name;
    public $last_name;
    
    public function __construct(Database $db, $last_name)
    {
        $this->database = $db;
        if (strlen($last_name) == 0) {
            throw new Exception('Last name required');
        }
        $this->last_name = $last_name;
    }
    
    public function setTitle($title)
    {
        $this->title = $title;
    }

    public function setFirstName($first_name)
    {
        $this->first_name = $first_name;
    }
    
    //More methods here which use the database object
}

$db = new Database('localhost', 'user', 'password');
//The database object is passed to the Person object in the constructor, along with some other data
$person = new Person($db, 'Smith');
$person->setTitle('Mr'); //Optional dependencies can be set with a separate method call

Dependency injection advantages

Dependency injection de-couples your code, allowing each object to exist and perform operations without requiring any particular environmental setup.
This makes code re-use much easier, as you can just use the same object in another application or in another setting in the same application.
It also makes unit testing much easier, as a test can be set up to inject real or dummy dependencies for the purposes of testing the object.
It is obvious to the calling code what the dependencies are - it can therefore supply everything that is needed without worrying that there might be some hidden dependency which will break the application if not supplied.
Inheritance and polymorphism can be used to great effect by specifying the parent class (or interface) as a dependency - the calling code can then supply any sub class and the object doesn't need to know or care what the implementation is (allowing for easy extensibility). For example, if a class has a constructor which requires a database object to be injected, the calling code can inject a MySQL database class or an SQLite database, or any other sub class of database (perhaps even one that hasn't been invented yet).
By passing dependencies in the constructor, any problems can be caught early - the class can verify that it has valid dependency data before it will allow instantiation. This makes debugging much easier.
For a lucid explanation of why dependency injection is generally superior to using statics, please see David C. Zentgraf's post: How Not to Kill Your Testability Using Statics (the article is not just about testability).

Dependency injection disadvantages

The calling code may have more work to do to initialise an object, especially if the dependencies you are injecting have dependencies of their own (if this gets out of hand you could look into using a dependency injection container).
If there are lots of dependencies, you could end up with a long list of parameters in your constructor which makes the code difficult to read and understand.

When to use dependency injection

In most cases, dependency injection provides more advantages than disadvantages, so it is becoming common practice to use it by default and only avoid it if it is causing problems. Where certain objects are used extensively throughout the code base (such as a database or configuration object), injecting them into every object can become laborious and inelegant. In such cases, it might be better to just accept a certain amount of close-coupling for the sake of code readability (and writability!), and to use the setup and teardown features of your unit testing software to initialise and destroy the most common dependencies.

Increasingly though, dependency injection containers are used to handle multiple object dependencies. Using a container allows the dependencies to be defined just once instead of at every instantiation, and the dependencies can even be defined in a config file or in annotations rather than in the code itself. There are various frameworks available that provide dependency injection containers, some of which are very lightweight and specialise in just providing containers.

In conclusion

The developer has to make a judgement call about when to use which approach for handling global data. Each approach has its advantages and disadvantages, and whilst some (dependency injection) are clearly more desirable than others (global variables) in most situations, it is not helpful to make blanket rules (like 'singletons are evil').

Using Procedural Code in PHP

2013-09-04T16:58:00.000+01:00

Object oriented programming (OOP) in PHP has become increasingly popular since PHP 5 was released, and especially from PHP 5.3 onwards. Without doubt, writing a large application using OOP in PHP has a lot of advantages over using purely procedural code (and if you are new to PHP I strongly advise concentrating your efforts on OOP). But despite the advances PHP has seen in OO support, it is still a procedural language at heart, and there are times when procedural is the way to go. Here are some examples...

When to use procedural code in PHP

All PHP scripts have to start with procedural code! Typically you have an 'index.php' file as the entry point to your application, and before you can use any objects, you have to instantiate them, and the first instantiation has to be done procedurally - there is no other option.
In most cases there are certain 'bootstrap' tasks that have to be carried out before you can process a request. This might include registering an autoload function, setting up error handling, and checking that the system is capable of running your application. All of this is typically done procedurally (perhaps also using static classes) because it is impractical to do it using OOP (for example, if you do not control the deployment environment, you might want your app to die gracefully even if someone tries to run it on PHP 4 - but if you try to use OO features from PHP 5, it will not be possible to die gracefully).
You could use procedural code to provide alternatives or additions to the built-in PHP functions (although you should first consider whether OOP would be better). For example, your script might or might not have access to multibyte string functions - it could be useful to have your own alternative functions that use multibyte functions if available or the standard PHP functions if not (you can use namespaces to override default PHP functions in PHP 5.3 and above, but this might not be very wise as your code will no longer behave in the way it would be expected to behave by any other PHP developer! Better to write separate functions with a different name).
If you are just writing an example script, a proof of concept, or a very simple script or plugin, there is probably no need to go OO.
If you prefer to write procedurally and feel you write better code that way, there is no reason why you shouldn't do so! Procedural code is not 'wrong' per se, there is just a tendency for it to lead to code that is difficult to test and difficult to maintain. But some great applications have been written procedurally, and it can be an acceptable choice, especially if you don't need to collaborate with anyone else. Better to write good procedural code than bad object oriented code!

When not to use procedural code in PHP

Pretty much any other time! Of course, I might not have thought of everything, so there are probably other occasions when procedural code is the right choice.

Other considerations

Object oriented vs procedural is not the be-all and end-all of programming models. I found this post about the 'messaging' programming model very helpful (at least the first part - it gets more biased towards MS Windows later on): Drastically Improving Your Code With Messaging as a Programming Model.

First Post

2013-09-04T15:28:00.001+01:00

I'm always learning new stuff. Sometimes stuff is worth sharing, or at least recording so I can find it again later. So I decided to start a blog. And here it is.

I'll try to make sure I know what I'm talking about before posting, but I'm always learning and always happy to be corrected. I'm interested in a lot of subjects, so my posts might seem a bit random, but I appreciate any feedback on whether posts are useful or not.