Almost every web application needs to handle global data. There are certain things that just have to be available throughout the entire code base, such as database connections, configuration settings, and error handling routines. As a PHP developer, you may have heard the mantra 'globals are evil', but this naturally begs the question 'what should I use instead of global variables?'
There are several different strategies available to help cope with the demand for global data - each has its advantages and disadvantages and it can be a challenge to know which approach to use for any given situation. Here I will try to outline what the options are, how they work, their advantages, their disadvantages, and examples of under what circumstances each option might be used. The code samples are not necessarily realistic, they are kept as simple as possible to demonstrate the idea. The principles and design patterns that follow are not specific to PHP - they can be applied to any object oriented language.
Global Variables
Probably the easiest solution to understand and to implement is the use of global variables. A global variable is defined using the global keyword, and that makes the contents of the variable available throughout the entire code base. Typically you declare the variable (again using the global keyword) in every scope that you want to use it, although you can just use the $GLOBALS 'super global' array to reference the value without declaring it first.Global variable example:
global $database; $database = new Database(); function doSomething() { global $database; //This has to be declared so that PHP knows //you want to use the global variable, not //a local one $data = $database->readStuff(); }
Global variable example using $GLOBALS super global:
global $database; $database = new Database(); function doSomething() { $data = $GLOBALS['database']->readStuff(); //No need to //declare the //global first }
Global variable advantages
- The main advantage of global variables is that they are easy
to understand, and easy to use in your code. Their ease of use makes
it tempting to use them a lot, even though there may be much better
options available!
Global variable disadvantages
The disadvantages of using global variables nearly always outweigh the advantages.- They make your code hard to read and hard to understand. It
is not obvious what the variable is for, where or how it was
initialised, or what is the proper way to use it.
- They make your code hard to maintain. It is very difficult to
make changes to global variables, as you have to search through your
entire code base looking for where they have been used.
- It is easy to abuse a global variable and cause errors that
are hard to debug. Without any control mechanism for how the
variable is used, it is easy to populate it with invalid data which
can cause errors in other parts of the code (for example, if one
part of the code populates the variable with an array but another
part of the code expects it to contain an object).
- It is easy to get confused regarding the variable's scope. If
you forget to declare that the variable is global, you can end up
unwittingly working with a local variable without noticing - until
your app breaks. This can also be hard to debug.
- If you combine your code with someone else's code (eg. by
using a third party library or writing an extension for another
piece of software), and both systems use global variables, there is
a chance that the variable names could clash, causing errors in both
systems which are hard to debug.
- All parts of your code that use a global variable are tightly
coupled and it becomes very difficult to separate out or re-use a
module elsewhere.
- Unit testing is made more difficult, as the test has to know
which global variables are needed, and how to initialise all global
variables with valid values.
When to use global variables
It is rarely a good idea to use global variables, especially in a large application, but there are some occasions when their ease of use and simplicity make them an acceptable option. In particular, if you are writing a short and relatively simple plugin or small app, which is going to be easy to read and understand, or perhaps a proof-of-concept or prototype script.Static Classes (Helper Classes)
Using helper classes that just contain static members is another easy way of dealing with global data, although they share many of the same disadvantages as global variables. Classes with static members can contain both properties and methods, like a normal object, but do not need to be instantiated before use, and retain their values throughout the scope of your application. They have more in common with procedural code than object oriented code, despite the use of classes.Static class example:
class SmtpConfig { public static $host = 'localhost'; public static $port = 465; public static $user = 'me@example.com'; public static $password = 'j4a!9Sd@aKP2f'; public static $tls = true; } echo SmtpConfig::$user; //This and other values from the class //are available everywhere as long as //the file containing the class //declaration has been included or can //be autoloaded
Static class advantages
- A static helper class enables you to group several related
pieces of data together.
- Static classes are easy to to use, easy to understand, and
not so prone to naming clashes as global variables (although now
that PHP supports namespaces, name clashes are not really an issue
except in legacy code).
- It is easier to locate where the data was initially defined
(most IDEs will automatically locate the class definition for you,
whereas with global variables, it is not always possible to tell
where they were first declared), although this still doesn't stop
the values being initialised or changed anywhere throughout your
code base.
Static class disadvantages
- If a static class has methods which have their own
dependencies, they can be more difficult to unit test than
instantiated classes with dependency injection (see below).
- The main disadvantage of static classes is that they promote
close coupling - any object relying on the static members is closely
coupled with the code that initialises those members (and which in
turn may have its own dependencies).
- Static classes do not have a constructor, so any static
methods have to do their own dependency checking, and the calling
code has to perform any initialisation beyond just using the default
values. This also typically requires error checking after the method
call - at which point it is difficult to ascertain which dependency
failed.
- Dependencies are not enforced, so the code execution can fail
to execute with few clues as to why, and for reasons related to a
dependency of a dependency of a dependency and not for reasons
relating to the place in the code where the failure occurred (a
debugging nightmare).
When to use static classes
Static classes are best used in simple cases where there are no dependencies, or where the dependencies are simple and fundamental enough to the operation of your application that they can be taken as read (since you have no way of enforcing them). An example of this would be your application's global error handler (although that could equally well be a singleton - see below - it is best to keep error handling as simple as possible, as it needs to be bulletproof, and static classes are arguably simpler than singletons - you could even use procedural code in a bootstrap file which is simpler still).Static members are useful as private or protected members of an instantiated class, and provide a way of storing data once for many instances (thus reducing the amount of memory needed for each instance) - for example by holding immutable metadata such as database column information. They can also be used effectively for providing small algorithms that are never likely to need changing or overriding for internal use within an instantiated object. But a class full of just static members is a bit of a 'code smell', and there is usually a better way.
Singleton
A singleton is a class which is instantiated, but for which there can only be a single instance. A singleton class cannot be directly instantiated by the calling code (the constructor carries the private or protected modifier) - it has to be accessed through a static member which checks whether the class has already been instantiated, and if so, returns the existing instance, otherwise, creates a new one (which it holds on to in case another caller wants it). This is an effort to allow better support for inheritance, and to allow it to be passed around and treated like any other object. The intention of the singleton pattern is not really to provide a mechansim for global data, but to ensure that only one object is created (it being globally available is a side effect).Singleton example
class Singleton { protected static $instance = null; protected function __construct() { } protected function __clone() { } public static function getInstance() { if (!isset(static::$instance)) { self::$instance = new Singleton(); } return static::$instance; } } $singleton = Singleton::getInstance();
Singleton advantages
- Ensures there is only one version of the object (allowing a
resource to be shared).
- Can be used from anywhere in the code - if it is not already
instantiated, it will be on its first use.
- Can support inheritance and polymorphism to a limited degree.
Singleton disadvantages
- Enforcing a single instance of an object is rarely the
desired behaviour (for example, whilst it might seem like you would
only need one database connection in an application, requirements
might change, requiring the application to access more than one
database - perhaps for backup or synchronisation purposes).
- There is no need for classes that rely on the singleton to
declare their dependency on it, so it is not obvious that they rely
on it and it creates a close coupling between them.
- Inheritance and polymorphism are restricted, as there can
still only be a single instance per request (but the implementation
could be different for different types of request).
- Once instantiated, the singleton will be held in memory for
the life of the request even if it is not needed again (this might
be desirable for objects that are expensive to instantiate and/or
that are used frequently, but can negatively impact memory usage if
used indiscriminately).
When to use a singleton
A singleton should only really be used if a single resource needs to be shared among different objects. It is necessary to check that even if the current requirements do not call for multiple instances of the object, any likely or potential future requirements will also not need to allow for multiple instances. A common use of the singleton pattern is for combining with a global registry pattern (see below), or for interacting with the operating system or host that the application is running on (of which there will only ever be one at a time).Registry
The registry design pattern allows you to define an object (usually a singleton) which holds references to various other resources (typically as key/value pairs) that may be needed by your application (for example, database connections and configuration settings). Although the registry itself is usually a singleton (as you only want a single registry available to the whole application), the resources it stores are not expected to be singletons - it can store several different instances of the same class. The resources it stores do not even have to be objects - they can be primitive data types or arrays.Resources can be stored in a hash table (array), or if you know that certain items will need to be in the registry, you can strongly type them (which will help with the code autocomplete features of your IDE). You could also have a mixture!
Registry example (weakly typed)
class Registry { protected static $instance; protected $resources = array(); protected function __construct() { } protected function __clone() { } public static function getInstance() { if (!isset(self::$instance)) { self::$instance = new Registry(); } return self::$instance; } public function setResource($key, $value, $force_refresh = false) { if (!$force_refresh && isset($this->resources[$key])) { throw new RuntimeException('Resource ' . $key . ' has already been set. If you really ' . 'need to replace the existing resource, set the $force_refresh ' . 'flag to true.'); } else { $this->resources[$key] = $value; } } public function getResource($key) { if (isset($this->resources[$key])) { return $this->resources[$key]; } throw new RuntimeException ('Resource ' . $key . ' not found in the registry'); } } //Add a resource to the registry $db = new Database(); Registry::getInstance()->setResource('Database', $db); //Retrieve a resource from the registry (elsewhere in the code) $db = Registry::getInstance()->getResource('Database');
Registry example (strongly typed)
class Registry { protected static $instance; protected $main_db; protected $sync_db; protected $config; protected function __construct() { } protected function __clone() { } public static function getInstance() { if (!isset(self::$instance)) { self::$instance = new Registry(); } return self::$instance; } public function setMainDatabase(Database $value, $force_refresh = false) { if (!$force_refresh && isset($this->main_db)) { throw new RuntimeException('Main database has already been set. If you really ' . 'need to replace the existing database, set the ' . '$force_refresh flag to true.'); } else { $this->main_db = $value; } } public function getMainDatabase() { if (isset($this->main_db)) { return $this->main_db; } throw new RuntimeException ('Main database resource not found in the registry'); } public function setSyncDatabase(Database $value, $force_refresh = false) { if (!$force_refresh && isset($this->sync_db)) { throw new RuntimeException('Synchronisation database has already been set. If you really ' . 'need to replace the existing database, set the $force_refresh ' . 'flag to true.'); } else { $this->sync_db = $value; } } public function getSyncDatabase() { if (isset($this->sync_db)) { return $this->sync_db; } throw new RuntimeException ('Synchronisation database resource not found in the registry'); } public function setConfig(Config $value, $force_refresh = false) { if (!$force_refresh && isset($this->config)) { throw new RuntimeException('Configuration object has already been set. If you really ' . 'need to replace the existing configuration, set the ' . '$force_refresh flag to true.'); } else { $this->config = $value; } } public function getConfig() { if (isset($this->config)) { return $this->config; } throw new RuntimeException ('Configuration resource not found in the registry'); } } //Add a resource to the registry $db = new Database(); Registry::getInstance()->setMainDatabase($db); //Retrieve a resource from the registry (elsewhere in the code) $db = Registry::getInstance()->getMainDatabase();
In these examples, the developer is allowed to overwrite existing resources, but only if they make it clear that this was their intention (by setting the $force_refresh flag).
Registry advantages
- If there are common dependencies that are used throughout
your code, you can use the global registry instead of passing an
individual parameter for each one.
- A registry allows you the freedom to store and manage your
global data centrally, without restricting the implementation to a
single instance, and allowing full use of inheritance and
polymorphism for the resources it manages.
- A strongly typed registry allows your IDE to help you avoid
typing mistakes.
- A registry is somewhat easier to use than dependency
injection.
Registry disadvantages
- A registry still hides dependencies and is tightly coupled to
objects that rely on it (or its contents), although not as tightly
as a global variable (because the resources can be replaced with
different sub classes).
When to use a registry
Some developers reject the use of a registry on the grounds that it is just a global array in disguise, and (in particular with a weakly typed implementation) gives no clue as to how its contents are meant to be used. However, the generally preferred alternative (dependency injection - see below) can get out of hand when you have to inject lots of dependencies, many of which are the same ones over and over again (you can use a dependency injection container to manage this, but it is arguably more complex than using a global registry). Used sparingly then, a registry can be an appropriate vehicle for managing the most common dependencies that are fundamental to the workings of your application (typically, one or more databases, maybe a logger, and a configuration object), without requiring an unreasonably long list of parameters, or repeated use of the same parameters, for every object instantiation.Dependency Injection
Dependency injection requires that the calling code supply all of the dependencies to an object before use. In most cases, this is done by passing parameters to the constructor - so that the object cannot be instantiated unless it has been given all of the data it needs to do its job. Optional dependencies are often injected using a separate method call. Injected dependencies are often objects but they don't have to be - any data the object requires to do its job is a dependency and must be supplied by the calling code.Dependency injection example
class Person { protected $database; public $title; public $first_name; public $last_name; public function __construct(Database $db, $last_name) { $this->database = $db; if (strlen($last_name) == 0) { throw new Exception('Last name required'); } $this->last_name = $last_name; } public function setTitle($title) { $this->title = $title; } public function setFirstName($first_name) { $this->first_name = $first_name; } //More methods here which use the database object } $db = new Database('localhost', 'user', 'password'); //The database object is passed to the Person object in the constructor, along with some other data $person = new Person($db, 'Smith'); $person->setTitle('Mr'); //Optional dependencies can be set with a separate method call
Dependency injection advantages
- Dependency injection de-couples your code, allowing each
object to exist and perform operations without requiring any
particular environmental setup.
- This makes code re-use much easier, as you can just use the
same object in another application or in another setting in the same
application.
- It also makes unit testing much easier, as a test can be set
up to inject real or dummy dependencies for the purposes of testing
the object.
- It is obvious to the calling code what the dependencies are -
it can therefore supply everything that is needed without worrying
that there might be some hidden dependency which will break the
application if not supplied.
- Inheritance and polymorphism can be used to great effect by
specifying the parent class (or interface) as a dependency - the
calling code can then supply any sub class and the object doesn't
need to know or care what the implementation is (allowing for easy
extensibility). For example, if a class has a constructor which
requires a database object to be injected, the calling code can
inject a MySQL database class or an SQLite database, or any other
sub class of database (perhaps even one that hasn't been invented
yet).
- By passing dependencies in the constructor, any problems can
be caught early - the class can verify that it has valid dependency
data before it will allow instantiation. This makes debugging much
easier.
- For a lucid explanation of why dependency injection is
generally superior to using statics, please see David C. Zentgraf's
post: How Not to Kill Your Testability Using Statics (the article is
not just about testability).
Dependency injection disadvantages
- The calling code may have more work to do to initialise an
object, especially if the dependencies you are injecting have
dependencies of their own (if this gets out of hand you could look
into using a dependency injection container).
- If there are lots of dependencies, you could end up with a
long list of parameters in your constructor which makes the code
difficult to read and understand.
When to use dependency injection
In most cases, dependency injection provides more advantages than disadvantages, so it is becoming common practice to use it by default and only avoid it if it is causing problems. Where certain objects are used extensively throughout the code base (such as a database or configuration object), injecting them into every object can become laborious and inelegant. In such cases, it might be better to just accept a certain amount of close-coupling for the sake of code readability (and writability!), and to use the setup and teardown features of your unit testing software to initialise and destroy the most common dependencies.Increasingly though, dependency injection containers are used to handle multiple object dependencies. Using a container allows the dependencies to be defined just once instead of at every instantiation, and the dependencies can even be defined in a config file or in annotations rather than in the code itself. There are various frameworks available that provide dependency injection containers, some of which are very lightweight and specialise in just providing containers.
Thanks for this. As a newbie to OOP, this helped solidify a few different concepts for me.
ReplyDeleteI think that the rule "never use global variables" should really be expressed as "never OVER-use, MIS-use or AB-use global variables".
ReplyDeleteThere are two ways of applying a rule - indiscriminately or intelligently. Those who do the former just apply a rule without thinking, and it is this lack of thinking which causes me to doubt everything which follows.
I disagree that using global variables will ALWAYS make your code harder to read and harder to maintain. Sometimes the efforts taken to avoid them produce convoluted code, which makes the solution worse than the problem.
I didn't say 'ALWAYS', I was just speaking generally, so I don't think we disagree really. :)
DeleteYou should probably correct $_GLOBALS to $GLOBALS.
ReplyDeleteOops! Thanks, I have corrected that (I guess it just proves I never use globals!).
DeleteGreat, great, great article! Thorough! Insightful! Academic. Practical. This piece of work thankfully came up near the top in my search engine results. I've grappled with several of the techniques you discussed over that past several years in my work and have done my best to reach for me what would be the optimal trade off (balance) between: maintainability (fixing), extensiblity (enhancing), readability (easy to revisit without studying for a week), portability (re-using), and other good desirable stuff (you name what that is). It doesn't matter what I favor--so I won't say because, yes, the author must decide for themselves. I'd say a key element to any project is discipline and consistency--that will get you a long way despite any of the choices you make in your coding style. I think we all need to look at ourselves as more than just coders or programmers if we can. If we can--sometimes we work in a box and have a supervisor or a project manager or a boss who may not afford us our own discretion or the time to do things the best way (even if best way better meets business objectives). I read this article to learn and reinforce some good concepts as I press forward in my work. I think, Russell, what you've written supports the notion that we might lift our heads up for our screens and realize that yes, we don't have to be just programmers. We can be architects. Build good foundations. Sound ones. Resilient ones. That can be built up on even further. I don't wanna just code. Thanks again for your article!
ReplyDeleteDid not answered the question I was google about, but this pedagogic article finally gives a overview context of classes, in a way that could be related to as $GLOBALS['whatever'] is often a start-use-solution by new upcomming PHP application programmers!
ReplyDeleteSomehow, all the variations of using classes is like a horror-movie "the Scope". I will use this article approach within educational purposes.