Data Envelopment Analysis (DEA): A method for measuring efficiency, benchmarking and continuous improvement
Presented at 10th Annual Effective Management Accountant Conference - March 2010
Introduction
Measurement and comparison of measurements has long been an established process in all areas of human activity. The ability to measure and to compare is such an integral part of our daily lives - "which mountain is higher (or highest)", "who is the fastest skier", "which is the fastest boat", "which is the cheapest brand", "what is the shortest route", "which is the most fuel efficient car" - that we seldom think about the actual process of measurement or comparison. In government and business, however, the measurement and comparison process is much more important since it enables us to detect change from one period to another. Typical examples include the number of employees, the consumer price index, populations, taxation revenues and company turnover and profits. In these examples, the measures are relatively well defined and comparisons are made on a reasonably equitable basis. In other examples, however, we encounter problems in determining an equitable basis for comparisons because some factors may involve a "human element", or factors may be difficult to quantify or because comparisons might reflect conflicting objectives.
In many situations, we are often concerned with performance in the sense that the questions of interest are not based on "How many . . ?" but "How well are they doing?" or "How can they improve?". The difficulties of measuring and comparing performance in these circumstances are very real. Even where a profit motive is prevalent as in the private sector, it is important to understand the underlying reasons for performance variability and this becomes even more crucial in the public sector where profit considerations are more often replaced by considerations of "better value for the dollar" and cost containment. Both public and private sectors use performance indicators consisting of simple ratios of a single output measure to a single input measure. Performance ratios such as "sales per employee" or "on-time delivery" and "pupils per teacher" or "average length of stay" are well known in the private and public sectors respectively but taken alone they give a limited view of performance.
An obvious approach to providing a more complete picture of performance is to compute a number of performance ratios - e.g. the Balanced Scorecard approach of Kaplan and Norton (1993) mentions twenty measures used by the US firm, Rockwater. While the use of multiple performance indicator ratios gives a more complete picture of performance, it still fails to provide satisfactory answers to questions such as "How well are they doing?" or "How can they improve?" since each ratio can give different messages. Data Envelopment Analysis (DEA) is a modern approach which can help to provide meaningful answers to these questions by taking into account the actual inputs and outputs used to define the multiple performance indicators.
DEA, developed by Charnes, Cooper and Rhodes (1978) and subsequently the focus of considerable research, development and application both in Europe and the United States, is a non-parametric technique for evaluating the technical efficiencies of a collection of "Decision Making units (DMUs)" (e.g. bank branches, Crown Health Enterprises) which consume common inputs to generate common outputs. A DMU is said to be 100% efficient if
(a) None of the outputs can be increased without either
(i) increasing one or more inputs; or
(ii) decreasing some of the other outputs; and
(b) None of the inputs can be decreased without either
(i) decreasing some of its outputs; or
(ii) increasing some of its other inputs.
Since we usually have no way of establishing an absolute standard of efficiency, this definition must be adapted so that it refers to levels of efficiency relative to known levels of efficiency in other DMUs. We therefore say that a DMU is 100% efficient when comparisons with other DMUs do not provide evidence of inefficiency in the use of any input or output.
Based on this definition of efficiency, DEA is a mathematical optimization technique which determines the efficiency of each DMU by maximising the ratio of a weighted sum of its outputs to a weighted sum of its inputs while ensuring that the efficiencies of other units do not exceed 100%. Besides determining relative efficiency measures for each DMU, DEA also identifies efficient peer DMUs for each inefficient DMU and quantifies the required increase in outputs or decrease in inputs required to transform an inefficient DMU into an efficient DMU. It is this information which helps to provide answers to the question "How can they improve?".
From a mathematical point of view, DEA solves a sequence of simple linear programmes. The post-optimal analysis of these linear programmes provides us with the important information which quantifies inefficiencies.
There is now an extensive literature discussing both theory and applications of DEA. While applications have been reported in the private sector (e.g in retailing, banking, hotels and the airline industry), most of the applications of DEA have occurred in the public sector. In New Zealand, besides promoting the use of DEA in the health sector, we have also been involved in projects in retailing, roading maintenance, audit risk evaluation and horticulture in which DEA has been used to measure performance and quantify inefficiencies
The remainder of this paper explains the technical aspects of DEA using a simple example. We believe that its limited use in New Zealand is due primarily to a lack of knowledge about the technique - we are therefore pleased to have this opportunity to "spread the word" and invite you to contact us if you require further information.
Performance Measurement and Evaluation
The measurement and evaluation of performance is a fundamental aspect of managerial planning and control. The sophistication and complexity of performance measurement systems ranges from simple summarised measures through to hierarchies of multiple financial and non-financial measures connected by explicit structures enabling the identification of critical success factors and their underlying activities. Notwithstanding the variety of systems employed in practice, there are usually certain similarities in the objectives and environment of measurement systems. These include:
1. A collection of units or entities with common objectives or performing similar tasks or processes whose performance is to be evaluated;
2. An agreed or imposed set of common measures to be used in making comparisons and evaluations;
3. The specification of a time period for which comparisons are to be made;
4. An implicit or explicit notion of ranking units in terms of relative performance;
5. The existence of differences in environmental and organisational factors which may affect relative performance.
The most difficulty part of performance measurement is the determination of appropriate measures to provide an overall ranking of performance. Although a ranking can be obtained with a single measure of performance, this is almost always insufficient as it fails to capture the relevant dimensions of performance needed for planning and control, and provides a valid excuse for the claims of underperforming units that the measure does not fully reflect their activities and results.
Given that the use of a single measure is unacceptable and/or undesirable, multiple measures must be used. This gives rise to the problem of translating performance on several dimensions, usually measured in different units (e.g. dollars, weight, area, units) into an overall measure of performance which can then be used for ranking purposes. A solution is to use a system of weights applied to each measure to obtain an overall measure. This solution, as will be explained next, also has its problems.
The Use of Uniform Weights to Obtain Overall Performance
There are various ways in which uniform weights can be provided for measures to allow them to be combined into an overall measure of efficiency. The methodology, however, for choosing weights is almost always a major issue as it requires consensus on the choice of weights which may not be forthcoming.
The use of uniform weights requires prior specification of the weights to be used which are then applied to all units regardless of differences in local policies or circumstances. It is thus very much a `top-down' system of evaluation reflecting higher level priorities and perceptions.
This may be appropriate when a `natural' weighting system exists, but in most practical situations, local perceptions and priorities differ from higher levels and from other local units. In these situations, a uniform system of weighting tends to frustrate learning and innovative activities at the level of individual units who are often better placed to pursue different strategic directions to match local conditions. In fact, this is usually the reason for the devolution of decision making to local levels.
Data Envelopment Analysis (DEA) has the capacity to allow local policies and circumstances to be considered in performance evaluation. It achieves this through a systematic search for weights which will optimise performance for any individual unit. Thus, a set of weights are determined for each unit which present that unit in its best possible light in contrast to the `top down' approach which imposes a uniform set of weights on all units.
Performance Using Alternative Weights
Consider a simple example with twelve units who provide two similar outputs using the same input. These could be any unit e.g. retail outlet stores, months in a year. Table 1 contains the information for the units.
Table 1: Outputs and Input for Units 1 to 12
Unit | Input | Output 1 | Output 2 |
1 | 45 | 450 | 45 |
2 | 55 | 528 | 176 |
3 | 75 | 525 | 525 |
4 | 60 | 192 | 576 |
5 | 100 | 100 | 1000 |
6 | 55 | 248 | 330 |
7 | 55 | 176 | 176 |
8 | 90 | 450 | 360 |
9 | 60 | 360 | 300 |
10 | 56 | 364 | 196 |
11 | 61 | 153 | 415 |
12 | 60 | 330 | 90 |
As there is only one input, a possible set of measures would be the ratio of output to input for both outputs. The effect on performance evaluation of different weightings on these measures can be determined. For example, Table 2 shows the rankings for each set of ratios using five alternative set of weights: (i) 100% on ratio 1, 0% on ratio 2; (ii) 0% on ratio 1, 100% on ratio 2; (iii) 50% on ratio 1, 50% on ratio 2; (iv) 75% on ratio 1, 25% on ratio 2; (v) 25% on ratio 1, 75% on ratio 2.
It can be clearly seen that the performance of individual units varies significantly according to the weights used in their evaluation. For example, unit 1 is most efficient under weighting system (i) but least efficient under weighting system (ii) whereas the opposite applies to unit 5. Thus, the manager for unit 1 would be strongly in favour of 100% on output 1 whereas the manager of unit 5 would argue for 100% on output 2. Managers of units 2, 3 and 4 would argue for the weights which presented them as the most efficient. Any attempt to impose a uniform set of weights will face considerable resistance from units whose performance is better under an alternative set of weights.
Table 2: Output Ratios and Rankings Under Alternative Weights
Unit | O1 / I | O2 / I | (i) | (ii) | (iii) | (iv) | (v) |
1 | 10.0 | 1.0 | 1 | 12 | 4= | 2 | 10 |
2 | 9.6 | 3.2 | 2 | 9 | 2= | 1 | 7 |
3 | 7.0 | 7.0 | 3 | 3 | 1 | 3 | 3 |
4 | 3.2 | 9.6 | 9 | 2 | 2= | 7= | 1 |
5 | 1.0 | 10.0 | 12 | 1 | 4= | 11 | 2 |
6 | 4.5 | 6.0 | 8 | 5 | 7 | 6 | 5 |
7 | 3.2 | 3.2 | 10 | 10 | 12 | 12 | 11 |
8 | 5.0 | 4.0 | 7 | 7 | 10 | 7= | 8= |
9 | 6.0 | 5.0 | 5 | 6 | 4= | 4= | 6 |
10 | 6.5 | 3.5 | 4 | 8 | 8 | 4= | 8= |
11 | 2.5 | 6.8 | 11 | 4 | 9 | 10 | 4 |
12 | 5.5 | 1.5 | 6 | 11 | 11 | 9 | 12 |
Figure 1: Performance of Units Using Two Outputs and One Input
The DEA approach is to identify the set of units which are efficient under different weights and to construct an efficiency frontier using these units. Table 2 and Figure 1 show that units 6 through to 12 will always be dominated by one or more of units 1 through 5 regardless of the weights adopted. Thus, units 1 through 5 form an efficiency frontier as depicted in Figure 1. The efficiency of any unit below the frontier can therefore be determined by reference to its distance from the frontier. For example, unit 6 lies approximately three-quarters of the distance along a line extending from the origin to the segment of the frontier connecting units 3 and 4. Its efficiency is therefore 77% of a target unit on the frontier which reflects similar policies.
The frontier that we have identified is an estimate of the "true" efficient frontier in the same sense as a sample mean is an estimate of the population mean in statistical analysis. Thus, the sample frontier is efficient in terms of the sample data but not necessarily in terms of the population. It is, however, the best estimate of the population efficiency frontier. This distinction is shown in Figure 1 where the solid line connecting units 1 to 5 depicts the efficiency frontier determined from data observations of our twelve units which is our best estimate of the "true" frontier which is depicted by the broken line. The "true" frontier is (probably) unobservable given the difficulties of defining the correct production function and obtaining observations of the measures required.
An Algebraic Illustration
The problem may be defined in three parts: first, find a set of weights for a particular unit which will maximise its performance with an upper limit of 100% efficiency; second, these weights when applied to any other unit can not provide a performance greater than 100%; and third, the weights can not be negative.
This can be written as follows:
1 |
In simple terms, this searches for a set of weights, w1 and w1, which maximise the efficiency score for a unit o subject to the constraint that these weights, when applied to the output-input ratios for all of the other units (1 to n) including unit o, do not result in an efficiency score of greater than 100% for any of those units.
Applying this to the data from Table 2 and evaluating unit 1, this becomes:
Maximise z = 10 w1 + 1 w2 unit 1
Subject to: 10.0 w1 + 1.0 w2 <= 1 unit 1
9.6 w1 + 3.2 w2 <= 1 unit 2
7.0 w1 + 7.0 w2 <= 1 unit 3
3.2 w1 + 9.6 w2 <= 1 unit 4
1.0 w1 + 10.0 w2 <= 1 unit 5
4.5 w1 + 6.0 w2 <= 1 unit 6
3.2 w1 + 3.2 w2 <= 1 unit 7
5.0 w1 + 4.0 w2 <= 1 unit 8
6.0 w1 + 5.0 w2 <= 1 unit 9
6.5 w1 + 3.5 w2 <= 1 unit 10
2.5 w1 + 6.8 w2 <= 1 unit 11
5.5 w1 + 1.5 w2 <= 1 unit 12
w1, w2 >= 0
A solution to this is z = 1 and unit 1 is 100% efficient, w1 = 0.1, w2 = 0, constraint 1 for unit 1 is active and all other constraints inactive. Note that this corresponds to the ranking for 100% on ratio 1 and 0% on ratio 2 in the fourth column of Table 2 which has already shown unit 1 to be efficient given those weights.
Figure 2 shows the constraints for units 1 through to 5 and units 6 and 8 for values of w1 and w2.
|
|
|
|
Figure 2: Constraints Graph of Feasible Region and Units 6 and 8
For example, the constraint for unit 3 is satisfied with weights of either w1 = 1/7 (0.143), w2 = 0 or w1 =0 and w2 = 1/7 (0.143) or combinations of these along the line connecting these points. Those familiar with linear programming will recognise a feasible region bounded by constraints for units 1 through 5. The constraints for units 6 and 8 lie outside the feasible region. This means that these two units can never be 100% efficient as the constraints for any or some of units 1 to 5 will become active before then. In other words, these constraints will be equal to one while the constraints for units 6 or 8 will be less than one. If the line depicting the constraint for unit 6 is examined in Figure 2, it can be seen that the greatest distance it can move outwards from the origin before it is constrained by units 3 and 4 is at the point of intersection of these two constraints at point b with weights of w1 = 0.058421 and w2 = 0.084821. Reference to Figure 1 will confirm that units 3 and 4 are superior in performance to unit 6 and, more specifically, there is a point on the line segment connecting units 3 and 4 which forms the target unit for unit 6. This point is shown in Figure 3 and is the intersection of the ray extending from the origin through unit 6 with the efficiency frontier.
Unit 6 | Peers and Lamda values | Targets |
Input-orientation CRS Efficiency 77% | ||
Input 55 Output1 245 Output 2 330 | Unit 3 Unit 4 0.393 0.215 | Input 42 Output1 245 Output 2 330 |
Output-orientation CRS Efficiency 77% | ||
Input 55 Output1 245 Output 2 330 | Unit 3 Unit 4 0.510 0.279 | Input 5 Output1 321 Output 2 428 |
Figure 3: Unit 6 efficiency under input and output orientation
The efficiency of unit 6 is determined by calculating the result with these weights for the output ratios for unit 6. Reformulating the above model to evaluate unit 6, becomes:
Maximise z = 4.5 w1 + 6.0 w2 = 4.5 (0.058036) + 6.0 (0.084821) = 0.77
subject to: 7.0 w1 + 7.0 w2 = 7.0 (0.058036) + 7.0 (0.084821) = 1 Unit 3
3.2 w1 + 9.6 w2 = 3.2 (0.058036) + 9.6 (0.084821) = 1 Unit 4
and all other constraints inactive
w1, w2 >=0
Unit 6 is therefore 77% efficient and its `peer units' are units 3 and 4. Figure 3 also shows the target outputs and input for unit 6 which would enable it to be 100% efficient using either an input orientation in which outputs are held constant and input reduced, or an output orientation in which input is held constant and outputs increased. Note that the target ratios are the same under either orientation.
The DEA Model
The example used above had only two outputs and one input which enabled us to present the analysis by way of a two-dimensional graph. The DEA model can use any number of outputs and inputs subject to the proviso that, as the number of measures is increased, the number of efficient units may increase as units discover additional ways in which their efficiency may be maximised. This is not a problem peculiar to DEA as generally increasing the number of measures will increase the number of efficient units under most if not all methods of evaluation.
The standard DEA ratio model has two orientations as previously mentioned. For inefficient units, the input orientation searches for a minimum level of inputs given the actual level of outputs and the output orientation searches for the maximum output production given the actual level of inputs. The general form of the input oriented DEA model for i= 1,. . . ,n units; s number of outputs and m number of inputs, is :
2 |
3 |
4 |
The output orientation is simply the reciprocal of the above. Certain algebraic transformations are made to the above model to convert it to ordinary linear programming formats which may then be solved using an ordinary linear programming software package or one of the DEA software packages available.
Extensions to the Basic DEA Model
There are several extensions which have been made to the model described in order to accommodate additional requirements or modifications. Differences in sizes of units can be accommodated by the DEA scale model so that smaller or larger units are allowed to incorporate an allowance for their size. Uncontrollable inputs and/or outputs can be separately identified and allowed for in evaluation. For example, the evaluation of retail outlets may need to incorporate market size and location to ensure that these differences are reflected. Other activities such as road maintenance may need to include type and difficulty of terrain and other geological characteristics in evaluating performance throughout the country.
It may also be considered that restrictions should be placed on the absolute freedom for a unit to choose whatever weights best reflect its performance. Upper and lower limits may be imposed on the weights for any inputs or outputs with a unit being free to select weights falling within these limits.
Summary
This paper has described a methodology for performance measurement which has been used extensively in international applications and provides new insights into data analysis. Some of the advantages of the method are:
1. It can handle multiple inputs and outputs measured in different units (e.g. dollars, time, employees, location).
2. It enables the identification of efficient units which define the efficiency frontier.
3. It quantifies the inefficiency of remaining units and identifies their efficient peers i.e. it provides information on what changes in inputs and outputs are required for inefficient units to reach the efficiency frontier as defined by the peer units.
4. Each unit selects its own weights on each input and output to maximise its own efficiency, but for inefficient units the choice is limited by their efficient peer units.
5. It includes as special cases the single input, single output ratios.
6. It can incorporate environmental and demographic measures as inputs which affect performance. These of course are not always easy to quantify!
7. It can control lower and upper bounds on both inputs and outputs. In other words, it can force or limit consumption of inputs or production of outputs.
8. Differences in size can be explicitly taken into account.
The major difficulties are those found in most if not all methods of performance measurement. These are the selection and measurement of suitable input and output categories for analysis and the choice of appropriate actions in order to improve efficiency. The emphasis on peer identification and target values provided by DEA, however, can assist in the latter.
Bibliography
Charnes, A., Cooper, W.W. and Rhodes, E. (1978), Measuring the Efficiency of Decision Making units, European Journal of Operational Research, 2, 429-444
Kaplan, R.S. and Norton, D.P. (1993), Putting the Balanced Scorecard to Work', Harvard Business Review Sept - Oct, 134 - 147.