Saturday, June 19, 2010

Software Development is Not like Building a House

My hope is that when most of you see this, you'll say the equivalent of "no shit". You'll continue, "Of course software development is totally different than building a house." But, for those few hold-overs that think software development can progress in sequential phases like building a house, well, this blog is dedicated to you. I will try to convince you otherwise. At least wish me luck.

Take-Away Points

  1. Building non-trivial software systems is a creative endeavor.
  2. Planning will be driven by the realities of the endeavor, just as much as the endeavor will be driven by the planning.
  3. Coding is design, and so are some of the traditional design methods.
  4. Source code is a blueprint (design document); the compiled application is what the user interacts with.
  5. Construction is the compiling or interpreting of the code into machine instructions.

Analogies

I love analogies. I love them especially when you can use an analogy to reduce a complex issue into a simple one, and use it to show why something does or doesn't make sense. On the other hand, sometimes analogies can be powerful, and yet completely wrong. Their simplicity may leave out crucial aspects of what they are intended to describe. Even when an analogy hasn't accurately captured the essence of what it is trying to describe, its simplicity may be so compelling that people accept the analogy as truth. In these cases, analogies are dangerous. They lead us down the wrong path. They allow us to make decisions that are appropriate at the level of the analogy, but yet inappropriate for the actual situation.

An analogy is effectively a model of an unfamiliar or complex concept expressed as something with which we are all familiar, thus making it easier to understand and intuit. As a model, it attempts to express the essence, leaving out unnecessary details.

As with models, analogies have domains of relevance. In other words, an analogy may be useful to explain a concept under certain circumstances. And at the same time, it may be incorrect when used to describe the same concept under different circumstances. A simple example from physics illustrates this idea. Classical mechanics describes motions of bodies well when those motions are slow and the bodies (objects) are not too big or too small. For extremely small bodies or fast moving particles classical mechanics becomes a poor description and the errors can be quite significant when using classical mechanics.

In this case, the analogy that software development is like building a house over-simplifies software development. And, over the years, this oversimplification has lead to many dramatic failures.

Software Development

For decades people have attempted to describe the software development process and create processes to ensure that software development succeeds. And, for decades, people have failed. Practitioners of software development recognize that software development is not one homogenous thing. Rather, the development endeavor can vary dramatically depending on

  1. What is being developed.
  2. The technology choices.
  3. The constraints imposed on the system.
  4. How the system will be used.
  5. The technical and domain skills and experience of those developing the system
  6. And, I'm sure it depends on many more things.

Except for the simplest software systems, software development is anything but a linear process that proceeds along a well-define sequence of steps. Assuming that the intended outcome of the development is well understood, which is often not the case, the way that subject matter is modeled has huge implications on the actual development. By modeling I mean the way the system has been broken into its constituents and how those parts interact. For any given problem, there may be tens to hundreds to an infinite number of different ways of modeling the system. Some models may be better than others for certain situations. Understanding that there are different models requires experience. Developing different models requires skill and experience. And choosing the "best" model requires making trade-offs, skill, and experience. There is a significant artistic component to modeling and design in the same way that there is an artistic component to science. Now I don't want you to walk away with the impression that you can create and select the "best" design before a single line of code has been pounded into the keyboard. That is unlikely, except, again, for the simplest systems. Writing code is an integral part of designing software: which I'll talk about shortly.

In the previous paragraph, I started off the second sentence "Assuming that the intended outcome of the development is well understood". The outcome may be understood. But is it "well" understood? This is unlikely, except, again, for the simplest systems. Let's explore what it means for the "intended outcome" to be "well understood"? Conceptually, it means that we know what we want to build. But here's the rub: we need to "know what we want to build" at a sufficient level of detail that will allow us to design/build the system. And the "sufficient level of detail" is an ill-defined concept. Not only can "sufficient" be vastly different for different people, it also changes as new information becomes available. I may know how to select the appropriate non-linear least-squares algorithm for a particular aspect of the system. And at the same time, someone else may not even know what a non-linear least-squares algorithm is. After implementing the non-linear least-squares algorithm that I selected, we may decide that it takes too long to perform the optimization. And therefore, we may have to choose an entirely different approach. And although you might argue that we should have known the performance issue upfront. But usually you don't know how it will perform until you've built it. You may have developed similar code before and use that experience to guide you. But this isn't always the case. You could argue further that the performance characteristics should have been defined up front. Sure, that is a valid point. And the overall performance may have been defined. But you can't define the performance characteristics for every method up front, because you don't know what methods will be implemented. And again, we are back to figuring out what is the "sufficient" level of detail with which we need to define the performance characteristics.

At this point you should be seeing a pattern. There are aspects of the system for which you don't have solutions up front. As you solve these parts of the puzzle through development, you may find that you have invalidated some assumptions, and introduced new problems, which again need solutions. And again, solutions to these new problems will invalidate some assumptions, and introduce new problems. You may argue that with better planning, you can avoid these "unknowns" that crop up. But the problem with that argument is if you don't know the solution, then how could you ever imaging that you could know the problems that the solution (which isn't yet known) will bring. And furthermore, how your planning have changed that?

Coding is Designing

I believe that the misunderstanding described in the previous paragraphs largely stems from one central incorrect assumption: design comes first, then the coding (or implementation or build). You may argue that by iterating between designing and coding you are not making that incorrect assumption. But I would take it a step further; design and coding are one in the same. This may seem like a Zen koan.

Design is the process of originating or developing a plan that lays the basis for the making of every object or system.

And a plan can be a procedure used to achieve an objective, a set of plans (or drawings) used to provide instructions to build of fabricate a place or object, a blueprint, an engineering drawing, architectural drawing, and so on.

If design is the planning that lays the basis for the making of every object or system, then the way in which we express the design is left to us. For example, you could draw the design on a napkin while having a beer at a nice outdoor pub. Or you could fire up a design program and draw class diagrams, object diagrams, sequence diagrams, use-cases, and so on. And this is what we traditionally think of when we think about design. Now here's the thing: a person couldn't move into the architect's blueprint of the house no more than a user of a software system could use the source code describing how the system performs. In both cases, the designs must be built into a tangible thing that can be used by the person.

Design is the "plan" that allows us to make the object. In the example of a house, it is taking the "plan" (blueprints) and "making" (constructing) an "object or system" (the house) in which the person can live. Similarly, in the example of a software system, it is taking the "plan" (source code) and "making" (compiling/building) an "object or system" (the application) with which the person can interact. To take it a step further, the blueprints for a software system are source code. And the "plan" includes the blueprints as well as the traditional design documents I listed in the previous paragraph.

Planning Software Development

I have worked with people who believe that good planning leads to good execution. And I agree. But I disagree with the idea of developing detailed plans for the entire endeavor at the outset, except for the simplest of endeavors. During planning, we must recognize what we don't know. Our planning efforts should not be wasted on developing detailed plans for things that may never happen. All we gain by doing this are precise plans that are completely inaccurate, and need to be redone. In fact, a good plan should have built within it the ability to change and update the plan.

A good plan, violently executed now, is better than a perfect plan next week.
-- General George S. Patton

The fact is that planning will be driven by the realities of the endeavor just as much as the endeavor will be driven by the planning. Planning and execution are not separable for the development of anything but simple systems. Rather, they are intertwined.

The Mountain of Unknown

To see why planning the entire endeavor from the outset is rarely possible, let's look at an example. Suppose your company has been around for some time, say 30 years. Your financial accounting systems were built to meet idiosyncratic needs, written in COBOL, and over the years have been integrated with many additional systems that feed them data and consume their output. Not an uncommon scenario. These systems were updated in 1999 to solve date issues related to the year 2000. But, many of the solutions were "band-aid" fixes. Newly issued accounting standards require substantial modifications to one of the larger financial accounting modules.

At the outset, we logically assume that the financial accounting module that needs modification can simply be pulled out, and a new one popped in. Software should be "modularized", and everyone speaks about "the accounting module". So why not pull it out? Isn't it just like replacing the bathtub in your home with a nice, new whirl-pool-massage tub with water heating built in? Probably not! Let's examine why we may not be able to just pull out the accounting module and replace it. And keep in mind, to build a new system from scratch, can be much more complex than replacing an existing one.

Connections to Other Systems

Typically, any interesting system has connections to other systems. Systems need to get fed data and other input. And they produce some output that is likely consumed by other systems or people. How, and to what systems, this system is connected will play an important role in its replacement. The problem in many larger organizations may be that all the connections to the system may not be known. This happens, for example, when the system writes output to a shared database or a file. Unless there has been a good process in place to record and manage what systems use what data (database tables, database fields, or files) these connections aren't discovered until the connections are severed—when the system is replaced and stops writing to those sources.

  1. Do you know all the systems that consume this system's output?
  2. Are there scheduling constraints imposed by other connected systems?

Data Exchange with Other Systems

Because the system is connected to other systems, it exchanges data with other systems. How these data are exchanged drives the number, type, and complexity of the data-exchange mechanisms that need to be built.

  1. Are the inputs given to the system through a file, shared database, proprietary communication protocol, method/function call, web service?
  2. Are the inputs formatted in as text, binary, XML?
  3. Are the data handed to the system in bulk, in spurts, real-time, based on events, nightly batch loads?
  4. And you have the same questions for the system's output.

Data Storage

Likely the system you would like to replace stores data and results in some data store. The approach to the data storage will be a big factor in how easy it is to replace the system. If the system has its own, self-contained data store for the data it "owns", and no other systems pull from this data store, then you're in reasonable shape. But, more than likely, the system also writes data into a shared data store as well. And more than likely, as is usual with older system that have evolved over time, other systems read that data, and use that data store as an integration mechanism. Now, all of a sudden, you have to worry about potentially unrelated systems.

Part of a Business Process

Adding to the complexity is that fact that this system will, more than likely, be part of a larger business process. How that process is constructed will, again, be a big driver in the complexity of the replacement of this system.

  1. Does the process proceed through a series of systems, scheduled with some Unix "cron" job, where one system writes data, needed by the next system, to a file or database?
  2. Or is there a workflow that manages the process according to a schedule, or events?
  3. What data is needed by the next system?
  4. And will the new system be able to source that data (this is a real concern for commercially developed products)?

Add-Ons that Don't Belong

Legacy systems tend to accrete capabilities over time. And more than likely, some of those capabilities are not a logically consistent part of the system. This tends to be more of a problem when purchasing commercially developed ("off-the-shelf") products than it is for custom build (either in-house or third-party).

Commercially developed products typically contain capabilities related to the objectives of that product. Additional capabilities that are idiosyncratic to your company are probably not to be found in the commercial product. So what do you do?

  1. Do you customize?
  2. How much customization makes sense?
  3. What about maintenance and updates for the customized product?
  4. Do you know the missing capabilities at the outset?

Lack of Documentation

Finally, in larger, older organizations, there is just simply a lack of documentation that describes the systems, their capabilities, their input needs, their dependencies, where they fit into what processes, and "band-aid" fixes that have been applied to them over the years. The lack of documentation is a big issue. It forces the organization to perform archeological analyses to determine the all the systems and processes that are impacted by this replacement.

  1. When replacing the system, how do you know the answers to the previous five concerns?
  2. Do you have someone, or a few people, with that knowledge?
  3. How complete is their knowledge?
  4. What don't they know?

Take-Away Points

Building, replacing, or integrating a system can be quite complex. There are likely thousands to tens-of-thousands of things to contemplate. Many of them are deeply buried in some undocumented "band-aid" fix, or are a forgotten capability or dependency. Even if you have the developers and business people responsible for the system, expecting them to know all the downstream consumers of the current system's output, is unreasonable. The misguided idea that you can analyze and plan the entire effort at the outset, to a sufficient level of detail to allow execution of the plan will take you down a road of disappointment and failure. Instead, recognizing what you do know and can plan, executing those aspects, while continuing to plan, in real time, as new information becomes available, will yield a higher probability of success. Don't get me wrong. I'm not arguing for jumping in without planning. What I'm arguing for is to start with a possibly imperfect plan, and keep that plan fluid.

Building a House

A common analogy for software development is that software development can be viewed and managed in the same way that you would build a house or a bridge. It amazes me that this analogy has survived to this day. It stuns me that this analogy is still applied to the development of software systems of any real complexity (i.e. most systems). It's like accepting the theory of phrenology in this day and age. Time and time again this analogy has proven inadequate. And yet, software is still developed using processes that are directly modeled on it. Wow!

The idea is that software development progresses sequentially through phases. First there is the planning for the custom build, in which the customer's desires and needs are gathered. What style of house? How many square-feet? How many bedrooms, bathrooms? What type of kitchen? Then, based on the customer's desires and needs an architect begins designing the house. Most likely, this is an iterative process. Once the design is complete, construction commences. During the construction, there may be a number of inspections by the customer and by regulatory agents (e.g. county inspector). Upon completion, the customer and the regulator perform a final inspection, and the house is delivered.

So what's the problem? This seems reasonable enough. Why not just apply this to software development?

Houses are largely the same. In my neighborhood, my wife and I live in one of four types of houses. Sure, my neighbor's house is the mirror image of ours. But effectively, it is the same house. That means, that there are probably thousands, in not tens of thousands of identical house to mine built in the U.S. Our house has been built over and over again. Given the architect's blueprints, the construction crew hammers our one house after another.

Houses have more similarities to each other than they have differences. They all have bathrooms, bedrooms, kitchens, entrance ways, toilets, sinks, bathtubs, windows, floors, walls, and doors. Bathrooms have a combination of shower, bathtub, toilet, and sink. Most rooms have electrical outlets. And bathrooms have a water supply and drain. These similarities allow for the creation of a detailed set of building codes that describe most every aspect of the construction of a typical house. Building codes ensure that load-bearing walls can support the load. They ensure that the gauge of the electrical wire can support the current it must carry, and that the circuit-breakers trip if the current goes to high. Building codes ensure that in areas with high expected snowfalls, that the pitch of the roof is high enough, and that the trusses are strong enough to support the weight of the snow. And therefore, it stands to reason that the specifications for our house have been worked out pretty well. What's more, in most cases, different house styles are effectively following the same set of patterns in construction.

The fundamental difference between developing a software system and building a house is that when building most houses we have enough experience to be able to anticipate what needs to get done and in what order. There is virtually no need for experimentation. The same parts have been built many times. There are specialists for specific tasks. On the other hand, for most software development there are many unknowns—both at the outset and ones that emerge during the endeavor. Experimentation is usually required. There are complex interactions between various parts of the system that only become apparent during the endeavor.

There are fundamental differences between building a house and developing a software system. There are also some real similarities if we align them correctly.

Conclusion

The development of any non-trivial software system is a creative endeavor. The steps towards the solution, or even the ultimate solution, may not be completely known at the outset of the endeavor. In fact, during the development of the system, issues that had not even been conceived of at the outset, may arise and need to be solved.

When developing a non-trivial software system:

  1. We must accept that we can't always know the solution in its entirety before we begin the endeavor.
  2. And therefore, we cannot plan the entire endeavor at the outset.
  3. And therefore, we must allow the planning to be fluid as new issues arise and must be solved.

The above statements spell out fundamental differences between the idea that we can plan out the development at the outset of the endeavor and expect to adhere to that plan. Furthermore, when projects go "off the rails" it is not necessarily a short-coming of the team, but rather a stubborn adherence to the notion that a plan can be created and executed without a real understanding of what truly needs to be done. And sadly, these adherents express a naive surprise when the actual work doesn't follow that plan. We must keep in mind that as the creative endeavor progresses, the solution to one problem may lead to the need to solve another, that requires planning which could not have been done at the outset.

Saturday, June 5, 2010

Out of Control Controls and Processes

Summary

Processes and controls are critical for any enterprise. Well constructed processes and controls describe how things get done, provide order and repeatability, and reduce the risk of undesirable outcomes. However, no process and set of controls can guarantee that there will be no errors. And a poorly constructed process can actually do more harm than good.

The key is balance. The process and its controls should be as simple as possible, no simpler, and no more complicated than it needs to be. Achieving this balance requires:
  1. A clear and crisp set of objectives that the process, and its controls, is intended to achieve
  2. An understanding of the costs and risks of implementing the process and its controls, versus not implementing them
  3. In-depth knowledge of the domain for which the processes are being developed
  4. An understanding of the limits of what a process and its controls can do

Introduction

No doubt. Processes are important for businesses. So are controls. Don't get me wrong. I believe in the need for good processes and solid controls. Anarchy, however appealing to some, is, well, anarchy. And some order is needed to ensure that things get done how and when they need to get done. Entropy increases over time unless you expend energy to reduce it. So there needs to be some mechanisms in place to keep things in order.

But an over reliance on process and controls isn't the solution. In fact, complex processes with many controls can substantially increase the risk to the enterprise in terms of opportunity costs. If the processes are so complex and the controls so burdensome that the enterprise must expend extraordinary effort to accomplish even trivial tasks, then, it is logical that there may be little risk of error. Yet, in this case, the ratio of output to expended resources becomes small. And, the very existence of the enterprise is at risk.

Clearly, on the other hand, having no processes and controls can lead to errors that result in the end of the organization as well.

Definitions

Let's start with a few definitions.
process describes the act of taking something through an established, and usually routine, set of procedures to achieve a defined outcome. 
Examples of processes include the act of converting recycled PET into bottles for holding spring water, the on-boarding of a new employee, or the purchase of an asset for a portfolio.
A procedure is a sequence of steps that are performed to accomplish a task.
Examples of procedures are the sequence of steps a pilot performs as part of the safety check process, the sequence of steps to log on to your computer, or the sequence of steps performed to close out your cash register at the end of your shift.
Processes may contain controls that provide checkpoints to ensure that the process is being followed or determine whether the process is in control. 
Examples of controls include a measurement of the thickness of the plastic bottle after the first injection, approval to allocate various resources to a new employee during the on-boarding process, or the required approval by the portfolio manager when the asset being purchased has a price above some threshold.
The set of controls and their objectives are often referred to as the controls framework.
Examples of a control's objective may be to reduce the likelihood of errors, malfeasance, and to give management comfort that the enterprise's processes are being followed. Examples of control frameworks are COBIT and COSO.

Process Limitations

Let me back up and talk about what processes can and cannot do. As I defined above, a process describes the act of taking something through an established set of steps to achieve a defined outcome. Usually, the procedures of a process are routine. This means that you have a repeatable set of steps, which are likely to be executed many times over the course of the enterprise's existence. In other words, you have written down the steps for something that is being done over and over again, and you want to ensure that the appropriate outcome occurs every time. An example of this could be the manufacturing of a bottle. Or it could be a nightly marking and recording of economic variables used by the enterprise. In each case, you want the same outcome to occur every time. You want each bottle coming off the manufacturing line to have the same measurements (within a specified tolerance) every single time. Or you want to ensure that the same approach to marking the economic variables occurs every single night so that the variables are consistent, and any trending or relative comparisons make sense.

Next, you may add controls to the process to ensure that the process is being followed. For example, you may add a device to measure the thickness of the side of the bottle; noting when a bottle is out of tolerance, and potentially stopping the manufacturing line. Or, you may have someone review and approve the marks for the economic variables before they are moved to a central, shared repository.

These controls can prevent some types of errors, but remain completely ineffective against potentially unexpected events. In the manufacturing line, the measuring device could fail or be incorrectly calibrated. You may argue that part of the process could be to check the calibration every x number of hours. But in order to check the calibration, you may have to shut down the manufacturing line. And shutting down and restarting the manufacturing line can be too costly, lead to additional risks of equipment failure, or make the manufacturing process too inefficient and expensive relative to competitors. To return to our example of the nightly marking of economic variables, the review of the marks may check against external data sources and perform additional sanity checks. But if one or several of the existing data sources are incorrect, the review may not (and should not be expected to) catch that difference.

If you're not yet convinced that there are unknowable events against which a process and its controls cannot protect, try to develop a process for driving a car such that no one ever has an accident. Can it be done? Of course it can't be done. There are too many unexpected, and more importantly, unknowable events that occur even during a simple drive to the local grocery store. As you drive, you may follow the process and look in all directions at a four-way stop sign, mark it down in a notebook and have it signed by the passenger so that it is auditable, and while you're doing that, someone driving behind you has a heart-attack and plows into your trunk. What controls or processes could have prevented that?

No process, set of processes, or controls can guarantee that no error or malfeasance will occur. It is true. Good processes and controls can significantly reduce the risk of an undesirable outcome. And good processes and controls are constructed in such a way that they balance negative impact on productivity against the risks of undesirable outcomes.

Routine and Creative Endeavors


Placing a process around routine endeavors can increase, possibly significantly, the likelihood that the outcome is the desired outcome. Through procedures and controls you can build a fair amount of predictability into a process. And as required in many enterprises, the artifacts required by the process can be archived as evidence that the correct steps were taken and made available for internal and external review.

But not all endeavors are routine. Many endeavors we undertake within an enterprise are what I call creative. By creative I mean that they are endeavors for which the steps we must undertake to get to an outcome are not knowable at the outset. In some ways, driving is a creative endeavor. At the outset of your trip to the local grocery, there are many unknowable steps you will have to take. Your tire could blow out and you would have to take corrective action to avoid driving into the median. The details of the actions you need to take are not knowable before they are taken. They are a response to a complex set of circumstances that reveal themselves only when they occur. (It is true that you can come up with general principles such as don't pump the brakes (you have ABS). But those are principle, not procedures.)

There are many problems for which the solution isn't known. These problems must be solved through thought, experimentation (trial and error), discussion, research, or whatever it takes. I call the act of solving these problems a creative endeavor. At the outset you may know what you are trying to solve in detail. On the other hand, the problem may be open-ended and part of the endeavor will be to figure out what the problem is that you must solve. In either case, you may not know what steps you will need to take to develop the solution.

You can rightly argue that you can put a process into place for solving these types of problems. For example, you may argue that the first step is to identify the problem. Then you must develop the solution. Then you must validate that the solution works. And so on. But that is only marginally helpful. In fact, such a process is so high level that it is just plain obvious, and hardly useful at all. The fact that the steps toward the solution aren't known means that no process can be described in any significant detail which lays out the sequence of steps that must be taken to arrive at the solution. Rather, it is during the endeavor that the next steps may become clear—or maybe not.

This doesn't mean that you can't or shouldn't put a process in place to manage a creative endeavor. No. I believe that you should. But for a creative endeavor, the objectives of the process and its controls should be carefully aligned with the risk they are intended to mitigate. Any creative endeavor that could have an impact on the enterprise should clearly have controls to reduce the risk of undesirable outcomes.

Take for example, the development of a mathematical model. Most people would agree that this is a creative endeavor. Most people would also agree that before the model is used for driving decisions, the model must be fully tested, and the results of that model should match the expected results and be sensible. In this case, you may put into place a process that focuses on ensuring that the model is well tested and gives reasonable results. And, at the same time, put less focus on the steps that were taken to arrive at the model in the first place. You may also want to ensure that the model and its implementation are well documented, to protect the business, should the author of the model leave the organization. And so you add that to your process.

As another example, take the development of a piece of software by an organization outside of the enterprise. In this case, because the requirements provided to the external organization ("Vendor") form part of the contractual obligation, the requirements are clearly important. In this case, the process should ensure that the requirements are clear and complete before the vendor begins the development. And, that there is a clear arrangement in the contract so that requirements can change—which is likely.

In summary, developing sensible and realistic processes and controls is be complicated. For routine endeavors, at least, we have a good idea of the sequence of steps and the desired outcome. But for creative endeavors, developing processes and controls is complicated because the solution and the steps towards the solution may not be knowable at the outset.

Developing the Process and its Controls

Overly complex processes and too many controls can substantially increase the risk to the enterprise in terms of opportunity costs. If the processes are so complex and the controls so burdensome that the enterprise must expend extraordinary effort to accomplish even trivial tasks, then, it is logical that there may be little risk of error. Yet, on the other hand, ratio of artifacts produced to resources expended becomes small. In other words, the cost (resource expenditure) of producing output becomes large, profits (monetary or reputation) decrease, and the very existence of the enterprise is at risk.

Clearly, on the other extreme, having no processes or controls in place can lead to errors that result in the collapse of the organization as well.

The key is balance. But how do we achieve this balance?

Clear, Crisp Objectives

Before developing a process and its controls, we must define what the process is attempting to describe and what risks it is attempting to mitigate. Are we attempting to prevent someone in the back office from depositing funds into their account? Are we attempting to ensure that the numbers for financial reporting are correct? Are we attempting to ensure that we have adequate documentation for a mission critical application? Whatever the objectives are, we must know them before we can build a process to attempt to achieve them.

The objectives are only part of the equation. We must also understand the costs.

Understanding the Costs

We achieve part of the balance by ensuring that the cost of executing the process and its controls is significantly less than the cost of the undesirable outcome. To illustrate this point, imaging a scenario where the cost (monetary, reputation, etc) of the processes and its controls are equal to the cost of the undesirable outcome. Further, imagine that in this case that if we don't use the process we always get the undesirable outcome. This implies that if we don't use the process we incur the cost of the undesirable outcome. And it also implies that if we do use the process, we incur the cost of the undesirable outcome (recall, that was the cost of executing the process). These two implications lead to the conclusion that we incur the same costs regardless of the process. In other words, we gain nothing by using the process.

To further the argument, let us image that for the same endeavor, we get the undesirable outcome only some fraction of the time if we don't use the process. In this case, not using the process costs the enterprise only a fraction of what it would have lost had it not followed the process.

In other words, unless the undesirable outcome occurs every time we don't adhere to processes and its controls, the cost to company ends up being higher when the endeavor adheres to the process and its controls than it would if they were merely neglected. Clearly, this does not make sense for the enterprise.

The solution in this case isn't to get rid of the process and its controls. Rather the solution is to lighten the process and its controls so that the cost of executing them is only a fraction of the cost of the undesirable outcome. And that fraction should be smaller than the likelihood that the undesirable outcome occurs when the process and its controls aren't followed.

OK, so that's all in theory. And, in practice determining the cost to the enterprise of the "undesirable" outcome is difficult to determine. The likelihood that the "undesirable" outcome occurs is also difficult to determine. In cases where the endeavors are routine, there is hope. When we have routine endeavors, we can capture statistics. And we can use these statistics to determine the likelihood of "undesirable" outcomes. And, we can capture the actual costs of these "undesirable" outcomes.

In the end, though, we need to use judgment to guide the development of the process, its controls, and the real risks we are trying to mitigate.

In-Depth Domain Knowledge

Good judgment requires, at a minimum, an understanding of the domain in which you are making judgment calls. An investor that makes decisions on buys and sells without an in-depth understanding of the products in which (s)he is investing, is not an investor, but rather a gambler. The development of a process and its controls requires an in-depth understanding of the risks involved in the endeavor that the process aims to describe and control. Developing a process without in-depth domain knowledge runs the risk that the controls are too light where they should be heavy, too heavy where they should be light, or some combination of both. I have seen processes developed by people that don't have the domain knowledge. And in each instance, they have attempted to provide controls that prevent undesirable outcomes. And as we now know, this is not possible; we can only reduce risk to an acceptable level.

A good process needs to be low drag. It should align with the way its adherents work. The process may require additional work, but that work should make sense, be logical, and its purpose intuitive. A good process must allow for the nuances of the endeavor where they are truly needed, render unwanted nuances unnecessary, and prevent those that shouldn't occur. A good process should have little overhead. For example, how much does it cost to take a simple endeavor (say one hour's worth of work) through the process--2 hours, 3 days, one month? And I argue that to develop a "good" process as described above, requires an understanding of the endeavor the process is meant to describe and control.
As an aside, in physics there is a principle of least action that says that the path followed by a real physical system is that for which the action (think expended energy) is minimized. This applies to people as well. And so when the process is too onerous, people tend to look for an easier path. And easier path not have the needed controls. In this way, they may circumvent important controls.

Understand the Limits of What the Process Can Solve

From our discussions about developing a process to drive a car, we learned that it is not possible to create processes that prevent all undesirable outcomes. When we develop processes and controls, we must resist the temptation to build processes that "prevent" all undesirable outcomes. At best we can hope to reduce the risk of undesirable outcomes. When we attempt to build processes to prevent all undesirable outcomes, we tend to make the processes extremely costly and complex.

Understanding the limitations of processes--what they can and cannot solve--helps us develop processes that are simpler and less costly to execute, and at the same time, have addressed the areas where they can actually reduce risk.

Coupled with an understanding of the real risks, the costs of those undesirable outcomes, understanding what the processes can and can't solve helps as develop a reasonable balance between the cost of executing the process, the risk of loss.