Understanding Risks in IT Projects

Introduction

It is a common practice that when a business, whether small or large decides to undertake an IT project, the first thing a project manager will do is to whip out a Gantt chart and starts planning including drawing up the budget, timeline and the expectations. This is then presented to the board (in big businesses), or the owner (in small businesses) for approval. Then voila... everything should work according to plan, with some minor adjustments factored into the plan, right? Wrong... this is a common misconception of many, who would ultimately learn that in any IT projects, big or small carries risks, and the bigger the project is, the bigger the risks and the stake will be higher. These risks may eventually cost the business heavily in terms of monetary losses or damaged reputation. This can happen in all sorts of unthinkable ways. That being said, there are small IT projects that did have severe negative repercussions as well. Trying to "dive in" without first spending some time doing ground work and talking to relevant parties while exploring the bigger picture is just a recipe for disaster.



The Devil is in the Details

The common thing that happens when a project is first tabled is assumptions are made. While it is easy to make assumptions, the problem is some of these may appear true until someone actually does the work and found it otherwise. This can happen despite the best intention of all parties. For example, I have come across many times when certain third party products (software and hardware) promises certain specifications, only to fail to live up to their promises when it is actually put to use. The sales team from these third party companies typically tried very hard to sell and make promises to the procurement team, and the buyer was convinced that it was the right solution without running past the stakeholders, until the guys who do the work actually try to use them. Without naming any companies (and risk getting sued), one multi-million dollar company that specialises in producing software libraries promised that their new product would support the legacy interface, persuaded us to upgrade at the cost of £10,000, only for us to find out that it was not 100% compliant. The results? A major software rewrite that almost doubled the initial cost as it is not possible to have a partial half-baked solution. Another multi-million dollar shopping cart company endorsed by Google stated that their automated payment system fully supports Google checkout for digital goods, only for me to point out after testing that they failed to automatically email the license key upon purchase. It has to be manual and that defeats the concept of automation. When it was brought to their attention, it took them nearly 4 weeks (and counting) to try to fix their bug. Needless to say, these problems cost money and we will not be reimbursed by the faulting party. I should be glad then, that at least some companies have the decency to acknowledge and fix them. Shockingly, some even refused point blank and blamed others, including their own customers for the failure of their own products.

Small changes could potentially have a very BIG impact

I remembered too well that at many times, when asked how long it would take for a seemingly easy software change, some talented software guys would say, "Ah.. that's easy, it would be a 5 minute job. ". In one case, I protested and said that it would take 2 days, and obviously everyone was thinking that I was trying to be funny. The results? The job took just under 2 days. Why? It was because the actual job to do that may take only 5 minutes, but there are other associated costs, such as integration and testing, not to mention house-keeping. Before the job is carried out, one must carefully look at the bigger picture and think what the implications are. This could either result in an unexpected problem somewhere far away that seems unrelated, or breaks something obvious. In this case, the 5 minute job actually resulted in allowing the end user to potentially click a button, and screw up the system.

In another case, I still remember that I have told a former colleague certain steps that must be performed just before doing the first official release. He documented it after I have spent an hour with him. It was all in the notes. The company has spent a great deal of money and man hours on security and encryption as the new laptops contain very sensitive information that would be invaluable to criminals. The user account has been locked down, privileges removed, and all precautionary steps were taken in case any laptop got stolen. On the eve of the major release, in his haste, he had skipped one important step. The results? Nobody could log into the laptops after a week due to a security auto-lockdown kicking in by design, and the company then had to issue the administrator password to everybody as it severely impacted their operations. The release of the administrator password negated the whole mammoth effort and put the company in harm's way. Weeks were spent thereafter to undo the damage.

In another company many years ago, a small change to fix a bug was applauded only for the customers to vigorously complain about the fixes. The reason? These customers had unwittingly relied on that bug in their work flow, and by fixing that bug; we had actually disrupted their workflow and thus did them a huge disservice. Hence there were only 2 choices, to reintroduce the bug and keep the customers happy, or to develop additional functionality so that a different work flow could be introduced. The management decided to do both.

Maintenance

In a lot of the project plans, there is no mention of the maintenance phase. Projects are expected to complete and one FTSE 100 company even closed down the budget code upon completion (which means that once the project has been delivered, there is no more money to do anything else). In reality, there is a tail off phase in any projects where bugs WILL be found despite heavy testing, or the usability may not be quite right. For example, a project to upgrade the telephony system to Avaya had to be changed after a release, as the call handlers work in a very strange fashion. I only found out after spending an afternoon training the first operator, when I noticed that he had to manually jot down the phone numbers on a piece of paper that was displayed after an incoming call, in case he had to ring the customer back when the line dropped. Needless to say, I then introduced a "callback" button post-delivery for that project, which made their life easier. No input on this was given by the stakeholders during the 3 month period.

Conclusion

In conclusion, there are many risks that are hard to avoid, but one should at least find ways to mitigate them and do not ever underestimate the risks involved, no matter how small a project is. There are many factors that do not allow the luxury of time for extra fixes, for e.g. projects that are geared for the Formula One event, or the Olympic Games. Obviously any delay would mean disaster and the project is as good as dead. In scenarios like these, a compromise must be made, for e.g. can we deliver less, with less features or functionalities, but crucially we deliver on time?

Testing is of paramount importance. Testing must be done extensively even after a small minute change. Hence the best practice is to do a lockdown of releases (i.e. no more minor changes, quick fixes etc), and to test that release thoroughly. The stakeholders must be made to understand that while any small change may seem innocent and risk free, the reality is, no one can provide a 100% guarantee that it is. A small change may result in the customers unable to purchase goods on your website, or charged twice. Stakeholders must be made to understand that any quick fixes will invalidate any previous lengthy tests and a pain-staking re-test is necessary.

There are also scenarios whereby what happens if the stakeholder fell sick? Or had a car accident? I remembered that once we were days away from a major first release, I fell terribly sick and was shivering the whole day. My phone rang continuously as the project manager was desperate to find out how ill I was. I was too sick to even answer the phone. Fortunately I was back at work the next day but it was a close call.

The risk of any IT project, post or pre-delivery is huge. Take for instance the recent Royal Bank of Scotland's computer glitches. The failure was due to a slight human error, but the consequences was devastating, at least to the thousands of customers who could not access their money in the bank, or make mortgage payments, or even complete their house purchases (crucial for those who bought at auctions where delays may mean huge capital losses). Some could not even buy food, or fuel.

Understanding the risk is crucial in any IT projects and the stakeholders must be made to think of the doomsday scenario. Risk mitigation is essential and there must be contingency plans for these scenarios. While this may sound pessimistic, it is crucial to the planning, like having a fire assembly points where employees gather in case the building burns down.

No comments:

Post a Comment