Don’t Let Your Cloud Migration Become A ‘Clusterf*ck’
Earlier this year, the UK-based bank TSB attempted a major IT migration, a massive project several years in the making. The result was a complete and utter fiasco.
Customers were unable to log into their accounts. Data from some customer accounts appeared in different ones. Obscure technical error messages abounded. Customers overwhelmed call centers to the point that beleaguered call center reps walked off the job.
In an effort to stanch the bleeding, the bank waived tens of millions of pounds’ worth of overdraft fees and increased interest payments to cover customer losses.
At the center of this ill-fated initiative was a migration from a mishmash of legacy apps to a modern suite of applications, in part in the public cloud. Nevertheless, in spite of the noble goal to migrate legacy apps to the cloud, the effort was as one insider described, “a clusterfuck in the making.”
How could such an effort go so wrong? And perhaps more to the point, how can you avoid your cloud migration suffering a similar fate?
Fiasco Years in the Making
Lloyd’s Bank in the UK spun off Trustee Savings Bank (now TSB) after the 2008 financial crisis. Even as an independent bank, however, it continued to use Lloyd’s aging IT platform, which consisted of a mix of older apps that Lloyd’s had acquired over several years of bank acquisitions.
In 2015, Spain’s Banco Sabadell acquired TSB for £1.7 billion (about $2.2 billion), but it remained on the legacy platform. Migrating to a new platform, however, was an urgent priority, as TSB was paying Lloyds as much as £220 million ($286 million) per year to Lloyd’s for use of its platform.
To this end, Sabadell decided to migrate TSB’s application infrastructure to its own Proteo platform, which was based on Accenture ’s COBOL-based Alnova system. Sabadell, however, had heavily customized this platform over the years.
TSB planned to heavily customize Proteo even further for its own purposes, as well as migrating it (at least in part) to the AWS cloud from Amazon.com . The bank dubbed the result Proteo4UK.
By late 2017, TSB leadership was sanguine about Proteo4UK. “We have created a more digital, agile and flexible TSB,” crowed Paul Pester, CEO of TSB.
The bank played up the speed of the migration. “We have a great partnership with Sabadell and the development team in Barcelona,” said Genevieve Kangurs from the Digital Transformation Office at TSB. She said that Proteo4UK boasted “speed, adaptability and flexibility unheard of before.”
Sabadell leadership was equally sanguine. “With this migration, Sabadell has proven its technological management capacity, not only in national migrations but also on an international scale,” bragged Josep Oliu, Chariman of Banco Sabadell.
Fell Over Right Away
It didn’t take long for people to realize such confidence was unwarranted. Within hours of TSB flipping the switch on Proteo4UK in April 2018, the system came to a screaming halt, locking out up to 1.9 million TSB customers – and problems persisted well into the following week.
The same insider at the bank spoke to The Guardian about the situation in the weeks following the debacle. “It seemed a bad fit for a smaller bank to inherit all the problems of a bloated mess to service far fewer customers,” said the insider. “I could have put money on the rollout being the disaster it has been, with evidence of major code changes on the hoof over last weekend and into this week.”
The insider had more to say. “The time period to develop the new system and migrate TSB over to it was just 18 months,” the insider added. “It was unbelievable – hardly even a prototype or proof of concept, yet it was supposed to be fully tested and working by May  before the integration work started.”
Customers were livid. Here is one tweet among hundreds. “It’s getting worse as the day goes on,” tweeted Jobseeker (@Anonymous_Nottz). “I can’t login online at all now!! I’m locked out of the app as its trying to text me on a number I changed 3 years ago.”
Avoiding Your Own Migration Fiasco
TSB made many mistakes, including an overly aggressive timeline, inadequate testing, and an untold number of technical faux pas. As with many legacy migration efforts, however, the biggest mistake was tackling the entire effort as a single ‘big bang’ initiative.
Many organizations face the choice between two risky alternatives. “Today, enterprises making the shift to the cloud face two options: The first option is to ‘lift and shift,’ moving applications to the cloud as they are currently running on premises, and moving all the headaches to the cloud too,” explained Google vice president of technical infrastructure, Urs Hölzle. “Or it’s a very scary transition, where you have to rewrite a lot of things.”
TSB bet on the latter choice, which was perhaps their primary mistake. Fortunately, many organizations are learning this lesson. “Not long ago, many enterprise teams we spoke with believed there were only two ways to deal with existing applications – rearchitect them for the cloud, or keep them on-premises indefinitely. But that’s a false choice,” explained Dan Jones, VP of Product for Skytap. “Smart IT teams understand the cloud has benefits for many application types and that rearchitecting isn’t always necessary, or even advisable, to gain those advantages.”
In some cases, rewriting apps for the cloud does make sense – but it’s important to take a gradual approach. “Rebuilding your app to take advantage of cloud features can help you realise reduced costs, reduced risk and less need for operational support,” said Richard Latham, principal consultant at KCOM. “Trying to do everything at once runs the risk of slowing down the entire migration process, multiplying costs and can even delay implementation by as much as several years.”
The bigger the migration effort, the more can go wrong. “Typical migration solutions are high risk and can take years to complete. This is because altering programs and applications can require writing thousands, and in some cases, millions of lines of code, meaning that the scope for error is far-reaching,” said Carl Davies, managing director at TmaxSoft UK. And yet, “cloud-based architectures are certainly possible and offer tremendous flexibility. Almost all forms of processing are now moving to open systems made possible by the cloud.”
Failure may sink the entire business – an eventuality that may or may not be in the cards for TSB. Success, however, can lead to strategic benefits. “It’s not enough to merely shift applications and infrastructure to the cloud. By critically evaluating the options for the whole business, companies can benefit from opportunities to advance or extend their business models and stakeholder experiences,” explained Suranjan Chatterjee, global head of cloud apps, microservices & API-fication Unit at Tata Consultancy Services . “When executives emphasise business drivers in their decision-making to move away from legacy systems, it will open up cloud opportunities beyond cutting capital expenses, ending software maintenance contracts and reducing IT staff.”
Postscript: Will TSB Survive?
TSB’s failed migration gave them both a customer relations nightmare as well as a technical quagmire they have had to clean up quickly.
The cost of the fiasco (outside of the technical fixes) may end up costing TSB as much as £150 million ($195 million), according to Mark Brown, who heads up TBU, a union that represents many TSB employees. This number includes compensation to customers, fines, increased costs, and lost business – as well as addressing rampant fraud, as criminals have been quick to take advantage of the situation.
To clean up the technical mess – and most importantly, the sorry state the fiasco had put the bank’s data in – TSB hired IBM .
IBM’s estimated cost? As much as £955 million ($1.26 billion), according to an article on Wolf Street. This number includes the technical fixes and cleaning up the data mess. Compared to the £1.7 billion purchase price for TSB, these numbers would suggest that the fix may not be worth the trouble, customers be damned.
Only time will tell – but the moral of this story is clear. Cloud or no, IT migration promises both substantial upside and commensurate risk. There are no panaceas, and it is all too easy to oversimplify the alternatives.
And most of all? Know what you’re getting yourself into. “Where is the problem simply about infrastructure – which is the case for many traditional, scale-up apps – and where is the issue an app’s functionality?” asks Skytap’s Jones. “The answers are critical to creating a cloud strategy that actually solves your problems and delivers measurable value.”
Intellyx publishes the Agile Digital Transformation Roadmap poster, advises companies on their digital transformation initiatives, and helps vendors communicate their agility stories. As of the time of writing, IBM and Skytap are Intellyx customers. None of the other organizations mentioned in this article are Intellyx customers. Image credit: Dan Foy.