Frustration: The Data Governance Sinkhole

by Ian Hellström (23 November 2014)

Plentiful are the companies that revel in fancy descriptions of data-driven decision-making cultures on their corporate websites. Scarce are they who actually have a data governance office to back up these grand claims, for any data-centred programme without clear definitions and business processes regarding data is doomed to fail.

What is less known about data governance is that there is a phase during which companies run a risk of losing their best and brightest because of inaction or even worse: wrong actions. Michael Lopp has written an excellent article on why bored engineers quit, and as bad as that may be in general situations it is disastrous in the early phases of a data governance programme.

Maturity Models

There are many maturity models for data governance and the companies behind every single model promulgate their own as the bee’s knees. In essence, they all share the same characteristics as they describe the phases organizations go through on their ways from data anarchies to realms where data is king. The models differ in the number of individual stages and the names given to each stage but that pretty much is it. The National Association of State CIOs has summarized the most commonly used models in case you are interested in the details.

I particularly like the DataFlux maturity model as described by Tony Fisher in the must-read The Data Asset because it is succinct yet it covers all relevant aspects. According to that model, there are four stages: 1) undisciplined, 2) reactive, 3) proactive, and 4) governed.

DataFlux's maturity model for data governance

Undisciplined organizations are stuck in firefighting mode: whenever issues with data pop up they are dealt with on a per-case basis and only, to put it bluntly, when the shit hits the fan. There is discord about ownership: business users see data quality issues as IT’s problem, whereas IT only feels responsible for the technical administration, not the contents. Without executive interference this ownership ping-pong can go on for years with IT usually bearing the brunt of the workload. If there is an effort to improve data quality it is often outsourced, which in itself causes more problems in the long run than it solves. A data governance programme is typically only initiated after a data catastrophe.

Your data is your business. Tony Fisher, The Data Asset (p. 81)

When these organizations mature they can reach the reactive stage where units or departments still work largely independently and with little executive oversight but where there is a need for cross-functional data gathering and thus coordination and collaboration. Companies in the reactive stage are often involved in CRM, ERP, or data warehouse implementations. Do not let that fool you though: many companies attempt to build a data warehouse and the like without having done the groundwork. I have seen it happen and fail. Horribly.

More than 40 percent of companies take on CRM or similar projects without understanding data quality problems in existing systems. Tony Fisher, The Data Asset (p. 50)

Once organizations have come to terms with the idea that data is a strategic asset and not something that each department stores in silos that need to be guarded against poachers from other departments, they can climb to the proactive stage. It is a steep climb but the efforts are worth it. Proactive organizations have clear, unified views of their data assets and they have started to deploy MDM to guarantee that the core business data is standardized across the enterprise. Compliance is a key component of reactive organizations, not to police people but to ensure that everyone sees the same data in the same way: there is only one way to interpret data and that way is known to all and documented for all to see. Data stewards or data custodians are appointed and responsible for the accuracy and relevance of corporate data.

In the proactive stage the foundations for the governed stage have already been laid and ‘all’ that is required to reach the next level is a continuation of all the hard work done so far. A zero-defects policy is commonly adopted and people are aware of all facets of data quality. Because the organization trusts its data to the bone it can invest in business process management (BPM) automation, thus freeing valuable resources to work on improving the organization and its products and/or services. Such organization additionally tend to move towards a service-oriented architecture (SOA), in which data services allow access to corporate data. It is important that complacency does not take a hold once organizations have reached the governed stage: data governance requires a continuous investment and vigilance.

Any successful enterprise-wide data governance strategy must encompass people, processes, and technology. First, you need to have executive sponsorship and, more importantly, a willingness for IT and business people to cooperate. Executive support is crucial and it goes beyond saying, ‘You have my support. Now go and do it.’ Second, processes need to be defined, refined, and enforced that accurately document what is considered high-quality data and how that high-quality data is collected, handled, maintained, and archived. These processes and standards need to be captured in business processes that the organization lives by. Third, all solutions are built on technology that needs to be understood, developed, and managed.

People are without a doubt the most critical piece of the puzzle. Data governance requires people to think differently, to open up, to share, to admit failure and fix issues. Issues they often do not have because of their myopic view inside their own data silo. Executives sometimes choose to overlook the issues involved because they inherently want to believe in the goodwill of their employees. Often they do not believe or know their data is in such a terrible state because they only see what people below them want to show: massaged data and reports.

Truth be told, it is tough to break boundaries between business units and departments without executive support and even if your data governance programme has executive backing, the desire of departments to cooperate is not guaranteed. I have frequently been on the receiving end of passive-aggressive behaviour and without direct proof, which by the very nature of the behaviour is difficult to obtain, it’s their word against yours. Moreover, business units in many undisciplined and reactive organizations seem to live with the paranoid belief that other units want to look at their data to show how awful it is. The fact of the matter is that these units have their own data problems and could not care less about yours. The conundrum with paranoia is that anyone who points out fallacies in the argumentation is labelled as a co-conspirator and thus ignored.

Frustration

Usually there are not too many data wizards at companies: people who both know the intrinsic business value of data and the technical intricacies of the various data sources. The list of candidates for the role of data steward is equally small. People who are knowledgeable and respected on both the business and the IT aisle are hard to come by.

As undisciplined organizations crawl out of their cocoons and try to fly in reactive mode, there is real risk of alienating those people on whom a successful data governance programme depends the most. The capable engineers who sift through gigabytes of data are easily overlooked and neglected. The work is often thrown into their laps as something that simply has to be done:

[B]eing left too long on “have to” work is a guarantee of eventual boredom. Michael Lopp, Bored People Quit

And bored people quit.

Dull but important tasks such as data profiling are frequently dismissed as extravagances that impede quick progress, which causes even more trouble down the line. Frustration is a common gripe on data warehouse projects and it is one that in the infancy of a data governance programme can seriously undermine your efforts.

Reactive organizations risk losing their smartest data managers when those individuals become frustrated by the lack of attention paid to data quality issues and choose to join organizations that are more responsive. Tony Fisher, The Data Asset (p. 81)

Another common cause for frustration and eventually boredom is when developers are not consulted on possible solutions. I have lived through such an episode where a data warehouse was outsourced because the contractor sold it as the best thing since sliced bread. The technical know-how to build the data warehouse internally was there but the contractor promised to be a lot quicker than what any in-house data expert considered to be realistic. The reservations voiced by several knowledgeable people, including myself, were brushed off as being overly pessimistic and an indication that the critics needed to lighten up and ‘let the grown-ups do the talking’. Even the most basic data profiling tasks were skipped to ‘speed up the project’, which — as you may have guessed — caused considerable delays in the deployment because the validation took much longer than anticipated. In fact, the project took a lot longer than what the critics had planned had they done it themselves.

The ‘grown-ups’ also wanted certain features in applications that were built on top of the data warehouse. Several software engineers proposed solutions that seemed to cover the requirements and even went beyond the ones specified. The solutions elegantly allowed for future changes, changes that they already knew would come up because they were not born yesterday. Instead of at least hearing out the engineering team, management told the engineers to do what what they wanted and not come up with ideas of their own. On one occasion the team was even asked to produce data sets that did not exist. When they pointed out the deficiency in the data sources, they were told to simply generate the data and stop complaining. How? It’s not as if developers can magically make data appear at the drop of a hat. When management has little or no idea of what is actually involved in data-related tasks it is very difficult to move forward with data governance.

Needless to say, the ‘solutions’ required continuous fixing because a lot of the problems were unforeseen to the people who requested these particular solutions, and the fixes needed even more fixes. (Software) engineers live to solve problems; they are trained to solve problems and (usually) they are quite good at it. Bugs that have to be fixed have to be fixed. That’s a fact of life. But hardly any engineer enjoys fixing other people’s bugs all day, especially when there is a solution in a drawer that would have reduced the amount of headaches. Have-to work is a mental drain that causes frustration and that frustration eventually leads to boredom. And by now I think you have a pretty good idea of what happens once boredom kicks in.

You do not want people to feel like caretakers cleaning up after all who do not care about cleaning up after themselves. A data governance programme, as limited as its scope may well be in the beginning, allows the data janitors to measure progress. It’s a way for them to see the finish line, as far away as it may yet seem. Without data governance the work they do is futile and that ultimately leads to frustration.

What I am not saying is that data experts and business-savvy engineers are porcelain dolls that need to be handled with care, although treating them like human beings rather than mindless drones is a start. What I am saying is that, since they have their heads inside all systems all day, they have a pretty good picture of what can and needs to be done, so do not dismiss their concerns straight away.

In all data-related projects it is important to address the data experts’ concerns. They are the ones who do most of the heaving lifting. They are the ones who get their hands dirty and deal with data issues that have been around since before their time. They are the ones who swim against the tide. Without their full engagement companies have little chance of launching a successful data governance programme, and corporations risk alienating the ones they cannot afford to lose. Organizations will be left with a hole where once their expertise was. And without the efforts of these data gurus, an enterprise data strategy seems all but impossible to reach.