If you’ve seen it once, you’ve probably seen it a thousand times. Hey, you may have
even done it too.
We’re talking about the extended horizontal box that’s about an inch wide and (typically) traverses the lower third of a PowerPoint slide. The box begins all the way over on the left and stretches all the way to the right and neatly placed in the middle is a label affixed with the term, “Data Governance.” Ah, yes, that box!
If you allow your eyes to scan upwards along the left-hand side (establishing a mental image for yourself), you’ll probably see an old familiar list of source objects. Rather than using a single icon to represent a multitude of sources, there will be an icon for each source type: databases, applications that can be interfaced by using an API, message queues, change data capture output from database logs, and IOT devices. And, that’s just the list for the sources that have been additionally marked as “internal.” That means, that there will be another set of icons that will be tagged as being “external”. The external icons would undoubtedly include labels such as vendor, supplier, open data, weather data, additional IOT devices, and a plethora of social media feeds. Presumably, all of this amounts to nothing unusual? Everything is seemingly okay, no?
Ostensibly, yes; but semantically, not so much.
So, what’s the underlying issue here?
The issue is that the diagram, through the elongated box labeled “Data Governance,” explicitly asserts the role and function of data governance over these source feeds—all of these source feeds! The purview and expanse of data governance infers, diagrammatically at least, that the internal sources and all of the external sources too, are under the unilateral control of data governance. Semantically, does the diagram really mean what it is insinuating? Namely, can you exert data governance over assets that you don’t own or out rightly control?
Are we simply ignoring semantics or are we misunderstanding the role and function of data governance? Perhaps, another question may be, does this even matter? Let’s use this as an example, a particular person within the organization is entering things into a source system that gets shared with other internal systems. The problem with this particular person is that they often flip the month and day fields for a date when a transaction is keyed in. So that June 1 st becomes January 6 th and vice versa. A data steward or data custodian that represents “data governance” may very well have free reign to go and talk with this person about the aforementioned data quality issues and provide some additional tutelage to prevent any further reoccurrence.
In a comparable situation that, hypothetically, involves a Twitter feed, where someone is consistently misusing a word. Let’s say that word is “irony”, as ironically, irony is a word that can often be misused! For example, in Alanis Morissette’s song, from Jagged Little Pill, titled “Ironic”, she ends each verse with a question to the listener, “and isn’t it ironic, don’t you think?” However, none of the scenarios laid out in each particular verse could actually be ascertained to be ironical. The net net being is that whether you’re attempting to critique a song lyric or what someone’s writing about in a social media feed, you aren’t going to be able to exercise your team of data stewards to have a quiet word with the individual to begin improving the quality of their writing.
Semantically, the horizontal box (in our diagram) for data governance should begin somewhere under the line that is used to connect one or more of the sources with their initial targets. The message is that you can’t govern what you can’t control and for many data governance initiatives control is remarkably absent for the external sources. Additionally, the ability to assert control can invariably be non-existent or very weak even over the internal sources (because those sources may belong to other organizations and your enterprise data governance program may not be as enterprise as you’d presuppose).
In the future, if you see such a diagrammed claim, with the data governance function encompassing every source type, challenge the representation as to whether or not “data governance” factually applies to these sources. If a picture is worth a thousand words, let’s make sure we first remove the semantic dissonance from the narrative.