Root Causes of the Data Integration Problem

The Fundamental Phenomenon – Human Behavior


Writing over a century ago, Emile Durkheim and Marcel Mauss recognized and documented the true root cause of today’s data integration woes. (Primitive Classification, 1903, page 5-6 as quoted by Mary Douglas in Natural Symbols, page 61-62)

At the bottom of our conception of class there is the idea of circumscription with fixed and definite outlines. 

Given that this concept of classification is the basis of logic, social discourse, religion and ritual, it should not be a surprise that it also comes into play when software developers write software. They make assumptions and assertions in the design, data and code of their systems that rely on a fixed vision of the problem. Applications may be written for maximum flexibility in some ways, and still there is an intent on the part of the developers to define the breadth and width of the system,  in other words, to bound and fix in place the concepts and relations supportable by the application.

The highly successful ERP products like SAP, JD Edwards, and ORACLE Financials allow tremendous flexibility to configure for different business practices. The breadth of businesses that can make these products work for them is very large. However, it is a common understanding in the ERP professional community (of installers) that there are some things in each product that just can’t be changed or accomplished. In these areas, the business is said to have to change to accommodate the tool. The whole industry of “change management” was born from the need to change the PRACTICE of business due to the ultimate limitations of these systems which were imposed by the conceptual boundaries their authors had to place upon them. (This is a different subject which should be pressed and researched). No matter how flexible the business system is, it is ultimately, and fundamentally, a fixed and bounded symbolic system.

 So how does this relate to my claim that Durkheim and Mauss have unwittingly predicted the current crisis of data integration? Because they go on to point out that: 

It would be impossible to exaggerate, in fact, that state of indistinction from which the human mind developed. Even today a considerable part of our popular literature, our myths, and our religions is based on a fundamental confusion of all images and ideas. They are not separated from each other, as it were, with any clarity. 

This “conceptual stew” is present in every aspect of life. The individual human mind is particularly adept at working within this broad confusion, picking and choosing what to believe is true based on internal processes. Groups of individuals, in order to communicate, will add structure and formality to certain portions thru discussion and negotiation. But this “social” activity is not always accompanied by strong enforcement by the community.

 As Mary Douglas (Natural Symbols, page 62) continues from Durkheim and Mauss, individuals in modern society (and increasingly this encompasses the global community) are presented with many different conceptual mileaus during the course of a single day. Within each person, she indicates,

 A classification system can be coherently organized for a small part of experience, and for the rest it can leave the discrete items jangling in disorder. Or it can be highly coherent in the ordering it offers for the whole of experience, but the individuals for whom it is available may enjoy access to another competing and different system, equally coherent in itself, from which they feel free to select segments here and there eclectically, not worrying about the overall lack of coherence. Then there will be conflicts, contradictions and uncoordinated areas of classification for these people.

 This not only describes a few individuals, but it is my contention that this describes the whole of human experience. Nowhere in the modern world especially, except perhaps when alone with oneself, will the individual find a single, coherent, non-contradictory and comprehensive classification of the world. Instead, the individual is faced with dozens or hundreds of partial, conflicting conceptions of the world. Being the adaptable human being her ancestors evolved her to be, however, this utter muddle is rarely a problem in a healthy person. The brain is a reasoning engine built especially to handle this confusion, in fact it thrives on it – the source of much that we call “creative” or “humorous” or “brilliant” is derived from this ever-changing juxtaposition and jostling of different, partial conceptions. Human society expands from the breadth and complexity created by these different classification systems. Communication between strangers depends on the human capacity to process and understand commonalities and fill in the blanks in the signal.

The very thing which defines us as human, our ability to communicate across fuzzy boundaries, is also that thing that creates and exacerbates the Data Integration Problem in our software. Our software “circumscribes with fixed and definite outlines” some small aspect of our experience. In doing so, it denies the fuzziness of our larger reality, and imposes barriers between systems.

The Folk Model – What We Really Build Software From

The anthropological notion of a “folk model” can be a useful paradigm to consider when analyzing the implementation of software applications. Folk models are the proto-scientific conceptualizations of a group of people which they use to describe, understand and interact some aspect of their collective experience.

When writing software, especially but not only within the Agile approach, it is the through the elicitation and joint “discovery” of the user’s folk model that a common set of requirements for the software is defined. Ultimately, it is the closeness of fit between the folk model and the operation and symbology of the software that will determine its success or failure.

Different groups of people faced with the same or similar problems may develop largely similar folk models, and from these, different software development teams may create largely similar software applications. This is one reason why the software development process works best as a hand-crafted enterprise.

But what at first appears to be minor discrepancies between what the software model presents and what the folk model expects can grow so large that it can cause the failure of the software for those users. Especially if the folk model was flawed or in a state of flux at the time the software tried to codify it (and really, when is a folk model not in flux?).

