Bridge Contexts: Meaning in the Edgeless Boundary

Previously, I’ve written about the idea of the “edgeless boundary” between semiospheres for someone with knowledge of more than one context. This boundary is “edgeless” because to the person perceiving it, there is little or no obvious boundary.

In software systems, especially in situations where different software applications are in use, the boundary between them, by contrast, can be quite stark and apparent. I’ll describe the reasons for this in other postings at a later time. The nutshell explanation is that each software system must be constrained to a well-defined subset of concepts in order to operate consistently. The subset of reality about which a particular application system can capture data (symbols) is limited by design to those regularly observable conditions and events that are of importance to the performance of some business function.

Often (in an ideal scenario), an organization will select only one application to support one set of business functions at a time. A portfolio of applications will thus be constructed through the acquisition/development of different applications for different sets of business functions. As mentioned elsewhere on this site, sometimes an organization will have acquired more than one application of a particular type (see ERP page). 

In any case, information contained in one application oftentimes needs to be replicated into another application within the organization.  When this happens, regardless of the method by which the information is moved from one application to another, a special kind of context must be created/defined in order for the information to flow. This context is called a “bridging context” or simply a “bridge context”.

As described previously, an application system represents a mechanized perception of reality. If we anthropomorphize the application, briefly, we might say that the application forms a semiosphere consisting of the meaning projected onto its syntactic media by the human developers and its current user community, forming symbols (data) which carry the specifically intended meaning of the context.

Two applications, therefore, would present two different semiospheres. The communication of information from one semiosphere to the other occurs when the symbols of one application are deconstructed and transformed into the symbols of the other application, with or without commensurate changes in meaning. This transformation may be effected by human intervention (as through, for example, the interpretation of outputs from one system and the re-coding/data entry into the other), or by automated transformation processes of any type (i.e., other software).

“Meaning” in a Bridging Context

Bridging Contexts have unique features among the genus of contexts overall. They exist primarily to facilitate the movement of information from one context to another. The meaning contained within any Bridging Context is limited to that of the information passing across the bridge. Some of the concepts and facts of the original contexts will be interpretable (and hence will have meaning) within the bridging context only if they are used or transformed during this flow.  Additional information may exist within the bridge context, but will generally be limited to information required to perform or manage the process of transformation.

Hence, I would consider that the knowledge held or communicated by an individual (or system) operating within a bridging context which is otherwise unrelated to either of the original contexts, or of the process of transference, would existing outside of the bridging context, possibly in a third context. As described previously, the individual may or may not perceive the separation of knowledge in this manner.

Special symbols called “travellers” may flow through untouched by transformation and unrecognized within the bridging context. These symbols represent information important in the origin context which may be returned unmodified to the origin context by additional processes. During the course of their trip across the bridging context(s) and through the target contexttravellers typically will have no interpretation, and will simply be passed along in an unmodified syntactic form until returned to their origin, where they can then be interpreted again. By this definition, a traveller is a symbol that flows across a bridge context but which only has meaning in the originating context.

Given a path P from context A to context B, the subset of concepts of A that are required to fulfill the information flow over path P are meaningful within the bridging context surrounding P. Likewise, the subset of concepts of B which are evoked or generated by the information flowing through path P, is also part of the content of the bridge context.  Finally, the path P may generate or use information in the course of events which are neither part of context A nor B. This information is also contained within the bridge context.

Bridge contexts may contain more than one path, and paths may transfer meaning in any direction between the bridged contexts. For that matter, it is possible that any particular bridging context may connect more than two other contexts (for example, when an automated system called an “Operational Data Store” is constructed, or a messaging interface such as those underlying Service Oriented Architecture (SOA) components are built).

An application system itself can represent a special case of a bridging context. An application system marries the context defined by the data modeller to the context defined by the user interface designer. This is almost a trivial distinction, as the two are generally so closely linked that their divergence should not be considered a sign of separate contexts. In this usage, an application user interface can be thought of as existing in the end user’s context, and the application itself acts to bridge that end user context to the context defining the database.

Advertisements

Value Proposition of Metamorphic Modeling

Integration projects are started for the following reasons: the replacement of core application systems, especially such applications as enterprise resource planning (ERP) systems,the creation of enterprise data warehousing or business intelligence capabilities, the establishment of supply chain automation and business to business “e-Commerce”, the automation of business processes through workflow,the replacement or introduction of infrastructure, especially of “middleware”, the federation and synchronization of corporate systems due to mergers, the establishment of a “service oriented architecture,” and the introduction of Semantic Web technologies, especially such things as the Resource Description Framework (RDF) and the Web Ontology Language (OWL).

Each of these endeavors have their merits and value to the organization, some more than others at different times in the organization’s lifetime. While each of these projects can share data integration requirements, they also have vast differences in requirements such as throughput, periodicity (the continuum from real-time, instant synchronization to periodic batch update), communication infrastructure, and data storage strategy. Despite the similarities in their data integration requirements, it is these other differences which have led vendors to develop products with highly diverse architectures.

One of the more interesting consequences of the diversity of integration products and architectures is that oftentimes, the vendors seem unaware of the similarities between their products, especially if they are being marketed to different types of projects. The same can often been said of their customers. It is often the case, therefore, that each new integration project is approached as completely separate and unrelated to other existing or planned integration efforts within the organization. Often this can mean that the organization assembles different teams of people to staff the different projects. Staffing such projects often focuses on the team’s familiarity with the chosen technologies or the languages used to invoke the integration tool, instead of their ability to understand and think through how the data ought to be integrated for the highest-value. This in turn leads, as well, to each project inventing its own methods for documenting (or often not documenting) the data integration.

Very large organizations may, over time, find themselves implementing examples of each of the types of projects listed above. If they don’t take a broad perspective to each problem, they are likely to find themselves with investments in several, incompatible data integration solutions, with little or no way of reusing the knowledge of data equivalences that are embedded in each one.

With this system-architectural and market environment in mind, the Metamorphic Modeling Methodology has been defined. Metamorphic Modeling provides a language for describing and capturing the integration design details of all of an organization’s data integration efforts. It presents a standard, reusable way for an organization to produce high quality designs for data integration of any type and for any purpose. Using it, the organization will find the following capabilities open to it: a “design for integration” can be codified and standardized, and a body of reusable work products can be developed with applicability across the full spectrum of data integration projects, resulting in less redundancy of effort and more consistency of results across the organization; the ability to move the data integration skills from one platform to another – the portability of method across different problem types; a cadre of practitioners can be trained in the methodology, establishing a team which can produce reliable, consistent results repeatedly, for any data integration project the organization is faced with;high-quality, consistent integration designs are more easily managed, their implementation more readily measured, tested and verified; high-value business knowledge can be retained by the organization while projects are freed to locate the best quality and most cost-effective development team they can for the chosen architecture, even and especially when the organization decides to outsource the actual development effort (perhaps even offshore); and established, pre-existing data integrations can be reverse-engineered into Metamorphic Modeling conventions, perhaps even automatically, where they can then be used as specifications for re-implementation in a different technology or tool.

The Common Features of Data Integration Tools

The tools available in the marketplace for data integration are diverse. To say that there was a standard set of required features for data integration tools would be a bit of a stretch. There is little, at the present moment, in the way of recognition that there are common features and problems in the data integration space. This is due to the fact that companies are not buying products for their ability to unify and integrate their data alone, but rather to solve some other class of problem.

On the other hand, there is a lot of commonality, both in functionality and in presentation or user interaction, among tools in very different tool categories. A certain core set of features appear again and again, and a common graphical depiction has also become nearly ubiquitous among the products.

This stereotypical user interface consists of one or more box with a list of data element names stacked vertically, and then the provided ability to connect individual columns from one box to individual columns in a second box by drawing lines between the boxes. Some of the common features of data integration tools include: a data dictionary for the schemas of the company’s applications,automated or semi-automated processes for capturing the basic schema information about these applications,and some way of linking or tying data elements from one schema to another.

Many products tout their inherent architecture as a major benefit, namely that their product presents some sort of semantic “centralized hub and spoke” model. Key features of this architecture, in addition to the typical features described above, are a language or representation for building a common, unified data (or information) model (e.g., Common Information Model) spanning the data structures of the application systems of the corporation, a technique and notation for relating the application data structures to this unified model, and the nearly universal marketing pitch touting how the centralization reduces and removes the redundancies and inefficiencies inherent in any alternate design not using their centralized hub approach.

The Real Reason Systems Fail

It seems as if every week there’s another news article bemoaning the state of data integration within some large enterprise. Mission objectives are stymied because “systems don’t talk to each other”. Intelligence failures are due to “incompatible data”. The surprise and outrage expressed would make the lay reader think that this is a recent trend, but they’d be wrong in this notion. The “Data Integration Problem” has been around since the first humans began to speak. Most practitioners and experts who work in the software version of the problem space haven’t realized this, but it’s true.

What is “data” in the modern sense? Most people think that it is “information”, the detailed “facts” of a modern culture. This colloquial understanding is a major simplification, one which is at the root of the Data Integration Problem. It is the reason why most people, even seasoned experts, are constantly surprised and frustrated when the monster appears, seemingly out of nowhere, before them.

So what is data?
Data is CODE.
Data is REPRESENTATION.
Data is SYMBOLOGY.

What this implies is that without someone who can decode it – an INTERPRETER – data is nothing. Let that sink in for a minute.

Data is nothing without INTERPRETATION.

What does this mean? Well for one thing it means that without an interpretation, there is no way to even recognize that data exists. And without an interpreter, there can be no interpretation.

So when we think about all of the data being generated and passed around in our modern world, the question arises: Who is the interpreter that gives data its meaning? Well obviously it’s us. The computer doesn’t understand the data it contains! No matter how we might try to anthropomorphize them, computers are still just as dumb as the lumps of metal, plastic and sand from which they are constructed. The systems that we humans create using these computers are just that – mechanical systems which manipulate physical media, morphing symbols from one representation to another. Everything a computer does is devoid of intrinsic meaning until some human comes along and interprets the symbols.

Imagine computer systems after apocalypse. Imagine the systems of the stock exchange, or the weather service, or any of the thousands of other automated systems that may run unattended by their now defunct human inventors. Now answer the question: without humans to interact with them, do they produce anything? Is there any content to them without, ultimately, some human being to interpret their output?

This is more than that old saw about a tree falling in the forest. Consider some famous examples of symbols that have lost their meaning:

  1. Cave paintings of Lascaux depict hunts and animals, but what did our pre-historic cousins intend when they outlined their hands on the walls?
  2. Until the Rosetta Stone was found, Egyptian hieroglyphics had lost their meaning in the world.
  3. When the Confederacy fell, Confederate currency lost its meaning and value.

The Data Integration Problem, simply stated, is caused by the fact that data is symbolic code onto which some group of humans has projected meaning. Meaning, therefore, is local to the humans doing the projection. Without the knowledge of how meaning was projected onto the symbols, the information they contain cannot be retrieved in any complete sense.

Only the people who have projected their meaning onto the symbols (or in special instances who share significant experiences of the world with those people who have) are able to interpret the data correctly. In any large, complex, enterprise, where business necessitates that small groups complete their own missions expeditiously and with vigor, who should be surprised that locally defined data doesn’t integrate well from one end of the enterprise to the other?

Really, the answer to this question should be “nobody”.

This has been true since humans (and possibly our predecessors) first started making symbols.

%d bloggers like this: