Meaning Over Transformation

This entry is probably ahead of the story, but I wanted to start moving into this subject and I’m not yet organized. It should make more sense later on when I’ve explained such things as the “magical” function M() more thoroughly.

Review: The Magical Function “M()”

As a review for those who may not have seen this function previously on this site, I have invented a mysterious and powerful function over all things used as signs by humans. Named the “M()” function, I can apply it to any symbol or set of symbols of any type and it will return what that symbol represents. I call it the “M() function because it takes something which is a symbol and it returns its meaning (that’s all of its meaning).

How Meaning Carries Over Symbol Transformations

When we move information from one data structure to another, we may or may not use a reversible process. By this I mean that sometimes a transformation is a one-way operation because some of the meaning is lost in the transformation. Sometimes this loss is trivial, but sometimes it is crucial. (Alternatively, there can be transformations which actually add meaning through deductive reasoning and projection. SFAT (story for another time))

Whether a transformation loses information or not, there are some interesting conclusions we can illustrate using my magical, mysterious function M(). Imagine a set β of data structure instances (data) in an anchor state. The full meaning of that data can be expressed as M(β). Now imagine a transformation operation T which maps all of the data in β onto a second set of data Δ.

T : β |–> Δ such that for each symbol σ in β, there is a corresponding symbol δ in Δ that represents the same things, and σ <> δ

By definition, since we have defined T to be an identity function over the meaning of β, then we can conclude that if we apply M() before and after the transformation, we will find ourselves with an equivalence of meaning, as follows:

By definition: T(β) = Δ

Hence: M( T(β) ) ≡ M( Δ )

Also, by definition of T(), then M( β )  ≡ M( T(β) )

Finally, we conclude: M( β ) ≡ M( Δ )

Now, obviously this is a trivial example concocted to show the basic idea of M(). Through the manner by which we have defined our scenario, we get an obvious conclusion. There are many instances where our transformation functions will not produce equivalent sets of symbols. When T() does produce an equivalence, we call it a “loss-less” transformation (borrowing a term from information theory) because no information is lost through its operation.

Another relationship we claim can also be defined in this manner is namely that of semantic equivalence.  This should be obvious as well, from reflection, as I was careful above to refer to “equivalence of meaning”, which is really what I mean when I say two things are semantically equivalent. In this situation, we defined T() as an operation over symbols such that one set of symbols were replaced with a different set of symbols, and the individual pairs of symbols were NOT THE SAME (σ <> δ)! In a most practical sense, what is happening is that we are exchanging one kind of data structure (or sign) with another, such that the two symbols are not syntactically equivalent (they have different signs)  but they remain semantically equivalent. (You can see some of my thoughts on semantic and syntactic equivalence by searching entries tagged and/or categorized “equivalence” and “comparability“.)

A quick example might be a data structure holding a person’s name. Let’s say that within β the name is stored as a string of characters in signature order (first name  middle name  last name) such as “John Everett Doe”. This symbol refers to a person by that name, and so if we apply M() to it, we would recognize the meaning of the symbol to be the thought of that person in our head. Now by applying T() to this symbol, we convert it to a symbol in Δ, also constructed from a string data structure, but this time the name components are listed in phone directory order (last name, first name middle name) such as “Doe, John Everett”. Clearly, while the syntactic presentation of the transformed symbol is completely different, the meaning is exactly the same.

T(“John Everett Doe”) = “Doe, John Everett”

M( T(“John Everett Doe”) ) ≡ M( “Doe, John Everett” )

M( “John Everett Doe” ) ≡ M( T(“John Everett Doe”) )

M( “John Everett Doe” ) ≡ M( “Doe, John Everett” )

“John Everett Doe” <> “Doe, John Everett”

When the transformation is loss less, there is a good chance that it is also reversible, that an inverse transformation T ‘ () can be created. As an inverse transformation, we would expect that T ‘ () will convert symbols in Δ back into symbols in β, and that it will also carry the meaning with complete fidelity back onto the symbols of β. Hence, given this expectation, we can make the following statements about T ‘ ():

T ‘ (Δ) = β

M( T ‘ (Δ) ) ≡ M( β )

By definition of T ‘ (), then M( Δ )  ≡ M( T ‘ (Δ) )

And again: M( Δ ) ≡ M( β )

Extending our example a moment, if we apply T ‘ () to our symbol, “Doe, John Everett”, we will get our original symbol “John Everett Doe”.

Meaning Over “Lossy” Transformation

So what happens when our transformation is not loss-less over meaning? Let’s imagine another transformation which transforms all of the symbols σ in β into symbols ε in Ε. Again, we’ll say that σ <> ε, but we’ll also define T ‘ ‘ () as “lossy over meaning” – which just indicates that as the symbols are transformed, some of the meaning of the original symbol is lost in translation. In our evolving notation, this would be stated as follows:

T ‘ ‘ (β) = Ε

M( T ‘ ‘ (β) ) ≡ M( Ε )

However, by the definition of T ‘ ‘ (), then M( β )  !≡ M( T ‘ ‘ (β) )

Therefore: M( β ) !≡ M( Ε )

In this case, while every symbol in β generates a symbol in Ε, the total information content of Ε is less than that in B. Hence, the symbols of the two sets are no longer semantically equivalent. With transformations such as this, the likelihood that there is an inverse transformation that could restore β from Ε becomes more unlikely. Logically, it would seem there could be no circumstances where β could be reconstituted from Ε alone, since otherwise the information would have been carried completely across the transformation. I don’t outright make this conclusion, however, since it depends on the nature of the information lost.

An example of a reversible, lossy transformation would include the substitution of a primary key value for an entire row of other data which in itself does not carry all of the information for which it is a key, but which can be used in an index fashion to recall the full set of data. For example, if we created a key value symbol consisting of a person’s social security number and last name, we could use that as a reference for that person. This reference symbol could be passed as a marker to another context (from β to Ε, say) where it could be interpreted only partially as a reference to a person. But which person and what other attributes are known about that person in the new context Ε if we define the transformation in such a way that all of the symbols for these other attributes stay in β? Not much, making this transformation one where information is “lost” in Ε.  However, due to its construction from β, the key symbol could still be used on the inverse transformation back to β to reconstitute the missing information (presuming β retains it).

An example of a one-way transformation might be one that drops the middle name and last name components from a string containing a name. Hence, T ‘ ‘ ( “John Everett Doe” ) might be defined to result in a new symbol, “John”. Since many other symbols could map to the same target, creating an inverse transformation without using other information becomes impossible.

Advertisements

Example Interaction Between Parent and Child Context

In a previous post, I described in general some of the relationships that could exist between and across a large organization’s sub-contexts. What follows is a short description of some actual observations of how the need for regional autonomy in the examination and collection of taxes affected the use of software data structures at the IRS.

Effect of Context on Systems and Integration Projects

July 15, 2005

Contexts lay claim to individual elements of a syntactic medium. A data structure (syntactic medium) used in more than one context by definition must contain meaningful symbols for each context. Some substructures of the data structure may be purposefully “reserved” for local definition by child contexts. In the larger, shared context, these data structures may have no meaning (see the idea of “traveller” symbols). When used by a child context, the meaning may be idiosyncratic and opaque to the broader context.

One way this might occur is through the agreement across different organizational groups that a certain structure be set aside for such uses. Two examples would include the automated systems at the IRS used respectively for tax examinations and tax collections.

Within the broad context defined by the practitioners of “Tax Examination” which the examination application supports, several child contexts have been purposefully developed corresponding to “regions” of the country. Similar organizational structure have also been defined for “Tax Collection” which the collection application supports. In both systems, portions of the syntactic media have been set aside with the express purpose of allowing the regional contexts to project additional, local meaning into the systems.

While all regions are contained in the larger “Examination” or “Collection” contexts, it was recognized that the sheer size of the respective activities was too great for the IRS central offices to be able to control and react to events on the ground in sufficient time. Hence, recognizing that the smaller regional authorities were in better position to diagnose and adjust their practices, the central authorities each ceded some control. What this allowed was that the regional centers could define customized codes to help them track these local issues, and that each application system would capture and store these local codes without disrupting the overall corporate effort.

Relying on the context defined and controlled by the central authorities would not be practical, and could even stifle innovation in the field. This led directly to the evolution of regional contexts. 

Even though each region shares the same application, and that 80 to 90% – even 95% – of the time, uses it in the same way, each region was permitted to set some of its own business rules. In support of these regional differences in practice, portions of the syntactic medium presented by each of the applications were defined as reserved for use by each region. Often this type of approach would be limited to classification elements or other informational symbols, as opposed to functional markers that would effect the operation of the application.

This strategy permits the activities across the regions to be rolled up into the larger context nearly seamlessly. If each region had been permitted to modify the functionality of the system, the ability to integrate would be quickly eroded, causing the regions to diverge and the regional contexts to share less and less with time. Eventually, such divergence could lead to the need for new bridging contexts, or in the worst case into the collapse of the unified activity of the broader context.

By permitting some regional variation in the meaning and usage of portions of the application systems, the IRS actually strengthened the overall viability of these applications, and mitigated the risk of cultural (and application system) divergence.

Is MDM An Attempt to Reach “Consensus Gentium”?

Consensus gentium

An ancient criterion of truth, the consensus gentium (Latin for agreement of the peoples), states “that which is universal among men carries the weight of truth” (Ferm, 64). A number of consensus theories of truth are based on variations of this principle. In some criteria the notion of universal consent is taken strictly, while others qualify the terms of consensus in various ways. There are versions of consensus theory in which the specific population weighing in on a given question, the proportion of the population required for consent, and the period of time needed to declare consensus vary from the classical norm.*

* “Consensus theory of truth”, Wikipedia entry, November 8, 2008.

The Data Thesaurus

October 25, 2005

 So much of IT’s best practices are taken for granted that no one ever asks if there might be a better way. An example of this is in the area of data standards, enterprise data modeling, and Master Data Management (MDM). The core idea of these initiatives is to try to create a single data dictionary in which every concept important to the enterprise is recorded once, with a single standardized name and definition.

The ideal promoted by this approach is that everyone who works with data in the organization will be much more productive if they all follow one naming convention, and if every data item is documented only once. Sounds logical and practical, and yet when we look around for examples of organizations who have managed to successfully create such a document, complete it for ALL of their systems, even commercial software applications, and then who have kept it maintained and complete for more than a year or two, we find very few. In fact in my experience, which has included a number of valiant efforts, I have found no examples.

When one digs into the anecdotal reasons why such success seems so rare, some mixture of the following statements are often heard:

  1. The company lost its will, the sponsor left and so they cut the budget.
  2. It took too long, the business has redirected the staff to focus on “tactical” efforts with short return on investment cycles.
  3. Even after all that work, no one used it, so it was not maintained.
  4. We were fine until the merger, then we just haven’t been able to keep up with the document and the systems integration/consolidation activities.
  5. Our division and their division just never agreed.
  6. We got our part done, but that other group wouldn’t talk to us.

With ultimate failure at the enterprise level the more common experience, it’s surprising that no one involved in the performance and practice of data standardization has questioned what might really be going on. Lots of enterprises have had successes within smaller efforts. Major lines of business may successfully establish their own data dictionaries for specific projects. Yet very few, if any, have succeeded in translating these tactical successes into truly enterprise-altering programs.

What’s going on here is that the search for the “consensus gentium” as the Romans called it, the universal agreement on the facts and nature of the world by a group of individuals, is a never-ending effort. Staying abreast of the changes in the world that affect this consensus is increasingly impossible, if it ever was possible.

 The point here is that IT and the enterprise needs to stop trying to create a single universal dictionary. It must be recognized that such a comprehensive endeavor is an impossible task for all but the most extravagantly financed IT organizations. It can’t be done because the different contexts of the enterprise are constantly morphing and changing. Keeping abreast of changes costs a tremendous amount in both time and effort, and dollars. Proving an appropriate return on investment for such an ongoing endeavour is problematic, and suffers from the problem of diminishing returns.

 A better approach must be out there. One that takes advantage of the tactical point solutions that most enterprises seem to succeed with, while taking into account the practical limitations imposed by the constant press of change that occurs in any “living” enterprise. This blog attempts to document first-principles affecting the entire endeavor, and will build a case based on the human factors which create the problem in the first place.

 A better approach?

Why not build data dictionaries for individual systems or even small groups (as is often the full extent attempted and completed in most organizations). But instead of trying to extend these point solutions into a universal solution, take a different approach, namely the creation of a “data thesaurus” in which portions of each context are related to each other as synonyms, but only as needed for some particular solution. This thesaurus would track the movement of information through the organization by mapping semantics through and across changes in the “syntactics” of the data carrying this information. The thesaurus would need to track the context of a definition, and that definition would be less abstract and more detailed than those created by the current state of the practice. Links across contexts within the organization would be filled in only as practicality required, as the by-product of data integration projects or system consolidation efforts.

 What’s wrong with the data dictionary of today:

  1. obtuse naming conventions (including local standards and ISO)
  2. abstract data structures that have lost connection with actual data structures
  3. only one name for a concept, when different contexts may have their own colloquialisms – making it hard for practitioners to find “their data”, and even causing the introduction of additional entries for synonyms and aliases as if they were separate things
  4. abstracted or generalized definitions reflecting the “least common denominator” and losing the specificity and nuance present in the original contexts
  5. loss of variations and special cases
  6. detachment from modern software development practices like Agile, XP and even SOA

A Parable for Enterprise Data Standardization (as practiced today)

The enterprise data standard goal of choosing “one term for one concept with one definition” would be the same thing as if the United Nations convened an international standards body whose charter would be to review all human languages and then select the “one best” term for every unique concept in the world. Selection, of course, would be fairly determined to ensure that the term that “best captures” the concept, no matter what the original language was in which the idea was first expressed, would be the term selected. Besides the absurd nature of such a task, consider the practical impossibility of such a task.

First, getting sufficient representation of the world’s languages to make the process fair would require a lot of time. Once started, think of the months of argument, the years and decades that would pass before a useful body of terms would be established and agreed upon. Consider also that while these eggheads were deliberating, life around them would continue. How many new words or concepts would be coined in every language before the first missives would come out of this body? Once an initial (partial) standard was chosen, then the proselytizing would begin. Consider the difficult task of convincing the entire world to stop using the terms of their own language. How would the sale be made? Appealing to some future when “everyone will speak the same language” thus eliminating all barriers to communication most likely. As a person in this environment, how do you learn all of those terms – and remember to use them?

The absurdity of this scenario is fairly clear. Then why do so many data standardization efforts approach their very similar problem in the same way? The example above may be extreme, and some will say that I’ve exaggerated the issue, but that’s just the point I’m trying to make. When one talks with the practitioners of data standardization efforts, they almost always believe that the end goal they are striving for is nothing less than the complete standardization of the enterprise. They may realize intellectually that the job may never be finished, but they still believe that the approach is sound, and that if they can just stay at it long enough, they’ll eventually attain the return on investment and justify their long effort.

If the notion of the UN attempting a global standardization effort seems absurd, than why is the best practice of data standardization the very same approach? If we create a continuum for the application of this approach (see figure) starting at the very smallest project (perhaps the definition of the data supporting a small application system used by a subset of a larger enterprise), and ending at this global UN standardization effort, one has to wonder where along this scale does the practical success of the small effort turn into the absurd impossibility of the global effort? If we choose a point on this continuum and say “here and no further” then no doubt arguments will ensue. Probably, there will be individuals who find the parable above to not be ridiculous. Likewise, there will be others who believe that trying any standardization is a waste of time. Others might try to rationally put an end point on the chart at the point representing their current employer. These folks will find, however, that their current employer merges with another enterprise in a few months, which then raises the question is the point of absurdity further out now, at the ends of the combined organization?

Where Is The Threshold of Absurdity in Data Standardization?

Where Is The Threshold of Absurdity in Data Standardization?

Myself, I believe in being practical, as much as possible. The point of absurdity for me is reached whenever the standardization effort becomes divorced from other initiatives of the enterprise and becomes its own goal. When the data standardization focuses on the particular problem at hand, then the return on the effort can be justified. When data standardization is performed for its own sake, no matter how noble or worthy the sentiment expressed behind the effort, then it is eventually going to overextend its reach and fail.

If we all agree that at SOME point on the continuum, attempting data standardization is an absurd endeavor, then we must recognize that there is a limit to the approach of trying to define data standards. The smaller the context, the more the likelihood of success, and the more utility of the standard to that context. Once we have agreed to this premise, the next question that should leap to mind is: Why don’t our data dictionaries, tools, methods, and best practices record the context within which they are defined? Since we agree we must work within some bounds or face an absurdly huge task, why isn’t it clear from our data dictionaries that they are meaningful only within a specific context?

The XML thought leaders have recognized the importance of context, and while I don’t believe their solution will ultimately solve the problems presented by the common multi-context environments we find ourselves working in, it is at least an attempt. This construct is the “namespace” used to unambiguously tie an XML tag to a validating schema.

Data standards proponents, and many data modelers have not recognized the importance and inevitability of context to their work. They come from a background where all data must be rationalized into a single, comprehensive model, resulting in the loss of variation, ideosyncracy and colloquialism from their environments. These last simply become the “burden of legacy systems” which are anathema to the state of the practice.

Unmanage Master Data Management

Master Data Management is a discipline which tries to create, maintain and manage a single, standardized conceptual information model of all of an enterprise’s data structures. Taking as its goal that all IT systems eventually will be unified under a single semantic description so that information from all corners of the business can be understood and managed as a whole.

In my opinion, while I agree with the ultimate goal of information interoperability across the enterprise, I disagree with the approach usually taken to get there. A strategy that I might call:

  • Data Management with Multiple Masters
  • Uncontrolled/Unmanaged Master Data Management
  • Associative Search on an Uncontrolled Vocabulary
  • Emergent Data Management (added 2015)
  • Master-less Data Management (added 2015)

takes a different approach. The basic strategy is to permit multiple vocabularies to exist in the enterprise (one for each major context that can be identified). Then we build a cross reference of the semantics only describing the edges between these contexts (the “bridging” contexts between organizations within the enterprise), where interfaces exist. The interfaces that would be described and captured in this way would include non-automated ones (e.g., human mediated interfaces) as well as the traditionally documented software interfaces.

Instead of requiring that the entire content of each context be documented and standardized, this approach would provide the touchpoints between contexts only. New software (or business) integration tasks which the enterprise takes on would require new interfaces and new extensions of mappings, but would only have to cover the content of the new bridging context.

Information collected and maintained under this strategy would include the categorization of data element structures as follows:

  1. Data structure syntax and basic manipulations
  2. Origin Context and element Role (for example, markers versus non-markers)
  3. Storage types: transient (not stored), temporary (e.g. staging schemas and work tables), permanent (e.g., structures which are intended to provide the longest storage
  4. “Pass-through” versus “consumed” data elements. Also called “traveller” and “fodder”, these data structures and elements have no meaning and possibly no existence (respectively) in the Target Context.

For data symbols that are just “passing through” one context to another, these would be the traveller symbols (as discussed on one of my permanent pages and in the glossary) whose structure is simply moved unchanged from one context to the next, until it reaches a context which recognizes and uses them. “Fodder” symbols are used to trigger some logic or filter to change the operation of the bridging context software, but once consumed, do not move beyond the bridge.

The problem that I have encountered with MDM efforts is that they don’t try to scope themselves to what is RECOGNIZABLY REQUIRED. Instead, the focus is on the much larger, much riskier effort of the attempted elimination of local contexts within the enterprise. MDM breaks down in the moment it becomes divorced from a practical, immediate attempt to capture just what is needed today. The moment it attempts to “bank” standard symbols ahead of their usage, the MDM process becomes speculative, and proscriptive. The likelihood of wasting time on symbology which ultimately is wrong and unused is very high, once steps past the interface and into the larger contexts are taken.

Uses of Metamorphic Models in Data Management and Governance

In the Master Data Management arena, Metamorphic Models would allow the capture of the data elements necessary to stitch together an enterprise. By recognizing the information needed to pass as markers or to act as travellers, the scope of the data governance task should be reducible to a practical minimum.

Then the data governance problem can be built up only as needed. The task becomes, properly, just another project-related activity similar to Change Control and Risk Management, instead of the academic exercise into which it often devolves.

The scope of data management should focus on and document 100% of the data being moved across interfaces, whether these interfaces are automated or human-performed. Simple data can just be documented, and the equivalence of syntax and semantics captured. Data elements that act as markers for the processes should be recorded. Also all data elements/structures intended merely to make the trip as travellers should be indicated.

This approach addresses the high-value portion of the enterprise’s data structures, while minimizing work on documenting concepts which only apply within a particular context.

Functions On Symbols

Data integration is a complex problem with many facets. From a semiotic point of view, quite a lot of human cognitive and communicative processing capabilities is involved in the resolution. This post is entering the discussion at a point where a number of necessary terms and concepts have not yet been described on this site. Stay tuned, as I will begin to flesh out these related ideas.

You may also find one of my permanent pages on functions to be helpful.

A Symbol Is Constructed

Recall that we are building tautologies showing equivalence of symbols. Recall that symbols are made up of both signs and concepts.

If we consider a symbol as an OBJECT, we can diagram it using a Unified Modeling Language (UML) notation. Here is a UML Class diagram of the “Symbol” class.

UML Diagram of the "Symbol" Object

UML Diagram of the "Symbol" Object

The figure above depicts how a symbol is constructed from both a set of “signs” and a set of “concepts“. The sign is the arrangement of physical properties and/or objects following an “encoding paradigm” defined by the members of a context. The “concept” is really the meaning which that same set of people (context) has projected onto the symbol. When meaning is projected onto a physical sign, then a symbol is constructed.

Functions Impact Both Structure and Meaning

Symbols within running software are constructed from physical arrangements of electronic components and the electrical and magnetic (and optical) properties of physical matter at various locations (this will be explained in more depth later). The particular arrangement and convention of construction of the sign portion of the symbol defines the syntactic media of the symbol.

Within a context, especially within the software used by that context, the same concept may be projected onto many different symbols of different physical media. To understand what happens, let’s follow an example. Let’s begin with a computer user who wants to create a symbol within a particular piece of software.

Using a mechanical device, the human user selects a button representing the desired symbol and presses it. This event is recognized by the device which generates the new instance of the symbol using its own syntactic medium, which is the pulse of current on a closed electrical circuit on a particular wire. When the symbol is placed in long term storage, it may appear as a particular arrangement of microscopic magnetic fields of various polarities in a particular location on a semi-metalic substrate. When the symbol is in the computer’s memory, it may appear as a set of voltages on various microscopic wires. Finally, when the symbol is projected onto the computer monitor for human presentation, it forms a pattern of phosphoresence against a contrasting background allowing the user to perceive it visually.

Note through all of the last paragraph, I did not mention anything about what the symbol means! The question arises, in this sequence of events, how does the meaning of the symbol get carried from the human, through all of the various physical representations within the computer, and then back out to the human again?

First of all, let’s be clear, that at any particular moment, the symbol that the human user wanted to create through his actions actually becomes several symbols – one symbol for each different syntactic representation (syntactic media) required for it to exist in each of the environments described. Some of these symbols have very short lives, while others have longer lives.

So the meaning projected onto the computer’s keyboard by the human:

  • becomes a symbol in the keyboard,
  • is then transformed into a different symbol in the running hardware and operating system,
  • is transformed into a symbol for storage on the computer’s hard drive, and
  • is also transformed into an image which the human perceives as the shape of the symbol he selected on the keyboard.

But the symbol is not actually “transforming” in the computer, at least in the conventional notion of a thing changing morphology. Instead, the primary operation of the computer is to create a series of new symbols in each of the required syntactic media described, and to discard each of the old symbols in turn.

It does this trick by applying various “functions” to the symbols. These functions may affect both the structure (syntactic media) of the symbol, but possibly also the meaning itself. Most of the time, as the symbol is copied and transferred from one form to another, the meaning does not change. Most of the functions built into the hardware making up the “human-computer interface” (HCI) are “identity” functions, transferring the originally projected concept from one syntactic media form to another. If this were not so, if the symbol printed on the key I press is not the symbol I see on the screen after the computer has “transformed” it from keyboard to wire to hard drive to wire to monitor screen, then I would expect that the computer was broken or faulty, and I would cease to use it.

Sometimes, it is necessary/desirable that the computer apply a function (or a set of functions called a “derivation“) which actually alters the meaning of one symbol (concept), creating a new symbol with a different meaning (and possibly a different structure, too).

Tension and Intention: Shifting Meaning in Software

If a software system is designed for one particular purpose, the data structures will have one set of intended meanings. These will be the meanings which the software developers anticipated would be needed by the user community for which the software was built. This set of intended meanings and the structure and supported relationships make up the “domain” of the software application.

When the software is actually put to use, the user community may actually redefine the meaning of certain parts of the syntactic media defined by the developers. This often happens at the edges of a system, where there may exist sets of symbols whose content are not critical to the operating logic of the application, but which are of the right syntactic media to support the new meaning. The meaning that the user community projects onto the software’s syntactic media forms the context within which the application is used. (See “Packaged Apps Built in Domains But Used in Contexts“)

Software typically has two equally important components. One is the capture, storage, retreival and presentation of symbols meaningful to a human community. The second is a set of symbol transformation processes (i.e., programming logic) which are built in to systematically change both the structure and possibly the meaning of one set of symbols into another set of symbols.

For a simplistic example, perhaps the software reads a symbol representing a letter of the alphabet and it transforms it into an integer using some regular logic (as opposed to picking something at random). This sort of transformation occurs a lot in encryption applications, and is a kind of transformation which preserves the meaning of the original symbol although changing completely its sign (syntactic medium).

When we push data (symbols) from a different source or context into the software application, especially data defined in a context entirely removed from that in which the software was defined and currently used, there are a number of possible ways to interpret what has happened to the original meaning of the symbols in the application.

What are some of the ways of re-interpretation?

  1. The meaning of the original context has expanded to a new, broader, possibly more abstract level, encompassing the meanings of both the original and the new contexts.
  2. Possibly, the mere fact that the original data and the new data have been able to be mixed into the same syntactic media may indicate that the data from the two contexts are actually the same. How might you tell?
  3. Might it also imply that the syntactic medium is more broadly useful, or that the transformation logic are somewhat generically applicable (and hence more semantically benign)?
  4. Are the data from the two contexts cohabitating the same space easily? Are they therefore examples of special cases of a larger, or broader symbollic phenomenon, or merely a happy coincidence made possibe by loose or incomplete software development practices?
  5. How do the combined populations of data symbols fare as maintenance of the software for one of the contexts using it is applied? Does the other context’s data begin to be corrupted? Or is it easy to make software changes to the shared structures? Do changes in the logic and structure supporting one context force additional changes to be made to disambiguate the symbols from the other context?

These questions come to mind (or should) whenever a community starts thinking about using existing applications in new contexts.

Bridge Contexts: Meaning in the Edgeless Boundary

Previously, I’ve written about the idea of the “edgeless boundary” between semiospheres for someone with knowledge of more than one context. This boundary is “edgeless” because to the person perceiving it, there is little or no obvious boundary.

In software systems, especially in situations where different software applications are in use, the boundary between them, by contrast, can be quite stark and apparent. I’ll describe the reasons for this in other postings at a later time. The nutshell explanation is that each software system must be constrained to a well-defined subset of concepts in order to operate consistently. The subset of reality about which a particular application system can capture data (symbols) is limited by design to those regularly observable conditions and events that are of importance to the performance of some business function.

Often (in an ideal scenario), an organization will select only one application to support one set of business functions at a time. A portfolio of applications will thus be constructed through the acquisition/development of different applications for different sets of business functions. As mentioned elsewhere on this site, sometimes an organization will have acquired more than one application of a particular type (see ERP page). 

In any case, information contained in one application oftentimes needs to be replicated into another application within the organization.  When this happens, regardless of the method by which the information is moved from one application to another, a special kind of context must be created/defined in order for the information to flow. This context is called a “bridging context” or simply a “bridge context”.

As described previously, an application system represents a mechanized perception of reality. If we anthropomorphize the application, briefly, we might say that the application forms a semiosphere consisting of the meaning projected onto its syntactic media by the human developers and its current user community, forming symbols (data) which carry the specifically intended meaning of the context.

Two applications, therefore, would present two different semiospheres. The communication of information from one semiosphere to the other occurs when the symbols of one application are deconstructed and transformed into the symbols of the other application, with or without commensurate changes in meaning. This transformation may be effected by human intervention (as through, for example, the interpretation of outputs from one system and the re-coding/data entry into the other), or by automated transformation processes of any type (i.e., other software).

“Meaning” in a Bridging Context

Bridging Contexts have unique features among the genus of contexts overall. They exist primarily to facilitate the movement of information from one context to another. The meaning contained within any Bridging Context is limited to that of the information passing across the bridge. Some of the concepts and facts of the original contexts will be interpretable (and hence will have meaning) within the bridging context only if they are used or transformed during this flow.  Additional information may exist within the bridge context, but will generally be limited to information required to perform or manage the process of transformation.

Hence, I would consider that the knowledge held or communicated by an individual (or system) operating within a bridging context which is otherwise unrelated to either of the original contexts, or of the process of transference, would existing outside of the bridging context, possibly in a third context. As described previously, the individual may or may not perceive the separation of knowledge in this manner.

Special symbols called “travellers” may flow through untouched by transformation and unrecognized within the bridging context. These symbols represent information important in the origin context which may be returned unmodified to the origin context by additional processes. During the course of their trip across the bridging context(s) and through the target contexttravellers typically will have no interpretation, and will simply be passed along in an unmodified syntactic form until returned to their origin, where they can then be interpreted again. By this definition, a traveller is a symbol that flows across a bridge context but which only has meaning in the originating context.

Given a path P from context A to context B, the subset of concepts of A that are required to fulfill the information flow over path P are meaningful within the bridging context surrounding P. Likewise, the subset of concepts of B which are evoked or generated by the information flowing through path P, is also part of the content of the bridge context.  Finally, the path P may generate or use information in the course of events which are neither part of context A nor B. This information is also contained within the bridge context.

Bridge contexts may contain more than one path, and paths may transfer meaning in any direction between the bridged contexts. For that matter, it is possible that any particular bridging context may connect more than two other contexts (for example, when an automated system called an “Operational Data Store” is constructed, or a messaging interface such as those underlying Service Oriented Architecture (SOA) components are built).

An application system itself can represent a special case of a bridging context. An application system marries the context defined by the data modeller to the context defined by the user interface designer. This is almost a trivial distinction, as the two are generally so closely linked that their divergence should not be considered a sign of separate contexts. In this usage, an application user interface can be thought of as existing in the end user’s context, and the application itself acts to bridge that end user context to the context defining the database.

%d bloggers like this: