You Can’t Store Meaning In Software

I’ve had some recent conversations at work which made me realize I needed to make some of the implications of my other posts more obvious and explicit. In this case, while I posted awhile ago about How Meaning Attaches to Data Structures I never really carried the conversation forward.

Here is the basic, fundamental mistake that we software developers make (and others) in talking about our own software. Namely, we start thinking that the data structure and programs actually and directly hold the meaning we intend. That if we do things right, that our data structures, be they tables with rows and columns or POJOs (Plain Old Java Objects) in a Domain layer, just naturally and explicitly contain the meaning.

The problem is, that whatever symbols we make in the computer, the computer can only hold structure. Our programs are only manipulating addresses in memory (or disk) and only comparing sequences of bits (themselves just voltages on wires). Now through the programming process, we developers create extremely sophisticated manipulations of these bits, and we are constantly translating one sequence of bits into another in some regular, predictable way. This includes pushing our in-memory patterns onto storage media (and typically constructing a different pattern of bits), and pushing our in-memory patterns onto video screens in forms directly interpretable by trained human users (such as displaying ASCII numbers as characters in an alphabet forming words in a language which can be read).

This is all very powerful, and useful, but it works only because we humans have projected meaning onto the bit patterns and processes. We have written the code so that our bit symbol representing a “1” can be added to another bit symbol “1” and the program will produce a new bit symbol that we, by convention, will say represents a value of “2”.

The software doesn’t know what any of this means. We could have just as easily defined the meaning of the same signs and processing logic in some other way (perhaps, for instance, to indicate that we have received signals from two different origins, maybe to trigger other processing).

Why This Is Important

The comment was made to me that “if we can just get the conceptual model right, then the programming should be correct.”  I won’t go into the conversation more deeply, but it lead me to thinking how to explain why that was not the best idea.

Here is my first attempt.

No matter how good a conceptual model you create, how complete, how general, how accurate to a domain, there is no way to put it into the computer. The only convention we have as programmers when we want to project meaning into software is that we define physical signs and processes which manipulate them in a way consistent with the meaning we intend.

This is true whether we manifest our conceptual model in a data model, or an object model, or a Semantic Web ontology, or a rules framework, or a set of tabs on an Excel file, or an XML schema, or … The point is the computer can only store the sign portion of our symbols and never the concept so if you intend to create a conceptual model of a domain, and have it inform and/or direct the operation of your software, you are basically just writing more signs and processes.

Now if you want some flexibility, there are many frameworks you can use to create a symbollic “model” of a “conceptual model” and then you can tie your actual solution to this other layer of software. But in the most basic, reductionist sense, all you’ve done is write more software manipulating one set of signs in a manner that permits them to be interpreted as representing a second set of signs, which themselves only have meaning in the human interpretation.

Advertisements

What a Context Is: Information Flow Theory

I’ve been busy lately, and let this discussion lapse for a bit. Let’s see if I can kickstart it again.

Sometimes, information flows from the physical world to a person who observes and interprets his perceptions to create and recognize new knowledge. Sometimes, someone creates a message in a physical medium and “sends” it into the world hoping that there will be another person who not only can sense it, but can recognize it as a message and can receive the information layered on top of the physical perception. The first is an example of simple observation and perception, the second is an example of communication within a context. While both rely on the perception of physical reality by the observer, they are fundamentally, though subtlety, different.

I have been reading a book on the mathematical theory underlying the “flow” of “information” in the real world, and while I don’t yet really understand the theory, there are some points the book is failing to make about context which I think are important.

Barwise, Jon and Seligman, Jerry, Information Flow: The Logic of Distributed Systems, Cambridge University Press, 1997.

Not that the book was intended to cover the concept of context, per se, being an attempt to lay out a mathematical/logical framework for describing the flow of information across, through and between physical systems. It does have an extensive section on “local logics” which when I understand it better may help me describe my own ideas about context in a formal manner.

What struck me, and forms the origin point of today’s discussion on contexts, is the example the authors use to introduce and illustrate the technical discussion. And as you will see, it is not just the example given, but variations of it that I will use to elucidate better what a context is and is not. To get to the meat of my thinking, here is their example, as written on pages 4-5.

Judith, a keen but inexperienced mountaineer, embarked on an ascent of Mt. Ateb. She took with her a compass, a flashlight, a topographic map, and a bar of Lindt bittersweet chocolate. The map was made ten years previously, but she judged that the mountain would not have changed too much. Reaching the peak shortly after 2 P. M. she paused to eat two squares of chocolate and reflect on the majesty of her surroundings.

At 2:10 P. M. she set about the descent. Encouraged by the ease of the day’s climb, she decided to take a different route down. It was clearly indicated on the map and clearly marked on the upper slopes, but as she descended the helpful little piles of stones left by previous hikers petered out. Before long she found herself struggling to make sense of compass bearings taken from ambiguously positioned rocky outcrops and the haphazard tree line below. By 4 P. M. Judith was hopelessly lost.

Scrambling down a scree slope, motivated only by the thought that down was a better bet than up, the loose stones betrayed her, and she tumbled a hundred feet before breaking her fall against a hardy uplands thorn. Clinging to the bush and wincing atthe pain in her left leg, she took stock. It would soon be dark. Above her lay the treacherous scree, below her were perils as yet unknown. She ate the rest of the chocolate.

Suddenly, she remembered the flashlight. It was still working. She began to flash out into the twilight. By a miracle, her signal was seen by another day hiker, who was already near the foot of the mountain. Miranda quickly recognized the dots and dashes of the SOS and hurried on to her car where she phoned Mountain Rescue. Only twenty minutes later the searchlight from a helicopter scanned the precipitous east face of Mt. Ateb, illuminating the frightened Judith, still clinging to the thorn bush but now waving joyously at the aircraft.

Was Judith Lucky That Miranda Knew Morse Code?

In a word, yes. In so many ways:

  • like the fact that Miranda was near the bottom of the mountain,
  • that she had a clear view of the side of the mountain where Judith lay,
  • that she had a cell phone at the car,
  • and most importantly, Judith was indeed lucky that Miranda knew the Morse code for “SOS”.

In fact, it was also lucky that Judith herself knew the code. So in the story as given, since Judith knew the code to use when she found herself in trouble, she used her light as the syntactic medium in which she encoded a message of her need for help. The fact that Morse code is a globally standard coding scheme simply meant that both Judith and Miranda both shared a common context without ever having met. The fact of their shared knowledge of the code provided the context by which Judith was able to get her message to Miranda.

What if Miranda Didn’t Know Morse Code

Things could have been much worse for Judith if either of them had no knowledge of the code, or if neither of them did. In addition, it was lucky that Miranda was somehow aware (or realized as she was watching) that a flashing light could be used to signify such a code, and that she then obviously deduced from the repeated pattern that someone was sending a message.

Imagine what might have happened if Miranda had seen the flashing light, but didn’t recognize it as a code, and therefore didn’t try to translate what she was seeing. Instead of reacting by calling for help, she might have thought to herself “Oh look, there’s a light up on the mountain. I wonder what that is?” but then gone on about her business.

In other words, Miranda could have observed her environment and perceived the flashing light but concluded that it was simply a physical phenomenon of no particular import. She may have perceived the signal but failed to recognize it as a message. In which case, this would show that Judith and Miranda had failed to establish a context for the communication.

What if Judith didn’t know Morse Code?

If Judith didn’t know Morse Code, perhaps she would still have started waving her flashlight around. Miranda having seen the light would have no reason to recognize a code.

Would this mean Judith would be out of luck? Not necessarily, if Miranda was also an experienced hiker. Miranda being in the context of hiking, it might occur to her that there shouldn’t be a light on that part of the mountain at that moment in time. She might think to herself that the random way the light was moving, plus its position on the mountain compared with where the safe trails were, added up to someone in distress.

In this case, a message has still been sent from Judith to Miranda, with the same result. The context that Miranda was thinking in plus her perception and prior knowledge of the mountain trails, allowed her to reach a conclusion that there was someone on the mountain in trouble. But it is important to note that Judith was not in the same context as Miranda.

In fact, if Miranda was a ranger, she may have been trained to look for and recognize the behavior of people in distress. In this example, we must conclude that Judith was not actually participating in the context with Miranda. It was Miranda’s knowledge of and mindset regarding her perceptions of the dangerous mountain environment which led her to deduce the existence of a person in trouble, not the fact of Judith’s trying to send a message.

Yes, this Judith tried to send a message, but she couldn’t have known that her random wavings would be recognized in anyway. Whereas the Judith who used Morse code actually knew of a context and encoded a very specific message using that context, with the expectation and hope that someone else might also understand it.

The difference in the two versions of the story is subtle. In both cases a message was sent, and in both a message was received and an action was taken. But in the first story, a bridging context in the form of Morse Code was called upon to carry a very specific message, while in the second story there was no bridging context. In the second story, it was entirely the perceptiveness and deductive power of Miranda’s “hiking Mt. Ateb context” which allowed her to create for herself new information: namely that “someone out there is in trouble”.

Once More, What If Judith Wasn’t In Trouble?

Let’s take one more variation of the story to enforce this last point. Let’s say that everything happened as described, except that instead of falling down the scree, Judith purposefully rappelled down the side of the mountain. And furthermore, that instead of clinging desperately to a thorn bush, that she had actually managed to establish a bivuoac in that peculiar outpost. In this version of the story, perhaps Judith is waving her flashlight around as in our second story, only this time merely to light her little campground while fixing herself dinner.

Now imagine ranger Miranda, trained as before and with a knowledge of the trails, but without prior knowledge of anyone camping where Judith found her perch. Using her same described skills of perception and deduction, Miranda may still come to the conclusion that there was someone on the side of the mountain in trouble, and would take the aforementioned steps to effect a rescue. Only she would find that Judith was not in need of help, and is now put out by the disturbance of her relaxation by the whirring chopper blades.

In this version of the story, Miranda is still in the same context as before, and uses her perceptions and the rules of that context to reach her conclusion. The fact is, and this second version of the story should make it clear, that while information did flow from Judith to Miranda just as before, we cannot call this information a “message” carried on a medium and in a context shared by Judith and Miranda. In other words, it was not a purposeful communication across a bridging context.

No, quite simply, in both of these latter examples, Miranda’s context guided her to her perception and the creation of the knowledge that Judith was on the mountain and in trouble (even if she was mistaken on this last point in the final story).

Summary

The fact that a person who presses a flashlight button does or does not intend to send a message – to communicate through that act -defines whether we classify the information flow as being a symbollic act or not. Perhaps the person does not realize or care whether there is another person watching for a flashlight in the dark. The factthat someone sees the light and acts in response does not mean that communication has occurred. Just because information has flowed does not mean that symbols have flowed.
This is a subtle distinction but an important one.

Packaged Apps Built in Domains But Used In Contexts

Packaged applications are software systems developed by a vendor and sold to multiple customers. Those applications which include some sort of database and data storage especially are built to work in a “domain”.

The “domain” of the software application is an abstract notion of the set of contexts the software developers have designed the software to support. While the notion of “domain” as described here is similar to and related to the notion of “context”, the domain of the software only defines the potential types of symbols that can be developed. In other words, the domain defines a syntactic medium (consisting of physical signs, functions and transformations on those signs, and the encoding paradigm).

But the software application domain is NOT its context. Context, when applied to software applications, is defined by the group of people who use the software together.

There’s a difference, therefore, between how developers and designers of business software think about and design their systems, and how those systems are used in the real world. No matter how careful the development process is, no matter how rigorous and precise, no matter how closely the software matches the business requirements, and no matter how cleanly and completely the software passed its tests, the community using the software will eventually be forced to bend it to a purpose for which it was never intended.

This fact of life is the basis of several relatively new software development paradigms, including Agile and Extreme Programming, and the current Service-Oriented Architecture. In each of these cases, the recognition that the business will not pause and wait while IT formally re-writes and re-configures application systems.

One of the shared tenets of these practices is that because the business is so fluid, it is impossible to follow formal development methods. In SOA, the ultimate ideal is a situation where the software has become so configurable (and so easy to use), that it no longer requires IT expertise to change the behavior. The business users themselves are able to modify the operation of the software daily, if necessary.

Different Contexts Use Different Signs

The following is an excerpt from one of my permanent pages.

Photo of an Actual Stop Sign In Its Normal Context

Photo of an Actual Stop Sign In Its Normal Context

In the Context defined for “driving a car in the United States,” a particularly shaped, painted metal plate attached to a wooden post which has been planted in the ground at the intersection of two roads and facing toward oncoming vehicles represents the concept of a command to the oncoming motorist to “stop” their vehicle when they reach the intersection.

However, a similarly colored and shaped object, say a computer bitmap of a drawing of a “stop sign”, not only is represented by a different Syntactic Medium, it exists in an entirely different context (perhaps one that is not obviously recognized by the casual observer).

 

Cartoon Drawing of a Stop Sign
Cartoon Drawing of a Stop Sign

If this computer bitmap “stop sign” were to be displayed on a large computer monitor, and this computer monitor was used to replace the wood and metal Stop Sign, even if placed in the same position and orientation as the more typical structure, it is not certain that every driver would recognize the validity of the new Syntactic Medium, which could lead to accidents! This example should give the reader a clear understanding of how a Context constrains and defines the physical structures that are permitted to represent the concepts it contains.

Software as Semantic Choice

When I design a new software system, I have to choose what parts of reality matter enough to capture in the data (data is little bits of information stored symbollically and in great repetitive quantities). I can’t capture the entirety of reality symbollically, software is another example in life of having to divide an analog reality into discrete named chunks, choosing some and leaving others unmentioned.

This immediately sets the system up for future “failure” because at some point, other aspects of the same reality will become important. This is what in artificial intelligence is called “brittleness”. A quality which bedeviled the expert system movement and kept it from becoming a mainstream phenomenon. This is also a built in constraint on semantic web work, but I’ll leave that for another post.

Taking the example of quantum physics research as an example, there’d be no point in writing one application to capture both the speed and position of a quantum particle in a database, because as we all know, only one or the other data points is available to us to measure at one time. Thus we choose to capture the one that’s important to our study, and we ignore the other.

This is why a picture is worth a thousand words: because it is an analog of reality and captures details that can remain unnamed until needed at a future time.

This is also why we say that in communication we must “negotiate reality”. We must agree together (software developer and software user) what parts of reality matter, and how those parts are named, recognized, and interact.

In reading a recent thread on Library Science, it sounds like in the “indexing and abstracting” problem (used to set up a searchable space for finding relevant documents), a choice has to be made on what we think the searcher will most likely bring with him in order to find the information they seek. But by virtue of making one choice, we necessarily eliminate other choices we might have made which may have supported other seekers better.

This is an interesting parallel, and I must assume that I’ll find more as this dialog continues.

How to Emculturate

This post is really about the basic pre-conditions needed for two people to communicate. This is really a naive, basic description, and I know that. However, it can be a useful way to think about and discuss in lay terms the technical aspects of acts of communication.

When I think about semantics and symbology, I focus on how meaning flows from one person to another. There are several components that have to come together in order for meaning to transfer between people.

First of all, two people must share the same context, even if it is not an exact fit. Without having some commonality of experience, however tenuous, there can be no communication. Now this context may be based on shared experience (e.g., attending the same event, reading the same book) or parallel experience (e.g., becoming a parent, learning to drive a car).

With that precondition established, then the next element that must exist is that some physical mechanism (i.e., a syntactic medium) must be available that can both be manipulated and sensed by both individuals.

There would be no sense in writing on posters to communicate with a blind person across a great distance, or whispering a song to a deaf person from behind them, unless a second medium is also employed (such as having a third person read the poster aloud, or sign the song).

With a medium chosen that satisfies both conditions for both persons, then one person has to put the meaning into the medium using an established convention. In other words, the intended meaning of the message must be “encoded” onto the medium in such a way that both the sender of the message and the intended receiver of the message agree on the meaning conveyed.

These are the three minimal conditions required for communication between any two or more parties. In summary:

  1. Shared Context
  2. Physical Media that can be manipulated and sensed by both
  3. Agreed Upon Encoding

The only other elements required are that there be something to communicate and that the two individuals have the volition to try.

%d bloggers like this: