Good Summary on How Engineers Define Symbols

An interesting summary of how software engineers are constrained to develop data structures based on their locality is presented in a comment by “katelinkins” at this blog discussing a book about how “information is used“. I think, however, it ends on a note that suggests a bit of wishful thinking, in suggesting that engineers don’t really

…KNOW and UNDERSTAND the code…

and implying that additional effort  by them will permit

validating the representations upfront to aid in development of common taxonomy and shared context

I wasn’t sure whether the comment was suggesting that only software engineers “continually fall short” in this effort, or if she was suggesting a greater human failing.

While software developers can be an arrogant lot (I saw a description of “information arrogance” earlier in this discussion stream, and we can definitely fall into those traps, as anyone else can too), it is not always arrogance that causes our designs not to fit exactly everyone’s expectations.

Software developers do define symbols based on their regional context. But it gets even more constrained than that, because they must define the “symbology” based on what they know at a particular point in time and from a very small circle of sources, even if the software is intended for broad usage.

The fundamental problem is that there is ALWAYS another point of view. The thing that I find endlessly fascinating, actually, is that even though a piece of software was written for one particular business context (no matter how broad or constrained that is), someday, somewhere, a different group of users will figure out how to use the system in an entirely different context.

So, for example, the software application written for the US market that gets sold overseas and is used productively anyway, if not completely or in the same fashion, is a tremendous success, in my mind. This is how such applications as SAP (the product of German software development) has had such great success (if not such great love) worldwide!

I don’t believe there is such thing as a “universal ontology” for any subject matter. In this I think I’m in agreement with some of the other posts on this discussion thread, since the same problem arises in organizing library indexes for various types of the “information seeker” in any search. While having different sets of symbols and conceptions  among a diverse set of communicating humans can muddy the  space of our discourse, we at least have a capacity to compartmentalize these divergent views and switch between them at will. We can even become expert at switching contexts and mediating between people from different contexts.

One of the big problems with software is that it has to take what can be a set of fuzzy ideas, formalize them into a cohesive pattern of structure and logic that satisfies a certain level of rigor, and then “fix in cement” these ideas in the form of bug-free code. The end result is software that had to choose between variations and nuance which the original conceptions may not have ever tried to resolve. Software generally won’t work at all, at least in the most interesting parts of an ontology, if there is a divergence of conception within the body of intended users.

So in order to build anything at all, the developer is forced to close the discussion at some point and try their best to get as much right as is useful, even while they recognize there are variations left unhandled. Even in a mature system, where many of these semantic kinks have been worked out through ongoing negotiations with a particular user community, the software can never be flexible enough to accomodate all manner of semantic variation which presents itself over time without being revised or rewritten.

In the software development space, this fundamental tension between getting some part of the ontology working and getting all points of view universally right in a timely fashion has been one of the driving forces behind all sorts of paradigm shifts in best practices and architectures.  Until the computer software can have its own conversation with a human and negotiate its own design, I don’t see how this fundamental condition will change.

Chasing the Chimera: Searching for Universal Truth in the Data Center

There’s a widespread belief in the data community (sometimes stated and sometimes just implied) that not only does the pursuit of the definition of a universal Single Version of Truth have “obvious technical merits”, but that it is crucial to our collective success. Having spent an entire career helping customers in many different industries codify and fabricate business systems, including participating in more than a few attempts at establishing a single version of truth by standardizing data, I have been surprised by my own revelation in recent years that we, as an industry, have been chasing an unreachable, and possibly an undesirable, chimera.

It’s like the old riddle about how to swallow an elephant. The solution is to take small bites, and just keep at it. This is a common metaphor used whenever a large project to standardize an enterprise’s data is begun. The problem is, trying to create that all-encompassing, single standard for all of the data in the organization is not really comparable to eating a rotting elephant corpse. You’re not really eating a finite mass of elephant at all! A more appropriate metaphor would be to consider that you are actually chewing the grass on the edge of a vast plain, and it just keeps growing faster than you can chew!

The value of some data standardization cannot be denied. Re-engineering selected areas can result in better data quality, timeliness and actual value. Certainly we have seen that the wheels of e-commerce can be sped up by careful selection of the right standard. For some practitioners, however, taking this “piecemeal” approach, they feel, is insufficient, and may even detract from the ultimate goal. These practitioners have seen how much good came from a little standardization and rationalization, and then conclude that taking the practice to its logical conclusion should reap the ultimate benefit.

The problem with this logic is that it fails to take into account the cost of completion. My point is that no matter how valuable the end point is expected to be, the number of systems that come on and off line, the number of changes to the business, the number of external business partners, the number of external standards bodies, the number of mergers and acquisitions, means that they will never reach that end state.

Some people may agree with me on this point, and others may not. However, even those who might agree with me on the ultimate likelihood of success, may still take the same old approach to the problem: convening a steering committee of diverse end users, locking them in a room for weeks on end, and forcing them to define an abstract, but universal data dictionary. Only to find that major portions are already out of date, or that major subject areas are still missing, or worse still, that most people outside of this pressure-cooker committee disagree with or do not understand the result!

An alternative approach to this search for the universal would be to recognize that diversity of meaning and representation will be a given in any sufficiently large organization of humans, and to address this inevitability directly. This can be accomplished by creating a “federated data dictionary” following these rules:

  1. Don’t attempt to “swallow the elephant” – try “mapping the terrain” instead by creating well-documented data dictionaries of each context.
  2. Document the context that defined a concept in the first place.
  3. Only standardize as much as is necessary to knit together those portions of the enterprise that must work together, and do no more.
  4. Create a “data thesaurus” in addition to the data dictionaries that describes and documents the equivalence of meaning between the data structures of the different contexts, but only for those which must touch each other across the enterprise.
  5. Focus on the points of integration between the contexts first, where data flows from one context to another.

Isn’t it time we recognize that diversity exists? Maybe if we stop the never-ending chase for the universal, we’ll realize that diversity has its value too, and start trying to do a better job accommodating it.

Looking For The Semiotic Layperson

In searching for kindred spirits out there, I found a number of individual posts which I thought I could use to elucidate some of my own opinions. The following are mini-quotes from some of the people I’ve noticed online who appear to be thinking about symbols, meaning and communication in some fashion. I know there are lots of others, these just struck me as particularly interesting.

kristof28 has the same idea that I do about how symbols work:

Semiotics deals with the production of meaning. A perfectly sensible view of meaning would say that as I am the writer of this sentence so I put the meaning into it and that you, the reader, are the receiver so you take the meaning out. Semiotics is the science of understanding how signs work and how meaning emerges from the relationship between the sender and receiver.

What I would add to their basic statement is that the meaning that the receiver takes out of the message may not be exactly the same as the meaning that the sender put in. The more closely the two communicators share a common context, the more closely aligned will be their understanding. The less sharing before the message, the more likely that the message received will be different than intended.

cjc89 focuses on semiotics as the study of a larger societal process:

it is important to keep in mind that the key to semiotics is an attempt to define how meaning is socially produced (and not individually created). In this light, it will always be subject to power relations and struggles. Furthermore, meaning is always negotiated – it is never static.

In my mind, what “society” does with a symbol is to reinforce it, repeat it, and in this way amplify it. The most commonly shared concepts packaged in the most commonly recognized symbols will tend to get the most use and hence will tend toward relatively more people receiving the same message. But “society” is really a set of individual people. So it is through the popularity among a large set of people that certain symbols and concepts hold sway. I know I’m nit-picking a little here.

iheartunswjourno seems to share a worry about the power of the media:

Choosing to suppress or engage certain arbitrary relations that exist between the signifier and the signified, effectively oppressing or supporting the political agendas of their society. It is quite a scary reality to realize that the media is subtly constructing how we perceive the world.

While I agree that the bombardment of the majority conception of meaning through mass-produced symbols can be hard to counteract, I actually hold out the hope that we as individuals do have power to create meaning, at least within a sphere of influence.

 (The “semiotic” term for this would be “semiosphere“, apparently)

I don’t believe in the existence of “meaning” living outside of the individual. I recognize the volume of symbollic detritous – the notion of our being surrounded by other people’s messages – certainly. And, yes, I recognize that the most powerful will control what is said in the most official channels, but none of us have to merely succomb and accept the message.

The notion of meaning being negotiated is spot on. That’s how it works between two people, and that’s how it works within a society. The miracle of it is that we humans are able to shift between points of view (contexts) with such ease that we don’t often notice ourselves that we have done so.  So while we might disagree with the consensus opinion of our countrymen, we are able to reach common ground with our next door neighbors.

And that’s just the thing that gets the larger process moving, talking with your neighbors and coming to agreement on some aspect of reality.

Every individual can choose to accept or reject the overwhelming flow, or to create their own discourse.  And that is part of our heritage as human beings.

How to Emculturate

This post is really about the basic pre-conditions needed for two people to communicate. This is really a naive, basic description, and I know that. However, it can be a useful way to think about and discuss in lay terms the technical aspects of acts of communication.

When I think about semantics and symbology, I focus on how meaning flows from one person to another. There are several components that have to come together in order for meaning to transfer between people.

First of all, two people must share the same context, even if it is not an exact fit. Without having some commonality of experience, however tenuous, there can be no communication. Now this context may be based on shared experience (e.g., attending the same event, reading the same book) or parallel experience (e.g., becoming a parent, learning to drive a car).

With that precondition established, then the next element that must exist is that some physical mechanism (i.e., a syntactic medium) must be available that can both be manipulated and sensed by both individuals.

There would be no sense in writing on posters to communicate with a blind person across a great distance, or whispering a song to a deaf person from behind them, unless a second medium is also employed (such as having a third person read the poster aloud, or sign the song).

With a medium chosen that satisfies both conditions for both persons, then one person has to put the meaning into the medium using an established convention. In other words, the intended meaning of the message must be “encoded” onto the medium in such a way that both the sender of the message and the intended receiver of the message agree on the meaning conveyed.

These are the three minimal conditions required for communication between any two or more parties. In summary:

  1. Shared Context
  2. Physical Media that can be manipulated and sensed by both
  3. Agreed Upon Encoding

The only other elements required are that there be something to communicate and that the two individuals have the volition to try.

If You Reached This Entry…

So I’m incrementally organizing this blog to my liking, and have a lot left to do, having only started it a week ago now. One of the things I’m trying to do is create certain fixed pages of interesting (I hope) content to act as the entry point to some of my envisioned posts. My first example of this is my Glossary page. Eventually, I will have a large number of posts (if the pile of ideas I’ve been collecting on a desk behind me is any real indication). What these fixed pages will do is allow me to provide summaries of the basic premises and ideas, and then group related posts into categories related to these basics. That way, finding one’s way around and finding more content related to an idea you might find interesting will be made easier.

So if you have reached this entry by way of one of my fixed pages, my apologies. It just means that I haven’t yet had the time to write that next post on this subject. I’ll get to it, I promise!

Semantics of Architecture, Personal and Public

Poking around the blogosphere (should that be capitalized…?) this weekend, I came across Prof. Lindsay Clark’s blog describing some of her research interests in how architectural space becomes a “symbolic space”. I would love to see more details of her thinking there.

If I apply my own thought process to an architectural space, I could see several ways in which that space could be imbued with meaning. 

First of all, as an individual person living in a space, even a simple box-like room, that space will begin to acquire meaning by virtue of my living in it.

 “This corner is where I stood when I first saw the 9-11 video.”

 “I was sitting right here, just so, when I got the phone call about the birth of my nephew.”

 “The last thing she did when she left was to drop the key right there on that spot on the carpet.”

But this meaning is private, personal, and not at all obvious. Anyone else who comes into my physical abode, won’t notice these things, unless they happened to be in the room at the same time and hence remembered these events for themselves.

Second, I could embellish or alter my little space in various ways. I could paint it (with a pattern or not), add images or statuary, or architectural elements, etc. These too may or may not present themselves to a second person as terribly meaningful, unless my selection of elements includes icons or references from some community we both happen to share.

Third, I could imagine, as an architect, working very hard at embedding cultural (community) references through the use of shape and structure, materials, position and location, etc. While I would try to be clever about such symbology, I would likely also try to not be too esoteric, lest my intent be lost on the majority of visitors to the space. The best work, I would think, would appear fresh and clever, and be mostly obvious or at least easily accessed/discovered through direct experience of the space without other forms of description.

 (Nothing like ruining a good joke or a good symbol by having to explain it over and over…)

In this sense, the referents of the structure’s symbols should be recognized through the context of the surrounding environment as experienced in conjunction with or on approach to the space.  

A structure whose meaning requires explicit description (say through placards or brochures) becomes less a symbol in its own right, and more just an exhibit space. While the purpose and meaning of the Egyptian pyramids of Giza in their particulars are not obvious, their size, shape, age and location lends an obvious gravitas to them that I imagine a visitor can not help but recognize, even if they don’t read the brochure. Such a space is what I would describe as a symbolic space.

(Full disclosure: I’ve never been, but would love to go someday).

The Real Reason Systems Fail

It seems as if every week there’s another news article bemoaning the state of data integration within some large enterprise. Mission objectives are stymied because “systems don’t talk to each other”. Intelligence failures are due to “incompatible data”. The surprise and outrage expressed would make the lay reader think that this is a recent trend, but they’d be wrong in this notion. The “Data Integration Problem” has been around since the first humans began to speak. Most practitioners and experts who work in the software version of the problem space haven’t realized this, but it’s true.

What is “data” in the modern sense? Most people think that it is “information”, the detailed “facts” of a modern culture. This colloquial understanding is a major simplification, one which is at the root of the Data Integration Problem. It is the reason why most people, even seasoned experts, are constantly surprised and frustrated when the monster appears, seemingly out of nowhere, before them.

So what is data?
Data is CODE.

What this implies is that without someone who can decode it – an INTERPRETER – data is nothing. Let that sink in for a minute.

Data is nothing without INTERPRETATION.

What does this mean? Well for one thing it means that without an interpretation, there is no way to even recognize that data exists. And without an interpreter, there can be no interpretation.

So when we think about all of the data being generated and passed around in our modern world, the question arises: Who is the interpreter that gives data its meaning? Well obviously it’s us. The computer doesn’t understand the data it contains! No matter how we might try to anthropomorphize them, computers are still just as dumb as the lumps of metal, plastic and sand from which they are constructed. The systems that we humans create using these computers are just that – mechanical systems which manipulate physical media, morphing symbols from one representation to another. Everything a computer does is devoid of intrinsic meaning until some human comes along and interprets the symbols.

Imagine computer systems after apocalypse. Imagine the systems of the stock exchange, or the weather service, or any of the thousands of other automated systems that may run unattended by their now defunct human inventors. Now answer the question: without humans to interact with them, do they produce anything? Is there any content to them without, ultimately, some human being to interpret their output?

This is more than that old saw about a tree falling in the forest. Consider some famous examples of symbols that have lost their meaning:

  1. Cave paintings of Lascaux depict hunts and animals, but what did our pre-historic cousins intend when they outlined their hands on the walls?
  2. Until the Rosetta Stone was found, Egyptian hieroglyphics had lost their meaning in the world.
  3. When the Confederacy fell, Confederate currency lost its meaning and value.

The Data Integration Problem, simply stated, is caused by the fact that data is symbolic code onto which some group of humans has projected meaning. Meaning, therefore, is local to the humans doing the projection. Without the knowledge of how meaning was projected onto the symbols, the information they contain cannot be retrieved in any complete sense.

Only the people who have projected their meaning onto the symbols (or in special instances who share significant experiences of the world with those people who have) are able to interpret the data correctly. In any large, complex, enterprise, where business necessitates that small groups complete their own missions expeditiously and with vigor, who should be surprised that locally defined data doesn’t integrate well from one end of the enterprise to the other?

Really, the answer to this question should be “nobody”.

This has been true since humans (and possibly our predecessors) first started making symbols.

%d bloggers like this: