You Can’t Store Meaning In Software

I’ve had some recent conversations at work which made me realize I needed to make some of the implications of my other posts more obvious and explicit. In this case, while I posted awhile ago about How Meaning Attaches to Data Structures I never really carried the conversation forward.

Here is the basic, fundamental mistake that we software developers make (and others) in talking about our own software. Namely, we start thinking that the data structure and programs actually and directly hold the meaning we intend. That if we do things right, that our data structures, be they tables with rows and columns or POJOs (Plain Old Java Objects) in a Domain layer, just naturally and explicitly contain the meaning.

The problem is, that whatever symbols we make in the computer, the computer can only hold structure. Our programs are only manipulating addresses in memory (or disk) and only comparing sequences of bits (themselves just voltages on wires). Now through the programming process, we developers create extremely sophisticated manipulations of these bits, and we are constantly translating one sequence of bits into another in some regular, predictable way. This includes pushing our in-memory patterns onto storage media (and typically constructing a different pattern of bits), and pushing our in-memory patterns onto video screens in forms directly interpretable by trained human users (such as displaying ASCII numbers as characters in an alphabet forming words in a language which can be read).

This is all very powerful, and useful, but it works only because we humans have projected meaning onto the bit patterns and processes. We have written the code so that our bit symbol representing a “1” can be added to another bit symbol “1” and the program will produce a new bit symbol that we, by convention, will say represents a value of “2”.

The software doesn’t know what any of this means. We could have just as easily defined the meaning of the same signs and processing logic in some other way (perhaps, for instance, to indicate that we have received signals from two different origins, maybe to trigger other processing).

Why This Is Important

The comment was made to me that “if we can just get the conceptual model right, then the programming should be correct.”  I won’t go into the conversation more deeply, but it lead me to thinking how to explain why that was not the best idea.

Here is my first attempt.

No matter how good a conceptual model you create, how complete, how general, how accurate to a domain, there is no way to put it into the computer. The only convention we have as programmers when we want to project meaning into software is that we define physical signs and processes which manipulate them in a way consistent with the meaning we intend.

This is true whether we manifest our conceptual model in a data model, or an object model, or a Semantic Web ontology, or a rules framework, or a set of tabs on an Excel file, or an XML schema, or … The point is the computer can only store the sign portion of our symbols and never the concept so if you intend to create a conceptual model of a domain, and have it inform and/or direct the operation of your software, you are basically just writing more signs and processes.

Now if you want some flexibility, there are many frameworks you can use to create a symbollic “model” of a “conceptual model” and then you can tie your actual solution to this other layer of software. But in the most basic, reductionist sense, all you’ve done is write more software manipulating one set of signs in a manner that permits them to be interpreted as representing a second set of signs, which themselves only have meaning in the human interpretation.

Advertisements

Q&A: Meaning Symbol Sign and Mind (Part 1)

On one of my recent posts, a commentor named “psycho” asked me some very good questions. I decided I needed to respond in more detail than just a single comment reply. I respond in pieces below, so just for context, here is psycho’s entire original comment.

But if you take more meanings, and put them together to get yet another meaning. Don’t you feel like those meanings were again like symbols creating a new meaning?

In my understanding, every bit of information is a symbol – what is represented by the invididual neurons in the brain. And if you take all related bits (that is neurons, symbols), and look at it as a whole, what you get is meaning.

The sentence is a symbol, and it is made of word-symbols. And the list of word-symbols makes a meaning. Which, when given a name (or feeling), becomes a symbol, that can be further involved in other meanings.

I’ll respond to each paragraph in a separate post, in order to get all of my thoughts down in a reasonably readable fashion. Here is part one.

Construction of Symbols

But if you take more meanings, and put them together to get yet another meaning. Don’t you feel like those meanings were again like symbols creating a new meaning?

I try to make a very strong statement of the difference between symbols, signs and their “meanings”. Perhaps I’m being too analytical, but it allows my to think about certain types of information events in a way I find useful in my profession as a data modeller. So let me try to summarize here the distinctions I make, then I’ll try to answer this question.

First, in my writings, I separate the thing represented by a symbol from the thing used as the representation. The thing represented I call the “concept” or “meaning”. The thing which is used to represent the concept I have termed “the sign”.  A symbol is the combination of the two. In fact, a specific symbol is a discrete object (or other physical manifestation) built for the express purpose of representing something else. That specific symbol has a specific meaning to someone who acts as the interpreter of that symbol.

As I have come to learn as I continue reading in this subject area, this is a somewhat ideosyncratic terminology compared to the formal terms that have grown out of semiology and linguistics. To that I say, “so be it!” as I would have  a lot of re-writing to do to make my notions conform. I think my notions are comparable, in any case, and don’t feel I need to be bogged down by the earlier vocabulary, if I can make myself clear. You can get a feel for some of my basic premises by poking around some of my permanent pages, such as the one on Syntactic Media and the Structure of Meaning.

There is obviously a lot of nuance to describing a specific symbol, and divining its specific meaning can be a difficult thing, as my recurring theme concerning “context” should indicate. However, within my descriptive scheme, whatever the meaning is, it is not a symbol. Can a symbol have several meanings? Certainly. But within a specific context at a specific time, a specific symbol will tend to have a single specific meaning, and the meaning is not so fluid.

How do you express a more complex or different idea, then? It is through the combination of SIGNS which each may represent individual POTENTIAL concepts that I am able to express my thoughts to you. By agreement (and education) we are both aware of the potential meanings that a specific word might carry. Take for example this word (sign):

blue

When I show you that word in this context, what I want you to recognize is that by itself, I am merely describing its “sign”-ness. Those four letters in that combination form a word. That word when placed into context with other words may represent several different and distinct ideas. But by itself, it is all just potential. When you read that word above, you cannot tell if I’m going to mean one of the colors we both might be able to see, or if I might be about to tell you about an emotional state, or if I might describe the nature of the content of a comedian’s act I just saw…

While I can use that sign when I describe to you any of those specific meanings, in and of itself, absent of other symbols or context, it is just a sign with all of those ambiguous, potential meanings, but in the context of our discussion, it has no specific meaning.

It has a form, obviously, and it has been constructed following rules which

Photo of an Actual Stop Sign In Its Normal Context

you and I now tacitly understand. Just as a stop sign has been constructed following rules we have been trained to recognize.

Imagine now a warehouse at the Department of Transportation where a pile of new stop signs has been delivered. Imagine they are laid flat and stacked on a pallet, just waiting to be installed on a corner near you.

While they lay in that stack, they certainly have substance, and they each have the potential to mean something, but until they are placed into a proper context (at a corner by a road) their meaning is just as ambiguous as the word sign above. If you were driving a fork lift through the warehouse and came upon the pallet, would you interpret the sign right then as applying to you? Probably not! Could you say, just be looking at an individual instance of a sign, exactly which cars on which road it is intended to stop? No, of course not.

So this is the distinction between the sign and the meaning of a symbol. The sign is a physical construct. When placed into a recognized context, it represents a specific meaning. In that context, the sign will only carry that one specific meaning. If I make another instance of the sign and put it in a different context, while the signs may look the same, they will not mean the same, and hence I will have made two different symbols.

Just to be perfectly clear on the metaphor I’m presenting, here is a “pile” of signs (words) which I could use in a context to express meaning:

blue

blue blue

blue blue blue blue

Now let me use some of them and you will see that given a context (which in this case consists of other word signs and some typcal interpretations) I express different meanings (the thoughts in your head when you read them together):

once in a blue moon

blue mood

blue sky project

blue eyes crying in the rain

But make no mistake, while i have now expressed several different ideas to you using the same sign in different contexts, they are each, technically, NOT THE SAME SIGN AT ALL! Rather they are four examples of a type of sign, just as each of the stop signs on that pallet at the DoT are examples of a type of sign, but each is uniquely, physically its own sign! This subtlety is I think where a lot of people’s thinking goes awry, leading to conflation and confusion of the set of all instances of a sign with all of the concepts which the SET of signs represents.

To make this easier to see, consider the instance of the word (sign) “blue” above which I have colored red. That is a specific example of the “blue” sign, and it has a specific, concrete meaning which is entirely different from the word (sign) “blue” above which I have colored green.  The fact that both phrases have included a word (sign) of “blue” is almost coincidental, and does not actually change or alter the individual meanings of the two phrases on their own.

Finally, since I have belabored my nit-picking a bit, if I were to re-word your initial statement slightly to use the terminology I prefer on this site, It would change to:

But if you take more [signs], and put them together to get yet another meaning. Don’t you feel like those [signs] were again like symbols creating a new meaning?

And to this question, it should be clear, that my answer is “Yes, precisely: when you put other signs together, you create new meaning”.

Just What Is Meaning? A Lay Perspective

7/20/2005

The Origin of Symbols, Code and Meaning

Memories are NOT CODED. They are ANALOG recordings, not unlike phonograph records and the old photographs before the invention of digital cameras. There is some evidence that memories are stored in a manner similar to holographs within the medium of the brain. Memories may include recordings of coded information, this would be how symbols are recognized.

 Only when communication between brains is needed does CODE come into play. One brain must create appropriate SYMBOLS which represent the information. These symbols must be physicalized in some manner because the only input mechanism available to the other brain are the five senses of the body. Information is packaged and lumped, nuances and unimportant details are necessarily removed, symbols are selected and generated. If the other brain is receptive, then the symbols are sensed by the body, evoking the memory centers of the second brain. Communication is completed if the second brain understands the code and “remembers” the meaning in its own analog memory.

The Origin of Language

The brain records sensory inputs as memory. The mind constructs an internal symbol system describing the sensory information in ways the human body can communicate or relate the information. Details of the input which the mind cannot put a name to may be remembered or memorable, but cannot be communicated. Have you ever experienced something that you were unable to describe to someone else who had not experienced it?

Two people who have experienced the same or similar types of events can have a conversation about it and begin to form a language. Language is a shortcut to memory. It is the human capacity for the invention of vocabulary that sets them apart from other creatures (and from computers). If two people share a new experience, they’ll be able to talk about it by recognizing the same features in the sensory record and describing it in terms that evoke the same memory in the other person. Eventually, they’ll form a unique vocabulary of short hand symbolic terms and phrases to permit efficient communications. This is how strangers who meet at 12-Step Meetings are able to express and understand each other.

 But if only one of the two persons has experienced the events, there is no referent memory in one of the two. Think of the old saw “a picture is worth a thousand words”. Have you ever heard a new musical piece and tried to explain it to someone who hasn’t heard it? It takes a lot of explanation and yet is ultimately a failure.

 Consider another example: Wine tasting connoisseurs

 These people have an intense sensitivity to subtle features of taste and smell making their experience of wine very rich with information. More importantly, they have been able to attach vocabulary to these differences in unique ways that allows them to communicate with other wine experts. Of course, their success at communicating is predicated on the existence of other individuals with similar talents and experiences. When they try to explain to someone without the sensitivity of taste, their words merely confuse or sound hilariously out of place.

 This is one example of how “context” arises in human communication.

 What does this suggest for our major theme? 

  1. The features that are recognized in the sensory record are dependent first on the individuals whose senses recorded them
  2. The features that are chosen for communication are dependent on the interests and needs of the individuals doing the communicating. Other features that at first do not seem to contribute to the remembrance of the experience are often ignored or discounted.
  3. The vocabulary describing and naming these features is dependent on both the individual who sensed and on the people to whom they try to explain the sensation. Thru trial and error, the person who is trying to communicate will hit upon terms that find resonance in their audience.

Meaning Over Transformation

This entry is probably ahead of the story, but I wanted to start moving into this subject and I’m not yet organized. It should make more sense later on when I’ve explained such things as the “magical” function M() more thoroughly.

Review: The Magical Function “M()”

As a review for those who may not have seen this function previously on this site, I have invented a mysterious and powerful function over all things used as signs by humans. Named the “M()” function, I can apply it to any symbol or set of symbols of any type and it will return what that symbol represents. I call it the “M() function because it takes something which is a symbol and it returns its meaning (that’s all of its meaning).

How Meaning Carries Over Symbol Transformations

When we move information from one data structure to another, we may or may not use a reversible process. By this I mean that sometimes a transformation is a one-way operation because some of the meaning is lost in the transformation. Sometimes this loss is trivial, but sometimes it is crucial. (Alternatively, there can be transformations which actually add meaning through deductive reasoning and projection. SFAT (story for another time))

Whether a transformation loses information or not, there are some interesting conclusions we can illustrate using my magical, mysterious function M(). Imagine a set β of data structure instances (data) in an anchor state. The full meaning of that data can be expressed as M(β). Now imagine a transformation operation T which maps all of the data in β onto a second set of data Δ.

T : β |–> Δ such that for each symbol σ in β, there is a corresponding symbol δ in Δ that represents the same things, and σ <> δ

By definition, since we have defined T to be an identity function over the meaning of β, then we can conclude that if we apply M() before and after the transformation, we will find ourselves with an equivalence of meaning, as follows:

By definition: T(β) = Δ

Hence: M( T(β) ) ≡ M( Δ )

Also, by definition of T(), then M( β )  ≡ M( T(β) )

Finally, we conclude: M( β ) ≡ M( Δ )

Now, obviously this is a trivial example concocted to show the basic idea of M(). Through the manner by which we have defined our scenario, we get an obvious conclusion. There are many instances where our transformation functions will not produce equivalent sets of symbols. When T() does produce an equivalence, we call it a “loss-less” transformation (borrowing a term from information theory) because no information is lost through its operation.

Another relationship we claim can also be defined in this manner is namely that of semantic equivalence.  This should be obvious as well, from reflection, as I was careful above to refer to “equivalence of meaning”, which is really what I mean when I say two things are semantically equivalent. In this situation, we defined T() as an operation over symbols such that one set of symbols were replaced with a different set of symbols, and the individual pairs of symbols were NOT THE SAME (σ <> δ)! In a most practical sense, what is happening is that we are exchanging one kind of data structure (or sign) with another, such that the two symbols are not syntactically equivalent (they have different signs)  but they remain semantically equivalent. (You can see some of my thoughts on semantic and syntactic equivalence by searching entries tagged and/or categorized “equivalence” and “comparability“.)

A quick example might be a data structure holding a person’s name. Let’s say that within β the name is stored as a string of characters in signature order (first name  middle name  last name) such as “John Everett Doe”. This symbol refers to a person by that name, and so if we apply M() to it, we would recognize the meaning of the symbol to be the thought of that person in our head. Now by applying T() to this symbol, we convert it to a symbol in Δ, also constructed from a string data structure, but this time the name components are listed in phone directory order (last name, first name middle name) such as “Doe, John Everett”. Clearly, while the syntactic presentation of the transformed symbol is completely different, the meaning is exactly the same.

T(“John Everett Doe”) = “Doe, John Everett”

M( T(“John Everett Doe”) ) ≡ M( “Doe, John Everett” )

M( “John Everett Doe” ) ≡ M( T(“John Everett Doe”) )

M( “John Everett Doe” ) ≡ M( “Doe, John Everett” )

“John Everett Doe” <> “Doe, John Everett”

When the transformation is loss less, there is a good chance that it is also reversible, that an inverse transformation T ‘ () can be created. As an inverse transformation, we would expect that T ‘ () will convert symbols in Δ back into symbols in β, and that it will also carry the meaning with complete fidelity back onto the symbols of β. Hence, given this expectation, we can make the following statements about T ‘ ():

T ‘ (Δ) = β

M( T ‘ (Δ) ) ≡ M( β )

By definition of T ‘ (), then M( Δ )  ≡ M( T ‘ (Δ) )

And again: M( Δ ) ≡ M( β )

Extending our example a moment, if we apply T ‘ () to our symbol, “Doe, John Everett”, we will get our original symbol “John Everett Doe”.

Meaning Over “Lossy” Transformation

So what happens when our transformation is not loss-less over meaning? Let’s imagine another transformation which transforms all of the symbols σ in β into symbols ε in Ε. Again, we’ll say that σ <> ε, but we’ll also define T ‘ ‘ () as “lossy over meaning” – which just indicates that as the symbols are transformed, some of the meaning of the original symbol is lost in translation. In our evolving notation, this would be stated as follows:

T ‘ ‘ (β) = Ε

M( T ‘ ‘ (β) ) ≡ M( Ε )

However, by the definition of T ‘ ‘ (), then M( β )  !≡ M( T ‘ ‘ (β) )

Therefore: M( β ) !≡ M( Ε )

In this case, while every symbol in β generates a symbol in Ε, the total information content of Ε is less than that in B. Hence, the symbols of the two sets are no longer semantically equivalent. With transformations such as this, the likelihood that there is an inverse transformation that could restore β from Ε becomes more unlikely. Logically, it would seem there could be no circumstances where β could be reconstituted from Ε alone, since otherwise the information would have been carried completely across the transformation. I don’t outright make this conclusion, however, since it depends on the nature of the information lost.

An example of a reversible, lossy transformation would include the substitution of a primary key value for an entire row of other data which in itself does not carry all of the information for which it is a key, but which can be used in an index fashion to recall the full set of data. For example, if we created a key value symbol consisting of a person’s social security number and last name, we could use that as a reference for that person. This reference symbol could be passed as a marker to another context (from β to Ε, say) where it could be interpreted only partially as a reference to a person. But which person and what other attributes are known about that person in the new context Ε if we define the transformation in such a way that all of the symbols for these other attributes stay in β? Not much, making this transformation one where information is “lost” in Ε.  However, due to its construction from β, the key symbol could still be used on the inverse transformation back to β to reconstitute the missing information (presuming β retains it).

An example of a one-way transformation might be one that drops the middle name and last name components from a string containing a name. Hence, T ‘ ‘ ( “John Everett Doe” ) might be defined to result in a new symbol, “John”. Since many other symbols could map to the same target, creating an inverse transformation without using other information becomes impossible.

Example of How Meaning Is Attached to Structure

What follows is a detailed example of the thought process followed by a software developer to create a class of data structures and how meaning is attached to those structures.

Consider that the meaning of one data structure may be composed of the collection of meanings of a set of smaller structures which themselves have meaning. Take the following description as the meaning to be represented by a structure:

An employee is a human being or person. Each employee has a unique identity of their own. Each employee has a name, which may be the same as the name of a different person or employee. Being human, each employee has an age, calculated by counting the number of years since they were born up to some other point in time (such as present day).Each person of a certain age may enter into a marriage with another human being, who in turn also has their own identity and other attributes of a person.

To represent this information using data structures (i.e., to project the meaning of this information onto a data structure), we might tie the various concepts about a human being/employee to a computer-based data structure. Recognizing that a human being is an object with many additional characteristics of which we might want to know about, we might choose to project the concept of “human beings” or “people” onto a relational table and the concept of a particular individual onto one of that table’s rows (or a similar record structure).

This table would represent a set of individual human beings, and onto each row of the table would be projected the meaning of a particular human being. Saying this again in a more conventional manner, we would say that each row of the table will reference a singular and particular human being, the all of the rows will represent the set of all human beings we’ve observed in the context of our usage of the computer system.

In a more mathematical vein, we would define a projection Þ from the set of actual human beings Α onto Š, (Þ(Α) |–> Š), the set of data structures such that for any α in Α where α is a human being, there is a record or row σ in Š that represents that human being.

A record data structure being a conglomeration of fields, each of which can symbolically represent some attribute of a larger whole, then we might project additional attributes of the human being, such as their name and identifier, to particular fields within the record. If σ is the particular record structure representing a particular human being, α, then the meaning (values) of the attributes of that person could be associated with the fields, f1..fn, of that record through attribute-level projections, ψ1..ψn for attributes 1 .. n.

To represent a particular person, first we would project the reference to the person to a particular row, Þ(α) |–> σ, then we would also project the attribute facts about that person onto the individual fields of that row:

ψ1(α.1) |–> σ.f1

ψn(α.n) |–> σ.fn

Projection onto Relational Structure

When modeling a domain for incorporation into computer software, the modeler’s task is to define a set of structures which software can be written to manipulate. When that software is to use relational database management systems, then the modeler will first project the domain concepts onto abstract relational structures defined over “tuples”. These abstract structures have a well-defined mathematical nature which if followed provides very powerful manipulations. The developer projects meaning onto relations in a conventional way, such as by defining a relation of attributes to represent “PERSON” – or the set of persons, and another relation of attributes to represent “EMPLOYEE” – or the set of persons who are also employees. Having defined these relational sets, the relational algebra permits various mathematical operations/functions to be applied, such as “JOIN” and “INTERSECTION”. These functions have strictly defined properties and well-defined results over arbitrary tuples. The software developer having projected meaning onto the individual relations, he is also therefore able to project meaning on the outcomes of these operations which can then be used to manipulate large sets of data in an efficient, and semantically correct way.

As the developer creates the software however, they must keep in mind what these functions are doing on two levels, at the level of the set content and at the level of the represented domain (the referent of the sets and manipulations). Thus the intersection of the PERSON and EMPLOYEE relations should produce the subset of tuples (records, etc.) which has its own meaning derived from the initial projected meaning of the original sets. Namely, this intersection represents the set of PERSONS who are also EMPLOYEES, (which is the same, alternatively, as the set of EMPLOYEES who are also PERSONS). This is an important point about software: the meaning is not simply recorded in the data structure but the manipulations of the data by the computer themselves have specific connotations and implications on the meaning of data as it is processed.

Representational Redundancy

As a typical practice in the projection of information onto data structures within the relational model, there will usually be a repetition of the information projected onto more than one symbol. In particular, the reference to the identity of a single person will be represented both by the mere existence of a single row in the table, and also by a subset of fields on the row which the software developers have chosen (and which the software enforces) for this purpose. In other words, under common software development practices, each record/row as a conglomerate entity will represent a single person. In addition, there will be k attributes (1 <= k <= n) on that record structure whose values in combination also represent that same individual. These k attributes make up the “primary key” of the data structure. The software developer will use and repeat these columns on multiple data structures to permit additional concepts regarding the relationship between that person and other ideas also being recorded. For example, a copy of one person record’s primary key could be placed on another person record and be labelled “spouse”. The attributes which make up the primary key often have less mechanical meanings as well (for example, perhaps the primary key for our person includes the name attribute. As part of the primary key, the name value of the person merely helps to reference that person. It also in its own right represents the name of the person.

How Meaning Attaches to Data Structures: A Summary

What follows is a high level summary of how humans attach meaning to various kinds of data structures within a computer. It will serve as a good baseline account, though certainly not an exhaustive one, providing a model upon which more detailed dicussion can begin. 

 Background Terminology

Computer systems provide functionality to support the performance and record of business processes. They do that through three inter-related features: DATA, LOGIC, and PRESENTATION. The presentation consists of information displays permitting both an information visualization aspect and an information capture aspect. The logic consists of several aspects, much of it having to do with support of the presentation and manipulation of displays, but also a lot of it having to do with creation, transformation and storage of data. Data consists of sets of symbols constructed in a systematic, regular fashion using a set of data structures. Different data structures are constructed to represent different aspects of the recorded activity. It is in the relationships between the macro and micro structures where the specific detailed information captured.generated by the business process resides. By following a codified, rigid construction of its data structures, the computer system is able to record multiple recurring instances of similar events. Through the development of fixed transformations using program logic, the computer system is able to make routine, conventional conclusions about those events or observations, and it is able to maintain and retain those observations virtually indefinitely.
Data is maintained and stored in DATA STRUCTURES. The more regular these data structures are, the more easily they are interpreted by a broad audience of software developers. In most situations, the PRESENTATION of the data captured by a system to the end user of that system is in a more directly understandable form than the way that information is stored in the computer.  (This statement is not only trivially true, but in a very deep sense too, since the computer actually stores everything using more and more complex sequences of binary digits. That’s a different subject than our current presentation.)  The data structures within the computer system typically exist in two, simultaneous forms, one intended to support human reasoning (through what is often called a “logical”, “abstract” or “conceptual” model) and one supporting manipulations by the computer. Most software developers today strictly deal with the abstract model of the data for design, coding, and discussion. (There are still some developers working in assembly level code, but even that is at a more abstract level than the actual electro-mechanical machinations of the actual hardware!)
An obvious observation, at least on its face, is that different computer systems will store data representing similar ideas using different structures. We need to keep this in the back of our minds as we progress through the rest of this discussion, but it will be more directly adressed in other entries.
 A final thought concerns sets of data of similar structure, called a POPULATION. A population of data consists of some set of data symbols, all constructed using the same data structure pattern which represents a set of similar ideas. The classification of populations of data structures applies to the DATA portion of systems, represents an analogous classification of sets of observed events external to the computer system, and is affected by and affecting the LOGIC and PRESENTATION portions of the computer system. A more detailed definition of the notion of a “population” will also be treated in separate sections.

Commonalities of Structure

Many computer systems, especially those built in support of business (or other human activity) processes, are constructed using a conventional system of abstract data structures. (When I say they are “conventional” what I mean is that the majority of software developers follow conventional patterns for the construction of data structures to represent their idiosynchratic subject areas.) Whether these structures are called “objects”, “tables”, “records”, or something else, they typically take the form of a heterogenous collection of smaller structures grouped together into regular conglomerations. Instances or examples of the larger collections of data structures will each be said to “represent” individual intances of some real-world conglomerate. Each of the individual component element structures of these conglomerations will each be said to represent the individual attributes or characteristics of the real-world conglomerate object. In order to permit efficient processing by the computer,   instances of similar phenomenon will be represented by the same kind of conglomeration.
Typically, business systems will be based on a data structure called a RECORD.  Records consist of a series of “attribute data structures” all related in some fashion to each other. (A more complex structure called an “object” still has record-like attributes combined together to represent a larger whole, the nuances and variation of object-based representation is a subject for later.)  Each RECORD will stereotypically symbolize one instance of a particular concept. This could be a reference to and certain observed details of a real-world object, or it could be something more ephemereal like observations of an event. For example, one “PERSON” record would represent a single individual person.
RECORDS themselves consist of individually defined data elements or FIELDS. Each RECORD of a particular type will share the same set of FIELDS. Each FIELD will symbolize one kind of fact about the thing symbolized by the RECORD. For example, a NAME field on a PERSON record will record what the represented individual’s name is, at least as it was at the time the record was created. 
The set of all records within a system having the same structure will typically be collected and stored together, often in a data structure called a TABLE. Each TABLE will symbolize the set of KNOWN INSTANCES of whatever type of thing each record represents. TABLES are also described as having ROWS and COLUMNS. Each row of a table is one RECORD. The set of shared element-attribute structures across the set of  rows can be described as the “columns” of the table. Each column represents the set of all instances of a FIELD in the table, in other words, the same field across all records. Tables are a commonly used data structure because they readily support interpretation using relational algebra and set theoretic operations, as well as being easily presented and understood both by human and computer.  

Basic Data Structures and Their Relationships

The nomenclature of “record”, ” table”, “row”, “column” and “fields” describes the construction building blocks of an abstract syntactic medium whose usage permits humans to represent complex concepts within the computer system. By assigning names to various collections and combinations of these generic structures, humans project meaning onto them. Using diagrams called “data models”, a short hand of sorts allows the modeler to describe how the generic tables and fields relate to each other and what these relationships signify in the external world. These models also, by virtue of the typified short hand they use, allows for the generation of computer logic that can be applied to a database to support certain standard operations and manipulations of the data generated by a computer system.

Traditional data modeling results in the creation of a data dictionary which relates each structural element to a particular kind of concept. Every structure will be given a name, and if the developers are diligent, these can be associated with more fully realized text descriptions as well. Some aspects of the data structures are not described, at least typically, within a data model, such as populations or subsets of records with similar structures.

Traditional data dictionary entries record name and description of the set of all structures contained in a table. Using a set of structures to represent a set or collection of similar objects is itself a symbolic action. So not only does each row in a table represent one instance of some type of thing, and each column represents one observed (or derived) fact or attribute of that instance, but the collection of all instances of these row data structures also represents the logical set or population of these things.

The strategy for applying meaning to these data structures begins when the decision is made to treat the entirety of each record as the representation of a member of a population of like things. Being similar, then, a set of fields is conceived to capture various detailed observations regarding the things. These fields are intended to capture details about both how each thing is different from the other things in the collection, but also how different things may share similarities. Much of the business logic of the application system will be consumed by the comparisons between individual things, and the mathematical derived counts (and other metrics) of those sets of things (and of subsets within). Using the computer to compare the bit sequences contained in each field, the computer will indicate whether these contents are the same or different between different instances. Humans will then interpret the results of these comparisons by projecting the conclusion out of the computer and into the conceptual world.

For example, let’s say that we have defined the computer sequence “10101010” to represent a reference to a specific person, “Julie Smith”. If we take two different instances of bit sequences and compare them in the computer, the computer will tell us if they are the same or not. As humans, we would then interpret the purely electro-mechanical result which the computer calculated that “10101010” and “10101010” are the same as an indication that the two instances of these sequences represent the same specific person. Likewise, we would interpret a computer result indicating that two bit sequences were not the same as an indication that different people were being referred to.  This type of projection of meaning from mechanical result to logical inference is fundamental to the way humans use computers.

The specific number of fields and their bit sequence representations (data types)  that are developed within a computer application is entirely dependent on the complexity of the problem domain and the attributes of the objects required to reason over that domain. However, no matter how simple or complex, it is the projection of meaning onto the representation of these attributes in the computer and the projection of an interpretation onto the results of the computer comparisons of the physical representations which makes the computer the powerful engine that it is in our society.

How Row Subsets Represent Subpopulations
How Row Subsets Represent Subpopulations

 

The Syntactics of Speech: What a Language Permits You to Say Is Less Than What You Know

I found this article intensely interesting. It corroborates and validates some of my own ideas about how language and symbols are used in communication. Namely, it suggests that even though a language does not contain structures and syntactic rules allowing for precise designation of a concept, that does not mean that such a concept cannot be communicated and understood by someone who uses that language. It just may take a lot more time to convey the thought. It may also be difficult to confirm the listener’s understanding because the language they have available to respond is the same one as the original message (which we said could not directly convey the meaning).

NY Times article

%d bloggers like this: