The Syntactics of Speech: What a Language Permits You to Say Is Less Than What You Know

I found this article intensely interesting. It corroborates and validates some of my own ideas about how language and symbols are used in communication. Namely, it suggests that even though a language does not contain structures and syntactic rules allowing for precise designation of a concept, that does not mean that such a concept cannot be communicated and understood by someone who uses that language. It just may take a lot more time to convey the thought. It may also be difficult to confirm the listener’s understanding because the language they have available to respond is the same one as the original message (which we said could not directly convey the meaning).

NY Times article

Comparability: How Software Works

Back in 1990, I was working on a contract with NASA building a prototype database integration application. This was the dawn of the Microsoft Windows era, as Windows 3.0 had just been released (or was about to be). Oracle was still basically a start-up relational database vendor trying to reach critical mindshare. The following things did not yet exist which we take for granted today (and even think of as kind of out dated):

  • ODBC – allowing standardized access to databases from the desktop
  • Microsoft Access and similar personal data management utilities
  • Java (in fact most of the current web software stack was still just the twinkles in the eyes of their subsequent inventors)
  • Message-based engines, although EDI techniques existed
  • SOA and XML data formats
  • Screen-scrapers, user simulators, ETL utilities…

The point is, it was still largely a research project just to connect different databases that an enterprise might be using. Not only did the data representational difficulties that we face today exist back in 1990, but there was also a complete lack of infrastructure to support remote connection to databases: from network communication protocols, to query interfaces, to security and session continuity functions, even to standardized query languages (SQL was not the dominant language for accessing data back then), and more.

In this environment, NASA had asked us to prototype a generic capability that would permit them to take user search criteria, and to query three different database applications. Then, using the returned results from the three databases, our tool was to generate a single, unified query result.

While generally a successful prototype, during a critical review, it became clear to NASA and to us that maintaining such an application would be horribly expensive, so the research effort was ended, and the final report I wrote was delivered, then put into the NASA archives. It is just as well too, because within five years, much of the functional capabilities we’d prototyped had started to become available in more robust, standards-based commercial products.

What follows is a handful of excerpts from the final report, which while now out of context, still expresses some important ideas about how software symbols actually work. The gist of the excerpt describes how software establishes the comparability and sometimes the equivalence of meaning of the symbols it manipulates.

In a nutshell, software works with memory addresses with particular patterns of voltage (or magnetic field direction) representing various concepts from the human world. Software is constantly having to compare such “structures” together in order to establish either equivalence of meaning, or to alter meaning through the alteration of the pattern through heavily constrained manipulations. The key operation for the computer, therefore, is to establish whether or not two symbols are “comparable“. If they are not comparability, quite literally, then the computer cannot reliably compare them and produce a meaningful result.

Without further ado, here are the important excerpts from the research study’s final report, which I wrote and delivered to NASA in November 1990.

“Database Integration Graphical Interface Tools, Future Directions and Development Plan”, Geoff Howe, November 1990

2.2 The Comparability of Fields

There are many kinds of comparisons that can be made among fields. In databases, the simplest level of comparability is at the data type level. If two fields have the same simple data type (e.g., integer, character, fixed string, real number), then they can be compared to each other by a computer. This level of comparability is called “basal comparability”. Thus, if fields A and B are both integers, they can be combined, compared and related in any way appropriate for two integers.

However, two elements meeting the qualification for basal comparability may still be incomparable at the next level, that of the syntactic level. The syntactic level of comparability is that level in which the internal structure of a field becomes important. Examples of internal formats which might matter and might be important at this level include date formats, identification code formats, and string formats. In order to compare two fields in different formats, one or the other of these fields would have to be converted into the other format, or else both would have to be converted into a third format. The only meaningful comparisons that can be made among the fields of a database or databases must be made at the syntactic level.

As an example, suppose A is a field representing a date in Julian format, and suppose B is a field representing a date in Gregorian format. Assuming that both fields are stored as integers, comparing these dates would be meaningless because they lack the same syntactic structure. In order to compare these dates one or the other of these dates would have to be converted into the other format, or else both would have to be converted into a third format.

Unfortunately, having the same syntactic structure is not a guarantee that two fields can be compared meaningfully by a computer process. Rather, syntactic comparability is the minimum requirement for meaningful comparison by computer process. Another form of comparability must be incorporated as well, that of semantic comparability. Semantic comparability is based on the equivalence of the meanings attached to the contents of some pair of data items. The semantics of data items are not readily available to computer processes directly; a separate description in some form must be used to allow the computer to understand the semantic equivalence of concepts. Once such representation is in place, the computer should be able to reason over the semantic equivalence of concepts.

As an example of semantic comparability consider the PCASS fields, ITEM PART NUMBER from the FMEA PARTS table of the PCASFME subsystem, and CRIT_LRU_PART_# from the CRITICAI LRU table of the PCASCLRU subsystem. Under certain circumstances, both of these fields will hold the part numbers of “line replaceable units” or LRUs. Hence, these fields are semantically comparable. Given a list of the contents of ITEM PART NUMBER, and a similar list for CRIT LRU PART #, the assumption can be made that some of the same “line replaceable units” will be referenced in both lists.

Semantic comparability is useful when integrating data from different databases because it can be used to indicate the equivalence of concepts. Yet, semantic comparability does not imply syntactic comparability, and thus both must be present in order to satisfactorily integrate the values of fields from different databases. A definition of the equivalence of fields across databases can now be offered. Two fields are equivalent if they share the same base type; if their internal syntactic structure is the same; if their representational domains are the same; and if they represent the same concept in all contexts.

2.3 Heterogeneous Data Dictionary Architecture

 The approach which seems to have the most documentary support in the research for solving the integration of heterogeneous distributed databases uses a two-tiered data dictionary to support the construction of location-independent queries. The single data dictionary, used by both the single-site database management system, and the homogenous distributed environment, is split in two across the physical-conceptual boundary. This results in a two-level dictionary where one level describes in detail the physical fields of each integrated database, and the second level describes the general concepts stored across systems. For each unique concept represented by the physical level., there would be an entry in the conceptual level data dictionary describing that concept. Figure 2 shows the basic architecture of the two level data dictionary.

As an example of the difference between the conceptual and physical data dictionary levels, consider again the field PCASFME.FMEA PARTS.ITEM PART NUMBER. This is the full name of the actual field in the PCASS database. The physical level of the data dictionary would have this full name, plus the details of how this field is represented (character string, twelve places long). The conceptual level of the data dictionary would contain a description of the contents of the field, and a conceptual field name, “line replaceable unit part number”. Other fields in other tables of PCASS or in other databases may also have the same meaning. This fact poses the problem of mapping the concept to the physical field, which will be described below. Notice, however, how much easier it would be for a user to be able to recall the concept “line replaceable unit part number”, as opposed to the formal field name. This ease of recall is one of the major benefits of the two-level data dictionary being proposed. Two important relationships exist between the conceptual and physical data dictionaries. One of the relationships between fields of the conceptual level data dictionary and fields of the physical level data dictionary can be characterized as one-to-many. That is, one concept in the conceptual data dictionary could have many physical implementations. Identification of this type of relationship would be a matter of identifying and recording the semantic equivalences across system boundaries among fields at the physical level. All physical fields sharing the same meaning are examples of this one-to-many relationship.

Within the PCASS system, the concept of a line replaceable unit part number” occurs in a number of places. It has already been mentioned that both the ITEM PART NUMBER field of the FMEA_PARTS table, and the CRIT LRU PART # field of the CRITICAI_LRU table, represent this concept. The relationship between the concept and these two fields is, therefore, one-to-many.

The second type of relationship which may also be present, depending on the nature of the existing databases, relates several different concepts to a single field. This relationship is characterized as “many-to-one”. Systems which have followed strict database design rules should result in a situation where every field of the database represents one and only one concept. In practical implementations, however, it is often the case that this rule has not been thoroughly implemented, for a variety of reasons. Thus it is more than likely, especially in large database systems, that some field or set of fields may have more than one meaning under various circumstances. Often, these differences in meaning will be indicated by the values of other associated fields.

As an example of this type of relationship, consider the case of the ITEM PART NUMBER field of the PCASS table FMEA PARTS in the FMEA dataset one-more time. This field can have many meanings depending on the value of the PART TYPE field in the same table. If PART TYPE is set to “LRU”, the ITEM PART NUMBER field contains a line replaceable unit part number. If PART TYFE is set to “SRU”, the ITEM PART NUMBER field actually contains a shop replaceable unit part number. Storing both kinds of part numbers in the same structure is convenient. However, in order to use the ITEM PART NUMBER field properly, the user must know how to read and set the PART TYPE field to disambiguate the meaning of any particular instance of the record. Thus, the PART TYPE field in the physical database must hold either an “SRU” or “LRU” flag to indicate the particular meaning desired at any one time.

In the heterogeneous environment, it may be possible to find a different database in which the same two concepts which have been stored in one filed in one database, are stored in separate fields. It may in fact be possible that in one or more databases, only one of the two concepts has been stored. This is certainly the case among the separate data sets which make up the PCASS system. For example, in the PCASCLRU data set, only the “line replaceable unit part number” concept is stored (in the field, CRIT_LRU_PART_#). For this reason, the conceptual level of the data dictionary must include both concepts. Then there must be some appropriate construct within the data definition language of the data dictionary system which could express the constraints under which any particular field had any particular meaning. In order to be useful in raising the level of data location transparency, these conditional semantics must be entered into the data dictionary using this construct.

It is obvious now that the relationship between entries in the conceptual data dictionary and the physical data dictionary is truly many to many (see Figure 3). To implement such a relationship, using relational techniques, a third major structure (in addition to the set of tables supporting the conceptual data dictionary and the set of tables supporting the physical data dictionary) must be developed to mediate this relationship. This structure is described in the next section.

2.3.1 Conceptual – Physical Data Mapping

As an approach to implement this mapping from conceptual to physical structures, a table must be developed which relates every concept with the fields which represent it, and every field with the concepts it represents. This table will consist of tautological statements of the semantic equivalence of physical fields to concepts. A tautology is a logical statement that is true in all contexts and at all times. In thiis approach, the tautologies take the following form (please note that the “==” operator means “is semantically equivalent to”, not “is equal to”):

 normalized field f == field a from location A

 The normalized field f of the above example corresponds directly to an entry in the conceptual data dictionary. We call the field, f, normalized to indicate that it is a standard form. As will be described later, the comparison of values from different databases will be supported by normalizing these values into the representation described in the conceptual data dictionary for the normalized field.

Conditional semantics must now be added to the structure to support discussion. Given a general representation for a tautology, conditional semantics may be represented by adding logical operations to the right side of the equivalence. Assume that a new database, D, has a field, d1, which is equivalent to the normalized field, f, but only when certain other fields have specific values. Logically, we could represent this in the following manner:

normalized field f == field d1 from location D iff
field d2 from location D = VALUE1 AND
field d3 from location D = VALUE2 AND …
field dn from location D opn VALUEn

 In more general terms, the logical statement of the tautology would be as follows:

 R == P iff  E

where R is the normalized field representation, P is the physical field, and E is the set of equivalence constraints which apply to the relation. In our part number example, the following tautologies would be stored in the mapping:

Line Replaceable Unit Part Number == PCASFME.FMEA.PARTS.ITEM_PART_NUMBER iff PCASFME.FMEA.PARTS.PART_TYPE = “LRU”

Shop Replaceable Unit Part Number == PCASFME.FMEA.PARTS.ITEM_PART_NUMBER iff PCASFME.FMEA.PARTS.PART_TYPE = “SRU”

Line Replaceable Unit Part Number == PCASCLRU.CRITICAL_LRU_CRIT_LRU_PART_#

The condition statements are similar to condition statements in the SQL query language. In fact, this similarity is no accident, since these conditions wilt be added to any physical query in which ITEM PART NUMBER is included.

From a user’s point of view, implementing this feature allows the user to create a query over the concept of a line replaceable unit part number without having to know the conditions under which any particular field represents that concept. In addition, by representing the general – concept of a line replaceable unit part number, something the user would be very familiar with, this conceptual mapping technique has also hidden the details of the naming conventions used in each of the physical databases.

2.4.2 Integrating Data Translation Functions Into the Data Dictionary

In the simplest case, the integration of data translation functions into the data dictionary would be a matter of attaching to the data mapping tautologies described above a field which would store an indication of the type of translation which must occur to transform a result from its Location-specific form into the normalized form. This approach can be simplified further by allowing translations at the basal level to be identified by the source and target data types involved, and not recording any further information about the translation. It may not be unreasonable to assume that in certain well-defined domains, most of the translation functions required would be either identity functions or simple basal translation functions.

It is now possible to define completely the data structure required to store any arbitrary physical-conceptual field mapping tautology. The data structure would consist of the following parts:

  • concept field – a single, unique concept which the physical projection represents
  • normalized – a reference to the conceptual data dictionary entry used to represent the concept
  • physical projection – the field or set of fields from the physical data dictionary which under the conditions specified in the equivalence constraints represent the concept
  • equivalence constraints – the conditions under which the physical projection can be said to represent the concept
  • translation function – the function which must be performed on the physical projection in order to transform it into the normalized format of the normalized field

The logical statement of the tautology would be as follows:

R = Ft (P) iff E

where R is the normalized field representation, Ft is the translation function over the physical projection, P, and E is the set of equivalence constraints which apply to the relation. The exact implementation of this data structure would depend on the environment in which the system were to be developed, and would have to be specified in a physical design document. Note that instead of the “==” sign, which was defined above as “is semantically equivalent to”, has been replaced by “=” which means “is equivalent to”, and is a stronger statement. The “=” implies that not only is the left side semantically equivalent to the right, but it is also syntactically equivalent.

Unmanage Master Data Management

Master Data Management is a discipline which tries to create, maintain and manage a single, standardized conceptual information model of all of an enterprise’s data structures. Taking as its goal that all IT systems eventually will be unified under a single semantic description so that information from all corners of the business can be understood and managed as a whole.

In my opinion, while I agree with the ultimate goal of information interoperability across the enterprise, I disagree with the approach usually taken to get there. A strategy that I might call:

  • Data Management with Multiple Masters
  • Uncontrolled/Unmanaged Master Data Management
  • Associative Search on an Uncontrolled Vocabulary
  • Emergent Data Management (added 2015)
  • Master-less Data Management (added 2015)

takes a different approach. The basic strategy is to permit multiple vocabularies to exist in the enterprise (one for each major context that can be identified). Then we build a cross reference of the semantics only describing the edges between these contexts (the “bridging” contexts between organizations within the enterprise), where interfaces exist. The interfaces that would be described and captured in this way would include non-automated ones (e.g., human mediated interfaces) as well as the traditionally documented software interfaces.

Instead of requiring that the entire content of each context be documented and standardized, this approach would provide the touchpoints between contexts only. New software (or business) integration tasks which the enterprise takes on would require new interfaces and new extensions of mappings, but would only have to cover the content of the new bridging context.

Information collected and maintained under this strategy would include the categorization of data element structures as follows:

  1. Data structure syntax and basic manipulations
  2. Origin Context and element Role (for example, markers versus non-markers)
  3. Storage types: transient (not stored), temporary (e.g. staging schemas and work tables), permanent (e.g., structures which are intended to provide the longest storage
  4. “Pass-through” versus “consumed” data elements. Also called “traveller” and “fodder”, these data structures and elements have no meaning and possibly no existence (respectively) in the Target Context.

For data symbols that are just “passing through” one context to another, these would be the traveller symbols (as discussed on one of my permanent pages and in the glossary) whose structure is simply moved unchanged from one context to the next, until it reaches a context which recognizes and uses them. “Fodder” symbols are used to trigger some logic or filter to change the operation of the bridging context software, but once consumed, do not move beyond the bridge.

The problem that I have encountered with MDM efforts is that they don’t try to scope themselves to what is RECOGNIZABLY REQUIRED. Instead, the focus is on the much larger, much riskier effort of the attempted elimination of local contexts within the enterprise. MDM breaks down in the moment it becomes divorced from a practical, immediate attempt to capture just what is needed today. The moment it attempts to “bank” standard symbols ahead of their usage, the MDM process becomes speculative, and proscriptive. The likelihood of wasting time on symbology which ultimately is wrong and unused is very high, once steps past the interface and into the larger contexts are taken.

Uses of Metamorphic Models in Data Management and Governance

In the Master Data Management arena, Metamorphic Models would allow the capture of the data elements necessary to stitch together an enterprise. By recognizing the information needed to pass as markers or to act as travellers, the scope of the data governance task should be reducible to a practical minimum.

Then the data governance problem can be built up only as needed. The task becomes, properly, just another project-related activity similar to Change Control and Risk Management, instead of the academic exercise into which it often devolves.

The scope of data management should focus on and document 100% of the data being moved across interfaces, whether these interfaces are automated or human-performed. Simple data can just be documented, and the equivalence of syntax and semantics captured. Data elements that act as markers for the processes should be recorded. Also all data elements/structures intended merely to make the trip as travellers should be indicated.

This approach addresses the high-value portion of the enterprise’s data structures, while minimizing work on documenting concepts which only apply within a particular context.

What’s in a Name: Not That Much, Actually

The referenced paper is seminal. The comments that appear here are largely unaltered from when I first wrote them back in 1989. I follow this older writing with some additional conclusions, looking back over twenty years of experience working with data.

September 23, 1989:

When parsing a record-based system’s data, the software developer is faced with all of the problems of data structure semantics described by W. Kent (in William Kent, “Limitations of Record Based Information Models”, ACM Transactions on Database Systems 4(1), March 1979. Also John Mylopolous and Michael Brodie (eds), Readings in Artificial Intelligence and Databases, Morgan Kaufman, San Mateo, California, 1989. [20 pp]).

Field naming problems can be handled by naming all fields with a field number, then providing synonyms for all fields. I gave each field a “name” similar to the name of the original system which was possibly meaningless. This name was to allow for maintenance and information mapping between systems. Then, using synonyms I could give a more semantically significant name to the field. The record is just a place keeper – the concept represented is buried in the code supporting the use of the record, or perhaps by agreement (explicit or implicit) among the designers and users of the system. When this agreement is verbal, or worse, implied by training, that’s when the trouble arises: idiosyncratic usage enters the picture, along with the possibly disasterous loss of meaning accompanying the departure of those whose concept is being represented.

November 1, 2009:

This note was just one of several ideas I was toying with as I worked on a thesis paper for my Masters. The project I was working on was to integrate and add expert system capabilities (using Prolog) to an existing business application built on top of COBOL fixed record structures. What it describes is the idea I used to get around the very badly named columns of the COBOL records in order to improve the effectiveness and readability of the Prolog code. The basic trick was to put into the Prolog knowledgebase multiple names for the same data structures and attach to these Prolog structures logic statements that permitted the statement (in nearly human-language terms) of logical constraints.

In later years, I have come to recognize that this problem of naming conventions within code, while important to an extent, is not as important as some practitioners think. The fact of the matter is that the computer could care less what the column name of a table is, or the variable name within a program, etc. For all the computer cares, so long as the programming code references the right data structure at the right moment consistently, the actual references might as well be unique, semantically meaningless numbers.

Naming conventions are for the humans who have to write and maintain the code, or, more generally, who have to directly interact with the data structures. And while there can often be contentious, protracted debate amongst software developers on the “right” naming convention for various situations, in my mind, it is not usually worth the amount of attention it gets during development.

If left to my own devices, then the naming convention I try to impose is as richly semantic as possible. Column names and table names are as close to expressing the intended content, down to including qualifying adjectives, and role names to an appropriate, context-specific noun. The context I select the name from is defined by the context of the problem domain for which the software is being written. I also try to be very consistent in the use of names and name parts from one end to the other of whatever system I’m working on.

If the system already has a naming convention, so long as it can be written down in a set of repeatable rules, I’ll use whatever it is. Oftentimes I find I have to rationalize and standardize terms used previously, due to the fact that at different times, different developers may have used different conventions.

I have participated in efforts at making a universal naming convention, and these have all ultimately hit a wall and been stopped (the reasons for this have been to this point the primary subject of this blog – even if I haven’t explicitly described the scenario yet). Namely, the cross-context politics, long initial duration, required ongoing maintenance activities and ultimately the diminishing returns of such efforts cause them to sink from their own weight.

But even when I have had complete control over the data structure development, and I have had time to craft the “perfect” name for each column, even when I’ve checked and double checked and triple checked that I have consistently applied the same naming convention from one end of the system to the next, once my software has gone into use, it hasn’t taken long for the user community to start redefining the meaning of some aspect of the data structure. Or, the requirement changes and the programming team must change the usage of one of my finely-crafted data structures so that it supports a new meaning, not reflected in that finely crafted name.

This can be frustrating, and it can also pose a long term hazard to the maintenance of the system, as either the original meaning or the new meaning becomes a minority of the usage. But it is not the end of the world, and it does not always break the software if the code is changed to handle the new meaning correctly.

However, it does mean that the actual name of the field no longer reflects the contents it holds. But if the code is working properly, the name no longer matters to the operation of the system. Plus, the maintenance problem such a change presents is also no big deal, so long as the revised meaning is captured in an appropriate dictionary and made available to the programming team for future reference.

Why is this the case? The real truth is that the data structure stores symbols which have a meaning within a context defined by the USERS of the software. The data structures merely represent SYNTAX of the symbols, consisting of the data type of the symbol, and the manipulations of the symbol performed by the code. So long as the manipulations are applied appropriately to the correct part of the syntax, no matter HOW it is named, then the software will manage the MEANING intended by the USERS, despite of, not because of, the naming convention of the data structure.

Hence, what’s in a name used on a data structure? From the computer’s point of view, not so much. From the human’s point of view, since the meaning can change over time, the name shouldn’t be trusted until the code has been reviewed to confirm the content. So there again, not so much…

Living in My Own, Personal Semiosphere

I am sure I’m not getting this right when I read these seminal papers on the “semiosphere”, beginning with Juri Lotman’s “On The Semiosphere” (Sign Systems Studies 33.1. 2005).  I have to admit that the text has me confused a bit. On the one hand, Juri defines the semiosphere as an analog to the biosphere, a large, all pervading expanse of interconnected life on our planet. On the other hand, as he describes its features (what it is and what it is not), he describes examples of something which can be quite a bit smaller than the entirety of semantic discourse in the world. This includes the semiospheres of countries, language groups, and professional practitioners.

In other words, what I would call contexts.

Taking from this the idea that a semiosphere represents the sum total aggregate of the symbollic space around this context, I had a vision of myself, walking with a sphere of communication techniques and examples (language, art, gesture, expression) floating about me. This cloud represented not just anything that I had ever said or written (or otherwise communicated) but included the entirety of what I might ever say, or be able to say.

The sum total of everything I will ever be able to communicate.

The sum total of everything I will ever be able to communicate.

And then I thought of two of us coming together, each with our own spheres of semiotics, including personal and community symbols, and an ability to recognize and quickly adapt to contexts known to us. I imagine the interplay of our own personal semiospheres, one to the other, as we begin to try to communicate.

Having brought with ourselves the entirety of our communicative arsenol, we lob niceties and platitudes at each other, then observe which ones hook together in the shared semiotic space surrounding us. Not all of our personal spheres can be fit together – like oil and water, even if we give them both the name “liquid” cannot mix.

On first encounter, we may only recognize “the weather” and “the place” as subjects shared and in common. But as we meet over time, and we remember what connections we made before, we build the “bridge” of communication between us, and this bridge becomes our starting point for subsequent communication  (in other words, our context).

Umwelt and Semiosphere

Found this fascinating thread on Juri Lotman and his school of thought regarding the “semiosphere”, which he posited as being similar to a “biosphere”. Still reading, but wanted to capture this thread of discussion.

[Lotman] Umwelt and Semiosphere

Software as Semantic Choice

When I design a new software system, I have to choose what parts of reality matter enough to capture in the data (data is little bits of information stored symbollically and in great repetitive quantities). I can’t capture the entirety of reality symbollically, software is another example in life of having to divide an analog reality into discrete named chunks, choosing some and leaving others unmentioned.

This immediately sets the system up for future “failure” because at some point, other aspects of the same reality will become important. This is what in artificial intelligence is called “brittleness”. A quality which bedeviled the expert system movement and kept it from becoming a mainstream phenomenon. This is also a built in constraint on semantic web work, but I’ll leave that for another post.

Taking the example of quantum physics research as an example, there’d be no point in writing one application to capture both the speed and position of a quantum particle in a database, because as we all know, only one or the other data points is available to us to measure at one time. Thus we choose to capture the one that’s important to our study, and we ignore the other.

This is why a picture is worth a thousand words: because it is an analog of reality and captures details that can remain unnamed until needed at a future time.

This is also why we say that in communication we must “negotiate reality”. We must agree together (software developer and software user) what parts of reality matter, and how those parts are named, recognized, and interact.

In reading a recent thread on Library Science, it sounds like in the “indexing and abstracting” problem (used to set up a searchable space for finding relevant documents), a choice has to be made on what we think the searcher will most likely bring with him in order to find the information they seek. But by virtue of making one choice, we necessarily eliminate other choices we might have made which may have supported other seekers better.

This is an interesting parallel, and I must assume that I’ll find more as this dialog continues.

%d bloggers like this: