Thursday, March 15, 2018

Do indexes dream of humans?

I tweeted a link to a book indexing conference with a comment about indexing maybe being the stuff of thought. Reiver kindly replied with some tweets on how search and indexing were viewed in computer programming. So here I'm trying to explain my thoughts more thoroughly, and then addressing the difference in computer programming. The follow are my mild musings, wild speculations, and some thoughts whimsical. I welcome constructive comments.


Imagine a host of nodes, or if you prefer something more concrete, biologists.
They are using many terms which have the possibility of having different meanings to one another.
The meanings are already there they just aren't evenly distributed, or consistent.


I've heard a possibly apocryphal story about L. Wittgenstein that after a conference in exasperation he said it sometimes seemed to him (some very high percent) of philosophy was discussing word meanings. (I've heard different versions about just how high a percent.) Wittgenstein explained language as a kind of eternal guessing game where participants tried to discern the meanings of others. It seems simple and natural to humans but is in fact extremely vague and complex. We would now call it a kind of instinct blindness. Like how fish minds think about swimming in water, of course everyone knows how to do it - it's so simple!


Chomsky, perhaps inspired in part by the emergence of computational linguistics around the 1960's, argued against Skinner's Verbal Behavior book that the possibilities for babies learning the meanings of word were under-determined. That there was a 'combinatorial explosion' of possible meanings babies could ascribe to teacher-behaviors and that there must be some kind of cognitive module providing a scaffolding for grammar and the learning of word meanings.


Pinker in The Language Instinct explicitly discussed such scaffolding as evolved, and in the context of reasoning instincts - humans are more capable of reason because we have evolved new instincts, not because we have lost instincts.


Imagine a host of biologists, before Carl Linnaeus. There are many taxonomies, often conflicting, designed around different ideas of what it is best to optimize (and around scientists' egos too no doubt.) Eventually the systems merge or become forgotten and what remains is the Linnaean system. (One might alternatively look at the nomenclature rules in chemistry, for example.)


At one level of description, indexes are pages at the back of a book with names of ideas and where those ideas are discussed in the text. Rephrasing, such indexes are collections of pointers to the meaning of ideas.(Aside: A very common index item is ".. (word), definition of" ..) It seems to me that in a very real sense scientific taxonomies are also collections of pointers to ideas. Only they are guides to how to converse rather than to a page where an author discusses an idea.


Okay then, that is all at the level of communication between individuals. What about within?


"Neurons that fire together wire together." - quip about Hebbs' learning rule (1949.) But a quip does not an explanation make. But even if it did here, I'm more interested in meaning and communication theory than the rabbit hole of neurology, at this point.


How do different parts of the brain know they are communicating about the same thing? Currently brain waves are being touted as the unifying principle. Okay. They probably have something to do with it. Maybe a lot. But I'm much more interested in functional explanations than mechanical ones.  More ultimate causation and less proximate causation. Sometimes you can understand all the proximate causation you want, but still not understand the functioning. (Which is probably a good way of roughly describing the state of knowledge today about individual neurons.)


There have been some explicit comparisons of the hippocampus as a kind of an index to the memories in the brain. (Sorry, no cite on hand. Only general memory.) I like this comparison. It makes sense to me that the index should be in a particular place, centralized, in one of the most phylogenetically ancient structures of the brain. I like to imagine a brain wave travelling across the hippocampus like the eyes of a reader travelling down the entries of an index.


So this is what I implied about indexing being at the heart of the thought process - The meanings of a concept are distributed in different brain regions, like discussions in the text of a book, and linked together and accessible from different small areas of the brain's index. This is speculation. Or a thought in process perhaps.


(By the way, the part of the brain that includes the hippocampus continues on to form the amygdala, which has over the decades been a subject of much interest in brain studies of autistics. I've wondered just about as long if the claims around enlarged amydalae in autistics have been somehow overlooking enlarged hippocampi in autistic savants.)


Now as to what Reiver tweeted about the terms search and index in computer science. First of all, thank you. That's important information. By the way, cognitive psychology which often draws on computers as metaphors for the human brain, seems to define the terms similarly.  (Search seems to be often operationalized in cognitive psychology as eye saccades, or some such thing.) Defining the terms that way is very useful, particularly in computer science. They are defined in terms of discrete proximal operations by actual equipment and registers you can point at.

But for me they are the same function, only one is an accumulated operation of the other. In short, roughly speaking perhaps, indexes are accumulated searches. The reason why indexes make searches faster is because they have done all the searching beforehand. Perhaps I will think on this more and in the future come to a different view. But for now, some metaphors -

In psychometrics the general intelligence factor is sometimes broken down into crystallized intelligence and fluid intelligence. (Gc and Gf.)

In accounting you have Income Statements covering a period of time, and Balance Sheets at a point in time. Usually the Retained Earnings section of Balance Sheets shows the change in retained earnings from the prior balance sheet, in effect summarizing the income statement. Roughly speaking, Balance Sheets are an accumulation, summation, and transformation of all the previous Income Statements.

Maybe all those metaphors don't make any sense. Maybe they don't even make that much sense to me. But hopefully they help.

Perhaps explaining a little something about how indexers index might help. Typically they carefully read through the text (sheets from the publisher sometimes) and mark up the likely index main headings and sub-headings directly on the sheets. Then they type the headings and the page numbers and such into one of the professional software programs, which is mostly a sorting program (but not entirely.) Those are the easy parts. Then comes the editing. Deleting and re-organizing old entries, looking up new entries, over and over and over.  I knew one highly experienced indexer who said he usually spent twice as much time editing as all the other tasks. The point is that the first part of indexing is search in reverse. You are at the location where the topic is mentioned, and you then sort that into the constructed index. The editing process involves looking at the collection of links/locators/locations in the initial index, and .. working the indexer magic to make sense of it and make it useful.

By the way, my impression is that In psychometrics people still write about Gc and Gf in the journals, but it doesn't seem they do it as much as they used to. I suspect it's because the tools being used to measure general intelligence tended to fall into the categories of measuring by way of crystallized intelligence (mostly vocabulary or world-knowledge tests) or fluid (computations, insights, etc.) Those issues don't seem to dominate academic interests the way they used to.

And that is why for me the concepts of search, index, and thought are all right next to each other in my hippocampi.

And now some things whimsical.

Why are back of the book indexes organized alphabetically? Why are they not organized more like a detailed table of contents, according to the structure of the material in the author's mind? It seems to be because the reader's may have a different sense of how the meanings are related to one another, and what would be a logical ordering to the author is not necessarily the same for the readers. Of course there is always the table of contents there at the front of the book for the reader to access instead. But I find it curious that the injection of a kind of randomness, a somewhat arbitrary ordering of the meanings in a text, can actually improve the communications from the author to the reader. (Of course, being alphabetical makes it easier to directly look up a term of interest. But again, the term of interest then is de-contextualized from the overall structure of the meaning of the book as the author sees it.)

Do hippocampi dream of editing indexes?
It has often seemed to me that the accessing of never-before accessed old memories results in very clear images at first, but they rapidly degrade. What if dreams are the by-products of accessing memory components in the process of editing out the old no longer accessed memory index items? It might help explain why so many dreams go unremembered.

No comments:

Post a Comment