Following the path of least effort

08-12-2010
In November, professor emeritus Howard White visited both departments of the Royal School of Library and Information Science. He had discussions with both researchers and students and presented among other things his work on Relevance Theory.


Howard White has a PhD in librarianship from the University of California, Berkeley, and is now professor emeritus at Drexel University’s College of Information Science and Technology. His areas of expertise include bibliometrics and co-citation analysis, social science data archives and literature retrieval for meta-analysis among others.

Measuring relevance
One of the lectures was on "Relevance Theory and Citations". Relevance Theory, first introduced by Sperber and Wilson, deals with the relevance of communications and how this is determined by their cognitive effects and the effort needed to process them. Relevance can be thought of as the ratio of cognitive effects to processing effort:


                   Cognitive effects
Relevance = Processing effort


The formula illustrates that the greater the cognitive effects, the greater the relevance will be, and that the less the processing effort, the greater the relevance will be. For instance, if one were to give a lecture using only mathematical examples to an audience without a mathematical background, the information would not be very relevant to them - the processing effort would be too great compared to the cognitive effects, and thus the relevance would be diminished.

Choosing citations
Over his career, White has done a lot of research on the subject of citation analysis, for which he received the Derek J. de Solla Price medal in 2005. His studies show that neither grammar nor rhetorical rules dictate the citation choices of authors. When authors cite the works of others, they usually do it via the path of least effort, meaning they often choose works which require little processing effort. They frequently cite themselves (as it is easy to relate present to past work), members of their in-group (acquaintances or co-workers) and orienting figures from their reading. Therefore, the choices tend to be rather stereotyped over time.

Relevance theory in new contexts
White examined citation data to investigate whether relevance theory can be used to link bibliometrics and information retrieval. He has invented a new type of visualization, called pennant diagrams, to illustrate the connections. In addition to Relevance Theory, the idea rests on the well-known TF*IDF formula for weighting search terms in relevance rankings:


Weight = term frequency * inverse document frequency


As he makes clear in his article "Pennants for Strindberg and Persson", this weighting formula (known as tf*idf) to the formula for relevance and applied it to bibliometric distributions to show that relevance theory can explain the function of tf*idf in information retrieval in a new way. The article was published in a Festschrift for Olle Persson in June 2009.

Defining information science
White also gave an open lecture on "Information Science and Relevance Theory" in which he provided concrete examples of how relevance theory can be used to define major terms in information science more clearly. These definitions can then help to create themes, connecting various parts of information science, such as research on relevance judgments in document retrieval and research on least effort in information-seekers' behavior.


By Helle Saabye