Sunday, May 12, 2019

“Lies, damned lies, and statistics”

The title of the eighth chapter of From Scratch: Writings in Music Theory, the University of Illinois Press collection of the articles by music theorist and composer James Tenney, is “The Chronological Development of Carl Ruggles’s Melodic Style.” The essay was written in 1977; and those who have been following Tenney’s intellectual development by reading the chapters of From Scratch in their chronological order might expect that this would be an article in which Tenney takes his theoretical speculations about phenomenology and tries to put them into practice by examining the relatively limited (but decidedly revolutionary) repertoire of a single composer. Such readers need go no further than the last sentence of the first paragraph to abandon such expectations:
In this paper I shall report the results of some statistical analyses of Ruggles’s melodic lines, carried out with the aid of a computer.
Readers who now look at the above headline can probably see where things are going. I was rather pleased to see that this phrase actually has its own Wikipedia page. On that page one may read the “source sentence” taken from an article by Mark Twain entitled “Chapters from My Autobiography.” Wikipedia reproduced that sentence as follows:
Figures often beguile me, particularly when I have the arranging of them myself; in which case the remark attributed to Disraeli would often apply with justice and force: “There are three kinds of lies: lies, damned lies, and statistics.”
In the context of this particular Tenney article, that sentence deserves to be paired with another quotation:
To err is human, but to really foul things up you need a computer.
The Quote Investigator Web site has an impressively extensive discussion about the many possible origins of this witticism.

Those who have been following my progress through From Scratch may recall that I concluded my last article with a rather waspish dig about “misconceptions that Tenney probably picked up from Lejaren Hiller at the University of Illinois.” Hiller was trained as a chemist but became known as a pioneer in the domain of computer music, a rather loosely-defined area of both research and practice. The misconception I had in mind involved the technical term “entropy” as it is applied to the discipline of information theory. From a mathematical point of view, information theory amounts to a specialization within probability theory and its relation to statistical analysis. Statistical analysis is, in turn, a technique that has such a history of abuse that, in 1954, the journalist Darrell Huff published a book that remains one of the best plain-speaking accounts of the discipline. Huff chose to title his book How to Lie with Statistics.

Now I have far too much respect for Tenney and his efforts to accuse him of being a liar. However, I fear that he may have fallen victim to at least some of the misconceptions that Huff made it a point to explain. Most important is the basic principle that statistical analysis can only be applied to what is usually called a “representative sample space.” Put in plain talk: You are trying to identify some common property shared by a very large number of objects. That number is too large for you to examine all of those objects, so you limit your analysis to a smaller number of them. That limited quantity is your “sample space;” and there are a variety of statistical tools one may apply to determine whether the small number of objects in the sample space accurately represents that larger collection of objects.

Tenney’s decision to apply statistical tools to the melodic lines in Ruggles’ compositions must be grounded in the way in which he chose to define a sample space. However, his approach to analysis presumed that the elements of that space (i.e. the notes in those melodic lines) had some underlying uniformity that made them all “equal” with respect to the mathematical tools applied to them. The problem is that, in just about any piece of music, there are always some notes that (to paraphrase George Orwell) are more equal than others. In other words, whether in the act of composing or in the act of performing, there is almost always some underlying sense that differentiates an embellishing note from the note it is embellishing. Thus, the methodology behind Tenney’s data analyses levels a playing field that is not supposed to be level.

Now, to be fair, sorting out the embellishing from the embellished in a Ruggles score is no easy matter. Because of the constrains of copyright, only one Ruggles composition can be found on IMSLP, a very early song titled “Toys.” Finding those melodic lines that Tenney decided to analyze is no easy matter; and, even if one limits oneself to the vocal part, resolving questions of embellishment is a significant challenge. Nevertheless, between changes in duration and breath commas, it is a challenge that can be met given adequate time to work with the content. (One can also take advantage of a perfectly good recording of a performance of “Toys.”)

I suppose the underlying fallacy in Tenney’s essay arises from his using mathematical tools that apply to abstract constructs. However, the concrete qualities of his “source data” cannot be reduced to such abstract constructs. As a result, the output of his techniques cannot reflect the properties of his input sources that reflect the music that Ruggles composed, rather than the marks on paper that try to document that music.

To be fair, Tenney was neither the first nor the last to be ensnared by this methodological trap. When I first started doing my own research, it did not take me long to recognize the shortcomings of Hiller’s work in both theory and practice. Ironically, 1977 was the year in which I sat as Respondent in the Current Advances in Computer Methods session of the Twelfth Congress of the International Musicological Society. At that time much of my active research involved using that aforementioned programming language LISP to translate the somewhat cryptic graphic system of notation used by Heinrich Schenker into more formal symbol structures that better captured Schenker’s own representations of relations between the embellishing and the embellished. The more of Tenney’s work that I read, the more I regret that our paths never really crossed!

No comments: