Tuesday, September 20, 2016

Reproducing Music Without a Microphone

I suppose that my background in technology has a lot to do with my fascination with player pianos. Back when I was teaching at the University of Pennsylvania in the Seventies, I knew someone who was passionate about restoring old player pianos; and it was from him that I first learned that, in the most advanced models, there could be up to five additional columns for holes that would control the amplitude. The makers of player pianos understood the digital-to-analog converter long before the electrical circuit for such a device had been built!

Nevertheless, there is a downside to that story. It involves the promotion of the Welte-Mignon, a reproducing piano that was first introduced in 1904. To promote the instrument, members of the Welte family sought out as many of the best pianists of that period as they could muster, inviting them all to allow their performances to be captured by their technology. The reason I can remember that there were five additional columns comes from a story I heard about what happened when Artur Schnabel was approached. Whichever Welte was making the pitch to him boasted that their system could capture 32 levels of loudness. Schnabel replied curtly, “I have 33;” and refused to make any recordings for the instrument!

It this respect it is interesting that MIDI representations are based on eight-bit “words,” meaning that, unless extended by different interpretation software, the representation of amplitude has 256 levels (including silence). I was actually reluctant to doing anything with MIDI when it first appeared. This was not because of Schnabel but because I was interested in microtonal tuning at the time and had not yet encountered a representation of pitch that would satisfy the representation I had developed for my own EUTERPE language, which divided the octave into 72 equal parts.

However, when I moved to Singapore in 1991, I knew that I was not going to bring along my Baldwin grand. So I visited a Yamaha showroom and got myself a Clavinova with sampled sounds and support (sort-of) for three functioning pedals. (“Sort-of” meant that the “virtual dampers” controlled how long a note would be sustained but not how it would cause other open strings to reverberate.) Because this instrument supported MIDI control, it was not long before I bought a Macintosh, perched it on the instrument, and started playing around with MIDI software.

In Singapore Yamaha ran a music school in conjunction will selling instruments. I got to know one of the piano teachers, and she started asking me questions about MIDI. I happened to have one piece of software that would provide numerical representation of MIDI content in terms of time (both start and end), piano key, and degree of loudness. I showed her a printout of my own performance of Wolfgang Amadeus Mozart’s K. 2 minuet (which has only 24 measures). Once I explained what the numbers meant, she immediately started to critique my performance; and she could explain what she was saying in terms of the “raw data” of the numbers themselves.

Thus began one of the most interesting projects from my time in Singapore. Two colleagues and I developed a music-based visualization of MIDI recordings. This meant that the display was structured around score notation constructs. Amplitude was represented by coloring the note heads on a color scale that would “dissolve” from high-intensity red (loudest) to high-intensity blue (softest). We also developed a “virtual metronome” that would insert tick marks corresponding to the position of the beat with respect to the notation. Finally, there was a representation of “articulation,” basically how much of a pause there was from one note to the next.

Such visualization provided a new “channel” through which teacher could communicate with student. The object may have been to get the student to be a better listener, but establishing what the student should pay attention to while listening could be tricky. Visual cues, even relatively primitive ones, could go a long way towards guiding auditory attention.

Having established a relationship with Yamaha, I could now devote some of my attention to some of their own efforts to promote reproducing pianos, such as the Disklavier, which was basically a grand piano with the ability to translate MIDI data into physical hammer and pedal actions. Yamaha was trying to market a library of floppy discs recording performances by famous classical and jazz pianists. (As I recall there was even a floppy with MIDI transcriptions of player piano rolls recorded by George Gershwin.)

Needless to say, these sounded pretty feeble on my Clavinova. However, I discovered that the effect was not much better when I listened to the same recorded performance on a Disklavier in the Yamaha showroom. Was this just another example of Schnabel’s complaint that there were “not enough bits?”

My current thinking is that this was not the case. I thought about some of the vinyl recordings I had of Welte-Mignon performances; and I recalled that the audio recordings were based on using the instrument on which the piano roll had initially be created. This led to the hypothesis that reproduction was as much a matter of what the pianist heard as of what the pianist did physically. Our research in visualization had offered a representation of audible features; but there was a listening-playing loop that would be unique to any performance, regardless of how it was visualized.

This hypothesis has two serious corollaries. One involves “scientific” approaches to researching the nature of piano performance. One can take recordings of pianists and abstract them to a MIDI-like representation. That representation will provide a representation of what the pianist was doing that would be at least adequate. However, when isolated from the sounds of the recording, it abstracts out any information about how the pianist is responding to the sounds (s)he is creating. This seems to be a critical gap in the data that are available. It also suggests that, when a pianist is warming up on an unfamiliar instrument (s)he is aware of not only the physical responsiveness of the keys and pedals but also of how the whole instrument reverberates and how some frequencies may be stronger than others.

The second corollary goes back to Yamaha’s interest in marketing those floppy discs. Because every piano has its own unique sonorities, playing one of those floppies on a piano that is different from the one on which it is recorded may well have unsatisfying results. In other words the abstraction of the data on the floppy, which is probably grounded in MIDI, cannot capture the impact of listening on the nature of the performance.

Can the capture technology be improved? This would require a richer understanding of the nature of that listening-playing loop. At the present time we can do little more than come up with some highly general hypotheses. However, this is one of those cases in which the devil is in the details; and we still have quite a ways to go before we even know which of those details are the relevant ones!

No comments: