Vol. 24 Issue 1 Reviews
Marc Leman, ed: Music, Gestalt, and Computing: Studies in Cognitive and Systematic Musicology

Springer Verlag Lecture Notes in Artificial Intelligence, Vol. 1317, 1997, 524 pages, softcover, recorded examples on CD, illustrated, index, ISBN 3-540-63526-2;


available in North America from
Springer-Verlag New York, Inc.,
P.O. Box 2485, Secaucus, New Jersey 07096-2485, USA,
telephone 1-800-777-4643,
World Wide Web http://www.springer-ny.com/catalog/ np/jan98np/3-540-63526-2.html
available outside of North America from
Springer-Verlag Heidelberg,
Tiergartenstrasse 17, D-69121 Heidelberg, Germany,
telephone +49-6221-4870,
electronic mail orders@springer.de

Reviewed by Eric Scheirer (Cambridge, Massachusetts, USA)

Introduction
At the end of the 19th century, positivist philosophy and the new sciences of physics and psychology began to challenge the dominant view of musical aesthetics. Brilliant polymaths such as Carl Stumpf and Heinrich von Helmholtz raised a new proposition: scientists, not just musicians, could provide insight into the nature of music. This engendered two developments. On the one hand, the development of an analytic theory of music–systematic musicology–connected formal notations, such as the figured bass, to aesthetic ideals of composition, such as tension, release, implication, and completion. Music theorists such as Heinrich Riemann viewed this as grounding musicology in a rational, logical framework. On the other hand, a new school of psychology, called Gestalt psychology, emerged at the turn of the 20th century. The Gestaltists attempted to deal with the thorny questions posed by the reductionist approach to science. In particular, they tried to discover how it could be that the perception of a whole (whether in sound or image) could be greater than the sum of the percepts of its parts.

Gestalt psychology is built upon three basic principles. First, organisms and stimuli are wholes that are composed of parts, which in turn are composed of smaller parts. Different properties apply to parts at different levels of detail. For example, a piece of music might consist of a melody and some accompanying chords; the melody and chords each consist of notes. Second, the properties of the whole (the Gestalt) depend not only on the properties of the parts, but on the relationships between the parts as well. A melody is not simply a collection of notes; its quality depends on their order and rhythm. It is not possible just by knowing the set of notes in the melody to know what it sounds like. Third, perception cannot be understood statically, as though removed from time, but only as an evolving interaction between an organism and a stimulus. Reading the score of a piece of music is a very different experience from listening to the music as it evolves over time. For example, when reading a score, I can look back a page, or twenty pages, to compare one phrase with another, but I cannot do this while listening.

These principles appeal intuitively to both the scientist and the artist. However, the Gestaltists were never able to adequately formalize them and arrive at the testable predictions demanded by the scientific method. This line of research was largely abandoned in the 1930s when many of the researchers fled Europe before the Second World War, and the strict new behaviorist psychology of B. F. Skinner and his collaborators came into vogue. Gestalt psychology is today viewed, at least in textbooks, as a scattered set of rules, philosophies, and vague generalizations that never amounted to much.

Now, at the end of the 20th century, musical aesthetics is being challenged anew. Experiments in computer composition, synthesis, and sound design call into question the privileged role of humans in music-making. The new cognitive psychology aims to create models of the representations and processes involved in human thought. By doing this, researchers believe they can merge music theory and psychological models to create artificially-intelligent computer musicians, a better understanding of the human music-listening process, and a new view of the nature of music. A primary benefit of this challenge has been a fresh look at the ideas of the Gestaltists, to see what can be captured computationally from their appealing insights.

The modern study of auditory scene analysis can be understood as a turning-away from the narrow psychophysics of S. S. Stevens, Harvey Fletcher, and Eberhard Zwicker that has dominated 20th-century sound research. These prolific and principled researchers were focused on the perceptual properties, such as pitch and loudness, of the (supposedly) fundamental units of sound such as tones and noises. In contrast, contemporary psychoacoustics is dominated by the question of what happens when these units are combined into words, musical compositions, and sound environments. This question of parts and wholes is very similar to the one that concerned the Gestaltists; indeed, much of Prof. Bregman’s seminal volume is couched in terms of "Gestalt grouping rules" such as good continuation, proximity, and closure.

Over the last 10 years, the Belgian musicologist Marc Leman has developed a research program to create computational models that draw together ideas from cognitive and systematic musicology. His work shows enormous promise for the development of a new sort of music theory that is grounded not only in notatational formalism, but also in an understanding of human cognitive processing. Mr. Leman’s models connect musical signal-processing to music analysis and computational psychoacoustics. They include the first significant computational study of tonality in the acoustic rather than symbolic (MIDI or other note-list) domain. A major aspect of Mr. Leman’s methodology is an exploration of the relationship between symbolic and subsymbolic (his term for acoustic, or sonological) processing of music.

About the book
Music, Gestalt, and Computing is a collection of papers organized around the general themes in its title. A number of researchers who have collaborated directly with Mr. Leman are represented, as well as a number who have not. Taken as a whole, the collection clearly reflects the interests of the editor: musicological, psychological, hermeneutic, and analytic. However, certain individual chapters, particularly those from researchers outside his circle, provide a welcome counterpoint and prevent the volume from seeming polemical.

The volume is very dense and will require a serious investment for complete understanding. It contains 33 chapters divided into six sections: Gestalt Theory Revisited, From Pitch to Harmony, From Rhythm to Expectation, From Timbre to Texture, and From Musical Expression to Interactive Computer Systems. The general progression is from the more humanistic–many of the chapters in the first section touch on computational methods only in passing, or not at all–to the more practical. The opening section is mainly focused on philosophical reanalysis of Gestalt theory in the context of present-day systematic musicology and music psychology. The remaining sections are a mix of the theoretical, the experimental, and the computational. Many of the chapters in the later parts draw only sketchy connections to Gestalt theory; in this, the subtitle of the volume may be more reflective of the overall contents than the main title.

The book is well-edited, particularly considering that English is not the mother-tongue of many of the authors. Most of the chapters (with a few exceptions) are clearly written, engaging, and comprehensible. Each chapter concludes with references to the technical (and humanistic) literature, and there is an overall index at the end of the book. For those who have read Mr. Leman’s previous output and were put off by the heavy use of unusual jargon , I am pleased to report that the language is more down-to-earth here.

There is a variety of depth represented in the different chapters. Some (such as those by Richard Parncutt, Carol Krumhansl, and Mr. Leman & Francesco Carreras) are part of an extensive, ongoing exploration of a certain topic by that author, and as such dive deeply into one particular aspect of a problem. Others are more in the nature of a "first experiment" in some psychological or computational framework, and are more easily accessible to readers making an initial approach to these research areas. All of the contributions are published for the first time in this book.

To a music psychologist or psychoacoustician, this volume is an exciting achievement and raises many questions for further research. Many, perhaps most, of the papers address topics that have been rarely explored in the previous literature. If the arguments are sometimes not as rigorous as I would prefer, that only leaves room for more research along the suggested lines. In the areas of harmonic and textural analysis particularly, this book contains a wealth of original thinking that can be found nowhere else. An important next step in this sort of research is to draw connections back to what is presently considered "mainstream" psychoacoustics. Many of the models and techniques have a close resemblance to those proposed elsewhere (for example, Mr. Leman’s "tone completion image" is essentially the same thing as the "summary autocorrelogram" of R. Meddis and M. J. Hewitt), but these connections have not yet been explored very deeply. If this program of study is to have a lasting influence on mainstream psychoacoustics (it is not clear whether this is held as an important goal or not), it will also be essential to develop a more rigorous approach to experimental human studies, which are not always as convincingly presented as the computer-modeling results.

It is more difficult to recommend the book to those computer musicians who do not also have a strong interest in psychological issues. There are really only half-a-dozen chapters that describe computational methods in enough detail to evaluate them as contributions to the computer-music literature in their own right. The main thrust of these chapters is in the analysis of texture and timbre; a few touch on the analysis and synthesis of musical expression. Many of the other articles use computational methods as tools for psychological analysis and modeling, but the focus of these contributions is not on the computing methods per se. The main computational methodology is connectionism using the self-organizing Kohonen map (a technique brought to the music-analysis literature by Mr. Leman). A computer musician with a bent towards aesthetics will find more interesting material; a number of chapters present new methods for the formal analysis of post-tonal music. There is little said about the composition or analysis of acousmatic music, or music with electronic sounds.

My major criticism of the book as a whole is the ongoing confusion of issues that I consider perceptual with issues that I consider musicological. To some extent, I think this is intentional–the very term cognitive musicology implies the drawing together of these areas. This laudable goal notwithstanding, it is often difficult to tell whether a particular argument is meant to treat music as an aesthetic object, or music as a process conveyed in sound and perceived by a listener.

In his chapter on the roots of chords, for example, Mr. Parncutt writes, "The new theory of the perceptual root [strives] … to reliably predict the root whenever it is clearly and unambiguously defined in existing music theory [among other goals]." It seems from this that his aim is to build a theory of perception that is in accord with the definitions of music theory. This goal begs an important question: to what degree do the definitions of music theory themselves reflect the actual perceptions of listeners? More importantly, this goal conflates the two domains, promoting the aesthetic definitions of music theory to the status of perceptual theory. If the definitions of music theory turn out not to be borne out in perception (for example, if chords are rarely perceived to have an unambiguous root), then the idea of constructing a perceptual model of music-theoretical predictions is nonsensical. Most of the other chapters implicitly reflect a similar confusion, in my opinion (Mr. Parncutt’s article, in fact, is notable in that it explores this confusion more rigorously than most of the others).

At a deeper level, the musicological bent of the volume exerts a strong influence by limiting the questions posed. The traditional discourse of music theory seems to greatly constrain the topics and methods of the contents–there is a great concern with the harmony, rhythm, and texture of music in the traditional Western "classical" style. These are questions that have primary concern to a Western music theorist, but perhaps less interest to other researchers in the music sciences. Music from other cultures, and Western musics such as jazz, folk, and rock that fall outside of the mainstream of the classical tradition are treated with lip service if at all. This is very unfortunate because, to me, one of the greatest strengths of the sort of analytic perspective represented here is the possibility of opening up a broader, more inclusive, musical theory. It is certainly possible to envision the application of some of the techniques presented to other musical styles, but it would be welcome to find more research that immediately concerned itself with the broad world of music rather than putting most of it off "for later."

In sum, this book is an essential addition to the library of any researcher interested in psychological approaches to the study of music. It contains a wealth of well-reasoned philosophy as well as intriguing new insights into the music-listening process. For a researcher mostly interested in engineering approaches to sound analysis and synthesis by computer, there are some chapters of interest, but there are also many that will not hold much interest.