Перейти к основному содержанию

Joseph Paradiso. Интерфейсы электронных музыкальных инструментов

Введение

По мере того, как в последние столетия ремесленное изготовление музыкальных инструментов превращалось в технологии, у изобретателей и музыкантов появилось множество идей по улучшению традиционных инструментов и созданию совершенно новых. Классические акустические инструменты, такие как струнные, духовые и ударные современного оркестра (а также ситары, кото и другие представители неевропейских культур) были с нами на протяжении веков. Поэтому обычно считается, что их дизайн оптимален и подвержен лишь незначительным и постепенным улучшениям. Столетиями, создание дорогих акустических инструментов, особенно семейства струнных, оставалось таинственным искусством. И только недавно их конструкция, акустические свойства и свойства материалов были изучены настолько, чтобы попытаться создать на этой основе что-то новое, как, например, в инструментах Карлина М. Хатчинса (Carleen M. Hutchins).

У электронной музыки нет такой давней истории. Как явление, она существует менее столетия, не оставляя электронным музыкальным инструментам времени на эволюционное развитие. Технологии развивается настолько быстро, что одни методы и возможности синтеза звука сменяют другие всего за несколько лет. Вслед за новыми методами синтеза, которые дают новые возможности выразительности, конструкции соответствующих интерфейсов также требуют постоянного совершенствования. Это происходит и сейчас, когда активно развивается, например, физическое моделирование. Как следует из названия, этот метод синтеза реализует математическую модель настоящего акустического инструмента, или другой сложной акустической системы, с помощью DSP (цифровых звуковых процессоров). Поскольку большинство акустических инструментов имеют интерфейсы, сильно отличающиеся от клавиатуры, характерной для коммерческих синтезаторов, исполнителю потребуется другой интерфейс, возможно, с большими выразительными возможностями, чтобы реализовать весь потенциал этого типа синтеза.

На протяжении большей части истории электронной музыки интерфейс, как часть дизайна инструмента, относился к области эргономики. За последние годы электронные инструменты стали цифровыми, и в ближайшее время их функции, вероятно, возьмут на себя персональные компьютеры общего назначения. Таким образом, в практическом смысле исследование музыкального интерфейса смыкается с более широкой темой: исследование интерфейса человек-компьютер. Здесь возможны две основные крайности: с одной стороны, это интерфейсы для исполнителей-виртуозов, которые годами овладевают искусством управления всеми возможностями определенного инструмента. С другой стороны, вычислительные возможности компьютера позволяют сопоставлять простые движения с любыми параметрами генерации звука, позволяя даже не музыкантам управлять многокомпонентным музыкальным потоком. Если прежние подходы способствуют развитию технологий точного распознавания для создания сложных пользовательских интерфейсов реального времени, нынешние направлены больше на распознавание паттернов, алгоритмическую композицию и искусственный интеллект.

Использование компьютера, как посредника между физическим действием и музыкальным результатом, позволяет практически любой мыслимый звуковой ответ на определенный набор действий. Это то, что называется mapping или «сопоставление». Поскольку цифровые музыкальные интерфейсы существуют недавно, нет определенного набора правил, которые регламентировали бы эти сопоставления. Хотя, возможно, исполнителю необходим определенный уровень видимой связи его движений с их результатом. Аналогично, серьезные дискусии связаны с основными принципами цифрового исполнительства. Например, когда играет скрипач, у аудитории есть некое представление о том, каким образом музыкант извлекает звуки из своего инструмента. Но с незнакомым цифровым интерфейсом, где любой жест может вызывать любое возможное звуковое событие, исполнитель рискует потерять свою аудиторию. Для современных композиторов, работающих в этом жанре, довольно сложно поддерживать интерес, присущий наблюдению за музыкантом-виртуозом, использующим свой инструмент на грани его возможностей. В большинстве случаев зрители ожидают возможности почувствовать напряжение и пот исполнителя, если можно так выразиться.

Хотя, как указано выше, mapping является важной составляющей всех современных музыкальных интерфейсов (и для этого было разработано множество интересных программ: MAX Opcode (Miller Puckette), Interactor (Mark Coniglio и Morton Subotnick), Lick Machine (STEIM), CMU MIDI Toolkit (Roger Dannenberg), Flex (Cesium Sound), ROGUS и HyperLisp (MIT Media Lab)), основное внимание в статье будет уделено аппаратной составляющей, а также истории развития электронных музыкальных интерфейсов, описанию примеров и исследований, иллюстрирующих их основные концепции.

 

Клавиатуры

За исключением Терменвокса, который будет обсуждаться ниже, ранние электронные музыкальные инструменты управлялись, преимущественно, клавиатурой. Часто это была такая же 12-тоновая хроматическая клавиатура, как у акустического фортепиано. Одним из первых таких инструментов был «Музыкальный Телеграф» Элайши Грэя (Elisha Gray) 1876 года — набор настроенных на определенные частоты электронных зуммеров, которые активировались нажатием клавиш на музыкальной клавиатуре. В 1899 году публике была представлена «Поющая Дуга» английского физика Уильяма Даддела (William Duddel), здесь клавиатура использовалась для модуляции напряжения в электрической цепи, в которой находилась угольная дуговая лампа, непосредственно издававшая музыкальные тоны.

В 1906 году 200-тонный Telharmonium Таддеуса Кэхилла начал транслировать музыку из здания, наполненного настроенными на определенные частоты динамо-машинами, по одной на каждую ноту. И поскольку этот инструмент появился раньше электронно-лампового усилителя, каждая из динамо-машин (или tonewheels, звуковых валов) должна была производить около 10 кВт электроэнергии, чтобы по телефонным линиям донести звук до тысяч слушателей-подписчиков. Telharmonium управлялся с помощью многоклавиатурной консоли, предназначенной для двух исполнителей. Клавиатура Кэхилла была по-настоящему touch-sensitive (чувствительной к характеру нажатия) — особенность, которой, по большей части, были лишены электромеханические органы, являвшиеся, по сути, потомками Телармониума. Поскольку каждая клавиша была соединена с механизмом, регулирующим взаимодействие двух катушек трансформатора, амплитуда сигнала зависела от силы нажатия. Для управления тембром и динамикой также использовались специальные переключатели и педали. Поскольку за каждую ноту отвечал отдельный генератор, Telharmonium был полифоническим инструментом, как и его появившиеся десятилетия спустя потомки, также использовавшие tonewheels, такие как органы Rangertone и Hammond. Напротив, большинство ранних электронно-ламповых инструментов были монофоническими. Несколько исключений, таких как Pianorad Хьюго Джернсбэка (Hugo Gernsback, 1926) и Coupleaux-Givelet Organ (1938), требовали огромного количества громоздких ламповых генераторов (по одному на ноту) и отличались заметным дрифтом строя.

Ondes Martenot («Волны Мартено») Мориса Мартено, впервые показанный в 1928 году, был первым электронным клавишным инструментом, производившимся пусть в очень небольших количествах, но серийно. Поскольку для него были написаны значительные произведения, прежде всего, французскими композиторами того времени, он и сегодня используется в концертной практике. В ранних версиях высота тона задавалась с помощью натянутой ленты с прикрепленным к ней кольцом, в который продевался указательный палец исполнителя. Лента была намотана вокруг вала конденсатора переменной емкости, как струна настройки частоты в старых радиоприемниках. Когда исполнитель двигал кольцо вдоль клавиатуры (сделанной только в качестве визуального и тактильного ориентира), лента вращала вал конденсатора, изменяя его емкость и, соответственно, частоту осциллятора, определявшую высоту издаваемого звука. Исполнитель мог компенсировать дрифт строя на слух, просто корректируя положение кольца. Левой рукой, с помощью специальной кнопки осуществлялось взятие и снятие ноты. Рядом находились переключатели, определяющие тембр и характеристики амплитудной огибающей. Мартено последовательно совершенствовал инструмент, выпустив наконец модель с настоящей клавиатурой, каждая клавиша которой извлекала звук соответствующей высоты. Лента осталась, но использовалась теперь как дополнительный контроллер, для создания эффектов портаменто и вибрато. Левая рука освободилась для управления тембром и динамическими оттенками. Такое распределение ролей, когда левая рука отвечает за артикуляцию, а правая непосредственно за взятие нот, сохранилось и в большинстве современных клавиатурных синтезаторов.

В последующие десятилетия появилось еще несколько, преимущественно монофонических, клавиатурных электронных инструментов с развитым управлением тембром. В их числе Trautonium Фридриха Траутвайна (Friedrich Trautwein), созданный в 1928 году. Здесь в качестве клавиатуры использовалось нечто вроде ленточного контроллера, в котором высота звука определялась уровнем сопротивления цепи. А также различные инструменты Харольда Боде (Harold Bode): Warbo Formant Orgel 1937 года, в котором с помощью четырех осцилляторов была реализована 4-голосная полифония; Melodium 1938 года c клавиатурой, чувствительной к силе нажатия (динамика зависела от силы, с которой механизм клавиши воздействовал на сопротивление — войлочную ленту, пропитанную глицерином); и Melochord 1947 года — двухклавиатурный Мелодиум, в начале 50-х установленный в знаменитой кельнской Студии электронной музыки. Это был прямой предшественник модульных систем с управляемыми фильтрами и генераторами огибающей. В 1941 году французский инженер George Jenny представил Ondioline, популярный монофонический инструмент с 3-октавной клавиатурой, использующий цепь резисторов для управления высотой извлекаемого звука (этот же метод использовался в аналоговых синтезаторах 60-70-х гг.). Подвижность клавиатуры в вертикальной и горизонтальной плоскостях позволяла гибко управлять динамикой и получать эффект вибрато.

Никогда не производившиеся и не использовавшиеся в коммерческих целях, инструменты канадского пионера электронной музыки Хью ЛеКейна (Hugh LeCaine) были, однако, важными вехами в истории электронных музыкальных интерфейсов. Это полифонический электронный орган с клавиатурой, чувствительной к силе нажатия, в которой под каждой клавишей располагались емкостные датчики смещения (1953 год), клавиатурный «Special Purpose Tape Recorder» («Магнитофон специального назначения», 1955) — предшественник знаменитых сэмплеров Mellotron 1960-х годов. А также очень интересный инструмент — Electronic Sackbut (названный так в честь средневекового предшественника тромбона), созданный в 1948 году. Он задумывался как синтезатор, способный в полной мере имитировать выразительное звучание акустических инструментов. Sackbut представлял собой монофонический клавишный инструмент с несколькими каналами управления артикуляцией. Громкость и атака определялись смещением и силой нажатия на клавиши, а также положением ножной педали. Клавиши могли двигаться горизонтально, воздействуя таким образом на высоту тона, кроме того, еще одна ножная педаль управляла значением портаменто. Высотой тона также можно было управлять с помощью сенсорной ленты, расположенной над клавиатурой. LeCaine разработал очень тонкий контроллер управления тембром для левой руки: большой палец управлял парой формантных фильтров, указательным пальцем регулировалась форма волны основного осциллятора, а оставшимися тремя пальцами можно было выбирать различные типы частотной модуляции. С этим набором контроллеров Sackbut оживал в руках достаточно натренированного исполнителя, как хорошо слышно в сохранившейся записи 1952 года. Здесь он имитирует звуки струнного квартета (отрывок известного «Танца Духов» Глюка), а затем играет блюзовую мелодию.

Хотя модульные аналоговые синтезаторы конца 1960-х — начала 1970-х годов (производства Moog, ARP, E-Mu и т.д.) могли использовать практически любой источник управляющего напряжения, доминирующим интерфейсом оставалась стандартная 12-тоновая клавиатура. Эти клавиатуры генерировали сигнал Gate (булева типа) при нажатии клавиши, и передавали соответствующий клавише уровень управляющего напряжение (сигнал типа CV). Gate-сигнал запускал генератор огибающей, который определял динамику, в то время как CV-сигнал определял высоту тона, параметры фильтра и т.д. Большинство этих клавиатур были монофоническими, хотя были и дуофонические, генерировавшие пару CV-сигналов при нажатии двух клавиш. Эти клавиатуры обычно не были чувствительными, очень немногие реагировали на скорость, силу нажатия или что-то еще помимо самого события нажатия. Управление динамикой в реальном времени в этих устройствах достигалось с помощью вращения ручек, нажатия педалей или манипулирования чем-либо вроде ribbon-контроллера (резистивной ленты, реагирующей на прикосновение и движение пальца).

Однако, многим музыкантам был нужен портативный синтезатор, который было бы удобно использовать не только в студии, но и в гастрольной поездке. В ответ на этот запрос, в 1970 году Moog music выпустила свой знаменитый MiniMoog, который оказал огромное влияние на дальнейшее развитие синтезаторов. Этот инструмент представлял собой упрощенный вариант больших модульных систем, в котором маршрутизация сигналов осуществлялась только с помощью ручек и переключателей, без зарослей патчкордов и огромных стеллажей оборудования. Конечно, эта мобильность и доступность достигалась за счет гибкости и разнообразия тембров, которыми пришлось пожертвовать — однако, с экономикой спорить было невозможно. Влияние колоссального успеха MiniMoog (было продано более 12 000 экземпляров, больше, чем какой-либо из предыдущих моделей синтезаторов) ощущается до сих пор, наиболее очевидный пример — пара колесных контроллеров слева на панели почти любого электронного музыкального инструмента (Боб Муг не раз выражал сожаление, что это нововведение не было запатентовано). На MiniMoog одно из этих колес управляет отклонением высоты тона (pitch bend), другое - модуляцией осциллятора и фильтра (mod wheel). Современные синтезаторы, использующие MIDI, могут назначать эти контроллеры на управление любым параметром. Почти все примеры выразительной артикуляции в синтезаторных соло прогрессив-рока и фьюжн-джаза 1970-х годов были созданы с помощью этих колесных контроллеров. Несмотря на то, что впоследствии появились такие решения, как джойстики (например, у SCI Prophet VS и других векторных синтезаторов), сочетание ribbon-контроллера (ленточного) и колеса в Korg Prophecy, и 2-мерный тачпад в новом Korg Z1, классические колесные контроллеры MiniMoog, как правило, также присутствуют.

В начале 70-х годов синтезаторные клавиатуры стали полифоническими благодаря внедрению технологии цифрового сканирования. В таких клавиатурах каждая клавиша была подключена к входу цифрового мультиплексора, который непрерывно отслеживал изменения их положения (нажата/отпущена). Впервые полифоническую клавиатуру для модульного синтезатора разработал Дональд Букла (Model 237 и 238?), затем Ральф Дойч (Ralph Deutsch) и его коллеги из Rockwell International для Allen Digital Organ (превратившийся затем в знаменитый RMI Keyboard Computer). Но наиболее известны полифонические клавиатуры, разработанные Дейвом Россумом (Dave Rossum) и Скоттом Уэджем (Scott Wedge) для модульных систем E-Mu Systems. Каждая нажатая на такой клавиатуре клавиша запускает отдельный «голос», состоящий из осциллятора, генератора огибающей, усилителя VCA и фильтра (естественно, если есть свободный голос полифонии), при этом могут быть назначены различные приоритеты распределения нот по голосам. В ранних устройствах это осуществлялось только аппаратными средствами (например, в клавиатуре E-Mu 4050, эволюционировавшей затем в контроллер для синтезаторов Oberheim 4 Voice и 8 Voice). В клавиатуре 1977-го года E-Mu 4060 (адаптированной годом позже для знаменитого Sequential Prophet 5) уже использовался микропроцессор. В современных синтезаторах и сканирование клавиатуры, и выделение голосов, а также синтез и обработка сигналов могут осуществляться в цифровой форме с помощью встроенных интегральных схем специального назначения (ASIC) и процессоров.

После появления в 1983 году спецификации MIDI, предусматривающей передачу нескольких дополнительных параметров, помимо высоты взятой ноты, «выразительные» клавиатуры стали стандартом и на массовом рынке. Каждая нота сопровождается 7-битным параметром velocity, получаемым на большинстве клавиатур измерением времени, за которое клавиша переключается между верхним и нижним контактами. Многие клавиатуры также передают параметры послекасания (aftertouch) — сообщения Channel Pressure и Polyphonic Key Pressure (усредненное значение и отдельные значения для каждой ноты, соответственно).

Было несколько относительно недавних попыток улучшить синтезаторную клавиатуру, предоставив исполнителю больше возможностей по управлению звуком. Клавиши клавиатуры Notebender, разработанной Джоном Алленом (John Allen) и его коллегами из Key Concepts Inc. в 1983 году, могли двигаться после нажатия в продольном направлении (к себе и от себя), создавая дополнительный канал управления артикуляцией. Клавиатура Multiply-Touch-Sensitive keyboard, разработанная Робертом Мугом (Bob Moog) и Томасом Ри (Thomas Rhea) в конце 1980-х годов, идет намного дальше измерения скорости нажатия и давления: в каждой её клавише также используются емкостные датчики для считывания положения пальца на плоскости (т.е. по двум координатам). Некоторые коммерческие аналоговые синтезаторы конца 60-х - середины 70-х годов вообще обходились без механической клавиатуры, используя в качестве контроллера подобные емкостные сенсорные пластины. Наиболее известные примеры, это английские портативные синтезаторы: Wasp и Gnat компании Electronic Dream Plant, или Synthi-AKS от EMS. В этих устройствах диатоническая клавиатура изображена на плоской металлической пластине, под которой находятся емкостные датчики, считывающие прикосновение пальца как изменение характеристик соответствующей электрической цепи. Один из первых создателей модульных систем, Дональд Букла с осторожностью относился к использованию клавиатур из-за ограничений, которые они накладывают на возможности электронных инструментов. Многие из его ранних интерфейсов использовали аналогичные емкостные сенсорные панели (некоторые из которых также реагировали на силу и позицию нажатия), хотя они не предназначались для имитации музыкальной клавиатуры.

Как известно любому пианисту, качество клавиатуры зависит от её тактильного отклика и такой характеристики, как action («действие»). Хотя лучшие электронные клавиатуры теперь имеют пассивное взвешенное action, по ощущению они соответствуют низкокачественному акустическому пианино. Клавиатуры с активной обратной связью разрабатывались различными инженерами, в том числе Чаком Монте (Chuck Monte, разработчиком «Miracle Piano»), Клодом Кадозом (Claude Cadoz) и его коллегами в Гренобле, и Брентом Гиллеспи (Brent Gillespie) в Северо-западном университете (Northwestern University) и CCRMA Стэнфорда. Эти устройства позволяют программировать соответствующий динамический отклик, и таким образом способны имитировать тактильную характеристику любой клавиатуры, от лучших концертных роялей до совершенно невероятных устройств с «невозможной» механикой.

В последние годы также появляются новые типы клавиатур с нетрадиционными раскладками. Хотя электронный гигант Motorola построил такой контроллер для синтезатора Scalatron еще в 1974 году. Два из этих синтезаторов были оснащены т.н. generalized keyboard, разработанной Джорджем Секором (George Secor), которая представляла собой плотный массив из 240 кнопок-клавиш для воспроизведения микротональной музыки (т.е. звуков, лежащих между нотами 12-тоновой темперированной шкалы). Клавиатура MicroZone компании Starr Labs представляет собой MIDI-клавиатуру, также предназначенную для микротональной музыки - это массив из шестиугольных клавиш, похожий на пчелиные соты. Создавались и другие нестандартные интерфейсы из кнопок и переключателей, такие как Monolith Джейкоба Дьюрингера (Jacob Duringer), другие находятся в стадии разработки, как например, Chordboard Гранта Джонсона (Grant Johnson) из Fair Oaks, или generalized MIDI-клавиатура Джона Аллена (John Allen) того же типа, что и клавиатура Bosanquet.

Один из самых впечатляющих интерфейсов такого рода был спроектирован и построен Салом Мартирано (Sal Martirano) и его коллегами из Университета штата Иллинойс в 1969 году. Устройство под названием «Sal-Mar Construction» представляло собой панель из 291 сенсорного переключателя, каждый из которых с помощью массивного коммутатора соединялся с логическим блоком. Этот блок управлял аналоговым синтезатором, генерирующим до 24 независимых аудиоканалов. Sal-Mar был инструментом для живого исполнения, и Мартирано использовал его во многих концертах и ​​записях. Это была (одна из первых) цифро/аналоговая машина, способная создавать музыку по заданному алгоритму, в которой исполнитель определял звуковые последовательности и взаимодействовал с ними в реальном времени. Еще один подобный интерфейс — Contact Питера Отто (Peter Otto), разработанный в 1981 году для Лючано Берио.  Он состоял из 91 регулятора — ручки, переключатели и фейдеры с программируемыми функциями — и позволял в реальном времени управлять цифровой рабочей станцией (DAW). Эти интерфейсы относятся к временам больших модульных аналоговых синтезаторов, в которых для управления каждым параметром используется ручка или переключатель. Такие музыканты, как Tangerine Dream, Клаус Шульце и Кейт Эмерсон, возили эти огромные и сложные модульные системы с собой и выступили с ними вживую, переключая патчи и вращая ручки прямо во время исполнения, чтобы получить нужное звучание. Несмотря на свои размеры, неуключесть и высокую стоимость, такие интерфейсы, предоставляющие исполнителю физическую связь с настройками и параметрами, могут заметно облегчать процесс звукового дизайна. В некоторых областях (например, в микшерных пультах) такой физический тактильный интерфейс по-прежнему считается оптимальным, и лишь очень постепенно здесь внедряются абстрактные цифровые GUI (графические пользовательские интерфейсы). Иногда производители прислушиваются к призыву снова дать возможность управлять синтезатором с помощью физического интерфейса из ручек и переключателей, поскольку редактирование звуков из меню LCD-панели часто становится очень сложным. Примером может служить программирующее устройство Roland PG1000, выпущенное компанией для синтезатора Roland D50. Но такой подход становится все менее общепринятым, ему на смену приходят легкодоступные программные редакторы/библиотеки для персональных компьютеров, где взаимодействие с синтезатором происходит с помощью виртуальных ручек и слайдеров графического интерфейса, что не так быстро и интуитивно, но гораздо более практично. MIDI-контроллеры, имеющие меньший (но все еще значительный) набор программируемых фейдеров и переключателей, выпускаются несколькими производителями, это, например, JL Cooper FaderMaster, E-Mu Launchpad и Peavey PC 1600X (16 назначаемых фейдеров, 16 программируемых кнопок, колесный контроллер и поддержка двух педалей).

 

Ударные интерфейсы

Довольно близки к клавиатурным интерфейсам т.н. барабанные контроллеры (drum controllers), предоставляющие исполнителям на ударных доступ в мир электронного звука. Первые ударные интерфейсы появились в конце 1960-х годов и представляли собой простые акустические датчики, прикрепленные к поверхности, по которой производится удар. Получаемые сигналы формировали огибающую, т.е. CV-сигнал, пропорциональный силе удара, и Gate-сигнал импульсного типа, инициирующий взятие/снятие ноты (стандартные для модульных синтезаторов того времени). Затем эти сигналы передавались в модули синтезатора, управляющие созданием звука. Первым ударным интерфейсом, получившим широкое распространение, был Moog 1130 Drum Controller. Этот инструмент, появившийся в 1973 году, использовал в качестве сенсора чувствительный резистор и впервые дал публике представление об электронных ударных на концертах прог-рок-групп вроде Emerson, Lake and Palmer. Другие подобные контроллеры, в основном, содержащие максимально простые встроенные синтезаторы, появились еще до распространения MIDI во второй половине 70-х, их можно услышать в большей части танцевальной музыки того времени. Можно отметить такие марки, как Pearl synthetic drums, Synares, Syndrum и недорогие электронные ударные интерфейсы ElectroHarmonix.

Electronic percussion took a major leap in the early 80's with the designs of Dave Simmons, which combined new appealing sounds with very playable flat, elastic drumpads in what were then exotic shapes (eventually annealing into the familiar hexagon); these devices also evolved a MIDI output for driving external synthesizers. The Simmons SDX drumpads introduced the concept of "zoning", where hits of varying intensity in different areas of a single pad could trigger different sonic and MIDI events.

Nowadays, although Simmons are long-vanished, nearly every musical instrument manufacturer makes electronic percussion interfaces. One of the longest lines of innovative percussion controllers arise from KAT (now distributed by E-Mu), who make products such as electronic mallet interfaces for marimba players. Most percussion devices use Force-Sensitive Resistors (FSR's) as sensing elements, while some incorporate piezoelectric pickups. Essentially all percussion pads are acoustically damped, and radiate little direct sound. In recent years, several MIDI drum synthesizer modules (e.g., the Alesis DM series) incorporate analog inputs for 3'rd party percussion transducers, enabling triggers from essentially any source to produce MIDI output. By necessity, these devices are very adaptive to signals of different character; all relevant parameters (such as trigger thresholds, noise discrimination, crosstalk between pads, etc.) can be digitally adjusted and compensated through menu parameters for each transducer channel.

An interesting approach to percussion controllers and synthesis has been explored by Korg in their Wavedrum. This device supersedes the limited information in simple trigger detection by employing the actual audio signal received by transducers on the drumhead as excitation for the synthesis engine (various synthesis and processing algorithms are implemented), enabling a very natural and responsive percussion interface.

In recent years, the famous synthesizer innovator Donald Buchla has been directing his attention to designing new musical interfaces. One of his devices, called "Thunder", can be thought of as a very articulate percussion controller, designed to be played with bare hands. The flat surface of Thunder is split into several labeled zones of different shapes, adjusted to complement the ergonomics of the human hand. These zones respond separately to strike velocity, strike location, and pressure. Whereas the original Thunder designs employed capacitive touch sensing, later renditions use electro-optic detection of the surface membrane's deformation under hand contact.

Here at the Media Lab, we have built perhaps the world's largest percussion interface in the "Rhythm Tree", an array of over 300 smart drumpads constructed for the "Brain Opera", a big, touring, multimedia installation that explores new ways in which people can interact with musical and graphical environments. Each pad in the Rhythm Tree features an 8-bit PIC 16C71 microcontroller that analyzes the signal from a PVDF piezoelectric foil pickup and drives a large LED, both potted in a translucent urethane mold. Up to 32 drumpads share a single, daisy-chained RS-485 digital bus to simplify cabling. All pads on a bus are sequentially queried with a fast poll; if a pad has been hit, it responds with data containing the strike force, the zone of the pad that has been hit (obtained by analyzing the rising edge of the transducer waveform), and the resonant character of the hit (obtained by counting the waveform's zero-crossings). A MIDI stream is then produced, which triggers sounds and gives visual feedback by flashing the LED in the struck pad or illuminating others in the vicinity. All parameters (thresholds, modes, LED intensity) in each pad are completely and dynamically downloadable from the host computer.

 

4) Batons

 

Another interesting interface that began life as a percussion controller was computer music pioneer Max Mathews' "Daton", where a sensitive plate responded to the location and force of a strike. The strike location was determined by measuring differential force with 4 pressure sensors at the corner plate supports. Mathews then collaborated with his colleague at Bell Labs, Bob Boie, who evolved this device into one of the best known modern interfaces in academic music, the "Radio Baton". This instrument is played with two batons, one in each hand. The 3D location of each baton can be determined above a sensitive platform by measuring the signal induced through capacitive coupling between oscillators driving transmit electrodes in the batons and shaped receive electrodes in the platform. The signal from each baton is synchronously detected in the pickup electronics to reduce noise and extend the measurement range.

A similar interface has been designed by Donald Buchla. Termed the "Lightning", this is an optical tracker that measures the horizontal and vertical positions of a pair wireless wands, each with a modulated IR LED at its tip (several musicians and researchers use the Lightning controller; for an example showing an application, check out Jan Borcher's Worldbeat installation). The current version of this system is specified to sense across a region 12 feet high by 20 feet wide. A highly-interpreted MIDI output stream can be produced, ranging from simple coordinate values through complicated responses to location changes, beats, and gesture detection. Other researchers have explored related optical interfaces; i.e. the Light Baton by Bertini and Carosi at the University of Pisa and the IR baton of Morita and colleagues at Waseda University both use a CCD camera and frame-grabber to track the 2D motion of a light source at the tip of a wand in real time.

A series of batons have been produced that sense directional beats via an array of accelerometers. These include 3-axis devices such as the baton built by Sawada and colleagues at Waseda University, the MIDI baton designed by David Keane and collaborators at Queens University in Kingston Canada, and a commercial dual-wand MIDI unit called the "Airdrum" made by Palmtree Instruments in 1987. Even simpler devices have been built with simple inertial switches replacing the accelerometers; these establish momentary contact when the baton velocity changes sign in the midst of a beat. Examples are the Casio SS1 Soundsticks and other such devices that have appeared on the toy market.

The "Digital Baton", which we have built at the Media Lab, incorporates both of the sensor modes mentioned above; i.e., it tracks the horizontal/vertical position of an IR LED at the baton tip for precise pointing (using a synchronously demodulated PSD photosensor to avoid problems with latency and background light) and uses a 3-axis 5G accelerometer array for fast detection of directional beats and large gestures. It also features 5 FSR's potted in the urethane baton skin for measuring distributed finger/hand pressure. This multiplicity of sensors enables highly expressive control over electronic music; the performer can "conduct" the music at a high level, or descend into a "virtuoso" mode, actually controlling the details of particular sounds. We have used this baton in hundreds of performances of the Brain Opera at several worldwide venues.

 

5) Assimilating the Guitar

 

Stringed instruments are highly expressive and complex acoustic devices that have followed a long and difficult path into the world of electronic music controllers. The popularity of the guitar in modern music has given it considerable priority for being assimilated into the world of the synthesizer, and a look at the history of the guitar controller aptly reflects the evolution of signal-processing technology.

The world saw the birth of an important and popular electronic instrument when guitars were mated to electronic pickups back in the 1930's. Although it was an extreme break with tradition, as pioneered and explored by many innovative musicians like Charlie Christian and Les Paul, many sounds in this new instrument lay latent, as it were, until the arrival of the 1960's, when Jimi Hendrix and other contemporary guitarists turned up the volume and started exploring the virtues of distortion and feedback. The electric guitar now became part of a complex driven system, where the timbral quality and behavior of the instrument depends on a variety of external factors; i.e., distance to the speakers, room acoustics, body position, etc. Musicians explored and learned to control these additional degrees of freedom, producing the very intense, kinetic performance styles upon which much of modern rock music is based.

The next stage in the marriage of electronics and the guitar resulted in an array of analog gadgets and pedals that modified the guitar pickup signal directly; these included wah-wah's (sweeping bandpass filters), fuzzboxes (nonlinear waveshaping and limiting), flangers (analog delays and comb filters), octave dividers, and various other dedicated processors. One of the most unusual was "The Bag" by Kustom; essentially brute-force vocoders that injected the guitar signal directly into the player's mouth through a small speaker tube, then picked it up with a nearby vocal microphone. By shaping the mouth into different forms, the timbral characteristics of the guitar sound would be effectively dictated by the dynamic oral resonances, resulting in a familiar "talking guitar" effect.

The first stages in the melding of guitars and synthesizers were experiments with running guitar signals through envelope followers and processing devices such as filters, etc. in the old modular synthesizers. Designers then eagerly assailed the next step in this union, namely extracting the pitch of the guitar, so it can drive an entirely synthesized sound source, and become a true "controller". This task has proven quite difficult and is still a challenge to do quickly, accurately, and cheaply, even with today's technology. The problems come in at several levels; e.g., noise transients included with the attack of the sound, the potential need for several cycles of a steady-state waveform for robust pitch determination, dealing with variabilities in playing style, and the difficulty of separating the sounds and coupling effects from the different strings. The so-called "guitar synthesizers" of the mid-1970's were mainly monophonic analog devices (allowing only one note to be played at a time) that were unreliable and technical disasters. The Avatar, for instance, was essentially the first commercial guitar synthesizier, often credited with hastening the demise of the Arp Synthesizer Corporation. Rather than interface to an actual guitar, the commercial successes adopted during that era shaped a portable keyboard roughly along the lines of a guitar and slung it over the neck of a mobile performer; the right hand played the keys, while the left-hand adjusted knobs, sliders, etc. to articulate the sounds. Although this let keyboardist prance around the stage in the style of guitarists (the fusion player Jan Hammer was perhaps the best-known early proponent of this type of interface), it by no means was a guitar controller, and still required keyboard technique.

The late seventies and early eighties saw the widespread adoption of the hexaphonic pickup; a magnetic pickup with one coil for each string, thus producing 6 independent analog outputs, and mounted very close to the strings and bridge to avoid crosstalk. This, together with better pitch extraction circuitry, enabled the design of polyphonic guitar synthesizers such as the 360-Systems devices designed by Bob Easton and collaborators and the well-known Roland GR500 and GR300 series; although they could be slow and quirky, they gained some degree of acceptance in the community, being adopted by guitar giants such as Pat Metheny and Robert Fripp.

In the mid-80's, the guitar controller began evolving significantly away from its familiar form, with many devices being developed that weren't guitars at all, but enabled musicians with guitar technique to gain fast, expressive control over MIDI synthesizers. These avoided pitch tracking all together, and merely detected the playing parameters by directly sensing the fretting finger position (typically by membrane switches or resistive/capacitive pickups on the fretboard or between string and fretboard), pitch bend (measuring strain or displacement of the strings), and the dynamics of the string being plucked (from the amplitude of the audio signal produced by each string; the pitch is never determined). Perhaps the most famous was the SynthAxe, invented by Bill Aitken. This device actually sported two sets of strings; one of which were short lengths across the guitar body used to detect picking, and another set running down the fretboard for determining pitch, as described above. With its faster response, the SynthAxe was adopted by several well-known power jazz guitarists, notably Alan Holdsworth. It was very heavy and expensive, however, thus rapidly spawned a related set of much more affordable controllers, such as the Suzuki XG and Casio DG series, which retained the short plucking strings, but dispensed with the strings down the fretboard, now directly sensing finger pressure there. Only one such device is currently in production, namely the ZTAR from Starr Labs in San Diego, CA. All of these controllers feature several additional means of generating data from the guitar body; e.g., whammy bars (directly producing MIDI events), sliders, buttons, touchpads, joysticks, etc. Much simpler versions of these designs have appeared in toy products (such as the Virtual Guitar from Ascend Inc. in Burlington, MA), some of which are still marketed.

Another set of guitar controllers were introduced in the mid-late 80's that likewise didn't detect the pitch directly, but used yet another technique to sense the fretting positions. The Beetle Quantar and Yamaha G10 launched an ultrasonic pulse down the strings from the bridge; when a string was held against a metal fret, this pulse would be reflected back to the bridge, where it would be detected and the fretting position determined by the acoustic time-of-flight. An additional optical sensor on each string detected the lateral string position (thus pitch bend), and electromagnetic pickups determined the amplitude dynamics. Optical pickups were used exclusively on another guitar controller called the "Photon", developed by K-Muse in 1986. Here, the standard magnetic pickup was replaced with an IR sensor that detected the string vibration, enabling nonconductive nylon strings to be used.

The guitar controllers from Zeta Music were interesting hybrid approaches of considerable renown that appeared in the late 1980's. These were actual guitars with a multimodal MIDI interface, culminating in the Mirror 6, which featured a wired fretboard for determining pitch, a capacitive touch detector on each string for determining the expected acoustic damping on strings contacted but not pressing the fretboard, hex pickups for determining amplitude and pitch bend, accelerometers for measuring the instrument's rigid-body dynamics (e.g., shaking), plus an instrumented whammy bar and other tactile controls. Although they no longer produce guitars (having been bought and subsequently spun off the guitar giant Gibson), Zeta still makes other MIDI string instruments, as described later.

In recent years, as signal processing capability has improved, there has been a shift away from the dedicated MIDI guitar controllers described above and back toward retrofits for existing, standard electric guitars that now identify the playing features by running real-time DSP algorithms on the pickup signals. These systems generally consist of a divided hex pickup (still mainly magnetic, although some begin to employ contact piezoelectric transducers, which produce a more robust signal and work with nonmetallic strings) and an interface unit, that runs the pitch and feature extraction software. Examples are the Yamaha G50 and an interesting new device called the "Axon" controller from Blue Chip Systems, which employs a neural network to learn the playing characteristics of an individual player (the Axon claims to be able to reliably determine pitch after a single period of the string frequency; because it has learned the picking style of the player, it is said to be able to use the first, "noisy" period immediately after picking). These controllers, once properly calibrated (often still a nontrivial operation), are said to track quickly and reliably, plus estimate other parameters from the string signals, such as the longitudinal picking position. Some claim to respond quickly enough to enable good performance with a bass guitar, where the string oscillation period is much longer.

Nonetheless, it's generally accepted that the MIDI standard is inadequate for adequately handling the wealth of data that most acoustic instruments (especially the strings) can produce. While a guitar performance can be somewhat shoehorned into a set of features that fit the 31.25 kbaud, 7-bit MIDI standard, there's insufficient bandwidth to continually transmit the many channels of detailed articulation these instruments generate. Several solutions have been suggested, such as the currently-dormant ZIPI interface standard, proposed several years ago to supersede MIDI by the CNMAT center at UC. Berkeley and Zeta. A route many manufacturers seem to be pursuing now is to depart from the modular MIDI standard, and utilize the detailed, fine-grained features from their proprietary guitar interface in a synthesis engine housed together with the guitar controller unit. If desired, a subset of these parameters can be projected through a MIDI output, but the synthesis algorithms running on the native synthesizer have access to the high-bandwidth data coming right off the guitar, enabling very responsive sound mapping.

 

6) Other Strings; Wiring the Classics

 

The orchestra and synthesizer inhabited entirely independent spheres during their early courtship; the electronic sound systems knew nothing about what the instruments were playing. Musicians essentially kept time to a tape recording, or prerecorded sequence. Over the past decades, this relationship has become much more intimate. The most well-known large-scale examples of computer/orchestra integration were provided by Giuseppe Di Giugno's 4X synthesizers, developed at IRCAM in Paris during the 1980's. The 4X was used by many composers, including Pierre Boulez, to analyze and process the audio from acoustic ensembles, enabling them to produce real-time synthesized embellishment. This trend has continued as more fluent interfaces develop between computers and classic orchestral instruments. The electronic music generator now becomes a virtual accompanist, able to adapt and respond to the detailed nuance of the individual musicians.

The early electronic interfaces for bowed string instruments processed sound from pickups, adding amplification and effects (a famous example is Max Mathew's violin, which use piezoceramic bimorph pickups with resonant equalization). In general, more traditional stringed instruments (e.g., violin, viola, cello) have followed the guitar along a similar, although less trodden path toward becoming true electronic music controllers. Although the complicated and dynamic nature of a bowed sound makes fast and robust pitch tracking and feature extraction difficult, many researchers have developed software with this aim; commercial MIDI pitch-tracking retrofits are also manufactured for violins, violas, and cellos. Zeta, for instance, has built a full line of MIDI stringed instrument controllers over much of the last decade, and still manufactures MIDI controller electronics and retrofit pickups for both its own and 3'rd party instruments. Motivated by the vast world of sonic expression open to a skilled violinist, many researchers go beyond analyzing the audio signals, and build sensor systems to directly measure the bowing properties. Chris Chafe, of the CCRMA at Stanford, has measured the dynamics of cello bows with accelerometers and the Buchla Lightning IR tracker. Peter Beyls, while at Brussels University in 1990, built an array of IR proximity sensors into a conventional acoustic violin, measuring fingering position and bow depression. Jon Rose, together with designers at STEIM, Amsterdam, built interactive bows with sonar-based position tracking and pressure sensors on the bow hair.

Here at the Media Lab, we have designed several systems for measuring performance gesture on bowed string instruments. These efforts began with the Hypercello, designed by Neil Gershenfeld and Joe Chung in 1991. In addition to analyzing the audio signals from each string and measuring the fretting positions with a set of resistive strips atop the fingerboard, the bow position and placement were measured through capacitive sensing, which is much less sensitive to background than most optical or other techniques. A 50 kHz signal was broadcast from an antenna atop the bridge, and received at a resistive strip running the length of the bow. Transimpedance amplifiers and synchronous detectors measured the induced currents flowing from both ends of the bow; their difference indicated the transverse bow position (i.e., the end closer to the transmitter produced proportionally stronger current), while their sum indicated longitudinal bow placement (the net capacitive coupling decreases as the bow moves down the violin, away from the transmitter). In addition, a deformable capacitor atop the frog, where the bowing finger rests, measured the pressure applied by the player, and a "Dexterous Wrist Master" from Exos, Inc. (now part of Microsoft) measured the angle of the bow player's wrist. This setup originally premiered at Tanglewood, where cellist Yo-Yo Ma used it to debut Tod Machover's hyperinstrument composition "Begin Again Again..."; it has since appeared in over a dozen different performances of Machover's music at various worldwide venues.

This bow was wired to the signal conditioning electronics; not interfering significantly with the playing style of most cellists. We have since developed a wireless bow tracker for use with a violin, where a tether can much more significantly perturb the player. Here, we used three small battery-powered transmitters located on the bow, and a receive electrode on the violin, above the bridge. Two of these were CW transmitters broadcasting at different frequencies (50 and 100 kHz), driving either end of the resistive strip. The balance in the components of the received signal at these frequencies indicated the bow position and placement, as with the current-balance scheme in the cello bow. A FSR placed below the player's bow grip caused the frequency of the third oscillator to vary with applied pressure; a PLL (phase-lock-loop) in the receive electronics tracked these changes, producing pressure data. This system has likewise appeared in several concerts, played by the violinist Ani Kavafian in performances of Machover's composition "Forever and Ever".

Of course, inventors all over the world have been adapting technology to turn non-western stringed instruments into electronic music controllers. Perhaps one of the more extreme is Miya Masaoka's "Koto Monster", a 6-foot-long, hollow-bodied, 21-string, harp-like digital instrument that she developed at the Dutch STEIM center.

 

7) Wind

 

Wind instruments followed an analogous path into the domain of electronic music performance. Initially, during the late 60's and early 70's, avant-guarde wind players would outfit their instruments with acoustic pickups and plug the signals through synthesizers, waveshapers, and envelope followers, triggering sounds and applying distortion and effects of various sorts. Wind players quickly adopted the early pitch-to-voltage converters produced by most synthesizer manufacturers (e.g., Moog, EMS); as wind instruments are essentially monophonic, they were well-suited to driving the single-voice synthesizers of the time, and although these pitch extractors could be readily confused with harmonics, attack transients, and artifacts of expressive playing, some degree of playability could be attained.

In the early 70's, the electronic wind instrument started its metamorphosis into a soundless controller, where the valves and stops became switches, and mouthpieces and reeds were replaced with bite and breath sensors. As in the case of the wired guitar controllers outlined above, this shortcut the need for intermediate pitch extraction, and the player's gesture was able to be immediately mapped into dynamic synthesis and audio parameters. The first of these devices to gain any notoriety was the Lyricon Wind Synthesizer Driver, made by a Massachusetts company called Computone. This device produced voltages from fingering, lip pressure, and breath flux (measured by a hot-wire anemometer adapted from a small light bulb) that could drive an analog synthesizer; it was initially packaged just as a controller, but a small dedicated analog synthesizer was included with subsequent models to enable stand-alone performance. Envelope generators were not generally used with such wind controllers; the breath controller and lip/bite sensor signals were applied directly to control the amplitude and timbral dynamics, creating the level of intimate sonic articulation that wind players are used to expressing.

During the later 70's and early 80's the trumpeter/inventor Nyle Steiner, long working on electronic wind interfaces, developed two of the best-known devices, the Electronic Woodwind Instrument (EWI) and Electronic Valve Instrument (EVI). The EWI has the fingering protocol of a saxophone, while the EVI is designed for trumpet players. In addition to breath and lip pressure sensors, these instruments featured capacitive touch keys for fast pitch fingering, touch plates and levers for adding portamento, vibrato and other effects, and rollers for transposing and sliding pitch. The synthesizer/audio manufacturer Akai began producing these instruments in the late-1980's, packaging the controller with an analog synthesizer and MIDI interface. They still produce a version of the EWI today, and as many purists feel that MIDI can't adequately convey the streams of continuous data produced by this device, it remains optionally packaged with an analog synthesizer.

Yamaha has played an important role in digital wind interfaces, introducing a breath controller (a device which dynamically senses breath pressure) with its pioneering DX-7 FM synthesizer in the early 1980's, opening up another channel of articulation in what was essentially a keyboard instrument. In the later 1980's, they introduced the first real MIDI wind controller, the WX-7, with fingering switches laid out in a saxophone protocol, breath and lip sensors, a pitch wheel, and a set of control buttons; this device has now evolved into the currently-produced WX-5. As a commercial manufacturer pioneering techniques such as physical modeling and waveguide algorithms in their VL-series synthesizers, Yamaha has designed many sound patches for these devices that require breath or wind controllers for fullest expression.

Many other wind controllers have been made by other manufacturers and researchers; for example, Casio has produced the inexpensive "DH" series of hornlike controllers, Martin Hurni of Softwind Instruments manufactures the Synthophone, and John Talbert of Oberlin College has built the MIDI horn. Perry Cook and Dexter Morril have explored new concepts in brass synthesis controllers at the CCRMA in Stanford, where they mounted acoustic and pressure transducers at various points in standard brass instruments, plus monitored valve position, added additional digital controls, and applied new algorithms for realtime pitch and feature extraction from the audio stream.

Some devices in this family have evolved far from their parents; for instance, the STEIM performer Nicholas Collins has turned a trombone into a multimodal performance controller by putting a keypad onto an instrumented slide, and using this to control, launch and modify a variety of different sounds; the trombone is never "played" in the conventional sense. An altogether different kind of wind synthesizer has been built by California-based Ugo Conti. His "whistle synthesizer" is essentially a signal processing device attached to a microphone into which one whistles; by adjusting its ubiquitous array of sliders (at least one is accessible to each finger on both hands), the sound of the whistle can be dynamically warped and modified through sub-octave generators and delay lines.

 

8) Voice

 

The human voice is certainly an extremely expressive sound source, and has long been integrated into electronic music. For obvious reasons, however, it is quite difficult to abstract the voice mechanism away from the sonic output, as was pursued in the guitar and wind controllers discussed above. Although it may be possible to get some real-time information on vocal articulation from, for instance, EMG muscle sensors or video cameras and machine vision algorithms analyzing facial motion, essentially all voice-driven electronic sound comes from processing the audio signals picked up by microphones of one sort or another. Essentially every signal processing trick ever developed has been used on the human voice in the name of music, perhaps the most common being reverb, echo, pitch shifting, chorusing, ring modulation, harmonizing, and vocoding (filtering one sound with the spectral content of another). Over the last decade, many signal processors have been specifically designed for altering real-time vocals (e.g. the DigiTech Vocalizer series are recent well-known examples), and several pitch-to-MIDI converters have been optimized for the human voice (some old classics are the Fairlight VoiceTracker and the IVL Pitchrider; nowadays, many musicians use the pitch trackers in guitar interfaces for generating MIDI from vocals). As the voice expresses many musical characteristics that go far beyond simple pitch, some groups, following the directions taken in speech research, have written real-time computer software to dynamically analyze the human voice into a variety of musically interesting parameters, in some cases using these quantities to drive complicated models (including those of other musical instruments or other voices). For example, Will Oliver, here at the MIT Media Laboratory, has taken this approach in the Brain Opera's "Singing Tree", a realtime device that breaks the singing voice into 10 different dynamic parameters, which are then used to control an ensemble of MIDI instruments that "resynthesize" the character of the singing voice, but with entirely different sound sources. Academic research is rich with work on realtime and "batch" voice processing; witness, for instance, the well-known composition "Lions are Coming" by James (Andy) Moorer done at CCRMA, or the LPC work of Paul Lansky at Princeton.

 

9) Noncontact Gesture Sensing

 

In recent years, more musical devices are being explored that exploit noncontact sensing, responding to the position and motion of hands, feet, and bodies without requiring any kind of controller to be grasped or worn. Although these interfaces are seldom played with as much precision as the tactile controllers described earlier, with a computer interpreting the data and exploiting an interesting sonic mapping, very complicated audio events can be launched and controlled through various modes of body motion. These systems are often used in musical performances that have a component of dance and choreography, or in public interactive installations. Many sensing technologies have been brought to bear on these interfaces, from machine vision through capacitive sensing. As each have their advantages and problems, frequently the chosen sensing mechanism is best tailored to the desired artistic goals and the constraints imposed by the anticipated performance environment.

The Theremin, developed in the early 1920's by the Russian physicist, cellist, and inventor Leon Theremin, was a musical instrument with a radically new free-gesture interface that foreshadowed the revolution that electronics would perpetrate in world of musical instrument design. The technological basis of his instrument is very simple. The pitch waveform is generated by heterodyning two LC oscillators. Because their free-running frequencies (in the range 100 kHz - 1 MHz) are adjusted to be relatively close to one another, their detected beats are in the audio band. One of these oscillators is isolated, providing a relatively stable reference. The other oscillator is coupled to a sensor plate; when a player moves a hand close to this plate, their body capacitance adds to that of the attached oscillator, correspondingly altering its frequency and producing a pitch shift in the detected heterodyned audio beat. Another sensor plate similarly changes the frequency of yet another oscillator, however a filter network causes the amplitude of this oscillator to likewise vary with frequency (hence hand position). This signal is amplitude-detected, and used to analog-gate the audio heterodyne beat, thereby producing a change in loudness as a hand approaches this second plate. The Theremin is thus played with two hands moving freely through the air above these plates; one controlling pitch and the other determining amplitude.

The Theremin was a worldwide sensation in the 20's and 30's. RCA commercially manufactured these instruments, and several virtuosos performers developed, the most famous being Clara Rockmore. Robert Moog began his electronic music career in the 1950's by building Theremins, which had by then descended into more of a cult status, well away from the musical mainstream. Theremins are once again attaining some notoriety, and Moog (through his present company, Big Briar) and others are again producing them.

Here at the Media Lab, we have developed many musical interfaces that generalize capacitive techniques, such as used in the Theremin, into what we call "Electric Field Sensing". The Theremin works through what we call "Loading Mode"; i.e., it essentially detects current pulled from an electrode by a nearby capacitively-coupled body. We have, however, based our devices on other modes of capacitive sensing that provide more sensitivity and longer measurement range; namely "transmit" and "shunt" modes, as described below.

By putting the body very close to a transmit electrode (i.e., sitting or standing on a transmitter plate), the body, being quite conductive, essentially becomes an extension of the transmit antenna. The signal strength induced at a receiver electrode, tuned to the transmitter frequency, thus increases with the body's proximity, building as the capacitive coupling grows stronger. By placing an array of such receiver electrodes around a person attached to a transmitter, the range to his hands, feet, and body can thus be determined at several points in space. Gestural information is then obtained through simple processing of the data from these receive electrodes.

Our sensor chair exploits this "transmit mode". An electrode mounted under the seat drives the body with the transmit signal (a few volts at 50 kHz; well below any environmental or broadcast regulation), and pickup electrodes mounted in front of the performer and on the floor of the chair platform respond to the proximity of his hands and feet (halogen bulbs mounted near the pickup electrodes are driven to give a visual indication of the detected signals). We have used this chair in several different musical performances, mapping the motion of hands and feet into various musical effects that trigger, modify, and otherwise control electronic sound sources. We have adapted this design to other configurations; for instance, the Gesture Wall, used at the Brain Opera, dispenses with the chair, and transmits into the body of standing players through their shoes, with a simple servo circuit adjusting the transmitter amplitude to compensate for the wide range in shoe impedances (before starting, a player must first put his or her hand on a calibration plate, which acts as a reference capacitor). Data from the Gesture Wall receivers, which surround a projection screen, enable the performer to control a musical stream and interact with the projected graphics via free body motion. The receivers, mounted on goosenecks, are pieces of copper mesh in a urethane mold, surrounding a large LED that is driven with the detected signal, again for direct visual feedback.

Another electric-field-sensing mode that we have explored is termed "shunt mode". This is when the body, unattached to any electrodes, exhibits a dominant coupling through ambient channels to the room ground. Then, as hands, feet, etc. move between transmit and receive electrodes, the received signal drops, as the body effectively shields the receiver from the transmitter. Although accurate tracking can be more difficult here, we have made several musical gesture interfaces out of shunt-mode electrode arrays. Perhaps the most notorious of these is the Sensor Mannequin. Designed in collaboration with the Artist Formerly Known as Prince, there are several electrodes embedded in this device, creating descriptive MIDI streams as the body approaches various zones. A simpler embodiment is the "Sensor Frame"; an open rectangular structure made from PVC pipe, with copper electrodes at the corners and midway along the horizontal edges.

Several research labs and commercial products have exploited many other sensing mechanisms for noncontact detection of musical gesture. Some are based on ultrasound reflection sonars, such as the EMS Soundbeam and the "Sound=Space" dance installation by longtime Stockhausen collaborator Rolf Gelhaar. These generally use inexpensive transducers similar to the Polaroid 50 kHz electrostatic heads developed for auto-focus cameras, and are able to range out to distances approaching 35 feet. A multichannel, MIDI-controlled ranging sonar system for interactive music has been developed at the MIT Media Lab using a 40 kHz piezoceramic transducer, which, in contract to the Polaroid systems, exhibits a much wider beamwidth (although somewhat shorter sensitive range) and produces no audible click when pinged. While these sonars can satisfy many interactive applications, they can exhibit problems with extraneous noise, clothing-dependent reflections, and speed of response (especially in a multi-sensor system), thus their operating environment and artistic goals must be carefully constrained, or more complicated devices must be designed using pulse compression techniques.

Infrared proximity sensors, most merely responding to the amplitude of the reflected illumination (hence not true rangefinders, as the albedo of the body will also affect the inferred distance), are being used in many modern musical applications. Examples of this are found in the many musical installations designed by interactive artist Chris Janney, such as his classic SoundStair, which triggers musical notes as people walk up and down a stairway, obscuring or reflecting IR beams directed above the stair surfaces. Commercial musical interface products have appeared along these lines, such as the "Dimension Beam" from Interactive Light (providing a MIDI output indicating the distance from the IR sensor to the reflecting hand), and the simpler "Synth-A-Beams" MIDI controller, which produces a corresponding MIDI event whenever any of eight visible lightbeams are interrupted. One of the most expressive devices in this class is the "Twin Towers", developed by Leonello Taraballa and Graziano Bertini at the CNUCE in Pisa. This consists of a pair of optical sensor assemblies (one for each hand), each containing an IR emitter surrounded by 4 IR receivers. When a hand is placed above one of these "Towers", it is IR-illuminated and detected by the 4 receivers. Since the relative balance between receiver signals varies as a function of hand inclination, both range and 2-axis tilt are determined. The net effect is similar to a Theremin, but with more degrees of sonic expression arising from the extra response to the hand's attitude.

Other noncontact optical tracking devices have been built, such as the "Videoharp", introduced in 1990 by Dean Rubine and Paul McAvinney at Carnegie-Mellon. This is a flat, hollow, rectangular frame, which senses the presence and position of fingers inside the frame boundary as they block the backlighting emanating from the frame edges, thereby casting a corresponding shadow onto a linear photosensor array. Appropriate MIDI events are generated as fingers are introduced and moved about the sensitive volume inside the frame, allowing many interesting mappings.

Here at the Media Lab, we have built a much larger sensitive plane, using an inexpensive scanning laser rangefinder that we have recently developed. This rangefinder (a CW phase-measuring device) is able to resolve and track bare hands crossing the scanned plane within a several-meter sensitive radius. Because the laser detection is synchronously demodulated, it is insensitive to background light. We have used this device for multimedia installations, where performers fire and control musical events by moving their hands across the scanned areas above a projection screen.

Although they involve considerably more processor overhead and are generally still affected by lighting changes and clutter, computer vision techniques are becoming increasingly common in noncontact musical interfaces and installations. For over a decade now, many researchers have been designing vision systems for musical performance, and steady increases in available processing capability have continued to improve their reliability and speed of response, while enabling recognition of more specific and detailed features. As the cost of the required computing equipment drops, vision systems become price-competitive, as their only "sensor" is a commercial video camera. A straightforward example of this is the Imaginary Piano by CNUCE's Taraballa. A vision system tracks the hands of a seated player, triggering a note when they move below a vertical position threshold, with pitch determined by their horizontal coordinate.

A package called "BigEye", written by Tom DeMeyer and his colleagues at STEIM, is one of the most recent video analysis environments explicitly designed for live artistic performance applications. BigEye, running in realtime on a Macintosh computer, tracks multiple regions of specified color ranges (ideally corresponding, for instance, to pieces of the performers' clothing or costumes). The output from BigEye (a MIDI or other type of data stream) is determined in a scripting environment, where sensitive regions can be defined, and different responses are specified as a function of the object state (position, velocity, etc.).

Here at the MIT Media Lab, the Perceptual Computing Group, under Sandy Pentland, have explored many multimedia applications of machine vision. One of these projects, termed "DanceSpace" by Flavia Sparacino, effectively turns the body into a musical instrument, without the need for specific targets, clothing, etc. Using the "Pfinder" package developed by Chris Wren and colleagues, a human body is identified when it enters the field-of-view of a video camera, and a elliptical "blob" model is constructed of the relevant features (head, hands, torso, legs, feet) in realtime; depending on the details of the scene, update rates on the order of 20 Hz can be achieved on a moderate-capacity graphics workstation. DanceSpace attaches a set of musical controls to these features (i.e., head height controls volume, hand positions adjust the pitch of different instruments, feet fire percussive sounds), thus one essentially plays a piece of music and generates accompanying graphics by freely moving through the sensitive space. This has been applied in several dance applications, where the dancer is freed from the constraints of precomposed music, and can now control the music with their improvisational whims.

We have built another musical environment here at the Media Lab that combines both free and contact sensing in an unusual fashion. Termed "The Magic Carpet", it consists of a 4" grid of piezoelectric wires running underneath a carpet and a pair of inexpensive, low-power microwave motion sensors mounted above. The sensitive carpet measures the dynamic pressure and position of the performer's feet, while the quadrature-demodulated Doppler signals from the motion sensors indicate the signed velocity of the upper body. The Magic Carpet is quite "immersive", in that essentially any motion of the performer's body is detected and promptly translated into expressive sound. We have designed several ambient soundscapes for it, and have often installed this system in our elevator lobbies, where passers-by stop for extended periods to exploring the sonic mappings.

 

10) Wearables

 

Yet another kind of musical controller has appeared over the last couple of decades. These are essentially "wearable" interfaces, where a sensor system is affixed to the body or clothing of a performer. A very early example of this is from composer Gordon Mumma, who inserted accelerometers and wireless transmitters into dancers' belts for performances dating from 1971. Berlin-based artist Benoit Maubrey has been designing "electro-acoustic clothing" since the early 1980's; apparel that incorporates tape recorders, synthesizers, samplers, sensors, and speakers. Some of the best-known examples of wearable controllers were likewise used during the 1980's by the New York performance artist Laurie Anderson, who pulled the triggers off a set of electronic drums and built them into a suit, enabling a percussive performance by tapping different sections of the body. Commercial companies started making such garments in the mid-80's, for instance, the "Brocton-X Drum Suit", with attachable Velcro percussion sensors and heel-mounted triggers. At the turn of the decade, Mark Coniglio, of Dance/Theater company Troika Ranch, designed MidiDancer, a body suit instrumented with 8 sensors, measuring motion and bend at various joints. Data is wirelessly offlinked and converted to MIDI; the Interactor MIDI interpreter provides a GUI front-end, enabling choreographer to quickly define multimedia events to be launched with particular dance gesture. Yamaha has recently introduced its Miburi system, consisting of a vest hosting an array of resistive bend sensors at the shoulder, elbows, and wrist, a pair of handgrips with two velocity-sensitive buttons on each finger, and a pair of shoe inserts with piezoelectric pickups at the heel and toe. Current models employ a wireless datalink between a belt-mounted central controller and a nearby receiver/synthesizer unit. Yamaha has invented a semaphore-like gestural language for the Miburi, where notes are specified through a combination of arm configurations and key presses on the wrist controllers. Degrees of freedom not used in setting the pitch are routed to timbre-modifying and pitch-bending continuous controllers. The Miburi, not yet marketed outside of Japan, has already spawned several virtuosic, somewhat athletic performers.

In contrast to fully-instrumented body suits, some intriguing musical controllers are independently cropping up in various pieces of apparel. Laurie Anderson, again a pioneer in this area, performed a decade ago with a necktie that was outfitted with a fully functional music keyboard. Here at the Media Lab, we have recently built "musical jackets", with a touch-sensitive MIDI keyboard embroidered directly into the fabric using conductive thread. We have also made a set of "expressive footwear"; a retrofit to a pair of dance sneakers that inserts a suite of sensors to measure several dynamic parameters expressed at a dancer's foot (differential pressure at 3 points and bend in the sole, 2-axis tilt, 3-axis shock, height off the stage, orientation, angular rate and translational position). These shoes require no tether; they are battery powered for up to 3 hours and offload their data via a 20K bits/second wireless link.

Other researchers have attached electrodes directly to the body, using neurological and other biological signals to control various sound sources in some very unusual performances. Some of the best-known such works were produced by composer David Rosenboon of Mills College during the 1970's. These "biofeedback" pieces generated sounds as a function of the performers' biological states, including heart rates, GSR (skin resistance) data, temperature probes, and of course, EEG (brainwave) data. In most of these pieces, a computer system would monitor these features and direct the sonic output as a function of their states and correlations. This has now become a commercial business of sorts, with products appearing on the market aimed partially at musical control applications. For example, a system marketed by IVBA Technologies of Norwalk, CT consists of a sensor headband, which purports to measure brainwaves of various kinds. The headband-mounted controller wirelessly communicates with a base station that provides a MIDI output. Another, more general, device is the "Biomuse", produced by BioControl Systems, a Stanford University spin-off started by Hugh Lusted and Benjiman Knapp. The Biomuse is able to gather EMG (muscle), EOG (eye movement), EKG (heart), and EEG (brainwave) signals. It also has MIDI output and mapping capability, and has been used in musical research projects. Although the controllability and bandwidth of some of these parameters (especially the brainwaves) may be debated, new musical applications won't lurk far behind as researchers at various institutes progress in extracting and identifying new and more precise bioelectric features.

The different sensors developed for the Virtual Reality community been rapidly pushed into musical applications; various musical researchers have worked with the magnetic tracking systems and datagloves upon which the VR world was built. For example, Jaron Lanier, a well-known pioneer in this field, is still a very active musician, incorporating a host of different VR and musical interfaces (along with traditional and ethnic acoustic instruments) into his live performances. Gloves, in particular, have appeared in many musical performances. One example, composed by Tod Machover here at the Media Lab, is "Bug Mudra", where an Exos "Dexterous Hand Master" was worn by the conductor, who had complete dynamic control over the audio mix and synthesis parameters through finger positions. Many other composers, such as Richard Boulanger at the Berkeley School of Music in Boston, have used Mattel's "Power Glove" (the low-cost brother of the original VPL DataGlove, intended for the home gaming market) as a controller in several pieces.

Some of the most interesting glove and hand controllers have come from STEIM, the Dutch center for electronic performance research in Amsterdam. Michael Waisvisz' "Hands", first built at STEIM in 1984, consist of a pair of plates strapped to the hands, each equipped with keys for fingering and other sensors that respond to thumb pressure, tilt, and distance between the hands. Waisvisz has written many pieces for this expressive controller, and still uses it in performance. Ray Edgar has built a related controller at STEIM called the "Sweatstick", where the hand interfaces are now constrained to slide along a bendable rod; in addition to the keypad switches, the distance between controllers, their rotation around the rod, and the bend of the rod are all measured. Laetitia Sonami has also built a device at STEIM called the "Lady's Glove", an extremely dexterous system, with bend sensors to measure the inclination of both finger joints for the middle 3 fingers, microswitches at the end of the fingers for tactile control, Hall sensors to measure distance of the fingers from a magnet in the thumb, pressure sensing between index finger and thumb, and sonar ranging to emitters in the belt and shoe. Walter Fabeck designed the "Chromasone" while at STEIM; this likewise uses a glove to measure finger bending, together with sonar tracking to measure the position of the hand above an illuminated Lucite dummy keyboard, providing a physical frame of reference for the performer and audience. One of the most impressive things about the STEIM environment is its connection to the "streetsmart" musical avant-guarde. The STEIM artists don't keep these innovative devices in the laboratory, but regularly gig with them at different performance venues and music clubs throughout Europe and around the world.

 

11) The Horizon...

 

General-purpose personal computers are rapidly becoming powerful enough to subsume much of musical synthesis. Essentially all current computers arrive equipped with quality audio output capability, and very capable software synthesizers are now commercially available that run on a PC and require no additional hardware. Over time, the software synthesis capabilities of PC's will expand, and dedicated hardware synthesizers will be pushed further into niche applications. As more and more objects in our environment gain digital identity and are absorbed into the ubiquitous user interface that technology is converging toward, controllers designed for providing generic computer input of various sorts will be increasingly bent toward musical applications. Several indications of this trend are now evident. At a low level, depending on their settings, standard operating systems tend to barrage a user with a symphony of complicated sonic events as one navigates through mundane tasks. While this is not necessarily a performance, software packages exist that use the standard computer interface as a musical input device (i.e., starting with Laurie Spiegel's Music Mouse, already available in the late 1980's, there are now many different packages; e.g. Pete Rice's "Stretchables" program developed here at the Media Lab). In the not-too-distant future, perhaps we can envision quality musical performances being given on the multiple sensor systems and active objects in our smart rooms, where, for instance, spilling the active coffee cup in Cambridge can truly bring the house down in Rio.