1. Project Presentation

Electronic mixing of several sound sources is a relatively recent process. The advent of multi-track recorders dates from the late 60s, and it was only with this innovation that it became possible to mix at a different time from the recording session. Even the capability of mixing down separate sound sources while recording is not much older than the early 40s, given that before this, only one microphone was used and the relationship between elements was achieved through the physical distribution of performers within the acoustic space. The mixing process is historically a technical one, but throughout the years it has shifted towards an artistic practice with the incorporation of several creative approaches. Nevertheless, it is an activity that has always depended exclusively from a human operation, even though over the last decades automation systems have been experimented, in order to provide ways for a computer to make record and mixing decisions resulting in multiple mixes that can be played back.

With the development of techniques associated with adaptive digital audio effects, in which the processing parameters are controlled by automatic input analysis rather than human choice, emerged systems that have multi-input analysis capabilities and are able to process all inputs as an iterative network. This research work aims at exploring the decision-making processes behind traditional mixing practices and the implementation of algorithms that are able to assist human performance with digital systems. Even though the term “automatic mixing” seems to point towards an elimination of the human factor, the real benefit from this type of system is in augmenting the mixing engineers’ capability to undertake complex processes that would be too cumbersome without machine assistance.


2. Artistic and Scientific Background

Since the 1970’s there is some speculation about automatic possibilities for mixing [Gonzalez & Reiss, 2009]. An automatic mixing system is to be understood as a mechanism that is ready to accept a variable number of inputs and produces a mono/stereo/multi-channel output that reflects an equilibrium of features between all sound sources. This equilibrium will need to prove stable in terms of perceived volume, panning, equalization and depth, among other features. The most universally accepted criteria for a balanced mix is the idea of intelligibility of all elements, in order to achieve the equilibrium mentioned above. Dan Dugan’s work in the late 70s was pioneer in the areas of volume balancing and feedback cancelation [Gonzalez & Reiss, 2009]. His analogue systems are still used in live events that use a large number of microphones that are only used occasionally. With the emergence of digital audio effects, research work by Joshua Reiss and Udo Zolzer introduced intelligent panning, phase compensation, and intelligibility enhancement based in psychoacoustic principles.

Most of the systems being researched are static - their parameters are determined after an initial analysis of the input audio material and remain  unused until the user takes an action. Dynamic systems are still in their early stage of development. Some reasonably simple commercial products offering automatic mixing systems have been made available, in spite of the resistance by the professional market.

The present research work aims to explore two topics that are still very open within the area of dynamic mixing systems - dynamic intelligent equalization (a system that can make decisions on the spectral content of a given number of inputs and adapt the frequency content contour for each of them in a way that avoids conflicts, in an ever-updating process), and automatic temporal processing (by analyzing reverberation time in a given place the system tries to adapt artificial reverberation in a way that the sum of both is in sync with the musical tempo and content of the sound source).


3. Methodology

The initial development process will be based on a strong revision of primary and secondary literature sources, that will provide general guidelines for the dissertation. This literature will encompass topics related with digital signal processing (particularly adaptive digital audio effects). It will also approach themes that are less emergent and more grounded in the areas of physics, acoustics, psychoacoustics, music, musical information retrieval, computer science, neurosciences and sound engineering. Within these fields, it is particularly important to explore perception and reconstruction of realistic aural environments, machine learning and artificial intelligence.

During this stage of the process, questionnaires will be made to top professionals in the mixing industry, in order to support the design of the main guidelines that will define the parameters of the automatic system. It will also be the period of time to study validation and evaluation methodologies that will possibly be pursued in later stages. It is within the planned activities to have some prototypes that allow testable results, which can be evaluated by groups of listeners.


4. Expected Results and Objectives

This work plan intends to deepen the research on adaptive digital audio effects, which, at the present moment, are still derivative from former analogue systems, for the most part. To this date, no widespread adaptive automatic mixing products that can be integrated with the mainstream professional platforms, and research solutions are scarse. This is an area that demands an integration of many different disciplines, such as music, digital signal processing, acoustics, psychology neurology and a strong component of sound engineering, and consequently it falls in a very specific development niche. However, the results of this kind of research has a verty high potencial for the music technology market, and a broad applications in other fields of knowledge.

As a proof-of-concept, working prototype will be developed, and its results will be evaluated. From this prototype is expected to emerge some guidelines and practical insights for the development of workable final products. We shall determine which signal processing methods are more robust for the development of large-scale adaptive digital audio effects, always with the purpose of contributing with new methods. This research project will also provide a good framework to achieve relevant new insights in a  psychological approach for the study of audio mixing.


5. Starting Bibliography

Ballou, M.G. et. al., “Handbook for Sound Engineers”, Third Edition 2002 ed.: Focal Press/Elsevier, 2002.

Barchiese, D. Reiss, J.D. "Automatic Target Mixing Using Least-Squares Optimization of Gains and Equalization Settings". Proc. of the 12th Int. Conference on Digital Audio Effects (DAFx-09), Como, Italy, September 1-4, 2009.

Barchiese, D. Reiss, J.D. “Reverse Engineering of a Mix". J. Audio Eng. Soc., Vol. 58, No. 7/8, 2010 July/August.

Bitzer, J. et al. "Evaluating perception of salient frequencies: Do mixing engineers hear the same thing?". Proceedings of the Audio Engineering Society 124th Convention, 2008.

Boulanger, R. (ed.). “The Csound Book - Perspectives in software Synthesis, Sound Design, Signal Processing and Programming”. Cambridge (MA). The MIT Press, 2000.

Cook, P.R., “Real Sound Synthesis for Interactive Applications”. A K Peters, New York, 2002.

Dugan, D. "Application of Automatic Mixing Techniques to Audio Consoles". Proceedings of the Audio Engineering Society AES Convention, 1989.

Dugan, D. "Automatic Microphone Mixing". Proceedings of the Audio Engineering Society AES Convention, 1975.

Heise, S. Hlathy, M. Loviscach, J. “Automatic Adjustment of Off-the-Shelf Reverberation Effects”. Proceedings of the Audio Engineering Society 126th Convention.

Julstrom, S. Tichy, T. "Direction-Sensitive Gating: A New Approach to Automatic Mixing". JAES, vol. 32, pp. 490-506, July/August 1984.

Kolasinski, B.A. “A framework for automatic mixing using timbral similarity measures and genetic optimization”. Proceedings of the Audio Engineering Society 124th Convention, 2008.

Mathews, M. V. Miller, J.E. Moore, F.R. Pierce, J.R. Risset, J. C. “The Technology of Computer Music”. Cambridge (MA). The MIT Press, 1969.

Tsingos, N. "Scalable Perceptual Mixing and Filtering of Audio Signals Using an Augmented Spectral Representation”. Proceedings of the DAFx' 05 Sophia Antipolis, France, 2005.

Pachet, F. Delerue, O. "On-the-Fly Multi-Track Mixing". Proceedings of the Audio Engineering Society 109th Convention, Los Angeles, California, USA, 2000.

Perez-Gonzalez, E. Reiss, J.D. "Automatic Gain and Fader Control For Live Mixing". 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics.

Perez-Gonzalez, E. Reiss, J.D. “An automatic gain normalisation technique with applications to audio mixing”. Proceedings of the Audio Engineering Society 124th Convention Amsterdam, The Netherlands, 2008.

Perez-Gonzalez, E. Reiss, J.D. “Automatic mixing: live downmixing stereo panner”. Proceedings of DAFx-07, Bordeaux, France, 2007.

Perez-Gonzalez, E. Reiss, J.D. “Automatic equalization of multi-channel audio using cross-adaptive methods" Proceedings of the Audio Engineering Society 127th Convention, New York, NY, USA, 2009 October 9–12.

Perez-Gonzalez, E. Reiss, J.D. “Determination and correction of individual channel time offsets for signals involved in an audio mixture". Proceedings of the Audio Engineering Society 125th  Convention, San Francisco, CA, USA, 2008 October 2–5.

Perez-Gonzalez, E. Reiss, J.D. “Improved control for selective minimization of masking using interchannel dependency effects”. Proceedings of the 11th International Conference on Digital Audio Effects (DAFx), September 2008.

Perez-Gonzalez, E. Reiss, J.D. “A Real-Time Semiautonomous Audio Panning System for Music Mixing". EURASIP Journal on Advances in Signal Processing.

Reed, D. “A perceptual assistant to do sound equalization”, in Intelligent User Interfaces 5th Conference, Jan 2000.

Roads, C. Strawn, J. Abbott, C. Gordon, J. Greenspun, P. “The Computer Music Tutorial”. Cambridge (MA) : The MIT Press, 1996.

Sánchez, S. M. “Extending Automatic Audio Mixing with Dynamics and Equalization Effects”. Master Thesis UPF / 2009

Schick, B. Maillard, R. Spenger, C.C. "First investigations on the use of manually and automatically generated stereo downmixes for spatial audio coding" Proceedings of the Audio Engineering Society 118th Convention. Barcelona, Spain, 2005.

Sethares, W.A., Milne, A.J. Tiedje, S. Prechtl, A. and Plamondon J. “Spectral tools for dynamic tonality and audio morphing”. Computer Music Journal, vol. 33, no. 2, pp. 71–84, 2009.

Soo, S. Pang, K.K. “Multidelay block frequency domain adaptive filter”. IEEE Trans. on Acoustics, Speech and Signal Processing, vol. 38, no. 2, Feb. 1990.

Terrell, M.J. Reiss, J.D. “Automatic Monitor Mixing for Live Musical Performance”. J. Audio Eng. Soc., Vol. 57, No. 11, 2009 November.

Terrell, M.J. Reiss, J.D. “Automatic Noise Gate Settings for Multitrack Drum Recordings”. Proc. of the 12th Int. Conference on Digital Audio Effects (DAFx-09), Como, Italy, September 1-4, 2009.

Verfaille, V. Zolzer, U. Arfib, D. “Adaptive digital audio effects (a- DAFx): a new class of sound transformations". Audio, Speech, and Language Processing, IEEE Transactions on 14, no. 5 (2006): 1817-1831.

Zölzer, U. "DAFX - Digital Audio Effects". John Wiley & Sons, Inc., New York, 2002