Centre for Digital Music

 
Overview
Projects
People
Publications
Seminars
Seminar Videos
Conferences & Events
Education
PhD Study
PhD Graduates
Software
Patents
 

Proposed PhD Research Topics

The Centre for Digital Music welcomes PhD research in any of our general areas of interest, broadly covering the field of music and audio technology, including informatics, retrieval, signal analysis and music understanding.

In addition, we are advertising PhD research in the following topic areas;

Semantic Audio Systems for Studio and Live Sound applications

Contact: Dr. Josh Reiss, josh.reiss@elec.qmul.ac.uk

Musical audio contains a wealth of information that can be extracted using advanced signal processing techniques, and represented as musical information. The aim of this project is to use this information in the studio to dramatically simplify the workflow of recording engineers, and it can be used in live sound applications to help in setting up the PA, so that even small bands can have a professional sound.

Intelligent Mixing of Live Multichannel Sound

Contact: Dr. Josh Reiss, josh.reiss@elec.qmul.ac.uk

This PhD topic aims to intelligently generate an automatic sound mix out of an unknown set of multi-channel inputs. The input channels can be analysed to determine preferred settings for gain, equalization, compression, reverb, and so on. The research explores the possibility of reproducing the mixing decisions of a skilled audio engineer with minimal or no human interaction. This research has application to live music concerts, remote mixing, recording and postproduction as well as live mixing for interactive scenes.

Currently automated mixers are capable of saving a timeline of static mix scenes, which can be loaded for later use. But they lack the ability to adapt to a different room or to a different set of inputs. In other words, they lack the ability to automatically taking mixing decisions.

The justification of this research is the need of non-expert audio operators and musicians to be able to achieve a quality mix with minimal effort. Currently, mixing is a task which requires great skills, practice and can be sometime tedious. For the professional mixing engineer this kind of tool will reduce sound check time and will prove useful in multiple music group and festivals where changing from one group to another should be done really quickly. Large audio productions tend to have hundreds of channels, thus being able to group some of those channels into an automatic mode will ease the mixing task of an audio engineer. There is also the possibility of applying this technology to remote mixing applications where latency is too large to be able to interact with all aspects of the mix

This research topic builds on previous, successful work by researchers within the Centre for Digital Music, but is broad enough in scope that it could be taken in new and exciting directions.

Recommended skills

  • Knowledge of audio signal processing
  • Programming skills
  • Knowledge of music and/or sound engineering

 

Development of Interchannel Dependent Audio Effects

Contact: Dr. Josh Reiss, josh.reiss@elec.qmul.ac.uk

Most digital audio effects, whether implemented as plug-ins for mixers and audio editors or implemented as offline audio signal processing techniques, typically take a single channel as input and produce a single channel as output. The exceptions to this are fairly simple, such as ducking (which modifies one channel based on the level of another) and stereo effects (which produce two output channels).

The goal of this research is to develop MIMO (Multiple Input, Multiple Output) audio effects. These can be used to create different versions of a multichannel recording which are tailored to different listeners, or to modify channels based on the content in many other channels. Applications include live sound, where a customized mix may be fed back to each performer.

Current audio editors do not offer the ability to create plug-ins which may analyse or modify the multi-channel content. Thus this work will also involve either submitting modifications to an open source audio editor, or creating your own with the required functionality.

Recommended skills

  • Knowledge of audio signal processing
  • Programming skills
  • Knowledge of music and/or sound engineering

 

Acoustic autofocus

Contact: Dr. Josh Reiss, josh.reiss@elec.qmul.ac.uk

This project will pioneer research and development of an auto-focus for audio. It builds on the proposal of an acoustical zoom using Directional Audio Coding (DirAC). However, DirAC requires B Format microphones, which are large and expensive. The researcher will investigate methods of real-time stereo field decomposition to determine the direction of arrival of sound sources. Focus may be achieved by adjusting gain, delay, reverb and equalisation of the two microphone signals in order to emphasise those sources whose direction of arrival is nearer to the centre of the stereo soundfield. Other solutions, such as microphones with variable directivity, will also be investigated. Outcomes of the research could include improved methods for real-time direction of arrival estimation, psychoacoustic studies of preference for sound scene editing and a demonstrator that includes a stereo microphone synchronized to the zoom lens of a video camera.

 

Evaluating Mutual Engagement through New Musical Interfaces

Contact: Nick Bryan-Kinns

Whilst research into designing New Interfaces for Musical Expression has become a research field in its own right, there are still very few reliable methods and approaches to evaluating people's engagement with these interfaces, let alone understanding their mutual engagement with each other. This is a major problem in the field which hampers our ability to understand and inform the design of future musical interfaces. PhD projects in this area would involve designing novel music making devices for multiple participants, and developing suitable evaluation frameworks and methodologies based on existing research in HCI, Interactive Art, and User Experience.

 

Multimodal Location Based Techniques for Extreme Navigation and sports training

Contact: Tony Stockman

Location-based data and services for geographical and navigational information (such as  electronic maps and gps directions), are usually presented using visual displays. With the  increasing complexity of information, and the variety of contexts of use, it becomes important to  consider how other non-visual sensory channels, such as audition and touch, can be used to  communicate necessary and timely information to users. Activities such as running, rock-climbing and cycling, are all examples of activities where navigational and geographical  information may be needed, but where the visual modality is unsuitable. Additionally, there are a number of user groups such as visually impaired people and the emergency services, who also require non-visual access to geo-data, or information concerning the location of fellow players and opponents in team sports.  This PhD project will explore research ideas and findings about new interaction and perceptualization metaphors, novel  application contexts, multimodal and context-aware technologies for mobility -- contributing to  a solid foundation for the further development of pervasive extreme navigation.

 

Intelligent Instrument Recognition

Contact: Dr. Josh Reiss, josh.reiss@elec.qmul.ac.uk

Musical Instrument Identification is one of the more well-known tasks in musical signal processing. There are standard procedures and techniques for this, yet the classification rates are often very poor. The techniques are often focused on single instrument sounds, and fail when applied on testbeds with notably different qualities than the training data. Furthermore, they are rarely adapted to the task of Musical Instrument Segmentation, and thus cannot be easily used to, for instance, identify guitar solos in popular recordings.

Researchers at the Centre for Digital Music have developed more sophisticated instrument identification techniques that focus on the spectral content produced by each instrument. These have yielded exceptionally high classification rates on standard testbeds. The goal of this research would be to assess and implement these techniques, and then to adapt them to the task of Instrument Segmentation and Labelling, with an emphasis on diverse testbeds. The planned outcome of this research is a clear advancement in the state of the art of the performance and usability of instrument recognition techniques.

Prerequisites for this are programming skills and an understanding of musical signal analysis and processing. This project will also require frequent interaction with other researchers

Recommended skills

  • Knowledge of audio signal processing and machine learning techniques
  • Programming skills

 

Making Sense of Sounds: Towards Intelligent Machine Listening

Contact: Prof Mark Plumbley

Sound is all around us, coming from every direction. The aim of this project is to analyse audio signals to extract individual sounds from complex mixtures, much as our ears are able to do. This could be used to re-purpose old music recordings (e.g. extract just the guitar), enhancing a speaker whose words are buried in background noises, or deciding which direction a particular sound is coming from.

 

Music Informatics for On-line Music Recommendation Systems

Contact: Dr Simon Dixon, simon.dixon@eecs.qmul.ac.uk

With online music stores offering millions of songs to choose from, users need assistance. Using machine learning, ontologies and the semantic web, this project seeks to assist people in finding the music they want, whether it is by playing an example of something similar, humming, or musicological query (such as 12 Bar Blues in G, with guitar and no drums).

 

Automatic transcription from audio to common music notation

Contact: Dr Simon Dixon, simon.dixon@eecs.qmul.ac.uk

Automatic music transcription systems attempt to automatically extract a representation for the musical content of an audio signal. Ideally, the transcription will capture the musical essence of a performance, and can be played back by a performer. However, most transcription systems attempt to transcribe from audio to MIDI, that is, at best they represent only note pitch, onset, and duration times. Yet common music notation describes music in terms of a score which contains information on metre, rhythm, key, chord symbols, instruments, performance instructions (dynamics, fingering) and so on.

Recent advances in audio signal processing have shown that these features may, to some degree, also be extracted automatically. Thus the goal of this research is to take current research one step further, to go beyond multi-pitch detection and towards a richer transcription of an audio signal in common music notation. This work could be focussed on a specific application, such as music practice, music education, ethnomusicology, or performance analysis. Knowledge of programming, signal processing and music theory are required.

Recommended skills

  • Knowledge of audio signal processing
  • Programming skills
  • Knowledge of music theory

 

Mining Music Metadata

Contact: Dr Simon Dixon, simon.dixon@eecs.qmul.ac.uk

Millions of crowd-sourced transcriptions of musical performances exist, in formats such as MIDI files, chord sheets, and tablature. This project looks at mining the web for this data and using it in one or more of the following ways:

  • to inform audio analysis tasks such as beat tracking, (partial) transcription, instrument identification, and expressive performance analysis
  • to combine different partial transcriptions to create a fuller transcription
  • to understand aspects of human music perception, e.g. via common mistakes in transcription

 

A Comprehensive Music Information System

Contact: Dr Simon Dixon, simon.dixon@eecs.qmul.ac.uk

Vast amounts of music information are available on the Web, as well as in libraries and music databases, including: structured and unstructured text (e.g. biographical metadata); complete or partial scores (e.g. chords, tabs, lead sheets, lyrics); and recordings of performances (e.g. mixed, multi-track, video). For any given piece of music, many such instances might exist, and an ideal music information system would allow navigation between the various representations at both the document and fragment levels, based on the relationships between and within documents that it has discovered. Such a system would enhance both the human experience (e.g. browsing, search) and automatic analysis (e.g. informed transcription, multimodal similarity computation). To realise this vision, the open research questions involve: finding related documents; synchronising them; representing the relationships in ways that are meaningful to users and that allow reuse by other systems; and licensing/IP issues.