VA Mixing Session - Multi-Track Mix

This documentation covers production of a fully encoded multi-track implementation for mixing, using a set of back catalogue guitar / bass / kit / synth stems as material for use in renders. Stems had then been exported from Logic Pro and imported over into Unity Engine as audio assets, with an additional material component asset being created specifically for use in this session. For the room model which would define the surrounding interior geometry, a live room hall was designed using the engine’s gameobjects, and marked with the Steam Audio Geometry component assigned with the custom designed material asset. The material’s properties of absorption / scattering / transmission have been altered to create a frequency response within the virtual rendering environment which creates a virtual acoustic space that is mid-high focused with a shiny / airy timbre and low level in bass frequency reflections.

The material properties used can be compared with an impulse response of the VAS, which indicate that these properties are present within the rendering environment [Additional impulse response generation and signal data can be found at the end of this document].

Upon rendering, the choice was made to keep instrumentation together depending on it’s grouping. Meaning, all four guitar stems within the arrangement were rendered together, similarly with all drum stems and individual bass / synth stems all being rendered separately. As the renders have been done within the same model room, with the same HRTF listener perspective, the final layered virtually acoustic stems will be consistent in their VAS room spectra.

Whilst room model and HRTF positioning was consistent, the creative use of source placement equated to the balancing process of a mixing console, enhanced with additional depth, placement and spectral characteristics. The source placement began with guitars, distances and angles across horizontal and vertical dimensions relative to the listener perspective were adjusted, monitored, considered and then adjusted until a final arrangement was reached. The spread of guitars remained in a mid-way proximity from the listener, these were arranged in a symmetrical pattern as each had its own unique stem and therefore no directional cancellation took place due to in-phase ILD / ITD of identical signals.

The following is the arrangement of these guitar sources, once rendered and recorded these were then comparable to being bounced via a group bus or aux, with the four original stereo multitracks being rendered into one stereo file, containing the stems as well as the additional reflection cues incoming from the defined material geometry walls. It is worth noting that one guitar did get processed with a reverb effect before importing, this was to trial how existing effects such as this would translate into the VAS and final renders in comparison to application following rendering, and mask some hiss from the original stem.

Render recording - VA Rec Guitars Stem.wav https://drive.google.com/file/d/1zcLypBSXgwKdS5mgFdSYmJcfIgBJHzIN/view?usp=share_link

Following the guitar rendering, the bass was done separately as later level adjustments will then be possible if necessary. As all stems that’re done together are encoded with a fixed level defined on the render, these individual guitar stem levels for example cannot later be adjusted and therefore some instrumentation may benefit from separate renders. With enough computational power, all multitrack stems could be arranged / balanced and rendered together however this has a high resource demand, being done only in real-time meaning high risk of unwanted artefacts such as teared audio.

Positioning of the bass source was done with central positioning in-line with the receiver on both horizontal axes, with it being lower on the vertical axis positioning it directly beneath the receiver. This was done both to position the signal within the HRTF soundfield at the base, as well as to encourage higher bass response from reflection cues off of the floor surface.

Render recording - VA Rec Bass Stem.wav https://drive.google.com/file/d/1bLiYFC598hgKpbaksZ2hBCdIsEgKSUvr/view?usp=share_link

Moving to the drum kit render, the stems were individually exported so that the kick, snare and cymbals could all be individually assigned. The total number of sources for this was four, with the cymbals being spread across two LR positioned sources and the snare / kick being centrally positioned. Adjustments were made from the initial positioning due to a lacking cohesion between high and low frequencies, with the cymbals having lower than desired amplitude from far-field positioning.

The snare too was also adjusted, as trials of it placed in the rear perspective and vertical perspective were tested, however lead to a harsh and hollow timbre. The kick drum similarly to the snare was trialled in the rear perspective, where it stayed to both balance with the rendered bass signal with a slight difference of positioning allowing for better distinction between both. With the cymbals brought in tighter to the listener perspective and the snare adjusted centrally, the overall timbre of the kit was better suited to what was being sought, and once set the render was recorded.

Render recording - VA Rec Kit Stem.wav https://drive.google.com/file/d/1NC6I3yZHq1-nJ3MrCJRsGVTagQiwZOcb/view?usp=share_link

Moving into the padding layers, the first synth bells were arranged also using four sources despite being only one stereo multitrack. These utilised the ambisonic component as the stereo signal image is then maintained with additional sources being additive, with the placements here symmetry was considered to avoid phasing issues, however with placement above the listener perspective in a skewed pattern phasising did not occur. These sources had also been noclipped through the ceiling surface as this had a noticeable impact on the timbre making it brighter, the HRTF listener was then also raised on the vertical axis until a fuller midrange accompanying the bright highs was monitored. Once placement was satisfactory, these sources were rendered.

Render recording - VA Rec Synth 1 Stem.wav https://drive.google.com/file/d/1IQX0l1GPp3rXKcmA6DhUM6kB60o7YDFE/view?usp=share_link

The final layered VA stems result in a mix which has a distinct spatial balance, the overall room timbre is consistent and stems are perceived as together and recorded within the same acoustic space. There is further work to be done with regard to mixing however, as this process is not an alternative to the whole mixing process but rather an additional processing method which can be used in tandem for distinct psychoacoustic results.

It is worth noting that there is an initial period of auditory accommodation when listening to the audio clip of layered VA stems following listening to the original none-processed multitracks, a perceptual effect which could be described similarly to the accommodation of eyesight to shift focus from near to far may be observed as an audible phenomenon. Speculating on this peculiarity, it is perhaps the receptive adjustment to the audible depth perception of many incoming synthesised ITD / ILD.

Following are a collection of bounces from the session :

Track15 Unprocessed.wav https://drive.google.com/file/d/1s_QG6S4_FbGelIZlyY52lDmm9TLkUkk9/view?usp=share_link
Unprocessed original stems in the balance which was then exported out to the engine, these have no VA encoded renders.

Track15 VA.wav https://drive.google.com/file/d/1BdJR30mV6-q8Y4GfUDoZ3sk1MHU16lY1/view?usp=share_link
All stems processed with VA according to the aforementioned arrangements and rendering techniques.

Track15 VA:Unprocessed.wav https://drive.google.com/file/d/107CMS4prbkydxVFqSyW2bAzcieXg51i_/view?usp=share_link
Both of the above layered together with no alterations such as re-balancing.

Assessing the above, a combination of both VA and unprocessed audio works in a cohesive manner to add additional texture in a creative regard of application. Considering this however, the use of the unprocessed audio alongside encoded audio does lead to a breakdown of the ITD / ILD and overall illusion of depth and directionality. Therefore, going into the mix of these VA stems, I will be using only VA encoded stems but processing as if they’re untreated audio tracks to understand how they’d behave and sound through a usual mixing process involving EQ / Compression and creative effects. As these effects can alter phase relationships, determining the possible impact on the HRTF soundfield and ITD / ILD cues will glean an understanding of any requirements or limitations they may present.

On mixing the VA audio, it is apparent that these issues do not occur with the binaural cue soundfield remaining intact and positioning unaltered. Both EQ and compression can be conventionally used to process audio in usual ways with no interference of phasing or cancellation. In use, the use of compression emphasises the panoramic property of the stereo field which perceptually envelopes the listener in a surrounding audio scene, with EQ still usable to boost and cut to mould tone and timbre further. Additionally, creative effects such as reverb and delay maintain the original stereo image of the VA encoded audio, therefore their additionally generated samples of the input signal also maintain the encoded VA properties. There are even instances where through the use of modulation, audio image can be pushed into wider unconventional placements, this has subtly been applied to the synth here but when tested on the kit it makes for a unique doubled image of the kick especially on either side of the listener. Creative use of this could be taken much further with the abstract placement of sounds around the listener being adaptable to many different applications of taste and genre.

As an additional intro / outro textural layer to add into the piece, a sample of ocean noise has been used starting from two waves of unprocessed stereo crossfaded into a four source rotating render from within the VAS model used. This has partly been done to illustrate the shift in depth between the unprocessed and encoded stereo fields, as the final mix contains none of the original unprocessed stems.

Mixdown completed using only VA encoded stems - Track 15 - Ingleton Road - VA Mix.wav https://drive.google.com/file/d/1LsyfNY95CDp8RSJJFTwjNWLFZ_9kB9wg/view?usp=share_link

Previous
Previous

VA Mixing Session - Virtual Microphones

Next
Next

VA Mixing Session - What Will Happen ?