Steam Audio integration into Unity Engine for recording and mixing in Logic Pro X
As the aim of this project is focused on encoding virtual acoustic properties into PCM stereo, there is a need to bring the encoding processes together within a virtual rendering environment in such a way that allows for generation and recording of virtual acoustic space. Also summarised in the virtual rendering inquiry as the component functions concept involving the encoding / rendering functions :
- Sound source placement in a virtual environment
- HRTF receiver placement
- Virtual acoustic space (VAS) to be present in the environment
- Simulated acoustic modelling through raytracing / binaural cue generation / interacting virtual geometry
- Simulated sound propagation / transmission
- Ambisonics compatibility
- Internal audio routing for recording
The integration of Steam Audio within Unity Engine covers the basis of required encoding and functionality within a virtual environment for generating VAS, provided by Lakulish, et al (2023a) :
- Render 3D positional audio. Steam Audio binaurally renders direct sound using HRTFs to accurately model the direction of a sound source relative to the listener
- Render 3D audio for Ambisonic content. Steam Audio can also spatialize Ambisonics audio clips, rotating them based on the listener’s orientation in the virtual world.
- Use custom HRTFs using SOFA files. In addition to its built-in HRTF, Steam Audio can spatialize point sources and Ambisonics sources using any HRTF specified by the user.
- Model occlusion of sound sources. Steam Audio can quickly model raycast occlusion of direct sound by solid objects.
- Model reflections and related environmental audio effects. Steam Audio can model how sound is reflected by solid objects.
- Model propagation of sound along multiple paths. Steam Audio can model how sound propagates from the source to the listener, along multiple paths
The following pilot implementation of Steam Audio into Unity, is routed via Blackhole audio management software to enable recording within Logic Pro X. The result is a rendering process which works in parallel to a standard mixing session, with the capability to export and re-recorded audio encoded with qualities of spatial definition resulting from both the HRTF positioning of sound sources in the stereo field and the encoded acoustic properties of the VAS.
To render, exported stems from the DAW session are imported as assets into the engine and assigned to an audio source sphere. A HRTF receiver is then positioned in the virtual geometry tagged structure that defines the present VAS, with audio routed in the engine mixer to an input channel, and to a second channel containing a Steam Audio Reverb unit. This unit generates ambisonic impulse responses to capture additional reflections in the HRTF soundfield based on the incoming reflection cues from surrounding geometry.
Direct signal and reflections can be balanced to taste both directly from source gameobject component parameters and with the reverb unit, allowing for creative flexibility of the final sound’s spectra, whilst maintaining the localisation and acoustic properties of the sound source
in the VAS. Record arming is set on the DAW VA recording channel, which is set to no output to prevent feedback - as CH 19-20 is the Blackhole output of internal audio. Initialisation of the engine session is then started, along with recording to capture the generated audio.
The software workstation in use for VA rendering with Steam Audio / Unity Engine with Logic Pro X recording.