Work In progress
NOTICE! This project is still work in progress. A lot of the features are to change, to be improved upon or is missing components which is currently in progress. The aim of this project is to create a functional system. All sounds heard in the following videos solely have the function of informative feedback when testing and should therefor only be heard this way.
The purpose of this project is to experiment with the way we work with reflections and implement acoustics into video games and more precisely in first person shooters. This might change later i the process, but for now this is about guns and how we can make an dynamic acoustic reflection system for guns.
The project is made in Unity using Wwise as audio middleware.
First a little background on what led me to spin this project out of control!
The discussion about realism in first person shooters is pretty old. If the design is getting to realistic, it doesn’t sound good. If the design is getting to hyper-realistic, it doesn’t sound realistic enough. But what about acoustic realism?
Normally, when designing audio-assets for a video game, you want to make the asset as dry as possible and then add reverb dynamically as the game is running. But this isn’t the case when designing guns. The general practice when designing a gun, is to separate the body and the tail, into different assets. The dry signal and the wet reverb signal. This is kind of a unique approach compared to the majority of game audio assets.
So why not just use the in-game reverb, that is already being used in the target area of the game? This is an option, but your gun is not going to sound cool!
Making a physical modelling of a gun-tail in real-time is very hard and often the result is mediocre. The reason for this is that a gun-tail is extremely characteristic and it is holding a lot of environmental information, which often is hard to replicate.
The details of the gun-tail is partly a result of the extremely high dB levels of guns and the way the environment is responding to this. To physically model this in a videogame, you would need a separate reverb, most preferably convolution, using IR from real gunshots in similar location and probably build additional delay effects on top. I’m not sure the result will be good, but it probably will be rather expensive, depending on the amount of guns in the game.
Instead we design a prerendered gun-tail with the environmental characteristic we want. We can then change this tail depending on the environment the player is in and additionally add the reverb already being used in the environment on top. This blend gives perception of the gun being loud and naturalistic and at the same time stay in the same environment as the rest of the soundscape.
The way most games handle reverberation, is to have some kind of reverb-zone separation of all areas and rooms in the game. Each reverb-zone has a switch or other type of value attached, which is used to communicate what reverb and gun tail should be used in the area/room.
Depending on the size and type of the game, we might be talking about hundreds or thousands of zones. Each and every one needs to be set up manually.
But what if we had a system that could do all of that in runtime. Saving the company tons of work? That sound like a really good idea, Mads!
Okay! In order for us to create a dynamic reverb calculator, the first thing we need to know is the size of the room. So what i did was that I created a raycast system. This system shoots out rays (actually spherecast, they are like rays, just spheres, bigger ball, harder to miss), in a fixed interval. Every time it turns (x) degrees depending on the requested total hit points in a rotation. We offset the angle (x) degrees after each full rotation, to make sure we are precise and covers as much as possible. See figure 1 for an overview of the component settings.
We then calculate the approximately size of the area. The way we do this is by calculating the triangle area between each raycast, using the angle and the length of each ray. We then add this to a queue. When the queue is full we remove the oldest index every time we add a new. Combining all elements in the queue is giving us the approximately size of the room. To make it more stable, we save the four last recordings and find the average.
In action this looks something like the gif in figure 2. In the console you can see the currentArea, which is the size of the last rotation, and the roomSize which is the average. We only recognize roomSize changes with a margin of (roomSizeMargin)% of the current roomSize, to make sure this doesn’t change too much.
This is giving us a rough but reliable calculation of the room. We then feed this to the WwiseRoomSize RTPC, which drives the decay time of our reverb.
An example of how this works can be heard in figure 3.
Currently the only parameter this is working on is area size. We are not taking height surface materials, obstacles or other factors into consideration. But this is in development.
Reverb is good to give the player an understanding and feeling of the space/environment. But it doesn’t really attribute anything to the player perception of placement in the room. Localized early reflections on the other hand. They work in real life, so they probably also work in video games. But first of all lets figure out what defines early reflections?
Early reflections are the echoes of a signal that arrive at the microphone within a stretch of about 30ms after the direct sound. Early reflections are direct copies of the direct sound source, rather than diffuse mixtures as are present in the late reflections, or reverberation, or a sound source.
From this knowledge we know that reflections are echoes and direct copies of the sound source. Which is basically the same thing as the output of a regular delay effect. We also know that they need to reach the listener within a stretch of about 30ms. I read somewhere else that it is between 10ms and 80ms, which is what we will be using currently. Sound moves at a speed of ~343 m/s, which gives us a traveling distance up to ~27.5 meters (343/1.000*80). I’m no expert on the subject of making this kind of reflection calculation, so we are going to cheat physics a little bit. We are only going to take the direct path into consideration. The gun is our emitter, our limit of travel is 27.5m, so we only want to count the objects that is within 13.75m (27.5/2) of the player. Those are the reflections that will fall under the category of early reflections.
The way we are going to find those is with the help of our raycast system we created for the reverberation system. We already know the distance to all surfaces around the player, because we are storing every raycast in the last rotation in a list.
When we shoot our gun, we access this list, we go through each index , if it hit something and it is in range, then we want to play an early reflection from the point of impact.
This is were the hard part begins!
First of. Wwise doesn’t allow us to real-time control the delay time of the build in delay effect in Wwise. This is a weird restriction and in the future I will look into building my own delay effect. But for now we need to do something stupid, which is exemplified in figure 4.
We are going to have four fixed delay settings(because the effect limit is also four). The are going to be fixed with at delay time of 15ms, 30ms, 50ms and 70ms. In runtime, we are then going to locally bypass all except the one in which delay range fits our emitter distance. This is unfortunate, but it is just the limitations that middleware creates.
The next thing we need to do, is that we need to make a callback system using the famous but never practical explained method “SetGameObjectAuxSendValues”, as well as the less famous but very smart method “AddListeners”. This took me a long time to get working properly and it might be hard to describe, so I will maybe do this at a later state. The most important thing to know, is that we spawn emitters from a objectpool on the hitpoint. Using the .addlistener method, we make sure that the default listener, is listening to the spawn emitters. The emitters are then set as gamedefined aux busses for a gamedefined aux bus of the gun emitter. This way we can route the gun sound to four different gamedefined aux busses, but only use one of the four direct gamedefined auxbusses available on the gun.
Confused? No worries I made the system and I’m still confused. S#!t, even Xzibit would be. Instead take a listen to the video in figure 5, which is showcasing how it sounds.
The early reflections is giving a nice spatial experience of the room and the player position, but it isn’t attributing much to the characteristics of the room. Can we do something similar to the reflections, like we did with the pre-rendered gun-tail reverb? Like playback of material reflections from surfaces in the game, using pre-rendered assets of vibrating materials like metal, glass, wood, etc.
We already built the raycast system. We are already spawning emitters on the point of impact and we are also getting access to the gameobject we hit, which means we get access to other information like the material. If we take this information and then use it to drive a switch, then the only thing we need to do is to fire the material reflection event on the appropriate emitters, whenever the gun is being triggered.
See figure 6 for a video showcasing how this sounds.
The end result is a more spatial experience of shooting a gun. The reverberation tells us something about the size and type of location we are in. The early reflections tells us about the distance to the walls, and the material reflections tells us what the location is made of. When we stand in the corner of a metal shed, it should also sound like we are standing in the corner of a metal shed.
In regard to performance, then I still have a lot to polish and optimize. I want this system to be functional in most conditions, and I’m therefor running 30 FPS when testing it. This is lowest possible framerate, but also the worst possible work condition, since the system is very dependent on the update speed.
At 30 FPS the system is passively benchmarking ~0b GC allocation and 0.06ms, and when shooting the gun 2.7kb GC allocation and 0.13ms is added on top.
At 60 FPS the system is passively benchmarking ~0b GC allocation and 0.03ms, and when shooting the gun 2.7kb GC allocation and 0.13ms is added on top.
I have no idea if this is good or bad. But it definitely tell something.
The next features I’m working on include:
· Custom delay effect
· Include height and volume calculation
· Slapback/echo system
· Material/room type system for reverb
· Occlussion and obstruction
. Sound design
A big thanks goes to the guys from Massive Studios for releasing this video shortly after I started working on my project. I’m inspired by a lot of their approaches and the video helped me solve a lot of problems.
Magic in C
Ludum Dare 46
Dynamic Reflection System
Work in progress
Game Audio Pipelines
Level One: JAM
Silence and Swords
National film school of Denmark
Soy No Soy
Hypocondriac Health Club
Interaction & Immersion in Video games