Name AL_SOFT_UHJ Contributors Chris Robinson Contact Chris Robinson (chris.kcat 'at' gmail.com) Status Complete. Dependencies This extension is for OpenAL 1.1. This extension requires AL_EXT_BFORMAT. Overview This extension adds support for UHJ channel formats and a Super Stereo (a.k.a. Stereo Enhance) processor. UHJ is a method of encoding surround sound from a first-order B-Format signal into a stereo-compatible signal. Such signals can be played as normal stereo (with more stable and wider stereo imaging than pan-pot mixing) or decoded back to surround sound, which makes it a decent choice where 3+ channel surround sound isn't available or desirable. When decoded, a UHJ signal behaves like B-Format, which allows it to be rotated through AL_EXT_BFORMAT's source orientation property as with B-Format formats. The standard equation[1] for decoding UHJ to B-Format is: S = Left + Right D = Left - Right W = 0.981532*S + 0.197484*j(0.828331*D + 0.767820*T) X = 0.418496*S - j(0.828331*D + 0.767820*T) Y = 0.795968*D - 0.676392*T + j(0.186633*S) Z = 1.023332*Q where j is a wide-band +90 degree phase shift. 2-channel UHJ excludes the T and Q input channels, and 3-channel excludes the Q input channel. Be aware that the resulting W, X, Y, and Z signals are 3dB louder than their FuMa counterparts, and the implementation should account for that to properly balance it against other sounds. An alternative equation for decoding 2-channel-only UHJ is: S = Left + Right D = Left - Right W = 0.981532*S + j(0.163582*D) X = 0.418496*S - j(0.828331*D) Y = 0.762956*D + j(0.384230*S) Which equation to use depends on the implementation and user preferences. It's relevant to note that the standard decoding equation is reversible with the encoding equation, meaning decoding UHJ to B-Format with the standard equation and then encoding B-Format to UHJ results in the original UHJ signal, even for 2-channel. The alternative 2-channel decoding equation does not result in the original UHJ signal when re- encoded. One additional note for decoding 2-channel UHJ is the resulting B-Format signal should pass through alternate shelf filters for frequency-dependent processing. For the standard equation, suitable shelf filters are given as: W: LF = 0.661, HF = 1.000 X/Y: LF = 1.293, HF = 1.000 And for the alternative equation, suitable shelf filters are given as: W: LF = 0.646, HF = 1.000 X/Y: LF = 1.263, HF = 1.000 3- and 4-channel UHJ should use the normal shelf filters for B-Format. [1] This equation is derived from the encoding equations found in various online sources, such as Wikipedia and the Xiph.org wiki: * * and then reversed. This provides more precise decoder coefficients than what's typically found, including in those same pages. Super Stereo (occasionally called Stereo Enhance) is a technique for processing a plain (non-UHJ) stereo signal to derive a B-Format signal.[2] It's backed by the same functionality as UHJ decoding, making it an easy addition on top of UHJ support. Super Stereo has a variable width control, allowing the stereo soundfield to "wrap around" the listener while maintaining a stable center image (a more naive virtual speaker approach would cause the center image to collapse as the soundfield widens). Since this derives a B-Format signal like UHJ, it also allows such sources to be rotated through the source orientation property. There are various forms of Super Stereo, with varying equations, but a good suggested option is: S = Left + Right D = Left - Right W = 0.6098637*S - j(0.6896511*w*D) X = 0.8624776*S + j(0.7626955*w*D) Y = 1.6822415*w*D - j(0.2156194*S) where w is a variable width factor, in the range [0...0.7]. As with UHJ, the resulting W, X, Y, and Z signals are 3dB louder than their FuMa counterparts. The normal shelf filters for playing B-Format should apply. [2] There is unfortunately limited information publicly available about Super Stereo. Much of it seems to be in Gerzon's papers published through the late 70s and 80s. However, some useful information can still be found from the Sursound mailing list: * (third-party archive) A couple references to older material from that thread have also been archived at archive.org: * * Issues Q: 3- and 4-channel UHJ weren't widely, if ever, used, in part due to the extra channels not being stereo-compatible (players need to be aware to drop them if not decoding them) and making it more practical and efficient to use B-Format directly. Why include them here? A: UHJ is a hierarchal system, where 3-channel is a subset of 4-channel, and 2-channel is a subset of 3-channel. There's little extra work necessary to support them, and there are techniques for getting 3- and 4-channel UHJ into a stereo-compatible stream, so having the option is not a bad idea. Q: Why include Super Stereo here as it's not strictly UHJ? A: Super Stereo is built on the same structure as UHJ, utilizing phase shift filters with a passive matrix to generate a B-Format signal from pre-existing stereo content. Given the similarity in functionality, it provides a good option for handling stereo content. Additionally, even in the hardware space it's not uncommon for UHJ decoders to have Super Stereo capabilities, so it makes sense to have it here too. Q: Super Stereo seems to have a width factor limit of 0.7, but the AL_SUPER_STEREO_WIDTH_SOFT attribute goes up to 1.0. Why? A: For flexibility of implementation. If a method is developed that allows using wider factors, an arbitrary 0.7 limit would be unnecessary. There is some precedent for this with the source's AL_PITCH property being any finite non-negative value, but an implementation internally clamps to its own limits. New Procedures and Functions None. New Tokens Accepted by the parameter of alBufferData: AL_FORMAT_UHJ2CHN8_SOFT 0x19A2 AL_FORMAT_UHJ2CHN16_SOFT 0x19A3 AL_FORMAT_UHJ2CHN_FLOAT32_SOFT 0x19A4 AL_FORMAT_UHJ3CHN8_SOFT 0x19A5 AL_FORMAT_UHJ3CHN16_SOFT 0x19A6 AL_FORMAT_UHJ3CHN_FLOAT32_SOFT 0x19A7 AL_FORMAT_UHJ4CHN8_SOFT 0x19A8 AL_FORMAT_UHJ4CHN16_SOFT 0x19A9 AL_FORMAT_UHJ4CHN_FLOAT32_SOFT 0x19AA Accepted by the parameter of alSourcei, alSourceiv, alGetSourcei, and alGetSourceiv: AL_STEREO_MODE_SOFT 0x19B0 Accepted by the parameter of alSourcef, alSourcefv, alGetSourcef, and alGetSourcefv: AL_SUPER_STEREO_WIDTH_SOFT 0x19B1 Accepted by the parameter of alSourcei and alSourceiv for AL_STEREO_MODE_SOFT: AL_NORMAL_SOFT 0x0000 AL_SUPER_STEREO_SOFT 0x0001 Additions to Specification UHJ Buffer Formats The formats AL_FORMAT_UHJ2CHN8_SOFT, AL_FORMAT_UHJ2CHN16_SOFT, AL_FORMAT_UHJ2CHN_FLOAT32_SOFT, AL_FORMAT_UHJ3CHN8_SOFT, AL_FORMAT_UHJ3CHN16_SOFT, AL_FORMAT_UHJ3CHN_FLOAT32_SOFT, AL_FORMAT_UHJ4CHN8_SOFT, AL_FORMAT_UHJ4CHN16_SOFT, and AL_FORMAT_UHJ4CHN_FLOAT32_SOFT may be used for buffering data to functions like alBufferData. 8-bit data is expressed as an unsigned value over the range 0 to 255, 128 being an audio output level of zero. 16-bit data is expressed as a signed value over the range -32768 to 32767, 0 being an audio output level of zero. Byte order for 16-bit values is determined by the native format of the CPU. 32-bit float data is expressed as a signed value with the normalized range -1.0 to +1.0, 0.0 being an audio output level of zero. Byte order for 32- bit values is determined by the native format of the CPU. These formats are interleaved, with UHJ2 having the left and right samples in order, UHJ3 having the left, right, and T samples in order, and UHJ4 having left, right, T, and Q samples in order. UHJ formats are decoded and played according to the rules of BFORMAT buffer formats. When played, such formats may be oriented according to the source's AL_ORIENTATION and AL_SOURCE_RELATIVE properties. Super Stereo Processing When playing Stereo formats, a source may opt to enable Super Stereo processing with the AL_STEREO_MODE_SOFT attribute. Name Signature Values Default ------------------- --------- -------------------- -------------- AL_STEREO_MODE_SOFT i,iv AL_NORMAL_SOFT, AL_NORMAL_SOFT AL_SUPER_STEREO_SOFT Description: When AL_STEREO_MODE_SOFT is set to AL_NORMAL_SOFT, Stereo formats are processed and mixed as normal for multi-channel formats. When set to AL_SUPER_STEREO_SOFT, Stereo formats are processed with a Super Stereo (sometimes called Stereo Enhance) algrorithm. In this mode, the stereo sound is converted to B-Format using the width factor specified by AL_SUPER_STEREO_WIDTH_SOFT, and is treated as a B-Format source which may be oriented with this source's AL_ORIENTATION and AL_SOURCE_RELATIVE properties. This attribute cannot be changed while the source is in an AL_PLAYING or AL_PAUSED stated, and it has no effect when the source is not playing a STEREO format. Name Signature Values Default -------------------------- --------- ------------ ------- AL_SUPER_STEREO_WIDTH_SOFT f,fv [0.0f, 1.0f] I/D Description: The width factor for the resulting soundfield using Super Stereo processing. The default value is implementation-defined, with a suggested value that provides good quality for a wide range of stereo content. A value of 0 results in the signal being a focused point at the front of the soundfield, a value of 0.5 results in the signal covering the front half of the soundfield, and a value of 1.0 results in the signal covering all around. An implementation may internally clamp the maximum value depending on the limits imposed by the selected algorithm. Has no effect when AL_STEREO_MODE_SOFT is not AL_SUPER_STEREO_SOFT or the source is not playing a STEREO format. Errors An AL_INVALID_OPERATION error is generated if an attempt is made to set AL_STEREO_MODE_SOFT on a source while it's in an AL_PLAYING or AL_PAUSED state.