crossover phase shift audibility

Serious · May 16, 2020

During simulations for a new speaker project I once again came upon the old problem of optimizing for low group delay/phase shift or optimizing for uniform directivity. I wanted to see for myself how audible the phase shift from typical speaker crossovers is (if at all), so I decided to create some impulse files in RePhase, which can easily be used with JRiver MC. Simply import the files into the JRiver convolver in its DSP studio and activate the convolution. For those of you who do not have JRMC, there's also a convolver plugin for foobar: https://www.foobar2000.org/components/view/foo_convolve

I chose 2048 taps for most files, which seems to be plenty in this case. They all have a flat FR magnitude. I wish RePhase had the option to simulate (and correct) filter types other than Linkwitz-Riley, but for correcting a speaker's excess phase I'd recommend measuring the speakers, anyway. You can then use REW to export an excess phase version of the measurement, import it into RePhase and use the 'Filters Linearization' tab aswell as the 'Paragraphic Phase EQ' filters to linearize the excess phase response.
I also went for a 44.1kHz sampling rate for the files since pretty much all of my music is 44.1kHz. I wanted to hear the effect with music, not specialized test signals.

The names on the files should be self-explanatory:

TEST_LP1_Q0707_1kHz: A file for testing if the convolution works. First order low pass at 1kHz with a Q of 0.707.

44_perfect_inv: This one simply inverts the impulse/swaps absolute phase.

8_inch_widebander_with_whizzer: This one has a phase shift like the one you'd see introduced by an 8" widebander with a whizzer cone. Based on a measurement of such a driver.

4_inch_widebander_with_whizzer: Same thing here, except a smaller 4" driver. The phase shift happens at a higher frequency here.

LR12_2500_tweeter_inv: A 12dB/octave Linkwitz-Riley filter at 2500Hz. Many bookshelf speakers have a crossover similar to this. In this case wired with the tweeter inverted. As far as 'normal' speaker design goes, this is about the least amount of phase shift you'd encounter.

LR12_2500_midrange_inv: Same thing, but with the midrange/midbass inverted instead.

LR12_250_LR12_2500_tweeter_inv: An added 250Hz LR 12dB/oct filter, similar to many 3-way speakers. Tweeter and bass inverted, midrange in positive polarity.

LR24_250_LR24_2500: Same crossover points, but using steeper 24dB/oct filters (to get less crossover lobing).

3_way_speaker: A real world 3-way speaker, based on a measurement. Similar to the LR12_250_LR24_2500 file. Tweeter and bass inverted, midrange in positive polarity. The tweeter may be slightly ahead of the midrange and the crossover may be slightly asymmetric to compensate. This is despite a small waveguide the speaker has.

LR_48_80_LR12_2500_tweeter_inv: Similar to the above bookshelf, but with a sub crossed at 80Hz at 48dB/octave.

It is quite easy to create them in RePhase, so feel free to experiment with other arragnements. I think it might be interesting to explore how different crossover frequencies affect the sound, i.e a LR12 filter at 2000Hz vs 2500Hz vs 3000Hz.

These files can also be imported into REW to better visualize the effects they have. For example you can easily view the effects they have on the step response and compare group delay between different filters. In the next post I will do just that.

I've already done experiments like this in the past, so I will say the results are not surprising to me, but I think everyone should really give this a try. Headphones will mask the differences less than speakers, but listening via speakers which are close to minimum phase works aswell. Otherwise it makes more sense to listen to a file which corrects the speaker's excess phase.

Serious · May 16, 2020

Okay, @Serious that's all fine and dandy, but what do these files do exactly?

Good question! As promised I visualized the effects the files have on the step response, aswell as the resulting group delay. Since these are many images I decided to dump then into an imgur folder and post them here in spoilers.

Note that the CSDs will depend a lot on the visualization. Also note that all filters have a perfectly flat response in the passband (DC to 22.05kHz). What you are seeing here is purely due to the phase shift.
If the response isn't extended to DC there will be some phase shift that looks like "longer decay" in the lower midrange. Having a bass bump will somewhat neutralize this phase shift in the frequency range above.

Since there's a limit of 20 images per post I will just link my imgur album here: https://imgur.com/a/lfgmgxP
Do not interpret too much into the CSDs, though.

Overall nothing new. There's a comparison of various filters here: https://www.ranecommercial.com/legacy/note147.html

I think I will post some of my subjective impressions in the next post. Let's say I'm a bit torn as to how I should continue with my speaker project so far, but I think I will go for the more difficult route.

All but the 80Hz LR48 filter should be inaudible according to wikipedia: https://en.wikipedia.org/wiki/Group_delay_and_phase_delay#Group_delay_in_audio

Serious · May 17, 2020

Note JRiver MC has a feature to automatically normalize the filter volume. Despite there not actually being any gain at any frequency you may otherwise get clipping due to the superposition of different signals. With music that isn't brickwalled I wouldn't bother checking this box, but with modern brickwalled music with less than 6dB headroom I'd use the feature. Without the box checked it is very easy to quickly A/B test it. Simply check and uncheck the convolution in the JRMC DSP Studio.

I listened to the files on both my speakers and my headphones and also tried the 3-way speaker mentioned above with a file correcting its excess phase (basically an inverted version of what I posted). Here are my subjective impressions (to prevent biasing others I'll spoiler it). No blind testing yet, so this may be placebo (Also some rambling about speaker crossovers):

LR12 at 2500Hz

I prefer it with the tweeter inverted compared to inverting the midrange. Either the treble or midrange and bass get more diffuse, I prefer a more diffuse treble. This way treble transients also get smoothed as opposed to sharpened, which I prefer.* Occasionally some shout in high pitched vocals regardless of tweeter polarity, but nothing major. Makes it sound a bit more artificial, like it limits timbre. Staging seems to get a bit smaller.
*Note that this applies most of the time, but some sounds seem to be recorded with inverted polarity. I think it's not nearly as much as other people say, because I think most music sounds better when played back in normal polarity, but some does sound less diffuse and sharper in reversed polarity. Sometimes it's only certain instruments within the mix, it really depends IME.

Overall, as far as speaker design is concerned I'm not surprised that this is probably the most popular choice out there. The compromises of a first order crossover are major as far as off-axis performance, FR evenness and the requirements on the drivers are concerned. Plus I don't think you can even pull off a first order crossover with any 1"-1.2" tweeter at 2500Hz and the midrange needs to have good response with not much beaming up to 10kHz.

LR24 at 2500Hz

I don't think it has the softness of the LR12 at 2500Hz*, but vocals and imaging are noticeably worse to me. It's to the point where coherency between midrange and treble is lacking. Treble sounds 'tweetery' to me without there actually being a tweeter. I think it's definitely worse than LR12. It's borderline to me. Past LR24 is where it gets really bad. Seriously, download RePhase and try LR48 at 2500Hz. LR48 at 2500Hz sounds very shouty and sizzly to me, basically unlistenable, tbh. Weird incoherent staging, pulling instruments apart a lot.
*I think @purr1n once called it that accurate studio monitor sound. I think I'm trying to describe the same thing.

Reversing polarity

As mentioned above, sounds that were pin-point become diffuse and the other way around. Using a convolution with a unit impulse sounds transparent to me, as it should. In this case it does add a delay, since the impulse is centered.

LR48 at 80Hz

Okay, 48dB per octave may be somewhat excessively steep for crossing a sub, but this just sounds ridicilously bad to me and very much reminds me of the worst subwoofer implementations. The bass noticeably lags behind the rest of the music.

LR24 at 80Hz

I think this is closer to a typical subwoofer implementation. Sounds much better to me. It doesn't really sound like a delayed bass anymore, mostly just a sludgier, muddier bass. It does sound slower, but not as incoherent.

LR12 at 80Hz

It's likely unrealistic to pull off a LR12 at 80Hz, but this is how I'd try to do it. It's still a bit incoherent and muddy to me - still sounds a bit like a subwoofer, but between having this kind of bass and having no subbass at all I'd probably go for the sub . Invert the sub, not the mains.

LR12 at 250Hz

Sadly the impact on overall coherence is pretty stark to me here. I would not go steeper than LR12. I'd describe it as sounding slower, even a bit thicker. I think even here bass lines become somewhat harder to follow.

8" Widebander with whizzer cone

Believe it or not, I think it sounds somewhat mellower when listening via my HD800. And listening to an inverted version with my OB speakers with Voxativ drivers (which this profile is based on) in order to correct their phase response faults, I actually preferred the uncorrected version. The drivers seem to be voiced with the phase response in mind and the upper midrange occasionally gets shouty with the excess phase corrected. Treble also gets a bit less natural (some bite) when corrected. Funny how that works, isn't it? Pretty much no change to the imaging, overall seems to be a pretty small effect.
I do prefer the normal sound with the HD800, but it's not a big change to me. Sounds pretty much transparent.

4" widebander with whizzer cone

While it hardly affects the midrange, I think the treble really suffers from this filter. I experimented with correcting the excess phase for these speakers a while ago and told @ultrabike about the results already. Treble to me is much less natural uncorrected (or when listening to this filter with my HD800). Sounds spikier, sharper. Unnatural. I'd rate this effect as much worse than the 8" widebander with whizzer. Imaging is worse, too. Note that (like other drivers) widebanders without whizzers have none of these stark excess phase issues, but beaming a big issue. A driver with a whizzer cone is best though of as a coax with a mechanical crossover, usually asymmetrical and I suppose something like 1st order LP and 2nd order HP normally. Depends on the driver, of course.

Note that with the widebanders the excess phase changes depending on the angle. These files are based off of measurements I took at a listening angle, not on-axis. Usually there's less excess phase shift on-axis.

LR12 at different frequencies between 1000Hz and 8000Hz

I seem to prefer either 4500Hz or 1500Hz, but for different reasons:

4500Hz seems to mostly keep the vocal range intact, while also not being as bad on the treble as filters at higher frequencies from 6-8kHz. Comparing the phase shift digitally this would be my choice, but in real life you will get bad beaming and overall it might not be such a good choice.

1500Hz conversely seems to keep the whole upper midrange and treble range intact, while also having the least effect on the imaging. Vocals seem to suffer somewhat less than higher frequencies and FWIW I think 1000Hz is weirder on the vocals (vs 1500Hz) aswell. LR12 at 1500Hz is hard to pull off, though. Even with a large tweeter and waveguide it's pretty hard on the tweeter.

LR24 at 1000Hz:

Sounds more disjointed than LR12 at 1500Hz to me. Vocals suffer. LR24 at 1500Hz isn't much better, though. If possible LR12 at 1500Hz seems like it might be a good choice.

3-way speaker corrected:

As mentioned above, I made an inverted version of the 3-way speaker file to correct its excess phase. This way you end up with an awesome looking right-triangle shape step response.
Well, I definitely prefer it. Imaging is better, vocals are more natural, speed seems to be improved. Bass is less sluggish and it sounds tighter. It's not all awesome, though. It seems that by voicing speakers by ear you sort of correct for the perceived FR differences the phase shift results in automatically. In other words: FR becomes subjectively worse (despite not objectively changing at all) in two ways:

The speakers have somewhat of a BBC dip (mild, 1-2dB) and it seems this works well in combating the shout I heard otherwise with the LR24 filter. With the phase response corrected the upper midrange now sadly sounds noticeably distant.

In a similar vein the lower midrange sounds less thick now and it overall ends up sounding a bit lean. It's weird how that works since the FR magnitude doesn't change, but it seems the phase shift seems to draw attention to itself somehow. The treble FR (unevenness) also sticks out a bit more, it seemed to be somewhat hidden before. It's like once the phase is corrected, there's less space to hide other nasties.

Serious · Jun 30, 2020

I mentioned this in @k4rstar's D/S thread:

Serious said: ↑

Microphones will also have some phase shift at the extremes considering they are less linear and more bandwidth-limited than electronics. I'm currently experimenting with using FIR filters to get a close to linear phase response (as opposed to minimum phase) from 20Hz - 20kHz, but I'm still undecided as to if it's worth it or not.
Click to expand...

Here's an example of such a file. It's a binaural recording of me sitting in a subway train. Due to the file size limits the snip here is rather short, but I think the part where the doors are closing makes the difference the most obvious anyway. That's the part I attached here.
One of them adds 42° of phase shift at 20Hz, the other subtracts 42° at 20Hz, both have a Q of 0.7. The third leaves the phase response as is. This is from a 6mm omni mic with a flat response to well below 20Hz, so the phase shift should be negligible at 20Hz. Adding 45° phase shift at 20Hz is roughly equivalent to a first order HPF with a -3dB point at 20Hz. Most dynamic driver headphones will have more LF rolloff, speakers normally much more. However some orthos may roughly match the microphone in infrasonic extension. Note that there is no difference in the frequency response magnitude, only the phase response. Can you guess which is which?
Note that there is no high pass filter applied to the recording and as such it contains a lot of infrasonic information. The spectrum peaks at 7Hz! Proceed with caution before trying to reproduce it at realistic volume levels*. Speakers without a HPF may exceed Xmax, potentially causing woofer damage! However I think most headphones should be fine, but I can't guarantee it.

Of course it's best to listen on headphones or IEMs with good LF extension. Best to listen on planars or IEMs. Try to guess first. After I tell you which is which it will become much more clear. Try not to look at the waveforms, as that may give it away.

*While this may not seem so loud in real life, the sound is actually quite loud. The sound of the doors closing in the recording corresponds to an A-weighted peak of 85dB, a C-weighted peak of 100dB and an unweighted peak of 117dB!

EDIT: I should mention that while I already feel quite confident in the absolute dB SPL values, I still want to remeasure the microphones and recheck all my editing steps, just to make sure I made no mistakes. As such it's possible the SPL values are slightly off.

Azimuth · Jun 30, 2020

Serious said: ↑

I mentioned this in @k4rstar's D/S thread:

Here's an example of such a file. It's a binaural recording of me sitting in a subway train. Due to the file size limits the snip here is rather short, but I think the part where the doors are closing makes the difference the most obvious anyway. That's the part I attached here.
One of them adds 42° of phase shift at 20Hz, the other subtracts 42° at 20Hz, both have a Q of 0.7. The third leaves the phase response as is. This is from a 6mm omni mic with a flat response to well below 20Hz, so the phase shift should be negligible at 20Hz. Adding 45° phase shift at 20Hz is roughly equivalent to a first order HPF with a -3dB point at 20Hz. Most dynamic driver headphones will have more LF rolloff, speakers normally much more. However some orthos may roughly match the microphone in infrasonic extension. Note that there is no difference in the frequency response magnitude, only the phase response. Can you guess which is which?
Note that there is no high pass filter applied to the recording and as such it contains a lot of infrasonic information. The spectrum peaks at 7Hz! Proceed with caution before trying to reproduce it at realistic volume levels*. Speakers without a HPF may exceed Xmax, potentially causing woofer damage! However I think most headphones should be fine, but I can't guarantee it.

Of course it's best to listen on headphones or IEMs with good LF extension. Best to listen on planars or IEMs. Try to guess first. After I tell you which is which it will become much more clear. Try not to look at the waveforms, as that may give it away.

*While this may not seem so loud in real life, the sound is actually quite loud. The sound of the doors closing in the recording corresponds to an A-weighted peak of 85dB, a C-weighted peak of 100dB and an unweighted peak of 117dB!

EDIT: I should mention that while I already feel quite confident in the absolute dB SPL values, I still want to remeasure the microphones and recheck all my editing steps, just to make sure I made no mistakes. As such it's possible the SPL values are slightly off.
Click to expand...

A to me seems untouched.
B to me seems like adding the phase shift
C to me seems like subtracting the phase shift

C certainly has slightly less "woomph" from the doors closing. A has slightly the biggest. A also has the best detail of the squeal at the end, while C has less of it.

Serious · Jun 30, 2020

Nope, that's not it.
I do agree with the first part. A seems to have more of a thump. Listening to it on my UERMs the difference is much more obvious than on my HD800s, which makes me think that good bass extension really is needed for judging the files. I don't have any planars, so the UERM is the best I can do
As for the 2nd part. they're only roughly matched in their length and beginning/end, so I wouldn't read too much into it. Also not sure what/who that sound was. I think the editing as such should be transparent, as mentioned above I can't tell a unit pulse convolution. Well, in that case the output is the same as the input, only with a delay.

skem · Jun 30, 2020

with all due respect, 3 seconds of percussive noises does not make for a good test track.

Serious · Jun 30, 2020

That's true. What would you like to hear?
I think most musical instruments are out since we need significant LF content, but maybe some very large bass drums could work? Although a door slamming may be somewhat excessive as far as LF content is concerned.
There are comparisons online for linear phase vs minimum phase filters, like this one: https://www.audiomasterclass.com/ne...se-eq-on-transient-signals-such-as-snare-drum

I want to add a HPF to these binaural recordings, so it is possible to play them back at realistic volume levels without fearing for driver damage. Technically I don't see a problem with using a minimum phase filter, but I'll give the linear phase variant a try.

Serious · Jul 18, 2020

Update: I have tried different high pass filters and they all seem to significantly affect the sound and transparency, however I don't think it's safe to keep the files as is. Between a 1st order and 2nd order high pass at 20Hz with a Q of 0.707 there's not that big of a difference in sound that I wouldn't use the 2nd order filter, just for extra protection. With transducers that can handle it I would actually listen to the files without a HPF, but I'm not going to upload them like that as it's just guaranteed to kill drivers. Almost every song in my library seems to use a much steeper high pass filter at a much higher frequency than that anyway.

I experimented with linear phase and minimum phase high pass filters and I far preferred the minimum phase variant, so I wouldn't really worry about the phase shift at the low end caused by the high pass behaviour itself. It seems you have to fix both the frequency response magnitude and phase response in order to restore the sound. So for pressure-gradient mics it may make sense to EQ the bass so it becomes linear.
I attached 20Hz high pass filter files for experimentation. Both 1st and 2nd order minimum phase and linear phase filters.

A = +42°, B = 0°, C = -42°

To get back on topic I also generated files in RePhase with same-ish phase response as three multi-BA IEMs I have measured. The Brainwavz B2, UERM and CF Andromeda. Here's what the excess group delay looks like for these filters. Note that I didn't get the TWFK (B2) behaviour down perfectly as it was a little harder to get right than the others. I create these files by hand, creating FIR filters in software may be a better idea in this case.

To be honest given how these files sound I'm fairly certain that the excess phase may explain more about an IEM's sound than I expected. Maybe I don't even have to improve my IEM measurement rig (thread on that topic here) as much (update is still going to take a few weeks) as I thought since the excess phase seems to explain some of the tonal differences I attributed to the frequency response magnitude. Seriously, compare the files in JRiver (or another software), the differences are very obvious to my ears.

Serious · Feb 23, 2024

@Ardacer reminded me of this thread recently. I finally made a short test file, actually I reused one of the files from the Rgnarok distortion anaylsis thread.

Free Dom

crossover phase shift audibility

Which file is just audible with music?

They all are.

The widebanders are audible.

LR12 in the midrange is just audible.

LR 12 in both the bass and midrange is just audible.

LR 24 is just audible.

The subwoofer is audible.

What are you talking about? They all sound the same to me.

Serious Inquisitive Frequency Response Plot

Attached Files:

Serious Inquisitive Frequency Response Plot

Serious Inquisitive Frequency Response Plot

Serious Inquisitive Frequency Response Plot

Attached Files:

Azimuth FKA rtaylor76, Friend

Serious Inquisitive Frequency Response Plot

skem Friend

Serious Inquisitive Frequency Response Plot

Serious Inquisitive Frequency Response Plot

Attached Files:

Serious Inquisitive Frequency Response Plot

Share This Page

Useful Searches