FIR filter trilemma with limited tap length

cameng318 · Dec 2, 2023

This thread should be irrelevant to you if you don't know / don't care how the DACs work, or you can't hear above 7 kHz. For the others, just skim through the pictures and maybe try the sample files. Don't read too much into it if it makes your head explode.

cameng318 · Dec 2, 2023

I made a few sample files with the last 1.5 minutes of a CD version of Yes's Roundabout. Link here. Listen for yourself to decide which types of filters you like. All the discussions below are about the 4x filters I made for these files.

The -12 dB file is the original sample volume matched to the filtered ones. It will show you what your DAC's internal filter sounds like.

cameng318 · Dec 2, 2023

NOS and Zero Stuffing

To 4x OS the audio files, we must first create place holders for the newly added 3x samples.
You could either fill them with zeros (zero stuffing):

Or fill them with previous samples (like NOS DACs):

Zero stuffing creates the aliasing above 22.05 kHz, which needs to be filtered out:

Handling it the NOS way has a droopy effect, with the famous -3.16 dB dip at 20 kHz. Above 20 kHz it actually flattens the aliasing noises and blends it somewhat in shape with the music's high frequencies. The music now looks like an 1/f pattern or -20 dB/decade for above 2 kHz.

Assume we aren't NOS fans here, and we would like to completely filter out the aliasing bands. We would need a FIR filter to do so. I won't elaborate about it here as I suck at teaching. There should be plenty of videos about them on YouTube if you are interested.

cameng318 · Dec 2, 2023

Typical tap lengths

Here I gathered the tap lengths and bits found in some audio chips:
- SAA7220 - 120 taps, 16 bits. This is the OS chip that goes with TDA1541.
- AK4493 - 32 bit filters. Don't know about how many taps. The 32 bit is probably about the arithmetic unit, not the filter coefficients.
- CS43198 - 54 taps 16 bits, programmable.
- ESS9069 - 128 taps for 2X filter, 32 taps for 4X filter, 24 bits, programmable.

I didn't read too much into the datasheets, so I could be wrong about some of them. I don't know why the 2X filter of ESS9069 has more taps than the 4X filter. Perhaps the 2X filter is actually a 4X filter for 2X contents (88.2 kHz music), and the 4X filter is actually a 2X filter for the 4X contents (176.4 kHz music)

This thread isn't about shaming those chips. It's hard work to make the FIR filters in chips with limited resources. I pretty much failed my FPGA class project to create a microcontroller. All the individual parts worked, but they didn't work when put together. It's pretty PITA to design a chip that wants to do everything.

cameng318 · Dec 2, 2023

Filters with 128 taps

Let's give all those chips the benefit of the doubt. I designed these 4x OS filters with 128 taps and 24 bits. I could come up with 3 types of filters, but each with its own drawback.

Filter A - leaky

This filter has 1 dB droop at 20 kHz, pretty much flat to 19kHz. It filters the aliasing bands down to -110 dB. However it leaks out the aliasing bands between 22.05kHz and 30 kHz.

Filter B - grassy

This filter is a little droopy at 20kHz, but still flat to 19 kHz. It starts mowing down the aliasing right at 22.05 kHz. However, it only mows them down by 50ish dB.

Filter C - droopy

This filter completely filters out the aliasing bands above 22.05 kHz. However, it's super duper droopy, much droopier than NOS. The -3dB frequency is at 13 kHz, so it's pretty much as droopy as some cassette player or phono cartridge.

cameng318 · Dec 2, 2023

Filter with 1024 taps

Let's raise the game up a little bit with 1024 taps and 32 bit coefficients:

Now it's pretty much flat to 20 kHz and completely nuked out all the aliasing by 150+ dB. 1024 taps is just long enough to do the job. Going longer can flatten out frequency response even more, but they'll have other sound effects. That's another topic for another day.

Note this 1024 taps linear phase filter at 176.4 kHz only introduces 2.9 ms of delay. That's pretty imperceivable even if it stacks up a few times.

cameng318 · Dec 2, 2023

Gears:
DAC: Gungnir A2 (its internal filter to 352.8 kHz is probably still in play)
Amp: My DIY amp with Noise Nuke and CMLI-15/15B
Headphones: HE6SE with rear screen removed, LCD2 with AHG pads and silver wires
Threshold of hearing: 17 kHz ish
Age: mid 20s (sorry I faked my birthday when registering this site, perhaps I should correct and reveal it)

I like the sounds that give me the least listening fatigue, so I liked the 1024 taps filter and filter C the best. They sound the most smoothed off at the top end, and also least digitus. The 1024 taps filter makes me want to listen more, while filter C gives me the vintagish analogish kind of feel.

To my surprise, I think filter A and B sound worse than NOS and zero stuffing. Zero stuffing sounds kinda piercing in the highs, but it isn't as annoying as the filter A and B.

The NOS filter has some digitus, but it isn't that annoying to me. It also seems to have more air and space than other filters. I like it but still would prefer the 1024 taps filter. The analog filter in my Gungnir A2 is probably doing a great job taming the NOS too.

The internal FIR filter of my Gungnir A2 sounds wet, in the reverb type of way, which means it probably has many more taps than 1024. It also does a great job filtering out the digitus sounds. It's on par with the 1024 taps filter to me. I'll pick between them depending on how much wet I want to add to my headphones.

cameng318 · Dec 2, 2023

Pull out of my ass theory about why ultrasonic matters

All I know about neuralscience is only a few episodes of podcasts, so please don't quote me on this theory. I'm a total noob in this field, and everything I wrote below could be wrong.

While the world is crazy about measuring SINAD of gears, I don't think much about distortions has been measured on our ears. I absolutely would love to see a distortion surface of our cochlear. I would bet it's H2 and H3 dominated like every other audio transducer in this physical world. Thus the ultrasonics up to 2x or 3x above our threshold of hearing might be perceived, when their audible fundamental frequencies are present. That's how I came up with the 7 kHz criteria in the first post.

In my theory, It will sound digitus if the ultrasonics kept cutting in and out, making it hard for our brain to find the pattern. Hence we should either completely filter them out, or leave them in the 1/f shape like NOS.

Hit me up if there are clues for my theory to be proved or disproved.

cameng318 · Dec 2, 2023

Relating NOS to Foveon Sensors

The aliasing effect isn't the end of the world. This recent blog post shows that the crunchiness of the Foveon sensors might be caused by sharpening the aliasing artifacts. NOS is kinda like that in this regard. I get how the Foveon sensors look great and how NOS sounds great. Maybe their fans are the same group of people.

NOS probably fooled our ears and brain into thinking that the aliasing frequencies are the harmonics of the audible frequencies. To improve this effect, maybe we can train an AI to recreate all the ultrasonics above 20 kHz. Too bad AI isn't my field of strength any time soon.

cameng318 · Dec 2, 2023

By no means this thread is comprehensive. There are so many other things in digital filters that also have an audible effect, such as: miniphase filters, IIR filters, windowing functions, number of bits, noise shaping and dithering… Some I may write about in the future, and some I might keep as my secret sauce. Taking care of the FIR filter still didn't completely cure what I don't like about some DS DACs. There are still plenty of mysteries to solve in the audio world. I'm glad I won't ever be bored for the rest of my life

cameng318 · Dec 2, 2023

Bonus:
ReplayGain scanning the files:

Nearly all the FIR filters bring up the track peak by 10-15%, while the volume remains pretty much the same. This is why we need to leave the FIR filter a few dB of headrooms to prevent clipping.

purr1n · Dec 2, 2023

cameng318 said: ↑

Gears:
DAC: Gungnir A2 (its internal filter to 352.8 kHz is probably still in play)
Amp: My DIY amp with Noise Nuke and CMLI-15/15B
Headphones: HE6SE with rear screen removed, LCD2 with AHG pads and silver wires
Threshold of hearing: 17 kHz ish
Age: mid 20s (sorry I faked my birthday when registering this site, perhaps I should correct and reveal it)

I like the sounds that give me the least listening fatigue, so I liked the 1024 taps filter and filter C the best. They sound the most smoothed off at the top end, and also least digitus. The 1024 taps filter makes me want to listen more, while filter C gives me the vintagish analogish kind of feel.

To my surprise, I think filter A and B sound worse than NOS and zero stuffing. Zero stuffing sounds kinda piercing in the highs, but it isn't as annoying as the filter A and B.

The NOS filter has some digitus, but it isn't that annoying to me. It also seems to have more air and space than other filters. I like it but still would prefer the 1024 taps filter. The analog filter in my Gungnir A2 is probably doing a great job taming the NOS too.

The internal FIR filter of my Gungnir A2 sounds wet, in the reverb type of way, which means it probably has many more taps than 1024. It also does a great job filtering out the digitus sounds. It's on par with the 1024 taps filter to me. I'll pick between them depending on how much wet I want to add to my headphones.

Click to expand...

The Gungnir A2 analog filter is expecting x8 OS, so the filter will probably be closer to 100kHz to avoid too steep a filter. However, your CMLI-15 line input transformers will also be doing a bit of low pass work as well. I'm curious now of their bandwidth.

Check out Resampler-V for Foobar. Quite a few options to play with in respect to passband width, stopband frequency, stopband attenuation, noting it's effect on impulse response ringing. It's a matter of tradeoffs (if you believe impulse response ringing is audible). It makes me barf when I read ASR applauding stopband attenuation numbers because it's more complex than that.

cameng318 · Dec 2, 2023

purr1n said: ↑

The Gungnir A2 analog filter is expecting x8 OS, so the filter will probably be closer to 100kHz to avoid too steep a filter. However, your CMLI-15 line input transformers will also be doing a bit of low pass work as well. I'm curious now of their bandwidth.
Click to expand...

My potentiometer is 10k, so the -3 dB frequency is probably around 50 to 60 kHz according to the datasheet. Yep that's some filtering too. I don't know where the Gungnir A2 cutoff at, but it makes sense to run an 8x OS with the analog filter cuting at 50 kHz too. 8x OS just makes the ananlog filter's work easier, and require less steep filter as well.

purr1n said: ↑

Check out Resampler-V for Foobar. Quite a few options to play with in respect to passband width, stopband frequency, stopband attenuation, noting it's effect on impulse response ringing. It's a matter of tradeoffs (if you believe impulse response ringing is audible). It makes me barf when I read ASR applauding stopband attenuation numbers because it's more complex than that.
Click to expand...

I prefer to look at the absolute value of impulse response on dB scale. Here's a sinc filter without windowing function:

Then you'll see plenty of the ringing is within audible range. It's how the PGGB filter or a perfect brickwall filter looks like. It smears the sound in a quite pleasant way that many people like.

Then here's what my 1024 tap filter looks like:

It doesn't really last long enough to cause an audible smearing effect. It's about the same amount of delay as 240 Hz monitors.

I haven't fully flush out what to look for when looking at impulse response like this. That's why I didn't show it at first place. Showing impulse response like this is probably an ass to work with DAC measurements. All the dc offsets and analog filter probably messes things up. I tried it with my RME ADI-2, but so far no clear results.

On a side note, windowing functions matters more than passband/stopband etc. I hate to admit it, but that's just what I hear. I could use fancier windowing functions to achieve better passband/stopband performance with less taps, but they just don't sound as good as the good old gaussian window, which I used for all samples in this thread. I have some clues of what's the cause, but still not quite sure. Still need more experiments.

roshambo123 · Dec 2, 2023

cameng318 said: ↑

the crunchiness of the Foveon sensors might be caused by sharpening the aliasing artifacts. NOS is kinda like that in this regard. I get how the Foveon sensors look great and how NOS sounds great. Maybe their fans are the same group of people.
Click to expand...

Interesting observation. I don't know if it's entirely true, as I think Foveon crunch is due to mainly the sensor being low resolution and wildly noisy and as Jim Kasson points out some oversharpening from Sigma software.

What makes Foveon appealing has nothing to do with over/under sampling I think, but the tone curve. The Foveon images have a film like tone response (except when clipping violently) and that's what people largely enjoy, the other piece being handling of high color saturation. For comparison, you can utilize pixelshift on a bayer sensor to sample each pixel in a manner like Foveon and the tone curve and colors of the final image are still very bayer looking. We're talking about how light filters through multiple layers of silicon (or not) versus over or undersampling on DACs and it's not really comparable in my mind.

Where your analogy works is saying something like "NOS comes off a bit more analog sounding than 2x/4x/8x, and people like that for the same reasons they like Foveon, which is another digital tech that has some analog-like behaviors." That being said, I don't know the reasons for those individual analog behaviors are directly comparable, although you may have a proof I missed in some of those droop charts, such as the reason for the NOS tone is buried in the droops or something.

Northwest · Dec 2, 2023

Not a question strictly to cameng318, but to anyone who has insight.

I wanted to participate in this using ASIO with ROON ->Schiit Eitr (Spdif)-> Modi Multibit 1 -> Amp

But it is absolutely not having it with the 176.4 files unless I down sample to 88.2.

I was wondering with the down sampling if my impressions would have any validity.

cameng318 · Dec 2, 2023

roshambo123 said: ↑

Where your analogy works is saying something like "NOS comes off a bit more analog sounding than 2x/4x/8x, and people like that for the same reasons they like Foveon, which is another digital tech that has some analog-like behaviors." That being said, I don't know the reasons for those individual analog behaviors are directly comparable, although you may have a proof I missed in some of those droop charts, such as the reason for the NOS tone is buried in the droops or something.
Click to expand...

Yeah I think my thoughts aren't fully flushed out yet. I just saw the blog post yesterday, and trying to draw parallels between them. A large portion of the Foveon look definately comes from the post processing like tone and sharpenning, but I think I can apply the same processing to Bayer sensors and have them look similar, except maybe the color science will be a little off due to different spectrum response. Resolution, noise and spectrum response are the only real differencies in my signal processing perspective.

Going by my gut feeling, I think it's the 1/f characteristics giving the analog feel. To me pink noise sounds more analog than white noise. I mentioned the 1/f somewhere, but didn't draw it out. Here's their similarities of using aliasing to recreate that 1/f response:

cameng318 · Dec 2, 2023

Northwest said: ↑

I was wondering with the down sampling if my impressions would have any validity.
Click to expand...

Hmmm I didn't think about the 88.2 kHz case. Improperly downsamp it to 88.2 kHz would fold some of the aliasing back into audible range, so it would make the effects much more obvious, and sorta defy the purpose of the test.

I used to have a Modi Multibit too. I think its USB port could take 176.4 kHz, so perhaps use the USB port instead of the Eitr?

roshambo123 · Dec 2, 2023

cameng318 said: ↑

A large portion of the Foveon look definately comes from the post processing like tone and sharpenning, but I think I can apply the same processing to Bayer sensors and have them look similar, except maybe the color science will be a little off due to different spectrum response.
Click to expand...

The color science is the interesting part.

Foveon does some really interesting things with highly saturated colors that is both filmic and very different to bayer. Bayer for example tends to clip pure reds easily and even with proper exposure you really struggle to bring fine gradations of an image with any pure primary colors back. Fujifilm has a feature called "Color Chrome Effect" which is meant to address this, and it does to some degree, but it's a nuisance to use and slows down operation of the camera. Foveon and film all seem able to deliver extremely saturated colors with a lot of gradations easily. My thinking is the foveon stacked CFA layers have vastly different spectral responses vs. bayer, possibly reduced sensitivity to extremely saturated colors, reducing overexposure and clipping.

cameng318 · Dec 2, 2023

roshambo123 said: ↑

Bayer for example tends to clip pure reds easily and even with proper exposure you really struggle to bring fine gradations of an image with any pure primary colors back.
Click to expand...

I haven't looked into how that part is handled, but this could make sense. It feels like another jar of worms. Bayer sensors does have less resolution in pure colors, but we also have 61MP sensors with huge dynamic range nowadays. I'd expect the gap to close somewhat.

My other guess of the color gradations issue is the lack of dithering. Modern sensors are way too clean in certain aspects, despite shitty color noise, and then banding could becomes a problem. Adding some noise during post processing might help. It would cover up the color noise too. In my experience quantization error is the nastiest thing in both audio and photo.

soumya · Dec 28, 2023

cameng318 said: ↑

Typical tap lengths

Here I gathered the tap lengths and bits found in some audio chips:
- SAA7220 - 120 taps, 16 bits. This is the OS chip that goes with TDA1541.
- AK4493 - 32 bit filters. Don't know about how many taps. The 32 bit is probably about the arithmetic unit, not the filter coefficients.
- CS43198 - 54 taps 16 bits, programmable.
- ESS9069 - 128 taps for 2X filter, 32 taps for 4X filter, 24 bits, programmable.

I didn't read too much into the datasheets, so I could be wrong about some of them. I don't know why the 2X filter of ESS9069 has more taps than the 4X filter. Perhaps the 2X filter is actually a 4X filter for 2X contents (88.2 kHz music), and the 4X filter is actually a 2X filter for the 4X contents (176.4 kHz music)

This thread isn't about shaming those chips. It's hard work to make the FIR filters in chips with limited resources. I pretty much failed my FPGA class project to create a microcontroller. All the individual parts worked, but they didn't work when put together. It's pretty PITA to design a chip that wants to do everything.
Click to expand...

Delta Sigma DACs reach the needed MHz sampling rates in 2 broad stages - 8x or more interpolation using FIR. Followed by 16x or more ZoH followed by a gentler 2nd order or so IIR.

The interpolation stage needs significant number of multiplications if it were to be done in single stage which most DAC chips don't have the resources for. So it's done in cascaded stages yet again.
In case of ESS - the first 2x FIR filter is 128 taps if the channels are not clubbed together. If the channels are clubbed (to extract more SNR) the 2x FIR filter can use 256 taps coefficients for a steeper filter.
This is followed by a 32 taps 4x filter which is obviously less steeper.
That's how you get the needed 8x digital filtering.

Compare that to single stage filtering that DACs like Rockna WaveDream, Soekris (the 8x part of the 32x final rate), Chord (the 16x part of the final 256x rate) do which is accomplished using an FPGA.

FIR filter trilemma with limited tap length

cameng318 Friend

cameng318 Friend

cameng318 Friend

cameng318 Friend

cameng318 Friend

cameng318 Friend

cameng318 Friend

cameng318 Friend

cameng318 Friend

cameng318 Friend

cameng318 Friend

purr1n Desire for betterer is endless.

cameng318 Friend

Attached Files:

no_windowing.jpg

roshambo123 Friend

Northwest Almost "Made"

cameng318 Friend

cameng318 Friend

roshambo123 Friend

cameng318 Friend

soumya Acquaintance

Share This Page

ABOUT US

RELATED LINKS

REFERENCES

CONTACT US

FIR filter trilemma with limited tap length

cameng318 Friend

cameng318 Friend

cameng318 Friend

cameng318 Friend

cameng318 Friend

cameng318 Friend

cameng318 Friend

cameng318 Friend

cameng318 Friend

cameng318 Friend

cameng318 Friend

purr1n Desire for betterer is endless.

cameng318 Friend

Attached Files:

no_windowing.jpg

roshambo123 Friend

Northwest Almost "Made"

cameng318 Friend

cameng318 Friend

roshambo123 Friend

cameng318 Friend

soumya Acquaintance

Share This Page

Useful Searches