HQPlayer vs SOX

Garns · Nov 24, 2021

./configure --with-flac will attempt to compile in flac support. It needs libFLAC to link dynamically against. You could also make a static compile of flac and pass it in to the linker. There are some instructions here.

Scott Kramer · Nov 28, 2021

Thanks! ./configure --with-flac was it.

Woland · Dec 18, 2021

Archimago chips in...

http://archimago.blogspot.com/2021/12/upsampling-native-dac-playback-and-sox.html

ohshitgorillas · Dec 29, 2021

The talk around town here has got me curious about HQPlayer and upsampling in general. Since my main DAC is an Android DAP which can't run HQP it and my other DAC is a Bifrost 2 which is 'limited' to 16-bit / 192 kHz, I think SoX is my best bet. Also, I'm a huge nerd and I almost always prefer the DIY route over pre-paid solutions and I've been looking for an excuse to get back into learning Linux.

Right now my goal is to create upsampled versions of a handful of albums (offline), upsampled to 16X, to defeat the interpolation filters on my Shanling M8.

I've downloaded the modified version from this thread but can't compile it:
Code:
checking for pkg-config... /usr/bin/pkg-config
checking pkg-config is at least version 0.9.0... yes
./configure: line 10252: syntax error near unexpected token `-fstack-protector-strong'
./configure: line 10252: `AX_APPEND_COMPILE_FLAGS(-fstack-protector-strong)'
I've also downloaded a modified version with DSD support (https://github.com/mansr/sox) and applied @Garns modifications to the files in src to raise the number of taps, then recompiled, but for some reason I'm still capped at 32767 taps.

Also, using the formula above, I'm successfully able to create 4X upsampled versions but I'm getting an error when I try to create 16X upsampled versions at 705.6kHz (changing the sample rate to 705600 and "upsample 16"):
Code:
/home/viserion/bin/sox-dsd/src/.libs/sox FAIL formats: can't open output file `test16x.flac': FLAC__STREAM_ENCODER_IO_ERROR
edit: I am able to create WAV files at 705.6kHz, just not FLACs... apparently FLAC is limited to 655,350 Hz.

edit2: figured out why I couldn't increase the tap size; I had missed the changes to the rate and sinc files.

fastfwd · Dec 29, 2021

ohshitgorillas said: ↑

my other DAC is a Bifrost 2 which is 'limited' to 16-bit / 192 kHz
Click to expand...

24/192

ohshitgorillas said: ↑

./configure: line 10252: syntax error near unexpected token `-fstack-protector-strong'
./configure: line 10252: `AX_APPEND_COMPILE_FLAGS(-fstack-protector-strong)'
Click to expand...

If you're using a VERY old version of GCC (like 4.83), you might have to change that to just "-fstack-protector".

ohshitgorillas · Dec 30, 2021

fastfwd said: ↑

24/192
Click to expand...

It's definitely capable of accepting 24 bits. I was referring to the fact that Schiit claims their multibit DACs are capable of a certain "real" bit depth below 24 like how people were resampling to 21bit earlier in the thread. I recall reading that the Bifrost 2 has 16 "real bits" although maybe I'm misunderstanding.

ohshitgorillas · Jan 2, 2022

Here are some additional resources I've found helpful for following along with this discussion, for anyone else who is curious:

DAC Digital Filters and their impact in the time and frequency domains
DAC Digital Filters part 2: Deeper dive into the AK4490 and AK4493 filters
These posts demonstrate the trade-offs in oversampling filters. I'm not sure I agree with all of his opinions or interpretations (difficult for most, if not all, to hear? it's simple if you know what to listen for...), but otherwise it's a great explanation.

This is a very clear explanation of FIR design and window functions, plus it demonstrates the custom filter designer for MATLAB/octave referenced earlier.

There are still a few aspects of this that I don't quite understand yet but I'm getting there.

Interesting that many people consider sharp filters to be superior... with headphones, I usually prefer the sense of depth and separation offered by slow filters over the relative 'in your face' aggressiveness of sharp filters. I don't mind sharp filters on speakers, though I haven't really done any A/B testing on speakers.

I will also say after testing that the files I've upsampled using @Garns 64 million tap sinc filter at 4x, 8x, and 16x do sound better than their 16 bit / 44.1 kHz counterparts. The upsampled files make the originals sound somewhat compressed, constrained, congested. What I hear from my Shanling M8 is, in particular, less bloat and better separation in the low end during busy tracks; more realistic and natural textures; improved plankton retrieval, staging, and spatial cues; and generally cleaner sound. The benefits are subtle, e.g. if you can't hear the difference between a slow and sharp filter then it's probably a waste of time... but for those audiophiles dropping major cash chasing 1-2% improvements, this one is solid and free.

I'm still trying to decide where the sweet spot is for me--4X, 8X, or 16X. Probably depends on the quality of the original recording, but so far I don't hear a ton of difference between 8X and 16X... at least not anything that justifies the massive increase in file size from FLAC to WAV. I also have yet to try upsampling to DSD.

soumya · Jan 3, 2022

Happy New Year to all friends here!

Returning here after a while - tight deadlines on work front meant more Java and less (nearly 0) DSP fiddling in Python/Octave.
No updates from Henrik for a new camilladsp release. There are about 16 items open - so quite some items on his plate.
I will wait for some more time, else will start the project using sox as intermediate pipeline for up-sampling.

soumya · Jan 3, 2022

ohshitgorillas said: ↑

Here are some additional resources I've found helpful for following along with this discussion, for anyone else who is curious:

DAC Digital Filters and their impact in the time and frequency domains
DAC Digital Filters part 2: Deeper dive into the AK4490 and AK4493 filters
These posts demonstrate the trade-offs in oversampling filters. I'm not sure I agree with all of his opinions or interpretations (difficult for most, if not all, to hear? it's simple if you know what to listen for...), but otherwise it's a great explanation.

This is a very clear explanation of FIR design and window functions, plus it demonstrates the custom filter designer for MATLAB/octave referenced earlier.

There are still a few aspects of this that I don't quite understand yet but I'm getting there.

Interesting that many people consider sharp filters to be superior... with headphones, I usually prefer the sense of depth and separation offered by slow filters over the relative 'in your face' aggressiveness of sharp filters. I don't mind sharp filters on speakers, though I haven't really done any A/B testing on speakers.

I will also say after testing that the files I've upsampled using @Garns 64 million tap sinc filter at 4x, 8x, and 16x do sound better than their 16 bit / 44.1 kHz counterparts. The upsampled files make the originals sound somewhat compressed, constrained, congested. What I hear from my Shanling M8 is, in particular, less bloat and better separation in the low end during busy tracks; more realistic and natural textures; improved plankton retrieval, staging, and spatial cues; and generally cleaner sound. The benefits are subtle, e.g. if you can't hear the difference between a slow and sharp filter then it's probably a waste of time... but for those audiophiles dropping major cash chasing 1-2% improvements, this one is solid and free.

I'm still trying to decide where the sweet spot is for me--4X, 8X, or 16X. Probably depends on the quality of the original recording, but so far I don't hear a ton of difference between 8X and 16X... at least not anything that justifies the massive increase in file size from FLAC to WAV. I also have yet to try upsampling to DSD.
Click to expand...

Perfect, on the right track !
Yes it will look daunting at first. But it's also more rewarding in long run.

Some quick responses -
1. Steep filters won't sound bad if designed correctly being cognizant of the taps (computation resource available), the optimal transition width and sampling rate. However with long , steep filters you do start hearing flaws of downstream components more vividly.
2. Sweet-spot IMO will vary greatly if it's an oversampling or NOS DAC. For Delta Sigma, there is no other option but to defeat the initial 8x digital interpolation filters. You still have 0 order hold or IIR filters after that to take it to Mhz region before modulator comes to play.
Or convert to DSD but then modulator of the DSD encoder followed by modulator of DAC chip will still determine how things finally sound.
For NOS R2R DACs, 4x is a very good sweet-spot. Beyond which improvements become more subtle if perceptible at all. Speaking from my experience with Holo Spring via IIS. At every sampling rate make sure you are using the same transition width. Assuming other parameters are optimal, it's the width steepness that determines how much transient information gets recovered.
3. There is more to the Kaiser beta parameter than I talked about in this thread.
Here is the thing - using Rectangular Window function not only takes insane amount of resources for similar attenuation of side-lobes which others have figured out; the tones will sound way too soft, fuzzy, won't convey as much subtle details and room/ambient information.
We have to after all, give the dominant energy in a window it's own space to distinguish from other tones + noise. But we should not attenuate all other energy so much that it makes it sound too thin.

In DSP, it's all about balance. We just can't take one extreme.
The other issue is our hearing takes a while to acclimatize and only after an extended run we understand if the changes are good or hurting. Digital filters are not supposed to wow us but present a more subtle improvement which we need to evaluate with as much variety of content before reaching a conclusion.

Coming back to Kaiser Window - there seems to be a sweet spot for the attenuation of side-lobes (beta parameter).
Lesser than this, the tones sound thick, peaks sound blunted, soft, fuzzy and background is more grey than black.
Higher than this value, it begins sounding thin and importantly grainy.

audiofool · Jan 19, 2022

soumya said: ↑

Perfect, on the right track !
Yes it will look daunting at first. But it's also more rewarding in long run.

Some quick responses -
1. Steep filters won't sound bad if designed correctly being cognizant of the taps (computation resource available), the optimal transition width and sampling rate. However with long , steep filters you do start hearing flaws of downstream components more vividly.
2. Sweet-spot IMO will vary greatly if it's an oversampling or NOS DAC. For Delta Sigma, there is no other option but to defeat the initial 8x digital interpolation filters. You still have 0 order hold or IIR filters after that to take it to Mhz region before modulator comes to play.
Or convert to DSD but then modulator of the DSD encoder followed by modulator of DAC chip will still determine how things finally sound.
For NOS R2R DACs, 4x is a very good sweet-spot. Beyond which improvements become more subtle if perceptible at all. Speaking from my experience with Holo Spring via IIS. At every sampling rate make sure you are using the same transition width. Assuming other parameters are optimal, it's the width steepness that determines how much transient information gets recovered.
3. There is more to the Kaiser beta parameter than I talked about in this thread.
Here is the thing - using Rectangular Window function not only takes insane amount of resources for similar attenuation of side-lobes which others have figured out; the tones will sound way too soft, fuzzy, won't convey as much subtle details and room/ambient information.
We have to after all, give the dominant energy in a window it's own space to distinguish from other tones + noise. But we should not attenuate all other energy so much that it makes it sound too thin.

In DSP, it's all about balance. We just can't take one extreme.
The other issue is our hearing takes a while to acclimatize and only after an extended run we understand if the changes are good or hurting. Digital filters are not supposed to wow us but present a more subtle improvement which we need to evaluate with as much variety of content before reaching a conclusion.

Coming back to Kaiser Window - there seems to be a sweet spot for the attenuation of side-lobes (beta parameter).
Lesser than this, the tones sound thick, peaks sound blunted, soft, fuzzy and background is more grey than black.
Higher than this value, it begins sounding thin and importantly grainy.
Click to expand...

I tried your coefficients with sox for 44.1 to 176 - very impressive! I need to upsample to 705/768 at 32 bits from each of 44.1, 48, 96, 176, 192. Is it possible you could post the coefficient files? I noticed the volume is lower than using sox rate, is there a gain reduction built in? I have my own workflow adjusting the gain in advance after converting to 64 bit float so I don't need any gain reduction.
Thanks

audiofool · Jan 22, 2022

audiofool said: ↑

I tried your coefficients with sox for 44.1 to 176 - very impressive! I need to upsample to 705/768 at 32 bits from each of 44.1, 48, 96, 176, 192. Is it possible you could post the coefficient files? I noticed the volume is lower than using sox rate, is there a gain reduction built in? I have my own workflow adjusting the gain in advance after converting to 64 bit float so I don't need any gain reduction.
Thanks
Click to expand...

soumya said: ↑

Perfect, on the right track !
Yes it will look daunting at first. But it's also more rewarding in long run.

Some quick responses -
1. Steep filters won't sound bad if designed correctly being cognizant of the taps (computation resource available), the optimal transition width and sampling rate. However with long , steep filters you do start hearing flaws of downstream components more vividly.
2. Sweet-spot IMO will vary greatly if it's an oversampling or NOS DAC. For Delta Sigma, there is no other option but to defeat the initial 8x digital interpolation filters. You still have 0 order hold or IIR filters after that to take it to Mhz region before modulator comes to play.
Or convert to DSD but then modulator of the DSD encoder followed by modulator of DAC chip will still determine how things finally sound.
For NOS R2R DACs, 4x is a very good sweet-spot. Beyond which improvements become more subtle if perceptible at all. Speaking from my experience with Holo Spring via IIS. At every sampling rate make sure you are using the same transition width. Assuming other parameters are optimal, it's the width steepness that determines how much transient information gets recovered.
3. There is more to the Kaiser beta parameter than I talked about in this thread.
Here is the thing - using Rectangular Window function not only takes insane amount of resources for similar attenuation of side-lobes which others have figured out; the tones will sound way too soft, fuzzy, won't convey as much subtle details and room/ambient information.
We have to after all, give the dominant energy in a window it's own space to distinguish from other tones + noise. But we should not attenuate all other energy so much that it makes it sound too thin.

In DSP, it's all about balance. We just can't take one extreme.
The other issue is our hearing takes a while to acclimatize and only after an extended run we understand if the changes are good or hurting. Digital filters are not supposed to wow us but present a more subtle improvement which we need to evaluate with as much variety of content before reaching a conclusion.

Coming back to Kaiser Window - there seems to be a sweet spot for the attenuation of side-lobes (beta parameter).
Lesser than this, the tones sound thick, peaks sound blunted, soft, fuzzy and background is more grey than black.
Higher than this value, it begins sounding thin and importantly grainy.
Click to expand...

Thanks for the info, I am starting down the rabbit hole with Octave.
Trying to create an apodizing linear phase filter upsamping 16x
Found some code to start with, not sure if I have this right. Corner frequency set at 21khz I think makes it apodizing?

Thanks for any input.

fn=352800 % Nyquist freq. (Hz)
fc=21000 % Corner freq. (Hz)
tbw=331800 % Transition band width (Hz)
attn=300 % Stopband attenuation (dB)

% Make filter:
d=10^(-attn/20)
[n, w, beta, ftype] = kaiserord ([fc-tbw/2, fc+tbw/2], [1, 0], [d d], fn*2);
b = fir1 (n, w, kaiser (n+1, beta), ftype, "noscale");

% Plot magnitude response:
[h f] = freqz(b,1,2^18); plot(f/pi*fn, 20*log10(abs(h))); grid; pause

audiofool · Jan 24, 2022

soumya said: ↑

Perfect, on the right track !
Yes it will look daunting at first. But it's also more rewarding in long run.

Some quick responses -
1. Steep filters won't sound bad if designed correctly being cognizant of the taps (computation resource available), the optimal transition width and sampling rate. However with long , steep filters you do start hearing flaws of downstream components more vividly.
2. Sweet-spot IMO will vary greatly if it's an oversampling or NOS DAC. For Delta Sigma, there is no other option but to defeat the initial 8x digital interpolation filters. You still have 0 order hold or IIR filters after that to take it to Mhz region before modulator comes to play.
Or convert to DSD but then modulator of the DSD encoder followed by modulator of DAC chip will still determine how things finally sound.
For NOS R2R DACs, 4x is a very good sweet-spot. Beyond which improvements become more subtle if perceptible at all. Speaking from my experience with Holo Spring via IIS. At every sampling rate make sure you are using the same transition width. Assuming other parameters are optimal, it's the width steepness that determines how much transient information gets recovered.
3. There is more to the Kaiser beta parameter than I talked about in this thread.
Here is the thing - using Rectangular Window function not only takes insane amount of resources for similar attenuation of side-lobes which others have figured out; the tones will sound way too soft, fuzzy, won't convey as much subtle details and room/ambient information.
We have to after all, give the dominant energy in a window it's own space to distinguish from other tones + noise. But we should not attenuate all other energy so much that it makes it sound too thin.

In DSP, it's all about balance. We just can't take one extreme.
The other issue is our hearing takes a while to acclimatize and only after an extended run we understand if the changes are good or hurting. Digital filters are not supposed to wow us but present a more subtle improvement which we need to evaluate with as much variety of content before reaching a conclusion.

Coming back to Kaiser Window - there seems to be a sweet spot for the attenuation of side-lobes (beta parameter).
Lesser than this, the tones sound thick, peaks sound blunted, soft, fuzzy and background is more grey than black.
Higher than this value, it begins sounding thin and importantly grainy.
Click to expand...

I have the following code working:
fs=705600 % Sample Rate (Hz)
fc=21000 % Corner freq. (Hz)
tbw=80 % Transition band width (Hz)
attn=300 % Stopband attenuation (dB)

% Make filter:
bands=[fc-tbw/2, fc+tbw/2]
mag=[1,0]
d=10^(-attn/20)
dev=[d,d]
[n, w, beta, ftype] = kaiserord(bands, mag, dev, fs);
b = fir1 (n, w, ftype, kaiser (n+1, beta), "noscale");

I had to delete the last coefficient, seems to mess it up for some reason?
Creates about 180k coefficients, Does it look ok?
You mentioned something about 2 stage filter, any example code you can share?
Thank you for the reference to camiladsp - looks promising, will test soon.

soumya · Mar 1, 2022

audiofool said: ↑

Thanks for the info, I am starting down the rabbit hole with Octave.
Trying to create an apodizing linear phase filter upsamping 16x
Found some code to start with, not sure if I have this right. Corner frequency set at 21khz I think makes it apodizing?

Thanks for any input.

fn=352800 % Nyquist freq. (Hz)
fc=21000 % Corner freq. (Hz)
tbw=331800 % Transition band width (Hz)
attn=300 % Stopband attenuation (dB)

% Make filter:
d=10^(-attn/20)
[n, w, beta, ftype] = kaiserord ([fc-tbw/2, fc+tbw/2], [1, 0], [d d], fn*2);
b = fir1 (n, w, kaiser (n+1, beta), ftype, "noscale");

% Plot magnitude response:
[h f] = freqz(b,1,2^18); plot(f/pi*fn, 20*log10(abs(h))); grid; pause
Click to expand...

Hey there,
So when you say apodizing - do you imply smoothening out the transition from 1 to 0 ?
In general , applying a window function to a convolution kernel itself is often referred to apodization. The underlying notion is same.
So, if you are using Kaiser Window, first determine for a given number of taps and transition bandwidth, how much attenuation is possible . IIRC, Octave doesn't have such a util. SciPy does.
https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.kaiser_atten.html

Say you want to design a steep low pass filter for sampling rate of 176400 Hz (up-sampled 4x from 44100Hz), with transition bandwidth of 32 Hz for 64K taps (65536)

In SciPy you can do
nyquist_limit of target_fs = 176400 / 2 = 88200
transition_bandwidth / nyquist_limit = 32 / 88200 = 0.0003628117913832199546485

kaiser_atten(65536, 0.0007256235827664399092971)
gives 178.63319903986257 dB attenuation which is way below than even 24 bit noise floor.

Next use these values to feed in to kaiserord to compute the optimal beta value
numtaps, beta = kaiserord( 178.63319903986257, transition_bandwidth / (0.5 * target_fs) )

This returns a beta of 18.72663853419286

Now we have all what it takes to create an FIR filter using Kaiser Window as the apodizing function in to the convolution kernel

Happy learning!

soumya · Mar 2, 2022

I might want to move this in to a separate thread later... just apprising others of what I have been up to.

I have been optimizing the coefficients of different lengths and different up-sampling ratios ranging from 4Fs to 16Fs.
Last weekend I had a major breakthrough for 16Fs and sub 1 Million length coefficients.

While at it , I couldn't help appreciating the similarities in philosophies of Schiit Closed Form filter and Rob Watt's WTA filter. If you think, of it they are both trying to achieve the same thing - a steep filter that has very good time domain performance too.
Striking this balance is not easy.

From my subjective listening experiences , this is what I observed
I. Kaiser Window is the best (conventional) window when it comes to excellent stop band rejection. With right number of taps and beta values, it comes close to theoretical brickwall LPF.
This is 256 K taps Kaiser Window offering full 32 bit Dynamic range for 4x up-sampling. Look great in frequency domain!

Now look at its performance in Time Domain

And there in lies the problem - only a tiny centre region is actually having sinc coefficients. This has several side-effects
1. The loudest sound will overpower other sounds
2. Time domain inaccuracy will translate to poor micro and macro dynamics specially since a large number of them are closer to 0
3. Because of poor temporal performance, the sound-stage also takes a hit in comparison to say listening on a NOS R2R DAC.
4. Natural music can sound thin - again refer to point 1.

II. Rectangular Window has best time domain performance. Of course
But due to the abrupt transition from 1 to 0 causes it to sound soft and importantly the transition bandwidth is huge. No brick-wall like steepness. Leakage (due to Gibb's phenomenon) will be high

Time domain

Notice how the rectangular window never touches 0 within the window. This lack of smoothening to 0 or in other words - abrupt transition does not make it a good candidate by itself for steep filters or interpolation in general despite having best temporal performance.

So the question that I posed some time back is can we arrive at some form of trade-off - sacrificing little at the beginning and end of the window in time domain and some steepness in frequency domain to get an overall great sounding filter ?

it turns out we can
So first, amongst the convention filters take a look at Tukey Window.
https://en.wikipedia.org/wiki/Window_function#Tukey_window
It's a convolution of rectangular window with a tapered cosine function. While this in itself sounded better, it still lacks Kaiser's excellent frequency domain performance and steepness.

The ideal solution , would be to to have as much as sinc coefficients in between but smoothened by Kaiser coefficients at the beginning or end of the window to desired attenuation levels.

And this is what I got so far
I am sacrificing a tiny bit of steepness and allowing just a hint more leakage. So much that is not perceptible from Kaiser all that much.

But now look at its Time Domain performance

More to continue ....

audiofool · Mar 2, 2022

SciPy looks good, how do I get the FIR coefficients out of it? Can I avoid Octave and do everything in SciPy?

I like either the min phase short or linear long filters from hqplayer in apodizing form. I don't have a full understanding of Apodizing since it is being used to mean different things by different companies. What I am trying to do is what hqplayer does when defining apodizing, ie. replace the original ringing with the new filters ringing - can be any phase of filter. I originally thought simply reducing the bandwith would accomplish this but I think it is more complicated.

audiofool · Mar 4, 2022

Started experimenting with SciPy. Goal is to create linear phase brick wall Chord type filter.
This seems to work but creates messy stuff on each end of the window with high tap numbers, maybe it's a precision issue? Probably better ways to do it in SciPy?
taps = signal.firwin(1025233, 20000, width=14, window='kaiser', pass_zero='lowpass', scale=352800, fs=705600)
scipy.io.wavfile.write('coef.wav', 705600, taps.astype(np.float32))

I like your idea of multiple windows, I think WTA is using a combination of rectangular and kaiser. I haven't figured out how to combine them yet.

My other issue is work flow related - I use ffmpeg to convert formats to 64 float and then sox to integer upsample and then apply FIR. I think sox(and maybe also ffmpeg) will run into limits with large tap numbers so will use camilladsp. Would be nice if I could upsample using ffmpeg to avoid sox but haven't found a way to do it.

audiofool · Mar 5, 2022

Trying to duplicate what you suggested using Tukey and Kaiser, this seems to work but probably there are more precise ways to do this?

sample_rate = 705600

# The Nyquist rate of the signal.
nyq_rate = sample_rate / 2.0

# The desired width of the transition from pass to stop,
# relative to the Nyquist rate. We'll design the filter
# with a 5 Hz transition width.
width = 3.0/nyq_rate

# The desired attenuation in the stop band, in dB.
ripple_db = 300.0

# Compute the order and Kaiser parameter for the FIR filter.
N, beta = kaiserord(ripple_db, width)

# The cutoff frequency of the filter.
cutoff_hz = 20000.0

# Use firwin with a Tukey window to create a lowpass FIR filter.
taps1 = firwin(N, cutoff_hz/nyq_rate, window='tukey')
# Use firwin with a Kaiser window to create a lowpass FIR filter.
taps2 = firwin(N, cutoff_hz/nyq_rate, window=('kaiser', beta))
# Convolve both windows
taps3 = signal.fftconvolve(taps1, taps2, mode='same')

# output
scipy.io.wavfile.write('coef.wav', 705600, taps3.astype(np.float64))

ohshitgorillas · Oct 19, 2023

I am trying to mess around with a version of SoX that is modded to convert to DSD: https://github.com/mansr/sox with the goal of (offline) upsampling files to DSD using a sinc filter with a stupidly high number of taps.

I've followed these instructions:
Garns said: ↑
So prompted by this I finally figured how to successfully edit the SoX source to increase the maximum filter length. In fft4g.h you increase FFT4G_MAX_SIZE from 262144, e.g.
Code:
#define FFT4G_MAX_SIZE 16777216
and then in fft4g.c, in the bitrv2 and bitrv2conj functions, you need to increase the size of the ip[ ] static arrays. For every power of four you increase FFT4G_MAX_SIZE by you need to increase the size of the ip array by a power of two. So with my example above I increased FFT4G_MAX_SIZE by 4^3, so I increase ip[256] by 2^3 = 8 and end up with the following lines:
Code:
static void bitrv2(int n, int *ip0, double *a)
{
int j, j1, k, k1, l, m, m2, ip[2048];
and
Code:
static void bitrv2conj(int n, int *ip0, double *a)
{
  int j, j1, k, k1, l, m, m2, ip[2048];
Just checked this out and it is quite happy doing an 8 000 000 tap filter.

EDIT: here's a Github repository with the patched version.
Click to expand...
I was able to edit fft4g.c, but fft4g,h doesn't contain the FFT4G_MAX_SIZE parameter:
Code:
/* This library is free software; you can redistribute it and/or modify it
 * under the terms of the GNU Lesser General Public License as published by
 * the Free Software Foundation; either version 2.1 of the License, or (at
 * your option) any later version.
 *
 * This library is distributed in the hope that it will be useful, but
 * WITHOUT ANY WARRANTY; without even the implied warranty of
 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU Lesser
 * General Public License for more details.
 *
 * You should have received a copy of the GNU Lesser General Public License
 * along with this library; if not, write to the Free Software Foundation,
 * Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA
 */

void lsx_cdft(int, int, double *, int *, double *);
void lsx_rdft(int, int, double *, int *, double *);
void lsx_ddct(int, int, double *, int *, double *);
void lsx_ddst(int, int, double *, int *, double *);
void lsx_dfct(int, double *, double *, int *, double *);
void lsx_dfst(int, double *, double *, int *, double *);

void lsx_cdft_f(int, int, float *, int *, float *);
void lsx_rdft_f(int, int, float *, int *, float *);
void lsx_ddct_f(int, int, float *, int *, float *);
void lsx_ddst_f(int, int, float *, int *, float *);
void lsx_dfct_f(int, float *, float *, int *, float *);
void lsx_dfst_f(int, float *, float *, int *, float *);

#define dft_br_len(l) (2 + (1 << (int)(log(l / 2 + .5) / log(2.)) / 2))
#define dft_sc_len(l) (l / 2)

/* Over-allocate h by 2 to use these macros */
#define LSX_PACK(h, n)   h[1] = h[n]
#define LSX_UNPACK(h, n) h[n] = h[1], h[n + 1] = h[1] = 0;
Unfortunately, the modifications to fft4g.c aren't enough as running the command 'sox input.flac -r 2822400 -b 1 output.dsf sinc -22050 -n 1000000 rate -u 2822400' tells me that the number of taps must be between 11 and 32767.

Any ideas how I can hack the DSD-modded SoX to experiment with DSD upsampling? I unfortunately only speak Matlab, so this is way out of my wheelhouse.

fastfwd · Oct 19, 2023

ohshitgorillas said: ↑

'sox input.flac -r 2822400 -b 1 output.dsf sinc -22050 -n 1000000 rate -u 2822400' tells me that the number of taps must be between 11 and 32767.
Click to expand...

Search all the source code for the text of that error message, excluding the 11 and 32767 (i.e., search for "The number of taps must be between" or whatever the exact error message is). If you're lucky, the statement you find will be something like:
Code:
printf("The number of taps must be between %u and %u.\n", MINTAPS, MAXTAPS);
And then you can search the code for MAXTAPS (or whatever the actual name is). If you're lucky enough to find "#define MAXTAPS 32767" or "MAXTAPS = 32767" -- and especially if you find only one line that looks like that -- change the 32767 to 50000 and see whether you can successfully use 50000 taps.

If 50000 works, try 66000. Then if that works, go ahead and try 1000000 or 16777216 or whatever.

ohshitgorillas · Oct 19, 2023

Thanks, it wasn't that easy but 'grep -r "taps"' allowed me to find it in src/sinc.c:
Code:
GETOPT_NUMERIC(optstate, 'n', num_taps[1], 11, 1000000)
Now I just need to learn how to actually use the SoX cli... somehow the first round of files that I made came out with a 352.8 kHz sampling rate at 1 bit... and they did not sound great.

Edit: After some further digging around, I've discovered that this sox implementation has its own filters, although any documentation I can find (in the form of forum posts from the author) are out of date.

According to the man page,
Code:
sdm [-f filter] [-t order] [-n num] [-l latency]
              Apply a 1-bit sigma-delta modulator producing DSD output.  The input should be previously upsampled, e.g. with the rate effect, to a high rate, 2.8224MHz for DSD64.  The -f option selects the noise-shaping filter from the following list where the number  indi‐
              cates the order of the filter:
                 clans-4      sdm-4
                 clans-5      sdm-5
                 clans-6      sdm-6
                 clans-7      sdm-7
                 clans-8      sdm-8

              The noise filter may be combined with a partial trellis/viterbi search by supplying the following options:

              -t     Trellis order, max 32.

              -n     Number of paths to consider, max 32.

              -l     Output latency, max 2048.

              The result of using these parameters is hard to predict and can include high noise levels or instability.  Caution is advised.
however,
Code:
sox input.flac -b 1 -r 2822400 output.dsf sdm -f clans-8
yields the error
Code:
/home/adam/sox/src/.libs/sox FAIL sdm: invalid filter name 'clans-8'
. Fack.

HQPlayer vs SOX

Garns Friend

Scott Kramer Friend

Woland Friend

ohshitgorillas Friend

fastfwd Friend

ohshitgorillas Friend

ohshitgorillas Friend

soumya Acquaintance

soumya Acquaintance

audiofool New

audiofool New

audiofool New

soumya Acquaintance

soumya Acquaintance

audiofool New

audiofool New

audiofool New

ohshitgorillas Friend

fastfwd Friend

ohshitgorillas Friend

Share This Page

ABOUT US

RELATED LINKS

REFERENCES

CONTACT US

HQPlayer vs SOX

Garns Friend

Scott Kramer Friend

Woland Friend

ohshitgorillas Friend

fastfwd Friend

ohshitgorillas Friend

ohshitgorillas Friend

soumya Acquaintance

soumya Acquaintance

audiofool New

audiofool New

audiofool New

soumya Acquaintance

soumya Acquaintance

audiofool New

audiofool New

audiofool New

ohshitgorillas Friend

fastfwd Friend

ohshitgorillas Friend

Share This Page

Useful Searches