HQPlayer vs SOX

Discussion in 'Computer Audiophile: Software, Configs, Tools' started by Woland, Aug 7, 2021.

  1. Woland

    Woland Friend

    Pyrate
    Joined:
    Jan 13, 2021
    Likes Received:
    1,323
    Trophy Points:
    93
    Location:
    a friendly land
    HQPlayer does amazing things to music when upsampling PCM.

    SoX does not. At least not Sox as it is implemented in Volumio, Audirvana and everywhere else I've tried upsampling. Your experience may vary.

    I'm starting this thread to document my attempts to get SoX to deliver better results.

    I'm starting with the SoX documentation, and my observations from using HQPlayer than Linear Phase, Sinc Filters seem to give the best results into my Gungnir Multibit.

    I'll try to find a free FLAC which benefits greatly from upsampling, and upload upscaled variations with different settings for others to try.


    Please make suggestions if you're further along this path..
     
    Last edited: Aug 7, 2021
  2. fastfwd

    fastfwd Friend

    Pyrate
    Joined:
    Aug 29, 2019
    Likes Received:
    1,010
    Trophy Points:
    93
    Location:
    Silicon Valley
    • Like Like x 5
    • Epic Epic x 1
    • List
  3. haywood

    haywood Friend

    Pyrate
    Joined:
    Oct 22, 2015
    Likes Received:
    764
    Trophy Points:
    93
  4. Woland

    Woland Friend

    Pyrate
    Joined:
    Jan 13, 2021
    Likes Received:
    1,323
    Trophy Points:
    93
    Location:
    a friendly land
    Thanks @fastfwd & @haywood for excellent pointers.

    My thinking is to find a free FLAC where HQPlayer makes a substantial improvement. Then I'll try and replicate or beat HQPlayer's performance using a bunch of different SOX settings.

    After downloading a few hours of free music, the best candidate track I have so far is this one: Paisley Dark Edits Box 6 by Jezebell alternative suggestions are welcome though.
     
  5. Garns

    Garns Friend

    Pyrate
    Joined:
    Jul 9, 2016
    Likes Received:
    2,484
    Trophy Points:
    93
    Location:
    Sydney, AUS
    Thanks @Woland for this interesting discussion. I just got turned on to upsampling. I'd tried it before a few years ago using SSRC and it seemed to make the sound worse so I thought no more of it. With the talk of HQPlayer I thought I'd give it another go, and the easiest way seemed to be using SOX.

    I don't stream so it's incredibly easy to offline upsample entire folders of music. With Yggrasil A2 I can upsample to at most 176.4kHz (4x). The difference it makes is really noticeable:
    • Top end is a lot more precise, hi-hats and ride cymbals have clearly delineated attacks rather than a fuzzy halo
    • Less "grain" and a more liquid, organic sound
    • Less congestion in the mids
    • Less body and density - some people might not like this
    I can't comment on bass as my system doesn't reproduce bass at the moment (waiting on subs). But it's like a 5-10% increase in quality and makes the Yggdrasil sound like a more expensive DAC - not just removing something that shouldn't be there, but actually changing the sound for the better. My SOX recipe is:
    Code:
    sox in.flac -b 24 out.flac rate -u -b 99.7 176400
    -u is the highest quality preset. -b 24 gives 24 bit output (so don't have to bother thinking about dither) -b 99.7 sets the brickwall cutoff frequency, so the higher it is the closer to an ideal sinc function. Can't tell if this makes much difference but possibly 2% more definition though it's at the nervosa threshold.

    From turning on the debug this seems to produce a 20,000 tap filter which is about the same as the Yggdrasil built-in but I guess this one can be closer to linear phase. In any case it sounds substantially better.

    I use this script to do a whole folder at once:

    Code:
    #!/bin/bash
    
    cd "$1"
    mkdir "Hires"
    for f in *
    do
        sox "$f" -b 24 "Hires/$f" rate -u -b 99.7 176400
    done
    
     
  6. Woland

    Woland Friend

    Pyrate
    Joined:
    Jan 13, 2021
    Likes Received:
    1,323
    Trophy Points:
    93
    Location:
    a friendly land
    If you do ever want to play with dither.. In HQplayer I use 19 bits for a Gungnir Multibit, and if I recall correctly, 21 is appropriate for a Yggdrasil. That targets the dither to the bit depth of the DAC chips rather than the 24 bit interface.
     
  7. Garns

    Garns Friend

    Pyrate
    Joined:
    Jul 9, 2016
    Likes Received:
    2,484
    Trophy Points:
    93
    Location:
    Sydney, AUS
    OK, I couldn't resist modding sox to allow for longer filters. In src/rate.c I changed this line
    Code:
    GETOPT_LOCAL_NUMERIC(optstate, 'b', bw_3dB_pc, 74, 99.7)
    to replace 99.7 by 99.98, and in fft4g.h I changed
    Code:
    #define FFT4G_MAX_SIZE 262144
    to replace 262144 by 524288.

    Now after a recompile I can do
    Code:
    sox in.flac -b 24 out.flac rate -u -b 99.98 176400
    and this gives a 300,000 tap upsampling filter. Much as it pains me to say it this does add an extra 2% to the sound.

    In theory you could increase these numbers further but in practice sox craps its pants, I'm guessing because of a integer overflow which I'm not inclined to hunt down.
     
    • Like Like x 2
    • Epic Epic x 1
    • List
  8. GoodEnoughGear

    GoodEnoughGear Evil Dr. Shultz‎

    Pyrate
    Joined:
    Oct 25, 2015
    Likes Received:
    3,070
    Trophy Points:
    113
    Location:
    Cape Town, South Africa
    I'm not sure where exactly this should go - maybe we need a HQPlayer thread.

    Anyway, I've been looking at how one can use JRiver as a front-end for HQPlayer and there are some folks using batch files and such janky solutions. All of a sudden it occurred to me I might be able to use a virtual audio cable to pipe out of JRiver and into HQPlayer, and sure enough it works.

    I'm using VB Cable: https://vb-audio.com/Cable/, but I know there are others.

    It's pretty simple - I have a zone in JRMC which uses WASAPI to output to the input of the Virtual Cable. I can do any DSP I want in JRMC of course, and I do run a few VST plugins for EQ and such.

    In HQPlayer I simply configure the input in Settings to use WASAPI, listening to the output of the Virtual Cable. Using adaptive output rate, it seems to adapt to whatever rate I'm feeding it, even though I specify the input as audio: default/44100/2.

    Voila, I have a JRMC front-end for HQPlayer.

    I just got it working, so the jury's out on whether there are any sonic nasties, but I won't be able to do much critical listening until tonight. In theory, though, it should be fine.

    Note, this should work for most any player, it's not particular to JRiver.
     
  9. Garns

    Garns Friend

    Pyrate
    Joined:
    Jul 9, 2016
    Likes Received:
    2,484
    Trophy Points:
    93
    Location:
    Sydney, AUS
    Had a go at adding dither to my sox upsampling recipe:
    Code:
    sox in.flac -b 24 out.flac rate -u -b 99.98 176400 dither -S -p 21
    This dithers to 21 bits as per Yggdrasil spec. This has both negative and positive effects. Soundstage appears a bit deeper, space around the notes is a bit better defined. However the top octave is kind of screwed up and nasty sounding. I think the dithering algos in sox aren't that great. (There are no noise-shaping algos, or at least, none that run at 176.4k).

    However, it seems like dithering back to 16 bits
    Code:
    sox in.flac -b 16 out.flac rate -u -b 99.98 176400 dither -S
    retains the beneficial effects with less treble fuckery. Makes me want to try some other dithering algos though.
     
    • Like Like x 1
    • Epic Epic x 1
    • List
  10. Woland

    Woland Friend

    Pyrate
    Joined:
    Jan 13, 2021
    Likes Received:
    1,323
    Trophy Points:
    93
    Location:
    a friendly land
    It's awesome seeing original content, @Garns
     
  11. soumya

    soumya Acquaintance

    Joined:
    Jun 24, 2018
    Likes Received:
    42
    Trophy Points:
    18
    Location:
    Mordor, Middle Earth
    Just adding my $0.02 to this thread.
    If you really want to get great results out of SoX, either use the sinc filter or better write your own custom filter and use the coefficients with fir filter.
    The SoX fir filter supports up to a maximum of 256K taps. I would be happy to provide a custom coefficients file. Be advised though if the DAC is delta sigma type, 4x up-sampling to 176.4 / 192 kHz is not enough to completely defeat the internal digital interpolation.
     
  12. Woland

    Woland Friend

    Pyrate
    Joined:
    Jan 13, 2021
    Likes Received:
    1,323
    Trophy Points:
    93
    Location:
    a friendly land
    I'd love to replicate HQPlayer's high-tap sinc filter in SoX. Any suggestions about how to do that?
     
  13. soumya

    soumya Acquaintance

    Joined:
    Jun 24, 2018
    Likes Received:
    42
    Trophy Points:
    18
    Location:
    Mordor, Middle Earth
    Do you want to create up-sampled FLACs or output straight to USB DAC in a Linux system. In either case what sampling rate do you need ?
    It takes me a while to compute the parameters and generate optimal coefficients.

    One more thing - the arm and arm64 builds of SoX for some odd reason have the fir filter capped to 128K taps instead of 256K taps in x86-64 builds. Someone needs to notify the SoX maintainers about this bug. Just something to keep in mind.
     
  14. Woland

    Woland Friend

    Pyrate
    Joined:
    Jan 13, 2021
    Likes Received:
    1,323
    Trophy Points:
    93
    Location:
    a friendly land
    My main interest is in 4x (to 176 or 192kHz) for AES/SPDIF connection to DAC.
     
  15. Garns

    Garns Friend

    Pyrate
    Joined:
    Jul 9, 2016
    Likes Received:
    2,484
    Trophy Points:
    93
    Location:
    Sydney, AUS
    So prompted by this I finally figured how to successfully edit the SoX source to increase the maximum filter length. In fft4g.h you increase FFT4G_MAX_SIZE from 262144, e.g.
    Code:
    #define FFT4G_MAX_SIZE 16777216
    and then in fft4g.c, in the bitrv2 and bitrv2conj functions, you need to increase the size of the ip[ ] static arrays. For every power of four you increase FFT4G_MAX_SIZE by you need to increase the size of the ip array by a power of two. So with my example above I increased FFT4G_MAX_SIZE by 4^3, so I increase ip[256] by 2^3 = 8 and end up with the following lines:
    Code:
    static void bitrv2(int n, int *ip0, double *a)
    {
    int j, j1, k, k1, l, m, m2, ip[2048];
    
    and
    Code:
    static void bitrv2conj(int n, int *ip0, double *a)
    {
      int j, j1, k, k1, l, m, m2, ip[2048];
    
    Just checked this out and it is quite happy doing an 8 000 000 tap filter.

    EDIT: here's a Github repository with the patched version.
     
    • Epic Epic x 2
    • Like Like x 1
    • List
    Last edited: Nov 2, 2021
  16. soumya

    soumya Acquaintance

    Joined:
    Jun 24, 2018
    Likes Received:
    42
    Trophy Points:
    18
    Location:
    Mordor, Middle Earth
    Cool, will get a build going later tonight
     
  17. soumya

    soumya Acquaintance

    Joined:
    Jun 24, 2018
    Likes Received:
    42
    Trophy Points:
    18
    Location:
    Mordor, Middle Earth
    Fair enough, just be advised that depending on your DAC the internal 8x interpolation might still be in effect.
    176.4K it is then!
     
  18. soumya

    soumya Acquaintance

    Joined:
    Jun 24, 2018
    Likes Received:
    42
    Trophy Points:
    18
    Location:
    Mordor, Middle Earth
    Here is the coefficients file
    https://drive.google.com/file/d/179r9kMSCDeHXAOmUuzbEtuN6OLklgake/view?usp=sharing

    Code:
     flac -dc <input.flac> | sox --ignore-length -V6 -S --input-buffer 65536 --buffer 65536  -t wav - -r 176400 -b 24 -t wav - upsample 4 fir <path to pcm_176K_256K_taps.txt> vol 2.8317831375365516432088598 | flac -8 -f -o output.flac
    I have given 3dB headroom. You may change the vol filter to your needs
     
    Last edited: Nov 2, 2021
  19. Woland

    Woland Friend

    Pyrate
    Joined:
    Jan 13, 2021
    Likes Received:
    1,323
    Trophy Points:
    93
    Location:
    a friendly land
    Thank you! I'm looking forward to giving this some time.

    Can you explain a little more about how you derive the coefficients?
     
  20. soumya

    soumya Acquaintance

    Joined:
    Jun 24, 2018
    Likes Received:
    42
    Trophy Points:
    18
    Location:
    Mordor, Middle Earth

    Yeah we can have discussion on DSP for designing low pass filters in a separate thread.
    It makes sense, at least for the devs contributing to projects, if not general readers of such threads to have some degree of familiarity with the concepts.
    For instance, I read somewhere in this thread above where folks compared 20K taps of Schiit Yggdrasil to taps generated out of SoX. Which is wrong since the former is an optimized Parks–McClellan based closed form filter but SoX rate uses asynchronous window based resampling which may or may not be optimal.
    I will mention this again - number of taps alone is not an indication of how good sounding a filter is nor it is the starting parameter to design a filter.
    We could start with how steep we want, how much ripple is expected in pass-band, how much attenuation in stop band, what algorithm / window function to use. And from there arrive at the filter length needed to design such a filter.

    Cheer!
     
    Last edited: Nov 2, 2021

Share This Page