The Unified ABX and DBT Church

Discussion in 'Blind Testing and Psychoacoustics' started by MRC01, Dec 12, 2018.

  1. MRC01

    MRC01 Rando

    Joined:
    Dec 10, 2018
    Likes Received:
    15
    Dislikes Received:
    2
    Trophy Points:
    3
    Location:
    Earth
    You posted earlier showing a cymbal strike that should have lots of HF content, then revealed that there wasn't really much HF there when you look at the waveform. I took your point to be that very little natural music approaches the bandwidth limit. That's generally true but there are exceptions.

    Psychacoustically, we perceive the frequency and time domains differently. So one can't take generalizations about pure tones and apply them to transient response, or vice versa. Why is this relevant? Because it's counterintuitive: FFT a transient, remove frequencies you can't hear as pure tones, reconstruct the signal without them and you can hear the difference. That's crazy but true under certain conditions.

    Finally, all of this is relevant to discussion about differences in reconstruction filters. Hearing those differences on actual music is difficult, at least I think so having A/B/X tested over the years. Because if you can only hear differences with test signals, what does it matter? I don't listen to test signals for entertainment or artistic enlightenment. To make those differences easier to hear, it's useful to have musical sources having lots of HF energy with sharp transients.

    BTW, castanets were invented long before audiophiles or DSP existed, so they must have had some other purpose...
     
    Last edited: Dec 12, 2018
    Priidik likes this.
  2. purr1n

    purr1n Finding his inner redneck

    Staff Member Friend BWC
    Joined:
    Sep 24, 2015
    Likes Received:
    45,295
    Dislikes Received:
    70
    Trophy Points:
    113
    Location:
    Antarctica
    Umm. Don't you hear a difference among the filters with regular non-castanet music? Or are your ears or gear really so craptastic that you can't, and thus require the crutch of castanets?

    Also, your energy plot doesn't really show any more content past 18kHz than say a random track from Paul Simon's Graceland. Where is this massive high frequency energy near Nyquist which you speak of? See here for example: https://www.superbestaudiofriends.o...ul-simon-diamonds-on-the-music-analysis.4050/

    There are plug-ins for Foobar on PC and Audirvana (native) on Mac that upsample like crazy and allow you to make your own filters. I will say that I dislike minimum phase, and filters that roll off to early or too slowly. And that I prefer linear phase filters with sharp roll off, which @ultrabike has already shown via his math program to be practically spot on.
     
    Last edited: Dec 12, 2018
    ultrabike likes this.
  3. purr1n

    purr1n Finding his inner redneck

    Staff Member Friend BWC
    Joined:
    Sep 24, 2015
    Likes Received:
    45,295
    Dislikes Received:
    70
    Trophy Points:
    113
    Location:
    Antarctica
    No, it's not relevant because you are making some huge assumptions with your EQ test and about the nature of high frequency hearing. This is not an equivalent process to the application of a good reconstruction filter to Redbook content.

    I can also offer you similar anecdotes about how I can't hear past a pure 12kHz tone, but can somehow detect or sense the loss of air and shimmer if a -5db shelf filter is applied past 15kHz. And this has nothing to do with transients.
     
    Last edited: Dec 12, 2018
  4. purr1n

    purr1n Finding his inner redneck

    Staff Member Friend BWC
    Joined:
    Sep 24, 2015
    Likes Received:
    45,295
    Dislikes Received:
    70
    Trophy Points:
    113
    Location:
    Antarctica
    I'm heading into my TARDIS now and murdering the guy who invented castanets.
     
    SoupRKnowva, Kunlun, MRC01 and 3 others like this.
  5. Priidik

    Priidik Friend

    Friend BWC
    Joined:
    Sep 27, 2015
    Likes Received:
    1,322
    Dislikes Received:
    2
    Trophy Points:
    93
    Location:
    Estonia
    I can't see how this test is suggesting ear nonlinearity. It's well documented that ear is nonlinear, but it shows in imaging and soundstage more than in brightness or the lack of it.
    I can reliably distinguish tones up to about 16..17kHz, but 18..19kHz loud tones still annoy the hell out of me.
    It's more telling about the soft processing tissue inside our scull than ear as apparatus.
     
    ultrabike likes this.
  6. MRC01

    MRC01 Rando

    Joined:
    Dec 10, 2018
    Likes Received:
    15
    Dislikes Received:
    2
    Trophy Points:
    3
    Location:
    Earth
    Anyone who has done DBTs has experienced failures and successes, knows it can range from easy to impossible depending on what differences are being tested, and whether the source material highlights those differences.
    Castanets would be terrible on craptastic gear because it either won't reproduce the critical frequencies, or would do so with so much distortion it would mask whatever more subtle differences you were trying to hear.

    I posted the above in response to the cymbal crashes you posted earlier. It does show more HF content than that. I have the original Graceland before they ruined it in remastering; it is an unusually good recording.

    True. Listeners improve with training and experience. Their ears aren't getting better; their brains are learning to get more out of what their ears were picking up all along.
     
  7. purr1n

    purr1n Finding his inner redneck

    Staff Member Friend BWC
    Joined:
    Sep 24, 2015
    Likes Received:
    45,295
    Dislikes Received:
    70
    Trophy Points:
    113
    Location:
    Antarctica
    I haven't done DBTs because they are extremely difficult to conduct. And even then, we cannot be sure if there is telepathic beaming from one person to the next, thus requiring triple or perhaps quadruple anal retentive procedures, or Magneto's helmet*

    I have done many blind tests and have found that with practice, it becomes very easy to reliably discern what used to be difficult to consistently discern. It's all about practice and exposure. We are going to suck, or at least be inconsistent, at a tennis serve without a ton of practice. And even then, some people are always going to suck at it.

    It seems the crux of your arguments is a lot of more straightforward: 44.1kHz sampling may be insufficient to capture what we can hear?

    I don't know. You are all over the place. It first started with linear phase filters and moved to the "inadequacy" of reconstruction during DA, and then on to AD, and then on to how you can hear an 18kHz cut of Q=2, which seems normal to me, if you are younger than 40.

    --

    *Blurting out the use of DBTs is asinine. It's even more asinine than shouting the world is going to end because a network scanner indicated that a buffer overflow vulnerability exists on the server.

    Sometimes we need to step back and understand what was the need for the control in the first place. What is the risk, the negative scenario that we are trying to protect against, when we suggest the use of DBT.

    Once we know what we are trying to protect against, then we realize are there much simplier procedures that can do the same. This is especially true in regards to blinding tests in audio.

    This is the failure of our higher education system. It teaches kids good concepts and makes them smart, as in knowledgeable. But it doesn't teach kids how to think, to understand why we even do things in the first place. I just learned about DBTs. That sounds awesome! Did you do a DBT! You didnt! Shame on you! All your results are bogus!

    This is exactly why we hear nonsense like AP-555, GRAS, IEC coupler, "Olive-Neutral" Target, Amir-bits, Stradivarius, Antartic exploration, from seemingly smart people. All this is academic e-peening or reliance on external credentials rather than people actually trying to figuring out the why. So much easier to be spoon-fed, or in many cases make up shit using others' work as the foundation.

    Jude has a GRAS. WTF has he done? Next to nothing, other than getting Massdrop to agree to show his measurements on its 'drop pages.
     
    Last edited: Dec 13, 2018
    9suns, Thad E Ginathom and ultrabike like this.
  8. MRC01

    MRC01 Rando

    Joined:
    Dec 10, 2018
    Likes Received:
    15
    Dislikes Received:
    2
    Trophy Points:
    3
    Location:
    Earth
    My point was not what frequency cut anyone can hear, but rather that with certain kinds of sounds, we can detect the removal of frequencies we can't hear as pure tones. Our perception of frequency & time is not as symmetric as those domains are mathematically. This should caution anyone from taking a dogmatic engineering approach to audio. Good engineering leads to good sound, but we also need to listen and trust our perceptions.

    The phrase "trust our perceptions" leads to DBT...

    Exactly! The need for the control is twofold. (1) human perception is variable, even when it is sensitive, and (2) various forms of expectation bias. If you reliable detect difference in a DBT, then they definitely exist. However, the converse and inverse of that statement aren't necessarily true. In that sense, DBT is a high precision, low recall test. It never gives false positives, but it may give false negatives. That's what leads to endless DBT debates. To avoid starting a pointless new one here, I've only answered your question, not offering an opinion.

    Yep, that's it. For practical purposes, 44/16 is transparent for humans. It's not perfect, but its limitations are masked by other distortions from room placement, microphones, and other processing that are usually orders of magnitude higher. However, the intellectually curious wonder which of its limitations might be audible by humans, and how to describe or measure that whether in math/physics or real-world tests.
     
    AudioNut likes this.
  9. purr1n

    purr1n Finding his inner redneck

    Staff Member Friend BWC
    Joined:
    Sep 24, 2015
    Likes Received:
    45,295
    Dislikes Received:
    70
    Trophy Points:
    113
    Location:
    Antarctica
    You are not offering an opinion, but you simply regurgitating and not thinking. What is expectation bias and how does it come into play? There are ways to remove expectation bias with dedicated hardware ABX boxes. There are manual ways to remove this. I've done several blind tests with an input switcher where my wife or kids or come into the room and move the RCA cables randomly or not at all on the switcher's inputs. The wires in the back were hidden from me and I left the room when the cables were being switched. This does not constitute a double-blind test, but it sure as hell eliminates expectation bias.

    DBT is used in clinical trials because of the very subjective nature of say the effects of weed or Vicodin. In cases like this, it's totally reasonable to believe that single-blind tests may influence the results. For example, a tester against the use of weed may interpret the answers relating to the negative aspects of its use more negatively on a subjective scale. With audio, we can narrow things down to it is A or B - binary choice. Is it a Modi or D70? Is it a linear phase or minimum phase filter?

    Although the test I've been using isn't strictly DBT, it can be considered DBT in a sense: The switch is the researcher. My wife or kids making the wiring changes at the switch inputs is the neutral third party keeping track of which is which.
     
    Last edited: Dec 13, 2018
  10. MRC01

    MRC01 Rando

    Joined:
    Dec 10, 2018
    Likes Received:
    15
    Dislikes Received:
    2
    Trophy Points:
    3
    Location:
    Earth
    Ah that's what you mean. Like the word "balanced", the term "DBT" tends to be used ambiguously in audio to mean any kind of blind test. That was my intention. In many cases I don't think the X in ABX is strictly necessary. I've done a mix of blind testing either way. It's still a high precision low recall test so you can get false negatives.

    The only real issue I've seen over the years is people often don't actually do the math to compute the test confidence, or realize how many trials that takes. If your goal is at least 95% that's 5 in a row all correct (96.9%). If you get one wrong you need at least 8 trails - 7 of 8 is 96.5%. Or they only count correct tests, ignoring failures, or starting the count over after each failure, which biases the results.

    PS: now that I've broached the topic of confidence %, I should clarify that when I said above that a DBT can't have false positives, I was assuming sufficiently high confidence. That is, if you use the standard 95% then there's a 5% chance for a false positive. My point is, you can shrink the likelihood of false positives as low as you want by using a high number of trials. But false negatives are always likely and you can't control for them.
     
    Last edited: Dec 13, 2018
  11. purr1n

    purr1n Finding his inner redneck

    Staff Member Friend BWC
    Joined:
    Sep 24, 2015
    Likes Received:
    45,295
    Dislikes Received:
    70
    Trophy Points:
    113
    Location:
    Antarctica
    The term DBT is too ambiguous, used in the wrong context (socially retarded use), and too often used as a cheap rhetorical technique to discount individuals' subjective observations (or those who use less certain methods more prone to error or placebo). Because of this, it's become a trigger word that makes people on audio forums very mad. I am convinced that 99% of people who carelessly use the word DBT on audio forums have absolutely zero idea what it's use is trying to protect against. They are just parroting DBT and associated phrases such as expectation bias.

    Confidence levels may not be applicable for results relating to an individual, say to be able to tell the difference between a Modi 3 and Topping D30 (both DACs measure with results way better than the human ear can detect, at least with the typical / primitive measurements used.) As an aside, in the tests that I do, I look for 9 of 10 correct if random means 50 / 50 chance.

    However, CLs might be useful if the question is this: what percentage of people can distinguish between a Topping D30 and Modi 3. We can conduct 20 sets of 20 individual tests. Even then, I highly doubt 95% confidence level is possible when it comes down to a percentage. Populations and hearing acuity of people are too disparate.

    Finally, this is audio for fun, not medicines to save lives or missiles to precisely take lives. 70-80% confidence is good enough. Making audio into science purely for the sake of science is not fun and outside the domain of SBAF.

    P.S.

    Opinion polls on gear, "bright", "soft", "bassy" (say 1 to 5) from a decent population size is not a half-bad way to go. I'm sure the confidence levels might be poor, but may be an interesting exercise. Even if results had high-standard deviation, I'd be more apt to trust these results, or at least make a better educated / informed purchase decision. The DBT dweebs barf at this, but this is how they do it for medicines, just on a much larger scale - and in many instances, studies don't agree.
     
    Last edited: Dec 13, 2018
  12. purr1n

    purr1n Finding his inner redneck

    Staff Member Friend BWC
    Joined:
    Sep 24, 2015
    Likes Received:
    45,295
    Dislikes Received:
    70
    Trophy Points:
    113
    Location:
    Antarctica
    LPF (anti-alias filter) before / during A to D and the use of higher sampling rates. No issue. The use of higher than Redbook sampling rates for AD was around before Redbook / consumer CD format.

    There is a huge engineering aspect to this beyond the science / math. Feel free to read material on the chipmakers' websites or ask what the guys who set up the mics and racks do.
     
    Last edited: Dec 13, 2018
  13. MRC01

    MRC01 Rando

    Joined:
    Dec 10, 2018
    Likes Received:
    15
    Dislikes Received:
    2
    Trophy Points:
    3
    Location:
    Earth
    In this example, your wife or kids aren't necessarily a neutral third party. Your wife knows that A is the new megabuck preamp and if you pick it, you'll be spending a lot of money, so she'd really prefer that you can't tell them apart or that you prefer B. Your kids want to play with the new gear so they want you tell them apart and pick A. As you know, that's what the DBT eliminates: expectation bias also exists in the tester and could affect the outcome.

    It tend to agree with that. Tester expectation bias is a real thing, but it seems reasonable to disregard it if the test is just a sanity check of your own perceptions. You acknowledge it's imperfect and you're not going to submit the results to the AES.
     
    Last edited: Dec 14, 2018
  14. purr1n

    purr1n Finding his inner redneck

    Staff Member Friend BWC
    Joined:
    Sep 24, 2015
    Likes Received:
    45,295
    Dislikes Received:
    70
    Trophy Points:
    113
    Location:
    Antarctica
    I get gear for free and the rest of my family really doesn't care. And even if they did, procedures implemented (not knowing which wire came from which source - cable source outputs occluded from a piece of cardboard) would further eliminate expectation bias. So I guess in a sense, this is / might be triple-blind testing.

    Test subject | mechanical switch | cardboard | tabulators.
     
    Last edited: Dec 19, 2018
  15. purr1n

    purr1n Finding his inner redneck

    Staff Member Friend BWC
    Joined:
    Sep 24, 2015
    Likes Received:
    45,295
    Dislikes Received:
    70
    Trophy Points:
    113
    Location:
    Antarctica
    And this is exactly why people who come into SBAF and demand DBT are immediately banned.
     
    taisserroots, briskly and ultrabike like this.

Share This Page