Blind test. I constantly fail...

Discussion in 'Blind Testing and Psychoacoustics' started by murphythecat, Jun 3, 2017.

  1. Ringingears

    Ringingears Honorary BFF

    Friend
    Joined:
    Sep 26, 2015
    Likes Received:
    3,019
    Dislikes Received:
    2
    Trophy Points:
    113
    Location:
    Northern Californium Valley
    Harman has a very expensive A/B set-up. Their double blind tests have shown that trained listeners can hear differences between different speakers. I have purchased serveral Harman speakers over the years, and enjoyed them. My only complaint has been reliability and customer service. Enough said about that.
     
  2. Priidik

    Priidik Friend

    Friend BWC
    Joined:
    Sep 27, 2015
    Likes Received:
    1,322
    Dislikes Received:
    2
    Trophy Points:
    93
    Location:
    Estonia
    I could barely tell the difference between a Clip and BMC Puredac while switching quickly, although this was with HDVA600 as amp to HD800.
    Even worse, neither I nor my brother could easily tell the difference between Yggdrasil and Puredac from Apex Pinnacle in this manner. (fwiw Puredac is garbage next to Yggdrasil, although again it must be noted that Pinncale is not the most transparent of amps)
    Moral of the story for us: some things need to sink in for a bit.

    By M. Moffat: ''Back in the early 1970s, before I founded Theta Electronics, the tube audio products company, I had a busy part time biz rebuilding Dynaco Tube Amplifiers. At that time I had converted to the tube based practice for my own system, convinced that tubes sounded better than the solid state gear of that era. In my ramblings, I met John Koval, a man who had designed a modification for the old Qual ESL loudspeakers which made them sound much better. “The mod gets rid of a 5 db bump in the 200-400 Hz region which makes them much flatter” he explained. I told him that I was enchanted with the sound of tube amplifiers and preamplifiers. He explained that as long as the frequency response was the same and the levels were precisely matched, there was no way anyone could tell any amps/preamps apart in blind A/B tests. He had built a custom box that matched levels and randomized any two amplifiers or preamplifiers with a pushbutton to switch between them. Bullschiit, I thought, what about the solid state A/B box and its sonic signature.

    Intrigued, I built a similar box with passive relays and a passive attenuator. Damn, if he wasn't right. It is really difficult to tell differences in an instantaneous blind A/B test between tube gear that I built versus some commercial gear that I was not particularly fond of. I used to bet John beers that I could tell the difference. Usually, I won at 7 out of 10 picks or so – the best I ever did was 9 out of ten. But it was really hard.

    This whole deal made me wonder if I was crazy hearing differences between amps. If what John said was true, and many others have said in the passing 40 years or so, there is no point for an audio hobby involving anything other than transducers. WTF?

    So I tried something new – I still did the A/B tests, matched levels, but allowed long-term listening to each; at least an hour or two with known recordings. Guess what! Suddenly I knew which was what. I tried it out on John B and Mike and Dave and all my other audio buddies. They called it too – tubes vs a bad solid state preamp. Every friggin' time. My enthusiasm had returned. This taught me that the human ear is an integral, NOT differential device.

    So much for the blind A/B instantaneous naysayers. All that matters is frequency response, they say. People can't hear anything much above 20KHz in their prime, less later. The ear has a short memory, it is all bias, blah, blah. They should take up a different hobby, say stamp collecting.

    Thanks to Dr. Heil, the inventor of the Heil AMT speaker who shared this experiment with me over 40 years ago, Consider this: I am 67 years old – my high end extends to just under 15KHz (not bad for and old fart). I can play back two pulses 200 microseconds in length separated by 20 microseconds and clearly hear two pulses. Not unusual until one considers that 20 microseconds corresponds to a square wave of 50KHz. And then, there is the time domain – home of spatial cues which audio measurement traditionalists ignore. I believe that in the quest for the best sound, an open mind is the most important asset. I will even listen to cables, even though I believe in my heart that all technology about cables is well known. Who knows, even an old fart like me could be surprised.

    Until then, yet another retelling of my old John Koval saga is 40 year old news to me."
     
    Cryptowolf, RakiRaki, elguapo and 2 others like this.
  3. landroni

    landroni Friend

    Friend
    Joined:
    Aug 12, 2016
    Likes Received:
    1,292
    Dislikes Received:
    8
    Trophy Points:
    93
    Can't blind test transducers, cause physics. I'll give a golden brownie point full of enchanted epeen filling to anyone who can set up a valid DBT between an Apple earbud vs an HD650 vs an Elac B5.

    Why stop there then? There are other things that most definitely turn us on, or off, like fatigue, sleep deprivation, work related issues, birth in the family, job promotion, etc., which collectively can be considered under the umbrella of neurological factors, and which DBT doesn't do zilch to control for. So "pure" it is not. Do a large sample DBT at 5am on people who've just got out from a disco, wasted, and that examination -- while formally DBT -- will still have zero scientific validity.

    'Inevitable' is a strong assumption here. These biases can creep up, always, but it will depend on the individual whether they can control for them outside formal investigations. Just like when you're comparing headphones; or cars.

    And although superficially intuitive, the assumption of "pure" sound is wrong-headed. Instantaneous DBT is a very unnatural setting for someone used to enjoying their gear at home, which requires a strong mental adjustment (think the neurological factors above) to achieve the same perceptions as in a more familiar, relaxed setting. This what is effectively an exam setting will also introduce biases of its own, like the need to perform. By its very format instantaneous DBT puts pressure on the subject to identify, consciously, immediately, differences, which is a very different proposition from sitting back in your chair at home, relax, and wonder idly about music while letting your subconscious do the magic.

    Again, 'pure' it is not, and if unfamiliar DBT can mess with your head (and consequently with your performance).

    Personal preference, sure. But your preference may just as easily be for the arguably inferior sound. So it doesn't get you any closer to which is better, in an "absolute", more universal or if you will scientific sense. Nor can it tell you why you're detecting reliably a difference; that is an area for fertile post-hoc speculation*.

    And think about it for a second. Suppose you can easily detect the difference between Yggdrasil and the S19 in "biased" sighted conditions (which seems to have been the case with several here at SBAF). Then you determine that you can confirm, easily and with no sweat -- because one sounds like ass -- in DBT that you can reliably tell Yggdrasil apart from S19 (like several have done here). So what would be the superior validity of choosing your preference under DBT conditions, if you know what you're listening to? Even in blind conditions, the moment you know when you're hearing Yggdrasil sound and you know when you're hearing S19 sound, how does choosing your preference under blind conditions make it any more valid or "unbiased" than doing the same in sighted conditions?


    * I should note that DBT in audio, like the ones we're discussing now, has vanishingly little, nay, nothing to do with science (unless of course the subject of investigation is the psychological and sociological aspects of "subjectivism vs objectivism" debates). Science deals in factors (or effects), not product comparisons. A DBT of Yggdrasil vs S19 is of scientific interest if and only if it can be established that there is a single, verifiable design or implementation difference between the two, i.e. the factor under investigation. Otherwise it's all just forum-based pissing contest, and has nothing to do with Science as such.
     
    GTABeancounter, RakiRaki and Dino like this.
  4. murphythecat

    murphythecat Friend

    Friend
    Joined:
    Apr 4, 2016
    Likes Received:
    622
    Dislikes Received:
    17
    Trophy Points:
    93
    one thing I notice immediately and in blind test im sure I could spot easily is when I compare SS mode vs tube + mode in the ifi pro ican.

    I read most of you guys advise and really respect you guys opinion so I think I may take a step back and test over a longer period of time my stuff. when I started the thread, I did not know all the common opinion that some components do take time before a obvious difference is noted. ill test me and my friends for say 10 minutes and see if my results change

    one thing important to notice though, ive been trying to hear difference between:
    1- IC's
    2- coupling caps
    3- a very transparent pass b1 vs direct input into head amp

    therefore, its not totally absurd to think that the differences are very subtle.

    I do hear difference when I compare DACs but havent yet found a way to perfectly match their spl level and I do hear obvious difference when I switch from my sony ta 707 and my dht 6b4g amp.
     
  5. landroni

    landroni Friend

    Friend
    Joined:
    Aug 12, 2016
    Likes Received:
    1,292
    Dislikes Received:
    8
    Trophy Points:
    93
    When debating instantaneous vs extended blind testing with myself, I like to think of this in terms of an analogy that statisticians will find familiar.

    When examining a cross-section sample (e.g. instantaneous ABX), you can only examine phenomena that happen at that point in time. As such, instantaneous DBT by its very nature cannot answer questions relating to things like long-term perception or listening fatigue.

    To examine such things one would need a time-series or panel sample, which tracks the evolution of the phenomenon at several points in time. Only this approach (e.g. extended ABX) can allow one to examine trends.

    So the question really boils down to what is of interest to you when comparing gear?

    @baldr has posted this elsewhere, but seems germane here as well:
     
    RakiRaki and Dino like this.
  6. Merrick

    Merrick A lidless ear

    Friend
    Joined:
    Jan 6, 2016
    Likes Received:
    5,022
    Dislikes Received:
    5
    Trophy Points:
    113
    Location:
    Portland, OR
    If there were truly no differences between gear, then indeed long term listening would not reveal any differences, because none would exist.

    However, saying your instantaneous ABX sessions prove there are no differences and therefore long term ABX is pointless is faulty logic.
     
    Dino likes this.
  7. murphythecat

    murphythecat Friend

    Friend
    Joined:
    Apr 4, 2016
    Likes Received:
    622
    Dislikes Received:
    17
    Trophy Points:
    93
    where did I say that I believe that it takes long listening session to hear differences? this is what most of you guys are saying, not me. where did I said that I intend to "prove" that there's no differences?

    ill take the hypothesis that it may take longer sessions to reliably identify subtle differences between the gear ive tested and hear for myself.

    personally, id like to know of controlled test that suggest what you guys are saying. Mike mofatt test is very interesting but its about tube amps vs SS...
    Is it really far fetched to even suggest that differences between ICs, two excellent coupling caps (mundorf zn and silevermica) or a pass b1 are very subtle and possibly impossible to detect under normal listening condition? I dont think so

    can anyone cite me other controlled (other then Mofatt) test where:
    first, instantaneous DBT test was conducted and no participants could reliably detect a difference and failed to identify component A-B or C.
    second, the DBT test conducted over a longer period of time made possible for the participants to reliably tell a difference and identify component A-B or C.
     
    Last edited: Jun 5, 2017
  8. Thad E Ginathom

    Thad E Ginathom Friend

    Friend
    Joined:
    Sep 27, 2015
    Likes Received:
    5,300
    Dislikes Received:
    10
    Trophy Points:
    113
    Location:
    India
    Mr Oliver can have machines silently and invisibly exchange speakers. Of course it is understood that that is a whole reality-field harder with headphones, let alone in-ears.

    So you have picked one of the hardest, perhaps impossible, blind tests to set up. Have you invalidated the concept of blind testing? I don't think so.



    Which means what? You can apply all those to sighted testing or, errmmm... anything.



    Can and do. Always. But you think otherwise, then fine. Hey, you might even have yours under control. Even the unconscious ones. Probably humanly possible.

    So do the blind test in the comfort of your home, with long listening periods. You have set up a definition of the listening test, which is somewhat extreme. Does that invalidate blind testing?

    Finding out that the world is not as one thought, in whatever way, certainly messes with one's head.

    Better was never the question. the questions went...

    1. Is there a difference?

    2. If so, which do I prefer?

    But if you need to change the goal posts to win the arguments... sure: you can do whatever you want with your own gola posts.

    Think about this for a second. Sometimes blind tests prove that there was a difference and that the listener was right and could identify the components, or whatever. If you choose to say, "Well, that was a waste of time, then," fair enough. In a way, I suppose it was.


    Disclosure. I find the theory and principles of this stuff interesting, although I have run out of steam for extended argument on it. I don't think that the principles can be knocked down although many have tried and will continue to do so.

    Blind tests take people and time to set up, and even more so if listeners want extended listening periods (and why not) and I have never been part of such an experiment.

    The closest I came to actually doing it was with sample rates. I set myself up to be thoroughly confused, if not entirely blind, about which sample was which and compared my own digitisations in long and in short. I often heard different stuff, which, when I checked with the other sample, was not different at all. It was enough for my ears which were quite a lot better back then, to be confident in my findings.


    Certainly both. I don't doubt the reality of fatigue, and anybody that does should try some 32kbsec MP3 stuff (yes: 32: I didn't leave off a zero). Even for voice only. EG, comedy isn't even funny any more. Not confident, though, that I could Identify two or three seconds of a 32kbit MP3 sample.

    Probably the last thing I have to say on this topic is that there are many, many aspects to ownership satisfaction, and that even just the listening experience does not start and end with listening. And nobody is forcing blind testing on anyone. But I still think that my reaosn for not doing it is one of the best: laziness!
     
    crenca and a44100Hz like this.
  9. anetode

    anetode Moderator

    Staff Member Friend
    Joined:
    Sep 25, 2015
    Likes Received:
    556
    Dislikes Received:
    1
    Trophy Points:
    93
    Location:
    How did I get here
    Home Page:
    This is not an assumption, it is as close to fundamental truth as we can get to with science. Visual cues routinely alter your auditory perception.

    Not that I'm pining for your peen brownie, but this wouldn't be all that difficult to accomplish in a lab setting with the aid of an anesthesiologist.

    I think you are underestimating the breadth of scientific inquiry more than just a tad in order to ridicule disagreeable findings. Here's an experiment you can try at your leisure: go to Google Scholar/arXiv/Elsevier/JSTOR/PubMed and check out the many thousands of studies of human perception which have successfully employed double blind testing to useful ends.
     
    Grahad2, Xen, Thad E Ginathom and 4 others like this.
  10. landroni

    landroni Friend

    Friend
    Joined:
    Aug 12, 2016
    Likes Received:
    1,292
    Dislikes Received:
    8
    Trophy Points:
    93
    If this is your takeaway from my various irrelevant tirades on this topic, I don't think you're genuinely appreciating the subtleties that I'm raising when it comes to the scientific method and its limitations. No harm done, either way, and I have no interest in flaring tempers here. But I do find important to share my perspective on these issues.


    Depends on how you want to look at this, and which is your vantage point. But I wouldn't get into "fundamental truths" here, as it's a matter of putting things into their proper context.

    I will agree with you that visual cues can and do routinely alter your auditory perception. No problems here. It does NOT mean though automatically that every single auditory perception influenced by visual cues is biased. ALL human perceptions are noisy, all of them, like super, ridiculously noisy (e.g. think about this next time you look out the window), yet we still come up with a great deal many estimates which are largely consistent and unbiased (think statistics); Humans have simply evolved in this way, and some/many are really good at it. So just because visual cues can alter auditory perception doesn't mean that one cannot obtain a consistent, unbiased estimate of e.g. a given audio gear, which later can be confirmed in DBT (and once more, in case I failed to mention it, DBT in audiophile applications can only tell you how often you've correctly or not identified a given gear, and little else).

    Now does it matter? Not sure. If you are so concerned about biases and how various factors affect perceptions, do you DBT headphones? I mean seriously, if any one here objectivisticly-leaning have DBT'ed even once say an HD650 from a PortaPro, please raise your hand, share your methodology, and explain how your approach is valid. And if you haven't, do you assume that any and all differences that you perceive between the two headphones are biased, imaginary, placebo, what-have-you? And if you don't assume these things, how can you trust your perceptions if you've ever evaluated the gear only in biased, uncontrolled conditions?
    But why stop at headphones... Are you -- just a generic 'you' -- as zealously concerned about DBT'ing when determining preferences and choosing TVs, monitors, computers, keyboards, mice, skis, cars, planes, etc., etc.? If not, then I challenge why you would be so concerned about audio gear in particular.
    Again, all human perceptions are noisy, and we most of the times choose a great many gears over others based on raw perceptive estimates in uncontrolled conditions, and in a large majority of times we do not doubt those estimates just because "you didn't DBT it". (Many of these things can't be DBT'ed, anyways.)

    Oh, I beg to differ. This would work only if you assume that being anesthetized doesn't affect your auditory perceptions. Big assumption. But even then it won't work. You simply can't trick someone that they might be listening to an earbud plugged in their ears when in fact a pair of speakers 3 meters away are playing, or the other way around. The physical constraints alone prohibit these things, with existing technology anyways. (But if you have some serious ideas on how these things could be set up, I'd be thrilled and I'm all ears. I for one cannot come up with even a hypothetical valid setting that doesn't involve Star Trek technology, and even then...)

    Double-blind testing is useful alright. I don't think we disagree there! But before placing too much faith in it (hint: it's not a panacea), one should understand when it is useful, how, and what are its limitations. As with all things scientific method, various tools come with various constraints, limited scope, and assumptions. To come up with valid research, all need to be understood and acknowledged by the researcher.

    But please don't presume too much about my intentions (or please do read more carefully what I'm writing, even if you may find my arguments disagreeable). The scientific method is very useful, and I'm all for it. If fuses or caps or cables or DACs or amps or headphones or speakers don't do zilch for us, I want to be the first to know about it!

    However careful attention should be paid to what you're doing, how, and what it is that you're inquiring, and what it is that your inquiry can answer. And I'm not seeing this in audiophilia, at least not in the so-called Sound Science quarters. To put it crudely, there is good science, there is bad science, but what I see in the Sound Science on HF and similar comes off as a poor mockery of science (and before jedi-ing out dislikes, see the Cult of Science section here to see what I mean). Science is harder than many seem to realize, and if you're not paying attention you're just contributing to noise, however statistically significant your results may be, and not to signal. This works even harder when you're espousing principles that are patently unscientific, and Sound Science is a basket case (and I won't get into examples lest this becomes a manuscript of sorts, but anyone feel free to PM me if interested).

    To come back to what you term as my ridiculing disagreeable findings. Comparing two DACs (say Yggdrasil vs S19) has no intrinsic scientific value, because there are many too many implementation differences too isolate any one factor in particular. I mean, it's fun for us OCDs in audiophilia, and can be decidedly helpful in confirming if you can reliably detect a difference (if that's your thing), but Science it ain't. Same with comparing an Audioquest cable with a DIY silver one -- you may detect a difference, or not, but if you do you still won't know with any certainty why you're hearing differences. Could be cable length, cable width, sheathing, materials, crystal structure, kryo treatment, EMI sensitivity, etc., etc. Again, this is not science.
    Science would be if someone came up with two otherwise identical cables that different on one single characteristic, and tested that for audibility. Same for DACs, amps, etc. Of course, for some components doing such inquiries becomes prohibitively complicated if not impossible, again given the current scientific tools and technology at our disposal. And that's OK: Some things Science can't give an answer to right now. Always has been, always will be.


    And before anyone else starts assigning nefarious intentions to me or misinterprets what I am trying to convey, please carefully consider the issues that I'm raising here. Science is hard, comes with assumptions and limitations, so don't take it the wrong way when I'm pointing out that jumping into these things headway with "let's just DBT it!" or "prove it!" won't necessarily end up with a valid, satisfactory conclusion. If I'm doing this in such a dry manner it is precisely because I'm profoundly sceptical of both subjectivist and "objectivist" claims in audiophilia, at least the way most of them are usually pitched, and do intend to investigate some of these claims as time and finances permit (and as long I have an interest in these things as opposed to simply enjoying music).
    And in case anyone thinks I'm some sort of a religious anti-DBT zealot and bigot, think again: I intend to grab one of Torq's DBT assistants as soon as they'd become available. (I'm still very [!] curious to know if I can identify under controlled DBT conditions between Modi Multibit and X3, but I will do that only if there is a simple and straightforward way to approach this in home conditions.) But any and all results from my investigations will still be necessarily subject to the limitations I've highlighted here and elsewhere... It cannot be otherwise.

    I hope this makes my perspective clearer, and I won't get pinned in either the obsessively zealot anti-objectivist or in the religiously bigot anti-subjectivist camps. Or in both.
     
    Thad E Ginathom and RakiRaki like this.
  11. Kattefjaes

    Kattefjaes Mostly Harmless

    Friend
    Joined:
    Sep 5, 2016
    Likes Received:
    5,075
    Dislikes Received:
    41
    Trophy Points:
    113
    Location:
    London, UK
    Yep, it's probably time to spam this one again:



    The senses are really not inputs, they are mostly to do with "perceiving", a post-hoc thing you do in your brain.. Hell, the colour of plates also affects the taste of food. Go figure. Blind testing is important precisely because external stimuli can bias what you perceive in ways that feel absolutely authentic. It's not a sign of weakness, there are no macho or e-peen points on offer, it's just how the brain works.




    Oh, and @landroni, having been a guinea pig in proper laboratory conditions* double blind listening tests, I can tell you this.. They're as boring as shit to take part in. Really. Tiring too, you wouldn't think so, but it's quite gruelling if you're really concentrating. Be careful what you wish for!


    * Proper treated listening rooms, one-way glass, bespoke DBT software and sinister anonymous control surfaces. Weird stuff. Very "Prisoner".
     
    Last edited: Jun 5, 2017
    Thad E Ginathom and SSL like this.
  12. SSL

    SSL Friend

    Friend
    Joined:
    Nov 12, 2015
    Likes Received:
    1,053
    Dislikes Received:
    6
    Trophy Points:
    93
    Kind of like this thread?
     
  13. anetode

    anetode Moderator

    Staff Member Friend
    Joined:
    Sep 25, 2015
    Likes Received:
    556
    Dislikes Received:
    1
    Trophy Points:
    93
    Location:
    How did I get here
    Home Page:
    I'm disagreeing with your assertion that these biases are mere assumptions or that they are irrelevant even with experienced and/or trained listeners. Our evidence-based understanding of human perception runs contrary to your thesis. As per Thad, "the aim is to listen minus a few inevitable biases"; testing always requires controls. To do otherwise is worse than to simply introduce noise to the data, it will introduce false positives as well as false negatives. Now the main thrust of your criticism, as I understand it, is that listening DBTs skew the balance towards false negatives -- and I agree with that. We've been long overdue for shifts in testing protocol which introduce listener training and long-term listening.

    A brief note on human perception being noisy, one way to think of it is in terms of higher SNR, since the act of perception automatically discards a lot of data and introduces artificial permanence to features our brains judge as most relevant. That does not mean, however, that your neural activity is uncoordinated, activating some random/noisy metabolic arrangement. Just the opposite, perception is really a form of cognition rather than a precursor, as such it depends heavily on established neural networks which involve all manner of activation patterns, feedback loops and brain regions; all of which vary by individual. We do all share a tendency towards a specific arrangement of neural development, but only in the broadest terms.

    What happens in the brain of an audiophile who listens to their shrine (replete with cable risers, glowing tubes, etc.) is drastically different than what happens to someone who is listening to the same system without any expectations or past experiences. While you appear inclined to put very much stock in past experiences tuning one's mind to better discern minute aural differences that may prove too subtle to pick up on traditional DBTs, it is much more likely that these experiences are a simple form of operant conditioning whereby prior biases are either reinforced or slowly shifted. I still think that there's a great deal to be learned about the audiophile experience, but not because there's going to be some uncovering of the proverbial other 90% of the iceberg as having to deal with the mechanics of sound reproduction. Instead the evidence is decidedly on the side of product presentation on the consumer side and preferred engineering methods on the producer side.

    What you're saying is that a human being is very likely to be able to discern between a speaker and an in-ear-monitor, which is true. Nonetheless, it is possible to set up blind conditions up to complete paralysis and actual visual blindness. Throw all ethical restraint out of the window and we can go even further and use transcranial stimulation and brain surgery to hone in on the exact biological processes involved in audio perception. We've done so inadvertently with people who have had to undergo various types of brain surgery, and intentionally in medical trials with animals as close to us as monkeys. Again, you are promoting an artificially constricted view of scientific inquiry.

    Of course that doesn't mean that the NIH is itching to approve your grant proposal to study the listening preferences of audiophiles undergoing unnecessary medical procedures.
     
    Last edited: Jun 5, 2017
    crenca, Xen, Thad E Ginathom and 2 others like this.
  14. landroni

    landroni Friend

    Friend
    Joined:
    Aug 12, 2016
    Likes Received:
    1,292
    Dislikes Received:
    8
    Trophy Points:
    93
    Sure, but those controls should not start and stop at level-matching and DBT. It is more tricky than that to come up with valid methodology, and careful attention can and should be paid to ensuring that the testing setting, while controlled, is as true to the traditional setting of the audiophile in their natural habitat. I can't begin to imagine why this would be in any way controversial. We don't want to prime our test subjects (unless this is a specific goal) just like we don't want to skew the testing conditions to yield a mostly pre-determined result because of the constraints it imposes on the test subjects.
    Also, testing of the double-blind kind usually requires the researcher to set up a control group and establish a baseline performance, which is again not something that I often see being done or even discussed in audiophile circles. So again, "just DBT it!" doesn't pass my sniff test, not that it matters.

    These are your priors, and that's fine. Mine are somewhat different, insofar as while I'm keenly aware of and worry about placebo effects in uncontrolled conditions, I also have a strong disbelief that random factors always result in severely biased perceptions, as in perceptions that are systematically way off from the true effect and are not 'useful' (think in evolutionary terms). But at this level academics trade in fundamental (and personal) beliefs about how the world works, or should work. And sometimes we can agree on a methodology protocol that can sift through these priors and yield effects of interest, which may confirm or disprove some of those priors. (A boringly academic way to say: Let's test this!)

    I never proposed this. You've just did. Such an approach could severely bias results, and doesn't come anywhere close to what I could ever justify as valid methodology.

    So we're back to square one: How do we test* transducers in blind conditions? And absent a sane way to do this, and given the assumption -- or as you would put it, it seems, "fundamental truth" -- of visual cues systematically biasing our perceptions, how can we trust any of our perceptions when it comes to e.g. transducers if we are so incredibly suspicious of e.g. cables, amps or DACs**?
    * apparently needed implicit disclaimer: in a sane setting where test subjects and other poor animals would be guaranteed to survive unscathed, and that would not introduce hugely artificial constraints on the test subjects.
    ** Under these assumptions I don't buy the measurement story as lab testing results don't have much useful to say about audibility or subjective perception, only listening tests do. So arguments to the tune of "we know transducers sound different" vs "we don't know cables sound different" don't hold water here.***
    *** In these cases my sniff'o'meter simply points me towards a latent case of cognitive dissonance.


    As you can see I have a problem with these assumptions (or fundamental truths) -- like no long term auditory memory in humans or visual cues always biasing auditory perceptions -- being usually pitched in absolute terms, as if they always held in any setting, rather than more appropriately be presented in the context of their discovery, which would qualify them appropriately. Otherwise it is easy to construct reductio ad absurdum scenarios, to expose the obvious flaws when taking such assumptions as granted and to their extreme implications.
    And I still don't see why these things would be oh so important for average Joe in the audio world (i.e. all human perceptions are noisy), but would be routinely not worried about when it comes to other senses, say in photography, video, food, etc. I don't see either why all this fascination with 'DBT as a holy grail' in audio, since like any tool it comes with limitations and can only tell you so much (and depending on the specific setting could tell you much less than what you'd hope for). So yeah, there are various aspects of the "objectivist" pitch that don't seem to make a whole lot of sense to me and that I simply couldn't justify when subjecting them to scrutiny, which is why I tend to speak in terms of 'assumptions' and not 'truths' or 'facts'.

    (Anyways, all rhetorical questions and statements, all in good faith, so no need to shoot something back if it's snide.)


    Oh, I don't doubt that! Which is why I don't understand why so many argue that it's oh so imperative to do these things in audio (before making spending decisions or affirming any kind of firm preferences), yet so very few seem to have actually set up or participated in such things regularly*.

    As I mentioned earlier science is hard and -- as you point out -- boring. I've never seen prose more boring than in the academia, not unlike my abominations above. And when academics start debating methodological fine points and validity of assumptions, I can most definitely see "death by boredom" as a valid death reason in a coroner's report. Some may even conflate it with noise.

    * I recall last year when @Torq completed his DBT tool and went out to seek on HF volunteers to assist his first DBT trials or even partake in them. The exact number of "subjectivists" or "objectivists" who took him up on the offer was apparently exactly zero. This even with a relatively large local community of audiophiles.
     
  15. anetode

    anetode Moderator

    Staff Member Friend
    Joined:
    Sep 25, 2015
    Likes Received:
    556
    Dislikes Received:
    1
    Trophy Points:
    93
    Location:
    How did I get here
    Home Page:
    It's not an issue of personal beliefs, it's a consideration of what established science already tells us. Even in the case of experts personal beliefs are constantly challenged to facilitate a consensus with data from new experiments and the cumulative expertise of others in their respective fields. The concept of a placebo encompasses a wide range of biological mechanisms, many of which can be thought of in terms of behavioral and cognitive psychology as well as more grounded neuroscience. Evolutionary psychology deals with an overlapping but broader understanding of a human being as a genome influenced by natural and artificial selection processes. Too often reference to evolutionary psychology becomes some sort of catchall explanation for phenomena already studied in detail by other branches.

    I wholeheartedly agree -- for a while now I've been toying around with a DBT setup that listeners might actually find enjoyable. Hoping to have something ready to try out at next year's RMAF. (Though sadly it would mean not participating directly and having to hire confederates)

    I have no doubt that you have a problem with how the brain works, unfortunately "they don't think it be like it is but it do". It's not an issue of formal logical argument, it's biology. There's an argument to be made that the balance of multisensory integration varies with regard to environment and state of mind, but in absolute terms, yes, the understood neurophysiological mechanism of perception absolutely requires that inputs are mixed. For more on the topic, read The Merging of the Senses, by Stein.

    That was my attempt to lighten the mood a little after a morbid tangent :)
     
    Last edited: Jun 6, 2017
  16. RakiRaki

    RakiRaki Acquaintance

    Joined:
    Oct 1, 2015
    Likes Received:
    28
    Dislikes Received:
    0
    Trophy Points:
    13
    Location:
    New Zeal-and
    I have enjoyed the exchange between iandroni and anetode above. Having some experience of human subjects research, I do think the difficulties of setting up robust audiophile oriented DBT may have been addressed only lightly.

    With human subjects I want to highlight a measurement dichotomy. Consider a drug trial and suppose the measurement instrument is something which - for argument's sake - is independent of subjective self-assessment. Let's say a blood test of a kind with high validity and reliability and well-established test protocols. In this case we have a 'simpler' or more objective measurement scenario than what has been discussed In this thread. Why? Because the measurement instrument is not the human subject. Instead it is something over which we have a high degree of control: we can baseline its performance throughout the trial and recalibrate as needed.

    These are substantial experimental advantages which vanish once the measuring instrument becomes human subjects. Unlike a typical laboratory instrument, a human has many things going on. Particularly hard to control are order effects such as learning (once I hear something I didn't previously notice it's essentially impossible to reset to the state of not having heard it), fatigue and attentional issues such as boredom/search for novelty/intrusive thoughts. Mechanical instruments - I assume - are not subject to these problems.

    Another effect I suspect (based on my field of social psychology) is a reputational one: an audiophile (but perhaps not nonaudiophiles) risks damage to their reputation and so the payoff matrix makes a hedging strategy optimal; i.e. "I don't really hear a difference" minimizes one's losses. If true, null findings are the most likely outcome.

    Then, as iandroni has pointed out, DBT may be substantially different from real-world listening (I hope so!). Careful experimental setup establishes baselines. Is there a difference between real world listening and a DBT scenario? If there is, other questions have to be addressed. For example, what length of time can 'adequate' (whatever we operationalize this to be) listening attention be maintained before order effects come into play, assuming they do? Note that further methodological issues instantly arise: we can't answer this last question or the one before by asking the subject. Careful design means we prefer not to validate and baseline any instrument using itself. Some independent and validated measure is needed to properly address questions such as how long can a single session last; is rest needed between sessions and if so how long?

    Use of human subjects as measuring instruments usually requires training, retraining to correct observer drift, and some evidence-based notion of how often this has to be applied.

    In essence, use of human subjects - like use of any other instrument - produces meaningless data if we can't or don't quantify their properties, capabilities and limits with respect to the tests proposed. This is needed in order to justify a claim that our instrument is fit for purpose. Use of human subjects where the domain is subjective perception is particularly difficult to get right because we're fumbling in the dark from the outset. Even the little I've pointed to above is littered with unknowns and strikes me as formidable.
     
    crenca, gixxerwimp, Lasollor and 10 others like this.
  17. Thad E Ginathom

    Thad E Ginathom Friend

    Friend
    Joined:
    Sep 27, 2015
    Likes Received:
    5,300
    Dislikes Received:
    10
    Trophy Points:
    113
    Location:
    India
    Definitely no flaring tempers! I do like, sometimes, to get my teeth into an argument, tear the thing to shreds, or show that it is irrelevant or fallacious. I guess that, on your side, maybe you feel similar. Nothing personal!

    Bias is a dirty word. Perhaps we should use another one! Although sight itself is one of the big, immediate, influences on hearing, as per the McGurk thing, what blind testing seeks to do is to remove a lot more than just what we see. It seeks to narrow down the amalgamation of visual clues, prior impressions, knowledge, suppositions and a long list of et ceteras which amount to any one of our perceptions, into being, as close as can be, just about sound.

    I completely agree with with this. And the impression that I have picked up from hydrogenaudio (yes, I check out that site; no, it is not my religion) is that test "subjects" should be comfortable. In fact the thing does not have to be a challenge: it can be as much an enquiry.

    It often becomes a challenge, either because people are very confident to show that they are right, or because they want to take themselves on in practice, improving skills etc.

    I half-remember one audiophile-ego your-test-can't-beat-me thing; I think it is on youtube, maybe. Among the setup was that the testee should be entirely comfortable, should have unlimited time to listen sighted to the A and the B. No pressures. No intimidation. The A/B switching was then entirely at the request of the "subject." There was no getting one second of this and one second of that and half a second to tick the box.

    There might be occasions on which such forensic testing of sound against sound might be useful and might serve a purpose. I can't imagine buying an amplifier or a DAC like that even if they sound so alike as to barely make a difference. I might be doing this if I was an engineer (although plenty of engineers wouldn't) but I am never going to do it as a music listener. On the other hand, if an honest salesman was prepared to set up testing behind a curtain, with an easy/quick switching method, I'd be chuffed and delighted. Sadly, I expect the salesmen to be thinking of their profit margins, and making those tiny, doesn't sound louder but does sound better tweaks with the volume control.

    I do think there is an element of bullshit reveal in blind testing, and that is where the real controversy is inevitable.

    Look, we all know that a one-inch radio-alarm speaker sounds different to a Tannoy Turnberry. And on that scale there are numerous shades. But Audioquest's range of cables? Well, I suppose one man's bullshit is another man's great place to lay eggs.

    To my mind, it should be, and should have been, part of the industry's review process. It could have been used, as per J Gordon Holt's comments (although it has been said that he never used blind testing) an industry honesty check. But the time has been lost and it probably never will be.

    Extreme claims require extreme proofs. So, the next time someone tries to sell you stuff to hang your speaker cables off, or "audiophile" network cables, don't even think of double-blind testing: get them to show that they can!
     
    crenca likes this.
  18. anetode

    anetode Moderator

    Staff Member Friend
    Joined:
    Sep 25, 2015
    Likes Received:
    556
    Dislikes Received:
    1
    Trophy Points:
    93
    Location:
    How did I get here
    Home Page:
    From the birth of recorded sound and audio engineering we've had many experiments to quantify absolute perceptual thresholds for just about every type of signal distortion. We've identified psychoacoustic masking properties in human hearing and established that distortion has to be up to 20db higher in music recordings than test signals optimized to the geometry and sensitivity of the human ear. We've come up with a plethora of solutions which consistently remain at least 20 db lower than established test signal thresholds in their distortion of audio reproduction.

    Yet in relatively recent times we have completely revolutionized our understanding of brain functioning and have only begun to get close to quantifying levels of complexity inherent therein. We've nonetheless uncovered a physiological description of key cognitive-perceptual processes, namely the brain's internal mapping of the outside world by processing sensory information (from way more than the colloquially termed five senses). We've pecked away at the problem by studying larger order psychological phenomena and identified the consistent emergence of a common set of sensory illusions and perceptual changes influenced by both short-term and long-term memories.

    To me, and I'm sure to you and many others here, it comes as no surprise that the commonly employed DBT methods used to establish perceptual thresholds inadequately encapsulate the human experience of critical listening. It makes sense therefore that to study what really matters to audiophiles requires us to look not only, nor even primarily, at the gear measurements of gear already achieving the burial of distortion products many orders of magnitude lower than absolute perceptual thresholds, but to document the full context of the listening experience, the self-reporting from listeners, the listeners' experiential history and then to study all of these in relation to both physiological observations and the variables under test.

    That is to say, to progress we need to study people, not just gear. Here we all seem to be in agreement.
     
    Thad E Ginathom and Wilson like this.
  19. Thad E Ginathom

    Thad E Ginathom Friend

    Friend
    Joined:
    Sep 27, 2015
    Likes Received:
    5,300
    Dislikes Received:
    10
    Trophy Points:
    113
    Location:
    India
    Absolutely. Because what happens on the inside is actually more interesting than what happens on the outside. But when the tenth hearing of Hotel California sounds different to the first five, in some way, and I suggest that this may be as much to do with physical/psychological mechanisms as with change of gear, I get told to "trust my ears." And I am not talking about random impressions of internet people: I am talking about actual people, actual friends, real people, in whose company I have thoroughly enjoyed listening to both gear and music. And without the experience, I would never have had the thought: "Wait, I have heard this ten times, of course my brain is going to find some new nuance."

    Let's explore...
     
  20. murphythecat

    murphythecat Friend

    Friend
    Joined:
    Apr 4, 2016
    Likes Received:
    622
    Dislikes Received:
    17
    Trophy Points:
    93
    DBT removes a couple of well known psychological bias. Is there any other way known to remove those psychological biased without the use of DBT?
     

Share This Page