555 Transcript | The Testing Psychologist

Dr. Jeremy Sharp (00:00)
Hello y’all. Welcome back to the Testing Psychologist. This is episode three in the March sprint, three of 12, with a capstone 13th episode before the month ends. And today we are talking about performance validity testing. And we talked about PVTs on the podcast before, but I’m going to do a little recap and just refresh our memories here today about

PVTs and why they are important. So this is a little bit of I guess, the elephant in the assessment room. So before we dive into the data, ⁓ I want to do a professional disclaimer of sorts. So as psychologists and evaluators, I think, you know, we have a strict kind of ethical and legal obligation to maintain test security, right? So under the APA ethics code, where

tasked with protecting the integrity of our measures. And this means that I will not be naming specific internal stimuli, showing specific testing materials or providing like the answer keys, so to speak, or scoring rules for the tools that we use. So if you are a clinician who is listening, you know these tools. If you’re a member of the public, please understand that these are just cloaked.

measures that are only effective when they are novel and unknown. So exposing them just kind of ruins the value for everyone. So today we’re going to move beyond kind of the binary label of quote unquote malingering and into a little more sophisticated understanding of why effort fails and how we use validity indicators, both standalone and embedded, to protect the truth of our diagnostic data. And as per

The last couple of episodes, I want to mention Crafted Practice before we move to the full discussion. So Crafted Practice is my in-person business retreat for just for testing psychologists. This will be the fourth year that I do it. It is all inclusive, which means you pay the registration and that includes meals and all the coaching and lodging. All you have to do is get here. And so this is a four day event here in Colorado.

over the summer where we do small group coaching. We implement our ideas. We have spaciousness and time to connect. We do happy hours at the end of every day, which are a blast. And it’s just a time to slow down and actually work on your business. Put some of those ideas into place. So if that sounds interesting, ⁓ go to the testing psychologist.com slash crafted practice to get more info and to register. I would love to see you there. And for now,

Let’s jump to our discussion of PVTs.

Dr. Jeremy Sharp (02:49)
All right, everybody, we are back. We are talking about validity testing. Why are we talking about validity testing? Well, because clinical intuition is essentially a coin flip, and we are not good at determining when someone is giving optimal effort. I think this is relatively well established in the research that our clinical intuition is not great in general, and this applies to

validity testing and optimal effort as well. just as an example of 2022 meta analysis showed that in clinical ADHD evaluations failure rates on at least one performance validity indicator range from 9 percent to 27 percent. So essentially nearly one out of every four assessments are going to have you know the phenomenon where scores that you see on the page may not be a valid representation of the

client’s true ability. right. So in clinical practice, this is rarely about ⁓ faking or malingering, I don’t think. It could often be kind of a cry for help profile. Like the patient or client is just so desperate for their struggles to be validated that they subconsciously overemphasize their deficits. They aren’t trying to trick you necessarily, but they are trying to ensure that they aren’t ignored or dismissed. ⁓ Now,

There are some times, of course, where malingering may be present or faking bad may be present. But in the majority of cases, that ⁓ is not true. But what we will say is that there is suboptimal effort for one reason or another. And we’re going to talk about how to dive in and actually recognize some of the suboptimal effort. Now, ⁓ for a thorough, thorough discussion about this topic, I have interviewed

Dr. David Baker and Dr. Cecil Reynolds in past podcast episodes. And we do a little bit more of a detailed discussion of this concept. But like I said, it’s always nice to keep things top of mind. I certainly forget things and lapse back in my typical practice. And this is no exception. So let’s move a little bit to choosing the right tools. And to do this part of the discussion, we’re going to have a little…

conversation about sensitivity and specificity, right? These things that are so important in our measures. So when we’re selecting our validity measures, ⁓ we have to be able to balance sensitivity, which is catching low effort, okay? It’s sensitive to ⁓ low effort, with specificity, which is making sure that we don’t accidentally mislabel a truly ⁓ impaired person. This is the line that the validity measures are trying to walk.

So for years, I think the field has relied on well-known recognition-based memory tasks. These are excellent for specificity, making sure that we don’t mislabel someone who’s truly impaired. If someone fails any of these quote unquote easy tasks, you can be statistically confident that the effort is suboptimal because they’re so easy. That said, there is some recent research, mean very recent, 2025, 2026.

That suggests that some of these legacy measures might be a little too easy. So they lack sensitivity, which is truly catching low effort. So there are some newer nonverbal measures, which I will generally refer to as a multi-stage symptom validity test.

They’re showing much higher sensitivity. They are harder to beat because they require consistent effort across different types of tasks. So in terms of what to avoid, research would say that we should avoid older, quote unquote, face valid measures that were designed before the current era of sophisticated performance fluidity research.

If the client can guess what the test is measuring, it’s probably not a good validity test. And again, I am being purposefully vague here to maintain test security and so forth. So there are a couple things to consider here. One is the two failure rule and the $0 indicator. What do these mean?

So the American Academy of Clinical Neuropsychology, or AACN, and most major boards would now lean toward something called the two failure rule. Because human effort can fluctuate, a single failure is a red flag, but not necessarily a conviction. Two independent indicators, on the other hand, create a little bit of a chain of evidence, I suppose, that is statistically hard to ignore, especially if you’re using ⁓ these.

valid measures that get at poor effort. So one of the most valuable tools is an embedded indicator. All right, so you’re likely already using it. There’s something called reliable digit spans. This is a calculator derived from some of the standard working memory tests out there that I’m sure you’re familiar with. By simply summing the longest span forward and the longest span backward, you can get a validated indicator of effort.

OK, if that sum falls below the established 2026 normative cutoffs, which is generally a seven or less in adult samples, your quote unquote integrity alarm should be going off a little bit. It costs zero extra minutes to calculate and zero dollars. You’re already doing it, but it’s one of the most robust data points in your battery. So that is, again, a single failure point. And then you compare that with any number of other

newer, more well-developed validity measures, standalone validity measures.

Dr. Jeremy Sharp (08:54)

So what do we do if we determine that effort is not great? Well, that’s important. A lot of people ask this question. What exactly do we do? So there’s a thing called the feedback sandwich. You guys have all probably heard of the feedback sandwich. OK, you don’t have to be the effort police. ⁓ You can go about this in a little bit of a kinder way.

Feedback sandwich. There are three stages. The affirmation, the calibration, and the reset.

All right, so the affirmation. Something like, this is where you’re acknowledging and vocalizing the fact that you are aware that effort may be suboptimal. So you’re affirming. ⁓ I can see that this is.

Before we totally move to the way to discuss suboptimal effort with a client in the moment, let’s talk about just schedule of

Now, before we move to a discussion of how to actually discuss suboptimal effort with the client in the moment, let’s talk about the schedule of administration for these measures. So if we’re going by the two failure rule, which I just mentioned, ⁓ that would indicate that we need to be administering multiple validity indicators across the course of testing, right? So ⁓ I would say, now for kids, there is a robust, well-validated

PVT with five separate subtests that you can administer throughout the day. We love that measure. If you’re working with adults, you do have a few things at your disposal. And you don’t want to wait until four hours into testing to administer your first validity measure. So with the two failure roll, I would suggest administering something pretty close to the beginning of the battery.

because that will give you an idea of how this person is approaching testing from the very beginning. And then depending on the length of your battery, I would do another one toward the end of the battery, maybe an hour before the end of the battery. And if your battery is long enough, you can add a third measure ⁓ when you think that effort may be maybe lagging. So again, human effort and energy does fluctuate. So

That’s the rationale coupled with the two failure rule to administer different validity measures throughout the assessment process.

So let’s transition a little bit to talking about how to talk with clients about poor effort in the moment. So this is generically labeled the feedback sandwich. So this is how do you handle failing patient, quote unquote, failing patient, suboptimal patient, without destroying the alliance that you have built. So you don’t have to be the effort police necessarily. But you can use this little feedback sandwich.

Dr. Jeremy Sharp (12:48)

Part one is the affirmation. So this is essentially where you can say, hey, I can see that this is really taxing work, and I very much appreciate your persistence in sticking with these tasks. Then comes the calibration. This is where you might say something like, however, I’m seeing some patterns in this data that are inconsistent. It’s like your brain is going on idle mode, or it’s not quite engaging with the tasks at

your full potential. Okay. So you’re acknowledging, Hey, I see this. I’m going to talk to you about it. And then you present the third part, which is the reset. This is, take a five minute movement break. When we come back, I’m to need your absolute game day face or game day effort, because I want to make sure that this is an accurate map of your strengths and your abilities. Okay. So you don’t have to totally

abandon the battery at the instant that the first validity indicator is indicating suboptimal effort. Again, you can do some feedback in the moment and then try to reset and recalibrate. So in that regard, you have protected the test integrity, you have protected the clinical relationship, and you have given them a little bit of a graceful exit, so to speak, to try harder.

And you’ve also let them know that you are noticing that they are not trying as hard as they possibly could or giving optimal effort. Now again, this particular topic, I mean, there is a lot of, ⁓ you know, branches we could follow here. There’s plenty of research on malingering and, ⁓ you know, deliberate faking bad and what that looks like. We’re not getting into that necessarily. This is just again, a primer or a reminder maybe about the importance of validity measures. So,

You know, poor effort is present in somewhere between 9 to 27 % of adult ADHD evaluations, right? And ⁓ information would suggest that kids are not much better, actually, that even in a pediatric population, we have a substantial minority of kids who are giving poor effort for one reason or another. So making sure you’re picking the right tools. ⁓ did allude to a few tools that you can use to look at.

validity, and then how to present that information to the client in the moment to try and ensure a little more valid or optimal performance. So thanks for listening. Tomorrow we’re going to wrap up this first pillar with episode four on writing good reports. We really can’t talk about this enough. We will discuss how to weave in validity concerns and fatigue and diagnostic data and all sorts of things into one coherent, meaningful story.

I will see you then.

Click here to listen to the podcast instead.

Leave a Reply Cancel reply