Close this search box.

Resource Library

Guest post by Dr. Lyndsey Zurawski

Assessments Are NOT Created Equal

Assessments Are NOT Equal. For most Speech-Language Pathologists (SLPs), completing an evaluation is a smaller part of the job. According to the 2020 ASHA Schools Survey, SLPs across all facility settings reported an average of 4 hours per week on diagnostic activities.

So, What are diagnostic activities? This could include formal evaluations, informal evaluations, scoring evaluations, analysis of evaluation, screening, also observations, and report writing. So, Does this sound more or less than the average that you spend on diagnostic activities? For SLPs with high caseloads and a limited amount of workload time allotted in their schedules, completing an evaluation can seem tedious.

That is to say, When required to conduct an evaluation whether as an initial or a reevaluation, sometimes convenience and/or speed can override best practices. 

We’ve all been there when An evaluation needs to get done, and we grab the quickest evaluation tool on our shelf and administer it. We score it, write up the report, and along with observations, parent input, and classroom/intervention data we go to eligibility or ineligibility.

So, what about best practices? Administering that one quick test is NOT the best practice. However, we have all been there, and when I say “we”, I mean me!

What are best practices: Why Assessments Are NOT Equal

Firstly, I will discuss what should be included in a comprehensive evaluation. 

  • Case History/File Review
  • Hearing and Vision Screenings
  • That Observations in more than 1 sets
  • Assessments in all areas of suspected disability 
  • Culturally and linguistically sensitive assessments
  • Recommendations and Summary

Secondly, Jump to assessments in all areas of suspected disability. Do you have a favorite assessment tool? I don’t just have a favorite tool, I have a favorite battery of assessments I use. So, to clarify, part of my job is as a diagnostician conducting in-depth language and literacy evaluations.

So, My assessment battery has to be more comprehensive than what I would typically utilize for my caseload evaluations and reevaluations. But However, that should not stop me (or you) from utilizing best practices when we conduct an evaluation. 

Furthermore, It is also important for us, as SLPs, to consider diagnostic accuracy which includes sensitivity and specificity. So, I’m not going to get all super nerdy and discuss the ins and outs of psychometric properties, instead, I’ll like to refer you over to a page on the Informed SLP that is easy to read and digest.

Table shows All Assessments Are NOT Equal

TestAgeBest Practice
OWLS- IIOral and written language scales
TILLS6 thru 18Comprehensive language and literacy assessment
OPUS5-0 thru 21-0Measures listening (auditory) comprehension and memory skills
LPT – 35-0 thru 11-11Measures strengths and weaknesses of language hierarchy
ITPA- 35-0 thru 12-11Assessment of spoken analogies, morphology, syntax, reading/written language, decoding, and encoding.
TOAL- 412-0 thru 24-11Measures spoken and written language
CAPs7 thru 18Assessment of social language

As you are reading, I’m sure that you are thinking I should hurry up and get to the part where I share which tests to use and not use. but, it is not as simple as that, So, I am going to share what I use and why.

To clarify, This is based on my own clinical experience, along with information related to the factors. But, It is best practice for you to administer more than one assessment when conducting an evaluation that is because all Assessments Are NOT Equal.

All assessments are not created equal.

So, The important thing to know about sensitivity simply speaking and specificity are while no test will be at 100%, a test is judged to be “good” if it is 90% to 100% accurate; “fair” if it is accurate 80 to 89 percent of the time, while anything less than 80% is considered unacceptable (Plante & Vance, 1994).

However, Would you like a resource that does tell you this information at the tip of your fingertips? Our friends over at the Virginia Department of Education created an SLP Test Comparison Card that includes this information. In 2018, they added a few additional tests as a supplemental card.

Meanwhile, As diagnosticians, which each of us is, it is our job to pick and choose the assessment tools that can be reliably and validly identify a speech, language, literacy, and/or social communication disorder.

So, How do we do that? There’s no simple answer, but it comes down to using best practices including evidence-based practice, clinical judgment, and experience, certainly considering cultural and linguistic diversity and sensitivity in all of our evaluations. 

Oral and Written Language Scales-II (OWLS-II)

Firstly, One of the most common assessment tools used, is the Oral and Written Language Scales-II (OWLS-II) it does not report sensitivity and specificity Information for other common assessments such as the Test of Integrated Language and Literacy Skills (TILLS), Comprehensive Assessment of Spoken Language- 2nd Edition (CASL-2), and the Clinical Evaluation of Language Fundamentals-5th Edition (CELF-5) are included from the Virginia DOE. 

Another factor is cultural and linguistic sensitivity/bias in our assessments. Therefore, One way we can ensure that we don’t under or over-identify students is to utilize dynamic assessment, especially with students from diverse backgrounds. Ireland (2019) reported that dynamic assessment sensitivity and specificity has been documented up to 100%. 

TILLS (for students 6 thru 18)

Secondly and Most frequently I am utilizing the TILLS (for students 6 thru 18) because it is a comprehensive language and literacy assessment, it has solid sensitivity and specificity, that provides detailed information and also interpreting the results, that includes relation/impact to curriculum-based content.

So, As an added bonus, there’s a report writing template that is available to use for your own reports. While it is lengthy to administer, I could only use this test and feel comfortable with my results. However, I typically supplement with other assessment tools that are not considered a global language assessment. 

Oral Passage Understanding Scale (OPUS)

In addition, another assessment tool that I use, which is often little known or underrated, is the Oral Passage Understanding Scale (OPUS) for students 5-0 thru 21-0. The publisher, Western Psychological Services (WPS) provides the following information in the overview, “…is a new measure of listening (auditory) comprehension.

So, It evaluates a person’s ability to listen to passages that are read aloud and recall information about them.

This ability is therefore key to success in the classroom, as well as in social and occupational settings.

Furthermore, the OPUS also measures memory skills, which are integral to listening comprehension.” Therefore, What I like most about this assessment is the length and complexity of the passages, along with the types of questions. This assessment is closely aligned to the types of questions that students may be asked in a classroom environment. 

Language Processing Test-3rd Edition (LPT-3)

So, If you are looking to delve into where a student’s breakdown within the language hierarchy, You’d like to administer the Language Processing Test-3rd Edition (LPT-3) for students 5-0 thru 11-11.

This test is fairly quick to administer and can provide insight into where a child’s strengths and weaknesses are along the language hierarchy.

Most importantly, it is a great tool to re-administer after a child has received therapy for some time to demonstrate the growth and progress in the weak areas. 

Illinois Test of Psycholinguistic Abilities-3rd Edition (ITPA-3)

For another global language test (that is lesser-known), I like to administer the Illinois Test of Psycholinguistic Abilities-3rd Edition (ITPA-3) for students 5-0 thru 12-11. I got my hands on this assessment when my boss was changing positions. she told me, “this is an oldie but a goodie.” I did not understand how true of a statement that was until I administered the test, then I understood.

Importantly this test includes the assessment of spoken analogies, spoken vocabulary, morphology, syntax, reading comprehension, written language, decoding, and encoding. However, It also does not take a significant amount of time to administer this test in its entirety. 

Test of Adolescent and Adult Language-4th Edition (TOAL-4)

For older students, similar to the ITPA-3, I like to administer the Test of Adolescent and Adult Language-4th Edition (TOAL-4) for students 12-0 thru 24-11. This test measures spoken and written language including word opposites, word derivations, spoken analogies, word similarities, sentence combining, and orthographic usage. Again, this test provides a solid overview of students strengths and weaknesses without taking too long to administer. 

Clinical Assessment of Pragmatics (CAPs)

So, If you are looking to assess social language, and are enjoying the new Clinical Assessment of Pragmatics (CAPs) which is for students 7 thru 18. Why do I like this assessment?

Firstly, It is hard to find a solid standardised pragmatic assessment

Secondly, It has digital video scenes that depict real life scenarios (as much is feasibly possible)

Furthermore, It looks at a variety of aspects of social language skills including Pragmatic judgment (comprehension), Performance of pragmatic language (expression), Understanding context and emotions (intent of the speaker through inference, sarcasm, and indirect requests), and Nonverbal cues (facial expressions, prosody, and gestures).

In addition, it helps determine a student’s strengths and weaknesses.

School-Age Language Measures (SLAM)

A lesser-known tool, which is an informal assessment measure, is the School-Age Language Measures (SLAM) from the LEADERS Project that I use in almost every language assessment I give. So, why do I like the SLAM cards? Straight from LEADERS Project, SLAM cards are “meant to elicit a language sample that can be analyzed in the context of typical language development as well as the child’s background (e.g., educational experiences, family, linguistic and cultural background, etc). So, for this reason, no scores are included..” Now, they have even updated the site to include guidelines for analysis. And, the best part, IT IS FREE! 

Other Test You should know: All Assessments Are NOT Equal

In conclusion, here are some additional tests that I’m not going to go in-depth about, but that I love having in my assessment toolkit are the School Motivation and Learning Strategies Inventory (SMALSI), the Gray Oral Reading Test- 5th Edition (GORT-5), and the Phonological Awareness Test-2nd Edition: NU (PAT-2:NU). 

Now, you may have noticed I did not mention the OWLS-II, CELF-5, or CASL-2. I will and have administered all of them. But, I found that there are weaknesses within all of these assessments.

So, Let me mention a few, For the OWLS-II, often there is not enough information that can be gleaned simply from administering the Listening Comprehension and Oral Expression Subtests. In regard to the reading subtest, the rigor is just not there. The same goes for the written language subtest.

That is to say, there isn’t good information that can be obtained from administering these subtests. However, if I had a choice there are better options. In regard to the CELF-5, the primary reason I do not administer the test often is that it relies heavily on auditory input and memory.

Many of the students we work with or are assessing have difficulty with auditory memory, short-term memory, and attention which can negatively impact the scores obtained on the assessment. So, For the CASL-2, I do like the assessment, but it is lacking in some areas including diversity.

It also can take quite a long time to administer in its entirety. While the subtests can stand alone, if I am going to administer the CASL-2, I prefer to give the entire assessment so I can look at the strengths and weaknesses across subtests. 


So, If all assessments are not equal how do you find the best? If you’ve read this far, I’d love to hear from you about what your favorite assessment tools are and why! Maureen and I will be doing a live chat on Instagram where we will be candidly discussing assessments and try and answer questions that you have! 

You can follow me at @speechtothecore on IG, FB, and Twitter. 

Guest post by Lyndsey Zurawski, SLP.D, CCC-SLP


American Speech-Language-Hearing Association. (2020). 2020 Schools survey. Survey summary report: Numbers and types of responses, SLPs.

Brydon, M. (2018, September). How do we judge the quality of a standardized test? The Informed SLP.

Ireland, M. (2019, July). Dynamic Assessment. Paper presented at FLASHA Annual Convention, At Sea, FL.

LEADERS Project. (2020). SLAM The Crayons Picture. LeadersProject.Org.

Plante, E. & Vance, R. (1994). Selection of preschool language tests: A data-based approach. Language, Speech, and Hearing Services in Schools, 25, 15-24.

Virginia Department of Education. (2015). SLP Test Comparison. 

Western Psychological Services. (2020). Clinical Assessment of Pragmatics. WPS.

Western Psychological Services. (2020). Oral Passage and Understanding Scale. WPS.


Share This Post

Meet Maureen

Hey there! I’m Maureen Wilson, a school-base SLP who is data driven and caffeine powered. My passion is supporting other pediatric SLPs by teaching them how to harness the power of literacy and data to help their students achieve their goals…without sacrificing time they don’t have.

Free Dynamic Assessment Mini Course

Dynamic Assessment Mini Course

Get the basics you need to administer and analyze Dynamic Assessments in a school setting.  Dynamic Assessments are great for:

  • Assessing student’s language learning
  • Assessing student’s with multi-lingual backgrounds
  • Getting practical information to make confident decisions on eligibility and goals

Featured Products

Sentence Sidekick Bundle

Language Rubrics: A Progress Monitoring and Data Tracking Tool

You might also enjoy...

15 Responses

  1. Oh! Thank you for this post. I am pretty much only an evaluator, and I like that you’ve included information on assessments I do AND don’t have (OPUS, CAPS, ITAP). I avoid the CELF for the same reasons as you do, and I use the LPT for the same reasons you do. I’ve just started to use the TILLS but am not sure if I can use it to meet our state regs yet. I do like using the CASL best for an overall receptive/expressive, and I also like the TAPS-4, the Listening Comprehension Test, and the CELF-Meta. I’m facilitating a discussion on assessments in April; you’ve given me some food for thought!

  2. It is interesting that many of the assessments spoken about include decoding/encoding and other very specific reading skills. In our district we have reading specialists that specifically deal with reading needs, and therefore do the assessments and treatments for this area. I think it is very important to “stay in your lane” as an SLP as our scope of practice is so incredibly large that we could end up servicing the entire school’s IEP caseload if we did not. I found it somewhat curious that the author supports the use of the OPUS and not the CELF, with her reasoning being that the CELF relies too much on auditory stimuli. I have found that any students with any memory, attention, or auditory differences/disabilities do extremely poorly on the OPUS. Very interesting post- thank you!

    1. We know all tests have their ups and downs. The skills for encoding and decoding do fall into our scope, which we know is vast. The encoding and decoding, phonological awareness tasks, are becoming more common in assessments as we have seen in more and more research the ties to poor phonological skills and the negative impacts on things like vocabulary. While we can’t services everyone, because that would be nuts 😉 We can easily incorporate activities to address these needed skills while putting our focus on other language areas of need.

  3. Great information!
    If you had a chance to purchase just one of the mentioned general language tests, which one would it be?
    Thank you for the comparison chart.

  4. Thanks for your opinions. I agree with the OPUS and the CAPs. Love those two instruments. Would love to use the TILLS, too. However, I don’t like the LPT-3 at all. Besides, it’s pretty old for a test. It was last published in 2005 and there are currently no plans on updating the norms. There are more current instruments available.

  5. Thanks so much for your post and for explaining your reasoning behind choosing assessments. I especially appreciate your emphasis on cultural and linguistic sensitivity.

    1. Thank you so much for this post!
      I was wondering if you have an updated link for the test comparison chart from Virginia? I’ve seen the chat before and it’s fantastic but can’t seem to find it anymore. Thanks!

  6. This is great information for SLPs working with school age children. Do you have any recommendations for preschool assessments?

  7. Hi 🙂 While I appreciate the review of newer and little known tests, as well as the nod toward psychometrics, I think it’s important to frame those within a more in-depth discussion re: how they relate to the sampled demographics. While psychometric sensitivity and specificity at the thresholds mentioned are important when evaluating which tests to use, it also important to recognize that these measures themselves are skewed and highly inaccurate within the context of local norms vs. US demographics because they are based on averages heavily weighted in the favor of Mainstream American English-speaking and predominantly Caucasian students. In other words, if a test is normed on a sample that is 52% Caucasian (US 2012 demographics), then this will shift the average “normative” performance toward a Caucasian average performance, wherein lies the difficulty of determining actual language impairment vs. dialectal difference. Furthermore, because the standard deviation is based on averages across the broader, national demographic, this will further disadvantage students in regions where local norms may trend lower due to a host of specific demographic differences (e.g., SES, race/ethnicity, English language proficiency, dialectal differences, etc.). I am concerned that these issues are still only infrequently being addressed and, when they are acknowledged, they are not fully discussed such as in the above review. Without this information the above review cannot adequately guide SLPs in test selection when the populations in which they work vary considerably from a predominantly Caucasian or middle class, Mainstream American English-speaking student body. Also, while it has been neglected from the above review, it should be noted that there ARE in fact assessments designed to minimize cultural/linguistic/racial/SES/etc. bias in testing, such as the DELV series (Diagnostic Evaluation of Language Variation, Norm Referenced and Screening Test). It truly is of the utmost importance that we step outside of the limitations of our own cultural, racial, SES, gender and academic biases in order to be present and aware for our students where their needs actually lay. Regardless, thank you for the information that you have provided.

    1. I’m just a parent prepping for an IEP meeting, who wandered through this space, but I’m so grateful for this perspective/reminder. Seriously, I appreciate you taking the time for that.

Leave a Reply

Your email address will not be published. Required fields are marked *