Consumer’s Guide to Social and Emotional Assessment, Part I: How To Know if an Assessment is Up to Snuff

A Consumer’s Guide to Social and Emotional Assessment

Before getting into the business of assessing student social and emotional competence, most people would agree that it is wise to critically evaluate assessment options. Here’s the catch: Most people don’t have years of graduate training in psychometrics, so it’s hard to know what’s up to snuff and what is not. I wrote my book on SEL assessment to provide educators with guidance about how to choose and use SEL assessments wisely, and it includes information you can use to be a smart consumer of assessment products.

Blog Series Overview

This blog series lays out key information on how to know whether an SEL assessment has the technical requirements to accomplish your assessment goals. Upcoming blog posts will demystify the concepts of reliability and validity and help you understand how to evaluate whether an assessment’s scores are reliable (enough) and valid (for an intended use).

Judging Whether an SEL Assessment is Up to the Job

This blog focuses on how to judge an assessment’s ability to do what you want it to. The most important thing is to be clear about what you want your assessment to do. When you have assessment scores in hand, how will you use them? What decisions will you make with the information? Only if you are clear about your assessment goals can you judge the quality of an assessment.

Why is that? Like any other tool, every single assessment out there is good at some things and not good at others. For example, a high-quality standardized math test is a wonderful tool for assessing math competency. It’s a lousy measure of reading competence.

TIP

You can find good information about available SEL assessments at these two websites:

Fortunately, there are a small number of social and emotional assessment goals that educators most commonly pursue. Those include using assessment data to:

Guide instruction—what to teach to whom and when
Assess student growth
Measure program impact
Compare student competence to age-mates around the country
Identify students at-risk for a condition

Consider carefully which of these is most important. There’s no right or wrong answer. Actually, there is a wrong answer. If you say, we want to do all of these things, that’s a red flag, particularly if you say that all of these goals are equally important, and you want to use one simple assessment to do it, and the assessment has to be free.

Remember: No one assessment can do it all. Figure out what you want the assessment to do before you select the one you’ll use.

Example 1: The Right SEL Tool for the Job

So let’s say you want an SEL assessment to guide your teachers as they decide what skills to teach to whom at what point in the year. Great. That’s a specific goal. The next step is to evaluate the ability of your assessment options to accomplish these goals. Table 1 below summarizes the desirable characteristics of assessments you intend to use to achieve each of these goals. For example, to use the assessment to guide what skills to teach to whom, the assessment should measure the competencies you want to teach with enough specificity that you can use scores to decide what to teach to whom. In addition, scores should be reliable.

The table also recommends what you should do before deciding on a particular assessment. To use an assessment to decide what to teach to whom, for example, you would be well-advised to understand the assessment content, and confirm that the content is related to what you want to teach. In addition, you should review evidence of score reliability, which I’ll talk about in an upcoming blog post.

Finally, the table provides general guidance about how you’ll know when assessment has the characteristics it needs to be used for each specific goal.

TABLE 1: Matching the assessment to the goal

If you want to…	The assessment should…	You should…	You’ll know it’s okay if…
Use scores to guide what you teach to whom	Measure competencies you want teach, with enough specificity that you know what to teach to whom. Scores are reliable, meaning repeatable and consistent.	Be familiar with the assessment content and verify it is designed to measure what you intend to teach. Review evidence of score reliability.	It’s clear how assessment scores are related to what you intend to teach. There is evidence that each score measures a distinct skill. Score reliabilities are .80 or above.
Evaluate student growth	Be sensitive to change in student skill level over time.	Understand whether scores change over time.	Data shows that performance improves with age.
Evaluating program impact	Be Is sensitive to the program effects; Be designed to measure what the program teaches.	See “…guide what you teach to whom.” Review evidence that students exposed to high-quality instruction score better than a control group.	See “understand student strengths and needs to guide instruction.” See field trials that included the assessment as an outcome measure.
Use scores to understand how your students’ performance compares to the general population.	Be nationally normed—that is, students around the country should have completed the assessment to determine how to compute student scores.	Review documentation of the characteristics of the norming group to see how similar they are to your students.	The norming group is large, reflects the country’s diversity, and includes students like yours.
Screening to identify students at risk for a disorder or condition	Efficiently identify students who may have a specific condition or disorder.	Know the rate of false positives and false negatives you can expect from the screener. Know how you will follow up with those who screen positive.	The documented false positive and false negative rate is acceptable given your goals.

So let’s say you’re reviewing our assessment, SELweb. You are using the Sanford Harmony program to teach SEL skills. First question—does SELweb assess the competencies you intend to teach? The program alignment below shows considerable overlap between the content of the assessment and the curriculum. Check.

Caveat Emptor

Some SEL assessments claim to measure several distinct competencies, but some don’t show evidence that’s true. I suspect these scores, which supposedly measure distinct competencies, are so highly correlated that the better conclusion is that they are all measuring one thing.

Do the scores give you specific enough information to differentiate instruction? Well, we have really good data showing that SELweb scores measure four correlated but mostly distinct competencies, and each competence is associated with different lessons in the Harmony program. Check.

Are the scores reliable enough for this purpose. SELweb’s score reliabilities are in the .80 to .90 range. In addition, SELweb reports show confidence intervals, which means that you can see the range of scores any given child is likely to achieve on repeated measurement. Check.

So SELweb holds up favorably under scrutiny as a tool to decide what competencies to decide what to teach to whom in this situation.

Example 2: The Wrong SEL Tool for the Job

Let’s say you wanted a screener to identify students at risk for an emotional or behavioral disorder. Let’s look at SELweb again. Does it identify students who may have a disorder? Not really; at least we have no evidence that it does so. Do we know the rate of false positive and false negative to expect if we use a SELweb score cutoff to identify student as at risk. Nope. We have no data to that effect.

In this case, the very same assessment fails to measure up, so to speak, for this particular measurement goal.

Conclusion

As you consider what SEL assessment to adopt, be clear about what decisions you’ll make with the scores, and really kick the tires—check to make sure that the assessment you’re considering is up to the job. Use Table 1 as a starting point. For a more in-depth treatment, read my book.