Blog Entry 4
Rubrics and Scoring Guides. What are they are and how can they be both reliable and valid?
Reliability: Tree's growth rings to measure the age of the tree; Validity: Testing each participant using the same equipment.
According to Carol Boston (2002), A rubric has essentially two uses: To evaluate students and to enhance their learning. Using a rubric or scoring guide to evaluate students helps teachers to be more consistent in scoring and also helps the teacher define exactly what it is that will be evaluated and the quality that is being sought, thus weeding out unnecessary information that could cloud the activity. Rubrics can also provide the student with the exact information that their work will be judge on, thus allowing them to focus on and improve their learning of the task at hand without unwarranted guessing.
Boston (2002) suggests using a metarubric or rubric to judge the quality of rubrics being utilized in a current classroom or course. She says a good metarubric include the following four areas: 1) Content/Coverage, 2) Clarity, 3) Practicality, and 4) Technical Quality/Fairness. Using this metarubric to judge our own rubrics will help tighten them up and clean out any redundancy or ambiguity. Once the rubric is ready to roll, she suggests using an activity with students to familiarize them with how the rubric will function where they use the rubric themselves to judge other work.
Two other areas of concern when developing rubrics for use in evaluating student performance include validity and reliability. The validity of a rubric or other scoring device is determined by the degree to which the device actually assesses what it was meant to assess. Boston (2002) suggests three types of evidence that are commonly used to examine the validity of an assessment tool. These include content, construct, and criterion.
When looking at the content for validity, ask the question, “Are students being assessed on their knowledge of the subject at hand, or are they being assessed on their ability to interpret and understand the question?” If they are being assessed on both their knowledge and grammar skills, then the categories of the rubric should reflect both.
A student’s reasoning processes are the focus of construct validity. This is a very personal process and is internal to every individual. When looking at the construct, both the correct response and the reasons for that response must be viewed together.
Criterion-related evidence is based on and should reflect the outcomes of a current or future event. If a course is preparing a student to work in a certain field, then the assessment rubric should reflect the same criteria as the future workplace.
To determine which of these three types of evidence should be assessed, one should consider what type of material is being assessed. If the material being assessed is knowledge based, then the content type is warranted. If the material involves reasoning, the the construct type should be utilized. And, finally, if the purpose of the material is show how a student will perform outside of the learning environment, then the criterion-related evidence is warranted. However it is possible that all three could be used.
An assessment tool that is reliable is one that will produce the same scores for each student regardless of who does the scoring, when it’s completed, or when the student was assessed. Two types of reliability include interrater and intrarater. The rater being the person doing the rating. Interrater reliability refers to the variability between the persons doing the scoring and includes the amount of subjectivity allowed by the rater. Intrarater reliability refers to the outside influences on the rater such as fatigue that might affect the rater’s ability to be reliable in scoring.
Boston (2002) asserts that no matter what is included in the rubric or what is being assessed, it is clear that the scoring device should always be shared with the persons being scored prior to the evaluation so as to allow the students the opportunity to prepare their product in a way that shows they have met the criteria.
I found the three types of evidence commonly used to examine the validity of an assessment tool to be very interesting. I have not been exposed to this before, but it makes sense to choose an instrument depending on if you are assessing knowledge, reasoning, or future outcomes. I think I should like to explore creating each of these types of rubrics a bit further.
Boston, C. (2002).Understanding scoring rubrics: a guide for teachers. College Park, MD: ERIC Clearinghouse on Assessment and Evaluation, University of Maryland.