Online Book Reader

Home Category

5 Steps to a 5 AP Psychology, 2010-2011 Edition - Laura Lincoln Maitland [138]

By Root 955 0
conditions as all other proctors. All scorers must use the same scoring system, applying the same standards to rate responses as all other scorers. Thus, we should earn the same test score no matter where we take the test or who scores it.

Reliability and Validity


Not only must a good test be standardized, it must also be reliable and valid.

Reliability

If a test is reliable, we should obtain the same score no matter where, when, or how many times we take it (if other variables remain the same). Several methods are used to determine if a test is reliable. In the test-retest method, the same exam is administered to the same group on two different occasions and the scores compared. The closer the correlation coefficient is to 1.0, the more reliable the test. The problem with this method of determining reliability or consistency is that performance on the second test may be better because test takers are already familiar with the questions. In the split-half method, the score on one half of the test questions is correlated with the score on the other half of the questions to see if they are consistent. One way to do that might be to compare the score of all the odd-numbered questions to the score of all the even-numbered questions. In the alternate form method or equivalent form method, two different versions of a test on the same material are given to the same test takers, and the scores are correlated. The SAT given on Saturday is different from the SAT given on Sunday in October; there are different questions on each form. Although this does not happen, if the same people took both exams and the tests were highly reliable, the scores should be the same on both tests. This would also necessitate high interrater reliability, the extent to which two or more scorers evaluate the responses in the same way.

Validity

Tests can be very reliable, but if they are not also valid, they are useless for measuring the particular construct or behavior. Psychometricians must present data to show that a test measures what it is supposed to measure accurately, and that the results can be used to make accurate decisions. Because there are no universal standards against which test scores can be compared, validation is most frequently accomplished by obtaining high correlations between the test and other assessments. Validity is the extent to which an instrument accurately measures or predicts what it is supposed to measure or predict. Just as there are several methods for measuring reliability, there are also several methods for measuring validity.

• Face validity is a measure of the extent to which the content of the test measures all of the knowledge or skills that are supposed to be included within the domain being tested, according to the test takers. For example, we expect the AP Psychology exam to ask between five and seven questions dealing with testing and individual differences on the multiple-choice section of the test, as defined by the content outline for the course, which sets the structure and boundaries for the content domain.

• Content validity is a measure of the extent to which the content of the test measures all of the knowledge or skills that are supposed to be included within the domain being tested, according to expert judges.

• Criterion related validity is a measure of the extent to which a test’s results correlate with other accepted measures of what is being tested.

• Predictive validity is a measure of the extent to which the test accurately forecasts a specific future result. For example, the SAT is designed to predict how well someone will succeed in his/her freshman year in college. High scores on the SAT should predict high grades for the first year in college.

• Construct validity, which some psychologists consider the true measure of validity, is the extent to which the test actually measures the hypothetical construct or behavior it is designed to assess. The MMPI-2 (described in Chapter 14) has a clinical trial set of questions for schizophrenia. This test has construct validity if this subset

Return Main Page Previous Page Next Page

®Online Book Reader