Assessment & Testing Flashcards

Question

An IQ test has an SEM of 3, and Tom scores 106. About 68% of the time, his score will fall between: **What range?** * A. 100 and 103 * B. 100 and 106 * C. 103 and 109 * D. Higher than someone with 139

Answer 1

C. 103 and 109 ## Footnote 68% confidence interval = score ±1 SEM → 106 ± 3 = 103–109.

Answer 2

A. Spiral test ## Footnote Spiral tests arrange items in order of increasing difficulty, which is appropriate for gauging ability across a range of skill levels.

Answer 3

B. Power test ## Footnote Power tests measure ability without time constraints, focusing on item difficulty rather than speed.

Answer 4

B. Olivia scored 100, Lucas scored 130 ## Footnote Corrected Answer: B

Answer 5

A. 82–98 ## Footnote Solution: Since the question asks for about 95% of the time, we use ±2 SEM:

Answer 6

B. Equivalent forms reliability ## Footnote Equivalent (or alternate) forms reliability involves administering different versions of the same test to the same group and correlating the results. Similar results suggest strong reliability.

Answer 7

B. Forced-choice format ## Footnote Forced-choice tests require the test-taker to choose from limited, fixed options. The NCE is multiple-choice, making it a forced-choice instrument, unlike projective or ipsative measures.

Answer 8

A. Raymond Cattell ## Footnote Raymond Cattell developed the 16 Personality Factor Questionnaire.

Answer 9

A. Reliable but not valid ## Footnote Reliability refers to consistency, while validity refers to accuracy. A test can be consistent but still measure the wrong construct.

Answer 10

B. Any form of mental testing or measurement ## Footnote Psychometrics is the science of measuring mental functions and abilities through testing.

Answer 11

B. Cohort effects related to educational/cultural differences ## Footnote Cross-sectional studies can reflect generational (cohort) differences, not actual cognitive decline.

Answer 12

D. Accuracy ## Footnote Validity is the degree to which an instrument measures what it claims to measure—its accuracy.

Answer 13

A. Sophia scored 70, Jake scored 50. ## Footnote Sophia: z = 0 → score = 70 Jake: T = 30 → z = -2.0 → 70 - 20 = 50

Answer 14

D. .25 ## Footnote Difficulty index of .25 means only 25% are expected to answer correctly.

Answer 15

C. Informal assessment ## Footnote Journaling is a nonstandardized, informal tool for self-reflection and counselor insight.

Answer 16

B. Appraisal ## Footnote Appraisal is the systematic process of assessing or estimating attributes, which fits the counselor’s purpose.

Answer 17

B. The test will produce consistent results that do not measure true career interests. ## Footnote Reliability means consistency, not accuracy. A reliable but invalid test will consistently measure the wrong construct, making it ineffective for assessing career interests.

Answer 18

A. 81% ## Footnote The percentage of variance explained is the square of the correlation coefficient (r = 0.90). Squaring 0.90 gives 0.81, or 81% of the variance is explained by the correlation. The remaining variance is 100% - 81% = 19%.

Answer 19

A. More reliance on tests while others advocate less ## Footnote There is debate between expanding and reducing reliance on standardized testing.

Answer 20

B. Construct validity ## Footnote Construct validity assesses whether a test truly measures the theoretical trait it claims to measure, such as resilience.

Answer 21

C. 52–60 ## Footnote 56 ± 4 SEM = 52–60 for a 68% confidence interval.

Answer 22

B. 64% ## Footnote Explained variance = r² = (.80)² = .64 = 64%.

Answer 23

B. Social desirability bias ## Footnote Social desirability bias occurs when respondents give socially acceptable rather than truthful answers.

Answer 24

C. .90 or –.90 ## Footnote r = √.81 = .90 (sign depends on direction of relationship).

Answer 25

B. It has no measurement error. ## Footnote A coefficient of 1.00 means perfect consistency and zero measurement error.

Answer 26

A. The client scored above 97% of the population. ## Footnote Two standard deviations above the mean (130 on WAIS-IV) is roughly the 98th percentile, meaning about 97–98% scored lower.

Answer 27

B. Low predictive validity ## Footnote Predictive validity concerns whether the test can forecast relevant outcomes, such as a clinical diagnosis.

Answer 28

D. Galton ## Footnote Sir Francis Galton held that intelligence was hereditary and normally distributed, similar to physical traits.

Answer 29

D. Kim; Beverly ## Footnote Beverly’s T-score of 70 is 2 SDs above the mean (best). Kim’s raw score of 70 is 3 SDs below the mean (worst).

Answer 30

B. 25% ## Footnote The sign doesn’t matter for shared variance; (−.50)^2 = .25 → 25%.

Answer 31

C. 51% ## Footnote The percentage of variance explained by the correlation is the square of the correlation coefficient (r = 0.70). Squaring 0.70 gives 0.49, which means 49% of the variance is explained. The remaining variance, or the percentage not explained, is 100% - 49% = 51%.

Answer 32

D. 90.25% of the variance is shared ## Footnote .95^2 = .9025 → 90.25%.

Answer 33

B. 58% ## Footnote The percentage of variance explained is the square of the correlation coefficient (r = 0.65). Squaring 0.65 gives 0.4225, or 42.25% of the variance is explained by the correlation. The remaining variance is 100% - 42.25% = 57.75% (rounding to 58%).

Answer 34

A. Higher anxiety is associated with lower performance ## Footnote Strong negative correlation means as anxiety increases, performance decreases.

Answer 35

C. Face validity ## Footnote Face validity refers only to how a test appears, not whether it actually measures what it claims to measure.

Answer 36

A. Predictive validity ## Footnote Predictive validity evaluates how well a test forecasts future performance.

Answer 37

A. .50 ## Footnote r = √.25 = .50.

Answer 38

A. 68% ## Footnote ±1 SD in a normal distribution includes about 68% of scores.

Answer 39

D. All of the above ## Footnote FERPA gives students access to their own records and the right to request corrections.

Answer 40

B. Attitude toward IQ testing ## Footnote While socioeconomic status and race often show measurable correlations with IQ scores, attitudes toward testing have little predictive relationship.

Answer 41

B. The test’s test-retest reliability ## Footnote Consistent results over time with the same instrument indicate strong test-retest reliability, even when external factors may have changed.

Answer 42

D. Degree to which scores can be generalized for the same inference across tests ## Footnote Validity concerns whether a score supports the same interpretation across measures.

Answer 43

A. .80 or –.80 ## Footnote r = √.64 = .80; correlation could be positive or negative.

Answer 44

B. Better for older populations ## Footnote The Wechsler scales were better suited for adults than the Binet, which was more child-focused.

Answer 45

B. The test’s stability over time is questionable ## Footnote .42 is low for reliability; generally > .80 is considered acceptable.

Answer 46

B. Focus heavily on professional jobs and overlook blue-collar roles ## Footnote Many inventories emphasize professional careers, reducing relevance for those pursuing skilled trades.

Answer 47

C. Both compare the examinee’s score to scores of others in the norm group. ## Footnote Both percentile ranks and standard scores reference the norm group’s performance, although they express results differently.

Answer 48

A. 64% ## Footnote True variance = r² = (.80)² = .64 or 64%.

Answer 49

Correct Answer: A. Logan scored 75, Lily scored 125. ## Footnote Logan: 100 - 25 = 75 Lily: T = 60 → z = +1.0 → 100 + 25 = 125

Answer 50

D. A test is only one source of data ## Footnote Ethical practice requires explaining that test results are not infallible and should be used alongside other data.

Answer 51

D. Kim; Beverly ## Footnote Beverly’s T-score of 70 = +2 SD (highest); Kim’s raw 70 is 3 SD below mean (lowest).

Answer 52

B. Subjective ## Footnote Subjective tests require scoring based on the evaluator’s interpretation, unlike objective tests with predetermined correct answers.

Answer 53

B. .22 ## Footnote Lowest absolute value indicates weakest relationship.

Answer 54

D. Achievement test ## Footnote The NCE measures mastery of counseling knowledge, fitting the achievement test category.

Answer 55

B. DSM or ICD ## Footnote DSM or ICD diagnoses are required for most third-party payments.

Answer 56

C. Army Alpha and Beta in WWI ## Footnote The Army Alpha/Beta tests were the first large-scale group IQ measures.

Answer 57

C. Use alternative assessments that are culturally fair and normed on diverse populations. ## Footnote Ethical practice requires selecting culturally fair instruments when available, especially if bias could affect validity.

Answer 58

A. Carlos scored 100 and Mia scored 120. ## Footnote Carlos: 120 + (-1.0 × 20) = 100 Mia: T = 50 → z = 0 → 120 + 0 = 120

Answer 59

B. Cyclical test ## Footnote A cyclical test contains multiple sections, each section organized like a spiral (from easy to difficult), but difficulty resets at the start of each new section.

Answer 60

B. .60 ## Footnote √.36 = .60; could be positive or negative, but with absolute value .60.

Answer 61

D. All of the above ## Footnote All three sources provide valuable data on test validity, reliability, and use.

Answer 62

A. Concurrent validity ## Footnote Concurrent validity means the new test correlates strongly with other validated tests given at the same time.

Answer 63

C. Eighth-grade student with IQ of 136 ## Footnote Interest inventories are less accurate for very young adolescents due to limited exposure to occupations.

Answer 64

A. The child scored better than 82% of students in the norm group. ## Footnote Percentile rank refers to the proportion of the norm group scoring lower—not percentage correct or objectives met.

Answer 65

A. O*NET Ability Profiler and MCAT ## Footnote Both measure potential performance or abilities relevant to future training or careers.

Answer 66

C. Standard error of the estimate ## Footnote The standard error of the estimate reflects accuracy in predicting criterion scores from predictor scores.

Answer 67

B. Henry scored 95 and Olivia scored 75. ## Footnote Henry: 90 + (0.5 × 10) = 95 Olivia: T = 35 → z = -1.5 → 90 + (-1.5 × 10) = 75

Answer 68

A. Reliability ## Footnote Reliability ensures consistent measurement over time, conditions, and populations.

Answer 69

B. Cyclical test ## Footnote The hallmark of a cyclical test is repeating cycles of increasing difficulty in each new section. Spiral tests only have one continuous progression without resetting difficulty.

Answer 70

C. Mental Measurements Yearbook ## Footnote The Mental Measurements Yearbook provides professional reviews and validity data, which is essential before using a standardized test for high-stakes decisions.

Answer 71

B. Clinical psychologist ## Footnote The Rorschach requires specialized training, typically held by clinical psychologists.

Answer 72

C. 36% ## Footnote The percentage of variance explained is the square of the correlation coefficient (r = 0.80). Squaring 0.80 gives 0.64, or 64% of the variance being explained. The remaining variance (unexplained) is 100% - 64% = 36%.

Answer 73

C. High predictive validity ## Footnote The ability to predict actual future job performance reflects predictive (criterion-related) validity.

Answer 74

B. Predictive validity ## Footnote Predictive validity is the test’s ability to forecast future outcomes, such as academic performance.

Answer 75

B. Inter-rater reliability ## Footnote Inter-rater reliability is agreement between different evaluators scoring the same performance.

Answer 76

B. Modify the test format and language to reduce cultural loading before further use. ## Footnote While predictive validity is important, fairness is also crucial. Cultural bias can lower scores for non-English speakers, so revising the test to minimize cultural loading supports both validity and ethical standards. Ipsative measures (C) are not comparable across individuals, and eliminating all tests (D) is not a balanced solution.

Answer 77

C. GRE ## Footnote The GRE assesses both learned knowledge and predicts graduate school performance.

Answer 78

D. Projective ## Footnote Word association is a projective technique revealing underlying thought patterns.

Answer 79

A. 120 ## Footnote IQ = (MA ÷ CA) × 100 → (12 ÷ 10) × 100 = 120.

Answer 80

A. Alternate forms reliability ## Footnote Alternate forms reliability assesses consistency between different but equivalent versions of the same test.

Answer 81

B. Standard deviation ## Footnote Standard deviation shows how scores spread around the mean.

Answer 82

B. Select only the highest-performing candidates ## Footnote A difficulty level of .25 means only 25% are expected to answer correctly, which is appropriate when the goal is to admit only top performers.

Answer 83

A. Construct validity ## Footnote Construct validity refers to whether a test accurately measures a theoretical trait or concept.

Answer 84

A. WAIS-IV ## Footnote The WAIS-IV is appropriate for adults and includes verbal and performance components.

Answer 85

A. Grace scored 82.5, Elijah scored 67.5 ## Footnote Grace: 75 + (0.5 × 15) = 82.5 Elijah: T = 45 → z = -0.5 → 75 - 7.5 = 67.5

Answer 86

D. Americanizing the Binet test ## Footnote Terman adapted the Binet-Simon test into the Stanford-Binet for U.S. use.

Answer 87

B. 49% ## Footnote Shared variance is the square of the correlation: (.70)² = .49, or 49%.

Answer 88

A. 70% of the score variance is accurate measurement. ## Footnote A coefficient of .70 means 70% true variance and 30% measurement error.

Answer 89

B. Less reliable than adult tests ## Footnote Developmental changes make early IQ measures less stable over time.

Answer 90

A. Difficulty index ## Footnote The difficulty index measures the percentage of individuals who answer correctly; higher percentages mean easier items.

Answer 91

D. Standardized personality inventory ## Footnote The MMPI-2 is a standardized measure of personality and psychopathology.

Answer 92

D. Americanizing the Binet test ## Footnote Terman adapted the Binet-Simon test into the Stanford-Binet for U.S. use.

Answer 93

A. Reliable but not valid ## Footnote Reliability means consistency; validity means accuracy. A test can be consistent but still inaccurate.

Answer 94

C. Are generally reliable and non-threatening ## Footnote They tend to have good reliability and are easy for clients to complete without stress.

Answer 95

C. Educating the public on testing ## Footnote Increasing public understanding of testing can reduce misuse and misinterpretation.

Answer 96

B. Difficulty index ## Footnote The difficulty index measures the percentage of individuals who answer an item correctly. A lower percentage means a more difficult item.

Answer 97

A. 106–114 ## Footnote SEM = SD × √(1 – r) = 10 × √(.16) = 4; CI = 110 ± 4.

Answer 98

B. 53 - 57 ## Footnote Solution: For two-thirds of the time (~68%), we use ±1 SEM: 55±2=(53,57)55 \pm 2 = (53, 57)55±2=(53,57)

Answer 99

A. Interval ## Footnote Standard scores have equal intervals and an arbitrary zero, fitting interval scale definition.

Answer 100

D. Decrease ## Footnote Shortening a test generally decreases reliability because fewer items reduce measurement stability.

Answer 101

D. Mastered 83% of the content objectives ## Footnote Criterion-referenced scores reflect mastery of specified content, not comparison to others.

Answer 102

B. Isabella scored 72, Mason scored 60. ## Footnote Isabella: 60 + (2 × 6) = 72 Mason: T = 50 → z = 0 → score = 60

Answer 103

A. How accurate a score is likely to be ## Footnote SEM estimates the range in which the true score is likely to fall due to measurement error.

Answer 104

B. Attitude toward IQ testing ## Footnote Research consistently shows strong correlations between mental ability scores and socioeconomic status, moderate ones with race, and negligible with attitudes toward testing.

Answer 105

D. Construct validity ## Footnote Construct validity includes convergent and discriminant evidence to confirm a test measures its intended construct.

Answer 106

D. Behavioral checklist ## Footnote Checklists are informal, nonstandardized appraisal tools.

Answer 107

B. Current performance ## Footnote Aptitude is predictive; achievement reflects what has already been learned or mastered.

Answer 108

A. Intelligence quotient ## Footnote IQ = (Mental Age ÷ Chronological Age) × 100 in the original formula.

Answer 109

B. Carl Jung ## Footnote The MBTI is rooted in Jung’s personality typology.

Answer 110

A. Bender-Gestalt II ## Footnote The Bender-Gestalt II assesses visual-motor integration, often used in neuropsychological screening.

Answer 111

B. Having fewer negative than positive statements ## Footnote Imbalanced item wording can worsen bias rather than reduce it.

Answer 112

A. Power tests allow for unlimited time, measuring the depth of comprehension rather than speed. ## Footnote Power tests focus on the complexity of items and depth of understanding rather than quick responses, making them more appropriate for evaluating comprehension.

Answer 113

A. 49% is explained by the correlation, and 51% is unexplained ## Footnote The percentage of variance explained is the square of the correlation (0.70), which equals 49%. Therefore, the remaining 51% of the variance is not explained by the correlation.

Answer 114

C. .80 ## Footnote For employment decisions, a reliability of .80 or higher is generally acceptable.

Answer 115

C. Identify children with intellectual disabilities ## Footnote It was designed to distinguish children with learning difficulties from those without.

Answer 116

B. 51% ## Footnote The shared variance between the two tests is the square of the correlation coefficient (r = 0.70), which is 0.49 or 49%. The remaining 51% is the unexplained variance.

Answer 117

C. Jensen ## Footnote Arthur Jensen questioned causes of group differences in IQ scores.

Answer 118

B. The tests explain 49% of the variance in problem-solving ability ## Footnote The percentage of variance explained is the square of the correlation coefficient (r = 0.70). Squaring 0.70 results in 0.49, or 49% of the variance being explained. Therefore, the remaining variance (unexplained) is 51%.

Answer 119

C. Guilford ## Footnote J.P. Guilford emphasized convergent (one answer) and divergent (many answers) thinking as key components of intelligence.

Answer 120

B. WAIS-IV ## Footnote The WAIS-IV is the adult Wechsler test, appropriate for ages 16 and up.

Answer 121

A. 84% ## Footnote The percentage of variance explained is the square of the correlation coefficient (r = 0.40). Squaring 0.40 gives 0.16, or 16% of the variance being explained. The remaining variance (unexplained) is 100% - 16% = 84%.

Answer 122

A. Profile analysis ## Footnote Profile analysis interprets the relative elevations of different MMPI scales to assess patterns of functioning.

Answer 123

A. Ella scored 82.5 and James scored 72.5. ## Footnote Ella: 75 + (1.5 × 5) = 82.5 James: T = 45 → z = -0.5 → 75 + (-0.5 × 5) = 72.5 Oops! James = 72.5, not 70. Correct answer: A

Answer 124

A. WPPSI-IV ## Footnote The WPPSI-IV is for children ages 2 years 6 months to 7 years 7 months.

Answer 125

B. Pictures ## Footnote The TAT presents ambiguous pictures for storytelling to reveal underlying motives.

Answer 126

A. Approximately 40% of the variance in scores is shared between the instruments ## Footnote .63^2 = .3969 → ~40% shared variance.

Answer 127

B. Reliability ## Footnote Agreement among raters reflects inter-rater reliability.

Answer 128

C. WPPSI-IV ## Footnote The WPPSI-IV is designed for children aged 2 years, 6 months to 7 years, 7 months.

Answer 129

D. Psychodynamic clinician ## Footnote Projective tests are aligned with psychodynamic approaches that explore unconscious processes.

Answer 130

C. WISC-IV ## Footnote The WISC-IV is intended for ages 6–16.

Answer 131

A. Yes, because unreliable tests cannot be valid. ## Footnote Reliability is a prerequisite for validity.

Answer 132

D. Both B and C ## Footnote r = √.36 = .60; correlation could be positive or negative.

Answer 133

A. 64% ## Footnote Shared variance = r2r^2r2 → .802=.64.80^2 = .64.802=.64 → 64%.

Answer 134

C. Z-score ## Footnote Z-scores are directly tied to the standard normal distribution.

Answer 135

B. .70 ## Footnote Shared variance = r^2 → r = √.49 = .70.

Answer 136

A. Predictive validity ## Footnote Predictive validity measures how well a test forecasts future performance, such as college success.

Answer 137

B. Standardized ## Footnote It is a standardized measure with consistent administration and scoring procedures.

Answer 138

A. 70 ## Footnote A raw 130 is 2 SD above mean; T-score formula (10z + 50) = 70.

Answer 139

A. Items are familiar to all examinees regardless of culture ## Footnote Culture-fair tests minimize language and cultural content to reduce bias.

Answer 140

B. Test-retest reliability ## Footnote Test-retest reliability measures the stability of scores over time.

Answer 141

A. Recognition-based forced choice ## Footnote Multiple-choice items require recognition and are classified as forced-choice questions.

Answer 142

B. Frequency distribution ## Footnote Frequency distribution is a common first step before summarizing data.

Answer 143

B. Alternate forms reliability ## Footnote Alternate forms reliability compares scores from two equivalent versions of the same test.

Answer 144

A. Prediction, monitoring, evaluation, discrimination ## Footnote Standard texts identify prediction, monitoring, evaluation, and discrimination as primary functions.

Answer 145

B. Subjective format ## Footnote Essays are graded using judgment and interpretation, making them subjective. Objective formats have predetermined correct answers (e.g., multiple-choice).

Answer 146

A. Test-retest reliability ## Footnote Test-retest reliability checks consistency across repeated administrations.

Answer 147

C. Mental Measurements Yearbook ## Footnote The Mental Measurements Yearbook contains professional reviews and technical validity data, making it the best resource for evaluating an instrument before use.

Answer 148

C. –.81 ## Footnote Strength is based on absolute value; –.81 is strongest.

Answer 149

C. Read the test manual ## Footnote The test manual specifies intended populations and administration guidelines.

Answer 150

C. Neurological impairment ## Footnote A large discrepancy can indicate neurological issues, though context is needed.

Answer 151

A. Split-half reliability ## Footnote Split-half reliability assesses internal consistency by correlating two halves of the same test.

Answer 152

B. Construct validity ## Footnote Construct validity ensures an abstract psychological concept is being accurately measured.

Answer 153

A. Concurrent validity ## Footnote Concurrent validity is shown when scores on different tests measuring the same construct correlate closely when administered at the same time.

Answer 154

A. “She performed better than 82% of students in the norm group.” ## Footnote Percentile rank reflects the percentage of norm group members who scored lower, not the percentage correct.

Answer 155

B. It must be reliable. ## Footnote Validity requires reliability; an accurate measure must also produce consistent results.

Answer 156

A. They share 36% of their variance ## Footnote .60^2 = .36 → 36% shared variance.

Answer 157

A. More reliance on computer-assisted testing and scoring ## Footnote Technology has expanded computerized administration and scoring.

Answer 158

B. 25% ## Footnote The percentage of variance explained is the square of the correlation coefficient (r = 0.50). Squaring 0.50 gives 0.25, or 25% of the variance is explained by the correlation. The remaining variance is 100% - 25% = 75%.

Answer 159

C. More reliable ## Footnote Speeded tests can increase reliability because they reduce random error due to guessing or pacing differences.

Answer 160

A. Sophie scored 80 and Noah scored 90. ## Footnote Sophie: z = 0 → 80 + (0 × 10) = 80 Noah: T = 60 → z = +1.0 → 80 + (1 × 10) = 90 Oops! Noah should be 90. Correct answer: A

Answer 161

B. Ipsative ## Footnote Ipsative measures require test-takers to compare their own responses rather than being compared to a norm group.

Answer 162

B. It allows each content area to be assessed across a range of difficulty levels independently. ## Footnote Cyclical design lets each content area be tested in depth without the earlier domains affecting the difficulty progression of later ones.

Answer 163

C. Using structured questionnaires to estimate psychological attributes for the entire population. ## Footnote Appraisal refers to systematically assessing or estimating attributes, often across a group. While observation (D) and interviews (B) can be part of the process, a structured questionnaire for all students fits the formal appraisal definition.

Answer 164

B. The more reliable test, because it will provide more consistent results. ## Footnote Between two equally valid tests, the more reliable instrument is preferable because it will yield more consistent scores.

Answer 165

B. The true score is likely between 101 and 107 about 68% of the time. ## Footnote The SEM indicates the likely range for the true score within a certain confidence interval (±1 SEM for 68% confidence).

Answer 166

B. No linear relationship between the variables ## Footnote A zero correlation means no linear relationship; other non-linear relationships may still exist.

Answer 167

D. Digital scale ## Footnote Mechanical devices like a precise digital scale have near-perfect reliability compared to psychological tests.

Answer 168

A. Low reliability means low validity. ## Footnote Reliability is a prerequisite for validity; an unstable measure cannot be valid.

Assessment & Testing Flashcards

Learn the principles of assessment and testing, including test construction, psychometrics, DSM-based diagnosis of mental disorders, and the ethical use of evaluation tools in counseling. (192 cards)