@gmdprpm
Psychological Assessment
Mr. Ivan Dellosa
October 6, 2023
Psychological Assessment Psychological Testing
Process of measuring psychology-related variables Gathering and integration of psychology-related
by means of devices or procedures designed to data for the purpose of making psychological
obtain a sample behavior evaluation
Numerical in nature Answers a referral question through the use of diff
tools of evaluation
Individual or by group Individual
Administrators can be interchangeable without Assessor is the key to the process of selecting test
affecting the evaluation and/or other tools of evaluation
Requires technician-like skills in terms of Requires an educated selection of tools of
administration and scoring evaluation, skill in evaluation, and thoughtful
organization and integration of data
Assumptions in Psychological Assessment
● Assumption 1: Traits (long term/ugali) and States (short term/mood) Exists
● Assumption 2: Traits and States can be quantified and measured
● Assumption 3: Test-Related Behavior Predicts NOn-Test Related Behavior
● Assumption 4: Test and other Measurement Techniques have strengths and weaknesses
● Assumption 5: Various Sources of Error are part of the Assessment Process
● Assumption 6: Testing and Assessment can be conducted in a Fair and Unbiased Manner
● Assumption 7: Testing and Assessment Benefit Society
Tools for Psychological Assessment
1. Interview - method of gathering information through direction communications involving
reciprocal exchange (intake interviews)
2. Documents - Case History Data: Records, transcripts, ad other accounts in written, pictorial, or
other form that preserve archival information, official and informal accounts, and other data and
items relevant to an assessee
3. Behavioral Observation - monitoring of actions of others or oneself by visual or electronic
means while recording quantitative and/or qualitative information regarding those actions
4. Psychological Test - Classification of Tests: Test user classification, ability test (IQ test,
achievement test), typical performance test (personality test)
Classification of Psychological Tests
1. According to number of test takers
a. individual
b. group
@gmdprpm
2. According to test user classification
a. Level A - anyone with the knowledge of test administration
b. Level B - psychologist and psychometricians
c. Level C - psychologists
3. According to variable measured
a. Ability Test / Maximum Performance
i. Achievement, Intelligence, Aptitude
b. Typical Performance
Examples:
Level A - achievement tests
Level B
● Group Administered Ability Test
○ RPM, CFIT, SPM, PNLT
● Group Administered Typical Performance
○ BPI, 16PF, MBTI
Level C
● Individual Administered Intelligence Test
○ SB-S, WAIS
● Projective Test
○ TAT, DAP, Rorschach
● Diagnostic Test
○ MMPI-II, Bender-Gestalt II
Ability Test
Achievement Intelligence Aptitude
Measures previous learning Measure current general mental Measure potentials for learning
ability
Content Validity Content validity & Construct Content Validity & Criterion
validity Validity
Test in schools for grading WAIS, SB-5 Differential Aptitude Test
Speed and Power Tests
Speed Tests Power Tests
complete as many items as possible in a limited Exhibit depth of understanding and/or skills
amount of time
items are easy or same in difficulty varies in difficulty
Flynn Effect - Progressive rise in intelligence test Omnibus Spiral Format - items in an ability tests
scores that is expected to occur on a normed are arranged in increasing difficulty
@gmdprpm
intelligence test from the date when the test was
first normed
Norm and Criterion Referenced
Norm Referenced Criterion Referenced
Norm referenced compare a persons knowledge or Criterion referenced tests compare a persons
skills to the knowledge or skills of the norm group knowledge or skills against a predetermined
standard, learning goal, performance level, or
other criterion
Primary Scales of Measurement
Magnitude Equal Intervals Absolute Zero
Nominal
Ordinal x
Interval x x
Ratio x x x
Measure of Central Tendency
Mode Median Mean
Most commonly occurring value Middle value in distribution Sum of the value of each
in a distribution when the values are arranged in observation in a data set divided
ascending or descending order by the number of observations.
This is also known as the
arithmetic average
Type of Data Measure
Nominal Data Mode
Ordinal Data Median
Interval/Ratio (Not skewed) Mean
Interval/Ration (Skewed) Median
Skewness
@gmdprpm
● Skewness is demonstrated on a bell curve when data points are not distributed symmetrically to
the left and right sides of the median on a bell curve
Negatively Skewed Normal (No skew) Positively Skewed
Negative Direction (Left side) The normal curve represents a Positive Direction (Right side)
Maraming pumasa / or lahat perfectly symmetrical Maraming bumagsak
distribution
Standard Scores
Standardized Scores Mean SD
Z-Score 0 1
T-Score 50 10
IQ Score 100 15 (16 IF SB)
Stanine 5 2
Sten 5.5 2
GRE/SAT/CEEB 500 100
● Raw Score that has been converted from one scale to another scale
2 RULES!
● Raw Scores are meaningless
● Z Scores are gold
Converting raw scores to Z scores: z= x-u / o OR z = score - mean / SD
Converting Z score to another SS: SS = z (SD) + Mean
Psychometric Properties: Reliability and Validity
Reliability (consistency) Validity (accuracy)
Consistency in test measurement Determines if the test is able to measure what it
purports to measure
Reliabilityt underlies the computation of the error
of measurement of a single score
Necessary but insufficient
Stability and Dependability
@gmdprpm
Types of Reliability Types of Validity
● Test-Retest Reliability ● Face Validity
● Parallel Forms / Alternate Forms ● Content Validity
● Internal Consistency ● Criterion Related Validity
● Internal Rater ● Construct Validity
RULE: A valid test is always reliable, but a reliable test is not always valid.
Reliability
Test-Retest Reliability
● An estimate of reliability obtained by correlating pairs of scores from the same people on two
different administrations
● Time element; one test administered to the same group with more than one occasion
● Coefficient of Stability
● Time Sampling Error
● Statistical Tool: Pearson R
● Carry Over Effect
○ Test Sophistication
○ Practice Effect
Parallel/Alternate Forms Reliability
● Parallel Forms: each form of the test, the means, and the variances are equal
● Alternate Forms: simply different version of a test that has been constructed
● Two tests that measure the same thing administered to the same group
● Coefficient of Equivalence
● Time Sampling Error and Item Sampling Error
● Statistical Tool: Pearson R
● 2 Types of AF:
○ Immediate Form - pagkatapos ni Set A, sinagutan agad si Set B
○ Delayed Form - ngayon si Set A, after a week si Set B naman
Internal Consistency Reliability
● Measures the internal consistency of the test which is the degree to which each item measures the
same construct
● If all items measure the same constructs, then it has a good internal consistency
● A single administration of a single form of a test
● Error: Item Sampling Homogeneity
Types of Internal Cosnistency
Split-half Reliability KR 20/21 Cronbach’s Alpha /
Coefficient Alpha
@gmdprpm
● Obtained by correlating ● Dichotomous (right or ● Used for test with non-
two pairs of scores wrong) Orientation dichotomous format
obtained from ● Tests with right or ● Likert format
equivalent halves of a wrong format ● Best used for personality
single test administered ● KR 20: used for tests
● Error associated is item dichotomous format -
sampling error pag iba iba ung degree
● Statistical Tool: Pearson of difficulty per item
R (& Spearman-Brown) ● KR 21: if all the items
● 2 Methods to use have the same degree of
○ Top Bottom difficulty, dichotomous -
Method gamitin if pare parehas
○ Odd Even Split lang ung degree of
(most used) difficulty per item
Inter-Rater Reliability
● The degree of agreement or consistency between two or more scorers with regard to a particular
measure
● Error: Scorer Differences
● Fleiss Kappa: Determine the level between 2 or more raters when the method of assessment is
measured on Categorical Scale
● Cohens Kappa: 2 raters only
● Krippendorff’s Alpha: Two or more rater, based on observed disagreement corrected for
disagreement expected by chance
Reliability Ranges (0.95 to 0.99 - Redundancy)
1.00 PERFECT
> 0.9 EXCELLENT (clinical test)
> 0.8 - < 0.9 GOOD
> 0.7 - < 0.8 ACCEPTABLE (psychometric prop.)
> 0.6 - < 0.7 QUESTIONABLE (research)
> 0.5 - < 0.6 POOR
< 0.5 UNACCEPTABLE
0.00 NO RELIABILITY
Validity
Face Validity
@gmdprpm
● If the test appears to be valid in the eyes of the test user and taker then it would elicit motivation
to the test taker
● Face validity builds rapport
Content Validity
● Concerned with the extent to which the test is representative of defined body of content
consisting the topics and processes
● Aside from the face validity, content validity can be described as more logical than statistical
● It relies on expertise of people in the field
● Test Blueprint: a plan regarding the types of information to be covered by the items, the no. of
items tapping each are of coverage, the organization of items, and so forth
● Issues:
○ Construct underrepresentation - hindi nasakop ang pinaka coverage. kulang.
○ Construct irrelevant variance - walang kinalaman ung tanong sa coverage
Criterion Related Validity
● A judgment of how adequately a test score can be used to infer an individuals most probable
standing on some measurement of interest - the measurement of interest being criterion
● Bases are other standard measure that are external to the test that you have created
● Criterion: standard on which a judgment or decision may be made
● Incremental Validity: the degree to which an additional predictor explains something about the
criterion measure that is not explained by predictors already use
● Types of Criterion Related Validity
○ Concurrent Validity
○ Predictive Validity
Cosntruct Validity
● Concerns on how well the test score related to other measures or behaviors in a theoretically
expected fashion
● Logical and Statistical
● Types of Construct Validity
○ Convergent
■ Correlate measures that are expected to correlate and they should correlate in
your analysis
○ Divergent
■ Correlate unrelated constructs therefore. The statistical value should be low or
totally unrelated