0% found this document useful (0 votes)
19 views11 pages

Articulo JMO

Uploaded by

Vanessa Perez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views11 pages

Articulo JMO

Uploaded by

Vanessa Perez
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

Article

Evaluative Judgment: A Validation Process to Measure Teachers’


Professional Competencies in Learning Assessments
José Miguel Olave Astorga * and Félix González-Carrasco

Facultad de Filosofía y Humanidades, Pontificia Universidad Católica de Valparaíso, Valparaíso 2340025, Chile;
felix.gonzalez@pucv.cl
* Correspondence: jose.olave@pucv.cl

Abstract: This article deals with teachers’ professional development, focusing specifically
on their competencies to assess learning. Research in this field has shown a lack of in-
struments for measuring such competencies in practicing teachers. In this context, we
present the validation process of such an instrument, called Classroom Evaluative Judg-
ment, which is designed to assess teachers’ competencies in evaluating their students’
school work. We adopt a quantitative approach, with a non-experimental and sequential
design. First, the instrument was subjected to content validation through expert judgment.
Subsequently, a pilot test was carried out with an unintentional sample, applying statistical
reliability analysis and confirmatory factor analysis to ensure the internal consistency of
the instrument with respect to its theoretical basis. Finally, we validated the instrument
with 266 participants, obtaining high levels of internal consistency and statistical reliability.
The results support the soundness of the proposed model and its usefulness for measuring
professional teaching competencies in the field of learning assessment. Its application in
real contexts of professional practice could open new lines of research on the evaluative
judgment of teachers and the strengthening of their evaluative identity.

Keywords: learning assessment; teachers’ professional development; evaluative identity;


instrument validation

Academic Editor: Juan Peña-Martínez

Received: 13 April 2025 1. Introduction


Revised: 5 May 2025
The international literature widely recognizes that assessment influences teaching and
Accepted: 13 May 2025
Published: 20 May 2025 learning (Wylie, 2020; Baird et al., 2017); therefore, it is important to link the increase in pro-
fessional competencies in learning assessment with their eventual transfer to the classroom.
Citation: Olave Astorga, J. M., &
González-Carrasco, F. (2025). In addition, it is affirmed that formative assessment practices carried out by teachers not
Evaluative Judgment: A Validation only improve their students’ learning, but are also effective for professional development
Process to Measure Teachers’ as they favor inclusion and strengthen participation among colleagues (Ainscow et al., 2024;
Professional Competencies in DeLuca et al., 2023; Hoefflin & Allal, 2007).
Learning Assessments. Education
The need to implement training processes for teachers in terms of learning assessment
Sciences, 15(5), 624. https://doi.org/
parallels the emergence of professional standards, thus making it necessary to implement a
10.3390/educsci15050624
situated and progressive professional development approach in educational communities
Copyright: © 2025 by the authors.
to improve and support teachers’ assessment practices (DeLuca et al., 2016a, 2016b). In
Licensee MDPI, Basel, Switzerland.
Chile, Law 20.903 of 2016 establishes a regulatory framework for teachers’ professional
This article is an open access article
distributed under the terms and development, emphasizing collaboration and situated learning as fundamental pillars to
conditions of the Creative Commons improve educational quality (Organisation for Economic Co-operation & Development
Attribution (CC BY) license [OECD], 2015, 2017). Complementarily, in order to promote specific competencies in learn-
(https://creativecommons.org/ ing assessment among teachers, the Chilean Ministry of Education has recently enacted new
licenses/by/4.0/).

Educ. Sci. 2025, 15, 624 https://doi.org/10.3390/educsci15050624


Educ. Sci. 2025, 15, 624 2 of 11

Professional Teaching Standards (CPEIP, 2022). The standards outline the necessary profes-
sional expertise regarding key knowledge, skills, and attitudes for educational processes
(Wyatt-Smith et al., 2017). A criticism voiced in the literature on standards is that these
professional norms are not sufficient to represent the complexity of teachers’ evaluative
work (Wyatt-Smith & Looney, 2016). From this perspective, this article seeks to describe
the validation process of an instrument employed to measure professional competencies in
learning assessment in the context of teachers’ professional practice, allowing us to obtain
valid and reliable information that will enable favorable decision making to support their
professional development.
Coinciding with current understanding that assessments serve as a sociocultural activ-
ity (Broadfoot, 2021) influenced by policies affecting the professional practice of teachers
and, in particular, their ability to evaluate their students’ work in the classroom (Looney
et al., 2018), we interpret the assessment of learning as a complex process in which teachers
and students actively participate. Therefore, the design of an assessment tool should not
only respond to the intentions of teachers but should also consider students’ differing
trajectories and ways of learning. Specifically, the assessment perspective assumed by this
study coincides with Heritage’s (2007) definition of a formative assessment, defined as “a
systematic process for gathering evidence about learning, in which students participate
actively with their teachers, sharing learning objectives, understanding how their learning
progresses, what steps they should take and how to do it” (p. 142). This approach to assess-
ment emphasizes that teachers and students develop evaluative judgments about different
school tasks—as evidence of learning—in order to make pedagogical decisions relevant
to the overall process and not merely about isolated outcomes (Black & Wiliam, 2018). Ac-
cording to Sadler (1989), evaluative judgment is a central element in formative assessment
processes as it helps students and teachers to make estimates about how much material
has been learned and what remains to be taught. An accepted definition in the special-
ized literature is the one proposed by Tai et al. (2018), which points out that evaluative
judgment is the process in which one assesses the quality of one’s own and others’ work.
Recent research on evaluative judgment has mainly been conducted with higher education
students (Sun et al., 2024); as a result, there are few studies on the process of evaluative
judgment among teachers.
In light of the identified research gap on the development of professional competencies
in the area of learning assessment, the specialized literature highlights assessment literacy
(AL) as the knowledge and skills required by teachers to implement learning assessment
processes (Brookhart, 2023). Moving away from theoretical approaches that reduce the
assessment process to a technical and decontextualized event, we seek to reconceptualize
the concept of AL by integrating “knowledge, beliefs, feelings and skills of teachers in their
roles as evaluators of student learning” (Adie et al., 2020, p.4). In this reconceptualization,
the concept of teachers’ evaluative identity is included as a fundamental aspect, under-
scoring that when a teacher constructs evaluative judgments about their students’ work,
they not only incorporate a series of evaluative knowledge and skills, but also deploy their
evaluative expertise in which the emotional and contextual dimensions of their professional
experience participate (Wyatt-Smith et al., 2024; Adie et al., 2020; Looney et al., 2018).
The concept of assessment literacy among practicing teachers has become increasingly
important for both initial and continued teacher education (Xu & Brown, 2016), as it can be
used as the basis for their professional development (DeLuca et al., 2023). Consequently,
constructing instruments that can measure these competencies is an important challenge
for researchers. To date, research in this area has been developed based on instruments
that provide weak psychometric evidence (Gotch & French, 2014; DeLuca et al., 2016a).
In particular, these instruments focus on measuring general knowledge about learning
Educ. Sci. 2025, 15, 624 3 of 11

assessment, leaving ample research space to explore what items are important when
assessing learning and how assessment literacy impacts students’ learning outcomes (Yan
& Pastore, 2022a; Wylie, 2020).
In response to this need, instruments that have been developed from the AL perspec-
tive were reviewed. DeLuca et al. (2016a) presented an instrument called Approaches to
Classroom Assessment Instrument (ACAI), which is based on the analysis of professional
standards in the field of learning assessment in fifteen countries. The results of their study
propose a sort of agenda for research in this field, highlighting the need to investigate
teachers within classroom assessment spaces. In this sense, in Chile, Meckes (2018) vali-
dated an online instrument for measuring evaluative competencies in primary education
teachers based on four theoretical dimensions; namely, collecting evidence about learning
from their students; analyzing and interpreting evidence of learning; formative feedback;
and certifying and grading student learning. Although this instrument is focused on
measuring professional competencies to develop the evaluation process, the theoretical
link with evaluation practices is still weak. Seeking to integrate conceptual, practical, and
socioemotional aspects present in the evaluation process, Yan and Pastore (2022a, 2022b)
presented the Teacher Formative Assessment Literacy Scale (TFALS) and the Teacher For-
mative Assessment Practice Scale (TFAPS), with the aim of measuring teaching practices
in formative assessments. Both scales have been validated through statistical analysis,
confirming their structure and psychometric quality. These instruments align with the
perspective of the present study, as they integrate the theoretical, emotional, and attitudinal
knowledge present in the process of learning assessments in the classroom. In the same
sense, as we have mentioned, the introduction of the concept of evaluative identity has
gained strength in studies with teachers, as the perceptions that teachers have of themselves
as evaluators and how these perceptions affect their evaluative judgments about their stu-
dents’ work have become relevant in current research (Olave & Orrego, 2025). Thus, Estaji
and Ghiasvand (2021) and Jan-nesar Moqaddam et al. (2021) have presented scales for
the measurement of competencies in learning assessment that incorporate dimensions of
teachers’ evaluative identity. Both works are based on the model proposed by Looney
et al. (2018), who suggested five key dimensions in the construction of evaluative identity.
The results of these studies suggest the importance of professional practice and students’
learning trajectories as key elements when assessing students’ work.
Following this line of research, the present study focused on validating an instrument
used to measure teachers’ professional competencies in the field of learning assessment,
which is based on the framework of assessment literacy in the context of teachers’ pro-
fessional practice. Thus, the present study aims to describe the validation process of this
instrument, discuss its usefulness for measuring professional competencies in the field of
learning assessment, and propose its application in real contexts of pedagogical practice
oriented toward teachers’ professional development.

2. Methods
The instrument was designed to be applied in a pilot study and was subsequently
implemented to answer the following research question: How can an instrument that
measures the evaluative competencies of teachers in the context of their professional
practice be validated? The research was developed using a quantitative approach, with a
non-experimental and sequential design (Arévalo-Chávez et al., 2020). In the first stage
of the study, the instrument was subjected to expert judgment evaluation. After this
evaluation, a pilot phase was applied to a non-probabilistic sample of 66 teachers, whose
inclusion characteristics were their experience in classroom evaluation at primary and
secondary education levels. The results of the pilot application were analyzed by means
Educ. Sci. 2025, 15, 624 4 of 11

of statistical reliability criteria and confirmatory factor analysis, following which internal
coherence adjustments were made to the initial theoretical proposal.
The second stage of the study consisted of the final application of the instrument
taking into consideration the adjustments made after the first stage. This final application
consisted of the participation of 266 elementary and middle school teachers who work in 19
educational establishments that report to a local public education service (SLEP) belonging
to the Chilean public education system.

3. Results
In the first stage, the Classroom Evaluative Judgment instrument was constructed
based on the study by Wyatt-Smith et al. (2024), who consider evaluative practices to be
associated with three dimensions; namely, (a) evaluating the quality of work; (b) consid-
ering the trajectory of students during evaluation; and (c) making professional decisions
in favor of future learning. Based on these dimensions, 21 items were elaborated, each of
which was accompanied by 5 alternatives with a score ranging from 1 to 5, with the highest
score being the one that best relates to the expected characteristics of formative evaluation
in schools. The scores related to the alternatives were as follows: Never (1), Almost never
(2), Occasionally (3), Almost always (4), and Always (5).
Regarding content validity of the pilot instrument, the instrument was validated by
five expert judges in the area of learning assessment; specifically, doctors in the field of
education with experience in teacher training. They were asked to analyze the instrument
based on the criteria of relevance and clarity; that is, whether each item is appropriate
according to its dimension, whether it fulfills the objective or purpose of the instrument,
and whether each item is formulated and written in a clear way. The judges evaluated
each of the items and incorporated qualitative observations for the items which caused
disagreement regarding their clarity and/or relevance. This information was analyzed
through thematic analysis and discussed with the research team, allowing for adjustments
to be made to the items for the final application of the pilot instrument.
After content validation, a pilot test was applied to a non-probabilistic sample of
66 teachers, of which 22.7% exclusively comprised elementary school teachers, 27.3%
exclusively comprised middle school teachers, and 50% belonged to both educational
cycles. The objective of the analysis of the pilot application was to make adjustments to the
proposed theoretical model, contrasting it with an analysis of the statistical reliability and
standard estimation of each of the items related to their respective dimensions. For this
purpose, a confirmatory factor analysis was carried out. We opted for this type of analysis
as the structure (dimensions, sub-dimensions, and items) of the instrument was defined
according to the hypotheses and theoretical assumptions of the researchers. The analytical
work aimed to control for a number of previously established factors and variables between
which relationships are observed (Ferrando & Anguiano-Carrasco, 2010). As a result of
the pilot phase, three items were eliminated due to overlapping with other items, and
restructuring was also carried out according to the statistical relationship between items
and dimensions. This process helped to consolidate the instrument’s theoretical affinity
and allowed us to review the standard estimators and the significance tests of the factor
loadings, which led to the model being based on six dimensions with three items in each
dimension, as shown in the following table (Table 1).
The table describes the model fit and the selected items. The factor loadings are
statistically significant (p < 0.05), indicating that each item is associated with the proposed
dimension, with standardized estimator values higher than 0.4 serving as reference and
representing the proportion of variance explained by the factor with respect to the item
(Ventura-León, 2019). For example, in Dimension 6, item D4_P19 presents one of the highest
Educ. Sci. 2025, 15, 624 5 of 11

standardized estimators (0.837), reinforcing its relevance in the measurement of that factor.
Similarly, in Dimension 2, item D2_P5 exhibits a standardized loading of 0.419, a lower
value compared to others, but still statistically significant (Z = 3.165; p = 0.00155). This range
of standardized factor loadings (ranging approximately between 0.4 and 0.84) suggests a
good explanatory power of the items for their respective factors, reflecting the consistency
and robustness of the proposed factor model.

Table 1. Factor dimensions and loadings.

Standard
Factor Indicator 1 Estimator EE Z p
Estimator
D1_P2 0.316 0.091 3.491 <0.001 0.527
Dimension 1 D1_P3 0.360 0.089 4.057 <0.001 0.655
D1_P4 0.262 0.069 3.800 <0.001 0.560
D2_P5 0.265 0.084 3.165 <0.001 0.419
Dimension 2 D2_P6 0.546 0.089 6.147 <0.001 0.779
D2_P7 0.583 0.088 6.629 <0.001 0.834
D3_P12 0.257 0.061 4.242 <0.001 0.556
Dimension 3 D3_P13 0.470 0.123 3.818 <0.001 0.511
D3_P14 0.739 0.137 5.411 <0.001 0.682
D4_P15 0.357 0.100 3.553 <0.001 0.478
Dimension 4 D4_P16 0.590 0.104 5.692 <0.001 0.713
D4_P20 0.479 0.118 4.078 <0.001 0.525
D2_P9 0.234 0.051 4.626 <0.001 0.586
Dimension 5 D2_P10 0.473 0.100 4.736 <0.001 0.639
D2_P11 0.212 0.069 3.087 <0.001 0.432
D4_P17 0.572 0.117 4.906 <0.001 0.605
Dimension 6 D4_18 0.578 0.106 5.441 <0.001 0.653
D4_P19 0.797 0.108 7.360 <0.001 0.837
1 According to the statistical analysis, questions Q1, Q8, Q21 are eliminated.

The proposed model based on six dimensions has statistically significant factor load-
ings (see Table 2), in addition to satisfactory fit indices (RMSEA = 0.078; TLI = 0.789;
CFI = 0.834). Thus, the soundness and relevance of the proposed theoretical model are
supported. In light of these results, we decided to advance on the model proposed by
Wyatt-Smith et al. (2024) by disaggregating dimensions related to task quality and reso-
lution (D1–D2); incorporating the recognition of individual and group progress among
students (D3) and the assessment of attitudinal elements (D5); and finally disaggregating
the dimension corresponding to professional decision making into two dimensions: deci-
sion making for teaching and decision making for learning (D4-D6). The final instrument is
presented in Table 3.

Table 2. Adjustment measures.

CI 90% of RMSEA
IFC TLI RMSEA Inferior Superior
0.834 0.789 0.078 0.047 0.104

As a result of this first stage, it was found that the instrument offers internal consis-
tency and construct validity adequate for its application as a definitive instrument, which
reinforces its usefulness for measuring professional competencies in the context of student
learning assessment. The process of content validation, application, and analysis of the
pilot phase allowed us to advance the factor loadings analysis and define the definitive
Educ. Sci. 2025, 15, 624 6 of 11

dimensions and items of the second stage. In this way, the final instrument was constructed
using six dimensions with 18 items.

Table 3. General description of the final instrument.

Dimension Description
D1. Recognize in the students’ work the resolution of the task
Task resolution according to established criteria.
D2.
Identify the qualities observed when solving the
Qualities in the
proposed tasks.
performance of tasks
Identify individual progress stages according to the learning
D3.
trajectory of each student in relation to himself/herself and
Learning path
his/her course group.
D4. Determine decisions that involve next steps to improve
Decision making student learning and improve your teaching tools
for teaching or strategies.
D5. Assess attitudinal aspects of their students involved in the
Task implications performance of the tasks.
D6.
Identify pedagogical decisions derived from evaluative
Decision making
judgment to improve learning support for their students.
for learning

As in the pilot instrument, each item of the final instrument presents five alternatives
with a score ranging from 1 to 5. The scores related to the alternatives were as follows:
Never (1), Almost never (2), Occasionally (3), Almost always (4), and Always (5). The
questionnaire was distributed through a digital platform and socialized with the support
of a local public education service belonging to the Chilean public education system. The
participants in this stage were 266 teachers working in 19 schools belonging to a local
public education service in Chile. Purposive sampling was used to select the participants
following contact with each educational institution. The calculation of the power of the
sample was corroborated using the G*Power 3.1 software, obtaining a power (1−b err
prob)= 0.964. Of the participants, 60% corresponded to elementary school teachers and 35%
to middle school teachers, while 5% did not indicate the level at which they work. The
participants gave their consent to participate in the research, which was formalized via the
informed consent form prepared within the framework of the research and validated by
the ethics committee of the university sponsoring the study under the code BIOEPUCV-H
628-2023.
The results of the second stage obtained Cronbach’s value of 0.896, which is interpreted
as a good coefficient (Frias-Navarro & Pascual-Soler, 2022). This information corroborates
the internal consistency of the construct as well as its applicability to the proposed case.
Based on the information obtained in the theoretical application, we proceeded to perform
confirmatory factor analysis with the purpose of validating its theoretical affinity and
reviewing the standard estimators, as well as the significance tests of the factor loadings
(see Table 4).
The analysis of the factor loadings of the scale allowed us to confirm the model’s
high validity and reliability for its intended use. All of the factor loadings of the scale
are high (0.723 to 0.935) and the Cronbach’s alpha values vary between 0.886 and 0.895.
The measures of fit (see Table 5)—particularly the RMSEA (Root Mean Square Error of
Approximation)—indicate that the model has a “good” fit to the data (RMSEA = 0.078).
Moreover, the TLI (Index) indicates that the proposed model is “acceptable” with respect
Educ. Sci. 2025, 15, 624 7 of 11

to the independence model (TLI = 0.789). The values analyzed, specifically the adjustment
of the factor loadings, allowed for the development of the general instrument, according
to the sampling criteria employed for its validation, and the application of the proposed
theoretical model.

Table 4. Factor loadings and instrument validation.

Factor
Factor Indicator p Alpha
Loadings
D1.P1 As I review students’
schoolwork, I recognize what is 0.864 <0.001 0.889
expected of the learning objective.
D1.P2 As I review students’
D1. schoolwork, I identify the skills 0.862 <0.001 0.889
Task resolution employed for its resolution.
D1.P3 While reviewing students’
schoolwork, I identify the
0.768 <0.001 0.892
knowledge involved in
the assignment.
D2.P4 As I review students’
schoolwork, I assess how they
0.810 <0.001 0.889
integrate cross-cutting skills
in their resolution.
D2.
Qualities in the D2.P5 While reviewing
performance schoolwork, I value the integration
0.934 <0.001 0.886
of tasks of knowledge related to
other contexts.
D2.P6 As I review the task, assess
the application of content and 0.855 <0.001 0.888
skills in different contexts.
D3.P7 As I review the assignment I
recognize the student’s progress in 0.723 <0.001 0.891
achieving the assignment.
D3.P8 When reviewing
D3. schoolwork, I compare individual 0.756 <0.001 0.890
Learning path work with the group’s progress.
D3.P9 When I review the
assignment, I compare similar
assignments (from other years or 0.764 <0.001 0.892
from the same year) that my
students have solved.
D4.P10 After reviewing the
assignment, I make adjustments to
0.779 <0.001 0.890
the instruments (e.g., clarify
instructions, adjust scores, etc.).
D4.
D4.P11 After reviewing
Decision making
schoolwork, I propose or create
for teaching 0.825 <0.001 0.888
new instruments that reflect
new learning.
D4.P12 After reviewing the task,
I adjust my planning according to 0.806 <0.001 0.890
the results obtained.
Educ. Sci. 2025, 15, 624 8 of 11

Table 4. Cont.

Factor
Factor Indicator p Alpha
Loadings
D5.P13 While reviewing the task,
I value the responsibility for 0.935 <0.001 0.892
its completion
D5. D5.P14 When reviewing schoolwork,
0.764 <0.001 0.895
Task implications I assess the order and clarity of the task
D5.P15 While reviewing the task,
I value creativity (or originality) in 0.843 <0.001 0.893
solving the task.
D6.P16 After reviewing the
assignment, I develop individual 0.798 <0.001 0.890
recommendations for each student.
D6.P17 After reviewing the assignment,
D6. Decision I make recommendations to the course
making for learning 0.724 <0.001 0.891
group for the development of future
assignments on the subject.
D6.P18 After reviewing the
assignment, I propose new challenges 0.852 <0.001 0.889
based on each student’s achievement.

Table 5. Measures of instrument fit.

CI 90% of RMSEA
IFC TLI RMSEA Inferior Superior
0.834 0.789 0.078 0.047 0.104

In summary, the application of the final version of the instrument, supported by the
results of the confirmatory factor analysis and the high internal consistency (Cronbach’s
alpha = 0.896), confirms its soundness and viability. The factor loadings, which range from
0.723 to 0.935, together with the fit values (RMSEA = 0.078; TLI = 0.789), demonstrate the
coherence of the proposed structure with the theoretical model. Likewise, the high internal
reliability in each dimension (alpha between 0.886 and 0.895) reinforces the instrument’s
capacity to accurately measure the defined factors. Taken together, these findings support
the relevance of the tool and its usefulness as a valid and reliable resource in the context of
the assessing the professional practice of teachers who make evaluative judgments about
their students’ work.

4. Discussion
Firstly, the instrument was constructed based on the scales presented in order to link
theoretical approaches with evaluative practices. In this case, the instrument advances on
practical dimensions such as the assessment of individual and group trajectories of students,
together with the assessment of attitudinal dimensions derived from the construction
of evaluative judgments by practicing teachers. In this way, the Classroom Evaluative
Judgment instrument surpasses other tools intended only to measure knowledge related to
learning assessment (Yan & Pastore, 2022a, 2022b; Meckes, 2018; DeLuca et al., 2016a). In
addition, the Classroom Evaluative Judgment instrument strongly incorporates dimensions
of assessment context and evaluative identity development (Estaji & Ghiasvand, 2021;
Jan-nesar Moqaddam et al., 2021; Looney et al., 2018).
Secondly, the instrument provides a structured framework that allows teachers to
reflect on their own evaluative practices, especially in the area of formative assessment.
Educ. Sci. 2025, 15, 624 9 of 11

This aspect is important in a context where learning assessment, in many cases, focuses
on measuring performance rather than promoting learning. In summary, the validated
instrument can help to support peer professional development (DeLuca et al., 2023) by using
a situated (DeLuca et al., 2016b) and inclusive (Ainscow et al., 2024; DeLuca et al., 2023)
approach that can contribute to the consolidation of a community in which teachers share
and discuss their evaluative experiences.
Thirdly, the instrument responds to the demands for reliable data on how teachers
understand the evaluation process of their students, providing timely data on their eval-
uative identity and their role as expert evaluators (Adie et al., 2020; Looney et al., 2018).
The instrument includes not only items that identify the quality of school tasks but also
items assessing teachers’ ability to recognize the educational trajectories of their students,
along with assessing attitudinal dimensions that their students develop when performing
such tasks. In addition, the instrument incorporates dimensions concerning the decisions
derived from evaluative judgment, such as professional decision making in favor of im-
proving teaching and learning. In summary, it can be said that the instrument advances the
evaluative judgment model proposed by Wyatt-Smith et al. (2024).
Finally, as professional standards have not been sufficiently clear in representing
the complexity of the learning assessment process (Wyatt-Smith & Looney, 2016), the
proposed instrument was validated as a useful tool for evaluating teachers’ assessment
of student learning for accountability purposes (DeLuca et al., 2023), along with offering
timely evidence from a model of classroom evaluative judgment that enables a complex
understanding of the process.

5. Conclusions
The validated instrument represents a significant contribution to the line of research
in educational evaluation, specifically in the context of teachers who are implementing
formative evaluation strategies. Its contribution can be analyzed in three main dimensions:
the strengthening of validity and reliability in the measurement of professional compe-
tencies in learning assessment; the generation of empirical evidence to strengthen the
evaluative identity of teachers based on analysis of their evaluative judgments regarding
their students’ work; and the possibility of replication in other educational contexts in order
to support teachers’ professional development in the field of learning assessment.
In addition, the instrument opens a space for future research linking teaching prac-
tice and student learning. Its systematic application can provide empirical data on spe-
cific competencies of the evaluation process and consequently improve teaching and
student learning.
Regarding the limitations of this study, we acknowledge that given the Chilean regula-
tions on learning assessment, this study was conducted only with teachers from the public
sector; therefore, future research should address this challenge and look to adopt the instru-
ment for application with teachers from different educational levels, as well as including
teachers from charter and private schools. Addressing these challenges could open a new
research agenda that considers substantive elements when evaluating student learning, as
well as the effects of training for teachers in the context of their professional practice.

Author Contributions: J.M.O.A. contributes to the development of the article: Conceptualization,


methodology, investigation, data curation, writing—original draft preparation; writing—review
and editing; visualization; supervision; and project administration. F.G.-C.: methodology; soft-
ware; validation and formal analysis; investigation; resources; data curation; writing—original draft
preparation. All authors have read and agreed to the published version of the manuscript.

Funding: This work was funded by ANID FONDECYT INICIACIÓN project number 11230410.
Educ. Sci. 2025, 15, 624 10 of 11

Institutional Review Board Statement: This study was approved by ethics committee of the univer-
sity BIOEPUCV-H 628-2023.

Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement: The data used as a basis for the analyses developed in this article are
available at https://data.mendeley.com/preview/r7j8jfm7jf?a=e0d67456-110c-4170-ba93-4f37eb0
7b37f (accessed 1 May 2025).

Conflicts of Interest: The authors declare no conflicts of interest.

References
Adie, L., Stobart, G., & Cumming, J. (2020). The construction of the teacher as expert assessor. Asia-Pacific Journal of Teacher Education,
48(4), 436–453. [CrossRef]
Ainscow, M., Calderón-Almendros, I., Duk, C., & Viola, M. (2024). Using professional development to promote inclusive education in
Latin America: Possibilities and challenges. Professional Development in Education, 51, 149–166. [CrossRef]
Arévalo-Chávez, P., Cruz-Cárdenas, J., Guevara Maldonado, C., Palacio Fierro, A., Bonilla Bedoya, S., Estrella Bastidas, A., Guadalupe
Lanas, J., Zapata Rodríguez, M., Jadán Guerrero, J., Arias Flores, H., & Ramos Galarza, C. (2020). Actualización en metodología de la
investigación científica. Universidad Tecnológica Indoamérica.
Baird, J. A., Andrich, D., Hopfenbeck, T. N., & Stobart, G. (2017). Assessment and learning: Fieldsapart? Assessment in Education:
Principles, Policy and Practice, 24(3), 317–350. [CrossRef]
Black, P., & Wiliam, D. (2018). Classroom assessment and pedagogy. Assessment in Education: Principles, Policy & Practice, 25(6), 551–575.
[CrossRef]
Broadfoot, P. (2021). The sociology of assessment: Comparative and policy perspectives. The Selected Works of Patricia Broadfoot. Routledge.
Brookhart, S. M. (2023). Assessment literacy in a better assessment future. Chinese Journal of Applied Linguistics, 46(2), 162–179.
[CrossRef]
CPEIP. (2022). Pedagogical and disciplinary standards for pedagogy careers. CPEIP.
DeLuca, C., LaPointe-McEwan, D., & Luhanga, U. (2016a). Approaches to classroom assessment inventory: A new instrument to
support teacher assessment literacy. Educational Assessment, 21(4), 248–266. [CrossRef]
DeLuca, C., LaPointe-McEwan, D., & Luhanga, U. (2016b). Teacher assessment literacy: A review of international standards and
measures. Educational Assessment, Evaluation and Accountability, 28(3), 251–272. [CrossRef]
DeLuca, C., Willis, J., Cowie, B., Harrison, C., & Coombs, A. (2023). Cultivating teacher evaluation skills. In Learning to assess: Teacher
education, learning innovation and accountability. Springer. [CrossRef]
Estaji, M., & Ghiasvand, F. (2021). Assessment perceptions and practices in academic domain: The design and validation of an
assessment identity questionnaire (TAIQ) for EFL teachers. International Journal of Language Testing, 11(1), 103–131.
Ferrando, P. J., & Anguiano-Carrasco, C. (2010). Factor analysis as a research technique in psychology. Papeles del Psicólogo, 31(1), 18–33.
Frias-Navarro, D., & Pascual-Soler, M. (2022). Research design, analysis and writing of results. Palmero Ediciones. [CrossRef]
Gotch, C. M., & French, B. F. (2014). A systematic review of assessment literacy measures. Educational Measurement: Issues and Practice,
33(2), 14–18. [CrossRef]
Heritage, M. (2007). Formative assessment: What do teachers need to know and do? Phi Delta Kappan, 89(2), 140–145. [CrossRef]
Hoefflin, G., & Allal, L. (2007). Assessment in the context of professional development: The implementation of 110 a portfolio project.
In S. Frankland (Ed.), Enhancing teaching and learning through assessment. Springer.
Jan-nesar Moqaddam, Q., Khodabakhshzadeh, H., Motallebzadeh, K., & Khajavy, G. H. (2021). àAÒÊªÓ úG AKPP@ IKñë    éÓAJ‚ƒQK
éÓAJ‚ƒQK  
.
úG. AKPAJ.J«@ ð I kAƒ úæ„JÊÆK@ àAK. P àAÒÊªÓ úG. AKPP@ IKñë
 ø QÃ èP@YK@ . Journal of Language and Translation, 1(1), 29. [CrossRef]
Looney, A., Cumming, J., Van Der Kleij, F., & Harris, K. (2018). Reconceptualising the role of teachers as assessors: Teacher assessment
identity. Assessment in Education: Principles, Policy & Practice, 25(5), 442–467. [CrossRef]
Meckes, L. G. (2018). An online instrument to assess evaluative competencies of Basic Education teachers. Final Report FONIDE: FX11668.
FONIDE Technical Secretariat. Available online: https://centroestudios.mineduc.cl/wp-content/uploads/sites/100/2018/10/
Informe-final-FONIDE-FX11668-Meckes_ap-convertedDU.pdf (accessed on 1 April 2024).
Olave, J. M., & Orrego, R. (2025). Formative assessment strategies for elementary and middle school teachers: Decisions to improve
teaching and learning. Pages of Education, 18(1), 1. [CrossRef]
Organisation for Economic Co-operation and Development [OECD]. (2015). Education at a glance 2015. OECD Indicators. OECD
Publishing.
Organisation for Economic Co-operation and Development [OECD]. (2017). Education in Chile. Reviews of national policies for
education. OECD.
Sadler, D. R. (1989). Formative assessment and the design of instructional systems. Instructional Science, 18(2), 119–144. [CrossRef]
Educ. Sci. 2025, 15, 624 11 of 11

Sun, W., Ding, Y., Wang, R., Liu, Y., Wang, Y., Zhu, B., & Liu, Q. (2024). Bibliometric analysis of assessment and evaluation in higher
education: 2012–2023. Assessment & Evaluation in Higher Education, 49(8), 1121–1135. [CrossRef]
Tai, J. M., Aijawi, R., Boud, D., Dawson, P., & Panadero, E. (2018). Developing evaluative judgement: Enabling students to make
decisions about the quality of work. Higher Education, 76(3), 467–481. [CrossRef]
Ventura-León, J. (2019). Two easy ways to interpret the famous factor loadings. Gaceta Sanitaria, 33(6), 599. [CrossRef]
Wyatt-Smith, C., Adie, L., & Harris, L. (2024). Supporting teacher judgement and decision-making: Using focused analysis to help
teachers see students, learning, and quality in assessment data. British Educational Research Journal, 50, 1420–1448. [CrossRef]
Wyatt-Smith, C., Alexander, C., Fishburn, D., & McMahon, P. (2017). Standards of practice to standards of evidence: Developing
assessment capable teachers. Assessment in Education: Principles, Policy and Practice, 24(2), 250–270. [CrossRef]
Wyatt-Smith, C., & Looney, A. (2016). Professional standards and the assessment work of teachers. In L. Hayward, & D. Wyse (Eds.),
Handbook on curriculum, pedagogy and assessment (pp. 805–820). Routledge.
Wylie, E. C. (2020). Observing formative assessment practice: Learning lessons through validation. Educational Assessment, 25(4),
251–258. [CrossRef]
Xu, Y., & Brown, G. T. L. (2016). Teacher assessment literacy in practice: A reconceptualization. Teaching and Teacher Education, 58,
149–162. [CrossRef]
Yan, Z., & Pastore, S. (2022a). Are teachers literate in formative assessment? The development and validation of the teacher formative
assessment literacy scale. Studies in Educational Evaluation, 74, 101183. [CrossRef]
Yan, Z., & Pastore, S. (2022b). Assessing teachers’ strategies in formative assessment: The teacher formative assessment practice scale.
Journal of Psychoeducational Assessment, 40(5), 592–604. [CrossRef]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual
author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to
people or property resulting from any ideas, methods, instructions or products referred to in the content.

You might also like