Title: Enhancing Comparability of Standards through Validation and Moderation A study funded by the Nationa
1Enhancing Comparability of Standards through
Validation and ModerationA study funded by the
National Quality Council
- Shelley Gillis
- Andrea Bateman
- Berwyn Clayton
2Rationale
- Some key stakeholders have raised concerns with
the quality and consistency of assessments being
undertaken by RTOs. That is, concerns have been
raised about comparability of standards.
3Aim
- To develop a series of products that would
- Improve the consistency in assessment decisions
within VET - Increase the level of confidence in industry in
assessment in VET - Increase awareness of, and consistency in, the
application of reasonable adjustments in making
assessment decisions - Increase capability in RTOs to demonstrate
compliance with AQTF 2007 Essential Standards for
Registration, Standard 1.
4Products
- Guide for Developing Assessment Tools
- Code of Professional Practice for Validation and
Moderation - Implementation Guide Validation and Moderation
http//www.nqc.tvetaustralia.com.au/nqc_publicatio
ns
5Changes to the AQTF User Guide
- Validity
- Reliability
- Assessment tool
- Validation
- Moderation
6- The Guide for Developing Assessment Tools
7Essential Characteristics of an Assessment Tool
- An assessment tool includes the following
components - The learning or competency unit(s) to be assessed
- The target group, context and conditions for the
assessment - The tasks to be administered to the candidate
- An outline of the evidence to be gathered from
the candidate - The evidence criteria used to judge the quality
of performance (i.e., the assessment decision
making rules) as well as the - The administration, recording and reporting
requirements.
8Ideal Characteristics
- The context
- Competency mapping
- The information to be provided to the candidate
- The evidence to be collected from the candidate
- Decision making rules
- Range and conditions
- Materials/resources required
- Assessor intervention
- Reasonable adjustments
- Validity evidence
- Reliability evidence
- Recording requirements
- Reporting Requirements
9Competency Mapping
- The components of the Unit(s) of Competency that
the tool should cover should be described. This
could be as simple as a mapping exercise between
the components within a task (eg each structured
interview question) and components within a Unit
or cluster of Units of Competency. The mapping
will help determine the suffiency of the evidence
to be collected as well as the content validity.
10Decision Making Rules
- The rules to be used to
- Check evidence quality (i.e., the rules of
evidence). - Judge how well the candidate performed according
to the standard expected. - Synthesise evidence from multiple sources to make
an overall judgement.
11Reasonable Adjustments
- This section should describe the guidelines for
making reasonable adjustments to the way in which
evidence of performance is gathered without
altering the expected performance standards (as
outlined in the decision making rules).
12Validity Evidence
- Validity is concerned with the extent to which an
assessment decision about a candidate, based on
the performance by the candidate, is justified.
Requires determining conditions that weaken the
truthfulness of the decision, exploring
alternative explanations for good or poor
performance, and feeding them back into the
assessment process to reduce errors when making
inferences about competence. - Evidence of validity (such as face, construct,
predictive, concurrent, consequential and
content) should be provided to support the use of
the assessment evidence for the defined purpose
and target group of the tool. - .
13Reliability Evidence
- Reliability is concerned with how much error is
included in the evidence. - If using a performance based task that requires
professional judgement of the assessor, evidence
of reliability could include providing evidence
of - The level of agreement between two different
assessors who have assessed the same evidence of
performance for a particular candidate (i.e.,
inter-rater reliability). - The level of agreement of the same assessor who
has assessed the same evidence of performance of
the candidate, but at a different time (i.e.,
intra-rater reliability). - If using objective test items (e.g., multiple
choice tests) than other forms of reliability
should be considered such as the internal
consistency of a test (i.e., internal
reliability) as well as the equivalence of two
alternative assessment tests (i.e., parallel
forms).
14Examples
Portfolio Interview Observation Product
15Quality Checks
16A Code of Professional Practice for Validation
and Moderation
17Assessment Quality Management
- Quality Assurance
- Quality Control
- Quality Review
18(No Transcript)
19Validation Versus Moderation
20Focus - Tool
- Has clear, documented evidence of the procedures
for collecting, synthesising, judging and
recording outcomes (i.e., to help improve the
consistency of assessments across assessors
inter-rater reliability). - Has evidence of content validity (i.e., whether
the assessment task(s) as a whole, represents the
full range of knowledge and skills specified
within the Unit(s) of competency. - Reflect work-based contexts, specific enterprise
language and job-tasks and meets industry
requirements (i.e., face validity). - Adheres to the literacy and numeracy requirements
of the Unit(s) of Competency (construct
validity). - Has been designed to assess a variety of evidence
over time and contexts (predictive validity). - Has been designed to minimise the influence of
extraneous factors (i.e., factors that are not
related to the unit of competency) on candidate
performance (construct validity).
21Focus - Tool
- Has clear decision making rules to ensure
consistency of judgements across assessors
(inter-rater reliability) as well as consistency
of judgements within an assessor (intra-rater
reliability). - Has a clear instruction on how to synthesise
multiple sources of evidence to make an overall
judgement of performance (inter-rater
reliability). - Has outlined appropriate reasonable adjustments
that could be made to the gathering of assessment
evidence for specific individuals and/or groups. - Has evidence that the principles of fairness and
flexibility have been adhered to. - Has been designed to produce sufficient, current
and authentic evidence. - Is appropriate in terms of the level of
difficulty of the task(s) to be performed in
relation to the skills and knowledge specified
within the relevant Unit(s) of Competency. - Has adhered to the relevant organisation
assessment policy.
22Focus - Judgement
-
- Check whether the judgement was too harsh or too
lenient by reviewing samples of judged candidate
evidence against the - Requirements set out in the Unit(s) of
Competency - Benchmark samples of candidate evidence at
varying levels of achievement (including
borderline cases) and the - Assessment decision making rules specified within
the assessment tools. - Desirable for validation, mandatory for moderation
23Types of Approaches Assessor Partnerships
- Validation only
- Informal, self-managed, collegial
- Small group of assessors
- May involve
- Sharing, discussing and/or reviewing one
anothers tools and/or judgements - Benefit
- Low costs, personally empowering, non-threatening
- Weakness
- Potential to reinforce misconceptions and mistakes
24Types of Approaches - Consensus
- Typically involves reviewing their own
colleagues assessment tools and judgements as a
group - Can occur within and/or across organisations
- Strength
- Professional development, networking, promotes
collegiality and sharing - Weakness
- Less quality control than external and
statistical approaches as they can also be
influenced by local values and expectations - Requires a culture of sharing
25Types of Approaches - External
- Types
- Site Visit Versus
- Central Agency
- Strengths
- Offer authoritative interpretations of standards
- Improve consistency of standards across locations
by identifying local bias and/or misconceptions
(if any) - Educative
- Weakness
- Expensive
- Less control than statistical
26Types of Approaches - Statistical
- Limited to moderation
- Yet to be pursued at the national level in VET
- Requires some form of common assessment task at
the national level - Adjusts level and spread of RTO based assessments
to match the level and spread of the same
candidates scores on a common assessment task - Maintains RTO-based rank ordering but brings the
distribution of scores across groups of
candidates into alignment - Strength
- Strongest form of quality control
- Weakness
- Lacks face validity, may have limited content
validity
27Summary of major distinguishing features
- Validation is concerned with quality review
whilst moderation is concerned with quality
control - The primary purpose of moderation is to help
achieve comparability of standards across
organisations whilst validation is primarily
concerned with continuous improvement of
assessment practices and outcomes - Whilst validation and moderation can both focus
on assessment tools, moderation requires access
to judged (or scored) candidate evidence. The
latter is only desirable for validation - Both consensus and external approaches to
validation and moderation are possible.
Moderation can also be based upon statistical
procedures whilst validation can include less
formal arrangements such as assessor
partnerships and - The outcomes of validation are in terms of
recommendations for future improvement to the
assessment tools and/or processes whereas
moderation may also include making adjustments to
assessor judgements to bring standards into
alignment, where determined necessary.
28Principles
- Transparent
- Representative
- Confidential
- Educative
- Equitable
- Tolerable
29Tolerable
30CONTACT DETAILS
Andrea Bateman Director Education
Consultant Bateman Giles Pty Ltd Email
andrea_at_batemangiles.com.au Phone 0418 585 754
- Associate Professor Shelley Gillis
- Deputy Director,
- Work-based Education Research Centre
- Ph 61 3 9689 3280
- Mobile 0432 756 638
- email shelley.gillis_at_vu.edu.au
- web www.werc.vu.edu.au
-
WWW.VU.EDU.AU