Title: Show Me How to Get Past MCQs: Emerging Opportunities in Measurement Carol OByrne, PEBC Karen S' Flin
1Show Me How to Get Past MCQs Emerging
Opportunities in Measurement Carol OByrne,
PEBC Karen S. Flint and Jaime Walla, AMPDrs.
Frank Hideg, Paul Townsend, Mark Christensen,
NBCEAlison Cooper, CAPRLila Quero-Munoz,
Consultant
- Presented at the 2004 CLEAR Annual Conference
- September 30 October 2 Kansas City,
Missouri
2Goals
- Gain an overview of performance assessment
- Observe and try out electronic standardized
patient simulations - Consider exam development, implementation and
administration issues - Consider validity questions research needs
- Create computer-administered standardized
patient simulations with scoring rubrics - Set passing standards
3Part 1 - Presentations
- Introduction to performance assessment
- Purposes and objectives
- Models
- Issues, successes and challenges
- 15-minute presentations
- Four models, including their unique aspects with
two participatory demonstrations - Developmental and ongoing validity issues and
research studies
4Part 2 - Break-out Sessions
- Identify steps in development and implementation
of a new performance assessment and develop a new
station - Create a new electronic simulation and set
passing standards - Create a new standardized patient simulation and
scoring rubrics - Participate in a standard setting exercise using
the Competence Standard Setting Method - and all the while, ask the hard questions
5Performance Assessment - WHY?
- To assess important problem solving, critical
thinking, communications, hands-on and other
complex skills that - Impact clients' safety and welfare if not
performed adequately and - Are difficult to assess in a multiple choice
question format
6HOW?
- Pot luck direct observation (e.g., medical
rounds, clerkships and internships) - Semi-structured assessments (e.g. orals and
Patient Management Problems) - Objective, Structured Clinical Examinations
(OSCEs) (combining standardized client
interactions with other formats) - Other standardized simulations (e.g., airline
pilots' simulators) - Electronic simulations (e.g., real estate,
respiratory care, architecture)
7Does it really work?
- Links in the Chain of Evidence to Support the
Validity of Examination Results - Job Analysis
- Test Specifications
- Item Writing
- Examination Construction
- Standard Setting
- Test Administration
- Scoring
- Reporting Test Results
8PEBC Qualifying Examination
- Based on national competencies
- Two parts
- MCE OSCE
- Must pass both to be eligible for pharmacist
licensure in Canada - Offered spring and fall in multiple locations
- 1400 candidates/year
- 1350 CDN
- 15-station OSCE
- 12 client interactions (SP or SHP) 3 non-client
stations - 7 minute stations
- One expert examiner
- Checklist to document performance
- Holistic ratings to score exam
- Standard Setting
- Reports results and feedback
9Competencies Assessed by PEBCs MCE and OSCE
10Comparing PEBCs OSCE (PS04) and MCE (QS04) Scores
11Comparing PEBCs OSCE and MCE scores
12Holistic Rating Scales
- COMMUNICATION Skills (1)
- Rapport
- Organization
- Verbal and nonverbal expression
- Problem-solving OUTCOME (2)
- Information processing
- Decision making
- Follow-up
- Overall PERFORMANCE (3)
- Comm Outcome
- Thoroughness (checklist)
- Accuracy (misinformation)
- Risk
13(No Transcript)
14Validity an ascent from Practice Analysis to
Test Results
- Job/practice analysis
- Who/what contexts?
- How?
- Test specifications sampling
- Which competencies?
- Which tasks/scenarios?
- Other parameters?
- Item writing and review
- Who and how?
- Scoring
- Analytic (checklists) /or holistic (scales)?
15 Validity an ascent from Practice Analysis to
Test Results
- Detect and minimize unwanted variability, e.g.
- Items/tasks does the mix matter?
- Practice effect how can we avoid it?
- Presentation/administration what is the impact
of - different SPs, computers, materials/equipment?
- Scores how do we know how accurate and
dependable they are? What can we do to improve
accuracy? - Set Defensible Pass-fail Standards
- How should we do this when different standard
setting methods -gt different standards? - How do we know if the standard is appropriate?
- Report Results
- Are they clear? Interpreted correctly?
- Are they defensible?
16Validity flying high
- Evidence
- Strong links from job analysis to interpretation
of test results - Relates to performance in training other tests
- Reliable, generalizable dependable
- Scores
- Pass-fail standards outcomes
- Feasible
- Large small scale programs
- Economic, human, physical, technological
resources - Ongoing Research
17Wild Life
- Candidate diversity
- Language
- Training
- Format familiarity,
- e.g. computer skills
- Accommodations
- Logistics
- Technological requirements
- Replications (fatigue, attention span)
- Security
18Computer-Based SimulationsKaren S. Flint
Director, Internal Development Systems
IntegrationApplied Measurement Professionals,
Inc.
- Presented at the 2004 CLEAR Annual Conference
- September 30 October 2 Kansas City,
Missouri
19Evolution of Simulation Exam Format
- AMPs parent company, NBRC, provided oral exams
from 1961 to 1978 - Alternative sought due to
- Limited number of candidates that could be tested
each administration - Cost to candidates who had to travel to location
- Concern about potential oral examiner bias
20Evolution of Simulation Exam Format
- Printed simulation exam format introduced in 1978
using latent image technology - Latent image format used by NBRC from 1978 to
1999 - NBRC decision to convert all exams to
computer-based testing - Proprietary software developed by AMP to
administer simulation exams in comparable format
via computer introduced in 2000 - Both latent image test booklets computerized
format being used
21How Simulation Exams Differ from MCQs
- Provides accurate assessment of higher order
thinking related to a content area of interest
(testing more than just recall) - Challenge test takers beyond complexity of MCQs
- Simulation problems allow test takers to assess
their skills against test content drawn from
realistic situations or clinical events
22Sample relationship between multiple-choice and
simulation scores assessing similar content
23Simulation Utility
- Continuing competency examinations
- Self-assessment/practice examinations
- High-stakes examinations
- Psychometric characteristics comparable to other
assessment methodologies - That is, good reliability and validity
24Professions Using This Simulation Format
- Advanced-Level Respiratory Therapists
- Advanced-Level Dietitians
- Lighting Design Professionals
- Orthotist/Prosthetist Professionals
- Health System Case Management Professionals
(beginning 2005) - Real Estate Professionals
- Candidate fees range from 200 to 525 for
full-length certification/licensure simulation
exam
25Structure of Simulations
- Opening Scenario
- Information Gathering (IG) Sections
- Decision Making (DM) Sections
- Single or multiple DM
- All choices are weighted (3 to 3)
- Passing scores relate to judgment of content
experts on minimal competence
26Simulation Development(Graphic depiction of path
through a simulation problem)
27IG Section Details
- IG section
- A section in which test takers choose information
that will best help them understand a presenting
problem or situation - Facilitative options may receive scores of 3,
2, or 1 - Uninformative, wasteful, unnecessarily invasive,
or potentially illegal options may receive scores
of 1, 2, or 3 - Test takers who select undesirable options
accumulate negative section points
28IG Section Details
- IG Section Minimum Pass Level (MPL)
- Among all options with positive scores in a
section, some should be designated as REQUIRED
for minimally competent practice - The sum of points for all REQUIRED options in a
section equals MPL
29DM Section Details
- DM section
- A section of typically 4-6 options in which the
test taker must make a decision about how to
handle the presenting situation - Facilitative options may receive scores of 3,
2, or 1 - Harmful or potentially illegal options may
receive scores of 1, 2, or 3 - Test takers who select undesirable options
accumulate negative section points and are
directed to select another option
30DM Section Details
- DM Section Minimum Pass Level (MPL)
- May contain two correct choices, but one must be
designated as REQUIRED for minimally competent
practice - The REQUIRED option point value in the section
equals MPL
31Minimum Passing Level
- DM MPL
- The sum of all DM section MPLs
- IG MPL
- The sum of all IG section MPLS
- Overall Simulation Problem MPL
- Candidates must achieve MPL in both Information
Gathering and Decision Making
32Simulation Exam Development
- 8 to 10 simulation problems per examination
- Each problem assesses different situation
typically encountered on the job
33Lets Attempt A Computerized Simulation Problem!!!
34- Karen S. Flint, Director, Internal Development
Systems Integration - Applied Measurement Professionals, Inc.
- 8310 Nieman Road
- Lenexa, KS 66214
- 913.541.0400
- (Fax 913.541.0156)
- KFlint_at_goAMP.com
- www.goAMP.com
-
35Practical TestingDr. Frank Hideg, DCDr. Mark
Christensen, PhD Dr. Paul Townsend, DC
- Presented at the 2004 CLEAR Annual Conference
- September 30 October 2 Kansas City,
Missouri
36NBCE History
- The National Board of Chiropractic Examiners was
founded in 1963 - The first NBCE exams were administered in 1965
- Prior to 1965 chiropractors were required to take
chiropractic state boards and medical state basic
science boards for licensure
37NBCE Battery of Pre-licensure Examinations
- Part I Basic Sciences Examinations
- Part II Clinical Sciences Examinations
- Part III Written Clinical Competency
- Part IV Practical Examination for Licensure
38Hierarchy of Clinical Skills
DO
PRACTICE
PART IV
SHOW HOW
KNOW HOW
PART III
KNOWLEDGE
PARTS I II
39NBCE Practical Examination
- Content Areas
- Diagnostic Imaging
- Chiropractic Technique
- Chiropractic Case Management
40Content Weighing
TEC 17
DIM 16
CAM 67
41Diagnostic Imaging
- 10 Four-minute Stations
- Candidate identifies radiological signs on plain
film x-rays - Candidate determines most likely diagnoses
- Candidate makes most appropriate initial case
management decisions
42Chiropractic Technique
- 5 five-minute stations
- Candidate demonstrates two adjusting techniques
per station - Cervical spine
- Thoracic spine
- Lumbar spine
- Sacroiliac articulations
- Extremity articulations
43Chiropractic Case Management
- 10 five-minute patient encounter stations
- 10 linked post-encounter probe (PEP) stations
- Candidate performs focused case histories
- Candidate performs focused physical examinations
- Candidate evaluates patient clinical database
- Candidate makes differential diagnoses
- Candidate makes initial case management decisions
44Key Features of NBCE Practical Examination
- Use of standardized patients
- Use of OSCE format and protocols
45Case History Stations
- Successful candidates use organized approach
while obtaining case history information - Successful candidates communicate effectively
with patients - Successful candidates respect patient dignity
- Successful candidates elicit adequate historical
information
46Perform a Focused Case History
47Post-Encounter Probe Station
48Part IV Candidate Numbers
49Part IV State Acceptance
50(No Transcript)
51Candidate Qualifications
- Candidates must pass all basic science and
clinical science examinations before applying - Candidates must be within 6 months of graduation
from an accredited chiropractic college - 1,075 examination fee
52Let's See How This Works!
53Contact Information
- National Board of Chiropractic Examiners
- 901 54th Avenue
- Greeley, CO 80634
- 970-356-9100, 970-356-1095
- ptownsend_at_nbce.org
- www.nbce.org
-
54Station DevelopmentAlison Cooper Manager of
Examination OperationsCanadian Alliance of
Physiotherapy Regulators
- Presented at the 2004 CLEAR Annual Conference
- September 30 October 2 Kansas City,
Missouri
55First Principles
- If its worth testing, its worth testing well
- it is possible to test anything badly
- this is more expensive
- Some things are not worth testing
- trivia
- infrequently used skills
56Overview
- Write
- Review
- Dry run
- Approve
57Write
- Focus of station
- SP portrayal - general
- Checklist scoring
- Instructions to candidate
- Details of SP instructions
- Review everything
- References
58Focus of Station
- Each station must have a clear focus
- establish the focus in one sentence
- take time to get this right
- you cant write a good station without a clear
focus - Example Perform passive range of motion of the
arm for a client who has had a stroke.
59SP Portrayal - General
- Consider SP movement, behaviour
- a picture in your head
- use real situations to guide you
- Not detailed yet
- Example Client is 55 years old, is disoriented,
and has no movement in the left arm or leg.
60Checklist Scoring
- What is important to capture
- Consider the level of the candidates
- Group items logically
- Assign scores to items
- Scoring scales
61Checklist Example
- Explains purpose of interaction 1
- Corrects clients position 2
- Performs passive ROM of scapula 1
- Performs passive ROM of shoulder 1
- Performs passive ROM of elbow 1
- Performs passive ROM of wrist 1
- Performs passive ROM of hand fingers 1
- Performs passive ROM of thumb 1
- Uses proper body mechanics 3
- Uses proper handling 3
62Instructions to Candidate
- Information the candidate needs
- age and sex of client
- pertinent information and assumptions
- The task for the candidate
- exactly what they are to do and not do
63Example
- Eric Martin
- 55 years old
- This client had a right middle cerebral artery
haemorrhage resulting in a left sided hemiplegia
two (2) weeks ago. - The client presents with confusion and left sided
flaccidity. His cardiovascular status is stable. - Perform passive range of motion on the clients
left upper extremity. - Perform only one (1) repetition of each movement.
- Assume that you have the clients consent.
64Details of SP Instructions
- History, onset, changes
- Initial position, movements, demeanor, must
say/ask - anticipate strong AND weak candidates
- Cover the checklist and candidate instructions
- SP prompts
65SP Instructions...
- Use plain language
- Include
- what to wear/not wear
- features of the SP (height, scars)
- Diagrams are often helpful
66Example
- Presenting complaint
- Initial position, general mobility, affect
- Comments you must make
- Medical, social history
- Medications
- Activities and areas affected
- Sensation
- Pain
- Muscle quality
- Responses to candidate
- Emotions
67Check Everything
- Go back and check
- does it make sense?
- is there still a clear focus?
- is anything missing?
- Edit/revise as needed
- add notes to examiner for clarification
- Check for plain language
68References
- Use references you expect candidates to know
- Umphred, 2nd edition, page 681
69Next Steps
- Review by others
- Dry run
- Approve for use
70Thank you
- Canadian Alliance of Physiotherapy Regulators
- 1243 Islington Ave., Suite 501
- Toronto, ON, Canada M8X 1Y9
- (W)416-234-8800, (F)416-234-8820
- acooper_at_alliancept.org
- www.alliancept.org
-
71 OSCE Research The Key to a Successful
ImplementationLila J Quero Muñoz, PhD
Consultant
- Presented at the 2004 CLEAR Annual Conference
- September 30 October 2 Kansas City,
Missouri
72Prior to the OSCE CPBC and PEBC
- Need for assessing communication, counseling, and
interpersonal skills to provide pharmaceutical
care to patients - PEBC MC examination was not assessing the full
scope of pharmacy practice as profiled by NAPRA
(National Association Pharmacy Regulatory
Authorities of Canada)
73Generalizability Data Analyses
- Psychometrically, OSCEs, are complex phenomena,
producing scores with potential errors from
multiple sources, including - Examiners (pharmacists and non-pharmacists)
- Cases (context, complexity, of stations)
- Scoring methods (global vs. checklists)
- Standard setting
- Differential grading practices
74Research Question 1
- How many examiners are required to obtain
consistent and dependable candidates scores?
75Results 1-1998
- 1 examiner per case yielded similar consistency
as 2 (G.82, .81, D.81, .79) indicating that
examiners agreed highly on their scores - Examiners contributed little to the scoring
errors of candidates performance
761 vs. 2 Global -1999
77Research Question 2
- How many cases are required to maintain
consistency, validity and generalizability of
scores? - Adequate and representative sampling of
professional practice are necessary to capture a
candidates abilities. - Multiple observations of abilities yield more
consistent and content valid inferences. - Logistical constraints restrict the number of
cases that are timely and economically feasible
to administer within one OSCE examination.
78Results 2-1998
- 15 cases reduced the candidates score error due
to sampling variability of the cases dramatically
from 5 or 10 cases and improved the consistency
of scores from G.60 to .81 - 15 cases reduced the cases and raters interaction
variance as an indication that raters agreed on
their scores across cases
79Results 2-1998
- Candidates scores varied mostly due to their
differential performance across cases. - Sampling of the cases might affect the
candidates performance on an OSCE. - We suggest, however, that differential
performance across cases might be due to
candidates differential levels of skills across
the pharmacy competencies assessed
80Profile of Sources of Errors in -1998
81Research Question 3
- How do different scoring methods such as
checklists or global grading affect candidates
scores?
82Results 3-1998
- Low correlations between checklist and global
scores suggest both methods might not be
interchangeable - If used in isolation they would yield different
end results, particularly for borderline
candidates - Global grading yields higher mean scores than
checklist grading (p values.81 and .59)
83Global vs. Checklist-1999
84Research Question 4
- What is the validity and defensibility of
standard-setting procedures and pass/fail
decisions
85Results 4-1998
- SMEs agreed highly on the minimum standard
necessary for safe pharmacy practice for the
borderline qualified pharmacists - On different occasions, SMEs had similar
standards for entry-to-practice for the core
cases - Standards varied little between 26 20 cases and
were consistent enough with 15 cases (G.74, .74,
.71)
86Results 4-2003
87Research Question 5
- Are there differential grading practices among
Canadian Provinces? - Are candidates pass/fail decisions affected by
provincial differences on scoring practices?
88Results 5-Videos 2003
- Variability in scores between sites are due
mostly to true score variance - Differences between exam sites are in magnitude
of scores but not in pass/fail status - Differences between assessors are mostly of
magnitude of scores but not in pass/fail status - Pass/Fails decisions did not vary between sites
and assessors - There is more variance between assessors than
between exam sites
89Results 5-2003
90Results 5-2003
91Results 5-2003
92Conclusions 1998-2004
- Development of cases should follow templates,
guidelines and a detailed blueprint - Selection of cases must follow a detailed
blueprint to mirror OSCE forms between exam
administrations to control for differences in
cases such as complexity and content
93Conclusions 1998-2004
- Multiple sources of errors in OSCEs forces us to
do more extensive and non-traditional research
than for MC exams - OSCEs require continuous vigilance to assess the
impacts of the many sources of errors - OSCE research must be planned and implemented
beyond exam administrations
94Conclusions 1998-2004
- OSCE infrastructure must support both design
research and exam administration research - Successful implementation and continuous
improvements of OSCE go hand and hand with
research - More collaborative efforts among OSCE users are
needed to built on each others success and avoid
pitfalls
95Conclusions 1998-2004
- Although OSCE research is costly it is a
deterrent to litigation and wasted exam
administration resources - Similar conclusions may apply to other
performance assessments
96- Carol OByrne, BSP, OSCE Manager
- John Pugsley, PharmD, Registrar, PEBC
- obyrnec_at_pebc.ca
- 416-979-2431, 1-416-260-5013 Fax
- Lila J. Quero-Muñoz, PhD, Consultant
- 787-431-9288, 1-888-663-6796 Fax
- lila_at_insidetraveling.com
-