Title: Performancebased Testing The Reality Show of the Examinations World
1Performance-based Testing The Reality Show of
the Examinations World
- Clarence Buck Chaffee
- President The Caviart Group
2Reality in Television America's Next Top Model
Dancing With the Stars Project Runway
The Biggest Loser Laguna Beach The
Contender Survivor The Amazing Race
Flavor of Love Pimp My Ride
3People Stranded on a Desert Island ?
4Not Exactly!
5Reality in Testing
Measuring individual competence by having the
individual do the actual tasks or performances of
their work environment. Testing by doing Real
world work in a real world environment
Well..almost
6- Like reality television, performance- based
testing - Looks like real world problems
- Simulates reality in a controlled fashion
- Must be carefully designed to entertain or to
accurately measure ability - Reality is a set of stimuli in a restricted
environment
7Defining Content for Performance Exam
- The Job Analysis defines knowledge and skill
required in practice - Knowledge areas are tested in multiple-choice
fashion, competencies or skills are better tested
using performance-based measures
8The Landscape Architect Registration Examination
- Section A - Project and Construction
Administration - Section B - Inventory, Analysis and Program
Development - Section C - Site Design
- Section D - Design and Construction Documentation
- Section E - Grading, Drainage and Stormwater
Management
9Creating the Exam
- The exam specification identifies the
competencies to be tested through performance
problems - Important competencies are tested in several
problems to improve reliability - The examination committee creates problems that
incorporate competencies
10Section C - Competencies
- conveying information through text and in
drawings - synthesis of and connections between aspects of
landscape architecture and disciplines outside of
landscape architecture, including consultant
studies - development of conceptual design, planning, and
management solutions considering on-site and
off-site influences - prediction of the implications of design,
planning, and management proposals on natural and
cultural systems both within the site and in the
larger context - creation of design alternatives to demonstrate
the range of options - evaluation of design alternatives to determine
the appropriate solution - design of circulation systems (e.g., equestrian,
bicycle, pedestrian, vehicular)
11The Landscape Architect Registration Examination
(L.A.R.E.) Site Design Vignette Problem
12The Landscape Architect Registration Examination
(L.A.R.E.) Site Design
13Competencies
- conveying information through text and in
drawings - synthesis of and connections between aspects of
landscape architecture and - disciplines outside of landscape architecture
including consultant studies - design for protection and management of land
resources (e.g., land forms, - grading, drainage, vegetation, habitat, erosion
and sedimentation control) - design for protection and management of water
resources (e.g., storm water, water - supply, ground water)
14The Landscape Architect Registration Examination
(L.A.R.E.) Grading and Drainage Vignette
15Scoring the Exam
- Evaluation criteria is created to assess the
competencies being tested - Two evaluators review each exam solution to
ensure accuracy of scores - Discrepant scores are resolved by a master
evaluator (most experienced evaluators and
responsible for training other evaluators)
16Scoring the Exam
- Each drawing tests two competencies
- Evaluators assign scores to each competency based
on a six point scale - The point scale is weighted based on the
importance rating given that competency and how
many times that competency is tested across the
exam
17Tabulating the Scores
- Possible point values on the exam range from 0 to
1000 points - The large scale reduces the error of judgment and
improves score reliability
18Setting the Passing Point
- Each time the exam is given, a Subject Matter
Experts Committee is convened to set the cut
score - Experts rate 10 practice exams that have been
selected to represent the full range of scores - Experts discuss their decisions and come to
consensus on a holistic evaluation approach
19Setting the Passing Point
- Experts then rate 30 problems individually,
identifying passing and failing candidates - A monotonic increasing regression function is
used to determine each judges recommended cut
score - The Beuk Adjustment procedure is used as a
secondary confirmation of the recommended cut
score
20Confirming the Decision
- A cross-section of exams is selected representing
each point value across the Standard Error of
Judgment (SEJ) estimate created from the analysis
of the Experts decisions - Experts review multiple examinations and adjust
their decision within the SEJ estimate as
required to achieve an accurate score
21Candidate Feedback
- Candidate performance is reported directly on the
competencies obtained from the Job Analysis - Candidate feedback provides a 3-level analysis of
their performance for each competency tested - At or above the level necessary to pass
- Slightly below the level necessary to pass
- Well below the level necessary to pass
22Competencies follow through each step of the
process
- From the job analysis
- To the exam spec
- To the test construction
- To the scoring criteria
- To the weighting
- To the feedback
23Performance-based Testing The Reality Show of
the Examinations World
- Cristina Goodwin
- Certiport, Inc.
24Microsoft Office Specialist Certification Exams
- Microsoft Office Word - Word Processing
- Microsoft Office Excel - Spreadsheets
- Microsoft Office PowerPoint - Presentations
- Microsoft Office Outlook E-Mail Communications,
Calendars, Personal and Team Management - Microsoft Office Access Databases
- Microsoft Office Vista Operating System (
coming in 2007)
25Target Audience
- Secondary and post secondary students, including
those working toward an AA or equivalent degree,
and those in continuing education classes to
improve and validate their skills. - People in workforce development or retraining
programs seeking to gain and validate new skills. - Administrative and executive assistants seeking
to improve and validate skills.
26Skill Level of Target Audience
- An end user who can moderately create complex
deliverables in the application with little or no
direction - A person who is familiar with the tools available
within and the capabilities of, the application
and knows how to use them to efficiently create
deliverables - Not an expert
27Languages
- English
- German
- Italian
- Spanish
- Greek
- French
- Portuguese
- Japanese
- Korean
- Chinese Traditional
- Chinese Simplified
28Live Application Testing
29Scoring
- Items are made up of multiple tasks each task is
dichotomously scored - Scores are based on end result, not method used
to complete the task - There is no limit on number of clicks or time in
each item
30Performance Testing Challenges
- Cost development and delivery costs are
substantially higher than traditional tests - Detail both live application and simulation
exams require exhaustive documentation of all
reasonable methods for completing tasks - Technical software development aspect of test
production and hardware requirements for test
delivery add complexity
31Performance Testing Challenges
- Training SMEs require specialized training in
performance-based item development - Database management Specialized item banking
tools and protocols needed - Understanding/documenting differences in
localized versions of products - Absence of industry standards