1751 PDF
1751 PDF
DETAILS
CONTRIBUTORS
GET THIS BOOK George T. Milkovich and Alexandra K. Wigdor, Editors, with Renae F. Broderick
and Anne S. Mavor; Committee on Performance Appraisal for Merit Pay,National
Research Council
FIND RELATED TITLES
SUGGESTED CITATION
Visit the National Academies Press at NAP.edu and login or register to get:
Distribution, posting, or copying of this PDF is strictly prohibited without written permission of the National Academies Press.
(Request Permission) Unless otherwise indicated, all materials in this PDF are copyrighted by the National Academy of Sciences.
ii
NATIONAL ACADEMY PRESS 2101 Constitution Avenue, N.W. Washington, D.C. 20418
NOTICE: The project that is the subject of this report was approved by the Governing Board of
the National Research Council, whose members are drawn from the councils of the National
Academy of Sciences, the National Academy of Engineering, and the Institute of Medicine. The
members of the committee responsible for the report were chosen for their special competences and
with regard for appropriate balance.
This report has been reviewed by a group other than the authors according to procedures
approved by a Report Review Committee consisting of members of the National Academy of Sci-
ences, the National Academy of Engineering, and the Institute of Medicine.
This project was supported by the U.S. Office of Personnel Management.
Library of Congress Cataloging-in-Publication Data
Pay for performance : evaluating performance appraisal and merit pay / George T. Milkovich
and Alexandra K. Wigdor, editors, with Renae F. Broderick and Anne S. Mavor ; Committee on
Performance Appraisal for Merit Pay, Commission on Behavioral and Social Sciences and Educa-
tion, National Research Council.
p. cm.
Includes bibliographical references (p. ) and index.
ISBN 0-309-04427-8
1. Compensation management—United States. 2. Merit pay—United States. 3. Employees—
United States—Rating of. 4. United States—Officials and employees—Salaries, etc. 5. United
States—Officials and employees—Rating of. I. Milkovich, George T. II. Wigdor, Alexandra K. III.
National Research Council (U.S.).
Committee on Performance Appraisal for Merit Pay.
HF5549.5.C67P38 1991 90-25995
658.3'125—dc20 CIP
Copyright © 1991 by the National Academy of Sciences
No part of this book may be reproduced by any mechanical, photographic, or electronic pro-
cess, or in the form of a phonographic recording, nor may it be stored in a retrieval system, transmit-
ted, or otherwise copied for public or private use, without written permission from the publisher,
except for the purposes of official use by the U.S. Government.
Printed in the United States of America First Printing, January 1991 Second Printing, December
1991
iii
iv
CONTENTS v
Contents
Preface vii
Executive Summary 1
1 Introduction 7
References 167
Appendixes
A Survey Descriptions 191
B Biographical Sketches 197
Index 201
CONTENTS vi
PREFACE vii
Preface
PREFACE viii
EXECUTIVE SUMMARY 1
Executive Summary
THE CHARGE
This report reviews the research on performance appraisal and on its use in
linking pay to performance. It was written to assist federal policy makers as they
undertake a revision of the federal government's system of performance appraisal
and merit pay for mid-level managers, called the Performance Management and
Recognition System. Specifically, the Committee on Performance Appraisal for
Merit Pay was asked by the Office of Personnel Management to review current
research on performance appraisal and merit pay and to supplement the research
findings with an examination of the practices of private-sector employers. Our
investigation expanded beyond a restricted examination of merit pay plans to
include pay for performance plans more generally, as well as the organizational
and institutional conditions under which such plans are believed to operate best.
EXECUTIVE SUMMARY 2
PERFORMANCE APPRAISAL
Performance appraisal has two ostensible goals: to create a measure that
accurately assesses the level of a person's performance in a job, and to create an
evaluation system that will advance one or more operational functions in an
organization. These two goals are represented in the literature by two distinct, yet
overlapping, approaches to theory and research. The measurement tradition
emphasizes standardization, objective measurement, and psychometric
properties. The applied tradition emphasizes the organizational context and the
usefulness of performance appraisal for promoting communication, clarifying
organizational goals, informing pay-based decisions, and motivating employees.
EXECUTIVE SUMMARY 3
Conclusions
The search for a high degree of precision in measurement does not appear to
be economically viable in most applied settings; many believe that there is little to
be gained from such a level of precision.
• The committee concludes that federal policy makers would not be well
served by a commitment of vast human and financial resources to job
analyses and the development of performance appraisal instruments and
systems that can meet the strictest challenges of measurement science.
• The committee further concludes that, for most personnel management
decisions, including annual pay decisions, the goal of a performance
appraisal system should be to support and encourage informed
managerial judgment, and not to aspire to the degree of standardization,
precision, and empirical support that would be required of, for example,
selection tests.
EXECUTIVE SUMMARY 4
Finding
• The evidence on the effects of pay for performance, pieced together from
research, theory, clinical studies, and surveys of practice, suggests that,
in certain circumstances, variable pay plans produce positive effects on
individual job performance. The evidence is insufficient, however, to
determine conclusively whether merit pay can enhance individual
performance or to allow us to make comparative statements about merit
and variable pay plans.
Conclusion
• On the basis of analogy from the research and theory on variable pay
plans, the committee concludes that merit pay can have positive effects
on individual job performance. These effects may be attenuated by the
facts that, in many merit plans, increases are not always clearly linked to
employee performance, agreement on the evaluation of performance
does not always exist, and increases are not always viewed as
meaningful. However, we believe the direction of effects is nonetheless
toward enhanced performance.
EXECUTIVE SUMMARY 5
EXECUTIVE SUMMARY 6
INTRODUCTION 7
1
Introduction
INTRODUCTION 8
plans" has been highly publicized as a means for improving U.S. labor
productivity. Public policy analysts have been exploring the impact that pay for
performance plans might have on labor productivity in preparation for
recommendations about national tax incentives for these plans (Blinder, 1990).
Their interest was sparked by theoretical arguments that certain types of pay for
performance plans (particularly profit-sharing) might stabilize national
employment without inflation (Weitzman, 1984). Many employers, having
already trimmed their work forces, are exploring the potential of these plans for
making their remaining work forces more productive while continuing labor cost
control (TPF & C/Towers Perrin, 1990; Wallace, 1990). Consultants, academics,
and employee advocate groups (including unions) are also beginning to seriously
discuss the effects of pay for performance plans—and to explicate the potential
downsides, in particular the high costs of organizational changes required for
effective plan implementation, and the equity problems associated with asking
employees to place a larger part of their pay at risk when they have little control
over many factors influencing organizational outcomes. In other words, it is a
time of reassessment in the private as well as the public sector.
Amidst widespread dissatisfaction with PMRS and the current celebration of
pay for performance plans in the private sector, the question presents itself: Are
there things to be learned from private-sector organizations that can improve
human resource management in the federal bureaucracy? The government has
many sources of advice on these issues, from blue ribbon groups like the Volker
Commission, to federal employee associations, to a variety of consulting firms.
Our task was to supply one perspective to the coming policy deliberations—that
is, to bring together the best scientific evidence and knowledge derived from
practice on performance appraisal and on linking pay to performance.
INTRODUCTION 9
Merit Pay
In merit pay plans, the locus of attention is individual performance. As one
element in a meritocratic personnel system, merit pay plans link pay level or
annual pay increases, at least in part, to how well the incumbent has performed on
the job. Just as ability or skill is intended to rule employee selection in such
systems, so the quality of each employee's job performance should, according to
merit principles, be recognized through the pay system.
The most recent survey data indicates that 94 to 95 percent of private-sector
companies have merit pay programs to provide individual pay increases to their
eligible ("exempt") employees, and 71 percent of companies have merit pay
programs for their nonunion hourly employees. Performance appraisal is at the
heart of merit pay plans. Although there are numerous variations in systems
labeled as merit plans, some sort of rating of each employee's performance
precedes compensation decisions. In some firms, the rating of performance is
informal, with very little committed to paper; some firms undertake detailed job
analyses, which provide the underpinnings of the appraisal system; a majority of
firms appear to base the performance appraisal on a set of goals established by
the supervisor or negotiated by the supervisor and the employee.
The committee's review of private-sector compensation surveys suggests
that the dominant model of merit pay plan can be characterized roughly by the
characteristics listed below. They are discussed in more detail in the chapters of
the report:
1. The plan is tied to a management-by-objectives system of performance
appraisal for exempt employees and a work standards or graphic rating scale
performance appraisal for salaried nonexempt employees.
2. The typical appraisal summary format has four to five levels of
performance.
3. Pay increases are administered via a matrix (merit grid) that uses both an
employee's performance level and position in the pay grade to determine a
prespecified percentage increase (or increase range) in base salary. The
other components of the merit grid are the organization's pay increase
budget and the time between pay increases.
4. Merit increases usually are permanent increases, which are added into an
individual employee's salary and are funded from a central compensation
group. These funds are allocated to divisions or units as a percentage of
payroll. Because merit pay increases are added to base pay and compounded
into the earning stream, they can result in significant changes in pay levels
over time.
INTRODUCTION 10
Variable Pay
Variable pay plans fall into two categories, individual incentive plans and
group plans. Piece work and sales commissions are the best known of the
individual incentive plans. In recent years, a variety of group incentive plans have
come into vogue. These pay plans are specifically designed to influence
aggregate organization measures. They typically tie a significant portion of
annual pay to organization-wide productivity or financial outcomes. For
example, profit-sharing plans or equity plans link individual employee's pay to
the overall fortunes of the firm as measured by some indicator of its financial
health. Hence, one important distinction between merit pay plans and group
incentive plans is that the latter base compensation decisions in whole or in part
on organizational performance rather than individual performance. In addition,
the portion of pay associated with the variable plans is usually a one-time
payment, not an increase to base pay.
Variable pay plans have taken on an importance in our report that they
would otherwise not have had, given our mandate to look at performance
appraisal and merit pay, because virtually all of the research on the effectiveness
of pay for performance plans deals with these compensation plans. The enormous
difficulty of trying to link individual performance in most jobs to productivity
(the grand exception being manufacturing piece work and sales) may have turned
the attention of social scientists to system-level indicators of effectiveness, and
hence to the variable pay incentive plans.
Advocates of variable pay plans argue that their implementation can help to
revitalize organizations and control labor costs. They believe that the link
between pay and organization outcomes is likely to motivate employees to work
more creatively, smarter, harder, and as teams to achieve these outcomes. If
INTRODUCTION 11
the outcomes are achieved, they fund sizable payouts; if they are not, employee
pay in addition to base would be small or nonexistent. In either case, the ratio of
labor costs to total costs stays about the same, making the organization more
competitive.
The actual impact these variable pay plans can have on an organization's
productivity and financial competitiveness is just beginning to be seriously
examined. But it is a fact that, by design, these plans either require system
changes—such as redefinition of jobs, creation of teams, and changes in work
methods and standards as is typical in gainsharing programs—or provide
powerful monetary incentives to employees to experiment with changes in their
own jobs (individual bonus and profit-sharing plans.) There is disagreement
about whether it is the broader system changes (Deming, 1986; Beer et al.,
1990), or the presence of the variable pay plans themselves (Schuster, 1984a)
that are most critical to improvements in organizational effectiveness. No one
denies, however, that broader system or context changes will influence the
impact of a variable pay plan on an organization's performance.
The potential of variable pay plans to control labor costs and improve an
organization's effectiveness has received the most attention in the press. Since
such plans pay out only when they are funded by improvements in system
measures, making a larger portion of a lower-level employee's pay dependent on
them shifts management risks to those who have little say in management
decisions. The potential abuse of employee equity with these plans is thus high.
ISSUES
With this background in mind, the committee has interpreted its charge from
OPM as requiring the investigation of whether and under what conditions
performance appraisal and merit pay can assist the federal government in
regulating labor costs, managing performance, and fostering employee equity.
We interpreted the managing of performance to include improvements in
organization effectiveness, thus requiring some examination of variable pay plans
and comparisons of their intended effects with those of merit pay plans. We
broadly defined employee equity to include, not only employee perceptions of the
legitimacy and fairness of performance appraisal and merit plans, but also
incentives for managers to administer these plans equitably. By defining
expectations for performance appraisal and merit pay plans in this way, our
investigation was of necessity expanded beyond a restricted examination of the
plans themselves to include an exploration of organizational and institutional
conditions under which the plans are believed to operate best.
We ask the reader to keep in mind several caveats in reviewing this report.
Most important is that there is no commonly accepted theory of pay for
performance or performance appraisal. Therefore, we have to consider the
proposition that pay for performance plans affect performance, given certain
INTRODUCTION 12
2
The History of Civil Service Reform
For nearly 50 years, the federal government has operated with some
performance appraisal procedures whose purposes have been to strengthen the
link between pay and performance. Since 1978, specific pay for performance
programs have been in place for mid- and upper-level federal managers. There is
general agreement that these programs have not attained the desired objectives;
their troubled history has included a series of adjustments and changes, differing
levels of financial support, and little evidence of success. The ability to
demonstrate a link between performance and pay—to both the employee and the
public—remains problematic for the federal government.
As we approach the year 2000, the questions surrounding pay for
performance in the public sector have assumed a new importance—indeed, a
central position—in new proposals for federal civil service reform. Many of the
questions raised in the debate about the 1978 reforms are being raised again. Why
is this so? What accounts for the intransigence of problems surrounding effective
pay for performance systems in the federal government? Is there evidence to
support the validity of the effort, despite its problems?
In the federal government, the answers to these questions are made more
difficult by the nature of the federal personnel system, by the intermingling of
issues of political responsiveness with issues of effective management, and by the
need to marshal very scarce resources for a policy activity that never ranks very
high on the national agenda. It is our intent in this chapter to provide the
historical and contextual information necessary to understand these constraints
and their implications for performance-based pay schemes in the federal
government.
issues long after passage of the Pendleton Act. President McKinley, for example,
included 1,700 new positions in the classified service, but exempted 9,000 that
had previously been covered (Skrowonek, 1982). Congress excluded entire
agencies from the classified service. In the New Deal years, Franklin D.
Roosevelt successfully urged that the new agencies created be staffed by persons
with policy expertise congruent with the President's interests, rather than the
''neutral competents" produced by the civil service examination system. When
Roosevelt assumed the presidency in 1932, about 80 percent of federal
employees were in the competitive civil service. By 1936, that proportion was
about 60 percent (U.S. Civil Service Commission, 1974).
Of equal significance, new provisions and procedures were layered on
incrementally as the system grew. Until the time of the New Deal, most of the new
provisions, with their emphasis on economy, efficiency, and standardization,
reflected the scientific management principles in vogue in the business and public
administration communities. One such effort was the creation, in 1912, of the
skeleton of a performance appraisal system. In that year, the Civil Service
Commission (CSC) was directed by Congress to establish a uniform efficiency
rating system for all federal agencies. The commission established a Division of
Efficiency to carry out this task (U.S. Civil Service Commission, 1974).
The passage of the Classification Act in 1923 represented a more ambitious
attempt to bring scientific management principles to the federal merit system. In
words that have a familiar ring, the Joint Commission on Reclassification of
Salaries had concluded in 1920 that the United States government, the largest
employer in the world, needed a "modern classification of positions to serve as a
basis for just standardization of compensation" (quoted in Gerber, 1988). The
Classification Act established in law the principle of nationally uniform
compensation levels, providing for the standard classification of duties and
responsibilities by occupations and positions with salary levels assigned to the
resulting positions.
In addition, the Classification Act legalized the principle of rank in position.
Unlike the more common European practice of rank in person, the U.S. system
provided that wages and/or salary for each position were to be determined solely
by the position description and the qualifications for it, not by the personal
qualifications of the person who would occupy the position. Finally, the
Classification Act of 1923 led to the creation of a standard rating scale, which
required supervisors to rate employees for each "service rendered." This was the
first government-wide effort to describe job requirements and employee
performance.
The Classification Act came under almost immediate attack. Evaluations in
1929 and 1935 found major problems with the classification system that it
established. Primary criticisms focused on the extremely narrow and complex
nature of the classification process. The 1935 inquiry noted, for example, that
"what seem to be the most trifling differences in function or difficulty are
formally recognized and duly defined …" (Wilmerding, 1935). Nonetheless, the
Classification Act was not reformed until 1949, following the release of the first
Hoover Commission report. That report had been blunt about the state of the
federal merit system:
Act also authorized an additional step increase or quality step increase (QSI) for
"high-quality performance." This system guided performance management in the
federal government until the passage of Civil Service Reform Act in 1978.
These incremental changes and 100 years' accretion of laws and procedures
have resulted in an enormously complex federal merit system. Entrance to the
system can now be through "competitive," "noncompetitive," or "excepted"
authority. Veterans have preference in hiring and, until 1953, did not have to pass
an examination to be considered for employment. There are direct hiring
authorities for hard-to-hire and specialized occupations, for outstanding scholars,
for returned Peace Corps Volunteers, for Vietnam-era veterans, and many others.
Examinations are not required in these cases. There is extensive use of temporary
and part-time hiring; there are 35 different ways to hire temporary employees
alone (for additional discussion, see Ingraham and Rosenbloom, 1990).
At the time the Civil Service Reform Act of 1978 was passed, over 6,000
pages of civil service law, procedure, and regulation governed the federal merit
system. There were at least 30 different pay systems in place; there were over 900
occupations in the federal civil service. This complexity was one of the problems
addressed by civil service reform; the history and development of the complexity
profoundly influenced the reform's potential for success. It is significant that the
1978 act did not for the most part address basic entrance procedures, the
classification system, or the basic federal compensation systems. In many
respects, it reformed at the fringes of the system.
Performance Appraisal
The general logic of the SES performance appraisal provisions was applied
to non-SES employees as well. But while the primary emphasis of the SES system
appeared to be on linking individual performance to organizational objectives, the
program for mid-level managers (GS 13–15 supervisors and management
officials) emphasized the link between individual performance and pay. Under
the performance appraisal provisions of the Civil Service Reform Act, each
agency was required to develop performance appraisal systems that "(1) provide
for periodic appraisals of job performance of employees; (2) encourage employee
participation in establishing performance standards; and (3) use the results of
performance appraisals as a basis for training, rewarding, reassigning, promoting,
reducing in grade, retaining and removing employees." These systems were
required to meet criteria prescribed in OPM regulations and were required to be
implemented by October 1, 1981, three years after the act was passed. The
designers of the act believed that this time lag would permit the SES reforms to
become institutionalized before other pay for performance reforms were
implemented.
The OPM regulations were intended to develop job-related and objective
performance appraisal systems consistent with the dictates of the statute. The
regulations required that performance standards and critical elements be
consistent with the duties and responsibilities covered in an employee's position
description. OPM guidance suggested that performance standards be based on a
job analysis to identify critical elements of a position, and that each agency
develop a method for evaluating its system to ensure its validity.
This identification of critical elements of a job was a key component of the
performance appraisal reforms. A critical element was defined by OPM as "any
requirement of the job which is sufficiently important that inadequate
performance of it outweighs acceptable or better performance in other aspects of
the job." Employees who failed to perform at a satisfactory level on a critical
element were to be subject to performance-based actions, including dismissal if
performance did not improve.
Merit Pay
The Civil Service Reform Act also created a new pay for performance system
for middle managers, GS 13–15. The merit pay provisions represented a break
from the long tradition of essentially automatic salary increases based on length
of service. Borrowing from private-sector practices, Title V of the Civil Service
Reform Act contained provisions intended to motivate mid-level managers to
perform at higher levels by tying performance to financial incentives.
The Merit Pay System (MPS), which became mandatory on October 1,
more trust in the organization and demonstrated greater satisfaction with their
jobs."
It is also important to note, though with less empirical foundation, that the
political rhetoric surrounding the Civil Service Reform Act influenced
expectations and created both positive and negative perspectives on its likely
outcomes. The positive expectations are reflected in the objectives for
performance appraisal and merit pay contained in OPM's evaluation plan:
Performance Appraisal
Short-Term Objectives:
1. Increase employees' understanding of performance standards.
2. Ensure effective appraisal of performance.
3. Ensure equitable appraisal of performance.
4. Link performance to personnel actions through the performance appraisal
process.
Long-Term Objectives:
1. Increase the effectiveness of employees and supervisors.
2. Improve the quality of federal working life.
3. Contribute to agency productivity.
Merit Pay
Short-Term Objectives:
1. Relate pay to performance.
2. Provide flexibility in recognizing and rewarding good performance with
cash awards.
Long-Term Objectives:
1. Motivate merit pay employees by making pay increases contingent on
performance; clarifying job expectations, i.e., defining goals and objectives,
increasing competition for recognition and rewards.
2. Improve the productivity, timeliness, and quality of work in the federal
government through better management and more effective programs.
The negative expectations resulted from the punitive tone—the "bureaucrat
bashing," as it came to be known—that accompanied descriptions of the need for
reform. New whistleblower protections were said to be necessary to ferret out
waste and fraud; greater managerial flexibilities were needed to eliminate
deadwood; performance appraisal and pay for performance were necessary
because federal employees were not productive and did not measure up to their
private-sector counterparts (see Ingraham and Barrilleaux, 1983). This, coupled
with the characterization of the federal bureaucracy as the "giant Washington
performance.
In the past I have been aware of what standards have been used to evaluate my 23 16 61 27 20 54
performance.
Pay for Performance: Evaluating Performance Appraisal and Merit Pay
The Record
The record of the Civil Service Reform Act has been turbulent. The orderly
implementation of the act envisioned by the Carter administration was interrupted
by the election of Ronald Reagan in 1980. President Reagan was not a supporter
of the civil service; cutting back the size and cost of government was high on the
Reagan policy agenda. OPM's human resource function was redefined; most
planning, evaluation, and research activities were eliminated; the organization
was downsized and restructured. Because political control of key components of
executive branch agencies was considered critical to policy success, a specifically
political role emerged for OPM.
Donald Devine, the director of OPM for the first Reagan term, explicitly
espoused the Weberian view of organizations; under his direction, OPM emerged
as a political management arm of the White House, rather than an agency
concerned with broader human resource management issues (Newland, 1983).
The organization was not so overtly political in the second Reagan term, and
serious efforts were made to address some of the most pressing federal personnel
and management problems. Nonetheless, many of the reforms created by the
Civil Service Reform Act had been deferred, eliminated, or redefined. Many
observers have noted that the reforms were simply overwhelmed by the
dramatically changed political environment in which federal agencies existed in
the 1980s.
Pay for performance and performance appraisal were also affected by the
turbulence of implementation. The experience of the Senior Executive Service is
notable in a number of respects. Because it was the first to be implemented, the
SES performance appraisal and bonus system was carefully watched by most
federal employees. It did not serve as a positive model.
The first SES payouts occurred in the year following passage of the reform.
The first agency to complete the process paid out the full amount allowable under
the law; not only was the number who received bonuses considered excessive in
the view of Congress and some other external observers, the proportion of
Performance Review Board members who themselves received a bonus was
much too high. As a result, six months into the implementation of the SES
system, Congress altered the provisions of the act. Under the new provisions, the
percentage of SES positions in the agency eligible for a bonus was reduced from
50 to 25 percent. OPM, using its rulemaking authority in an effort to demonstrate
its good faith to Congress, further lowered that percentage to 20 percent of the
total approved positions.
This dramatic change in the SES pay for performance system had an
immediate and negative impact. Members of the SES, who had viewed the bonus
system as an escape from the federal pay cap, were disillusioned with the new
system. The formation of the Senior Executive Association to lobby Congress for
the interests of the SES was one indicator of the disenchantment and
dissatisfaction with the reform very early in the implementation process.
difficulties. For example, a dispute between OPM and the General Accounting
Office concerning the permissible size of payout led, in September 1981 (one
month before payout), to a determination that the OPM formula for calculating
the merit pay fund was not in conformance with the statute. The ruling resulted in a
modified payout that provided only small differentials among the mid-level
managers covered, again undercutting pay for performance principles and
diminishing the incentives for supervisors to differentiate among employees.
Because the Merit Pay System was not perceived as fair in some
fundamental ways, it failed to establish credible links between pay and
performance. Managers who performed satisfactorily often found themselves
receiving lesser rewards than their nonmanagerial counterparts at grades 13-15,
whose pay was set under the General Schedule. The perceptions of employees
that nonperformance factors (e.g., the composition of the pay pool) affected
payout and that ratings were arbitrarily modified also diminished the
effectiveness of the pay for performance aspects of the system. Employees in
most agencies perceived no greater likelihood that their performance would be
recognized with a cash award after the establishment of the Merit Pay System
than had previously been the case (U.S. General Accounting Office, 1984).
The reported successes of the Merit Pay System in motivating employees
emanated primarily from the performance appraisal requirements of the Civil
Service Reform Act. Gaertner and Gaertner (1984) reported that developmental
appraisals—those that focused on planning for the coming year and clarifying
expectations—were more effective than appraisals that focused only on past
performance. However, developmental appraisal strategies were seldom used, and
the pay administration role for appraisals tended to undermine this function. In
fact, one study reported a significant drop in the organizational commitment of
employees who received satisfactory, but not outstanding, ratings (Pearce and
Porter, 1986).
has three monetary components: (1) employees who are rated fully successful or
better are assured of receiving the full general pay or comparability increase. (2)
They are also eligible for merit increases, which are equivalent to within-grade
increases. The size of the merit increase depends on an employee's position in the
pay range and performance rating. (3) In addition to these monies, employees
rated fully successful or above also qualify for performance awards or bonuses.
Beginning in fiscal 1986, performance awards of no less than 2 percent and no
more than 10 percent became mandatory for employees rated two levels above
fully successful. Moreover, an agency may give a performance award of up to 20
percent of base salary for unusually outstanding performance. An upper limit of
1.5 percent-of-payroll for all performance awards was placed on agency payout
under the system.
PMRS also created Performance Standards Review Boards, modeled after
the Performance Review Boards in the Senior Executive Service, to review
performance standards within an agency to ensure their validity and to perform
other oversight functions. At least half of each board is required to be made up of
employees eligible for merit pay. Although the number and functioning of these
boards was left to agency discretion, they are required to report annually to the
agency head.
Although the evidence is thin, there are some indications that PMRS has
functioned better than the Merit Pay System. The Merit Systems Protection Board
(MSPB) conducted surveys of employee attitudes at three-year intervals
beginning in 1983. The report of the most recent survey (Merit Systems
Protection Board, 1990) says that, in 1986 and 1989, 32 and 36 percent,
respectively, of the federal employees surveyed believed they would receive more
pay for performing better. This represents a substantial increase over the 17
percent of employees surveyed in 1983 who perceived a link between pay and
performance and provides an interesting comparison to the Wyatt Company's
1989 report on employee attitudes in private-sector firms that about 28 percent of
those surveyed saw a link between their pay and their job performance.
It nevertheless remains true that the conceptual support of pay for
performance remains far stronger among federal employees—the report of the
1989 MSPB survey says that 72 percent of respondents endorse the proposition
—than their support of existing pay for performance systems. Only 42 percent
indicated that they would choose to be under a pay for performance system if
given the choice; about the same proportion of respondents indicated that they
would not so choose, many of them citing the shortcomings of the present system
as the grounds for their disinclination. The most commonly registered
reservations involved (1) the ability and freedom of managers to make
meaningful distinctions among levels of performance and (2) the availability of
enough money to reward the best performers. The monetary concern coincides
with a more general dissatisfaction with pay expressed by 60 percent of
respondents to the 1989 MSPB survey.
It is not clear that PMRS has provided the hoped-for motivational stimuli. It
is unlikely that pay for performance devices such as merit increases, bonuses, and
awards would produce performance effects in the context of a deep, generalized
dissatisfaction with pay levels of the kind reported in each of the three MSPB
surveys. In addition, even though most merit system employees have received
performance awards (U.S. Office of Personnel Management, 1989), the General
Accounting Office found that 50 percent of the employees surveyed in the first
year of PMRS felt the size of the awards was inadequate. Insofar as performance
may be affected by the communication of performance standards, the
Performance Management and Recognition System appears to be functioning
well. Nine out of ten respondents to the 1989 survey said that they understand the
performance standards for their jobs.
A somewhat more negative picture of PMRS emerges from informal surveys
of their membership conducted recently by two federal managers' associations.
Most of the managers responding to the surveys indicated support for the concept
of basing pay on performance. Only 3 percent, however, felt that PMRS should
be maintained in its current form and approximately 40 percent said that PMRS
should be completely abolished. More than 75 percent of the managers indicated
that they believed that their ratings were influenced by officials above their
supervisors, that their performance evaluations were of little guidance for
development purposes, and that insufficient funds have resulted in meaningless
performance awards. Given that the current system is viewed as so unfair and
ineffective, there is a concern over whether any new pay for performance system
could function effectively.
The evaluations of PMRS to date have been silent with respect to the
influence of PMRS on agency effectiveness. The Merit Systems Protection Board
has identified a tentative relationship between turnover and performance ratings
that suggests that poor performers are more likely than good performers to leave
federal service (Merit Systems Protection Board, 1988). However, no such
relationship was found between turnover and performance ratings in an earlier
study by the General Services Administration (Perry and Petrakis, 1987).
IMPLICATIONS
This brief account of civil service reform is a record of modest changes and
frequently conflicting objectives, accompanied perhaps by unrealistic
expectations about the effects of the reforms on the performance and productivity
of federal personnel. Neither the Merit Pay System nor the Performance
Management and Recognition System has been able to counteract what, since at
least the early 1980s, has come to be called the "quiet crisis" in the federal
government. That crisis, according to the National Commission on the Public
Service and others, is marked by below-market public-sector salaries, an inability
to recruit new employees for many federal occupations, an inability to retain
seasoned federal managers, and a perceived decline in the overall quality of the
federal work force (National Commission on the Public Service, 1990).
The uncompetitiveness of the Civil Service is particularly noticeable in
certain fields, for example, law and the scientific and engineering professions. A
recent National Research Council report noted that recruitment of scientific and
engineering personnel was a problem for the National Institutes of Health, the
Environmental Protection Agency, the National Institute of Standards and
Technology, the Department of Health and Human Services, the Social Security
Administration, and the National Science Foundation, among many others
(National Research Council, 1990). However, the overall problem of recruiting
and retaining a well-qualified work force is being felt throughout the federal
government.
While there is no reason to believe that the present malaise cannot be
reversed, there are important tensions between the potential benefits of pay for
performance and the reality of the federal personnel and compensation systems.
We describe these tensions below.
1. The tension between the principle of neutral competence and pay for
performance. We have described the centrality of the principle of neutral
competence to the modern civil service. In turning away from the spoils
system, the founders of the merit system in the late nineteenth century
envisioned federal employees as dispassionate servants to the body politic
who, to function properly, needed to be shielded from invidious political
influences. Many of the most characteristic elements of the merit system—
entry by competitive examination, retention rights, limitations on partisan
activities—derive from this vision of neutral competence. Efforts to ensure
that political neutrality could be maintained for the career service, however,
have created an extremely complex system of constraints that have come to
place severe limits on the discretion of career managers, in addition to
controlling partisanship. Two outcomes of a merit system built on the
concept of neutral competence are directly related to the potential success of
pay for performance. First, the managerial constraints and legalistic
environment that have come to characterize federal management are
antithetical to the managerial discretion necessary for effective pay for
performance processes (National Academy of Public Administration, 1983).
Second, merit pay carries far more meaning in the context of the civil
service than in the private sector. The objective of any merit pay system is to
relate pay to individual or group contributions to organizational purposes.
But in the public sector, the possible impact of political influence on ratings
of individual performance will inevitably be of concern. Moreover, the
definition of organizational purpose will always be complicated in the public
sector because of the frequent turnover of the political leadership. These
considerations at the very least raise questions about the transferability of
private-sector practice.
this context, the utility of pay for performance plans in contributing to equitable
compensation systems appears to be very limited.
Our purpose in this chapter was to provide a general flavor of the
complexities of the federal sector and to introduce some of the more salient issues
to be considered as policy makers turn to the redesign of the merit pay system in
the federal government. We turn now to an examination of the scientific and
clinical evidence on performance appraisal and pay for performance and an
assessment of the implications of this evidence for a merit pay system for federal
managers.
3
The Nature of the Evidence
We have been asked to assess the role of performance appraisals and pay for
performance systems in promoting excellence at work and to identify promising
models for potential application to the federal work force. A number of major
evidentiary obstacles impede scientific study of these issues, for reasons that go
well beyond the scholar's perennial lament that more data are needed. As this
chapter articulates, there are some conceptual and methodological mine fields
implicit in this charge. At the same time, there are a number of strengths in this
literature in terms of methodological rigor and relevance to organizational
practices. These strengths and limitations need to be made explicit so that readers
of this report can accurately gauge the existing scientific evidence bearing on
performance appraisal and pay for performance systems.
This chapter does not aim to provide a comprehensive introduction to
methodology in the social and behavioral sciences. Rather, it briefly reviews
some of the evidentiary issues that arose in pursuing the committee's charge and
summarizes the different kinds of research methods and data that have been
brought to bear on performance appraisal and pay for performance plans. The
diverse and fragmentary nature of the research evidence available to us turned
out to have important implications for how we carried out the study and
formulated our conclusions.
operating at numerous levels. The issues involved range from the intrapsychic
(e.g., memory and attention allocation) to the interpersonal (e.g., affect, group
dynamics) to the organizational, interorganizational, and even societal level (e.g.,
organizational structure, the role of money, legal constraints on performance
appraisal and pay systems). Accordingly, the kinds of research relevant to our
charge also run the gamut: research on the nature of jobs and job performance;
investigations into the accuracy and context of human judgment; analyses of the
impact of pay on motivation and behaviors; research on how organizational
structure and environment influence personnel practices; studies of the effects of
performance appraisal and pay systems on organizational functioning; proprietary
surveys on attitude and climate undertaken by specific companies; and everything
in between.
Because the issues of interest to the committee lie at the interstices between
different theories, disciplines, audiences, and levels of analysis, there is not a
single predominant type of research evidence for us to evaluate. Rather, we are
faced with the task of trying to compare, contrast, and synthesize very different
kinds of evidence relevant to the charge.
THE EVIDENCE
All the different kinds of evidence do not address the same issues or even
employ the same standards of proof. Each type has its strengths and its
limitations, and each brand of research implies its own definition of what kinds
of evidence are most relevant and useful. In this section, we briefly summarize
the quality of existing evidence and discuss a number of challenges faced by the
committee in reviewing, synthesizing, and drawing inferences from such diverse
strands of research.
One of the clear areas of strength is the research on performance appraisal.
There is an enormous literature, stretching back well over half a century, on the
assessment of work performance. Although the particular topics that have
captured the attention of researchers have changed from time to time, the sheer
accumulation of empirical work, laboratory studies, surveys of practice, and
analytical models provides a rich backdrop for contemporary thinking about the
use of performance appraisal. An additional, although sometimes unrecognized,
virtue of the work on performance appraisal in recent decades derives from the
pressures of litigation under Title VII of the Civil Rights Act of 1964.
Performance appraisal systems have had to be defended in high-stakes situations,
a fact that has made researchers in the field more cognizant of actual practice and
the problems of evaluating performance in applied settings.
Pay for performance is a much younger research field. Although there is a
good deal of suggestive theory, there is not an equivalent cumulation of empirical
research. The field is, however, energetic and protean. pay for performance
compensation strategies have begun to draw the attention of
EVIDENTIARY CHALLENGES
The evidence relating to performance appraisal and to pay for performance
compensation systems is discussed in detail in Chapters 4 through 7. On a more
general plane, however, there are a number of issues and evidentiary challenges
that merit the reader's attention, ranging from how to gauge the effectiveness of
performance-based pay to questions of causality.
all, if only large organizations were to permit researchers to study them, what
could be said scientifically about small organizations?
It is important to emphasize that the mere fact that a sample is nonrandom in
some respects does not make it unrepresentative or useless. The extent of bias
depends on the population to which researchers wish to make generalizations.
Results from the above-mentioned hypothetical survey outside the Veterans
Administration may be perfectly appropriate for making generalizations to some
populations.
In our case, much of the relevant organizational evidence bearing on
performance-based pay comes from private-sector corporations. Leaving aside all
the complications of measurement, causal inference, and the like discussed
throughout this chapter, even if we could make perfectly valid and precise
inferences about corporations, we would still face the difficult issue of whether
those conclusions can safely be generalized to workers in federal agencies. (That
question, of course, is hardly idiosyncratic to the work of this committee; after
all, scientific debates occur constantly about the relevance of specific evidence
from animal studies for human health and behavior.)
In addition, as we noted earlier, much of the data derive from clinical
knowledge and experience. Although this sort of data can be informative, it is
important to acknowledge the potential limits of clinical expertise. The opinions
of managers about their companies or the assessments of paid consultants about
organizations for whom they have consulted can be illuminating, but the
potential for bias and conflict of interest must also be recognized. Furthermore,
relying on the "excellent company" method to make inferences about the
effectiveness of organizational practices is perilous. The mere observation that
many organizations with a reputation for success appraise performance or
allocate pay in a particular way does not constitute scientific evidence or a basis
for prescription—any more than would the fact that most successful companies
have male chief executive officers justify the recommendation that women should
not be promoted at the top.
Two other related concerns should be noted about the sources and quality of
available data bearing on performance-based pay. First, experimental control or
random assignment of subjects to treatments is often difficult or impossible to
obtain in studying organizational phenomena. Firms typically do not design or
alter their appraisal or pay systems randomly over time, but rather in response to
real or perceived dilemmas.
Second, it has been well documented that organizational intervention as such
has effects on the behavior of organizational members. Physical scientists have
documented that even physical phenomena are altered by the very process of
scientific observation and measurement. However, in the organizational world,
this problem, frequently called the Hawthorne effect, is much more severe and
more difficult to disentangle. The mere entry of researchers or consultants into an
enterprise or a change by the organization in its personnel
system can be enough to occasion large attitudinal and behavioral changes. The
reactivity of organizations to policy changes and to external scrutiny further
obscures inferences about the consequences of performance appraisal and pay
systems for organizational effectiveness.
Attributing Causality
Much of the evidence concerning differences in performance appraisal
systems, pay systems, the relationship between them, and their link to
performance, which we summarize in this report, is based on studies that are
cross-sectional or nearly cross-sectional (i.e., very short time series). This
evidence is thus of limited power in making statements about causal
relationships. Yet even if these difficulties could be surmounted and a causal link
established between performance-based pay and some dimension of
organizational performance, tricky issues remain that cloud the interpretation of
the findings and their practical relevance.
First, inferences about the effects of performance-based pay plans on
organization- or individual-level outcomes are only as valid as the statistical
model used to look at the question. Any judgment about performance is always a
judgment about performance compared with something. In statistical studies, that
something is specified by control variables. If important control variables are
omitted, or if the effects of the variables of interest are confounded with included
or omitted control variables, then it can be perilous to make inferences about how
some factor affects performance.
Another reason why empirical evidence regarding the effects of pay for
performance can be misleading concerns unobserved heterogeneity . Even in
IMPLICATIONS
In reviewing these various issues, we do not wish to overstate the
complexities involved in weighing the evidence on performance appraisal and
pay for performance. The issues raised in this chapter are generic to studies of
social and organizational phenomena. Indeed, in some respects, there is a larger
and higher-quality body of research bearing on these concerns than is often the
case
in studying applied social science concerns. We have surveyed the nature of the
evidence simply to underscore the need for caution (and additional research) in
drawing policy inferences from the scientific evidence and prevailing practice and
to explain the general approach we take throughout this report in weighing the
evidence and drawing conclusions from it.
In carrying out the study we built upon our own diversity, which went well
beyond simple differences in disciplinary training or occupation, to encompass
fundamental differences in approach to issues in human motivation and behavior,
the nature of organizations, and the relevant questions to be asked about
performance appraisal and pay for performance. Some of us viewed the problem
at the individual level of analysis; others were concerned with organizational
effectiveness and change. Some employed criteria of individual or organizational
performance, while others interpreted the issues in terms of procedural justice or
the role of performance appraisal and pay for performance in legitimizing
organizations.
We have been catholic in pulling together evidence and information that
might bear on the effectiveness of performance appraisal and performance-based
compensation systems, taking account of theory, empirical research, and clinical
studies not only from many disciplines but also from any research topics that
seemed relevant. We have supplemented formal evidence with as much
information about current practices in private-sector firms as we could reasonably
gather in the limited time available for the study.
For example, our findings about performance appraisal and pay for
performance rely on and exploit existing knowledge about organizations available
from related areas. We know a great deal about how organizations vary along a
number of other dimensions of their personnel systems, as well as some of the
consequences of those differences. For instance, we know what types tend to pay
higher wages, to promote more from within, to provide on-the-job training, to
emphasize seniority more in pay and promotion decisions, and so on. We also
know that personnel practices tend to be part of a larger system governing
employment. Accordingly, it would be surprising if the insights we have gleaned
from this other research were irrelevant to understanding the determinants and
consequences of performance appraisal and performance-based pay systems.
Not all of this evidence will meet rigorous standards of scientific proof. We
have been careful throughout the text to identify the type of evidence and the
level of confidence we feel that it merits. But the fact is that managers in the
private and public sector routinely have to make choices about management
practice in the absence of definitive evidence. Federal leaders are currently
working on compensation policy and will soon revise the Performance
Management and Recognition System. In the end, we judged it better to paint as
rich a picture as possible. We felt that a careful weaving together of the many
kinds of evidence and experiential data would provide useful insights into
general
4
Performance Appraisal: Definition,
Measurement, and Application
INTRODUCTION
The science of performance appraisal is directed toward two fundamental
goals: to create a measure that accurately assesses the level of an individual's job
performance and to create an evaluation system that will advance one or more
operational functions in an organization. Although all performance appraisal
systems encompass both goals, they are reflected differently in two major
research orientations, one that grows out of the measurement tradition, the other
from human resources management and other fields that focus on the
organizational purposes of performance appraisal.
Within the measurement tradition, emanating from psychometrics and
testing, researchers have worked and continue to work on the premise that
accurate measurement is a precondition for understanding and accurate
evaluation. Psychologists have striven to develop definitive measures of job
performance, on the theory that accurate job analysis and measurement
instruments would provide both employer and employee with a better
understanding of what is expected and a knowledge of whether the employee's
performance has been effective. By and large, researchers in measurement have
made the assumption that if the tools and procedures are accurate (e.g., valid and
reliable), then the functional goals of organizations using tests or performance
appraisals will be met. Much has been learned, but as this summary of the field
makes explicit, there is still a long way to go.
In a somewhat different vein, scholars in the more applied fields—human
1 The reason for this imbalance in the research literature is obvious: managerial jobs are
difficult to define and assess at a specific level—not only are they fragmented, diverse, and
amorphous, but many of the factors leading to successful outcomes in such jobs are not
directly measurable. Moreover, in
Copyright National Academy of Sciences. All rights reserved.
Pay for Performance: Evaluating Performance Appraisal and Merit Pay
issues, such as the effects of rater training and the contextual sources of rating
distortion.
the critical incident technique (Flanagan, 1954). This method involves obtaining
reports from qualified observers of exceptionally good and poor behavior used to
accomplish critical parts of a job. The resulting examples of effective and
ineffective behavior are used as the basis for developing behaviorally based
scales for performance appraisal purposes. Throughout the 1950s and 1960s,
Flanagan and his colleagues applied the critical incident technique to the
description of several managerial and professional jobs (e.g., military officers, air
traffic controllers, foremen, and research scientists). The procedure for
developing critical incident measures is systematic and extremely time-
consuming. In the case of the military officers, over 3,000 incident descriptions
were collected and analyzed. Descriptions usually include the context, the
behaviors judged as effective or ineffective, and possibly some description of the
favorable or unfavorable outcomes.
There is general agreement in the literature that the critical incident
technique has proven useful in identifying a large range of critical job behaviors.
The major reservations of measurement experts concern the omission of
important behaviors and lack of precision in working incidents, which interferes
with their usefulness as guides for interpreting the degree of effectiveness in job
performance.
Moreover, there is some research evidence—and this is pertinent to our
study of performance appraisal—suggesting that descriptions of task behavior
resulting from task or critical incident analyses do not match the way supervisors
organize information about the performance of their subordinates (Lay and
Jackson, 1969; Sticker et al., 1974; Borman, 1983, 1987). In one of a few studies
of supervisors' "folk theories" of job performance, Borman (1987) found that the
dimensions that defined supervisors' conceptions of performance included: (1)
initiative and hard work, (2) maturity and responsibility, (3) organization, (4)
technical proficiency, (5) assertive leadership, and (6) supportive leadership.
These dimensions are based more on global traits and broadly defined task areas
than they are on tightly defined task behaviors. Borman's findings are supported
by several recent cognitive models of the performance appraiser (Feldman, 1981;
Ilgen and Feldman, 1983; Nathan and Lord, 1983; De Nisi et al., 1984).
If, as these researchers suggest, supervisors use trait-based cognitive models
to form impressions of their employees, the contribution of job analysis to the
accuracy of appraisal systems is in some sense called into question. The
suggestion is that supervisors translate observed behaviors into judgments about
general traits or characteristics, and it is these judgments that are stored in
memory. Asking them via an appraisal form to rate job behaviors does not mean
that they are reporting what they saw. Rather, they may be reconstructing a
behavioral portrait of the employee's performance based on their judgment of the
employee's perseverance, maturity, or competence. At the very least, this research
makes clearer the complexity of the connections between
attributes that determine whether a person will do the job. Second, tasks were
chosen as the central unit of analysis, rather than worker attributes or skill
requirements. It follows logically that the performance measures were job-
specific and that the measurement focus was on concrete, observable behaviors.
All of these decisions made sense. The jobs studied are entry-level jobs assigned
to enlisted personnel—jet engine mechanic, infantryman, administrative clerk,
radio operator—relatively simple and amendable to measurement at the task
level. Moreover, the enviable trove of task information virtually dictated the
economic wisdom of that approach. And finally, the objectives of the research
were well satisfied by the design decisions. During the 1980s the military was
faced each year with the task of trying to choose from close to a million 18- to
24-year-olds, most with relatively little training or job experience, in order to fill
perhaps 300,000 openings spread across hundreds of military occupations. It was
important to be able to demonstrate that the enlistment test is a reasonably
accurate predictor of which applicants are likely to be successful in a broad
sample of military jobs (earlier research focused on success in training, not job
performance). For classification purposes, it was important to understand the
relationship between the aptitude subtests and performance in various categories
of jobs.
In other words, the picture of job performance that emerged from the JPM
research was suited to the organizational objectives and to the nature of the jobs
studied. The same job analysis design would not necessarily work in another
context, as the following discussion of managerial performance demonstrates.
Implications
In sum, virtually all of the analysis of managerial performance has been at a
global level; little attention has been given to the sort of detailed, task-centered
definition that characterized the military JPM research. (One exception is the
work of Gomez-Mejia et al. [1982], which involved the use of several job
analysis methods to develop detailed descriptions of managerial tasks.) This
focus on global dimensions conveys a message from the research community
about the nature of managerial performance and the infeasibility of capturing its
essence through easily quantified lists of tasks, duties, and standards. Reliance on
global measures means that evaluation of a manager's performance is, of
necessity, based on a substantial degree of judgment. Attempts to remove
subjectivity from the appraisal process by developing comprehensive lists of
tasks or job elements or behavioral standards are unlikely to produce a valid
representation of the manager's job performance and may focus raters' attention
on trivial criteria.
In a private-sector organization with a measurable bottom line, it is
frequently easier to develop individual, quantitative work goals (such as sales
volume or the number of units processed) than it is in a large bureaucracy like the
federal government, where a bottom line tends to be difficult to define. However,
the easy availability of quantitative goals in some private-sector jobs may actually
hinder the valid measurement of the manager's effectiveness, especially when
those goals focus on short-term results or solutions to immediate problems. There
is evidence that the incorporation of objective, countable measures of
performance into an overall performance appraisal can lead to an overemphasis
on very concrete aspects of performance and an underemphasis on those less
easily quantified or that yield concrete outcomes only in the long term (e.g.,
development of one's subordinates) (Landy and Farr, 1983).
It appears that managerial jobs fit less easily within the measurement
tradition than simpler, more concrete jobs, if one interprets valid performance
measurement to require job-related measures, and the preference for "objective"
measures (as the Civil Service Reform Act appears to do). It remains to be seen
whether any approaches to performance appraisal can be demonstrated to
be reliable and valid in the psychometric sense and, if so, how global ratings
compare with job-specific ratings.
Approaches to Appraisal
As is true of standardized tests, performance evaluations can be either
norm-referenced or criterion-referenced. In norm-referenced appraisals,
employees are ranked relative to one another based on some trait, behavior, or
output measure—this procedure does not necessarily involve the use of a
performance appraisal scale. Typically, ranking is used when several employees
are working on the same job. In criterion-referenced performance evaluations, the
performance of each individual is judged against a standard defined by a rating
scale. Our discussion in this section focuses on criterion-referenced appraisal
because it is relevant to more jobs, particularly at the managerial level, and
because it is the focus of the majority of the research.
In criterion-referenced performance appraisal the "measurement system" is a
person-instrument couplet that cannot be separated. Unlike counters on
machines, the scale does not measure performance; people measure performance
using scales. Performance appraisal is a process in which humans judge other
humans; the role of the rating scale is to make human judgment less susceptible to
bias and error.
Can raters make accurate assessments using the appraisal instruments? In
addressing this question, researchers have studied several types of rating error,
each of which was believed to influence the accuracy of the resulting rating.
Among the most commonly found types of errors and problems are (1) halo:
raters giving similar ratings to an employee on several purportedly different
independent rating dimensions (e.g., quality of work, leadership ability, and
planning); (2) leniency: raters giving higher ratings than are warranted by the
employee's performance; (3) restriction in range: raters giving similar ratings to
all employees; and (4) unreliability: different raters rating the same rater
differently or the same rater giving different ratings from one time to the next.
Over the years, a variety of innovations in scale format have been introduced
with the intention of reducing rater bias and error. Descriptions of various
formats are presented below prefatory to the committee's review of research on
the psychometric properties of performance appraisal systems.
Scale Formats
The earliest performance appraisal rating scales were graphic scales—they
generally provided the rater with a continuum on which to rate a particular trait
or behavior of the employee. Although these scales vary in the degree of
explicitness, most provide only general guidance on the nature of the underlying
dimension or on the definition of scale points along the continuum. Some scales
present mere numerical anchors:
definitions of the performance dimensions are eliminated from the rating form.
The actual performance score is computed by someone other than the rater.
Forced-choice scales represent an even more extreme attempt to disguise the
rating continuum from the rater. This method is based on the careful
development of behavioral examples of the job that are assigned a preference
value based on social desirability estimates made by job experts. Raters are
presented with three or four equally desirable behaviors and asked to select the
one that best describes the employee. The employee's final rating is calculated by
someone other than the rater.
We turn now to a discussion of the validity, reliability, and other
psychometric properties of performance appraisals, pointing out (as the literature
allows) any evidence as to the relative merits of particular scale formats.
Validity
Validity is a technical term that has to do with the accuracy and relevance of
measurements. Since the validity of performance appraisals is a critical issue to
measurement specialists and a basic concern to practitioners who must withstand
legal challenges to their performance appraisal tools and procedures, we are
presenting the following discussion of validation strategies and how they apply to
the examination of performance appraisal.
Cronbach (1990:150-151) describes validation as an ''inquiry into the
soundness of an interpretation." He sees the validation process as one of posing
hypotheses, testing them, and supporting or revising the interpretation based on
the findings. He makes the point that challenge to a proposition or hypothesis is
as important as the collection of evidence supporting the interpretation. Within
this framework, the researcher is continually recognizing rival hypotheses and
testing them—the result is a greater understanding of the inferences that can be
made about the characteristics of the individuals who take a test or who are
measured on a performance appraisal scale.
If the discussion seems rarified thus far, a practical example drawn from one
of the biggest success stories of the measurement tradition—testing to select
aircraft crew members during World War II—may be of interest. In an article
with the pithy title "Validity for What," Jenkins (1946) describes the
development and use of a test to select pilots, navigators, and bombardiers. For
each position, military psychologists found that those who scored well on the test
were also the most successful in technical training, so the test was put into use to
select aircrews. Several years into the war, uneasiness with the hit ratios on
bombing runs led to Jenkins's follow-up study, which revealed that scores on the
selection test, though they predicted success in bombardier training, were not
correlated with success in hitting the target—and this, ultimately, was the
performance of greatest interest.
At least three major validation strategies have been proposed in the area
been judged by the subject matter experts to be so. Several researchers have used
this approach (e.g., Campbell et al., 1970; DeCotiis, 1977; Borman, 1978).
However, any simple reliance on content validity to justify a measurement
system has long since been dismissed by measurement specialists. Even if the
accomplishment of particular tasks is linked to effective job performance, a
comprehensive enumeration of all job tasks and rating on each of them does not
give any guidance on what is important to effective job performance and what is
not. For example, at a nonmanagerial level, Bialek et al. (1977) reported that
enlisted infantrymen spent less than half of their work time performing the
technical tasks for which they had been trained; in many cases, only a small
proportion of a soldier's time was devoted to accomplishing the tasks contained in
the specific job description. These results are reinforced by the work of Campbell
et al. (1970) and Christal (1974). What is needed is to go beyond the list of
behaviors to a testable hypothesis about the behaviors that constitute effective
task performance for a specific job construct.
Moreover, for some jobs, such as those involving managerial performance,
the content validity approach is not particularly useful because a large portion of
the employee's time is spent in behaviors that are either not observable or are not
related to the accomplishment of a specific task. This is particularly true for
managers who do many things that cannot be linked unambiguously to the
accomplishment of specific tasks (Mintzberg, 1973, 1975). Thus it appears that a
content approach is not likely to be sufficient for establishing measurement
validity for any job, and for some jobs it will be of little value in making the link
between job behaviors and effective performance.
Criterion Evidence The criterion-related approach to validation is not as
useful for evaluating performance appraisals as it is with selection tests used to
predict later performance. The strength of the approach derives largely from
showing a relationship (often expressed as a correlation coefficient) between the
measure being validated and some independent, operational performance
measure. The fact that course grades are moderately correlated with the SAT or
American College Testing (ACT) examinations lends credibility to the claim that
the tests measure verbal and quantitative abilities that are important to success in
college. The crucial factor is the independence of the operational measure, and
that is where difficulty arises. When the measure being studied is a behavioral
one, it is difficult to find operational measures for comparison that have the
essential independence.
So-called objective behavioral measures—attendance, tardiness, accidents,
measures of output, or other indices that do not involve human judgment—
appear to provide the best approximation of criteria for performance measures,
but studies using such indices are rare. Heneman (1984) was able to locate only
23 studies with a total sample size of 3,178 workers, despite a literature search
covering more than 50 years of published research. His meta-analysis
day in and day out. It may well be that most supervisory ratings are more
influenced by typical performance than the occasional best efforts. Or it may
simply be that supervisors are more influenced by job knowledge because the
direct contact of the supervisor with the employee to be rated is usually some sort
of discussion, and discussion is likely to be more informative about job
knowledge than actual performance. Whatever the exact cause, Guion suggests an
important implication of Hunter's analysis that has special salience for this study:
supervisor ratings, if they are more influenced by what employees have learned
about their jobs than what they actually do on a day-to-day basis, may be more
accurately viewed as trainability ratings than performance ratings.
The Army Selection and Classification Project (Project A) offers another
study of the relationship between performance ratings and other measures of job
proficiency, including hands-on performance, job knowledge tests, and training
knowledge tests (Personnel Psychology , 1990). One of the purposes of this
large-scale project was to develop a set of criteria for evaluating job performance
in 19 entry-level army jobs. There were five performance factors identified for
the criterion model: (1) core technical proficiency (tasks central to a particular
job); (2) general soldiering proficiency (general military tasks); (3) effort and
leadership; (4) personal discipline; and (5) physical fitness and military bearing.
All types of proficiency measures, including performance ratings, were provided
for each factor. The results, as reported by Campbell et al. (1990) show that
overall performance ratings correlated .20 with a totalhands-on score; when
corrected for attenuation, the correlation increases to .36. This finding is
consistent with the results presented in Hunter's meta-analysis (1983).
Convergent and Discriminant Evidence Since other measures of the job
performance construct have not been readily available in most settings, it has
been necessary for researchers in performance appraisal to rely on agreement
among raters or to develop special study designs that produce more than one
measure of performance. Campbell and Fiske (1959) proposed the multimethod-
multirater method for the purpose of determining the construct validity of trait
ratings. Using this approach, two or more groups of raters are asked to rate the
performance of the same employees using two rating methods. Examples of
methods include BARS, graphic scales, trait scales, and global evaluation.
Convergent validity is demonstrated by the agreement among raters across rating
methods; discriminant validity is demonstrated by the degree to which the rates
are able to distinguish among the performance dimensions.
Campbell et al. (1973) used the multimethod-multirater technique to
compare the construct validity of behaviorally based rating scales with a rating of
each behavioral example separated from its dimension (like a Mixed Standard
Scale approach). In the summated rating method, raters provided one of four
descriptors about the behavior ranging from "exhibiting almost never" to "almost
always." The dimension score was the average of the item responses for that
dimension. Both rating procedures were used for 537 managers of department
stores within the same company. Ratings were provided by store managers and
assistant store managers using each method. The behaviorally anchored scales
were based on critical incidents collected and analyzed by study participants and
researchers.
The results indicated significant convergent validity between rating methods
and high discriminant validity between dimensions. That is, the raters agreed
about ratees and about their perceptions of the dimensions as they were defined
on the instruments. This suggests that the scales provided clear definitions of
behaviors, which allowed the raters to discriminate among the behaviors with
some degree of consistency. The behavioral rating scales were superior to the
summated ratings in terms of halo (similarity of ratings across performance
dimensions), leniency (inflated ratings), and discriminant validity. It is not
surprising to find agreement between the rating methods, as they are based on the
same dimension definitions and the raters were participants in the development of
the rating instruments. It is worth noting that developing the behavioral scales
was extremely time-consuming, but that the managers felt they gained a better
understanding of critical job behaviors—those that could contribute to effective
performance. This could be useful if the results were integrated into the
management development process.
The weakness of this study is that it does not really compare substantially
different methodologies. As Landy and Farr (1983) remarked with reference to a
different set of studies, when a common procedure is used to develop the
dimensions and/or examples, then the study is really only about different
presentation modes—that is, the type of anchor.
Kavanagh et al. (1971) and Borman (1978) also used the multimethod-
multirater method to examine convergent and discriminant validity. Kavanagh et
al. (1971) compared ratings of managerial traits and job functions made by the
superior and two subordinates of middle managers. The traits rated included
intellectual capacities, concern for quality, and leadership, while job functions
included factors like planning, investigating, coordinating, supervising, etc. The
results showed agreement among raters about ratees (.44) but did not demonstrate
the ability of the raters to discriminate among the rating dimensions. Raters were
more consistent when evaluating personal traits than job functions and, according
to Kavanagh et al. (1971), that finding suggests that ratings based on personality
traits are more reliable than performance traits. However, they also show an
increased level of halo over ratings based on job functions.
Borman (1978) examined the construct validity of BARS under highly
controlled laboratory conditions for assessing the performance of managers and
recruiting interviewers. Different groups of raters provided ratings for videotaped
vignettes representing different levels of performance effectiveness on selected
rating dimensions. Performance effectiveness on each dimension
ability tests and supervisor ratings was .26 and between ability tests and work
samples it was .40 (corrected for sampling error only). All of these studies
demonstrate the existence of moderate correlations between employment test
scores and supervisor ratings of employee job performance.
There are several studies that have examined the effects on performance
appraisal ratings of the demographic characteristics of the ratee and the rater
(e.g., race, gender, age). The hypothesis to be tested here is that these
demographic characteristics do not influence performance appraisal ratings. On
one hand, rejection of the hypothesis would mean that the validity of the
performance ratings was weakened by the existence of these systematic sources
of bias. On the other hand, if the hypothesis is not rejected, it can be assumed that
the validity of the performance ratings is not being compromised by these sources
of rating error.
There are meta-analyses of the research dealing with both race and gender
effects. Kraiger and Ford's (1985) survey of 74 studies reported that the race of
both the rater and the ratee had an influence on performance ratings; in 14 of the
studies, both black and white raters were present. Over all studies supervisors
gave higher ratings to same-race subordinates than to subordinates of a different
race. The results showed that white raters rated the average white ratee higher
than 64 percent of black ratees and black raters rated the average black ratee
higher than 67 percent of white ratees. (The expected value, if there were no race
effects, would be 50 percent in both cases.) In this analysis, ratee race accounted
for 3.3 percent and 4.8 percent of the variance in ratings given by white and black
raters, respectively. In a later study, the authors (Ford et al., 1986) attempted to
assess the degree to which black-white differences on performance appraisal
scores could be attributed to real performance differences or to rater bias. They
looked at 53 studies that had at least one judgment-based and one independent
measure (units produced, customer complaints) of performance. Among other
things, they found that the size of the effects attributable to race were virtually
identical for ratings and independent measures, which led the authors to conclude
that the race effects found in judgment-based ratings cannot be attributed solely to
rater bias—i.e., there were also real performance differences.
Carson et al. (1990) conducted a meta-analysis on 24 studies of gender
effects in performance appraisal. In this review, gender effects were extremely
small—the gender of both the ratee and the rater accounted for less than 1
percent of the variance in ratings. Although there was some evidence of a ratee-
gender by rater-gender interaction (higher ratings for same gender versus mixed
gender pairs), the interaction was not statistically significant. Murphy et al.
(1986) reached similar conclusions in their review.
Age has also been shown to have a minimum effect on performance ratings.
McEvoy and Cascio (1988) reported a meta-analysis of 96 studies relating ratee
age to performance ratings. On average, the age of the ratee accounted for less
than 1 percent of the variance in performance ratings. In addition, Landy and Farr
(1983) suggest that if age effects exist at all, they are likely to be small.
Another source of indirect evidence for suggesting that under some
conditions supervisors can make accurate performance ratings is the strength of
the relationship between performance appraisal feedback and worker
productivity—by inference, if feedback results in increased productivity, then the
performance appraisal must be accurate. There are several studies that have
shown that performance feedback does have a positive impact on worker
productivity as measured in terms of production rates, error rates, and backlogs
(Guzzo and Bondy, 1983; Guzzo et al., 1985; Kopelman, 1986). Landy et al.
(1982) have shown that performance feedback has utility that far exceeds its cost,
and that a valid feedback system can lead to substantial performance gains. They
reviewed several studies showing that individual productivity increased as much
as 30 percent as a function of feedback. In one of the studies (Hundal, 1969), a
correlation of .52 (p < .01) was found between the level of feedback specificity
and productivity. The subjects of Hundal's research were 18 industrial workers
whose task was to grind metallic objects. This evidence is particularly interesting
because it gets to the relevance of the appraisal, whereas much of the evidence of
interrater reliability does not.
Interrater Reliability
There have been several studies suggesting that two or more supervisors in a
similar situation evaluating the same subordinate are likely to give similar
performance ratings (Bernardin, 1977; see Bernardin and Beatty, 1984 for a
review of research on interrater reliability). For example, Bernardin et al. (1980)
reported an interrater reliability coefficient of .73 among raters at the same level
in the organization.
Other studies have examined the agreement among raters who occupy
different positions in the organization. Although there is evidence that ratings
obtained from different sources often differ in level—for example, self-ratings are
usually higher than supervisory ratings (Meyer, 1980; Thornton, 1980)—there is
substantial agreement among ratings from different sources with regard to the
relative effectiveness of the performance of different ratees. Harris and
Schaubroeck (1988), in a meta-analysis of research on rating sources, found an
average correlation of .62 between peer and supervisory ratings (correlations
between self-supervisor and self-peer ratings were .35 and .36, respectively).
One question of scale format that has received a good deal of attention in the
reliability research concerns the number of scale points or anchors. According to
Landy and Farr (1980), there is no gain in either scale or rater reliability when
more than five rating categories are used. However, the reliability drops with the
use of fewer than 3 or more than 9 rating categories. Recent work
indicates that there is little to be gained from having more than 5 response
categories.
Implications
There are substantial limitations in the kinds of evidence that can be brought
to bear on the question of the validity of performance appraisal. The largest
constraint is the lack of independent criteria for job performance that can be used
to test the validity of various performance appraisal schemes. Given this
constraint, most of the work has focused on (1) establishing content evidence
through applying job analysis and critical incident techniques to the development
of behaviorally based performance appraisal tools, (2) demonstrating interrater
reliability, (3) examining the relationship between performance appraisal ratings,
estimates of job knowledge, work samples, and performance predictors such as
cognitive ability as a basis for establishing the construct validity of performance
ratings, and (4) eliminating race, age, and gender as significant sources of rating
bias. The results show that supervisors can give reliable ratings of employee
performance under controlled conditions and with carefully developed rating
scales. In addition, there is indirect evidence that supervisors can make
moderately accurate performance ratings; this evidence comes from the studies in
which supervisor ratings of job performance have been developed as criteria for
testing the predictive power of ability tests and from a limited number of studies
showing that age, race, and gender do not appear to have a significant influence
on the performance rating process.
It should be noted that the distinction between validity and reliability tends
to become hazy in the research on the construct validity of performance
appraisals. Much of the evidence documents interrater reliabilities. While
consistency of measurement is important, it does not establish the relevance of
the measurement; after all, several raters may merely display the same kinds of
bias. Nevertheless, the accretion of many types of evidence suggests that
performance appraisals based on well-chosen and clearly defined performance
dimensions can provide modestly valid ratings within the terms of psychometric
analysis. Most of the research, however, has involved nonmanagerial jobs; the
evidence for managerial jobs is sparse.
The consensus of several reviews is that variations in scale type and rating
format have very little effect on the measurement properties of performance
ratings as long as the dimensions to be rated and the scale anchors are clearly
defined (Jacobs et al., 1980; Landy and Farr, 1983; Murphy and Constans, 1988).2
In addition, there is evidence from research on the cognitive processes of raters
suggesting that the distinction between behaviors and traits as bases for
research comparing behaviorally anchored rating scales with other types of rating scales.
In particular, the performance dimensions for the scales to be compared were generated by
the same BARS methodology in some
Copyright National Academy of Sciences. All rights reserved.
Pay for Performance: Evaluating Performance Appraisal and Merit Pay
rating is less critical than once thought. Whether rating traits or behaviors, raters
appear to draw on trait-based cognitive models of each employee's performance.
The result is that these general evaluations substantially affect raters' memory for
and evaluation of actual work behaviors (Murphy et al., 1982; Ilgen and
Feldman, 1983; Murphy and Jako, 1989; Murphy and Cleveland, 1991).
In litigation dealing with performance appraisal, the courts have shown a
clear preference for job-specific dimensions. However, there is little research that
directly addresses the comparative validity of ratings obtained on job-specific,
general, or global dimensions. There is, however, a substantial body of research
on halo error in ratings (see Cooper, 1981, for a review) that suggests that the
generality or specificity of rating dimensions has little effect. This research shows
that raters do not, for the most part, distinguish between conceptually distinct
aspects of performance in rating their subordinates. That is, ratings tend to be
organized around a global evaluative dimension (i.e., an overall evaluation of the
individual's performance—see Murphy, 1982), and ratings of more specific
aspects of performance provide relatively little information beyond the overall
evaluation. This suggests that similar outcomes can be expected from rating
scales that employ highly general or highly job-specific dimensions.
studies, so that what was really being tested was different presentation modes, not
different scaling approaches (see Kingstrom and Bass, 1981; Landy and Farr, 1983).
Copyright National Academy of Sciences. All rights reserved.
Pay for Performance: Evaluating Performance Appraisal and Merit Pay
between the person's own behavior and his or her performance. In Vroom's
(1964) Expectancy X Valence model, these beliefs are labeled expectancies and
described as subjective probabilities regarding the extent to which the person's
actions relate to his or her performance. The second contingency is the belief
about the degree of association between performance and pay. This belief is less
about the person than it is about the extent to which the situation rewards or does
not reward performance with pay, where performance is measured by whatever
means is used in that setting. When these two contingencies are considered
together, so goes the theory, it is possible for the person to establish beliefs about
the degree of association between his or her actions and pay, with performance as
the mediating link between the two.
The second mechanism through which performance information is believed
to affect motivation at work is that of intrinsic motivation. All theories of intrinsic
motivation related to task performance (e.g., Deci, 1975; Hackman and Oldham,
1976, 1980) argue that tasks, to be intrinsically motivating, must provide the
necessary conditions for the person performing the task to feel a sense of
accomplishment. To gain a sense of accomplishment, the person needs to have
some basis for judging his or her own performance. Performance evaluations
provide one source for knowing how well the job was done and for subsequently
experiencing a sense of accomplishment. This sense of accomplishment may be a
sufficient incentive for maintaining high performance during the time period
following the receipt of the evaluation.
The third mechanism served by the performance evaluation is that of cueing
the individual into the specific behaviors that are necessary to perform well. The
receipt of a positive performance evaluation provides the person with information
that suggests that whatever he or she did in the past on the job was the type of
behavior that is valued and is likely to be valued in the future. As a result, the
evaluation increases the probability that what was done in the past will be
repeated in the future. Likewise, a negative evaluation suggests that the past
actions were not appropriate. Thus, from a motivational standpoint, the
performance evaluation provides cues about the direction in which future efforts
should or should not be directed.
The motivational possibilities of performance appraisal are qualified by
several factors. Although the performance rating/evaluation is treated as the
performance of the employee, it remains a judgment of one or more people about
the performance of another with all the potential limitations of any judgment. The
employee is clearly aware of its character, and furthermore, it is only one source
of evaluation of his or her performance. Greller and Herold (1975) asked
employees from a number of organizations to rate five kinds of information
about their own performance as sources of information about how well they were
doing their job: performance appraisals, informal interactions with their
supervisors, talking with coworkers, specific indicators provided by the job itself,
and their own personal feelings. Of the five, performance appraisals
were seen as the least likely to be useful for learning about performance. To the
extent that many other sources are available for judging performance and the
appraisal information is not seen as a very accurate source of information,
appraisals are unlikely to play much of a role in encouraging desired employee
behavior (Ilgen and Knowlton, 1981).
If employees are to be influenced by performance appraisals (i.e., attempts to
modify their behavior in response to their performance appraisal), they must
believe that the performance reported in the appraisal is a reasonable estimate of
how they have performed during the time period covered by the appraisal. One
key feature of accepting the appraisal is their belief in the credibility of the person
or persons who completed the review with regard to their ability to accurately
appraise the employee's performance. Ilgen et al. (1979), in a review of the
performance feedback literature, concluded that two primary factors influencing
beliefs about the credibility of the supervisor's judgments were expertise and
trust. Perceived expertise was a function of the amount of knowledge that the
appraise believed the appraiser had about the appraisee's job and the extent to
which the appraisee felt the appraiser was aware of the appraisee's work during
the time period covered by the evaluation. Trust was a function of a number of
conditions, most of which were related to the appraiser's freedom to be honest in
the appraisal (Padgett, 1988) and the quality of the interpersonal relationship
between the two parties.
A difficult motivational element related to acceptance of the performance
appraisal message is the fact that the nature of the message itself affects its
acceptance. There is clear evidence that individuals are very likely to accept
positive information about themselves and to reject negative. This effect is often
credited for the frequent finding that subordinates rate their own performance
higher than do their supervisors (e.g., see Holzbach, 1978; Zammuto et al., 1981;
and Shore and Thornton, 1986). Although this condition is not a surprising one, if
the focus is on the nature of the response that employees will make to
performance appraisal information, then the existence of the discrepancy means
that the employee is faced with two primary methods of resolving the
discrepancy: acting in line with the supervisor's rating or denying the validity of
that rating. The fact that the latter alternative is very frequently chosen, especially
when the criteria for good performance are not very concrete (as is often the case
for managerial jobs), is one of the reasons that performance appraisals often fail
to achieve their desired motivational effect.
Rater Training
The results of the effects of training on rating quality are mixed. A recent
review by Feldman (1986) concluded that rater training has not been shown to be
highly effective in increasing the validity and accuracy of ratings. Murphy et al.
(1986) reviewed 15 studies (primarily laboratory studies) dealing with the effects
of training on leniency and halo and found that average effects were small to
moderate. In a more recent study, Murphy and Cleveland (1991) suggest that
training is most appropriate when the underlying problem is a lack of knowledge
or understanding. For example, training is more necessary if the performance
appraisal system requires complicated procedures, calculations, or rating
methods. However, these authors also suggested that the accuracy of overall or
global ratings will not be influenced by training.
Taking the other position, Fay and Latham (1982) proposed that rater
training is more important in reducing rating errors than is the type of rating scale
used. They compared the rating responses of trained and untrained raters on three
rating scales (one trait and two behaviorally based scales). The results showed
significantly fewer rating errors for the trained raters and for the behaviorally
based scales compared with the trait scales. The rating errors were one and one
half to three times as large for the untrained group.
The training was a four-hour workshop consisting of (1) having trainees' rate
behaviors presented on videotape and then identifying similar behaviors in the
workplace, (2) a discussion of the types of rating errors made by trainees, (3) a
group brainstorming on how to avoid errors. The workshop contained no
examples of appropriate rating distributions or scale intercorrelations; the focus
was on accurate observation and recording. Researchers have found that
instructing raters to avoid giving similar ratings across rating dimensions or
giving high ratings to several individuals may not be appropriate; some
individuals do well in more than one area of performance and many individuals
may perform a selected task effectively (Bernardin and Buckley, 1981; Latham,
1988). Thus, these instruction could result in inaccurate ratings.
Other researchers have shown that training in observation skills is beneficial
(Thornton and Zorich, 1980) and that training can help raters develop a common
frame of reference for evaluating ratee performance (Bernardin and Buckley,
1981; McIntyre et al., 1984). However, the training effects documented in these
laboratory studies are typically not large, and it is not clear whether they persist
over time.
Rating Sequence
Supervisors rating many individuals on several performance dimensions
could either complete ratings in a person-by-person sequence or in a dimension-
by-dimension sequence (rate all employees on dimension I and then go on to
dimension II, etc.). Presumably, a person-by-person procedure focuses the rater's
attention on the strengths and weaknesses of the individual, while the
dimension-by-dimension procedure focuses attention on the differences among
individuals on each performance dimension. A review of this research by Landy
and Farr (1983) indicates that identical ratings are obtained with either strategy.
Implications
Although the results are mixed, the most promising approach to increasing
the quality of ratings appears to be a combination of factors including good
scales, well-trained raters, and a context that supports and encourages the
appraisal process. With respect to training, Latham (1988) and Fay and Latham
(1982) found that training in the technical aspects of the performance appraisal
process, if done properly, can lead to more accurate ratings. Their results suggest
that if raters are trained to recognize effective and ineffective performance and
are informed about pitfalls such as the influence of false first impressions, they
can provide more reliable and accurate ratings than raters who have not received
training.
The implication is that training in the use of performance appraisal
technology can lead to both a more acceptable and a more effective system.
However,
training is only one among several factors with potential influences on the
performance appraisal process. As mentioned earlier, the rater's approach to the
process is affected by organizational goals, degree of managerial discretion,
management philosophy, and external political and market forces, to name a few.
Even if raters have been trained properly and have a good grasp of the rating
process, they may distort their ratings on the basis of their perceptions of
organizational factors. There is also evidence to suggest that the purpose of the
rating may lead to rating distortion.
within the work group who perform similar jobs. In both cases, attaining or
maintaining parity might be viewed as more important then rewarding present
performance.
While these predictions of instrumentality theories are reasonable, empirical
research on motivational factors in rating distortion is rare. For example, there is
some disagreement about the extent to which negative reactions on the part of
ratees will actually affect the rater's behavior (Napier and Latham, 1986). More
fundamentally, little is known about the factors actually considered by raters
when they decide how to complete their rating forms (Murphy and Cleveland,
1991).
FINDINGS
Job Analysis
1. Job analysis and the specification of critical elements and standards can
inform but not replace the supervisor's judgment in the performance
appraisal process.
Managerial Performance
1. Most of the research on managerial performance describes broad
categories of managerial tasks such as leadership, communication, and
planning.
2. Managerial performance does not lend itself to easily quantifiable job-
specific measurement: many of the tasks performed by managers are
amorphous and not directly observable. The bulk of the existing research on
job performance and performance appraisal deals with jobs that are more
concrete and with clearer outcome measures—research that is not directly
relevant to managerial jobs.
Psychometric Properties
1. Within the framework of the psychometric tradition, research establishes
that performance appraisals show a fairly high degree of reliability and
moderate validities.
2. There is some evidence that performance appraisals can motivate
employees and can improve the quality and quantity of their work when the
supervisor is trusted and perceived as knowledgeable by the employee.
3. Real-world influences such as organizational culture, market forces, and
rating purposes can work to distort performance appraisals.
4. The research does not provide clear guidance on which scale format to use
or whether to rely on global or job-specific ratings, although a consensus
seems to be building that scale type and scale format are matters of
indifference,
all things being equal. For example, one line of research suggests that rating
scale format and the number of rating categories are not critical as long as
the dimensions to be rated and the scale anchors are clearly defined. Another
line of research suggests that raters tend to rely on broad traits in making
judgments about employee performance, making the old distinctions
between trait scales and behavioral scales appear less important.
5. Although behaviorally based scales have not been shown to be superior to
other scales psychometrically, some researchers suggest that behaviorally
anchored rating scales offer advantages in providing employees with
feedback and in establishing the external and internal legitimacy of the
performance appraisal system.
6. There is some evidence that rater training in the technology of
performance appraisal tools and procedures can lead to more accurate
performance ratings.
In sum, the research examined here does not provide the policy maker with
strong guidance on choosing a performance appraisal system. Instead, the
literature presents the complexities and pitfalls of attempting to quantify and
assess what employees, particularly managers and professionals, do that
contributes to effective job performance. All of the appraisal systems that are
behaviorally based require a significant amount of initial development effort and
cost, are not easily generalizable across jobs, apparently offer little if any
psychometric advantage, and require significant additional effort as jobs change.
The primary value of behaviorally based appraisal is that it appears relevant to
both the supervisor and the employee and it may provide an effective basis for
corrective feedback.
appraisal. The existing body of research deals with different (i.e., lower-level)
jobs, and more important, different types of appraisal systems. The federal system
has characteristics of both the traditional top-down system and management-by-
objective systems (e.g., the use of elements, standards, and objectives that are
defined by the supervisor represents a mix of concepts from both types of
systems). It is not clear whether either the body of research at lower levels in the
private sector or research on managerial appraisal and management-by-objective
systems is fully relevant to the federal system.
A third gap has to do with the implications of the reliability, validity, and
other psychometric properties of appraisal systems for the behavior of employees
and the organization's effectiveness. With few exceptions, the research does not
establish any performance effects of performance appraisal. The preponderance
of evidence relates to the consistency of measurement, not the relevance.
Research documenting the impact of appraisal systems on organizations and their
members is sparse, fragmented, and often poorly done. Empirical evidence is
needed to determine whether organizations or their members actually benefit in
any substantial way when appraisals are done, other than to the extent that
legitimacy is provided and belief systems reinforced.
5
Pay for Performance: Perspectives and
Research
our conclusions drawn from this research and discuss their implications for
federal policy makers.
added to base salaries—cell b—include piece rate and sales commission plans.
Piece rate plans involve engineered standards of hourly or daily production.
Workers receive a base wage for production that meets standard and incentive
payments for production above standard. Piece rate plans are most commonly
found in hourly, clerical, and technical jobs. Sales commission plans tie pay
increases to specific individual contributions, such as satisfactory completion of a
major project or meeting a quantitative sales or revenue target. These plans are
most commonly found among sales employees. Payouts under individual
incentive plans are typically larger than those found under merit plans
(HayGroup, Inc., 1989) and are often made more frequently (piece rate plans, for
example, can pay out every week).
It is important to note that, although individual incentive plans can offer
relatively large payouts that increase as an employee's performance increases,
they also carry the risk of no payouts if performance thresholds are not reached.
Thus, unless employers make market or cost-of-living adjustments to base
salaries, individual incentives pose the risk of lower earnings for employees and
the potential advantage of lower proportional labor costs for employers. The same
is true of group incentive plans.
The matrix in Figure 5-1 helps to simplify and guide our discussion of
research on pay for performance plans, but it is difficult to classify all plans
neatly into one cell or another. Bonus plans—particularly those typical for
managerial and professional employees—are a good example. These plans often
combine both individual- and group-level measures of performance, with an
emphasis on the latter. For example, a managerial bonus plan may combine
measures of departmental productivity and cost control with individual
behavioral measures, such as ''develops employees." Like the other individual and
group incentive plans, these bonus plans offer relatively large payments that are
not added into base salaries (HayGroup, Inc., 1989), but they do not necessarily
pay out more than once a year. We consider these types of bonus plans under
research on group incentives.
Pay for performance plans tied to group levels of measurement can, in
principle, also be divided into those that add payouts to base salaries and those
that do not. However, few examples of group plans that add payouts into base
salaries exist (cell d in Figure 5-1). More common are plans that tie payouts to
work group, facility (such as a plant or department), or organization performance
measures and do not add pay into base salaries (cell c). There are many variations
on profit-sharing plans, but most link payouts to selected organization profit
measures and often pay out quarterly. A cash profit-sharing plan, for example,
might specify that each employee covered will receive a payout equal to 15
percent of salary if the company's profit targets are met. Gainsharing plans, like
profit-sharing, come in many forms, but all tie payouts to some measure of work
group or facility performance, and most pay out more than once a year.
Traditional gainsharing plans, such as Scanlon, Rucker, or
will be enhanced, and the likelihood of desired performance increased, under pay
for performance plans when the following conditions are met:
(1) Employees understand the plan performance goals and view them as
"doable" given their own abilities, skills, and the restrictions posed by task
structure and other aspects of organization context;
(2) There is a clear link between performance and pay increases that is
consistently communicated and followed through; and
(3) Employees value pay increases and view the pay increases associated with
a plan as meaningful (that is, large enough to justify the effort required to
achieve plan performance goals).
Goal-setting theory (Locke, 1968; Locke et al., 1970), also well tested,
complements expectancy theory predictions about the links between pay and
performance by further describing the conditions under which employees see plan
performance goals as doable. According to Locke et al. (1981) the goal-setting
process is most likely to improve employee performance when goals are specific,
moderately challenging, and accepted by employees. In addition, feedback,
supervisory support, and a pay for performance plan making pay increases—
particularly "meaningful" increases—contingent on goal attainment appear to
increase the likelihood that employees will achieve performance goals.
Taken together, expectancy and goal-setting theories predict that pay for
performance plans can improve performance by directing employee efforts
toward organizationally defined goals, and by increasing the likelihood that those
goals will be achieved—given that conditions such as doable goals, specific
goals, acceptable goals, meaningful increases, consistent communication and
feedback are met.
also typically offer larger, and thus potentially more meaningful, payouts than
most merit pay plans.
Given that individual incentive plans meet several of the ideal motivational
conditions prescribed by expectancy and goal-setting theories, it is not surprising
that related empirical studies tend to focus on individual rather than merit or
group incentive plans. In reviews of expectancy theory research, Campbell and
Pritchard (1976), Dyer and Schwab (1982), and Ilgen (1990) all agree that these
studies establish the positive effect of individual incentive plans on employee
performance. The studies reviewed include both correlational field studies and
experimental laboratory studies, with the correlational studies predominating.
While these studies were primarily designed to test specific components of
expectancy theory models, they all show simple correlations, ranging from .30
and .40, between expectancy theory conditions and individual performance
measures; this means that, when these conditions are met, 9 to 16 percent of the
variance in individual performance can be explained by differences in incentives.
Cumulative studies (primarily laboratory) also support goal-setting theory
predictions that specific goals, goal acceptance, and so forth, will increase
employee goal achievement—in some cases, by as much as 30 percent over
baseline measures (Locke et al., 1981). A laboratory study by Pritchard and Curts
(1973) also reported that individual pay incentives increased the probability of
goal achievement, but only if the incentive amount was meaningful. In this study
"meaningful" was three dollars versus fifty cents versus no payment for different
levels of goal achievement on a simple sorting task. Only the three-dollar
incentive had a significant effect on individual goal achievement. Similar
findings have been reported by others (see Terborg and Miller, 1978).
There are also some early field studies of piece-rate-type individual
incentive plans conducted in the wake of claims made by Frederick W. Taylor
(1911), the prophet of "scientific management" and inventor of the time and
motion study. The more methodologically sound studies generally compared the
productivity of manufacturing workers paid by the hour and those paid on a piece
rate plan, reporting that workers paid on piece rates were substantially more
productive—between 12 and 30 percent more productive—as long as 12 weeks
after piece rates were introduced (Burnett, 1925; Wyatt, 1934; Roethlisberger and
Dickson, 1939).
Viewed as a whole, these studies establish that individual incentives can
have positive effects on individual employee performance. But it is also
important to understand the restricted organizational conditions under which
these results are observed without accompanying unintended, negative
consequences. Case studies suggest that individual incentive plans are most
problem-free when the employees covered have relatively simple, structured
jobs, when the performance goals are under the control of the employees, when
performance goals
are quantitative and relatively unambiguous, and when frequent, relatively large
payments are offered for performance achievement.
There are a number of case studies that document the potentially negative,
unintended consequences of using individual incentive plans outside these
restricted conditions. Lawler (1973) summarizes the results of these case studies
and their implications for organizations. He points out that individual incentive
plans can lead employees to (1) neglect aspects of the job that are not covered in
the plan performance goals; (2) encourage gaming or the reporting of invalid data
on performance, especially when employees distrust management; and (3) clash
with work group norms, resulting in negative social outcomes for good
performers.
Babchuk and Goode (1951) reported an example of neglecting aspects of a
job not covered by plan performance goals. Their case study of retail sales
employees in a department store showed that when an individual incentive plan
tying pay increases to sales volume was introduced, sales volume increased, but
work on stock inventory and merchandise displays suffered. Employees were
uncooperative, to the point of "stealing" sales from one another and hiding
desirable items to sell during individual shifts. Whyte (1955) and Argyris (1964)
provided examples of how individuals on piece rate incentives or bonus plans tied
to budget outcomes distorted performance data. Whyte described how workers on
piece rate plans engaged in games with the time study man who was trying to
engineer a production standard; Argyris described how managers covered by
bonus plans tied to budgets bargained with their supervisors to get a favorable
budget standard. Many studies of individual incentive plans—from the
Roethlisberger and Dickson field experiments to case studies like those of
Whyte—have shown clashes between work group production norms and high
production by individual workers, which led to negative social sanctions for the
high performers (for example, social ostracism by the group).
These studies also suggested that development of restrictive social norms
had some economic foundation: employees feared that high levels of production
would lead to negative economic consequences such as job loss, lower incentive
rates, or higher production standards. Restrictive norms were also more common
when employee-management relations were poor, and employees generally
distrusted managers.
These findings suggest the dangers of using individual incentive plans for
employees in complex, interdependent jobs requiring work group cooperation; in
instances in which employees generally distrust management; or in an economic
environment that makes job loss or the manipulation of incentive performance
standards likely. Indeed, a recent study by Brown (1990) reported that
manufacturing organizations were less likely to use piece rate incentives for
hourly workers when their jobs were more complex (a variety of duties) or when
their assigned tasks emphasized quality over quantity. Since many modern
since the majority of employees are rated as acceptable. The relatively smaller
payouts and their addition to base salaries could also make merit plans seem less
economically threatening than individual incentive plans.
Merit plan design characteristics, intended to diminish the potentially
negative consequences of individual incentive plans, can, however, also dilute
their motivation and performance effects. Performance appraisal objectives are
typically less specific than the quantitative ones found under individual incentive
plans. Employees may thus see them as less doable and more subject to multiple
interpretations, and their attainment may be less clearly linked to employee
performance. Pay increases are smaller and may be viewed as less meaningful;
the addition of pay increases into base salaries may also dilute the pay for
performance link (Lawler, 1981; Krzystofiak et al., 1982). Many management
theorists have suggested that employers focus on the process aspects of
performance appraisal and merit plans in order to enhance their motivational
potential (see Hackman et al., 1977; Latham and Wexley, 1981; and Murphy and
Cleveland, 1991, for reviews). For example, employee-supervisor interaction and
bargaining during performance appraisal objective-setting could increase an
employee's commitment and understanding of goals and feelings of trust toward
management. Training both supervisors and employees in how to use
performance appraisal objective-setting, feedback, and negotiation effectively is
recommended. Communication of merit pay plans as a means of differentiating
individual base salaries according to long-term career performance is also
suggested as a means of helping employees to see these plans as providing
meaningful pay increase potential. Our review of merit pay practices in the next
chapter shows that some organizations are following these recommendations.
There is very little research on merit pay plans in general nor on the
relationship between merit pay plans and performance—either individual or
group—in particular.
In a recent review of research on merit plans, Heneman (1990) reported that
studies examining the relationship between merit pay and measures of individual
motivation, job satisfaction, pay satisfaction, and performance ratings have
produced mixed results. The field studies comparing managers and professionals
under merit plans with those under seniority-related pay increase plans, or no
formal increase plan, suggest that the presence of a merit plan positively
influences measures of employee job satisfaction and employee perceptions of
the link between pay and performance. In several of these studies, the stronger
measures of job satisfaction and of employee perceptions of pay-to-performance
links found under merit pay plans were also correlated with higher individual
performance ratings (Kopelman, 1976; Greene, 1978; Allan and Rosenberg,
1986; Hills et al., 1988). However, other field studies, notably those of Pearce and
Perry (1983) and Pearce et al. (1985), reported that over the three years following
merit plan implementation among Social Security office
Summary
Most of the research examining the relationship between pay for
performance plans and performance has focused on individual incentive plans
such as piece rates. By design, these plans most closely approximate the ideal
motivational conditions prescribed by expectancy and goal-setting theories, and
the research indicates that they can motivate employees and improve individual-
level performance. However, the contextual conditions under which these plans
improve performance without negative, unintended consequences are restricted;
these conditions include simple, structured jobs in which employees are
autonomous, work settings in which employees trust management to set fair and
accurate performance goals, and an economic environment in which employees
feel that their jobs and basic wage levels are relatively secure. Because these
conditions—especially the job conditions—are not found collectively in many
organizations and do not apply to many jobs, some researchers suggest that
organizations might adopt merit pay plans or group incentive plans in an effort to
avoid the potentially negative consequences of individual incentive plans while
still reaping some of their performance-enhancing benefits.
Merit pay plans have some design features, such as the addition of pay
increases to base salary, and the use of individual performance measures,
including both quantitative and qualitative objectives, that can help avoid some
of the negative consequences of individual incentives plans; these characteristics
may also dilute the plans' potential to motivate employees. Organizations,
however, can take steps to strengthen the motivational impact of merit plans.
While there is not a sufficient body of research on merit pay plans to confirm it,
we think it likely that to the extent merit pay plans approximate the motivational
strengths of individual incentive plans, they will, at minimum, sustain individual
performance and could improve it. Our conclusion is based on inference from the
research on individual incentives.
Given the restricted conditions under which individual incentive plans work
best, some organizations have adopted group incentive plans. Gainsharing and
profit-sharing plan designs retain many of the motivational features of individual
incentive plans—quantitative performance goals, relatively large, frequent
payments—but it is not as easy for individuals to see how their performance
contributes to group-level measures, and the motivational pay-to-performance
link is thus weakened. At the same time, group-level performance measures may
be more appropriate than individual measures when work group cooperation is
needed and when new technology or other work changes make it difficult to
structure individual jobs, although there is little theory or research to substantiate
this claim.
The research evidence (all based on private-sector experience) suggests that
gainsharing and profit-sharing plans are associated with improved group- or
organizational-level productivity and financial performance. This research does
not, however, allow us to disentangle the effects of group plans on performance
from the effects of many other contextual conditions usually associated with the
design and implementation of group pay plans. Consequently, we cannot say that
group plans cause performance changes or specify how they do so. Indeed, some
researchers believe that it is the right combination of contextual conditions that is
critical to improved performance, not the performance plans themselves.
This research provides us with at least a partial list of contextual conditions
that may influence pay for performance plan effects. These include task,
organizational, and environmental conditions. Task conditions reflect the nature
of the organization's work, including the complexity and interdependence of jobs,
the diversity of occupations and skills required, and the pace of technological
change. Organizational conditions include work force size and diversity, levels of
employee trust, the degree of participative management, existing performance
norms, and levels of work force skill and ability (including those of
management). Organizational conditions are all influenced by the organization's
history, strategic goals, and personnel policies and practices. Environmental
conditions include economic pressures and opportunities for growth, which
influence the organization's ability to fund performance plans and the extent to
which employees may feel economically threatened by the use of pay for
performance plans. The presence of unions is another environmental factor that
may influence pay for performance plan effects.
should help attract and retain better performers. This framework assumes the
importance of context; it also emphasizes that individuals will assess pay for
performance plans and other payments relative to everything else the organization
offers, thus placing pay in a potentially less prominent position than does the
research on performance motivation. For example, some individuals, though
opposed to pay for performance plans, might still be willing to stay with an
organization offering a challenging job, pleasant working conditions, and
opportunities for promotion. Unfortunately, although a conceptual case can be
made for the ability of pay for performance plans to help an organization attract
and retain the best performers, the research does not allow us to confirm it.
of pay offered, the pay offered for different types of jobs, and the amount of pay
increase received.
Distributive justice theories also predict that some employees, particularly
those managing or administering pay systems, will be concerned with distributing
pay increases according to rules that the majority will view as fair, thereby
reducing conflict (Greenberg and Levanthal, 1976). These distribution concerns
encompass employee perceptions of the fairness of basic pay policies, especially
those about how pay increases are allocated. Examples of pay increase policies
include increases tied to performance, increases based on seniority, across the
board (or equality) increases, and higher increases for those with greater needs.
Procedural justice theories suggest that employees have expectations about
how organization procedures will influence their ability to meet their own goals,
and that these expectations will be shaped by both individual preferences and
prevailing moral and ethnical standards (Walker et al., 1979; Brett, 1986). Work
in procedural justice also suggests that the consistency with which procedures
designed to ensure justice are followed in practice is an important determinant of
their perceived fairness (Levanthal et al., 1980). In application to pay, procedural
concerns would involve employee perceptions about the fairness of procedures
used to design and administer pay. The extent to which employees have the
opportunity to participate in pay design decisions, the quality and timeliness of
information provided them, the degree to which the rules governing pay
allocations are consistently followed, the availability of channels for appeal and
due process, and the organization's safeguards against bias and inconsistency are
all thought to influence employees' perceptions about fair treatment (Greenberg,
1986a).
Research examining distributive and procedural theories in a pay context is
scarce; there are no studies that can directly answer questions about the perceived
fairness of different types of pay for performance plans. The existing research on
distributive justice does suggest that employee perceptions about the fairness of
pay distributions do affect their pay satisfaction. Research on procedural justice
suggests that employee perceptions about the fairness of pay design and
administration procedures can also affect their pay satisfaction, as well as the
degree to which they trust management and their commitment to the
organization. None of this research, however, allows us to determine causality.
Early research (mostly case studies and laboratory experiments) examining
employee perceptions of the fairness of pay distribution focused on differences in
pay for different jobs or specific tasks (Whyte, 1955; Livernash, 1957; Jaques,
1961; Adams, 1965; Lawler, 1971). It supported theoretical predictions that
employees do judge the ratio of their pay outcomes to their work contributions
against selected comparison groups, and that negative reactions—primarily pay
dissatisfaction—can occur if comparisons are unfavorable. It also suggested at
least three major pay comparison groups—employees in similar jobs outside the
organization, employees in similar jobs within the organization, and employees
in the same job within the organization—to which pay designers should be
sensitive. Recent reviews of work on pay satisfaction (Heneman, 1985; Miceli
and Lane, 1990) also suggest that pay satisfaction is multidimensional; that
employees make judgments about their satisfaction with multiple distributive
outcomes: base salaries, pay increases, and so forth. This research does not,
however, allow us to determine whether dissatisfaction with one type of pay
outcome (such as base salary) affects satisfaction with other pay outcomes (such
as merit increases).
There have been a few correlational field studies on employee perceptions
of procedural fairness—most of them examining the procedures surrounding
performance appraisal ratings used to allocate pay (Landy et al., 1978, 1980;
Dipboye and de Pontbriand, 1981; Greenberg, 1986b; Folger and Konovsky,
1989). These studies suggest that opportunity for employees to have input into
performance evaluations is a key determinant of their perceptions about its
fairness. For example, when employees are able to interact with supervisors in
setting performance objectives, when they have some recourse for changing
objectives due to unforeseen circumstances, and when there are channels for
appealing ratings and pay increase decisions, they will be more likely to see
performance appraisals and any pay allocations based on them as fair. Other
studies suggest the importance of explanations about how performance appraisal
works, basing appraisals on accurate information (for example, current job
descriptions), and good interpersonal relationships between supervisor and
employee in determining employee perceptions of fairness.
As we noted earlier, this research is all focused on employee perceptions of
procedural fairness, but the findings are consistent with the body of judicial cases
shaping the legal definition of fair (and nondiscriminatory) performance
appraisal practices (Feild and Holley, 1982). These findings are also consistent
with some of the research on pay satisfaction suggesting the importance of pay
administration procedures (communication of pay policies, employee
participation in job evaluation, and so forth) to higher pay satisfaction (Dyer and
Theriault, 1976; Weiner, 1980; Heneman, 1985).
There is no body of research on employee perceptions of the fairness of
different pay increase policies—those based on performance, seniority, or
equality/across the board or according to need. Several studies (Dyer et al., 1976;
Fossum and Fitch, 1985; Hills et al., 1987) suggest that private-sector managers
believe that pay increases should be tied to performance; the perceptions of other
employee groups are not well documented. Evidence from public-sector
professional and managerial employees suggests that their beliefs differ from
those of private-sector managers. Although there appears from attitude surveys of
federal workers to be support of merit pay in principle, there is other evidence of a
disinclination to differentiate among employees. For example, several managers'
associations have proposed that performance appraisals have but two scale
points, satisfactory and unsatisfactory (Professional Managers
propose that employees' assessments of what is and is not fair depend on their
expectations about organization procedures. These expectations will be shaped by
their individual preferences, their organizational experiences, and their moral and
ethical beliefs.
structure and skill demands, the more appropriate the quantitative measures of
performance. The higher the labor intensity, the less costly it will be to
implement and monitor piece rate plans and still maintain the benefits of their
accurate measurement. As occupational diversity increases, job structure become
more complex, skill demands more varied, and quality measures of performance
more important; as labor intensity decreases, merit plans may represent the best
trade-off between accuracy of performance measurement and cost. Brown
proposes that unionization may be the best predictor of a firm's adoption of
seniority or across-the-board plans. He also suggests that, in some firm contexts,
job complexity and interdependence will make measurement of individual
performance so difficult that only group-level measures will be accurate. He does
not, however, speculate about the organization conditions that would make group
plans the cost-effective choice, and we know of no economic models that do.
Brown finds support for most of his predictions about the relationships
between firm context and choice of pay plans. His study focused on
manufacturing firms and production workers. We can only speculate that these
predictions might be applicable to professional and managerial jobs and a firm's
choice of individual bonus (based on mostly quantitative measures), merit, or
seniority or across-the-board pay increase plans. A simulation study by Schwab
and Olsen (1990) suggests that, in firms with highly developed internal labor
markets and in managerial and professional jobs, supervisory estimates of
individual performance used with conventional merit plans may provide a higher
level of accuracy for the cost than previously thought. One simulation, however,
is not enough to enable us to generalize about performance and cost trade-offs for
management and professional jobs.
Economic models provide some conceptual basis for describing the
potential trade-offs between performance and cost that an organization faces in
choosing a pay increase policy and selecting pay for performance plans. We have
no similar conceptual foundation for potential trade-offs between fair treatment
or equity and costs. It seems reasonable to think that contextual arguments about
these trade-offs could also be made. That is, the costs of ensuring that different
types of pay for performance plans are viewed as fair and equitable will be
influenced by firm context (Milkovich and Newman, 1990). However, the
arguments for cost and equity trade-offs quickly become complicated when
multiple organization stakeholders are considered. For example, when
organization conditions all favor the use of individual incentives, investments in
such procedural protections as appeals may be lower than under merit plans
because it is easier for employees to accept quantitative performance measures as
fair. Yet unions and associations often consider individual incentives plans unfair
unless they are involved in the development of individual performance measures
and in monitoring when measures should change. Some organizations
may consider that the costs of union participation cancel out the benefits from
individual incentive plan use.
Our discussion of pay for performance plan costs and trade-offs has thus far
dealt with the indirect labor costs that might be associated with plan design and
implementation. There are, in addition, the direct labor costs that merit,
individual, and group incentive plans like gainsharing and profit-sharing pay out
in increases. It has often been claimed that individual and group incentive plans
that do not add payments into base salaries will, over time, make an employer's
direct labor costs more competitive. These claims, however, depend on many
other factors, such as the employer's competitive wage policies and tax treatment
of these variable payments. They also do not consider the potentially high
indirect costs associated with successful individual and group incentive plan
design and implementation. To date, no research has convincingly supported
these claims (see Mitchell et al., 1990).
In summary, the research on cost regulation and the cost-benefit trade-offs
associated with pay for performance plans is sparse and limited to production
jobs and manufacturing settings. The research available does suggest that certain
contextual conditions believed to reflect indirect labor costs are associated with
organization decisions about adopting a pay for performance policy and selecting
among merit, individual, or group incentive plans. The more contextual
conditions depart from those considered most cost-effective in the
implementation of individual incentive plans (structured, independent jobs, low
occupational diversity, high labor intensity, and so forth), the more likely it is
that merit or group plans will be considered. We have no evidence that any
particular pay for performance plan is superior to another or to no pay for
performance plan in regulating direct labor costs.
There is no research on cost and fairness or equity trade-offs, so the most
precise summary we can offer is that we believe they exist. In adopting a merit
plan or any other pay for performance plan, organizations should consider the
likely equity perceptions of their various stakeholders, the process and procedural
changes that might be required to improve them, and the resulting costs
(economical, political, and social) of making those changes.
to any detailed analyses of the federal work forces and working conditions, so we
cannot discuss research implications exhaustively or specifically. We can,
however, discuss general implications.
Although virtually no research on the performance effects of merit pay
exists, we conclude by analogy from research that examines the impact of
individual and group incentive plans on performance that merit pay plans could
sustain, and even improve, individual performance to the extent that they
approximate the ideal motivational conditions prescribed by expectancy and
goal-setting theories. There are some features of merit plan design that depart
from these conditions, namely the use of less specific, less quantitative measures
of performance (typically performance appraisal measures) that employees may
find unclear and thus undoable, and the relatively small pay increases that are
added to base salary. Employees may view such increases as too small to warrant
additional effort, and their addition to base salary may make them seem less
linked to performance.
However, organizations can and do take steps to strengthen the motivational
impact of merit plans. For example, they can emphasize joint employee-
supervisor participation in setting performance goals, thus increasing employee
understanding about what is expected. They can emphasize the long-term pay
growth potential offered under a merit program, thus making each pay increase
seem more meaningful. This suggests that performance appraisal formats that
allow some give and take between employees and supervisors, that make
investments in training managers and employees in how to jointly set clear
performance objectives, and that implement pay communication programs
stressing the links between merit payouts, individual performance, and long-term
pay growth could enhance the performance improvement potential of federal
merit programs. The research on performance also led us to conclude that merit
pay plans might best be adopted under certain contextual conditions. (Group
incentive plans such as profit-sharing or gainsharing might also be considered,
but our focus here is on merit plans.) There is evidence that, when jobs are
complex, require work group cooperation, and are undergoing rapid
technological change, employees are less likely to find specific, quantitative
measures of performance—such as those typical of individual incentive plans—
acceptable. There is also evidence that when the organization is facing economic
pressures and reduced growth, tying relatively large payments to performance—
as is more common of individual and group incentives—is especially threatening
to employees. Moreover, the research suggests that when individual incentive
plans are adopted under these conditions, they are often associated with negative
consequences, such as employees' ignoring important aspects of their jobs,
falsifying performance data, and actively restricting work group performance by
''punishing" high performers.
The federal government obviously represents a diverse set of job and
organization conditions, and individual agencies face different economic
pressures
and growth projections, but when jobs are complex and require work group
cooperation (as is true of many professional and managerial jobs), and when there
are significant economic and growth constraints, merit plans may deliver some of
the individual performance improvements associated with individual incentive
plans, yet have fewer of the negative consequences. The emphasis on the
importance of context in organizational decisions to adopt different types of pay
for performance plans also implies that an organization as diverse as the federal
government might adopt several types of pay for performance plans (merit,
individual, or group incentives) or, in some agencies, no pay for performance
plans, depending on its agency-by-agency analysis of context.
Although the research on cost regulation and the cost-performance trade-offs
associated with pay for performance plans is sparse, it is consistent with the
research on performance effects in that both support the importance of contextual
conditions in an organization's decision to adopt different types of pay for
performance plans. It suggests that firms will adopt merit plans (or perhaps group
incentive plans) when their occupational diversity, job complexity, and labor
intensity are higher than would be ideal for individual incentive plans such as
piece rates. Though piece rates offer the most potential for accurate performance
measurement (and are thus the best indicator of actual individual performance),
the cost of successfully implementing them under these organization conditions
might be prohibitive. Merit plans offer the next best level of accurate individual
performance measurement at a reasonable cost.
This research also suggests that firms that are heavily unionized tend to
adopt seniority-based or across-the-board pay increase plans, presumably because
unions are opposed to merit plans and this increases the cost of their adoption.
The federal government may face higher costs in implementing merit plans than
less unionized organizations.
There is no research that examines the relationship between different pay for
performance plans and an organization's ability to attract and retain high-
performing employees. We know that pay influences employees' decisions to join
and to stay in an organization, but we cannot disentangle the influence of overall
pay—let alone pay for performance plans—from all the other inducements
(working conditions, promotions, job security, etc.) the organization has to offer.
This suggests that the federal government consider the entire work experience
offered to employees in its efforts to attract and retain the best performers; it
should probably not expect a merit pay program alone to have a substantial
effect.
Like the research on employee attraction and retention, research on fairness
and equity does not allow us to distinguish among different types of pay for
performance plans. Our conclusions from this research do, however, have some
implications for an organization's adoption of pay for performance plans. First,
the research suggests that there are different beliefs about how pay increases
should be allocated—pay for performance, seniority, across the board, etc. The
fact that these different beliefs exist suggests potential problems
for organizations like the federal government that are trying to change their
allocation policies. Since increases are seldom doubled or denied, it appears that,
in practice, the federal government used an automatic step increase policy for
years. Although survey data indicating wide support for merit pay exist, a sizable
portion of the work force may view the automatic step system as most fair and
will thus be dissatisfied with any pay distributions based on performance criteria
(Advisory Committee on Federal Pay, 1990). Managers who try to implement a
pay for performance policy in this situation will be strongly tempted to
manipulate pay for performance plans to maintain the status quo.
At the same time, the research suggests that organizations investing in
measures to assure employees about the fairness of the procedures surrounding
pay for performance plan design and implementation can positively influence pay
satisfaction, perceptions of pay fairness, and employee trust and commitment. In
application to merit plans, certain procedures would be included: providing
employees with information about the way appraisal works, training managers in
conducting appraisals, employee participation in setting performance objectives,
and channels for appealing ratings and pay increases. Procedural fairness is also a
concern of other organization stakeholders, such as regulatory agencies and
unions or associations. When employees believe pay for performance procedures
are fair, managers administering these programs may face less hostility, despite
employee dissatisfaction with ratings or increases. We know that the federal
government has many procedural protections in place for its employees, but given
the historical precedent for seniority-based pay increases, the representation of
unions and associations in the federal work force, and the regulatory and public
scrutiny that agencies face, an examination of how those procedures are operating
and a focus on employee perceptions of fairness may be an important aspect of
merit pay reform.
We began this chapter by observing that the pay systems of organizations
have multiple objectives reflecting the various interests of multiple stakeholders.
An organization's ability to meet those objectives will not depend on pay for
performance plans alone. It depends on many organizational factors including
other pay decisions, its human resource systems, its job structures, its
management style, its work force, and its institutional goals. It will also be
influenced by external conditions such as economic pressures, unionization, and
pressures from regulations and public opinion. Switching to a pay for
performance policy, adoption of a particular pay for performance plan, or change
in current plans is unlikely to help an organization meet and balance its pay
system objective unless the changes make sense within the total pay system, the
personnel system, and the broader organizational context. No one pay for
performance plan will be right for every organization.
The implications of the federal context for merit plan adoption are taken up
in Chapter 7. We turn next to a review of performance appraisal and pay for
performance practices in the private sector.
6
Private-Sector Practice and Perspectives
point noted in our earlier chapter on the nature of the evidence, and one to keep in
mind throughout this chapter.
The chapter is organized into two major sections. The first reviews
performance appraisal practices: the predominant types of appraisal used, the
typical objectives of performance appraisal, common performance appraisal
design and administrative characteristics, and measures of plan effectiveness. The
second provides a similar review of merit pay and individual and group
incentives. We focused on the individual and group incentive plans that do not
add pay increases into base salaries, and we have labeled these variable pay plans.
In each section we describe general trends concerning performance appraisal and
merit and variable pay plans to provide a profile of "average" practice; we then
use information from our interviews with the personnel managers of the five
Fortune 100 firms to provide richer detail about performance appraisal, merit, and
variable pay plan practices that are generally considered successful. Such details
are not available in survey reports of performance appraisal practice. Each section
ends with a brief discussion of the convergence or divergence between practice
and the research findings presented in earlier chapters.
1985, 1989; Bretz and Milkovich, 1989). The prevalence and distribution of
performance appraisal plans appears to have increased since the mid-1970s. In
particular, small companies are now more likely to use these plans, and executive
and hourly employees are more likely, in all companies, to be covered by them
(Bureau of National Affairs, 1974; Conference Board, 1977; Bretz and
Milkovich, 1989).
Organizations have historically used performance appraisal to accomplish
multiple organization objectives (Conference Board, 1977; Bretz and Milkovich,
1989; Wyatt Company, 1989b). Improvements to work performance, tying pay to
performance (via merit plans), and communicating work expectations to
employees are the three objectives that were consistently rated as the highest
priorities among the surveys we reviewed. There was more interest in using
performance appraisal results to validate selection and promotion decisions in the
late 1970s—especially for hourly and nonexempt salaried employees—but this
interest is not a high priority today (Bureau of National Affairs, 1974; Conference
Board, 1977; Bretz and Milkovich, 1989).
Design Characteristics
The Conference Board (1977) reported that, despite the fact that most
personnel managers believe job analysis, description, and evaluation provide
necessary foundations to effective performance appraisal plans, less than half the
companies in its survey even reviewed job descriptions prior to plan development
or revision. Only about one-fourth of the larger organizations in the Conference
Board sample had conducted any sort of pilot testing of performance appraisal
plans prior to their implementation.
Management by objective (MBO) or "objective" work standard approaches
were the performance appraisal formats most commonly reported for executives,
managers, and professionals (Bretz and Milkovich, 1989; Wyatt Company,
1989b). The MBO format is a very loosely defined one and thus difficult to
compare across organizations. MBO is really both a planning and an appraisal
process in which the organization's strategic plans are supposed to shape broad
goals that are passed down to employees through the management hierarchy.
Both employees and their supervisors then participate in setting individual
performance objectives against these goals. For work standard appraisals,
management defines important job factors or dimensions that may be applied
uniformly throughout a major job group, such as managers, or may be customized
for particular jobs. Factors may be either qualitative (such as "provides group
leadership") or more quantitative (such as "finishes projects within days
assigned"), and they are scaled to denote different levels of performance ("well
above average" to "well below average''). Both these performance appraisal
approaches require raters to assess an employee against performance objectives
or factors. The employee's ratings on these factors are then combined into one
overall rating. The typical appraisal rating includes three to five performance
intervals or "buckets," ranging from "below expected standards" to ''meets
expected standards" to "far exceeds expected standards" (Conference Board,
1977; Wyatt Company, 1987, 1989b; Bretz and Milkovich, 1989).
Most organizations reported skewing in their performance appraisal ratings
—that is, ratings that do not follow the normal distribution; most employees are
rated as fully satisfactory or above (Wyatt Company, 1987; Bretz and Milkovich,
1989). About 20 to 25 percent of the organizations in the Bretz and Milkovich
survey required that summary appraisal ratings be either ranked (that is,
individual employees in similar jobs are ranked top to bottom) or forced to
approximate a normal distribution; others may suggest informally that managers
rank or force distributions (Conference Board, 1984).
Administration Characteristics
The typical organization, as reflected in our survey review, used different
performance appraisal plans for different employee groups (executives,
managers/professionals, clerical employees, etc.) and has used the same plan,
without major revisions, for about nine years (Bretz and Milkovich, 1989). Most
organizations also reported that policy guidelines for performance appraisal
design and administration were centralized (Hewitt Associates, 1989; committee's
survey of Conference Board firms, 1990).
The Bretz and Milkovich survey (1989) reported that most organizations
required an employee's immediate supervisor to conduct performance appraisals
annually. Appraisals for managers and professionals were likely to be reviewed
by a second level of management. There was no evidence that organizations are
making increasing use of peer, subordinate, or self-review for performance
appraisals, despite the reported popularity of these practices in the business press
(Kiechal, 1989:201). Formal evaluations of managers' use of performance
appraisal and penalties for poor use were rarely reported. The average time that
managers spent on annual appraisals per employee was four to six hours.
Several surveys (Bretz and Milkovich, 1989; Wyatt Company, 1989b;
committee's survey of Conference Board firms, 1990) reported that employee
participation in performance appraisal design and administration was mostly
limited to personnel staff; line managers were involved in administration only via
actual assessments of their employees; employees were involved only if there
was joint manager-employee setting of performance objectives for the appraisal
and, in some cases, the appeals process. Only about one-fourth of the
organizations in the Bretz and Milkovich survey (1989) had a formal employee
appeals process.
The Wyatt Company (1989b) reported that most organizations did not
provide managers and employees with much assistance in understanding and
using performance appraisal. When assistance was provided, it was to the
managers
who are expected to conduct appraisals, not to the employees being appraised.
Only a small proportion of companies have written objectives for their
performance appraisal systems or provide written instructions for supervisors
about how to use performance appraisal plans. Performance appraisal training
was typically conducted for managers only when a new performance appraisal
plan was first implemented. Training focused on using forms, measuring
performance, conducting interviews, providing feedback, and setting performance
objectives. There was more training emphasis on avoiding bias (perceptual,
memory, and racial/ethnic types of bias) in the 1970s than there is today.
Measures of Success
Fewer than half the organizations participating in the surveys the committee
reviewed reported any formal measurement of performance appraisal success.
Among those who did measure, managerial and employee opinion surveys were
typical measurement approaches. These surveys ask personnel managers, other
managers who administer the plans, and employees covered by the plans about
how effective they perceive the plan to be both overall and in accomplishing
specific plan objectives (improving performance, tying pay to performance, and
communicating work expectations). Personnel managers (the designers of
performance appraisal plans) were the most likely employees to be questioned in
opinion surveys, as well as the most likely to view plans as "very effective" or
"partially effective," but even they recognized problems. In general, less than 20
percent of personnel managers polled in recent surveys gave their performance
appraisal plans an overall rating of "very effective"; another 60 to 70 percent,
however, rated their plans "partially effective." Other managers and employees
were similarly unenthusiastic. On average, less than one-third rate their
organization's performance appraisal plans as ''effective" in tying pay to
performance or in communicating organizational expectations about work
(HayGroup, Inc., 1989; Wyatt Company, 1989b).
Note: Table reports survey results for a sample of 3,052 companies, which are also broken out for
companies with effective and ineffective plans. Percentages in each cell are based on total number of
companies in the column. Of the personnel managers responding, 14 percent (427) considered their
firm's plan effective; 17 percent (518) considered their firm's plan ineffective; and 66 percent
considered it partly effective.
Source: The Wyatt Communicator: Results of the 1989 Wyatt Performance Management Survey,
Fourth Quarter, 1989 (Chicago: The Wyatt Company) pp. 7-8.
and that employees would perceive personnel practices with no reference to merit
to be unfair.
Consistent with this meritocratic personnel philosophy, the five personnel
managers we interviewed emphasized that their firms were perceived as good,
even elite, places to work. Both formal organization communications and
informal social norms reinforced these perceptions. Indeed, all the personnel
managers we interviewed considered the identification of employees far above or
below acceptable performance norms as a primary performance management
objective of their plans.
This private-sector view of a meritocratic personnel philosophy and the role
that performance appraisal plays in it appear to differ from the federal
government's meritocracy, especially in practice. For example, the importance of
identifying top and bottom performers in order to sustain high levels of work
force contributions is accompanied in the private sector by relative discretion of
managers to promote top performers and dismiss employees with consistently
poor performance. Federal managers have more limited discretion to make such
decisions. This lack of discretion may reduce the potential organizational benefits
of performance appraisal and make its role in the federal meritocracy less clear.
All five of the personnel managers we interviewed indicated that their firms
regularly canvassed employee opinions regarding performance appraisal plans;
they were most concerned with indicators related to specific objectives—such as
"the plan helps communicate work expectations" or "the plan links pay to
performance"—than to overall satisfaction ratings. They believed these more
specific indicators provide a better yardstick against which personnel managers
can judge whether employees perceive that performance appraisal plans are
operating as intended. They believed that employees' sense of consistency
between what the organization says performance appraisal is supposed to do, and
what it does, is basic to their perceptions about its fairness and that employees'
sense of fairness about personnel programs in general is basic to a meritocratic
personnel philosophy.
Four of the five firms had a centrally developed performance appraisal plan
for exempt employees. The fifth, itself a major division (45,000 employees) of a
larger corporation, had traditionally decentralized all performance appraisal
decision making within the division, but had recently proposed a common
MBO-type plan for all the division's exempt employees. In all cases,
centralization meant that headquarters personnel staff provided managers with
sample communications defining the firm's performance appraisal philosophy and
its relationship to other personnel practices, a set of broadly defined performance
areas (such as people management and development or customer satisfaction), a
set of administrative guidelines, and training materials. Managers then had
considerable discretion to adapt these to their own departments.
The prevalence and distribution of variable pay plans are difficult to gauge
via surveys. There is such a variety of plans that it is difficult to tell exactly which
plans are being counted in any given survey. A recent survey (O'Dell, 1987)
reported phenomenal growth in organizational interest in variable pay plans, but
such growth must be assessed against a 40- to 50-year history of scant use of or
interest in variable pay in U.S. industry (Mitchell et al., 1990). There is no doubt
that variable pay plans are much less prevalent and less widely distributed across
employee groups than merit plans. For example, O'Dell's 1987 survey of
incentives (conducted by the American Compensation Association and the
American Productivity Center) indicated that 13 percent of the firms they
surveyed (n = 1,598 private-sector firms) were using gainsharing plans. The
Hewitt Associates 1989 compensation survey reported that 16 percent of their
survey respondents (n = 705 private-sector firms) were using gainsharing; the
Conference Board's 1990 survey of variable pay (n = 435 private-sector firms)
reported 13 percent. Hewitt noted that two-thirds of the gainsharing plans in their
survey had been in place less than three years. Similarly, Hewitt reported that 16
percent of the firms they surveyed reported using cash profit-sharing plans; the
Conference Board reported 19 percent. Hewitt also reported that half of the cash
profit-sharing plans in their survey had been in place for less than three years.
Executives have traditionally had profit-sharing and bonus plans, sales
people are often on commission plans, and a limited number of hourly employees
work on piece rate plans (such as in the garment industry); however, the vast
majority of employees have not been covered by variable plans. The 1989 Hewitt
survey suggests that variable pay plans may now be covering some nontraditional
employee groups. For example, 35 percent of the organizations in the survey (n =
435) reported using gainsharing plans for exempt employees, although these
plans have been more commonly used for nonexempt employees. Profit-sharing,
traditionally used for executives, covered nonunion hourly employees in 47
percent of the organizations surveyed. TPF & C/Towers Perrin (1990) reported
that variable pay plans are less likely to be found in union environments.
The objectives claimed for variable pay plans are legion. The 1990 TPF & C
report on group incentives indicates that 73 percent of the organizations they
surveyed (n = 144) gave "supports personnel strategy as it relates to competitive
or revitalization business strategies" as their most important reason for adopting
variable pay plans. They noted that this objective encompassed other goals:
encouraging employee participation, increasing organization productivity and
quality, increasing employees' sense of ownership in the organization, and
moving employees away from a sense of entitlement to automatic annual pay
increases. O'Dell's 1987 report (700 organizations in their sample used
gainsharing or profit-sharing plans) listed increasing organization productivity
and financial performance as one of the most important reasons for
organizations'
adoption of variable plans. Controlling costs was another important reason given.
The Conference Board (1990) reported that, of the 57 organizations with
gainsharing plans in their survey, over half thought organization productivity and
quality improvements were the most important reasons for adopting the plans;
between 25 and 30 percent indicated that they used gainsharing to increase
employee involvement and promote teamwork; 19 percent reported controlling
labor costs as important. Taken as a whole, these reports suggest that
organizations adopt variable pay plans to improve organization performance,
increase employee acceptance and involvement in organization goals, and
regulate costs.
spectrum of variable pay plan designs, there are some common design issues that
must be addressed in all such plans: determining the performance measure to be
used, identifying employee eligibility, specifying the payout distribution rules,
and setting payout form and frequency (Milkovich and Newman, 1990). As under
merit plans, the distribution rules help to regulate costs or the distribution of the
plan funds and ensure that individual employees are treated consistently.
1984). This suggests that organizations define the role of merit pay in their pay
communications to employees. Yet we found no recent surveys detailing what
information organizations provide their employees about pay. A 1976 Conference
Board survey indicated that up to 70 percent of the organizations in their survey
had a policy of telling employees their pay range, but fewer than 20 percent
discussed the organization's overall pay structure, let employees know what other
organizations were used as a comparison group for determining salary market
competitiveness, or told employees the size (percentage) of the average merit
increase. This is certainly in direct contrast to the federal government, where this
pay information is available to employees.
There is no average set of administrative guidelines for variable pay plans.
Unlike merit plans, however, variable pay plan administration—or perhaps the
better word here is implementation—often goes hand in hand with much broader
organization changes such as job redesign, team development, changes in
management style, increased investments in employee participation, major
communication efforts, more sharing of information with employees, more
explicit provisions for job security, training for plan administration, and so forth
(Conference Board, 1990; TPF & C, 1990; Wallace, 1990). (For more
information on variable pay plan implementation and further references, see
Milkovich and Newman, 1990.)
7
The Importance of Context
Our reviews of performance appraisal and merit plan research and practice
indicate that plan success or failure are substantially influenced by the context
within which they are embedded. Research on performance appraisal now
encompasses a broader set of organizational factors, along with the individual and
task factors that it has traditionally studied (Murphy and Cleveland, 1991).
Research on pay now stresses the importance of viewing pay and pay for
performance plans in the context of an organization's personnel system, its
structure and managerial styles, and its strategic goals (Balkin and Gomez-Mejia,
1987a, 1990; Carroll, 1987). Managers of performance appraisal, merit pay, and
variable pay plans stress that these plans must fit or be consistent with the
organization's personnel practices, culture, and strategic mission or goals if they
are to work as the organization intends. Both researchers and managers
acknowledge the influence of environmental conditions on organization decisions
about adopting and implementing these plans.
The rationale underlying this concern with context is a simple one. In
Chapter 5, we noted that theory and research on individual motivation show that
individuals are motivated by pay to the extent that they value pay, understand
performance goals, and believe that pay is contingent on that performance.
Variations in an organization's context attributable to its strategy, structure, job
design, culture, management systems, personnel systems, and work force culture
and characteristics can strengthen or attenuate the links between pay and
individual motivation. Design and implementation of performance appraisal and
merit plans that fit or are consistent with context factors tend to strengthen these
links.
Schendel and Hofer, 1979; Lamb, 1984; Porter, 1985; and Harrigan, 1988). Two
primary strategic postures have been applied in studies of the association between
strategy and performance evaluation and pay systems in private-sector firms: a
dynamic, growth-oriented model and a steady-state model. Most of these studies
are cast at the headquarters level and examine executive compensation systems.
For our purposes they are interesting because they suggest that different strategic
goal orientations are associated with different emphases on performance
evaluation and pay for performance plans. We emphasize the word suggest here,
since these studies are, at best, descriptive and cannot be viewed as
generalizable.
A 1985 study by Kerr, for example, used a multiple case study methodology
to classify firms according to their corporate strategies and to distinguish patterns
of performance evaluation and pay plans within strategic classes. The 20 firms
that Kerr classified were pursuing either an "evolutionary/dynamic growth" or an
"steady-state/maintenance" strategic approach to their market environments.
Evolutionary strategies were defined as emphasizing increasing market growth
through active pursuit of new markets via acquisitions, joint ventures or mergers,
and innovative products or services. Steady-state strategies were defined as
emphasizing holding onto current market positions through internal development
of technology, improvements in products or services, increasing work force
productivity, internal coordination, and economies of scale.
Kerr found that executives in firms successfully pursuing evolutionary
strategies were more likely to be evaluated strictly on quantitative, organization-
level measures of strategic performance tied to bonus plans that offered high
returns (40 percent or more of base salary). Executives in firms successfully
pursuing steady-state strategies were more likely to be evaluated against a mix of
subjective and quantitative performance measures cast at both the individual and
the organization levels. Their bonuses paid out at a lower rate (20 percent of base
salary). (Kerr's results were consistent with earlier work on corporate strategy and
executive pay such as Berg [1965], and Pitts [1976].)
In a 1984 study of electronics manufacturing firms, Balkin and Gomez-
Mejia found that firms pursuing strategic innovation and growth goals through
new research and development were more likely to offer their engineers and
scientists a higher proportion of their pay in the form of incentives (bonuses,
profit-sharing, and stock) than firms with less investment in innovation and new
development.
In both these studies, organizations pursuing riskier (i.e., evolutionary,
innovative) strategies were evaluating their managers or professionals on
quantitative, specific, organization-level performance goals. They offered them
pay incentives that would be paid out only if the organization was successful, but
would then pay out very well. We can speculate that by tying performance
evaluations strictly to strategic goal attainment and by offering high payouts,
organizations are sending a signal to current and prospective employees about
and do so in order to leave the impression among members that things are not
done arbitrarily.
Size
The work discussed so far does not capture the size or scale, the scope of
operations, the complexity of joint working arrangements, and the diversity of
work forces typical of many large, modern organizations. In particular, in large
organizations with diverse operating units and work forces, there is always the
question of where in the organization's structure decentralization of performance
appraisal and pay systems is most likely to facilitate the achievement of strategic
objectives. We know, for example, that even within a discrete business unit,
personnel systems, including performance evaluation and pay for performance
plans, may vary by employee group (Hewitt Associates, 1989).
The business policy studies of the 1960s and 1970s illustrated two basic
approaches to corporate structuring and control of large, diverse businesses: one
in which corporate management took a hands-off or holding company approach to
managing business divisions; the other in which the corporate management tried
to set basic policy guidelines and used both performance evaluation and pay
systems to tie division managers to corporate as well as divisional goals
(Chandler, 1962; Berg, 1965; Pitts, 1976). Recent case studies of globalizing or
transnational firms have noted that, while some firms try to manage and
coordinate diverse businesses and work forces by developing more elaborate
bureaucratic and centralized structures and controls, most have moved to global
statements of corporate values that are intended to guide, but not dictate, business
unit actions at a decentralized level (Doz and Pralahad, 1981; Galbraith and
Kazanjian, 1986; Bartlett and Ghoshal, 1988; Evans, 1989). The work of Vancil
and Buddrus (1979) also supports decentralized control of performance appraisal
and pay for performance plans based on the nature of the work being performed
(e.g., team-based, task interdependence, task concreteness, stability of
technology, etc.).
ENVIRONMENTAL FACTORS
While there are a host of environmental factors that may influence
organizational arrangements, we focus here on three sets of institutional forces of
particular interest to performance evaluation and pay for performance systems:
economic pressures and growth; the presence of unions and professional
associations; and the pressure of laws and regulations governing personnel
systems.
motivated by them) when they believed the plans might eventually result in
reducing the organization's demand for people in their jobs. Likewise, the case
studies of gainsharing plans suggest that employees are more likely to accept
these plans when there is some form of job guarantee attached or the
organization's future economic success and growth look promising (Schuster,
1984b). Conventional merit plans also offer more incentive potential for
employees when the organization is growing. As we noted in our review of
practice, the opportunity to promote high-performing employees who are also
high in their salary range makes it more likely that merit plans will, over time,
provide higher-performing employees with higher pay levels. Some
organizations, faced with limited employment growth, are now considering
avoiding restrictions on merit allocations for employees already high in their
salary ranges by offering some portion of merit increases as lump sums (i.e., not
added into base salaries). In short, some assurances that pay for performance plan
payouts are feasible and that job security is not jeopardized by the plan appear to
be important to employee acceptance and motivation under pay for performance
plans. Both may be influenced by the organization's economic and growth
prospects.
FINDINGS
1. Using very precise individual performance measures and incentives
systems for managerial and professional jobs can have potentially negative
consequences for the organization; many organizations use more global
appraisals combined with merit plans for such jobs.
2. Organizations differ in their ability to articulate strategic goals that
provide direction throughout the management hierarchy in setting
meaningful performance appraisal goals. Some organizations—especially
public-sector organizations—find it difficult to articulate overall mission or
strategic goals.
3. Public-sector organizations may use more formal, precise performance
appraisals in an effort to make management decisions appear legitimate both
to employees and to other constituents. While this may be useful in
satisfying some constituents (for example, Congress) it may make
employees skeptical of their performance appraisals and any pay system
based on them, and it may reduce management incentives to administer the
systems as the organization intends.
4. The literature related to fit suggests that there is a general match between
certain patterns of organization strategy, structure, management on one hand
and performance evaluation and pay systems on the other. For example,
traditional performance appraisal and merit pay plans appear to be most
suited to steady-state organizations, which emphasize skill development and
work force norms. Group incentive systems appear better suited to
innovative entrepreneurial organizations.
5. Many large firms with diverse goals and work forces have moved towards
decentralized management strategies, with the home office providing policy
and audit functions and the local units designing and implementing
performance evaluation and pay systems.
This general discussion of contextual factors shaping performance appraisal
and pay practices suggests not only that performance appraisal and pay practices
must be aligned with the rest of an organization and its environment but also,
presumably, that the reverse is true. In other words, to the extent that the federal
government is seriously devoted to pay for performance, success in implementing
it is unlikely unless the broader context supports it.
8
Findings and Conclusions
Reliability
Reliability analysis provides an index of the consistency of measurement,
from occasion to occasion, from form to form (if there are several versions of a
test or measure that are all intended to measure the same thing), or from rater to
rater. The first- and last-mentioned types of reliability analysis are particularly
pertinent to performance appraisal. If the measurements are to
have any meaning, one would expect the rater to reach the same judgment from
one week to the next (assuming the employee's performance did not change
significantly), just as one would hope that several raters would reach substantially
the same decision about a single individual's performance. Data on reliability
derive in part from operational settings and in part from laboratory experiments
or from research projects undertaken in field settings, using special rating
instruments developed for the purpose and administered with the proviso that no
operational decisions will be based on the results.
Findings: Reliability
1. There is substantial evidence in the research literature to support the
premise that supervisors are capable of forming reasonably reliable
estimates of their employees' overall performance levels. For the mostly
nonmanagerial jobs studied over the years, raters show substantial
agreement in rating workers' performance. There is also some data showing
interrater agreement on managerial performance.
It is important to remember, however, that consistency among raters cannot
be taken simply at face value as proof of the accuracy of performance appraisal
procedures; it can also cloak systematic bias or systematic error in valuing
performance. Systematic bias is difficult to detect, the more so if it is the product
of unexamined views and conventional assumptions. There is evidence of such
bias, fragmentary but suggestive, in a small number of studies showing that white
supervisors tend to rate white employees as a group somewhat higher than black
employees and, conversely, that black supervisors rate black employees higher on
average. The studies have not been able to distinguish between real performance
differences and rater bias but suggest the presence of both, although the variance
accounted for by bias appears to be quite small.
Validity
From the psychometric perspective, the central question posed by any
measurement system is whether it produces an accurate assessment of relevant
performance. Validity is the technical term used to refer to the degree of accuracy
and relevance that characterizes a measurement procedure. It is not meant to
imply a static characteristic of a test or rating scale; rather, the term has to do with
the structure of meaning that can be built up to support the assessment results.
Validity, therefore, is an accretion of evidence from many sources; it describes a
research process that gradually lends confidence to the interpretations or
judgments made on the basis of the measure.
In the realm of job performance, validation begins in an important sense with
an analysis of the job or category of jobs for which performance measures are to
be developed. If an employment test or appraisal system can be linked to
important aspects of the job—say typing accuracy and speed or a sonar
Findings: Validity
1. Performance appraisal does not lend itself to the full complement of
validation strategies that have been found useful for standardized tests.
Criterion-related validity, for example, is rarely as useful for evaluating
performance appraisals as it is with selection tests. The strength of the
approach lies in showing that a healthy relationship exists between, say, test
results and some independent, operational performance measure (e.g.,
college admissions test and grade-point average). When the measure being
validated is itself a behavioral measure, it is difficult to find relevant
operational measures for comparison that have the essential independence.
As a consequence, what is frequently considered a compelling type of
evidence in validation research is usually not possible for performance
appraisals. Furthermore, in those limited conditions in which independent
criteria do exist, the jobs themselves tend to be much more simple and
straightforward than those for which appraisals are typically used.
2. It is, however, possible to compare performance appraisals to other
measures of job performance using the conventional statistical methods of
psychometric analysis. Recent military job performance measurement
research, for example, demonstrated moderate correlations between
supervisor ratings and each of the other types of criterion measures
developed (hands-on test scores, training grades, written job knowledge
tests), which lends credibility to the claim that carefully developed
performance appraisals can bear a meaningful degree of relationship to
actual job performance.
3. Supervisor ratings have been used in thousands of studies designed to
examine the power of cognitive and other ability tests to predict job
performance—in other words, they have been used to validate employment
tests. These studies consistently show a low to moderate observed
correlation between employment tests and supervisor ratings; job
incumbents who score well on the test tend also to receive good ratings and
those with low test scores tend to be rated as mediocre performers. While
admittedly circular, this relationship provides further indirect evidence that
supervisors can rate their employees with some degree of (but by no means
perfect) accuracy; whether they will do so in an operational setting is
another matter.
Scale Characteristics
A wide variety of rating scale formats, defining performance dimensions at
varying levels of specificity, exist. Commonly used rating dimensions include
personal traits (e.g., initiative, leadership, perseverance), job behaviors (e.g.,
follows safety procedures in engine room, financial management, interpersonal
relations), and performance results (e.g., quality of work, quantity of work). The
number of scale points has ranged as high as 11, but most appraisal scales have
between 3 and 5.
In terms of scale format, a general distinction can be made between scales
that include specific behavioral examples of good, average, and inadequate
performance and those that do not. The latter, called graphic scales, simply list
the dimension of interest and present a number of scale points along a
continuum. The scale points, or anchors, can be numerical or adjectival (e.g.,
consistently superior, average, consistently unsatisfactory).
Behaviorally anchored rating scales (BARS) were developed to reduce some
of the rating error typical of graphic scales. Proponents thought that BARS would
help to clarify the meaning of the performance dimensions used and would help
calibrate various raters' definition of what constitutes superior, average, and
unsatisfactory performance on the dimension. It was also felt that the behavioral
descriptions would discourage the tendency to rate on broad, general traits by
focusing attention on specific work behaviors. Mixed standard scales, also
behaviorally based, went one step further in trying to control rater error,
particularly bias and leniency. These scales present the behavioral descriptions in
random order and not in conjunction with a particular performance dimension.
The rater's responses are computed by someone else into a performance score for
each dimension measured.
was noted by Landy and Farr (1983). It is, namely, that in many studies the scales
compared were actually developed in the same way. The performance dimensions and
behavioral examples were developed according to BARS methodology. This means that
only the presentation modes were actually compared. Many authors have also pointed to
the lack of rigor in the selection and scaling of anchors, which suggests that the final word
has not been spoken on the merits of behavioral approaches to rating scales.
Copyright National Academy of Sciences. All rights reserved.
Pay for Performance: Evaluating Performance Appraisal and Merit Pay
It is also the case that the choice of approach (traits or behaviors) and format (BARS or
graphic format) may make a difference in the usefulness, if not the accuracy, of the
ratings. Scales containing specific behavioral examples may be more useful for providing
feedback to employees; trait scales may be more useful for ranking those rated.
Copyright National Academy of Sciences. All rights reserved.
Pay for Performance: Evaluating Performance Appraisal and Merit Pay
that explicitly guide the rater through both performance observation and
performance assessment.
in a position to know their employees well and to have far more information
available to them than the consumers of standardized test results—say, a college
admissions committee.
1. These considerations lead us to conclude that for most personnel
management decisions, including annual pay decisions, the goal of a
performance appraisal system should be to support and encourage
informed managerial judgment and not to aspire to a degree of
standardization, precision, and empirical support that would be
required of, for example, selection tests.
In this context, informed judgment means that there are demonstrable and
credible links between the performance of the individuals being rated and the
supervisor's evaluation of that performance.
on commission—payouts are not added to base salary. Although the payouts can
be large, they also carry the risk to the individual of no payout if performance
thresholds are not met.
Group incentive plans differ from the two preceding types in basing
compensation decisions on unit or system performance rather than individual
performance. Thus profit-sharing plans or equity plans link employees' payouts to
the overall fortunes of the firm as measured by some indicator of its financial
health. Although payouts can be large in good times, they are not usually added to
base pay—hence the designation variable pay plan.
All pay for performance plans are designed to deliver pay increases to
employees based, at least in part, on some measure of performance. In theory,
such plans offer several potential benefits:
• They can support the organization's personnel philosophy by helping to
communicate the organization's goals to its employees. For example, if
financial goals are paramount, then a pay for performance plan tied to
the achievement of financial goals (e.g., a profit-sharing plan) helps
reinforce their importance for employees.
• Goal theory also suggests that performance-based pay plans can support a
certain level of performance that is consistent with the organization's
mission. For example, a plan that pays out when financial goals are
almost met (80 percent) sends a different message to employees than one
that pays out only when goals are completely met (100 percent).
Likewise, if employees receive no pay increase when their performance
appraisal is below some work force norm, then they are more likely to
attend to that norm.
• They can help ensure consistency in the distribution of pay increases.
For example, under a plan that ties pay increases to a specific financial
goal, payouts are distributed only when that goal is met. Under a merit
plan, pay increases are distributed consistently to employees who are in
the same pay grade, who are in the same position in grade, and who have
the same performance appraisal ratings. This helps the organization
predict and regulate the price tag for merit increases.
• Motivation theory suggests that pay for performance can positively
influence individuals to achieve goals that are rewarded. To the extent
that these goals contribute to organizational effectiveness, we can infer
that pay for performance can influence individual and organizational
effectiveness.
Before turning to the research findings, it is important to note that
performance-based pay is only one dimension of employee compensation; other
dimensions include competitiveness of salaries with the marketplace, benefits
packages, cost-of-living considerations, and others. The effects of merit or
variable pay plans will depend in good measure on this larger compensation
context.
Employee Motivation
The research most directly related to questions about the impact of
performance-based pay plans on individual and organizational performance
comes from theory and empirical study of work motivation. Motivation theories
that have been well tested empirically predict that employee motivation is
enhanced, and the likelihood of desired performance increased, under pay for
performance plans when: (1) employees understand performance goals and view
them as "doable" given their own abilities and skills and the restrictions posed by
organization context; (2) there is a clear link between performance and pay
increases, consistently communicated and followed; and (3) the pay increase is
viewed as meaningful.
studies suggesting that managers and professionals under a merit pay system
(as opposed to a straight seniority system or no formal system) express more
job satisfaction and perceive a stronger tie between pay and performance.
Other studies suggest that these effects may be tenuous.
6. Some group incentive plans retain many of the motivational features of
individual incentive plans (quantitative performance goals, relatively large
and frequent payouts), but it is not easy for individuals to see how their
performance contributes to group- or organizational-level measures, so the
motivational link is weakened. More to the point, payouts may occur only in
good times and are dependent on larger environmental and economic forces
beyond the control of the individual employee.
7. There is a modest body of research evidence drawn from private-sector
experience that suggests that gainsharing and profit-sharing plans are
associated with improved group- or organizational-level productivity and
financial performance. This research does not, however, allow us to
disentangle the effects of the pay plans on performance from many other
contextual conditions. We cannot say that group plans cause performance
changes or specify how they do.
Overall Findings
1. There is a broad consensus among practitioners—as well as some research
evidence—that personnel systems in general and performance appraisal and
pay systems in particular must exhibit "fit" or congruence to be effective.
2. Three categories of contextual factors of particular relevance to
performance appraisal and pay for performance emerged from our reviews
of research and practice: (a) the nature of the organization's work, or what
might be called technological fit; (b) the broad features of the organization's
structure and culture; and (c) external factors such as economic climate, the
presence of unions, and legal or political forces exerted by external
constituents.
Technological Fit
The strongest evidence on congruence has to do with the fit between
appraisal and pay systems and the nature of work. The literature on the
links between pay and individual motivation, for example, demonstrates the
importance of job independence, concrete and easily measured products, and
production standards that are perceived as fair (doable) to effective individual
incentive pay plans. Only a limited number of jobs, mainly in some executive,
sales, and manufacturing work, have proved to be amenable to this sort of
performance measurement and incentive pay. Conversely, it has been shown that
using highly specific individual performance appraisals and incentives with jobs
that are complex, interdependent, and have multiple and amorphous goals can
result in employees' ignoring important aspects of their jobs or distorting
performance in order to meet the appraisal goals. This sort of gaming is a
particular danger with objectives-based appraisal systems. Group incentives avoid
some of the problem. They recognize the interdependent nature of work and
focus on organization-level performance. However, they suffer from unclear links
between individual actions and organization-level results.
association between performance appraisal and pay systems on the one hand and
organizational strategy and structure on the other. However, all of this work is
theoretical or descriptive and should be viewed as suggestive, but not necessarily
generalizable.
External Forces
The final dimension of congruence has to do with external factors that
constrain an organization's choice of evaluation and pay systems. One of the
most relevant to federal policy makers is the widespread resistance of unions in
the private sector to performance appraisal and pay for performance systems.
Most surveys show that unionized employees are far less likely than
nonunionized employees to be covered by incentive systems (including merit
plans). To the extent that this changed in the 1980s, the incentive pay
arrangements accepted by unions (e.g., profit-sharing) were not ones that
differentiate among individual employees.
Also of particular salience to the issue of pay for performance is the role of
external laws and regulations. Fair labor standards, occupational health and
safety, and equal employment opportunity are a few of the areas of law that
prescribe internal structures, policies, and procedures that may be more or less
compatible with an organization's chosen evaluation and pay systems. Federal
equal employment opportunity policy has had an enormous impact on personnel
management in every organization of any size in the nation.
In addition to these requirements, the federal government as an employer
faces a set of constraints imposed by the laws and regulations surrounding its
merit system. The desire to shield civil servants from the exigencies of politics
has placed serious constraints on the managerial flexibility needed to make pay
for performance work.
the fact that Congress retained statutory control over development of the federal
government's performance appraisal system, rather than delegating both the
development and implementation components to the Office of Personnel
Management. The rationale was to balance managerial discretion with employee
rights in the context of a system that made it easier for agencies to fire
incompetent employees; the result was to hobble the decision making of
managers. On one hand, Civil Service Reform Act legislation provided the
requirement for detailed performance appraisal standards that could be used by
managers as proof of unsatisfactory performance. On the other hand, the
managers' ability to act regarding unsatisfactory performance was limited in the
statute by providing employees with strong substantive rights, such as the
opportunity to improve before an unacceptable performance action can be taken
and the ability to appeal performance appraisal ratings both within the agency and
externally to the Merit Systems Protection Board. This has led to situations in
which, at best, a number of years are required to release an inadequate employee,
and the costs borne by managers serve as a strong disincentive against appraising
mediocre performance accurately.
Another feature of the federal context that warrants consideration is whether
the dominant motivations among employees are comparable to those of private-
sector workers who work where pay for performance has been implemented.
Although there has been a long tradition of simply applying private-sector
motivation theory and techniques to the public sector, some recent studies are
finding different sources for motivation and different motivational patterns among
public employees. Perry and Wise (1990) explore the role of public service as a
motivator; Rainey (1990) documents a fairly consistent pattern of differences in
public and private managers in relation to money, job satisfaction and security,
and organizational commitment. In a 1982 review article, Perry and Porter noted
that public-sector employees had higher achievement needs and tend to value
economic wealth less than do entrants into the private sector.
Furthermore, there is some evidence that public managers, particularly those
at the highest levels of the organization, are keenly attuned to public perceptions
of their effectiveness and the overall usefulness of the policies and programs they
administer (Ingraham and Barrilleaux, 1983). Federal Employee Attitude Surveys
in 1979 and 1980 demonstrated that upper-level managers perceived generalized
"bureaucrat bashing" as a personalized attack. More recent studies by the Merit
Systems Protection Board (1989) and the U.S. General Accounting Office (1987)
indicate that managers continue to tie their overall job satisfaction to their
perceptions of "appreciation" by the public. These findings suggest that policy
makers would do well to give their attention to nonmonetary motivators in
concert with their plan to strengthen the ties of pay to performance.
Finally, one of the most important contextual factors that governs how any
new performance appraisal or pay for performance system is likely to function
is the less than satisfactory experience of federal employees with the merit pay
systems implemented during the last 12 years.
CONCLUSIONS
We have conducted a wide-ranging study of performance appraisal and pay
for performance in the private sector to help the director of the Office of
Personnel Management and other federal policy makers as they rethink the
Personnel Management and Recognition System. What we have learned does not
provide a blueprint for linking pay to performance in the federal sector or even
any specific remedy for what ails PMRS. Instead, we conclude with some
general suggestions about priorities.
1. Performance appraisal ratings can influence many personnel decisions, and
thus care in the development and use of performance appraisal systems is
warranted. There is, however, no obvious technical (psychometric) solution
to the performance management issues facing the federal government.
Further refinements in the technology of performance appraisal (e.g.,
extensive new job analysis, modifications of existing rating scales or rater
training programs) are unlikely to provide substantially more valid and
accurate appraisals than those currently in force, particularly for managerial
and professional jobs. There is also no evidence that one particular appraisal
format is clearly superior to all others. For example, we do not know that the
objective-based format for managerial appraisal, so popular in the private
sector, yields more (or less) valid appraisals than the supervisory ratings
used in the government.
There appears to be at least as much effort expended on performance
appraisal in the federal government as elsewhere. More generally, the
pursuit of further psychometric sophistication in the performance appraisal
system used in the federal government is unlikely to contribute to enhanced
individual or organizational performance.
2. Where performance appraisal is viewed as most successful in the private
sector, it is firmly embedded in the context of management and personnel
systems that provide incentives for managers to use performance appraisal
ratings as the organization intends. These incentives include managerial
flexibility or discretion in rewarding top performers and in dismissing those
who continually perform below standards. When performance appraisal
ratings are used to distribute pay (as in a merit plan) the size of the merit pay
offered allows managers to differentiate outstanding performers from good
and poor performers, and thus provides them with incentives to
differentiate. For example, top performers may receive 10 percent of their
base salary in merit pay, good performers, 5 percent, and poor performers,
no merit increase. Finally, managers are themselves assessed on the results
of their performance appraisal activities.
contextual factor, the issue of comparability of federal base salaries with pay
for equivalent private-sector jobs may pose severe problems for the
acceptance of merit pay or any other pay for performance system if the
promise of recently enacted legislation proves illusory. We realize that the
broader changes suggested by an analysis of context can be costly, but we
suggest that making programmatic changes to the Performance Management
and Recognition System in isolation is unlikely to enhance employee
acceptance of the system or improve individual and organizational
effectiveness significantly and, in the long run, may prove no less costly.
REFERENCES 167
References
Abowd, J. 1990 Does performance based management compensation affect corporate performance?
Industrial and Labor Relations Review 43(3):52-73.
Adams, J. 1965 Inequity in social exchange. Pp. 272-283 in L. Berkowitz, ed., Advances in
Experimental Social Psychology. New York: Academic Press.
Advisory Committee on Federal Pay 1990 Advisory Committee on Federal Pay: 19th Annual Report
to the President of the United States. Washington, D.C.: Advisory Committee on Federal
Pay.
Allan, P., and Rosenberg, S. 1986 An assessment of merit pay administration under New York City's
managerial performance evaluation system: three years of experience. Public Personnel
Management 15:297-309.
Allison, G. 1983 Public and private management: are they fundamentally alike in all unimportant
ways? In J. Perry and K. Kraemer, eds., Public Management: Public and Private
Perspectives. Palo Alto, Calif.: Mayfield.
American Compensation Association 1987 Report on the 1987 Survey of Salary Management
Practices. Scottsdale, Ariz.: American Compensation Association.
American Educational Research Association, American Psychological Association, and National
Council on Measurement in Education 1985 Standards for Educational and Psychological
Testing. Washington, D.C.: American Psychological Association.
REFERENCES 168
Angoff, W. 1988 Validity: an evolving concept. Pp. 19-32 in H. Wainer and H. Braun, eds., Test
Validity. Hillsdale, N.J.: Erlbaum.
Argyris, C. 1964 Integrating the Individual and the Organization. New York: Wiley.
Babchuk, N., and Goode, W. 1951 Work incentives in a self-determined work group. American
Social Review 16:679-687.
Balkin, D., and Gomez-Mejia, L. 1984 Determinants of R and D compensation strategies in the high
tech industry. Personnel Psychology 37(4):635-650.
1987a An integrated framework for the compensation system. Pp. 1-6 in D. Balkin and L. Gomez-
Mejia, eds., New Perspectives on Compensation . Englewood Cliffs, N.J.: Prentice-Hall.
1987b Toward a contingent theory of compensation strategy. Strategic Management Journal
8:169-182.
1990 Matching compensation and organizational strategies. Strategic Management Journal
11:153-169.
Banks, C., and Murphy, K. 1985 Toward narrowing the research-practice gap in performance
appraisal. Personnel Psychology 38:335-345.
Bann, C., and Johnson, J. 1984 Federal employee attitudes toward reform: performance evaluation
and merit pay. P. 79 in P. Ingraham and C. Ban, eds., Legislating Bureaucratic Change: The
Civil Service Reform Act of 1978. Albany: N.Y.: SUNY Press.
Baron, J., Dobbin, F., and Jennings, P. 1986 War and peace: the evolution of modern personnel
administration in U.S. industry. American Journal of Sociology 92:350-383.
Bartlett, C., and Ghoshal, S. 1988 Organizing for worldwide effectiveness: the transnational solution.
California Management Review Fall:54-74.
Beer, M., Eisenstat, R., and Spector, B. 1990 The Critical Path: Mobilizing Human Resources for
Corporate Renewal. Cambridge, Mass.: Harvard Business School Press.
Beer, M., Spector, B., Lawrence, P., Mills, D., and Walton, R. 1985 Human Resource Management: A
General Manager's Perspective. New York: The Free Press.
Berg, N. 1965 Strategic planning in conglomerate companies. Harvard Business Review May/
June:79-92.
Berkshire, J., and Highland, R. 1953 Forced-choice performance rating—a methodological study.
Personnel Psychology 6:355-378.
Bernardin, H. 1977 Behavior expectation scales versus summated scales: a fairer comparison.
Journal of Applied Psychology 62:422-427.
REFERENCES 169
Bernardin, H., and Beatty, R. 1984 Performance Appraisal: Assessing Performance at Work. Boston:
Kent Press.
Bernardin, H., and Buckley, M. 1981 Strategies in rater training. Academy of Management Review
6:205-212.
Bernardin, H., Morgan, B., and Winne, P. 1980 The design of a personnel evaluation system for
police officers. JSAS Catalog of Selected Documents in Psychology 10:1-280.
Bialek, H., Zapf, D., and McGuire, W. 1977 Personnel Turbulence and Time Utilization in an
Infantry Division . Report #FR-WD-CA 77-11. Alexandria, Va.: Human Resources Research
Organization.
Bjerke, D., Cleveland, J., Morrison, R., and Wilson, W. 1987 Officer Fitness Report Evaluation
Study. Navy Personnel Research and Development Center Report, TR 88-4.
Blanz, F., and Ghiselli, E. 1972 The mixed standard scale: a new rating system. Personnel Psychology
25:185-199.
Blau, P. 1955 The Dynamics of Bureaucracy. Chicago: University of Chicago Press.
Blinder, A., ed. 1990 Paying for Productivity. Washington, D.C.: Brookings Institution.
Borman, W. 1978 Exploring upper limits of reliability and validity in job performance ratings.
Journal of Applied Psychology 63(2):135-144.
1983 Implications of personality theory and research for the rating of work performance in
organizations. Pp. 127-172 in F. Landy, S. Zedeck, and J. Cleveland, eds., Performance
Measurement and Theory . Hillsdale, N.J.: Erlbaum.
1987 Personal constructs, performance schemata, and ''folk theories" of subordinate effectiveness:
explorations in an Army officer sample. Organizational Behavior and Human Decision
Processes 40:307-322.
Borreson, H. 1967 The effects of instructions and item content on three types of ratings. Educational
and Psychological Measurement 27:855-862.
Brett, J. 1986 Commentary on procedural justice papers. Pp. 81-90 in R. Lewicki, B. Sheppard, and
M. Bazerman, eds., Research on Negotiation in Organizations . Greenwich, Conn.: JAI
Press.
Bretz, R., and Milkovich, G. 1989 Performance Appraisal in Large Organizations: Practice and
Research Implications. Working paper #89-17. Center for Advanced Human Resource
Studies, Cornell University, Ithaca, N.Y.
Brown, C. 1990 Firms' choice of pay method. Industrial and Labor Relations Review 43(3):165-182.
REFERENCES 170
Buchanan, B., II 1975 Red tape and the service ethic: some unexpected differences between public
and private managers. Administration and Society 6:423-438.
Bullock, R., and Lawler, E.E., III 1984 Gainsharing: a few questions and fewer answers. Human
Resource Management 23:23-40.
Bureau of National Affairs 1974 Management performance appraisal programs. Personnel Policies
Forum Survey No. 104. Washington, D.C.: Bureau of National Affairs.
1981 Wage and salary administration. Personnel Policies Forum Survey No. 131. Washington, D.C.:
Bureau of National Affairs.
1984 Productivity Improvement Programs. Personnel Policies Forum Survey No. 138. Washington,
D.C.: Bureau of National Affairs.
Burnett, F. 1925 An experimental investigation into repetitive work. Industrial Fatigue Research
Board Report No. 30. London: H.M. Stationery Office.
Burns, T., and Stalker, G. 1961 The Management of Innovation. London: Tavistock.
Campbell, A. 1988 Statement Before the U.S. Senate Subcommittee on Government Affairs. U.S.
General Accounting Office, Hearings on Design of the Civil Service Reform Act of 1978.
Campbell, D., and Fiske, D. 1959 Convergent and discriminant validation by the multitrait-
multimethod matrix . Psychological Bulletin 56:81-105.
Campbell, J., and Pritchard, R. 1976 Motivation theory in industrial and organizational psychology.
Pp. 63-130 in M. Dunnette, ed., Handbook of Industrial and Organizational Psychology.
Chicago: Rand McNally.
Campbell, J., Dunnette, M., Lawler, E., III, and Weick, K., Jr. 1970 Managerial Behavior,
Performance, and Effectiveness. New York: McGraw-Hill.
Campbell, J., Dunnette, M., Arvey, R., and Hellervik, L. 1973 The development and evaluation of
behaviorally based rating scales. Journal of Applied Psychology 57:15-22.
Campbell, J., McHenry, J., and Wise, L. 1990 Modeling job performance in a population of jobs.
Personnel Psychology 43:313-333.
Carroll, S. 1987 Business strategies and compensation systems. Pp. 343-355 in D. Balkin and L.
Gomez, eds., New Perspectives on Compensation. Englewood Cliffs, N.J.: Prentice-Hall.
Carson, K., Sutton, C., and Corner, P. 1990 Gender Effects in Performance Appraisal: A Meta
Analysis. Unpublished manuscript, Arizona State University.
REFERENCES 171
Cascio, W. 1987 Costing Human Resources: The Financial Impact of Behavior in Organizations.
Boston: Kent.
Chandler, A. 1962 Strategy and Structure: Chapters in the History of American Industrial Enterprise.
Cambridge, Mass.: MIT Press.
Christal, R. 1974 The United States Air Force Occupational Research Project (AFHRL-TR-73-75).
Brooks Air Force Base, Tex.: Air Force Human Resources Laboratory.
Clark, C., and Primoff, E. 1979 Job elements and performance appraisal. Management: A Magazine
for Government Managers 1:3-5.
Cleveland, J., Murphy, K., and Williams, R. 1989 Multiple uses of performance appraisal: Prevalence
and correlates. Journal of Applied Psychology 74(1):130-135.
Cohen, R., and Greenberg, J. 1982 The justice concept in social psychology. Pp. 1-41 in J. Greenberg
and R. Cohen, eds., Equity and Justice in Social Behavior. New York: Academic Press.
Conference Board 1977 Appraising managerial performance: current practices and future directions.
Conference Board Report No. 723. New York: Conference Board.
1984 Pay and performance: the interaction of compensation and performance appraisal. Conference
Research Bulletin No. 155. New York: Conference Board.
1990 Variable pay: new performance rewards. Conference Board Research Bulletin No. 246. New
York: Conference Board.
Cook, F. 1981 When long-term incentives are not long-term incentives. Proceedings of the American
Compensation Association's Regional Conferences. Scottsdale, Ariz.: American
Compensation Association.
Cooper, W. 1981 Ubiquitous halo. Psychological Bulletin 90:218-244.
Cornelius, E., Hakel, M., and Sackett, P. 1979 A methodological approach to job classification for
performance appraisal purposes. Personnel Psychology 32:283-297.
Cronbach, L. 1990 Essentials of Psychological Testing. New York: Harper and Row.
Deci, E. 1975 Intrinsic Motivation. New York: Plenum.
DeCotiis, T. A. 1977 An analysis of the external validity and applied relevance of three rating
formats. Organizational Behavior and Human Performance 19:247-267.
Deming, E. 1986 Out of the Crisis. Cambridge, Mass.: Massachusetts Institute of Technology Center
for Advanced Engineering.
REFERENCES 172
DeNisi, A., Cafferty, T., and Meglino, B. 1984 A cognitive model of the performance appraisal
process: a model and research propositions. Organizational Behavior and Human
Performance 21(3):358-367.
Dipboye, R., and de Pontbriand, R. 1981 Correlates of employee reactions to performance appraisals
and appraisal systems. Journal of Applied Psychology 66:248-251.
DiPrete, T. 1989 The Bureaucratic Labor Market: The Case of the Federal Civil Service. New York:
Plenum.
Doeringer, P., and Piore, M. 1971 Internal Labor Markets and Manpower Analysis. Lexington,
Mass.: Lexington Books.
Dornbusch, S., and Scott, W. 1975 Evaluation and the Exercise of Authority. San Francisco: Jossey-
Bass.
Doz, Y., and Pralahad, C. 1981 Headquarters influence and strategic control in MNCs. Sloan
Management Review Fall:15-29.
Dyer, L., and Schwab, D. 1982 Personnel/human resource management research. Pp. 87-120 in T.
Kochan, D. Mitchell, and L. Dyer, eds., Industrial Relations Research in the 1970s: Review
and Appraisal. Madison, Wis.: Industrial Relations Research Association.
Dyer, L., and Theriault, R. 1976 The determinants of pay satisfaction. Journal of Applied Psychology
61(5):596-604.
Dyer, L., Schwab, D., and Theriault, R. 1976 Managerial perceptions regarding salary increase
criteria. Personnel Psychology 29:233-242.
Edell, J., and Staelin, R. 1983 The information processing of pictures in print advertisements. Journal
of Consumer Research 10(June):45-61.
Ehrenberg, R., and Milkovich, G. 1987 Compensation and firm performance. Pp. 87-122 in M.
Kleiner, R. Block, M. Roomkin, and S. Salsberg, eds., Human Resources and the
Performance of the Firm. Madison, Wis.: Industrial Relations Research Association.
Ellig, B. 1982 Executive Compensation: A Total Pay Perspective. New York: McGraw-Hill.
England, P., and Dunn, D. 1988 Evaluating work and comparable worth. Annual Review of Sociology
14:227-248.
Evans, P. 1989 Organizational development in the transnational enterprise. Pp. 1-38 in R. Woodman
and W. Passmore, eds., Organizational Change and Development. Greenwich, Conn.: JAI
Press.
REFERENCES 173
Fay, C., and Latham, G. 1982 Effects of training and rating scales on rating errors. Personnel
Psychology 35:105-116.
Feild, H., and Holley, W. 1982 The relationship of performance appraisal system characteristics to
verdicts in selected employment discrimination cases. Academy of Management Journal 25
(2):392-406.
Feldman, J. 1981 Beyond attribution theory: cognitive processes in performance appraisal. Journal of
Applied Psychology 66:127-148.
1986 Instrumentation and training for performance appraisal: a perceptual-cognitive viewpoint. In K.
Rowland and G. Ferris, eds., Research in Personnel and Human Resources Management.
Greenwich, Conn.: JAI Press.
Flanagan, J. 1954 The critical incident technique. Psychological Bulletin 51(4):327-358.
Flanders, L., and Utterback, D. 1985 The management excellence inventory: a tool for management
development. Public Management Forum May-June:403-410.
Folger, R., and Konovsky, M. 1989 Effects of procedural and distributive justice on reactions to pay
raise decisions. Academy of Management Journal 32(1):115-130.
Ford, J., Kraiger, K., and Schectman, S. 1986 Study of race effects in objective indices and subjective
evaluations of performance: a meta-analysis of performance criteria. Psychological Bulletin
99:330-337.
Fossum, J., and Fitch, M. 1985 The effects of individual and contextual attributes on the size of
recommended salary increases. Personnel Psychology 38:587-603.
Freeman, R., and Medoff, J. 1984 What Do Unions Do? New York: Basic Books.
Gaertner, K., and Gaertner, G. 1984 Performance evaluation and merit pay results in the
Environmental Protection Agency and the Mine Safety and Health Administration. Pp.
87-112 in P. Ingraham and C. Ban, eds., Legislating Bureaucratic Change. Albany, N.Y.:
SUNY Press.
Galbraith, J. 1977 Organization Design. Reading, Mass.: Addison-Wesley.
Galbraith, J., and Kazanjian, R. 1986 Organizing to implement strategies of diversity and
globalization. Human Resource Management 25(1):37-54.
Gawthrop, L. 1984 Public Sector Management Systems and Ethics. Bloomington, Ind.: Indiana
University Press.
REFERENCES 174
Gerber, A. 1988 Historical background: classification and compensation in the Federal Service.
Unpublished staff paper, U.S. Office of Personnel Management.
Gerhart, B., and Milkovich, G. 1990 Organizational differences in managerial compensation and
financial performance. Academy of Management Journal.
Gomez-Mejia, L., Page, R., and Tornow, W. 1982 A comparison of the practical utility of traditional,
statistical, and hybrid job evaluation approaches. Academy of Management Journal 25(4):
790-809.
Graham-Moore, B., and Ross, T. 1983 Productivity Gainsharing: How Employer Incentive Programs
Can Improve Business Performance. Englewood Cliffs, N.J.: Prentice-Hall.
Green, B., Wigdor, A., and Shavelson, R., eds. 1991 Measuring Performance in the Workplace.
Committee on Performance of Military Personnel, Commission on Behavioral and Social
Sciences and Education, National Research Council. Washington, D.C.: National Academy
Press.
Greenberg, J. 1986a Determinants of perceived fairness of performance evaluations. Journal of
Applied Psychology 71:340-342.
1986b Organizational performance appraisal procedures: what makes them fair? Pp. 25-41 in R.
Lewicki, B. Sheppard, and M. Bazerman, eds., Research on Negotiation in Organizations.
Greenwich, Conn.: JAI Press.
1987 A taxonomy of organizational justice theories. The Academy of Management Review 12
(1):9-22.
1990 Organizational justice: yesterday, today, and tomorrow. Journal of Management 16(2):399-432.
Greenberg, J., and Levanthal, G. 1976 Equity and the use of overreward to motivate performance.
Journal of Personality and Social Psychology 34:179-190.
Greene, C. 1978 Causal connections among managers' merit pay, job satisfaction, and performance.
Journal of Applied Psychology 58:95-100.
Greller, M., and Herold, D. 1975 Sources of feedback. A preliminary investigation. Organizational
Behavior and Human Performance 13:144-256.
Guion, R. 1983 Comments on Hunter. Pp. 267-175 in F. Landy, S. Zedeck, and J. Cleveland, eds.,
Performance Measurement and Theory. Hillsdale, N.J.: Erlbaum.
Guzzo, R., and Bondy, J. 1983 A Guide to Worker Productivity Experiments in the United States .
Elmsford, N.Y.: Pergamon Press.
Guzzo, R., Jette, R., and Katzell, R. 1985 The effects of psychologically-based intervention programs
on worker productivity. Personnel Psychology 38:275-293.
REFERENCES 175
Hackman, J., and Oldham, G. 1976 Motivation through the design of work. Organizational Behavior
and Human Performance 16:250-279.
1980 Work Redesign. Reading, Mass.: Addison-Wesley.
Hackman, R., Lawler, E., and Porter, L. 1977 Perspectives on Behavior in Organizations. New York:
McGraw-Hill.
Halaby, C. 1986 Worker attachment and workplace authority. American Sociological Review
51:634-649.
Hammer, T. 1988 New developments in profit-sharing, gainsharing and employee ownership. In J.
Campbell, ed., Individual and Group Productivity in Organizations. San Francisco: Jossey-
Bass.
Hannan, M., and Freeman, J. 1984 Structural inertia and organizational change. American
Sociological Review 49:149-164.
Harrigan, K. 1988 Joint ventures and competitive strategy. Strategic Management Journal 9:149-158.
Harris, M., and Schaubroeck, J. 1988 A meta-analysis of self-supervisor, self-peer, and peer-
supervisor ratings. Personnel Psychology 41:43-62.
Hartigan, J., and Wigdor, A., eds. 1989 Fairness in Employment Testing: Validity Generalization,
Minority Issues, and the General Aptitude Test Battery. Committee on the General Aptitude
Test Battery, Commission on Behavioral and Social Sciences and Education, National
Research Council. Washington, D.C.: National Academy Press.
Havemann, J. 1990 Overhaul of federal pay urged. The Washington Post, March 22, 1990.
HayGroup, Inc. 1989 The Hay Report: Compensation and Benefits Strategies for 1990 and Beyond.
Philadelphia: HayGroup, Inc.
Heclo, H. 1978 Issue networks and the executive establishment. Pp. 87-124 in A. King, ed., The New
American Political System. Washington, D.C.: American Enterprise Institute.
Hemphill, J. K. 1959 Job descriptions for executives. Harvard Business Review 37:55-67.
Heneman, H., III 1985 Pay satisfaction. Pp. 113-115 in K. Rowland and G. Ferris, eds., Research in
Personnel and Human Resources Management Volume 3. Greenwich, Conn.: JAI Press.
Heneman, R. 1984 Pay for Performance: Exploring the Merit-Pay System. New York: Work in
America Institute.
REFERENCES 176
1990 Merit pay research. Pp. 115-139 in K. Rowland and G. Ferris, eds., Research in Personnel and
Human Resources Management Volume 8. Greenwich, Conn.: JAI Press.
Heron, A. 1956 The effects of real-life motivation on questionnaire response. Journal of Applied
Psychology 40:65-68.
Hewitt Associates 1985 An Overview of Productivity Based Incentives. Lincolnshire, Ill.: Hewitt
Associates.
1989 Compensation Trends and Practices. Lincolnshire, Ill: Hewitt Associates.
Hills, F., Scott, K., and Markham, S. 1988 Pay System Structure as a Moderator of Pay-Performance
Relationship and Employee Pay Increase Satisfaction. Paper presented at the National
Academy of Management Meetings, Anaheim, Calif.
Hills, F., Scott, K., Markham, S. and Vest, M. 1987 Merit pay: just or unjust desserts. Personnel
Administrator 32(1):53-59.
Holzbach, R. 1978 Comparisons of self- and superior ratings of managerial performance. Journal of
Applied Psychology 63:579-588.
Hundal, P. 1969 Knowledge of performance as an incentive in repetitive industrial work. Journal of
Applied Psychology 53:224-226.
Hunter, J. 1983 A causal analysis of cognitive ability, job knowledge, job performance, and
supervisor ratings. Pp. 257-266 in F. Landy, S. Zedeck, and J. Cleveland, eds., Performance
Measurement and Theory. Hillsdale, N.J.: Erlbaum.
Hunter, J., and Hunter, R. 1984 Validity and utility of alternate predictors of job performance.
Psychological Bulletin 96:72-98.
Ilgen, D. 1990 Pay for performance: Motivational Issues. Working paper prepared for the Committee
on Performance Appraisal for Merit Pay. National Research Council, Washington, D.C.
Ilgen, D., and Favero, J. 1985 Limits in generalization from psychological research to performance
appraisal processes. Academy of Management Review 10:311-321.
Ilgen, D., and Feldman, J. 1983 Performance appraisal: a process focus. In L. Cummings and B.
Straw, eds., Research in Organizational Behavior (Vol. 5). Greenwich, Conn.: JAI Press.
Ilgen, D., and Knowlton, W., Jr. 1981 Performance attributional effects on feedback from superiors.
Organizational Behavior and Human Performance 25:441-456.
Ilgen, D., Barnes-Farrell, J., and McKellin, D. 1989 Performance Rating Accuracy. Paper presented
at the 21st International Congress of Applied Psychology, Jerusalem.
REFERENCES 177
Ilgen, D., Fisher, C., and Taylor, S. 1979 Consequences of individual feedback on behavior in
organizations. Journal of Applied Psychology 64:347-371.
Ingraham, P. 1987 Building bridges or burning them? The president, the appointees and the
bureaucracy. Public Administration Review September/October:425-435.
1989 The design of civil service reform: lessons in politics and rationality. Policy Studies Journal
Winter.
Ingraham, P., and Barrilleaux, C. 1983 Motivating managers for retrenchment. Public Administration
Review 43: 393-402.
Ingraham, P., and Rosenbloom, D. 1990 The State of Merit in the Federal Government. An
Occasional Paper of the National Commission on the Public Service, Washington, D.C.
Jacobs, R., Kafry, S., and Zedeck, S. 1980 Expectations of behaviorally anchored rating scales.
Personnel Psychology 33:595-640.
Jaques, E. 1961 Equitable Payment. New York: Wiley.
Jenkins, J. 1946 Validity for what? Journal of Consulting Psychology 10:93-98.
Kahn, L., and Sherer, P. 1990 Contingent pay and managerial performance. Industrial and Labor
Relations Review 43(3):107-120.
Kalleberg, A., and Lincoln, J. 1988 The structure of earning inequality in the United State and Japan.
American Journal of Sociology 94(Supplement):S121-S153.
Kanfer, R. 1990 Motivational theory and industrial and organizational psychology. In M. Dunnette,
ed., Handbook of Industrial and Organizational Psychology . Palo Alto, Calif.: Consulting
Psychologists Press.
Katz, R. 1974 Skills of an effective administrator. Harvard Business Review Sept.-Oct.:90-102.
Kaufman, H. 1954 The growth of the federal personnel system. P. 106 in the American Assembly,
The Federal Government Service. New York: Columbia University.
1978 Reflections on administrative reorganization. Pp. 214-233 in F. Lane, ed., Current Issues in
Public Administration. New York: St. Martin's Press.
Kavanagh, M., MacKinney, A., and Wolins, L. 1971 Issues in managerial performance: multitrait-
multimethod analyses of ratings. Psychological Bulletin 75:34-49.
Kerr, J. 1985 Diversification strategies and managerial rewards: an empirical study. Academy of
Management Journal 28(1):155-179.
REFERENCES 178
Kiechel, W., III 1989 When subordinates evaluate the boss. Fortune 119(13):201.
Kingstrom, P., and Bass, A. 1981 A critical analysis of studies comparing behaviorally anchored
rating scales (BARS) and other rating formats. Personnel Psychology 34:263-289.
Kopelman, R. 1976 Organizational control system responsiveness, expectancy theory constructs, and
work motivation: some interrelations and causal connections. Personnel Psychology
29:205-220.
1986 Objective feedback. In E. Locke, ed., Generalizing From Laboratory to Field Settings.
Lexington, Mass.: Lexington Books.
Kraiger, K., and Ford, J. 1985 A meta-analysis of ratee race effects in performance ratings. Journal of
Applied Psychology 70(1):56-65.
Kraut, A., Pedigo, P., McKenna, D., and Dunnette, M. 1989 The role of the manager: what's really
important in different management jobs. The Academy of Management Executive 3
(4):286-293.
Krzystofiak, F., Newman, J., and Krefting, L. 1982 Pay meaning, satisfaction, and size of a
meaningful pay increase. Psychological Reports 51:660-662.
Lamb, R., ed. 1984 Competitive Strategic Management. Englewood Cliffs, N.J.: Prentice-Hall.
Landy, F., and Farr, J. 1980 Performance rating. Psychological Bulletin 87(1):72-107.
1983 The Measurement of Work Performance. New York: Academic Press.
Landy, F., Barnes-Farrell, J., and Cleveland, J. 1980 Perceived fairness and accuracy of performance
evaluation: a follow-up. Journal of Applied Psychology 65:355-356.
Landy, F., Barnes, J., and Murphy, K. 1978 Correlates of perceived fairness and accuracy of
performance evaluation. Journal of Applied Psychology 63:751-754.
Landy, F., Farr, J., and Jacobs, R. 1982 Utility concepts in performance measurement.
Organizational Behavior and Human Performance 30:15-40.
Latham, G. 1988 Human Resource Training and Development. Annual Review of Psychology
39:545-582.
Latham, G., and Wexley, K. 1977 Behavioral observation scales. Personnel Psychology 30:255-268.
1981 Increasing Productivity Through Performance Appraisal. Reading, Mass.: Addison-Wesley.
Latham, G., Fay, C., and Saari, L. 1979 The development of behavioral observation scales for
appraising the performance of foremen. Personnel Psychology 32:299-311.
Lawler, E., III 1971 Pay and Organizational Effectiveness: A Psychological View. New York:
McGraw-Hill.
REFERENCES 179
REFERENCES 180
McEvoy, G., and Cascio, W. 1988 Cumulative evidence of the relationship between employee age
and job performance. Journal of Applied Psychology 74:11-17.
McIntyre, R., Smith, D., and Hassett, C. 1984 Accuracy of performance ratings as affected by rater
training and perceived purpose of rating. Journal of Applied Psychology 69:147-156.
Merit Systems Protection Board 1988 Toward Effective Performance Management in the Federal
Government . Washington, D.C.: U.S. Government Printing Office.
1989 Government Documents and Agency Reports: Personnel Management Simplification Efforts in
the Federal Government. Washington, D.C.: U.S. Government Printing Office.
1990 Working for America: A Federal Employee Survey. Washington, D.C.: U.S. Government
Printing Office.
Metzger, B. 1978 Profit-Sharing in 38 Large Companies. Evanston, Ill.: Profit-Sharing Research
Foundation.
Meyer, H. 1980 Self-appraisal of job performance. Personnel Psychology 33:291-295.
Meyer, H., Kay, E., and French, J. 1965 Split roles in performance appraisal. Harvard Business
Review 43:123-129.
Meyer, J., and Rowan, B. 1977 Institutional organizations: formal structures as myth and ceremony.
American Journal of Sociology 83:340-363.
Miceli, M., and Lane, M. 1990 Antecedents of pay satisfaction: a review and extension. In K.
Rowland and G. Ferris, eds., Research in Personnel and Human Resources Management
(Vol. 9). Greenwich, Conn.: JAI Press.
Miles, R., and Snow, C. 1978 Organizational Strategy, Structure, and Process. New York: McGraw-
Hill.
1983 Designing strategic human resource systems. Organization Dynamics.
Milkovich, G. 1986 Gainsharing in managing and compensating human resources. Presented at
Conference on Participation and Gainsharing Systems. Johnson Foundation Wingspread
International Conference Center.
Milkovich, G., and Newman, J. 1990 Compensation. Boston: Richard Irwin.
Mintzberg, H. 1973 The Nature of Managerial Work. New York: Harper and Row.
1975 The manager's job: folklore and fact. Harvard Business Review July-August:49-61.
Mitchell, D. 1985 Shifting norms in wage determination. Brookings Papers on Economic Activities 2.
Washington, D.C.: Brookings Institution.
Mitchell, D., and Broderick, R. 1991 Flexible pay systems in the American context: history, policy,
research, and
REFERENCES 181
implications. In D. Lewin, D. Lipski, and D. Sockell, eds., Advances in Industrial and Labor
Relations. Greenwich, Conn.: JAI Press.
Mitchell, D., Lewin, D., and Lawler, E.E. III 1990 Alternative pay systems, firm performance, and
productivity. Pp. 15-94 in A. Blinder, ed., Paying for Productivity. Washington, D.C.:
Brookings Institution.
Mohrman, A., and Lawler, E. 1983 Motivation and performance appraisal behavior. In F. Landy, S.
Zedeck, and J. Cleveland, eds., Performance Measurement and Theory . Hillsdale, N.J.:
Erlbaum.
Mowday, R. 1987 Equity theory predictions of behavior in organization. Pp. 89-110 in R. Stears and
L. Porter, eds., Motivation and Work Behavior (4th Edition). New York: McGraw-Hill.
Murphy, K. 1982 Difficulties in the statistical control of halo. Journal of Applied Psychology
67:161-164.
Murphy, K., and Cleveland, J. 1991 Performance Appraisal: An Organizational Perspective. Boston:
Allyn and Bacon.
Murphy, K., and Constans, J. 1988 Psychological issues in scale format research: behavioral anchors
as a source of bias rating. In R. Cardy, S. Peiffer, and J. Newman, eds., Advances in
Information Processing in Organizations (Vol. 3) . Greenwich, Conn.: JAI Press.
Murphy, K., and Jako, B. 1989 Under what conditions are observed intercorrelations greater than or
smaller than true intercorrelations. Journal of Applied Psychology 74:827-830.
Murphy, K., Balzer, W., Kellam, K., and Armstrong, J. 1984 Effects of the purpose of rating on
accuracy in observing teacher behavior and evaluating teaching performance. Journal of
Educational Psychology 76:45-54.
Murphy, K., Herr, B., Lockhart, M., and Maguire, E. 1986 Evaluating the performance of paper
people. Journal of Applied Psychology 72:573-579.
Murphy, K., Martin, C., and Garcia, M. 1982 Do behavioral observation scales measure observation?
Journal of Applied Psychology 67:562-167.
Napier, N., and Latham, G. 1986 Outcome expectancies of people who conduct performance
appraisals. Personnel Psychology 39(4):827-837.
Nathan, B., and Alexander, R. 1988 A comparison of criteria for test validation: a meta-analytic
investigation. Personnel Psychology 41:517-535.
Nathan, B., and Lord, R. 1983 Cognitive categorization and dimensional schemata: a process
application
REFERENCES 182
REFERENCES 183
Perry, J., and Porter, L. 1982 Factors affecting the context for motivation in public organizations.
Academy of Management Review 7(1):89-98.
Perry, J., and Rainey, H. 1988 The public-private distinction in organization theory: a critique and
research strategy. The Academy of Management Review 13(2):182-201.
Perry, J., and Wise, L. 1990 The motivational bases of public service. Public Administration Review
50:367-373.
Perry, J., Petrakis, B., and Miller, T. 1989 Federal merit pay, round II: an analysis of the performance
management and recognition system. Public Administration Review January/
February:29-37.
Personnel Psychology 1990 Special issue on the Army Selection and Classification Project (Project
A). Vol. 43.
Pfeffer, J., and Baron, J. 1988 Taking the workers back out: recent trends in the structuring of
employment. Pp. 257-303 in B. Staw and L. Cummings, eds., Research in Organizational
Behavior, Volume 10. Greenwich, Conn.: JAI Press.
Pfiffner, J. 1988 Hitting the Ground Running: The Strategic Presidency. Chicago: Dorsey Press.
Pinder, C. 1984 Work Motivation. Glenview, Ill.: Scott Foresman.
Pitts, R. 1976 Diversification strategies and organizational policies of large, diversified firms.
Journal of Economics and Business 8:181-188.
Porter, M. 1985 Competitive Advantage. New York: Free Press.
Pritchard, R., and Curts, M. 1973 The influence of goal setting and financial incentives on task
performance. Organizational Behavior and Human Performance 10:175-183.
Professional Managers Association 1989 Legislation introduced to reform PMRS. OMA Update
August:1-5. Profit-Sharing Council of America
1984 Profit-Sharing: Philosophy, Practices, and Benefits to Society . Evanston, Ill.: Profit-Sharing
Council.
Rainey, H. 1990 Public management: recent developments and current prospects. Pp. 157-184 in N.
Lynn and A. Wildavsky, eds., Public Administration: The State of the Discipline. Chatham,
N.J.: Chatham House.
Reilly, C., and Balzer, W. 1988 Effect of Purpose on Observation and Evaluation of Teaching
Performance. Unpublished manuscript, Bowling Green University.
Roethlishberger, F., and Dickson, W. 1939 Management and the Worker. Cambridge, Mass.: Harvard
University Press.
REFERENCES 184
REFERENCES 185
Smith, E. 1982 Strategic business planning and human resources: part I. Personnel Journal
August:606-609.
Smith, P., and Kendell, L. 1963 Retranslation of expectations: an approach to the construction of
unambiguous anchors for rating scales. Journal of Applied Psychology 7:149-155.
Spenner, K. 1990 The Measurement of Skill: Strategies and Dilemma with Special Reference to the
Dictionary of Occupational Titles. Paper presented at a conference, ''Changing Occupational
Skill Requirements: Gathering and Assessing the Evidence," Taubman Center for Public
Policy, Brown University.
Sticker, L., Jacobs, C., and Kogan, N. 1974 Trait interrelations in implicit personality theories and
questionnaire data. Journal of Personality and Social Psychology 30:198-207.
Stone, K. 1974 The origins of job structure in the steel industry. Review of Radical Political
Economics 6(Summer):113-173.
Taylor, E., and Wherry, R. 1951 A study of leniency in two rating systems. Personnel Psychology
4:39-47.
Taylor, F. 1911 The Principles of Scientific Management. New York: Harper and Row.
Terborg, J., and Miller, H. 1978 Motivation, behavior, and performance: a closer examination of goal
setting and monetary incentives. Journal of Applied Psychology 63:29-39.
Thornton, G. 1980 Psychometric properties of self-appraisals of job performance. Personnel
Psychology 33:263-271.
Thornton, G., and Zorich, S. 1980 Training to improve observer accuracy. Journal of Applied
Psychology 65:351-354.
Tolbert, P., and Zucker, L. 1983 Institutional sources of change in the formal structure of
organizations: the diffusion of civil service reform. Administrative Science Quarterly
28:22-39.
TPF & C/Towers Perrin 1990 Achieving Results Through Sharing: Group Incentive Program Survey
Report . New York: TPF&C/Towers Perrin. U.S. Civil Service Commission
1974 Biography of an Ideal. Washington, D.C.: U.S. Government Printing Office.
U.S. General Accounting Office 1981 Productivity Sharing Programs: Can They Contribute to
Productivity Improvement? Washington, D.C.: U.S. Government Printing Office.
1984 A Two Year Appraisal of Merit Pay in Three Agencies. Washington, D.C.: U.S. Government
Printing Office.
1987 Status of Personnel Research and Demonstration Programs. Washington, D.C.: U.S.
Government Printing Office.
1988 The Senior Executive Service: Executives' Perspectives on Their Federal Service. Washington,
D.C.: U.S. Government Printing Office.
REFERENCES 186
U.S. Office of Personnel Management 1981 Merit Pay Systems Design. Washington, D.C.: U.S.
Office of Personnel Management.
1988a Performance Management and Recognition System: FY 1986 Performance Cycle.
Washington, D.C.: U.S. Office of Personnel Management.
1988b Turnover in the Navy Demonstration Laboratories, 1980-1985. Washington, D.C.: U.S. Office
of Personnel Management.
Vancil, R., and Buddrus, L. 1979 Decentralization: Managerial Ambiguity by Design. Homewood,
Ill.: Dow-Jones/Irwin.
Van Riper, P. 1958 History of the United States Civil Service. Evanston, Ill.: Row, Peterson.
Vaughn, Robert 1989 The United States Merit Systems Protection Board and the Office of Special
Counsel. Policy Studies Journal Winter.
Vroom, V. 1964 Work and Motivation. New York: Wiley.
Wagner, J., Rubin, P. and Callahan, T. 1988 Incentive payment and nonmanagerial productivity: an
interrupted time series analysis of magnitude and trend. Organization Behavior and Human
Decision Processes 42:47-74.
Wainer, H., and Braun, H., eds. 1988 Test Validity. Hillsdale, N.J.: Erlbaum.
Waldo, D., ed. 1971 Public Management in a Time of Turbulence. New York: Chandler.
Walker, L., Lind, E., and Thibaut, J. 1979 The relation between procedural justice and distributive
justice. Virginia Law Review 65:1410-1420.
Wallace, M. 1990 Rewards and Renewal: America's Search for Competitive Advantage Through
Alternative Pay Strategies. Scottsdale, Ariz.: American Compensation Association.
Walton, R. 1979 Work Innovations in the United States. Boston: Division of Research, Harvard
Business Review .
1984 From Control to Commitment: Transforming Workforce Management in the United States.
Boston: Division of Research, Harvard Business Review.
Weiner, N. 1980 Determinants of the behavioral consequences of pay satisfaction: a comparison of
two models. Personnel Psychology 33:741-757.
Weitzman, M. 1984 The Share Economy: Conquering Stagflation. Cambridge, Mass.: Harvard
University Press.
Whyte, W. 1955 Money and Motivation. New York: Harper and Row.
REFERENCES 187
William Mercer, Inc. 1983 Employer Attitudes Toward Compensation Change and Corporate Values.
New York: William M. Mercer, Inc.
Williams, K., Wickert, P., and Peters, R. 1985 Appraisal salience: effects of instructions to
subjectively organize information. In Proceedings of the Southern Management Association
Meetings, Orlando, Fla.
Wilmerding, L. 1935 Government by Merit. New York: McGraw-Hill.
Wyatt, S. 1934 Incentives in repetitive work: a practical experiment in a factory. Industrial Health
Research Board Report No. 69. London: H.M. Stationery Office.
Wyatt Company 1987 The 1987 Wyatt Performance Management Survey. Chicago: Wyatt Company.
1989a The 1989 Survey of Locality Pay Practices in Large U.S. Corporations . Philadelphia: Wyatt
Company.
1989b Results of the 1989 Wyatt survey: getting your hands around performance management. The
Wyatt Communicator Fourth Quarter:4-18.
Zammuto, R., London, M., and Rowland, K. 1981 Organization and rater differences in performance
appraisal. Personnel Psychology 35:643-658.
Zedeck, S., and Cascio, W. 1982 Performance appraisal decisions as a function of rater training and
purpose of appraisal. Journal of Applied Psychology 67(6):752-758.
REFERENCES 188
APPENDIXES 189
Appendixes
APPENDIXES 190
A
Survey Descriptions
Hewitt Associates
1989 Compensation Trends and Practices Survey, 1989. Lincolnshire, Ill:
Hewitt Associates.
No. of Organizations: 705
Type of Organizations: 33% manufacturing; 67% services
Size (employees): 33% < 1,000; 67% = 1,000
Respondents: Compensation managers
Response rate: Not reported
Committee on Performance Appraisal for Merit Pay, National Research
Council
1990 The committee solicited additional information on performance
appraisal from 28 Conference Board member firms. The respondents represented
all major industrial sectors and are generally considered leading firms in human
resource management. A draft summary of the responses of these firms is
available through the committee's staff files.
O'Dell, C.
1987 People, Performance, and Pay: A Full Report on the American
Productivity Center/American Compensation Association National Survey of
Non-Traditional Reward and Human Resource Practices. Houston: American
Productivity Center.
No. of Organizations: 1,598 (some multiple units of firm)
Type of Organizations: 46% goods; 46% services; 8% government
Size (employees): Not reported
Respondents: 83% personnel; 17% other managers
Response rate: 36%
TPF & C/Towers Perrin
1990 Achieving Results Through Sharing: Group Incentive Program Survey
Report. New York: TPF & C/Towers Perrin.
No. of Organizations: 144 companies (177 variable plans)
Type of Organizations: 77% manufacturing; 23% services and retail/
wholesale
Size (employees): Median = 2,600; Range = 26 to 300,000 (sales):
Median = $500 million
Respondents: Variable plan designers
Response rate: Not reported
U.S. General Accounting Office
1981 Productivity Improvement Programs: Can They Contribute to
Productivity Improvement? AMFD-81-22. Washington, D.C.: U.S. Government
Printing Office.
No. of Organizations: 54
Type of Organizations: 93% manufacturing; 7% services and retail/
wholesale
Size (employees): Range from 100 to 100,000
Respondents: Reported only as "officials" of firms
Response rate: 56%
Wallace, M.
1990 Rewards and Renewal: America's Search for Competitive Advantage
Through Alternative Pay Strategies. Scottsdale, Ariz.: The American
Compensation Association.
No. of Organizations: 46
Type of Organizations: 83% manufacturing; 17% services/utilities
Size (employees): Mean = 19,362; Range = 55 to 90,000
Respondents: Wallace conducted case studies; interviewed key executives,
managers, and employees
Response rate: Not applicable
The Wyatt Company
1989 Results of the 1989 Wyatt survey: getting your hands around
performance management. Pp. 4-18 in The Wyatt Communicator Fourth Quarter,
1989.
No. of Organizations: 3,052
Type of Organizations: 30% manufacturing; 40% services; 5% utilities/
transportation/oil; 6% retail/wholesale; 19% government/nonprofit/other
Size (employees): 65% < 1,000; 35% = 1,000 (25% > 10,000)
Respondents: 93% senior and middle personnel managers
Response rate: Not reported
This survey has a broad geographic representation with 24 percent in the
Northeast, 20 percent in the Southeast; 21 percent in the Great Lakes; and 15
percent in the Pacific states (north and south).
The Wyatt Company
1989 The 1989 Survey of Locality Pay Practices in Large U.S.
Corporations. Philadelphia: The Wyatt Company.
No. of Organizations: 80
Type Organizations: 44% manufacturing; 19% services; 37% utilities/other
Size (employees): 67% = 50,000; 33% > 50,000
Respondents: Top compensation managers
Response rate: Not reported
B
Biographical Sketches
INDEX 201
Index
A
Advisory Committee on Federal Pay, 95
Age effects on performance ratings, 64-65
Air Force task inventory, 49, 139
American College Testing, 59
Applied tradition, 3, 45-46, 137-138,
145-146, 150
Army Selection and Classification
Project, 61
Automatic step system, 101
B
Behaviorally anchored rating scales, 56,
66-67n.2, 75, 78
validity of, 62-63, 71, 142, 143, 149
Behavioral measures, objective, 59
Bias, demographic, 64-65, 66, 106, 141
Bonus plans
civil service, 20, 27, 29
executives and managers, 79, 87, 88,
114, 125-126, 157
negative effects of, 83
Brownlow Commission, 17
Bureau of National Affairs, 117
C
Campbell, Alan, 18
Carter, Jimmy, 17-19
Civil Rights Act (1964), 35, 138
Civil Service Commission, 15, 19
Civil Service Reform Act (1978)
employee expectations of, 22-26
merit pay, 14, 17, 21-22, 27-28, 135-136
Merit Pay System, 8, 21-22, 28
Merit Systems Protection Board, 19,
29-30, 32, 163
performance appraisal, 21, 54, 133,
135-136, 138, 140, 163
INDEX 202
D
Defense Department, 51
Demographic bias, 64-65, 66, 106, 141
Devine, Donald, 26-27
Discriminant validity, 58, 61-63
Distributive justice, 92, 93, 95, 154, 155
Due process requirements, 10, 133
E
Economic environment, 83, 89, 90,
130-131
Employee motivation
group incentive plans, 10-11, 86-89, 115
individual incentive plans, 81-84
merit pay and, 5, 84-86, 99, 165
organizational context and, 122, 129,
130-131, 158-159
pay for performance and, 5, 36-37,
80-81, 89-90, 136, 153-154, 165
INDEX 203
information sharing with, 10, 118-119, pay information, 10, 118-119, 157
120, 125-126, 156-157 performance appraisal, 13, 16, 23, 28,
legal protections, 5, 132-133, 163 38, 54, 76, 126, 133, 135
organizational commitment, 88 private-sector pay gap, 7, 30, 32, 136,
participation in setting performance 165-166
goals, 99, 104, 105, 108 recruitment, 30-31
performance appraisal feedback, 63, 65, regulatory protections, 132-133, 163
69, 72, 75, 146 unions and professional associations,
personal characteristics, 56, 62 131-132
supervisors' knowledge of, 50-51, Federal Labor Relations Authority, 19
60-61, 66-67, 142, 150-151 Federal policy implications
See also Federal employees; merit pay, 42-44, 160-166
Individual job performance; pay for performance research, 98-101,
Performance appraisal 134
Employment discrimination, 132 performance appraisal, 3, 5, 138, 150,
Environmental factors 160, 164-165
economic climate, 83, 89, 90, 130-131 private-sector practice, 5-6, 7-8, 31, 40,
laws and regulations, 132-133, 160, 162 119-121, 135
rating distortion, 147-148 Federal Reorganization Act (1939), 17
unionization, 90, 131-132, 160 Forced-choice scales, 57, 147-148
Environmental Protection Agency, 31, 126
Equity and fair treatment, 32-33, 92-96,
154-155, 165
cost trade-offs, 97, 98
employee perceptions of, 11, 92, 95-96,
100-101, 112, 129, 148, 150 , 155, 156
Equity pay plans, 10, 152
Equity theory, 73-74
Evolutionary (dynamic) strategies in orga-
nizations, 125-126, 159
Executive Position Description Question-
naire, 49, 52
Executives, 38, 88, 113, 114, 125
See also Managers
Expectancy theory, 80-81, 82, 86, 88, 89,
99, 146, 153
Expectancy X Valence model, 67-68
External factors. See Environmental factors
F
Federal Employee Attitude Survey, 22
Federal employees
attitudes toward Civil Service Reform,
22-26, 28, 136
attitudes toward employment condi-
tions, 7, 26, 29-30, 32
attitudes toward pay for performance,
94-95, 101, 111, 155
managers, 5, 13, 19-20, 21, 27, 28,
30-31, 118-119, 124, 155
merit system, 7, 8, 14-17, 21-22, 23,
27-30, 31, 44
organizational commitment, 28
Copyright National Academy of Sciences. All rights reserved.
Pay for Performance: Evaluating Performance Appraisal and Merit Pay
INDEX 204
Fortune 100 firms survey, 102, 106, 108, merit pay and, 4, 9, 99, 100, 157-158
113, 118, 120, 156 and organizational effectiveness, 21, 76,
112-113
perceived link with pay increases, 5, 21,
G 27-28, 29, 32, 68, 81, 85, 117, 120,
Gainsharing plans 153, 161, 165
employee acceptance of, 131 See also Performance appraisal
and organizational performance, 8, 11, Information sharing about performance
79-80, 86, 87-88, 90, 114, 115, 116, and pay, 88, 118-119, 120, 156-157
154, 157, 158 Instrumentality models of motivation, 73
Garfield, James Abram, 14 Internal Revenue Service, 126
Gender effects, 64 Intrinsic motivation, 68
General Accounting Office, 28, 30, 32,
132, 163
J
General Aptitude Test Battery, 63
General Schedule, 16, 22, 27, 28, 136 Job analysis, 2, 49-52, 74, 124, 138,
General Services Administration, 30 139-140, 150
Global ratings, 54-55, 67, 74-75, 144, 149 validity of measures, 58-59, 66, 148
Goal-setting theory, 81, 82, 86, 88, 89, 99, Job complexity and interdependence, 97,
152, 153 99-100, 123
Graphic scales, 55-56, 143 Job element method, 49, 139
Group incentive plans Job knowledge, 60-61, 66
context, 97 Job (work) samples, 60, 63-64, 66
pay increases, 10, 79-80, 95, 155 Job satisfaction, 85, 154
performance effects, 86-89, 154 Job security, 131
private-sector practice, 90, 114, 134, Job-specific ratings, 54-55, 67, 74-75,
151, 157, 159 144, 149
H
Halo error, 55, 62, 67, 144, 147
Hawthorne effect, 40-41
Hay Company, 10
Health and Human Services Department,
31
Hewitt Associates survey, 114, 115
Hoover Commission, 16
Hourly employees, 103, 113, 114
I
Incentive Awards Act (1954), 16
Individual incentive plans
definition, 10, 151-152
economic pressures and, 130-131
negative consequences of, 83-84, 89, 99,
133
performance effects, 81-84, 153, 158-159
unions and, 97-98
Individual job performance
measurement of, 45, 48-55, 58, 66, 78,
126, 132-133, 137-138, 140-141,
149-150
INDEX 205
Joint-Service Job Performance Measure- 58, 66, 78, 126, 132-133, 137-138,
ment Project, 51-52, 54 140-141, 149-150
organizational performance, 116
performance appraisal system success,
L 3, 106, 112
Labor cost regulation, 8, 10-11, 79, 80, quantitative, 54, 81, 86, 90, 96, 97, 99,
96-98, 100, 113, 115, 120 , 155-156 124
Labor productivity. See Employees; rating scale formats, 55-57
Productivity validity and reliability of, 37-38, 57-67
Labor regulations, 132-133, 160, 162 Mechanistic organization, 127-128
Office of Personnel Management, 19, Merit grid, 9, 78
21, 132, 138 Merit pay
Labor relations, 87, 130 effectiveness of, 4, 117-118, 119
Labor unions, 7, 97-98, 100, 131-132, 160 employee attraction and retention, 91,
Leniency error, 55, 62, 147 100, 154
Litigation, 31, 35, 67, 94, 144 employee motivation, 5, 84-86, 89,
99-100, 131, 153-154, 165
federal civil service, 7, 8, 14-17, 21-22,
M 23, 27-30, 31, 44
performance ratings, 81-82, 96-97,
McKinley, William, 15 109-110, 149, 151-152, 164-165
Management
labor relations, 87, 130
and pay risks, 11, 95
systems, 127-129
Management by objective, 9, 47-48n.1,
76, 84, 104, 108, 124
Management Excellence Inventory, 53
Managers (supervisors)
beliefs about performance and pay, 94,
95, 106-107, 110-111, 118-119, 155
employee appraisal, 2, 3, 50-51, 60-61,
63-65, 66-67, 74, 96, 108-110, 142,
148, 149, 150-151, 164
employee trust in, 69, 83, 89, 95, 130,
133, 153
federal, 5, 13, 19-20, 21, 27, 28, 30-31,
118-119, 124, 155
flexibility and discretion, 5, 23, 31, 32,
120-121, 133, 156, 161, 163, 164
group incentive plans, 88-89
merit pay plans, 38, 84, 85-86, 156, 157
performance of, 47, 49, 52-54, 59, 66,
74, 75-76, 105, 133, 139, 140, 159
rater training, 70-72, 75, 106, 108,
146-147
rating distortion, 72-74, 145-146
use of performance appraisal systems, 5,
10, 11, 84-85, 105-106, 108, 149,
164-165
Measurement
errors in, 55, 56;
see also Rating errors
individual job performance, 45, 48-55,
INDEX 206
O
Objectivity in measurement, 48, 59, 140
Office of Personnel Management, 1, 7, 36,
135
federal employee survey, 22, 24-25
Management Excellence Inventory, 53
politicization of, 26-27, 28
regulations, 19, 21, 132, 138
Opinion surveys, 106, 112, 120
Organic organization, 127
Organizational context
boundary, 162-163
culture and personnel practice, 39, 43,
110-111, 112, 118, 119, 120 , 152
effect on employees, 86-88, 93
labor cost control, 96, 98
labor relations, 130
INDEX 207
INDEX 208
INDEX 209
rater training, 70-72, 75, 106, 108 Task inventory, 49, 50, 51, 52, 90
rating sequence, 71 Technological fit, 123-124, 158-159
Rating scale formats Temporary employees, 17
behaviorally based, 56, 62, 71, 75 Traits, 53-54, 66-67, 144
evaluation of 66-67, 74-75, 143-144 Trait scales, 56, 61, 75
federal civil service, 16
forced-choice, 57, 147-148
graphic 55-56, 143 U
mixed standard 56-57, 61, 143 Unionized employees, 103, 113
number of anchors, 3, 65-66, 75, 144
Reagan, Ronald, 26-27
Reliability, 139, 140-141, 148, 149 V
interrater, 55, 65-66, 74, 76
See also Performance appraisal Validity, 37-38, 57-58, 67, 74, 76, 133,
Reorganization Plan No. 2, 17-18, 19 139, 141-142, 148, 149, 150
Research findings construct, 58, 63-65, 66
convergence with private-sector prac- content, 58-59, 66
tice, 102, 112-113, 119 convergent and discriminant, 58, 61-63
cost regulation, 96-98, 155-156 criterion, 59-61, 63, 142
employee motivation, 67-69, 80-90, See also Performance appraisal
153-154 Variable pay plans, 3, 103, 151-152, 155
employee retention, 90-92 performance effects, 4, 10-11, 119,
fair treatment and equity, 92-96, 154-155 157-158
pay for performance, government impli- private-sector practice, 113-118, 156
cations, 42-44, 98-101
performance appraisal, 35, 46, 67-69,
74-75, 149-151
quality of rating data, 69-72
rating distortion, 72-74
Restriction in range error, 55
Roosevelt, Franklin D., 15, 17
S
Salary Reform Act (1962), 16-17
Sales commissions, 10, 38, 78-79, 114,
151-152
Scholastic Aptitude Test, 58, 59
Selection tests, 59
Self-rating, 65, 69, 146
Senior Executive Association, 27
Senior Executive Service, 16, 19-20, 21,
26, 27
Social Security Administration, 31, 85-86
Standard Descriptive Rating Scale, 63
Standardized tests, 46, 138-139, 142
Statistical analysis, 41, 142, 147
Steady-state organizations, 125-126, 134,
159
Supervisors. See Managers
Surveys, 39, 42
INDEX 210
W
Whistleblower protections, 23
Work climate, 130
Work group cooperation, 83, 87
Work (job) samples, 60, 63-64, 66
Wyatt Company, 29, 105-106, 108, 109,
112, 117