Performance Appraisal

Ace your homework & exams now with Quizwiz!

what is 360 degree feedback? pros? cons?

360 feedback: for development, less concern of violating hierarchy, includes many levels, but it takes a lot of time, lots of people, there still may be disomfort, and should you combine the ratings? How would you even do that?

Bernardin & Smith 1981: A clarification of some issues regarding the development and use of BARS

BARS was developed to standardize observation (Smith-Kendall method). Usually people just read the definition of the dimension, read each anchor, and marks the choice that is the most typicall expected behavior, but this ignores so much behavioral data, there may also be overlap in the different anchors, and also a ratee may have behaved exactly like 2 different anchors, so it is hard to choose, or they may not behave like any of the anchors. Researchers suggest that according to the SK method, you should be recording behaviors frequently to inform your decisions and that anchors should be more general, so that people don't fall into rating traps and can answer in line with the spirit of the performance dimension.

Wiersma & Latham 1986: The practicality of BOS, BES, and trait scales

BOS was considered superior to both BES and Trait scales (that tended to be equal) on every dimension (ease of feedback, differentiation, objectivitiy, position differences, training, corporate-wide standards, ease of use) amongst managers and programmers. Lawyers also believed that BOS would be the easiest to defend. Researchers suggest that you have to focus on the practicality of the instrument, because it doesn't matter if it has great psychometrics if people perceive it as too hard to use.

List and describe the components that are included in the four-part model of PA and what are the underlying assumptions of this model?

Components: 1) The rating context (which may include intraorganizational factors and organizational-environment factors, or distal (customer reports) vs proximal factors (direct observation)) 2) Performance Judgment (which includes factors that effect a rater's judgment of performance) 3) Performance Rating (which is distinct from the judgment event and s=is the actual rating assigned to a ratee) and 4) Evaluation of the PA process (which is an evaluation of the PAs accuracy, or effectiveness). Assumptions: 1) rater behavior is goal-directed 2)performance appraisals are social interactions 3) PAs function primarily as a tool for effective management rather than a measurement instrument

Williams, DeNisi, Blencoe, & Cafferty 1985: The role of appraisal purpose: effects of purpose on information acquisition and utilization

Deservedness decisions vs designation decisions (deservedness everyone gets a raise that deserves one, designation only the top 2 get a raise because you only have 2 to give out, you have to rank everyone). Kelley attribution theory model for assigning causality: consistency, distinctiveness, consensus information. Study found that ratings increase when consistency is high, distinctiveness and consensus (though not significant) are low. For deservedness, people look for more distinctive behaviors (then consistency, then consensus), for designation, people look for less consensus behaviors beacause you are actively making comparisons.

What are the different ways that raters get infomation about ratees?

Direct observation , Indirect observation, Observation of results, Customer reports, reports from coworkers, reports from other members of other units, self reports

Kluger & DeNisi 1998: Feedback interventions toward the understanding of a double-edged sword

Feedback intervention is providing people with information about their performance. It can be either positive or negative. If you want to do 360, focus on the task, focus less on the self (because this diverts cognitive energy from the task, FIs that contain cues to the self, whether positive or negative, are less effective on performance than no cues to the self), but if you do focus on the self, it is better to focus on the ideal (they will strive to meet the ideal and continually improve performance, if you only get positive feedback, you stop improving). This practicalizes goal-setting strategy in addition to giving feedback on progress towards goals.

Describe the steps of the judgment process?

First you obtain information about performance (observation, reports, inspection of results, prior ratings, reputation of ratee), then you apply your judgment strategy (not conscious, use heuristics, judges don't use the same strategies consistently, strategy may inform information gathering), then the product of your information and judgment strategy is the actual evaluation (judgment).

Sackett, Zedeck, & Fogli 1988: Relations between measures of tyipcal and maximum job performance.

Found low correlations between typical and maximal performance. When do you see maximal? When there is awareness of being observed/evaluated, awareness of instructions to maximize efort, measured over a short duration. Typical is seen in the opposite conditions. Cashiers for speed and accuracy on average over 4 weeks= typical, vs maximal speed and accuracy test. Supervisor ratings correlated more with maximal rather than typical scores.

Roch, Sternburgh, & Caputo 2007: Absolute vs relative performance rating formats: implications for fairness and organizational justice

Found that not only does the relative vs absolute format effect fairness (absolute (BOS) seen as more fair even though relative are more psychometrically sound), but that even amongst these two, the format effects perceptions of fairness. Authors suggest that you don't use relative formats if you are concerned with fairness perceptions. Amongst relative measures, RPM was perceived the most favorably. You should consider the purpose of the performance appraisal (maybe fairness perceptions are more important if used for administrative decisions?). Study also found that rating format had a greater influence on procedural justice perceptions in the context of high distributive justice (people care more about the rating format in terms of the procedure being fair when the org has high distributive justice).

What aspects of cognitive psychology are important when studying PA systems?

Information acquisition: what the rater attends to, Encoding: prototypes and schemas, Storage and Retrieval: what the rater rememembers, and Integration of different pieces and types of information: to make a rating decision

Bartram 2007: Increasing validity with forced-choice criterion measurement formats

Ipsative= forced choice, used to try to reduce social desirability effects because you have to choose btw 2 socially deisrable choices. Study found that ipsative is better than normative for validity coefficients (the criterion related validity for a predictor can be increased 50% by changing to an ipsative format). Yet there are practical limitations: you need a lot of items and you need every possible pair. Researchers suggest using a combination of normative and ipsative (ipsative is better with alarge number of constructs and when you have a lot of time).

What are some of the advantages and disadvantages to direct observation and what are some observation aids that can be used?

It is possible to be accurate, but it is almost impossible to devote the time, managers may have set observation goals that would affect what they attend to, job features may make it hard to observe, turbulent environment may make it hard to observe with certainty, and also observation effects behavior. Observation aids: Training, experience with the job at hand, BARS, Martinko & Gardner: time sampling, behavior diaries, self-pbservation, multiple raters, large behavioral samples

Describe the lead and lag relationships of PA

Lead relationships: performance leads to actual ratings. Lag: you decide you want to fire someone, so you give them low ratings to reflect this. Lead-lag: you made a decision, s oyou give low ratings, but then they get training or something (the lead), so rating goes up later

What are some of the reasons that PA systems have moved away from being solely objective measures?

Murphy 2008: Explaining the weak relationship between job performance and ratings of job performance

Objective measures tend to have low reliability. They also tend to be available only for a few types of jobs. They may be increasingly inappropriate with the changing nature of work (criteron deficiency), and they also can suffer from criterion contamination (measuring something that isn't true performance). Ways to improve performance appraisals: Choosing rating scale format, reducing error (leniency, halo, central tendency), and using Rater training, but these aren't that effective. 3 proposed models: One-Factor models assumes that there is a true score + measurement error, so all you have to do is eliminate error (but there may not be a true score and only takes into account measurement error, not rater error); Multifactor models= a composit of relevant criteria, attempts to explain the error and take individual (rater goals, personality) and system (purpose, source) characteristics into account; Mediated model suggests that the goals of the rater mediates the relationship btw performance, system, and individual characteristics with ratings, so you have to convince the raters to be neutral. No one really likes one-factor models anymore, people like multifactor, mediated still needs more looks.

Hooft, Flier, & Minne 2006: Construct validity of multi-source performance ratings: an examination of the relationship of self-, supervisor-, and perr-ratings with cognitive and personality measures

Only found weak evidence for construct validity, except for between personality tests and self-ratings (because you do both yourself!). Weak findings may be due to The external constructs (inbasket, general intelligence) are results oriented (performance is more than that) and also maximal performance as opposed to typical. Also, supervisor ratings were not more predictive.

What is the difference between focusing on behaviors instead of results?

Organizations want to measure results, but it is better to measure behaviors. Exclusive focus on results leads to dysfunctional behaviors, results are the outcome of complex processes, so you don't know what you're actually measuring, and focusing on results means employees and managers may ignore behaviors that are beneficial to long term survival and not just tied to a product.

How was PA initially conceptualized and how is it conceptualized now? What is the advantage to shifting the conceptualization?

PA was initially conceptualized as a measurement problem. Research focused on scale development, scale formats, and reducing bias. PA is now conceptualized as a social and communication process. The advantage is that the context of performance gets more attention (measurement focus does not focus on the context that performance occurs in). Murphy & Cleveland

Denisi & Sonesh 2008: The appraisal and management of performance at work

People need to think the rating system is fair (distributive, procedural, and interactional justice (also informational)), then they are less likely to sue you, more likely to change based on their feedback. Research started emphasizing the difference btw the judgment and rating processes. Also the goal shifted from being "accurate" (may result in lower perceptions of fairness) to helping employees improve (so feedback acceptance is really important). FOR should be used not for "accuracy" but for aligning all raters with the company goals. Ratee satisfaction is based on perceptions of justice, timliness, accuracy, and utility, as well as preference for fostering future improvement and outlining strengths and weaknesses, and being more encouraging than critical. They also need to trust the system, have knowledge about it, be involved in the process, and receive feedback at the end. Rater motivations to give lower scores: prove no favoritism, feel threatened by employee, whats to motivate employee to improve. Performance management: want to improve performance, not just measure it, employee needs goal--setting, revision, feedback, and a timeline and framework (SMART). Benefits of PMSs: meaningfulness (links btw performance and rewards), specificity (clear communication), reliability (consistent and free of error), validity (relevant criteria=good for motivation), increases motivation and justice perceptions.

Levy & Williams 2004: The social context of performance appraisal: a review and framework for the future

Review of PA literature. Research is focusing more and more on performance context. This study developed a model of the social context of the appraisal process which included distal variables, process proximal variables (relating directly to how the PA is conducted), structural proximal variables (related to the structure of the PA system), and rater/ratee behavior. There isn't much research on distal variables (it would be hard to isolate effects). They found that there are many process proximal variables that effect rater and ratee behavior. Raters may be affected by affect similarity (more similar=higher rating), individual differences (High A, Low C= higher ratings), their own affect (better mood=better ratings), and by their motivations (avoid discomfort=higher ratings; being held accountable=more accurate ratings). Ratee reactions may be affected by fairness, trust btw rater and ratee, which format is used (BOS preferred) and work group dynamics. There should always be a feedback compononent with qualitative information. For structual proximal variables, knowledge and experience with the multisource feedback system improved acceptance of multisource feedback, the purpose of the PA affected reactions (raters more lenient for administrative), and that raters with FOR training were more accurate.

Boswell & Boudreau 2002: Separating the devedlopmental and evaluative performance appraisal uses.

Since there is conflict in PA purposes used for both evaluative and developmental purposes, researches studied giving the evaluative role to the supervisor-once-removed thinking this would increase employees satisfaction with the appraisal and their supervisor, increase awareness of development, and improve employees feelings towawrd their supervisor, also that employees intentions towards development would be affected. The only significant difference was that employees in the separated use PA will report that they intend to use less future development options than those in the traditional system (because they feel like development has been addressed in the separated PA system). Researchers suggest more research.

What are some variables that impact indirect observations?

Source of the report, initiator of the report, form of the report (descriptive vs evaluative, long vs short), medium, timing, motive, and consistency of the report observations

Fox, Mizman, & Garti 2005: Is distributional appraisal more effective than the traditional performance appraisal method?

TAM= Traditional appraisal methods, DAM= distributional appraisal methods. TAM ignores that performance is dynamic. DAM emphasizes that performance can vary and that this is important to measure. DAM records the estimated frequency of different levels of behavior in each domain. Suggests that the categories should be conrete, have anchors, and be known ahead of time (to guide observation). The study sought to find out if PDA (a DAM method) would reduce error and increase accuracy. They found no difference in leniency, and found that DAM was better than TAM for interrater agreement. Authors suggest not to use a summated score (lose variability) and also to use DAM requires more careful observation of ratee behavior, and that DAM is preferrable to TAM (even though it is more expensive).

Who should or usually obtains the information needed to make a rating?

Task acquaintance is important (the amount of work contact someone has with the ratee), direct supervisors may be the best source, but consider others (maybe they have more access to results, access to behaviors depends on level of access, direct supervisors may have more reason to be accurate in your appraisal, they may be held accountable). It is hard to use other sources because that wouldn't fit in with the typical hierarchical model, but you don't necessarily have to stick to a hierarchical model.

Jawahar & Williams 1998: Where all the children are above average: A meta analysis of the performance appraisal purpose affect

The PAP effect (more lenient/less accurate for administrative purposes) is different based on different moderators. PAP is stronger in the field vs in the lab. PAP is stronger with organizational raters vs student raters. PAP is stronger for direct observation, then video, then paper recordings. PAP is stronger for downward vs upward appraisal. Researchers suggest to hold raters accountable, and also not to collect ratings for one purpose and then use them for administrative decisions.

Tziner & Kopelman 2002: Is there a preferred performance rating format? A non-psychometric perspective.

There is no evidence that any PA system is psychometrically superior to all others. There is some research that says forced choice is better for administrative decisions and behavioral scales are better for development. PA systems may go unused if it does not elicit positive reactoins mong both raters and ratees. Study found that BARS was the least preferred in pretty much every case compared to BOS and GRS. BOS was superior to GRS in terms of reliability and validity, minimizing communication barriers, setting clear goals, increasing the rate of commitment to reaching goals, more focused feedback, more objective and less bias, measuring individual performance, but the differences between BOS and GRS were slight, so do a cost benefit analysis.

What is going on when sources disagree?

They are more likely to disagree about evaluation than description. Different sources may be more or less useful depending on the purpose of the eval. It is good to have some disagreement because why would you use multisource feedback if all sources agreed with each other? Might be easier or harder depending on the structure of the organization (less hierarchical, looser=easier). Disagreement is more likely with rigid hierarchies.

Hannum 2007: Measurement equivalence of 360 degree assessment data: are different raters rating the same constructs?

They did find adequate structural equivalence across the 3 rater types. You could possibly average across them, but results aren't conclusive. So each group was thinking of the 7 scales in the same way. Why would you want to average them though? You really wouldn't, support was only marginally adequate.

What should you consider when evaluating criteria?

They have to be relevant, you don't want them to be contaminated or deficient.

What is the difference between typical and maximal performance?

Typical is how well they actually do, and maximal is how well they can do. Those with higher maximal performance don't necessarily have higher typical performance. Employees perform maximally well when they are motivated and when their work situation allows them to. Since there is this distiction, we should measure not just someone's average performance but their consistency, and remember that raters pay more attention to extreme (pos or neg) performance events.

What are some criteria for determining effectiveness of rating systems?

Utilization criteria: the purpose for which the PA is being used, Qualitative criteria: relevance of the appraisal to job performance, ease of use, practicality, and Quantitative criteria: accuracy of measurement without halo, leniency, or central tendency errors

Borman, Bryant, & Dorio 2010: The measurement of task performance as criteria in selection research.

You can measure objective criteria (production rates (though not everyone has the same opportunities to produce, criterion deficient), sales (same issue), work samples (maximum, not typical), job knowledge tests (may not work for certain jobs). Subjective criteria: rating formats (BARS- anchors that you want to be relevant and inclusive, GRS- not as comprehensive, more subjective, simplest to use, Behavior Summary Scales- includes what not to do as well, has hihg- mid- low- effectivness, has summaries, BOS- looks at frequency of behaviors). Article combined several dimensional models and came out with 6 dimensions (communicating, productivity, useful personal qualities, problem solving, organizing/planning, leadership and supervision).

What kinds of behaviors define performance (so you should measure them)?

You should measure conceptual behaviors over task behaviors. Workers spend very little time performing tasks, many measures include things only tangentially related to tasks, and job performance is more long term than task performance. Some conceptual criterion are short term org goals (task completion), long term org goals (relationships), OCBs, etc..

What is the difference between ratings and rankings and what are the different ways to rank employees?

ratings are comparing to a standard (absolute/criterion referenced) and rankings are comparing to each other (relative/norm referenced). In a full ranking, you take every employee and rate them from best to worst. In Paired comparisons, you take all possible pairs to determine who is better (takes forever). In a forced distribution, you put a certain amount of employees in each level of performance (useful for admin decisions)

Murphy & Constans 1987: Behavioral anchors as a source of bias in rating

Anchors may bias the observation process by priming raters to attend to specific behaviors (and ignore ones not included), or increase the likelihood that raters will recognize and attend to certain behaviors. Even if all of the behaviors are relevant and represenetative, an employee who is average but once in a while does some of the good behaviors, employers may attend to those if anchors have directed their attention only to those behaviors. Anchors may bias recall by changing the retreival process from a free response to cued recall task, meaning if any behaviors match the anchors, those will be remembered, even if they are not representative of the employee's typical behavior. Results found that including high anchor behaviors and low anchor behaviors in a video of average performance was biased in the expected directions. Also, results found that anchors bias recall, not observation. Some reasons why are that a 6 on a 9 pt scale may be perceived as average with some anchors, but well above average with others. Also, subjects in the different groups may have remembered the same information, but gave more weight to anchors included in the scale. Researchers suggest using the SK method so that anchors represent typical, not unusual behavior.

Discuss some of the legal concerns/history of PA systems?

Because PA systems can be validated (either how well they predict future performance or how well they measure past performance), PA systems have become more and more measurement focused while ignoring the communication and social aspect of PA even though research suggests a hybrid approach)

Describe the different rating scale formats

GRS: the most common, easiest to use, easiest to develop, but you don't have clear anchors and definitions of performance. BARS/BES: Use behavioral anchors (BARS uses more actual behaviors, BES uses more expected behaviors). Behaviors are supposed to make it less subjective, but the behaviors might not be representative, migh tbe biased, and they are time consuming to develop (expensive). BOS: Frequency based (frequency employee performs each behavior (you can have sometimes, always, never, or a percentatge). Meant to guage typical behavior more than maximal, supposed to be even less subjective. MSS: supposed to reduce bias, shows the anchors and dimensions in a mixed up order and you look at each statement separately, but it is complicated to develop and score. PDA (performance distribution assessment): every behavior anchor at each level rated for frequency (adds up to 100), you get more variability this way).

What is Management by Objectives and what is its strength as a PA system?

In MBO, the system involves goal-setting, participation in decision making, and objective feedback. Its strength as a PA system is that even though it sounds purely objective, it includes very subjective elements, such as a goal-setting and negotiation, so it is a good mix of objective and subjective. You can add this to any PA system that you have.

What are the 2 main purposes of PA and how do they affect rating outcomes?

Purposes: Administrative and developmental. Ratings are more likely to be lenient for administrative PA (to avoid conflict, decision is more permanent, you want them out of your dept, you want to keep the culture positive, low ratings reflect on the manager, leniency is found more in the field than in labs) and there are mixed results on purpose's affect on accuracy because true performance is hard to know (easier in lab studies)

Blanz & Ghiselli 1972: The mixed standard scale: a new rating system

Rater error can be reduced by using MSS. For each dimensions, there are benchmarks and raters say whether the ratee performs the same as, worse than, or better than that standard. Also, instead of seeing them in order for each dimension, each statement is mixed up so that the rater can't tell which statement is the best or for which dimension. If the rater is rating logically, they should never say a ratee behaves better than the best standard for a dimension, but worse than the average standard for a dimension, so this is a way you can check for raters rating accurately, or at least logically. They found high interrater reliabilit, lower leniency error, high distinction between ratees, avoided halo effect, and that accurate raters ratings correlated with scores that workers got on tests. However, constructing, completing, and scoring the MSS is laborious.

What are some of the different sources of information and the advantages and disadvantages to each?

Subordinates as sources: this can make everyone uncomfortable, subordinate ratings and supervisor ratings tend to agree more with each other than self ratings, they have frequent access to interpersonal behaviors and results, with more employees, there is more trust in anonymity. Self as source: well informed of all behaviors, self ratings are close to supervisor ratings if performance feedback is given, they are less lenient if checked against objective criteria, there may be self-serving bias involved, and fundamental attribution error. Peers as sources: they can observe both task and interpersonal behaviors, the ratee doesn't feel like they are being observed, and ratings can be pooled, but there may be resistance and leniency. Customers/Clients as sources: can be internal or external, you can improve your behavior based on client needs, results oriented, but may only examine extreme behaviors. Uppermanagement as sources: they can't do a lot of observing, don't have much contact, they may be less biased (above politics), and also they can have a significant impact on what should be observed.

Cleveland, Murphy, & Williams 1989: Multiple uses of performance appraisal: Prevalence and Correlates

There are conflicting needs/uses of PA systems. ORgs use them to make decisions (btw subjects comparisons) and also want to use them for development (within subjects). They can also be used for system maintenance (use to evaluate HR systems in place) and Documentation (use to document or justify decisions being made). Study found that information from PA had the greatest impact on salary administration, performance feedback, and ID of employee strengths and weaknesses. Most organizations used the PA system for multiple purposes, so more research needs to be done to see how this effects outcomes.

Smither, London, & Reilly 2005: Does performance improve following multisource feedback? A theoretical model, meta-analysis, and review of empirical findings.

There are very small effect sizes found for performance improving after receiving feedback. However, they were lager when feedback was used for developmental purposes (maybe because it is a more continuous viewpoint, unlike PA for administrative decisions, which is over when the decision is made. Performance is most likely to improve when feedback indicates change is necessary, recipients have a positive feedback orientation, they perceive a need to change their behavior, react positively to feedback, believe change is feasible, set appropriate goals to regulate their behavior, and tack actions that lead to skill and performance improvement. Small changes may take a long time to add up, so maybe measure over multiple administrations.

Viswesvaran & Ones 2000: Perspectives on models of job performance

There have been different models to try to have a standard set of job dimensions, instead of developing them specifically for one job through job analysis. Models (4-square) have either stand-alone dimensions or dimensions developed as part of a set, and are either limited to a specific occupation or applicable across jobs. For models that are stand-alone dimensions and apply across jobs: Task Performance, OCBs, Counter productive work behaviors, there is a positive correlation across all of these dimensions. Models that are have dimensions developed as a set and are for specific occupations: eg, entry level jobs (9 dimensions), military personnel (5), managers (4) these models would say any manager job has these same 4 dimensions. Models that have dimensions developed as a set and are supposed to apply to all jobs (eg. Campbell 8 dimensions (statistically distinct) and the salience of each is different based on the job).

Steele-Johnson, Osburn, & Pieper 2000: A review and extension of current models of dynamic criteria.

There is an assumption that performance is stable, yet tasks change over time and so do people and the raters. People tend to say that cognitive ability is the best predictor of performance, yet many variables can effect this. Job variables (CA is more of a predictor if job complexity and task interdependnece is high, and less of a predictor if job consistency is high), Organizational variables (CA more predicitive with tech changes, work process changes, and less so with high situational strength and high situational constraints), Learning related variables (less predictive with higher skill acquisition and work experience), and task variables (more predictive if the task is complex, less so if it is consistent and task is defined/structured well).

Atwater, Brett, & Charles 2007: Multisource feedback: lessons learned and implications for practice

You should consider several factors: Before feedback (organization context, perception of process, actual process, individual differences), About feedback (format, characteristics, reactions to), After feedback (method of feedback distribution, org support, individual attitudes and behaviors) and Outcomes of the feedback process (changes in self and other ratings, relationship to assessment center ratings, relationship to PA ratings, employee satisfaction, intent to leave, engagement). You should not implement MSF when the company is restructuring and downsizing. You should integrate MSF with HR processes. Establish a clear purpose and communicate to employees how it fits in with org goals. Should be used for dev purposes first before evaluative and administrative. Make sure it is anonymous and inform employees of the process to ensure this. Think of individual differences when considering reactions to MSF. Discuss strategies with leaders to overcome initial reactions and motivate leaders to focus on the dev aspects. Consider individual differences when delivering feedback. Consider interventions to boost self efficacy. Consider only having manager rate some leaders each year, weight the value of feedback from external raters against time and cost, use electronic means to collect information and increase confidentiality, but don't use it to deliver feedback.

See all study sets

Performance Appraisal

Related study sets

Bio Test #3

Solids - Vertices Edges and Faces - CONVEX POLYHEDRON

Neutron Stars

Exam 1- Chapter 3- IS5610 Cybersecurity

Mathematics: Notable Mathematicians (& their Most Famous Works and Contributions and Notable Biographical Information)

psyc exam 3 quiz notes for exam

AP Stats Midterm Exam

Government Eco. Review

Female Reproductive II

Study Guide: Exam #3

HLTH Exam 2 Quiz

Nop

Application for HR 2 final

Chapter 13 embryology

ACCT Ch 5

Exam #3 Intro to business

Biological Molecules Part 1

SKELETAL SYSTEM BONE & TISSUE

Econ midterm 3

Exam 5