Kirkpatrick Level 3 Evaluation

This is the fourth in a series of posts on the Kirkpatrick Model of Evaluation. Previous posts have provided an overview of the Kirkpatrick model and a detailed look at both Kirkpatrick level 1 and Kirkpatrick level 2. This post will focus on the detail of level 3.

What is Kirkpatrick level 3 evaluation all about?

The keyword that Kirkpatrick uses to describe this level of evaluation is ‘Behaviour’ (2006).Kirkpatrick, D & Kirkpatrick, J. (2006). Evaluating Training Programs (Third Edition). San Francisco: Berrett-Koehler Publishers, Inc. Donald & Jim Kirkpatrick define level 3 evaluation as determining how much transfer of knowledge, skills and attitudes has occurred following a training programme – i.e. how has behaviour in the workplace changed as a consequence of training.

If you’ve read my Kirkpatrick level 1 and Kirkpatrick level 2 summaries you may have already concluded that evaluation at this level is more complicated and resource intensive than the earlier levels.

When should I be conducting evaluation at Kirkpatrick level 3?

One of the reasons that evaluation at Kirkpatrick level 3 is more complex is because the process is harder for trainers to control. In order to be able to determine if individuals have transferred the training they have received you have to wait until the trainees have spent some time back in the workplace. The amount of time you have to wait will depend on numerous factors, including:

  • the nature of the training.
  • the opportunity available to implement the knowledge/skills.
  • the level of encouragement from line management.

How and what should I evaluate at Kirkpatrick level 3?

Once you have decided how long to leave post training you will need to decide what methodology you’re going to use to collect your evaluation data. In my opinion it is important to have a systematic methodology, although research by ESI suggests that many organisations take a more ad-hoc approach when measuring the impact of learning.

Questionnaires and interviews are common methods to use but the techniques that you decide to use will need to be appropriate to the amount of resources that you’re prepared to commit. Remember, it is not always necessary to conduct evaluation at all levels for all your training – I have developed my own system called EVAL-IT? to help me prioritise training for evaluation.

Kirkpatrick Level 3 – opportunities to exploit:

There are a number of aspects of evaluation at level 3 that you can take advantage of:

  • Key data can be collected to inform how training is been implemented or not. Decisions can then be made on how to improve the transfer rate.
  • Enables direct engagement with the operational side of the business, this assists in developing relationships and maintaining the links between training and business requirements.
  • Identifies areas of training that may no longer be relevant (removing these elements of training can reduce the length of the training and therefore reduce costs).
  • Identifies gaps in the training where skills or knowledge need to be incorporated in order to more fully prepare individuals for their role.

Kirkpatrick Level 3 – threats to mitigate against:

Whilst there are advantages, there are also some issues that you should be aware of:

  • Lack of consistent methodology can create difficulties in comparing results across training programmes.
  • There can be a tendency to rely on ‘anecdotal’ evidence – this may not be sufficient for your board and should be reinforced with more scientific data collection.
  • Can uses excessive amounts of resource if not focussed efficiently, resulting in the potential for costs to outweigh the benefits – use a technique like EVAL-IT? to prioritise.
  • The wider orgnaistaion may not ‘buy-in’ to your evaluation efforts, concluding that “completing your questionnaires are a waste of time”. To overcome this problem you’ll need to actively engage across the workforce – check out who should be involved in training evaluation for more info.

Linking Kirkpatrick level 3 data with level 1 and 2 results…

One of the themes of the Kirkpatrick model is the concept of ‘generating a chain of evidence’ between the levels. The idea behind this principle is that you should be able to compare the different results that you have collected across levels 1, 2 and 3 for a particular training programme to see if you can identify any themes.

Results that show that participants who were dissatisfied with the training (level 1) and failed to demonstrate that they had assimilated the subject matter (level 2) are unlikely to then demonstrate effective transfer (level 3).

However, you may generate high satisfaction results (level 1) and effective knowledge gain (level 2) but still not achieve effective transfer (level 3). At this point you can then look to analyse the reasons why. Perhaps your training the wrong stuff? Perhaps line managers don’t agree with the ‘new way of doing things’ and are denying the opportunity for trainees to implement their new skills? When you come up against negative results that give you cause for concern you’ll need to intensify your research to determine the true cause before you can implement an action plan to improve the situation.

It’s more complicated to achieve but that doesn’t mean we shouldn’t do it…

The Kirkpatrick’s have made claims in the past that level 3 is the “missing link” in evaluation. Trainers will attempt to do the stuff they can control at levels 1 and 2, and may even attempt to demonstrate some links at level 4, but they fail to consider how effectively training is been transferred.

Yes it takes time and energy. Yes it means engaging with the wider workplace. Despite these facts, it should be conducted for those programmes that you (and your major stakeholders) believe are key for organisational success. If your employees are spending time on expensive training programmes but are not behaving differently at work as a consequence then what is the point?

To effectively conduct level 3 evaluation I believe you need to decide what methodologies you’re going to use to conduct level 3 evaluations. Engage with the relevant people and explain what you’re going to do (and why). Get on and do it.

Do you have any experience of conducing level 3 evaluations? Do you think level 3 is a waste of time? Please share your thoughts in the comments below.

11 Responses to Kirkpatrick Level 3 Evaluation

  1. Mark T Lawrence at 3:25 pm #

    Hi Richard,
    Thanks for posting an interesting and thought-provoking article.
    I was interested by your comment that “it is not always necessary to conduct evaluation at all levels for all your training”, as it seems to me that this may be the root of some of the other problems you mention. A consistent approach toward measurement of learning offerings, should mitigate against complexities encountered when comparing results; reliance upon anecdotal evidence; inefficient resource deployment; and lack of management buy-in.
    By treating offerings differently, it opens up the complexity of statistical analysis and adds pressure to resource bandwidth. Anecdotal evidence is always useful, if only to provide some textual support to the stats, but management are unlikely to offer their support without assurances that any findings are confirmed, repeatable and consistent.
    As a final thought, I’d ask whether it’s really necessary to link all the Kirkpatrick levels…? With resource pressures ever-abounding, I’d suggest simplification and prioritisation of those performance measures which are most pertinent to the organisation would be more appropriate.

    • Richard at 4:59 pm #

      Hi Mark, thanks very much for your comment!

      On reflection I wasn’t very clear with my comment on it not always being necessary to conduct evaluation at all levels for all training. Let me explain.

      What I meant to say was that you might treat different training programmes differently. For example you may deliver training in X. This training programme is expensive to deliver but deemed to be business critical to succeed in a new area of the business. For these reasons, you decide that it is important to evaluate the training at all 4 levels – ensuring that you have the right data to demonstrate that this training is required (or indeed to conclude that it isn’t or it needs amending). In this case I agree with you 100% that you need to be consistent in the measurement methodologies that you select and implement for all the reasons you describe. Ensuring that you apply those methods consistently for different iterations of the training you deliver on X is vital.

      However, you may also deliver training in Y. You can identify the business need for this training and senior management fully accept the requirement for it’s delivery. Whilst there is a need, its not business critical or expensive to deliver. In this case, you may decide not to apply levels 3 and 4 evaluation. Some may argue that you don’t need to evaluate it at all. This may be the case, but periodic confirmation that participants agree with the delivery methods and are consistently achieving the training objectives may be appropriate. Where you do carry out this activity, I agree again that you need to be consistent so that you can analyse trends over different iterations of Y to make sure you’re staying on track. I guess it becomes a resource balancing act.

      I hope this provides a better explanation of what I meant when I said that it is not always necessary to conduct evaluation at all levels for all of your training. On your final point, I’d agree that it’s not always necessary to link all the levels. Sometimes though this may help to provide a more persuasive argument. If that’s what you need it might be worth considering.

      Thanks again for your comment. I hope you call by the blog in the future and share your views and observations further.

  2. robert tremaine at 8:37 pm #

    What’s the ideal time to start measuring for Level III after the training took place? I’ve read the optimal range runs between 60-180 days. Closer to 60 days, some think it allows for less learning decay. Closer to 180 days, some argue it gives the learner more time to assimilate and practice what they learned. Where’s the sweet spot? Thanks! :)

    • Richard at 9:11 pm #

      Hi Robert – thanks for your question! I’m afraid in my opinion there is no set answer for this – it really depends on the type of learning programme. I think a key question is to ask ‘how long is likely to be required to be able to put the learning into practice?’.

      For example, following some form of leadership and management training I might want to wait for 6 months to enable the opportunity for the learner to be exposed to a range of environments in which they could implement what they have learnt. Alternatively, if the training programme was focussed on a particular piece of equipment that the learner would be using on a daily basis following training then I might decide to only wait for a month before conducting the evaluation.

      One technique I have used in the past is rather than evaluating a particular learning programme after each iteration, I wait and only subject the programme to periodic evaluation. When that particular programme comes round on the schedule I would then include individuals that have completed the training in, say, a 12 month period. Within this cohort, some individuals will have completed the programme only a month ago (this would be my minimum time) and others may have completed the programme 12 months ago (often my maximum time). If you organise your data properly, if you think it is relevant you should then be able to examine if there are any differences in views dependent on the amount of time that has passed since the learning.

      This approach also takes the pressure off continually trying to evaluate every individual instance of a learning programme – depending on the resources available, this may not be practical. I hope this helps, let me know if you have any supplementary questions. Regards, Richard.

  3. Sarah at 12:05 am #

    Hi Richard,

    I am interested to know the wording used for questions in the level 3 evalutation?



    • Richard at 11:08 am #

      Hi Sarah, thanks for getting in touch! The types of question will need to reflect the learning in question and your specific aims of the evaluation. However, some possible options might include:

      Relevance: How relevant do participants think the training was to their role? If it wasn’t relevant their not going to use it!

      Preparedness: How prepared were participants to implement the learning? You can link this to relevance or some other measure (perceived importance perhaps) – obviously, if there are areas that participants are rating as highly relevant or important then we would want to be confident that they are feeling fully prepared in those roles.

      Impact: How much impact did the learning have on their ability to perform their role? They may feel prepared but this could be as a result of other factors (e.g. mentoring in the workplace) rather than the learning programme itself.

      Frequency: How often are participants using a particular aspect of the learning? If it is not very often, perhaps there is not a strong requirement and therefore does it need to be included in the programme? Or, should the participant be using it more than they indicate and you need to do more to encourage Line Managers to support this aspect of the learning? Once again, you would hope that people feel very prepared (or confident perhaps) in those areas of the learning that they need to use frequently.

      These are just a few examples. With these types of questions you can produce a scale of responses (e.g. if looking at frequency you may use, daily, weekly, monthly, annually) so that respondents can select the most appropriate answer, this will generate some useful quantitative data. You can then also provide some open text boxes and encourage some comments in each area to generate some qualitative data – this will hopefully add some colour to your analysis.

      I hope this is useful – reply to this comment if you want to follow anything up!

      • Sarah at 9:54 pm #

        Hi Richard,

        Thanks so much for your response! I will now go ahead and put forward my Evaluation Strategy to my manager and see how we go!



        • Richard at 8:08 am #

          Good luck Sarah. My advice would be to start small, test your approach and then make adjustments as necessary. Be open with your manager in that you may not have got it exactly right first time. However, I think it’s better to start with something and then improve it, rather than to spend forever trying to document a perfect approach that never actually gets implemented. Remember that the aims of the evaluation should be aligned to what the organisation wants to see (what does success look like for the learning programme?) rather than simply what the L&D department wants to see. Once you’ve started, let me know if you’d like to write a guest post to share how things go and help others learn from your experience.

  4. Sarah at 10:04 pm #

    Thanks Richard. Good advice and yes I agree with that approach. I am having another meeting with our L&D Manager on Monday, so will see what that brings!

    Thanks again.


  5. Karinka Priskila at 10:17 am #

    Hi Richard,
    I am a last year student at ITHB Bandung, Indonesia.
    A couple months ago i had my internship at Learning&Talent Development of on of the Bank in Indonesia.
    They use Kirkpatrick analysis for their training evaluation.
    One of their training is Customer Service School.
    This training is for the new customer service.
    They ask me to make an instrument so they can measure the Level 3 (now they only can measure this training until level 2).
    I already have some of data: the customer service job description and the training material.
    What should i do?

    • Richard at 7:13 am #

      Hi Karinka, sorry for the delay in responding to your question. I think one thing you should aim to do is to try and gather some baseline data. By this I mean getting the data that captures the number of customer service related complaints that the bank have been receiving. I am assuming that the decision to deliver new customer service training is in response to the perception that the way bank employees handle customer service could be improved. Therefore we should aim to try and demonstrate the impact that the new customer service training is having.

      If you compare the number of customer service complaints within a period (say 3 months) prior to the training, and then measure the number of customer service complaints in the same timeframe after employees have received training, you will hopefully be able to identify a reduction. You will need to consider any other organisational changes that have occurred during the period but should hopefully be able to build a case that the new customer service training has contributed towards reducing customer service complaints.

      Depending on how the training is been rolled out across the organisation, you may be able to compare the number of customer complaints in one area of the business that has received training against another area that is still waiting for training. If you adopt this approach you will need to try and make sure that your two groups are as similar as possible in terms of size and function (one area of the business may historically generate more customer complaints than another so comparing these two areas has the potential to provided an inaccurate picture).

      This approach should generate some robust quantitative data that will illustrate the extent of the impact. You may also wish to consider gathering some more quantitative data by contacting a number of employees who have undergone training. You could look to gain their view on the extent to which they feel the training has improved (or otherwise) their ability to handle customer service issues – are they doing anything different as a result of the training? To what extent is the rest of the organisation enabling what they learnt in training to be implemented?

      This combination of quantitive and qualitative data should enable you to establish a picture of the impact that the training is having. You should take some time to condense this information down into a short overview (a selection of graphs with a series of quotes perhaps) that can then be presented to senior management.

      I hope this helps. Good luck and let me know if you have any questions about my response. Regards, Richard

Leave a Reply to Sarah Click here to cancel reply.