Technical publications concerning ILC

How to assure reliability in the determination of uncertainties of test results

Results of CompaLab ILC (interlaboratory comparisons) show that uncertainties are significantly underestimated by participants. A gradation of test methods can be established, from mainly metrological to methods which sources of uncertainties are mainly qualitative. Uncertainties are globally well determined for the first while they are globally underestimated by a factor 10 or more for the last. This probably comes from a massive choice of GUM method B to determine them, whatever the test method. However, method B is effective in metrology but not when significant qualitative sources of uncertainty are present. GUM also lacks guidance about some issues specific to testing. Furthermore, ILC and laboratory quality surveillance results can be re-used for GUM method A, which provide quite better estimates of uncertainties and request significantly fewer time and money than method B. When accurate determination of uncertainties is important, collaborative method A experiments (i.e. specifically designed ILC) should be organised, which results can afterwards be used in very effective internal quality surveillance programs. Determining uncertainties should always begin by a clarification about the intended use of them and a collection of available information concerning the precision of testing. The most appropriate method to determine uncertainties highly depends on this and, in most cases, the answer is not method B.

Download the full text:

Reliability of uncertainties of test results

Interlaboratory comparisons for hardness tests: interpolation of assigned values according to loading charges

The possibility in a lab proficiency testing to assess hardness test results of a given Brinell or Vickers scale when an enough amount of test results is available for adjacent scales is investigated. 5 different methods are found to determine the assigned value and 2 different methods are found to determine the proficiency standard deviation, the repeatability standard deviation and the uncertainty on the assigned value. The best option depends on the interlaboratory testing conditions. A procedure is described to deal with the different possible options and to propose parameters to check the adequacy of each of them to help the choice of the most adapted one. An assessment of the results obtained with this procedure on CompaLab ILC results obtained during the 2017-2023 years was performed, leading to very small differences in the scoring of participants for available scales. When the size of the input data is large, output scoring is even likely to be more efficient than usual one.

Download the full text:

ILC about hardness: Interpolation of VA according to load

Appropriate rankits to use for normal probability plots and Standard deviation probability plots

Normal probability plots are usually used to check whether a distribution can be regarded as Gaussian, to visualise whether some figures are likely to be outliers and, using a linear regression, to estimate its mean value and its standard deviation. In the same way, “SD probability plots”, based on the distribution of standard deviation estimates, could be quite useful to reach similar goals: check whether a hypothesis of homoscedasticity can be accepted or not, visualise estimates that are likely to be outliers, and estimate the true underlying standard deviation. In practice, a change of variable is necessary to change the rank of each value into a corresponding cumulated probability and inverse Gaussian transformation to get a “rankit” to be used as ordinates for these plots. Equations in the form of (i-a)/(N+1-2a) with 0 a 1 are usually used to determine the adequate cumulated probabilities. As a matter of fact, at least for small values of N, the choice of the “a” value has an important impact on the conclusions that are drawn afterwards. This document:

  • Discusses the grounds of these equations;
  • Evaluates their adequacy for a series of situations and types of distribution laws;
  • Proposes equations to determine “a” values as function of N, that provide better rankits than usually used and enable to estimate mean values and/or standard deviations without any bias for a series of situations;
  • Proposes an accurate way to determine envelope curves of confidence for normal probability plots and probability plots of any distribution which cumulative function is known.

Download the full text:

Appropriate rankits for probability plots

Beta risk in proficiency testing in relation with the number of participants


The Monte Carlo method was applied to PT schemes to investigate their efficiency. Probabilities that the computed z values are over 3 while the true value is less than 2 and that the computed z values are less than 2 while the true values are over 3 are computed for a series of situations: number of participants from 5 to 30, various ratios of repeatability over reproducibility and number of test results per participant, introduction or not of outliers with z from 3,5 to 10. For each situation, the probabilities of not detecting true outliers and to trigger false alerts are discussed. Guidance and keys are proposed to check and improve the efficiency of real PT programs.

Abstract of conclusions:

This study demonstrates that:

  1. The ratio λ=σr/(σL×Nr) is of main importance to control the efficiency of a PT scheme, even more than the number of participants. The PT providers should then care Nr, number of test results per participant that they request;
  2. Even in adverse conditions, the α-risk is always very low (less than 0,7%);
  3. Robust algorithms improve the efficiency of the PT program (i.e. β-risk) at a slight expense on α-risk (which always remain very low). This comes from a significantly better estimation of the standard deviation of reference when an outlier is present among the participants when these algorithms are used;
  4. A number of 6 participants is large enough to detect a strongly outlying participant provided that good PT conditions (i.e. low value of λ) are present;
  5. PT with a low number of participants is (almost) always better than no PT at all.

ISO 5725-1 and ISO 13528 recommend not to organise an ILC with less than 12 participants. This makes sense for ISO 5725-1, which goal is to determine the performance of a test method. It makes less sense for ISO 13528, which goal is to check the performance of a lab. Obviously, when no PT is organised, β-risk is 100%: any lab having a problem can never at all realise it! Consequently, for test methods that are performed by a little number of labs, it is obviously better to organise PT with 6 participants than nothing. In those cases, the PT provider should specially care the Nr it requests, to ensure a proper λ value and consequently assure an efficiency as good as possible.

Download the full text:

Beta risks in proficiency testing EN

Corresponding scientific publication