<img alt="" src="https://secure.lote1otto.com/219869.png" style="display:none;">
Skip to content

Measuring Intergroup Agreement and Disagreement


Cytel's Madhusmita Panda presented at this year’s PSI Conference in the Innovative Methodology session on the topic of ‘Measuring Intergroup Agreement and Disagreement’.

In this blog, we share the context, abstract and slides from Panda’s presentation. 

Why is this a hot topic? 

Evaluating the extent of agreement between two or several raters is common in social, behavioural and medical sciences. Agreement between raters is not only a subject of scientific research but also a problem frequently encountered in practice. There are numerous situations where agreement is required. For example, calibration of instruments, testing reliability of scales/measures, assessment of the equivalence of measurement tools, judgment in tests of ability, tests of repeatability or reproducibility, and diagnostic analysis (interpersonal and intra-individual agreement) or agreement between a group of expert (in house) assessors and a group of lay consumers.

How does measuring intergroup agreement and disagreement help in clinical trials? 

There are many challenges encountered by the biopharma industry when conducting multicentre trials to evaluate efficacy and identify safety issues for candidate drugs as effectively, efficiently and addressing the requirements of regulatory authorities across the globe.

There are a number of factors in a trial that make it a complex process including scale, design, geography and safety. With many investigators often involved,  it is very important that they arrive at a consensus around the study findings and sometimes agreement needs to be checked between the groups of investigators from different regions or physicians with different affiliations or experience.

Panda's talk on “Measuring Intergroup Agreement and Disagreement”

This work is motivated by the need to assess the degree of agreement between two independent groups of raters. In literature, there are several measures of agreement. Cohen’s kappa (Cohen, 1960) is a popular measure of agreement between two raters with values on a categorical scale. Fleiss’ kappa (Fleiss, 1981) for categorical scale and Krippendorff’s alpha (Krippendorff, 2004) for continuous scale are suitable measures to evaluate agreement within a group of raters. A generalization is needed to measure agreement between two groups of raters. In the literature there are two measures available to deal with this problem. Van Hoeij (van Hoeij et al., 2004) considered consensus method in which ratings of a subject by raters in a group are replaced by one (most common) rating thus reducing the problem to a two rater’s problem. Other method proposed by Vanbelle (Vanbelle et al., 2009) is a generalization of Cohen kappa measure for qualitative data. We propose two new methods to measure agreement between two groups of raters for qualitative as well as quantitative data. First is an agreement measure which is the cube root of product of agreement values within each group and in the combined group of raters. Second is the disagreement measure which is a quadratic form of difference vectors. Properties of two measures are investigated by resampling approach. The two new measures proposed seem to have smaller variability than measures available in literature.

To access Panda’s slides from the PSI conference click the button below.

Download Slides 





Cytel biostatisticians and programmers are active and well regarded in industry associations and communities around the world. Would you like to join them?

We have opportunities  at all levels to join our expanding team across our global locations. To find out more about rewarding careers with us click below.

Explore Careers 




With many thanks to Madhusmita Panda, Associate Biostatistician at Cytel.



contact iconSubscribe back to top