Measuring Human Rights (9): When “Worse” Doesn’t Necessarily Mean “Worse”

I discussed in this older post some of the problems related to the measurement of human rights violations, and to the assessment of progress or deterioration. One of the problems I mentioned is caused by improvements in measurement methods. Such improvements can in fact result in a statistic showing increasing numbers of rights violations, whereas in reality the numbers may not be increasing, and perhaps even decreasing. Better measurement means that you now compare current data that are more complete and better measured, with older numbers of rights violations that were simply incomplete.

The example I gave was about rape statistics: better statistical and reporting methods used by the police, combined with less social stigma etc. result in statistics showing a rising number of rapes, but this increase was due to the measurement methods (and other effects), not to what happened in real life.

I now came across another example. Collateral damage – or the unintentional killing of civilians during wars – seems to be higher now than a century ago (source). This may also be the result of better monitoring hiding a totally different trend. We all know that civilian deaths are much less acceptable now than they used to be, and that journalism and war reporting are probably much better (given better communication technology). Hence, people may now believe that it’s more important to count civilian deaths, and have better means to do so. As a result, the numbers of civilian deaths showing up in statistics will rise compared to older periods, but perhaps the real numbers don’t rise at all.

Of course, the increase of collateral damage may be the result of something else than better measurement: perhaps the lower level of acceptability of civilian deaths forces the army to classify some of those deaths as unintentional, even if they’re not (and then we have worse rather than better measurement). Or perhaps the relatively recent development of precision-guided munition has made the use of munition more widespread so that there are more victims: more bombs, even more precise bombs, can make more victims than less yet more imprecise bombs. Or perhaps the current form of warfare, with guerilla troops hiding among populations, does indeed produce more civilian deaths.

Still, I think my point stands: better measurement of human rights violations can give the wrong impression. Things may look as if they’re getting worse, but they’re not.

Measuring Human Rights (8): Measurement of the Fairness of Trials and of Expert Witnesses

An important part of the system of human rights are the rules intended to offer those accused of crimes a fair trial in court. We try to treat everyone, even suspected criminals, with fairness, and we have two principal reasons for this:

  • We only want to punish real criminals. A fair trial is one in which everything is done to avoid punishing the wrong persons. We want to avoid miscarriages of justice.
  • We also want to use court proceedings only to punish criminals and deter crime, not for political or personal reasons, as is often the case in dictatorships.

Most of these rules are included in, for example, articles 9, 10, 14 and 15 of the International Covenant on Civil and Political Rights, article 10 of the Universal Declaration, article 6 of the European Convention of Human Rights, and the Sixth Amendment to the United States Constitution.

Respect for many of these rules can be measured statistically. I’ll mention only one here: the rule regarding the intervention of expert witnesses for the defense or the prosecution. Here’s an example of the way in which this aspect of a fair trial can measured:

In the late 1990s, Harris County, Texas, medical examiner [and forensic specialist] Patricia Moore was repeatedly reprimanded by her superiors for pro-prosecution bias. … In 2004, a statistical analysis showed Moore diagnosed shaken baby syndrome (already a controversial diagnosis) in infant deaths at a rate several times higher than the national average. … One woman convicted of killing her own child because of Moore’s testimony was freed in 2005 after serving six years in prison. Another woman was cleared in 2004 after being accused because of Moore’s autopsy results. In 2001, babysitter Trenda Kemmerer was sentenced to 55 years in prison after being convicted of shaking a baby to death based largely on Moore’s testimony. The prosecutor in that case told the Houston Chronicle in 2004 that she had “no concerns” about Moore’s work. Even though Moore’s diagnosis in that case has since been revised to “undetermined,” and Moore was again reprimanded for her lack of objectivity in the case, Kemmerer remains in prison. (source)

Measuring Human Rights (7): Don’t Let Governments Make it Easy on Themselves

In many cases, the task of measuring respect for human rights in a country falls on the government of that country. It’s obvious that this isn’t a good idea in dictatorships: governments there will not present correct statistics on their own misbehavior. But if not the government, who else? Dictatorships aren’t known for their thriving and free civil societies, or for granting access to outside monitors. As a result, human rights protection can’t be measured.

The problem, however, of depending on governments for human rights measurement isn’t limited to dictatorships. I also gave examples of democratic governments not doing a good job in this respect. Governments, also democratic ones, tend to choose indicators they already have. For example, number of people benefiting from government food programs (they have numbers for that), neglecting private food programs for which information isn’t readily available. In this case, but in many other cases as well, governments choose indicators which are easy to measure, rather than indicators which measure what needs to be measured but which require a lot of effort and money.

Human rights measurement also fails to measure what needs to be measured when the people whose rights we want to measure don’t have a say on which indicators are best. And that happens a lot, even in democracies. Citizen participation is a messy thing and governments tend to want to avoid it, but the result may be that we’re measuring the wrong thing. For example, we think we are measuring poverty when we count the number of internet connections for disadvantaged groups, but these groups may consider the lack of cable TV or public transportation a much more serious deprivation. The reason we’re not measuring what we think we are measuring, or what we really need to measure, is not – as in the previous case – complacency, lack of budgets etc. The reason is a lack of consultation. Because there hasn’t been consultation, the definition of “poverty” used by those measuring human rights is completely different from the one used by those whose rights are to be measured. And, as a result, the indicators that have been chosen aren’t the correct ones, or they don’t show the whole picture. Many indicators chosen by governments are also too specific, measuring only part of the human right (e.g. free meals for the elderly instead of poverty levels for the elderly).

However, even if the indicators that are chosen are the correct ones – i.e. indicators that measure what needs to be measured, completely and not partially – it’s still the case that human rights measurement is extremely difficult, not only conceptually, but also and primarily on the level of execution. Not only are there many indicators to measure, but the data sources are scarce and often unreliable, even in developed countries. For example, let’s assume that we want to measure the human right not to suffer poverty, and that we agree that the best and only indicator to measure respect for this right is the level of income.* So we cleared up the conceptual difficulties. The problem now is data sources. Do you use tax data (taxable income)? We all know that there is tax fraud. Low income declared in tax returns may not reflect real poverty. Tax returns also don’t include welfare benefits etc.

Even if you manage to produce neat tables and graphs you always have to stop and think about the messy ways in which they have been produced, about the flaws and lack of completeness of the chosen indicators themselves, and about the problems encountered while gathering the data. Human rights measurement will always be a difficult thing to do, even under the best circumstances.

* This isn’t obvious. Other indicators could be level of consumption, income inequality etc. But let’s assume, for the sake of simplicity, that level of income is the best and only indicator for this right.

Measuring Human Rights (6): Don’t Make Governments Do It

In the case of dictatorial governments or other governments that are widely implicated in the violation of the rights of their citizens, it’s obvious that the task of measuring respect for human rights should be – where possible – carried out by independent non-governmental organizations, possibly even international or foreign ones (if local ones are not allowed to operate). Counting on the criminal to report on his crimes isn’t a good idea. Of course, sometimes there’s no other way. It’s often impossible to estimate census data, for example, or data on mortality, healthcare providers etc. without using official government information.

All this is rather trivial. The more interesting point, I hope, is that the same is true, to some extent, of governments that generally have a positive attitude towards human rights. Obviously, the human rights performance of these governments also has to be measured, because there are rights violations everywhere, and a positive attitude doesn’t guarantee positive results. However, even in such cases, it’s not always wise to trust governments with the task of measuring their own performance in the field of human rights. An example from a paper by Marilyn Strathern (source, gated):

In 1993, new regulations [required] local authorities in the UK … to publish indicators of output, no fewer than 152 of them, covering a variety of issues of local concern. The idea was … to make councils’ performance transparent and thus give them an incentive to improve their services. As a result, however,… even though elderly people might want a deep freeze and microwave rather than food delivered by home helps, the number of home helps [was] the indicator for helping the elderly with their meals and an authority could only improve its recognised performance of help by providing the elderly with the very service they wanted less of, namely, more home helps.

Even benevolent governments can make crucial mistakes like these. This example isn’t even a measurement error; it’s measuring the wrong thing. And the mistake wasn’t caused by the government’s will to manipulate, but by a genuine misunderstanding of what the measurement should be all about.

I think the general point I’m trying to make is that human rights measurement should take place in a free market of competing measurements – and shouldn’t be a (government) monopoly. Measurement errors are more likely to be identified if there is a possibility to compare competing measurements of the same thing.

Measuring Democracy (3): But What Kind of Democracy?

Those who want to measure whether countries are democratic or not, or want the measure to what degree countries are democratic, necessarily have to answer the question “what is democracy?”. You can’t start to measure democracy until you have answered this question, as in general you can’t start to measure anything until you have decided what it is you want to measure.

Two approaches to measuring democracy

As the concept of democracy is highly contestable – almost everyone has a different view on what it means to call a country a democracy, or to call it more or less democratic than another – it’s not surprising to see that most of the research projects that have attempted to measure democracy – such as Polity IV, Freedom House etc. – have chosen a different definition of democracy, and are, therefore, actually measuring something different. I don’t intend to give an overview of the differences between all these measures here (this is a decent attempt). What I want to do here is highlight the pros and cons of two extremely different approaches: the minimalist and the maximalist one. The former could, for example, view democracy as no more than a system of regular elections, and measure simply the presence or absence of elections in different countries. The latter, on the other hand, could include in its definition of democracy stuff like rights protections, freedom of the press, division of powers etc., and measure the presence or absence of all of these things, and aggregate the different scores in order to decide whether a country is democratic or not, and to what extent.

When measuring the democratic nature of different countries (and of course comparing them), should we use a minimalist or maximalist definition of democracy? Here are some pros and cons of either approach.


A minimalist definition makes it very difficult to differentiate between countries. It would make it possible to distinguish democracies (minimally defined) from non-democracies, but it wouldn’t allow to measure the degree of democracy of a given country. I believe an ordinal scale with different ranks for different levels of quality of democracy in different countries (ranging from extremely poor quality, i.e. non-democracies, to perfect democracies) is more interesting than a binary scale limited to democracy/non-democracy. The use of a maximalist definition of democracy would make it possible to rank all types of regimes on such an ordinal scale. A maximalist definition of democracy would include a relatively large number of necessary attributes of democracy, and the combination of presence/absence/partial development of each attribute would almost make it possible to give each country a unique rank in the ordinal scale. Such a wide-ranging differentiation is an advantage for progress analysis. A binary scale does not give any information on the quality of democracy. Hence, it would be better to speak of measuring democratization rather than measuring democracy. And democratization not only in the sense of a transition from authoritarian to democratic governance, but also in the sense of progress towards a deepening of democratic rule.

A minimalist definition of democracy necessarily focuses on just a few attributes of democracy. As a result, it is impossible to differentiate between degrees of “democraticness” of different countries. Moreover, the chosen attributes may not be typical of or exclusive to democracy (such as good governance or citizen influence), and may not include some necessary attributes. For example, Polity IV, perhaps the most widely used measure of democracy, does not sufficiently incorporate actual citizen participation, as opposed to the mere right of citizens to participate. I think it’s fair to say that a country that gives its citizens the right to vote but doesn’t actually have many citizens voting, can hardly be called a democracy.

Acceptability of the measurement vs controversy

A disadvantage of maximalism is that the measurement will be more open to controversy. The more attributes of democracy are included in the measure, the higher the risk of disagreement on the model of democracy. As said above, people have different ideas about the number and type of necessary attributes of a democracy, even of an ideal democracy. If the only attribute of democracy retained in the analysis is regular elections, then there will be no controversy since few people would reject this attribute.


So we have to balance meaning against acceptability: a measurement system that is maximalist offers a lot of information and the possibility to compare countries beyond the simple dichotomy of democracy/non-democracy, but it may be rejected by those who claim that this system is not measuring democracy as they understand the word. A minimalist system, on the other hand, will measure something that is useful for many people – no one will contest that elections are necessary for democracy, for instance – but will also reduce the utility of the measurement results because it doesn’t yield a lot of information about countries.

Measuring Human Rights (5): Some (Insurmountable?) Problems

If you care about human rights, it’s extremely important to measure the level of protection of human rights in different countries, as well as the level of progress or deterioration. Measurement in the social sciences is always tricky; we’re dealing with human behavior and not with sizes, volumes, speeds etc. However, measuring human rights is especially difficult.

Some examples. I talked about the so-called catch 22 of human rights measurement. In order to measure whether countries respect human rights, one already needs respect for human rights. Organizations, whether international organizations or private organizations (NGOs), must have some freedom to control, to engage in fact finding, to enter countries and move around, to investigate “in situ”, to denounce etc. Victims should have the freedom to speak out and to organize themselves in pressure groups. So we assume what we want to establish.

The more violations of human rights, the more difficult it is to monitor respect for human rights. The more oppressive the regime, the harder it is to establish the nature and severity of its crimes; and the harder it is to correct the situation.

So, a country which does a very bad job protecting human rights, may not have a low score because the act of giving the country a correct score is made impossible by its government. On the other hand, a low score for human rights (or certain human rights) may not be as bad as it seems, because at least it was possible to determine a score.

Another example: suppose a country shows a large increase in the number of rapes. At first sight, this is a bad thing, and would mean giving the country a lower score on certain human rights (such as violence against women, gender discrimination etc.). But perhaps the increase in the number of rapes is simply the result of a larger number of rapes being reported to the police. And better reporting of rape may be the result of a more deeply and widely ingrained human rights culture, or, in other words, it may be the reflection of a growing consciousness of women’s rights and gender equality.

So, a deteriorating score may actually hide progress.

The same can be said of corruption or police brutality. A deteriorating score may simply be a matter of perception, a perception created by more freedom of the press.

I don’t know how to solve these problems, but I think it’s worth mentioning them. They are probably the reason why there is so little good measurement in the field of human rights, and so much anecdotal reporting.

Measuring Democracy (2): Polity IV, and Some of Its Problems

Polity IV is, like Freedom House and others, a project ranking countries according to their political regime type. It’s extensively used in comparative and causal analysis that require a distinction between democracies and non-democracies, partly because its time series start from the year 1800.


perspective envisions a spectrum of governing authority that spans from fully institutionalized autocracies through mixed, or incoherent, authority regimes (termed “anocracies”) to fully institutionalized democracies. The “Polity Score” captures this regime authority spectrum on a 21-point scale ranging from -10 (hereditary monarchy) to +10 (consolidated democracy). (source)

The Polity Score is the aggregate of 6 component measures that aim to record what are called key qualities of democracies: executive recruitment, constraints on executive authority, and political competition.

However, it seems that Polity IV doesn’t adequately measure what it claims to measure. Its concept of democracy is quite thin, resulting in a fair number of “perfect democracies”, whereas we all know that there is no such thing in the world we live in. And other countries, which are obviously dictatorial, are classified as fairly democratic. A quote from this paper (which is an attempt to improve Polity IV):

Polity’s 21-point democracy/autocracy scale, illustrated by the dashed line [in the figure below], tracks the major changes in British political history, but only roughly. The Reform Bill of 1832 revised a complicated system of determining the franchise by increasing the number of voters from 500,000 to 813,000. Despite the modesty of this expansion, changes in the Polity Score for Britain give a sense of greatly expanded democracy, moving from a -2 (democracy=4, autocracy=6) to a +3 (democracy=6, autocracy=3).

However, … only six percent of the adult population voted even after the reform.

While the male franchise had broadened considerably by 1884, suffrage still excluded agricultural workers and servants. Actual voter turnout reached 12% of the population only in the election of 1885 before falling, and didn’t return to that level again until 1918. All the while, Polity scores for executive recruitment and competition increased while institutionalized autocracy decreased. In 1880 the Polity democracy score stood at 7 (autocracy=0). By 1901 the democracy score rose to 8 and by 1922 Polity suggests that Britain was a “perfect 10” democracy, even though full male suffrage was not achieved until 1918 and full female suffrage until 1928.

Britain has received the highest democracy rating ever since, even though the voting rate has never exceeded 60% of the adult population.

The high scores that Britain receives from 1880 on are misleading and, with respect to changes in participation, mistimed. As Figure 1 illustrates, participation doubled during a period Polity records as unchanged and doubled again during a modest 2 point move in Polity.

The racial exclusion in South Africa also demonstrates the danger of conceiving democracy without taking account of the breadth of citizen participation. According to Polity, South Africa was a relatively stable democracy from 1910 until 1989. It was coded a 7 out of 10 on democracy and a 3 of 10 on autocracy, bringing its score to +4. A positive score is surprising because it ignores the exclusion of the 90 percent of the population that did not – most could not – vote.

Switzerland, our final example, has scored a perfect 10 out of 10 on democracy in the Polity dataset since 1848, even though women – roughly half the population – were not granted the right to vote until 1971, 123 years later. Furthermore, electoral turnout has hovered around 30% recently, despite virtually universal suffrage. One reason is that Switzerland’s collective executive is an organizational form that diminishes voter motivation by minimizing the significance of election outcomes. Surely such a system should be regarded as less democratic than one in which most citizens participate in elections that actually make a difference in the leadership and policies of the nation.