Democratic societies are built around the principle of free and fair elections, that each citizen’s

vote should count equal. National elections can be regarded as large-scale social experiments, where

people are grouped into usually large numbers of electoral districts and vote according to their

preferences. The large number of samples implies certain statistical consequences for the polling

results which can be used to identify election irregularities. Using a suitable data collapse, we nd

that vote distributions of elections with alleged fraud show a kurtosis of hundred times more than

normal elections. As an example we show that reported irregularities in the 2011 Duma election are

indeed well explained by systematic ballot stung and develop a parametric model quantifying to

which extent fraudulent mechanisms are present. We show that if specic statistical properties are

present in an election, the results do not represent the will of the people. We formulate a parametric

test detecting these statistical properties in election results. For demonstration the model is also

applied to election outcomes of several other countries.

Free and fair elections are the cornerstone of every

democratic society [1]. A central characteristic of elec-

tions being free and fair is that each citizen’s vote counts

equal. However, already Joseph Stalin believed that

”The people who cast the votes decide nothing. The

people who count them decide everything.” How can it

be distinguished whether an election outcome represents

the will of the people or the will of the counters?

Elections are fascinating, large scale social experi-

ments. A country is segmented into a usually large

number of electoral districts. Each district represents

a standardized experiment where each citizen articulates

his/her political preference via a ballot. Despite dier-

ences in e.g. income levels, religions, ethnicities, etc.

across the populations in these districts, outcomes of

these experiments have been shown to follow certain uni-

versal statistical laws [2, 3]. Huge deviations from these

expected distributions have been reported for the votes

for United Russia, the winning party in the 2011 Duma

election [4, 5].

In general, using an appropriate re-scaling of elec-

tion data, the distributions of votes and turnout are ap-

proximately a Gaussian [3]. Let Wi be the number of

votes for the winning party and Ni the number of vot-

ers in electoral district i, then the logarithmic vote rate

is νi = log Wi−Ni

Wi

. In gure 2 we show the distribution

of νi over all electoral districts. To rst order the data

from dierent countries collapse to a Gaussian. Clearly

the data for Russia and Uganda boldly fall out out of

line. Skewness and kurtosis are listed for each data-set

in table SII, conrming these observations quantitatively.

Most strikingly, the kurtosis of the distributions for Rus-

sia (2003, 2007 and 2011) and Uganda deviate by two

orders of magnitude from each other country. The only

reasonable conclusion from this is that the voting results

in Russia and Uganda are driven by other mechanisms

or processes than other countries.

However, such distributions only reveal part of the

story, and a dierent representation of the data becomes

helpful to gain a deeper understanding. Figure 1 shows

a 2-d histogram of the number of electoral districts for a

given fraction of voter turnout (x-axis) and for the per-

centage of votes for the winning party (y-axis). Results

are shown for recent parliamentary elections in Austria,

Finland, Russia, Spain, Switzerland, and the UK, and

presidential elections in the USA and Uganda. Data was

obtained from ocial election homepages of the respec-

tive countries, for more details and more election results,

see SOM. These figures can be interpreted as fingerprints

of several processes and mechanisms leading to the over-

all election results. For Russia and Uganda the shape of

these fingerprints are immediately seen to dier from the

other countries. In particular there is a large number of

districts (thousands) with a 100% percent turnout and

at the same time a 100 % of votes for the winning party.

The shape of these irregularities can be understood

with the assumption of the presence of the fraudulent

action of ballot stung. This means that bundles of

ballots with votes for one party are stued into the

urns. Videos purportedly documenting these practices

are openly available on online platforms [6–8]. In one

case the urn is already filled with ballots before the elec-

tions start, e.g. [6], in other cases members of the elec-

tion commission are caught filling out ballots, e.g. [7].

Yet in another case the pens in the polling stations are

shown to be erasable, e.g. [8]. Are these incidents non-

representative exceptions or the rule?

We develop a parametric model to quantify the extent

of ballot stung for a given party to explain the elec-

tion fingerprints in figure 1. The distributions for Russia

and Uganda are clearly bimodal. One at intermediate

levels of turnout and votes, smeared towards the upper

right parts of the plot. The second peak is situated at

the vicinity of the 100% turnout, 100% votes point. This

suggests two modes of fraud mechanisms, incremental

and extreme fraud. Incremental fraud means that with a

given rate ballots for one party are added to the urn and

votes for other parties are replaced. This occurs within

a fraction fi of electoral districts. In the election finger-

prints in figure 1 these districts are shifted to the upper

right. Extreme fraud corresponds to reporting nearly all

votes for a single party with an almost complete voter

turnout. This happens in a fraction fe of districts, which

form a second cluster near 100% turnout and votes for

the incumbent party.

For simplicity in the model we assume that within each

electoral district turnout and voter preferences follow a

Gaussian distribution with the mean and standard devi-

ation taken from the actual sample, see figure S2. With

probability fi (fe) the incremental (extreme) fraud mech-

anisms are then applied. Note that if more detailed as-

sumptions are made about possible mechanisms leading

to large-scale heterogeneities in the data such as city-

country dierences in turnout (UK) or coast–non-coast

(USA) (see SOM), this will have an eect on the esti-

mate of fi. Figure 3 compares the observed and mod-

eled fingerprint plots for the winning parties in Russia,

Uganda and Switzerland. Model results are shown for

fi = fe = 0 (fair elections) and for best fits to the data

(see SOM) for fi and fe. To describe the smearing from

the main peak to the upper right corner, an incremental

fraud probability around fi = 0.64 is needed for the case

of United Russia. This means fraud in about 64% of the

districts. In the second peak around the 100% turnout

scenario there are roughly 3,000 districts with a 100%

of votes for United Russia representing an electorate of

more than two million people. Best fits yield fe = 0.05,

i.e. five percent of all electoral districts experience ex-

treme fraud. A more detailed comparison of the model

performance for the Russian parliamentary elections of

2003, 2007 and 2011 is found in the figure S3. The fraud

parameters for the Uganda data in figure 3 are fi = 0.45

and fe = 0.01.

The dimension of election irregularities can be visual-

ized with the cumulative number of votes as a function

of the turnout, figure 4. For each turnout level the to-

tal number of votes from districts with this, or lower

turnouts are shown. Each curve corresponds to the re-

spective election winner in a dierent country. Normally

these cdfs level o and form a plateau from the party’s

maximal vote count on. Again this is not the case for

Russia and Uganda. Both show a boost phase of in-

creased extreme fraud toward the right end of the distri-

bution (red circles). Russia never even shows a tendency

to form a plateau.

It is imperative to emphasize that the shape of the fin-

gerprints in figure 1 will deviate from pure 2-d Gaussian

distributions due to non-fraudulent mechanisms, such as

heterogeneities in the population or voter mobilization,

see SOM. However, these can under no circumstances ex-

plain the mode of extreme fraud. A bad forgery is the

ultimate insult1.

It can be said with almost certainty that an election

does not represent the will of the people, if a substantial

fraction (fe) of districts reports a 100% turnout with al-

most all votes for a single party, and/or if any significant

deviations from the sigmoid form in the cumulative dis-

tribution of votes versus turnout are observed. Another

indicator of systematic fraudulent or irregular voting be-

havior is a kurtosis of the logarithmic vote rate distribu-

tion of the order of several hundreds.

Should such signals be detected it is tempting to in-

voke G.B. Shaw who held that ”[d]emocracy is a form of

government that substitutes election by the incompetent

many for appointment by the corrupt few.”

 

FIG. 1. Election fingerprints: 2-d histograms of the num-

ber of electoral districts for a given voter turnout (x-axis)

and the percentage of votes (y-axis) for the winning party (or

candidate) in recent elections from eight dierent countries

(from left to right, top to bottom: Austria, Finland, Russia,

Spain, Switzerland, Uganda, UK and USA) are shown. Color

represents the number of electoral districts. Districts usually

cluster around a given turnout and voting level. In Uganda

and Russia these clusters are ’smeared out’ to the upper right

region of the plots, reaching a second peak at a 100% turnout

and a 100% of votes (red circles). In Finland the main cluster

is smeared out into two directions (indicative of voter mo-

bilization due to controversies surrounding the True Finns).

In the UK the fingerprint shows two clusters stemming from

rural and urban areas (see SOM).

 

 

FIG. 2. A simple way to compare data from dierent elec-

tions in dierent countries is to present the distributions of

the logarithmic vote rates νi of the winning parties as rescaled

distributions with zero-mean and unit-variance [3]. Large de-

viations from other countries can be seen for Uganda and

Russia with the plain eye.

 

FIG. 3. Comparison of observed and modeled 2-d histograms

for (top to bottom) Russia, Uganda and Switzerland. The

left column shows the actual election fingerprints, the middle

column shows a fit with the fraud model. The column to

the right shows the expected model outcome of fair elections

(i.e. absence of fraudulent mechanisms fi = fe = 0). For

Switzerland the fair and fitted model are almost the same.

The results for Russia and Uganda can only be explained by

the model assuming a large number of fraudulent districts.

 

FIG. 4. The ballot stung mechanism can be visualized by

considering the cumulative number of votes as a function of

turnout. Each country’s election winner is represented by a

curve which typically takes the shape of a sigmoid function

reaching a plateau. In contrast to the other countries, Russia

and Uganda do not tend to develop this plateau but instead

show a pronounced increase (boost) close to complete turnout.

Both irregularities are indicative of the two ballot stung

modes being present.

 

 

SUPPORTING ONLINE MATERIAL

The data

Descriptive statistics and ocial sources of

the election results are shown in table SI. The

raw data will be made available for download at

http://www.complex-systems.meduniwien.ac.at/.

Each data set reports election results of parliamentary

(Austria, Finland, Russia, Spain, Switzerland and UK)

or presidential (Uganda, USA) elections on district

level. In the rare circumstances where electoral districts

report more valid ballots than registered voters, we work

with a turnout of 100%. With the exception of the US

data, each country reports the number of registered

voters and valid ballots for each party and district. For

the US there is no exact data on the voting eligible

population on district level, which was estimated to be

the same as the population above 18 years, available

at http://census.gov. Fingerprints for the 2000 US

presidential elections are shown in figure S1 for both

candidates for districts from the entire USA and Florida

only. There are no irregularities discernible.

Model

A country is separated into n electoral districts i, each

having an electorate of Ni people and in total Vi valid

votes. The fraction of valid votes for the winning party

in district i is denoted vi. The average turnout over all

districts, ¯ a, is given by ¯ a = 1/n

P

i

(Vi/Ni) with stan-

dard deviation sa, the mean fraction of votes ¯ v for the

winning party is ¯ v = 1/n

Pi

vi with standard deviation

sv. The mean values ¯ a and ¯ v are typically close to but

not identical to the values which maximize the empirical

distribution function of turnout and votes over all dis-

tricts. Let v be the number of votes where the empirical

distribution function assumes its (first local) maximum

(rounded to entire percents), see figure S2. Similarly a

is the turnout where the empirical distribution function

of turnouts ai takes its (first local) maximum. The dis-

tributions for turnout and votes are extremely skewed to

the right for Uganda and Russia which also inflates the

standard deviations in these countries, see table SI. To

account for this a ’left-sided’ (’right-sided’) mean devia-

tion σL

v (σR

v ) from v is introduced. σR

v can be regarded

as the incremental fraud width, a measurable parameter

quantifying how intense the vote stung is. This con-

tributes to the ’smearing out’ of the main peaks in the

election fingerprints, see figure 1 in the main text. The

larger σR

v , the more inflated the vote results due to urn

stung, in contrast to σL

v which quantifies the scatter

of the voters’ actual preferences. They can be estimated

from the data by

σL

v =

ph(vi − v)2ivi<v , (1)

σR

v =

ph(vi − v)2ivi>v . (2)

Similarly the extreme fraud width σx can be estimated,

i.e. the width of the peak around 100% votes. We found

that σx = 0.075 describes all encountered vote distribu-

 

FIG. 1. Turnout against percentage of votes for Bush (left col-

umn) and Gore (right) in the 2000 US presidential elections.

Results are shown for all districts in the USA (top row) and

for districts from Florida (bottom). There are no traces of

fraudulent mechanisms discernible in the fingerprints.

 

FIG. 2. A stylized version of an empirical vote distribution

function shows how v, σL

v , σR v and σx are derived from the

election results. v is the maximum of the distribution func-

tion. σL

v measures the distribution width of values to the left

of v, i.e. smaller than v. The incremental fraud with σR v

measures the distribution width of values to the right of v,

i.e. larger than v. The extreme fraud width σx is the width

of the peak at 100% votes.