Citizens' Initiatives Not So Private
“Liu Shaoqi subjected to public criticism”
by Unknown Author is in the
Public Domain,
via Wikimedia Commons.
Finland’s leading daily newspaper Helsingin Sanomat recently wrote a hit piece against people who had supported citizens’ initiatives which the newspaper considers verboten. These included initiatives such as banning political strikes or Extinction Rebellion. One initiative was asking for budget cuts to YLE which is the Finnish equivalent of BBC with similar problems to the BBC.
For many, the phone call from a journalist and having their photo together with their employer published in the newspaper came as a surprise. After all the kansalaisaloite.fi service advertises itself like this.
“The system does not store the social security number of the person signing a statement of support. For each statement of support, an initiative specific oneway hash is calculated using the personal identity number, which securely prevents the same initiative from being supported multiple times. Based on this information, it is not possible to determine which different initiatives an individual person has supported in the service. The data of the statements of support are stored in the database strongly encrypted. The actions of the system’s administrators are recorded in a log.”
So how was the article possible? Let’s look at the code.
The Hash
The FAQ claims that service uses a oneway hash for storing the personal identity number ie. SSN. This seems to be true. The hash is created by first concatenating the initiative id, SSN and a salt. This concatenation is then hashed with SHA256. The hash is a binary so it is converted to a base64 string before saving to the database.
public String initiativeSupportHash(Long initiativeId, String ssn) {
Assert.notNull(initiativeId, "initiativeId");
Assert.hasText(ssn, "ssn");
StringBuilder message = new StringBuilder(256)
.append(initiativeId).append(MAC_DELIM)
.append(ssn).append(MAC_DELIM)
.append(password);
return base64Encode(sha256(message.toString(), DEFAULT_ENCODING), DEFAULT_ENCODING);
}
So far so good. This prevents duplicate votes while keeping the actual SSN private. However, when you have access to the secret key, you could brute-force the SSN from the hashes. This would not take too long with modern hardware.
The Database
Looking at the schema there are three interesting columns: initiative_id, support_id and details.
create table support_vote (
initiative_id bigint NOT NULL,
-- base64(sha256(initiative_id & ssn & sharedSecret))
support_id varchar(64) NOT NULL,
-- aes( name & birthDate & municipality)
details varchar(4096) NOT NULL,
created timestamp NOT NULL DEFAULT now(),
batch_id bigint,
PRIMARY KEY (initiative_id, supportId),
FOREIGN KEY (initiative_id) REFERENCES initiative(id),
FOREIGN KEY (batch_id) REFERENCES support_vote_batch(id)
);
The initiative_id column is the initiative being supported. The hash from above is saved to support_id column. The most interesting part is the details column. It contains the full name, date of birth and home municipality of the person casting the vote.
private String buildVoteDetails(User user, Locale locale, DateTime now) {
return new StringBuilder(128)
// Last name
.append(user.getLastName())
.append(VRK_CSV_DELIM)
// First names
.append(user.getFirstNames())
.append(VRK_CSV_DELIM)
// DateOf Birth
.append(user.getDateOfBirth().toString(VRK_DTF))
.append(VRK_CSV_DELIM)
// Home municipality
.append(user.getHomeMunicipality().getTranslation(locale))
.append(VRK_CSV_DELIM)
// Voting time
.append(now.toString(VRK_DTF))
.toString();
}
At least it is encrypted with AES before saving to the database.
public EncryptionService(String password, String vetumaSharedSecret, int secureRandomReseedInterval, int encryptorPoolSize) {
if (Strings.isNullOrEmpty(password)) {
throw new IllegalArgumentException("registeredUserSecret was null or empty");
}
this.vetumaSharedSecret = vetumaSharedSecret;
this.password = password;
this.secureRandomReseedInterval = secureRandomReseedInterval;
aesEncryptor = new PooledPBEStringEncryptor();
aesEncryptor.setProvider(new BouncyCastleProvider());
aesEncryptor.setAlgorithm(DEFAULT_ALGORITHM);
aesEncryptor.setPassword(password);
aesEncryptor.setKeyObtentionIterations(DEFAULT_KEY_OBTENTION_ITERATIONS);
aesEncryptor.setPoolSize(encryptorPoolSize);
}
public String encrypt(String message) {
// Add encryptor id to encrypted message
return new StringBuilder(message.length() * 2)
.append(DEFAULT_ENCRYPTOR_ID)
.append(aesEncryptor.encrypt(message)).toString();
}
Observant readers have probably already spotted the problem.
No, it is not the fact that only one PBKDF iteration is used for key derivation. It is the fact users’ full name, date of birth and home municipality are saved in the same row in the same database table as the actual vote. You do not even need the SSN to find all the initiatives an individual person has supported in the service. You just query by the name and birthday.
Yeah, if nitpicking the sentence “Based on this information, it is not possible to determine which different initiatives an individual person has supported in the service” is technically correct since “this information” points to the SSN. It is still misleading because you can easily find the different initiatives supported by each individual.
Say what now?
To be fair the website also states the following.
“The names of the signatories of the statements of support are not published at any stage on the Kansalaisaloite.fi service.”
Which is technically correct since names have not been published on Kansalaisaloite.fi. Instead, they were published in Helsingin Sanomat.
“The organizers of a citizens’ initiative may only disclose information regarding the signatories to the Digital and Population Data Services Agency. If the Digital and Population Data Services Agency confirms that a citizens’ initiative has collected at least 50,000 statements of support, the Agency may disclose information from the statements; in other words, the information in its possession becomes public. If, according to the decision, there is an insufficient number of statements of support, the information will not become public.”
This still does not take away the fact that contrary to what is claimed, it is easy to find out all initiatives supported by each individual if you have access to the database.
It is worth noting also that the code in GitHub has not been updated since 2019. It is possible that development has continued behind closed doors and obvious shortcomings have been fixed in live code. If anyone knows where I can find the live codebase let me know.