Finland

The empirical proving ground.

Why Finland?

Finland combines difficult replay dynamics with unusually open source material and a legally non-authoritative consolidation surface. Finlex consolidated texts are informational, not legally binding. The "real" law is the original statute plus all amendment acts published in Säädöskokoelma.

This makes replay-vs-oracle comparison scientifically meaningful: when LawVM and Finlex disagree, the disagreement can be independently adjudicated against primary sources. A divergence might be a LawVM bug, a source gap, or a genuine error in the official consolidation.

Corpus

690 Finnish statutes with at least one amendment, spanning from the 1920s to the 2020s. All replayed from raw amendment acts in Säädöskokoelma XML. Curated from 3,591 amended statutes — the 690 are those where base XML, oracle consolidation, and all amendment texts are available for full replay.

DecadeStatutesAvg amendmentsMax
1920s–1960s101567
1970s2013.761
1980s2514.464
1990s10720.6238
2000s17313.3166
2010s2375.352
2020s1182.518

As of 2026-04-16.

Benchmark

Text distance

0.65%
Mean Levenshtein distance from Finlex consolidation. ~420 of 690 statutes produce identical text.

Structural error

4.25%
Mean section-level structural divergence. 367 perfect structural match. 490 at ≥95%.

<90% match

104
of 690 statutes below 90% — these are the investigation frontier

Benchmark run: 2026-04-16, mode: finlex_oracle. Two metrics: Levenshtein measures character-level text distance; structural error measures section-level divergence. See Artifacts for corpus definition and reproducibility.

How to read this: The benchmark measures structural agreement with Finlex. It does not measure legal correctness. High similarity ≠ correct. Low similarity ≠ wrong. Some divergences mean LawVM is right and Finlex is wrong. The real verification loop is manual residual review against primary sources.

The golden dataset

We systematically investigate every divergence between LawVM and Finlex. In 77+ cases so far, the official Finlex consolidation contains errors — missing content, stale amendments, editorial additions without legal basis, missing section headings.

Each finding is documented in a structured YAML entry with Finnish prose, root cause classification, affected sections, and source evidence. The full findings database is available in Finnish at Finlex-virheet.

Root cause taxonomy

Root causeCountMeaning
finlex_missing_content10Finlex omits content published in Säädöskokoelma
source_pathology9Source XML itself is defective
finlex_missing_otsikko≥9Finlex omits section headings
oracle_stale_cutoff≥6Finlex has not applied recent amendments
finlex_editorial_addition4Finlex added content without amendment basis
high_uncovered_body3Structural amendment coverage gap
finlex_dual_element_residual3Finlex has dual current/pending versions
corrigendum_misapplied1Published correction not applied
+ 7 more categories

77 entries as of 2026-04-16. Target: full classification of all divergences.

Concrete cases

2004/699 — Wrong section heading for 16+ years

Laki rahoitus- ja vakuutusryhmän valvonnasta. Amendment 2008/886 changed §8's heading from “Valvontatehtävän siirtäminen toiselle valvontaviranomaiselle” to “... ulkomaan valvontaviranomaiselle” (“another” → “foreign” supervisory authority). Finlex still shows the 2004 heading. No subsequent amendment changed it back. A semantic error that changes legal meaning, persisting in the official database since 2008.

2014/716 — Missing COVID-19 emergency provisions

Valtioneuvoston asetus yritystoiminnan kehittämiseksi. Temporary COVID-19 amendment 2020/697 modified §§5, 11, 12 and added new §8b (state aid for enterprises in difficulty). Finlex's consolidated version contains none of these changes — it shows pre-COVID text even though the consolidation was prepared during the amendment's period of force. Anyone relying solely on Finlex during 2020–2021 was reading law that omitted applicable emergency provisions.

2014/1245 — Six section headings silently dropped

Valtioneuvoston asetus televisio- ja radiotoiminnasta. Amendment 2018/396 added headings to §§1, 3, 6, 7, 14, 15. Finlex's consolidated version is missing all of them. The amendment source XML explicitly contains the headings. Not a subtle text difference — six headings simply absent from the official version.

1992/728 — Future-effective amendment already applied

Laki kunnallisesta viranhaltijasta. Finlex's consolidated metadata says the consolidation date is 2009-12-29, but §3 already reflects amendment 2009/1710 which enters into force on 2010-01-01. LawVM's legal point-in-time mode correctly carries the earlier wording. The editorial consolidation runs ahead of strict legal effectivity — silently incorporating provisions that are not yet in force.

1997/1339 — Corrigendum applied editorially but not in machine-readable source

Kirjanpitoasetus. Amendment 2015/1752 had a published corrigendum correcting multiple preamble errors. Finlex applied the correction; the machine-readable enacted XML was never updated. LawVM's strict-from-source replay is transparent about what it does and does not have. A mixed case illustrating that corrigenda exist outside the machine-readable pipeline.

The forward goal

Current evidence already shows repeated classes of cases where replay outperforms the official consolidation. The ongoing goal is a fully classified divergence ledger for the Finnish statute corpus — every divergence investigated, every one typed.