Nothing “fishy”about IDs in Kansas voter files IMHO

But others may disagree

Dec 28, 2024

The Kansas voter file has about 2 million records. The voter ID numbers range from roughly the number 1 through about 6.5 million. Quick math suggests there are 4.5 million “holes” — numbers not used — over this range. Some might ask why aren’t ID numbers assigned strictly in numerical order?

I received questions about whether there is something “fishy” about the voter IDs used to identify voters in the Kansas voter files released by the Kansas Secretary of State.

My opinion is “no.” I do not believe anything is “fishy” about our voter IDs, but others may disagree. Sadly, there is no way to prove anything definitively without a review of the software, which is not possible.

Paquette’s Findings

Andrew Paquette, who identifies on Linked-In as a “steganographic algorithm detection” researcher, made claims of “an algorithm hidden in New York’s voter rolls” in May 2023: Paquette studied patterns in the ID number assignments.

Last April, I discovered an algorithm hidden in New York’s voter rolls. The algorithm linked county voter identification (CID) and State Board of Elections identification (SBOEID) numbers in such a way that it could be used as a third ID number. This could be used to clandestinely tag and track records of interest, such as phantom voters.

Paquette published articles about finding “mysterious algorithms in various US voter roll databases” on his ResearchGate page, with separate preliminary reports for his findings in Arizona, California, Georgia, Ohio, Pennsylvania, Texas, and Wisconsin.

Some of the problems cited for other states were due to separate county and state IDs, which is not an issue in Kansas. Since the first Help America Vote Act mandated statewide voter file was created in 2006, Kansas has only had a single ID for all voters.

I disagree with Paquette, but how can that be proven?

Kansas Voter File 2024

Some in our state have asked, “what about Kansas voter IDs” from what Paquette claims?

I will admit a “scatterplot” of the Kansas Voter IDs by registration date is a bit unexpected with several curious patterns from the 2,027,930 voters on Oct. 22, 2024:

Scatterplot of Voter Registration Date by Kansas Voter ID from Oct. 22, 2024 voter file. The “rug” along the bottom shows many breaks in the assigned ID numbers.

There are many questions:

When is a voter registration date assigned or changed?
What explains the vertical “bands?” Gaps in vertical bands?
What explains the horizontal gaps between the vertical bands?
What explains the “holes” in “solid” bands when the plot is enlarged?
What explains the “triangle” at the upper right?

History provides some of the answers.

ELVIS in 2006

The scatterplot above shows a horizontal line when the HAVA-mandated Kansas Election Voter Information System (ELVIS) first appeared. The “triangle” at the upper right only started forming after ELVIS started.

Very old Kansas statewide voter files (e.g., from dates 2001-12-28, 2003-01-30, …, 2006-02-14) were collections of voter names from all 105 counties, but did not contain any voter IDs.

With ELVIS introduced to satisfy HAVA requirements in Dec. 2005, the first file I have with a statewide voter ID (first called “Key_Registrant”) is from May 17, 2006.

This 2006-05-17 file of 1,645,777 data records was sorted by Key_Registrant. The early part of this sorted file was also mostly in order by county name.

This is evidence the 105 counties were originally assigned ranges of IDs by county. Each county assigned voter IDs within their range as voters were added to ELVIS. There were gaps between county ranges.

Can we observe this in data from 2006?

Kansas 2006

Horizontal white bands, especially in odd-numbered years, are explained by low voter registration activity. Near elections, especially in even-numbered, the horizontal bands become more dense as many registered to vote.

But what about counties?

Johnson County 2006

The Johnson County voter IDs from 2005 were abandoned when data were migrated to ELVIS and a new state ID was assigned.

In ELVIS near mid-2006 JoCo voters had IDs ranging from 2417 to 5030454 but most were in a narrow vertical band visible in the plot below.

The “rug” along the bottom shows one large band of ELVIS IDs (i.e., JoCo), and some smaller bands representing migrations to JoCo from other counties. More about that later.

ELVIS is very dynamic with the 105 counties making changes every day.

When voters move between counties they retain their original ELVIS ID but are assigned a new voter registration date. A move within a county generally does not changed the registration date (but sometimes does).

Most of the sparsely filled areas for voters in Johnson County above (outside the dense narrow vertical band) were likely caused by voters moving from some other county to Johnson County and retaining an ELVIS ID that reflects their original county assignment.

Sedgwick County 2006

The plot below shows a narrow band of ELVIS IDs for Sedgwick County voters in 2006.

Note the Sedgwick County band range below in the the Johnson County plot above shows a number of Sedgwick County voters moved and re-registered in Johnson County.

Likewise, the Johnson County range of IDs from above in this Sedgwick County plot below shows some migration from JO to SG counties.

Shawnee County 2006

The plot below shows a narrow band of IDs for the smaller Shawnee County voters in ELIVS.

The narrow band of mostly Shawnee ELVIS IDs below identifies migrations from Shawnee County to JO or SG above.

Plots like this for other counties are possible, but have not been created.

Enlargements: Kansas Voter File 2024

Post HAVA Registrations

The Kansas Voter File 2024 plot at the top shows two things happened once ELVIS was implemented:

Voters in a county were still assigned IDs in the original county band of IDs seen in 2006.
Voters were assigned new, higher ID numbers represented by the triangle at the upper right.

An enlargement of that triangle looks like the following and shows “holes” and areas with less-dense registrations.

The low-density bands under the 2006 line (near 5300000) are curious and I don’t have an explanation for them. I don’t have an explanation for the other scattered dots under the “triangle”

Oct 2024 Registrations

An extreme enlargement of right-most tip of the triangle shows registrations up to the date the file was extracted, Oct. 22, 2024.

Each dot here is a single voter with a specific ID and registration date.

In theory, voter registrations were not allowed after Oct. 15, but this plot shows some took place after the deadline.

I could identify these voters by ID and name, but it’s unlikely Kansas Open Records would reveal the reason for these late registrations. (KORA needs reform.)

Better transparency needed

Hashing system for database indexing?

I cannot explain all the patterns I see in the plots above, but IMHO they are consistent with a hashing algorithm for quick database indexing.

Hashing algorithms require plenty of “holes” to avoid “collisions” on lookups, so the sparsity of the numbers assigned does not surprise me.

Possibly, the “holes” in the assigned ID numbers provide a security measure where any guessed voter ID number has a greater chance of being invalid than valid.

I see using hashing for database lookups an explanation with the smallest number of assumptions. To me this “Occam’s razor” explanation is the most likely.

Hashing system for cryptography?

But because hashing algorithms can also be used for cryptographic purposes, some like Andrew Paquette believe hidden encoding is possible with voter IDs.

I’d argue that it would be easier to “hide” such things in a database field that is never released to the public, since there is no transparency on exactly what fields ELIVS maintains. I know ELVIS has fields that were releasable several years ago that are not released now. Why try to hide something buried in a number when a non-disclosed field could contain that same information?

Who is right?

Review of the ELVIS code on how the voter IDs are created and used could answer the question, but sadly election officials across the country do not permit that by anyone.

Election officials do not understand their “black box” systems and lack of transparency about how anything works can feed conspiracy theories about possible “algorithms.”

Scrutiny not possible?

Analogy to medical devices

For ten years I worked on software development on two medical devices, a process heavily regulated by the FDA. The FDA could show up at any time — and did once — to ask questions about about what was on a computer screen or a report, or how any algorithm worked.

We had to produce documents of the requirements, the design, and the testing to prove our medical device was safe and effective.

Once I defended our medical device and showed it worked property when FDA wanted to see the proof, with one exception: I had to admit a certain decade-old formula I inherited was missing a minus sign! Fortunately, a wise surgeon recognized the discrepancy between the number and the plot on the screen and a patient was spared a cardiac procedure. We fixed that problem and issued a new software release.

Why are voting systems, which are part of the national infrastructure, immune from scrutiny of any kind by anyone at anytime?

State legislators and Congress need to demand better transparency in our election systems to ensure public confidence in our elections.

Everything should not be a “black box.”

Watchdog Lab

Discussion about this post