Screen Scraping Kansas Unofficial Election Results
How to get unofficial 2022 general election result data for analysis
Updated 2022-12-14
The Kansas Secretary of State reports unofficial elections results online one way and then publishes the official results in a different place online with different formats after the state certification by Dec. 1.
“These unofficial results on election night are incomplete and will not include provisional ballots. Provisional ballots will be counted during the county canvass.”
“Official results will be posted on our main website after the county canvass is complete.” [from 2020 web page]
A single, unified reporting system from election night to election certification to data archival would be much better. But that’s not reality today.
The final certified results will include precinct-by-precinct details, which are useful, but the county results and maps available with the unofficial results are not provided with the final results. So, some analysis by county is often easier by grabbing the county data in the unofficial results, which were finalized on Monday.
Data files and analysis notebooks mentioned or used in the following are online in this GitHub repository.
Downloading county files
The online unofficial election results cannot be directly used in analysis by computer programs. Data from the main results page and the 105 separate county pages must be “scraped” and reformatted before it can used used in analysis.
The “wget” program may be familiar for those using the “command line” on various operating systems, especially Linux. A relatively simple one line command like the following can fetch all the web files with the unofficial results:
wget -e robots=off -N -t 50 -o wget.log -l 8 -r "https://ent.sos.ks.gov/kssos_ent.html"
This command transfers about 163 files in 5 folders to the local folder ent.sos.ks.gov (which was renamed to reflect the data timestamp). Grabbing all the files at once helps ensure consistency across all the files.
Screen Scraping
The details of extracting the data from the HTML pages are in notebook Screen-Scrape-County-Election-Results-Exploration-FINAL.html (on GitHub).
Often screen scraping scripts must be re-written every election cycle because of changes to the web pages. My scripts from 2010 or 2016 were of no value.
The results for all counties are stored in an Excel file for analysis: County-Results-ent.sos.ks.gov-2022-11-14-1818-FINAL.xlsx.
Here are some of the results for Allen County in that file:
This “County-Results” file can be easily filtered and analyzed using tools like R’s “tidyverse” package dplyr.
GitHub Respository
Download all files mentioned in this article from the 2022-Kansas-Elections repository on GitHub.
The unscraped country html files are in folder ent.sos.ks.gov-2022-11-14-1818-FINAL.
The scraped results with data from all 105 counties are in file County-Results-ent.sos.ks.gov-2022-11-14-1818-FINAL.xlsx, which also can be downloaded here:
Judicial Retention Analysis
The folder (November-General/Unofficial-Results/Judicial-Retention) has a notebook (Kansas-Judicial-Retention.html) showing how to extract the judicial retention election data from the County-Results file described above. This data was used to create maps for each judicial contest.
Supreme Court Justice 4, Eric S Rosen, did not have a retention election in 2022.
Supreme Court Justice 7, Caleb Stegall, had a retention election in 2022 and won all 105 counties in the state.
Pyle’s Voters
The folder (November-General/Unofficial-Results/Dennis-Pyle) has a notebook (Dennis-Pyle-2022-11-14-FINAL.html) showing how to extract the Pyle data from the County-Results file described above.
This data was used to create the map of Pyle’s voters and the density plot showing variation across counties.