IRS not serious about transparency of non-profits and political groups
Failure for punctual release of IRS 990 and IRS 527 files hides political activities of non-profits. Why have over 400,000 IRS 990 XML files from 2021-2022 been taken offline since July?
Update Feb. 15, 2023: The IRS 527 “master file” was finally updated on Sunday (Feb.12) after 15 weekly updates were missed.
Update Dec. 19, 2022: Representative in phone call to IRS media relations said they are “aware” of problems and they are “working on it.” There is no timeline for any solution.
Updated and corrected, Dec. 6 at 9 PM.
About a year ago the IRS “moved” a few million IRS 990 XML files from AWS servers to its own server. In the process the IRS “lost” about 1.4 million filings from 2010 to 2015.
Since July 2022 the IRS removed about 400,000 additional IRS 990 XML files from its server from 2021-2022 filings, and has not published expected additions for either year.
In Sept. the IRS said the removal was for “maintenance.” In the weeks before the election the IRS simply stated the files were “unavailable.” Why would the IRS “hide” the most recent information about non-profits before an election? Since the election, the IRS web page makes no mention of the removed files.
These 990 XML files are searchable, the 990 PDF files published by the IRS are not searchable. Why would the IRS impede searching 990 files by only providing PDF files now?
Besides the missing IRS 990 files, the IRS skipped the last five weekly updates of IRS 527 files, including the last update prior to the November general election. Why near an election is the IRS “hiding” information about contributions to and expenditures by 527 political organizations?
IRS 990s
Many 501(c)(3) “educational” and 501(c)(4) “advocacy” non-profits engage in political activities that affect elections. These groups must file yearly IRS 990 tax reports to justify their non-profit status.
The IRS publishes these 990s for public scrutiny, but that scrutiny is impeded by delays. Slow processing of these filings by the IRS, especially during COVID, has extended publication delays to two years or more in some cases.
The impact and public interest in non-profit political spending wanes considerably with such long delays.
Missing IRS 990 XML Files
If you use the ProPublica IRS “Nonprofit Explorer” 990 search today, you’re likely to see this notice about recent filings, especially those from 2021 and 2022, which may not be available:
But over a million older XML files are also missing.
IRS 990 XML files on Amazon in Dec. 2021
The WayBack Machine shows the IRS started storing and updating IRS 990s on Amazon around Oct. 2018.
But “on December 16, 2021 the IRS announced that it would discontinue updates to the IRS 990 Filings dataset on AWS, starting December 31, 2021.” The IRS “moved” these files to its own servers.
Here are counts of IRS 990 XML files I downloaded from Amazon in March 2022, which had not been updated since Dec. 2021:
1.4 million XML 990 files went missing in “move” from AWS
After “moving” from AWS storage to the IRS-hosted site, nearly 1.4 million XML files from 2010 - 2015 went missing.
IRS 990 XML files on IRS site July 2022
On July 15, 2022 the IRS site had index files for both 2021 and 2022 and a total of seven ZIPs containing over 400,000 XML files for download.
In the first six months of 2022 over 15,000 IRS 990s from 2020 were added, as well as over 200,000 from 2021, and over 16,000 forms from 2022.
Over 400,000 additional XML files went missing
XML files from 2021 and 2022 were mysteriously dropped from the site since July, and no new XML files have been added. [PDF files may be available, but they’re huge compared to PDFs and more difficult to work with — see below.]
On Sept. 1 the following notice said links to the downloads had been removed for maintenance.
The Wayback Machine captured this notice from Sept. 4 through Nov. 9 on the same page with the links now “unavailable.”
Today there is no mention of XML files from 2021 or 2022.
PDF files instead of XMLs?
PDF files are great for reporting information, but are a burden to use in research because of their format and huge size compared to XML files.
In general, XML files can be searched more efficiently than PDFs and are much, much smaller. Searching XML files can reveal target information anywhere in the report. But PDFs published by the IRS (AFAIK) are scanned images and text cannot be searched without optical character recognition (OCR) pre-processing. [PDF internals can be text or scanned images. Searching text is easy. Searching images is difficult.]
The 475,354 XML files from 2020 take about 15.7 GB of disk space.
The PDF files from 2021 (I don’t have the count yet) are in 259 ZIP files that take 436 GB of disk space — almost 25 times as much space as the XML files from 2020 — and that’s before decompression. Downloading the 2021 IRS 990s as PDFs took 27.5 hours! Searching these PDFs is not readily possible without a huge OCR preprocessing task.
Missing IRS 527 updates
Every Sunday morning at 1 AM the IRS supposedly updates a large Political Organization Filing and Disclosure (POFD) file, which has information about all IRS 527 organizations and the IRS 8871 and 8872 forms they file.
The POFD file has an archaic file structure with 9 different kinds of records in a single file, but like with the XML files, searching all fields is possible with this text file. Once information of interest from an IRS 8871 or 8872 is identified in the POFD file, the online IRS 8871/8872 search page can be used to view a report.
A POFD file with over 12 million records can still be downloaded but it has not been updated since Oct. 29 — the last five weekly updates are missing.
The penultimate record in the file, the “footer” record, shows the time of update and number of records: “F|20221029|0326|12134704|” (2022-10-29 03:25 with 12,134,704 records). File “signatures” (md5sums) also prove the file has not been updated.
I learned in 2015 when 527 file updates were missing for nearly two months there was no way to contact the IRS about these failures to update. Multiple attempts failed.
There’s also no way (AFAIK) to report to the IRS the problem with about ~12,000 records that cannot be parsed correctly in the POFD file. Many records unexpectedly wrap onto multiple lines. I’ve developed heuristics to “unwrap” about half of these wrapped records, leaving about 6,000 for now that cannot be processed. Technical problems like this also “hide” information from public scrutiny.
The IRS Tax Exempt Organization Search Tool becomes more and more worthless when its contents are not updated in a timely way.
The IRS Tax Exempt Organization Search Bulk Data Downloads become more and more worthless when files are months or years late to appear.
Related
Millions in “Dark Money” Bankrolled 2022 State Secretary of State Campaigns, Parker Thayer, Capital Research Center, Dec. 5, 2022.
5 Ways Secret Money Makes Its Way into our Elections, Campaign Legal Center, Oct. 11, 2022.
The IRS is not enforcing the law on political nonprofit disclosure violations, Matt Corley and Adam Rappaport, CREW, April 28, 2022.
Some Suggestions for Improving the Form 990, Rob Stilson, Capital Research Center, March 28, 2022.
Details are missing in several recent articles because of IRS delays in releasing documents about non-profits and political organizations . . .