DOJ DOCUMENT REMOVAL AUDIT: ~64,000 FILES REMOVED FROM JUSTICE.GOV AFTER EFTA PUBLICATION
Correction (Feb 22, 2026): The original version of this report stated 78,234 removed documents based on an HTTP scan conducted February 18-20, 2026. Two corrections have been applied:
DOJ DOCUMENT REMOVAL AUDIT: ~64,000 FILES REMOVED FROM JUSTICE.GOV AFTER EFTA PUBLICATION
A Forensic Integrity Audit of the Department of Justice's Epstein Files Transparency Act Production
Correction (Feb 22, 2026): The original version of this report stated 78,234 removed documents based on an HTTP scan conducted February 18-20, 2026. Two corrections have been applied:
>
Correction 1 — Authenticated Rescan (67,784): A subsequent rescan using authenticated Playwright browser sessions (passing both the Akamai bot challenge and the DOJ's age-verification gate) confirmed 67,784 documents returning HTTP 404, down from 78,234. The ~10,450 false positives were caused by the DOJ's age-verification system returning HTTP 404 to unauthenticated requests — a behavior that caused our original scanner to misclassify live documents as removed.
>
Correction 2 — Statistical Sampling (~64,259): Because the rescan infrastructure also used distributed IP addresses (to stay within Akamai's per-IP rate limits), we conducted a further statistical verification. A random sample of 500 documents was drawn from the 67,784 (seed=42) and each was checked individually via an authenticated Playwright Firefox session with canary validation every 50 requests. Results: 474 confirmed removed (94.8%), 26 confirmed live (5.2%). The false positives clustered entirely in the subgroup of 404s that lacked a Last-Modified response header in the original scan (0/362 false positives in the with-Last-Modified group; 26/138 = 18.8% in the without-Last-Modified group). Applying the observed false positive rate to the full population yields an estimated ~64,259 genuinely removed documents (95% CI: 62,940-65,578). All figures in this report have been updated to reflect this estimate. The ten documents cited by EFTA number were independently re-verified and remain 404.
EXECUTIVE SUMMARY
Between February 18 and 20, 2026, we conducted a comprehensive HTTP integrity scan of every document in the Department of Justice's Epstein Files Transparency Act (EFTA) production hosted at justice.gov/epstein. Of the 1,380,941 documents in the production corpus, we successfully scanned 1,365,677 (98.89%). An initial count of 78,234 HTTP 404 responses was refined through authenticated browser rescanning (67,784 confirmed 404s) and statistical sampling (estimated ~64,259 genuinely removed documents, 95% CI: 62,940-65,578). These files were part of the original DOJ publication but have been silently removed from the public server without any Federal Register notice, without Congressional notification, and without published legal justification — all of which are required by the statute.
Browser-authenticated verification and statistical sampling confirm these are real removals, not scanner artifacts. The false positive rate is 5.2% (26/500 sampled), concentrated entirely among documents whose 404 responses lacked a Last-Modified header — a fingerprint of the DOJ's age-verification gate, not a genuine removal.
The removed documents are not random. They are concentrated overwhelmingly in Dataset 9 — the dataset containing the FBI's investigative files, prosecutorial correspondence, Bureau of Prisons records, and grand jury materials. Below, we examine ten removed documents — organized across five categories — that contain no victim personally identifying information and for which no lawful basis for removal exists under the statute. These include: the Bureau of Prisons' detention logs and death incident report, the BOP's psychological reconstruction of Epstein's death, the FBI's 476-page death investigation file, Kenneth Starr's 36-page lobbying fax to US Attorney Acosta, the complete 51-page investigation timeline, the federal grand jury presentation outline, bank statements for an Epstein-linked shell entity, and 2,153 pages of phone records subpoenaed by the grand jury.
We also document a significant new finding: since our scan, the DOJ has been quietly restoring some previously removed documents — including the Non-Prosecution Agreement itself, FBI financial records, and BOP medical files. This partial restoration confirms the documents were removed, but the pattern of what remains 404 is revealing: documents about how Epstein died and how the prosecution was influenced remain removed, while other categories have been restored.
Additionally, numerous other removed documents contain extensive, unredacted victim PII — full names, dates of birth, home addresses, phone numbers, and family members of victims who were minors at the time of the crimes. The DOJ failed to redact this information before publication, then removed the documents wholesale rather than fixing the redactions. We are not identifying those documents here — see "A Note on Victim Privacy" below.
We have conducted our own targeted PII redaction of victim-identifying information across approximately 1,400 pages in our research databases at epstein-data.com, removing 2,099 instances of victim names, phone numbers, SSNs, dates of birth, and addresses. Every document cited below links to both the DOJ URL (which returns 404) and our OCR text database (where the full document can be read with victim PII redacted).
Additionally, 23,989 documents that remain accessible show file sizes smaller than their originals — suggesting post-publication re-processing or content removal that has not been disclosed.
PART I: THE EPSTEIN FILES TRANSPARENCY ACT
What the Law Requires
The Epstein Files Transparency Act (Public Law 119-38, H.R. 4405) became law on November 19, 2025. In plain terms, it orders the Attorney General to make public — within 30 days — all unclassified records the Department of Justice holds relating to Jeffrey Epstein. The word "all" is not rhetorical. The statute specifies nine categories of material that must be released:
The statute requires these records be made available in "searchable and downloadable format."
What the Law Prohibits
Section 2(b) creates an absolute prohibition: No record shall be withheld, delayed, or redacted based on embarrassment, reputational harm, or political sensitivity — including to any government official, public figure, or foreign dignitary. This is not a balancing test. It is a categorical bar.
The Five Permitted Carveouts — And Their Limits
Section 2(c) defines exactly five categories of material the Attorney General may withhold or redact. These are the only lawful bases for removing any document from the production:
(A) Victim Personally Identifiable Information — Records may be withheld or redacted where they contain personally identifiable information of victims, or victims' personal and medical files, where disclosure "would constitute clearly unwarranted invasion of personal privacy." This protects victim names, addresses, phone numbers, medical records, and identifying details. It does not protect the names of perpetrators, government officials, law enforcement, attorneys, or other non-victim participants. It does not authorize removing an entire document when targeted redaction of victim identifiers would suffice. (B) Child Sexual Abuse Material (CSAM) — Material depicting or containing CSAM as defined under 18 U.S.C. § 2256 may be withheld. Section 2256 defines CSAM to include visual depictions — photographs, films, videos, and digital images. This does not extend to investigative reports, legal filings, or text documents that reference or describe CSAM without depicting it. (C) Active Federal Investigation or Ongoing Prosecution — Material may be withheld where disclosure "would jeopardize an active federal investigation or ongoing prosecution." The statute adds two critical qualifiers: such withholding must be "narrowly tailored" (removing only the specific material that would cause jeopardy, not entire documents or categories) and "temporary" (implying an obligation to release once the investigation concludes). This does not apply to investigations that have been resolved, declined, or closed. (D) Images of Death, Physical Abuse, or Injury — Material may be withheld where it depicts or contains "images of death, physical abuse, or injury of any person." This covers visual depictions — not text descriptions, investigative reports, or administrative records that discuss these events without depicting them visually. (E) National Defense or Foreign Policy — Material specifically authorized under criteria established by Executive order to be kept secret in the interest of national defense or foreign policy may be withheld. Any post-July 1, 2025 classification decision must be published in the Federal Register.The Procedural Requirement That Makes Removal Unlawful
Even where one of these five carveouts applies, the statute imposes a procedural requirement that the DOJ has not followed. Section 2(c)(2) mandates: "All redactions must be accompanied by written justification published in the Federal Register and submitted to Congress." The Attorney General must declassify to the maximum extent possible and, for any material that cannot be declassified, must release an unclassified summary.
As of this writing, no Federal Register notice has been published justifying the removal of ~64,000 documents. No unclassified summaries have been provided. On February 14, 2026, the Attorney General submitted a report to Congress listing categories of records released and withheld — the report required by Section 3 of the EFTA. That report does not constitute the Federal Register publication required separately by Section 2(c)(2). The removals are, on their face, procedurally non-compliant with the statute regardless of whether any substantive carveout might apply to individual documents.
The DOJ's Stated Explanation
On February 5, 2026, the U.S. Attorney's Office for the Southern District of New York filed a letter to the court (Case 1:19-cr-00490-RMB, Document 105) explaining its approach to post-publication removals. The letter, signed by Attorney General Pamela Bondi, Deputy Attorney General Todd Blanche, and U.S. Attorney Jay Clayton, states that:
- Victims and victim counsel can contact
[email protected]to flag insufficiently redacted documents
- Documents flagged through that process "are removed for further review by Department personnel"
- Approximately 7,000 documents were flagged based on "identifiers not previously provided to the Department or based on apparent instances of first names only, nicknames, misspellings or alternate spellings of names, or initials appearing in documents"
- The Department acknowledged "human errors, technical errors, and instances in which the effectiveness of certain quality control measures appears to have been hampered"
- Approximately 9,500 documents subject to the Maxwell Protective Orders were also removed
This explanation accounts for a subset of the removals. It does not account for the removal of Bureau of Prisons detention logs containing no victim information, financial records of Epstein-linked shell entities, or the BOP's own psychological reconstruction of Epstein's death — none of which contain victim-identifying information that would be flagged through the [email protected] process.
PART II: AUDIT METHODOLOGY
Corpus Definition
The DOJ's EFTA production consists of 1,380,941 documents organized across 12 datasets. Each document is assigned an EFTA number (e.g., EFTA00074206) and hosted as a PDF at a predictable URL:
``
https://www.justice.gov/epstein/files/DataSet%20{N}/EFTA{NUMBER}.pdf
`
Our corpus manifest was built from the full_text_corpus.db database, which contains the complete text extraction of all documents in the production. The manifest includes every EFTA number and its dataset assignment.
Scanning Method
HTTP HEAD requests were sent to every document URL in the production. Scans were conducted February 18-20, 2026. Each request recorded the HTTP status code, Content-Length, Content-Type, and response latency. The DOJ serves these files through an Akamai CDN that applies aggressive rate-limiting — particularly on Dataset 9, which contains over 531,000 documents and required multiple scanning passes to achieve coverage.
Deduplication and Verification
All scan results were merged using a priority system: HTTP 200 (document present) takes precedence over HTTP 404 (removed), which takes precedence over error codes. This ensures that if a document was accessible in any scan pass, it is counted as present.
All 78,234 documents initially flagged as HTTP 404 were submitted to a second verification pass. False positives (documents that returned 404 initially but 200 on reverification) were reclassified as present, reducing the count to 67,784.
Manual Playwright Verification
Because the DOJ's Akamai CDN serves a JavaScript challenge page before delivering content, standard HTTP tools (curl, wget) can return misleading results. The DOJ's age-verification gate returns HTTP 404 (not 302 or 200) to unauthenticated requests, which caused our initial scanner to misclassify approximately 10,450 live documents as removed. We confirmed removals by loading documents in browser sessions using Playwright (a browser automation tool) with both the Akamai bot-check cookie and the DOJ's justiceGovAgeVerified age-gate cookie. Removed documents display the DOJ's standard 404 error page with a Last-Modified date of September 2, 2025. Present documents load as PDFs with Content-Type: application/pdf.
Statistical Sampling Verification
To quantify the residual false positive rate after the authenticated rescan, we drew a simple random sample of 500 documents from the 67,784 confirmed-404 population (random seed 42, uniform sampling). Each URL was checked sequentially (one request at a time, 0.3-second delay) via fetch() inside an authenticated Playwright Firefox session. Canary validation was performed every 50 requests: a known-live document (EFTA00000001, Dataset 1) was verified to return HTTP 200 with Content-Type: application/pdf, and a known-nonexistent document (EFTA99999999) was verified to return HTTP 404. All 10 canary checks passed.
| Verdict | Count | Rate |
| --------- | ------- | ------ |
| Confirmed removed (HTTP 404) | 474 | 94.8% |
| Confirmed live (HTTP 200, PDF) | 26 | 5.2% |
header presence:
| Subgroup | Sampled | False Positives | FP Rate | Population Est. |
| ---------- | --------- | ----------------- | --------- | ----------------- |
With Last-Modified header | 362 | 0 | 0.0% | ~47,430 |
Without Last-Modified header | 138 | 26 | 18.8% | ~20,354 |
The false positives cluster entirely in the without-Last-Modified subgroup. These are documents where the age gate's 404 response is indistinguishable from a genuine removal based on headers alone. The with-Last-Modified group (documents whose 404 response carried a Last-Modified: Tue, 02 Sep 2025 header matching the DOJ's standard error page) showed a 0% false positive rate across 362 samples.
The sampling script, raw results CSV (500 rows), and source data are published on GitHub for independent replication.
Limitations
- 15,264 documents (1.1%) were unreachable — all in Dataset 9 — due to Akamai CDN rate-limiting before scanning completed. These are recorded as UNSCANNED, not as removals.
- 2,614 documents returned HTTP 401 (CDN authentication challenge), also not counted as confirmed removals.
- 235 EFTA numbers fall in gaps between dataset ranges and were not included in the manifest.
- Our HEAD requests do not verify document content — only whether the URL returns a valid response. A document that has been replaced with a different file at the same URL would not be detected.
PART III: AUDIT RESULTS
Summary
| Metric | Count | Percentage |
| -------- | ------- | ------------ |
| Total corpus | 1,380,941 | 100% |
| Documents scanned | 1,365,677 | 98.89% |
| Unscanned (Akamai blocked) | 15,264 | 1.11% |
| HTTP 200 (present) | 1,284,829 | 93.04% |
| HTTP 404 (confirmed) | 67,784 | 4.91% |
| Estimated genuinely removed | ~64,259 | ~4.65% |
| HTTP 401 (auth failure) | 2,614 | 0.19% |
| Zero-byte files (200 but empty) | 26 | <0.01% |
The initial scan flagged 78,234 documents as HTTP 404. An authenticated rescan reduced this to 67,784. Statistical sampling (see Methodology above) estimates ~3,525 residual false positives, yielding ~64,259 genuinely removed documents (95% CI: 62,940-65,578).
Per-Dataset Breakdown
| Dataset | Manifest | Scanned | Present | Removed (confirmed 404) | Removal Rate |
| --------- | ---------- | --------- | --------- | --------- | -------------- |
| 1 | 3,158 | 3,158 | 3,147 | 11 | 0.35% |
| 2 | 574 | 574 | 564 | 10 | 1.74% |
| 3 | 67 | 67 | 62 | 5 | 7.46% |
| 4 | 152 | 152 | 151 | 1 | 0.66% |
| 5 | 120 | 120 | 118 | 2 | 1.67% |
| 6 | 13 | 13 | 12 | 1 | 7.69% |
| 7 | 17 | 17 | 17 | 0 | 0.00% |
| 8 | 10,595 | 10,595 | 10,579 | 16 | 0.15% |
| 9 | 531,284 | 516,020 | 450,242 | 65,778 | 12.74% |
| 10 | 503,154 | 503,154 | 502,076 | 1,078 | 0.21% |
| 11 | 331,655 | 331,655 | 330,777 | 878 | 0.26% |
| 12 | 152 | 152 | 148 | 4 | 2.63% |
| TOTAL | 1,380,941 | 1,365,677 | 1,297,893 | 67,784 | 4.96% |
Dataset 9 accounts for 97.0% of all confirmed 404s (65,778 of 67,784). After applying the 5.2% false positive rate from statistical sampling, we estimate approximately 62,360 genuinely removed documents in Dataset 9 alone. This is the dataset that contains the bulk of the FBI investigative files, prosecutorial correspondence, Bureau of Prisons records, and grand jury materials — the categories of documents most directly responsive to the EFTA's disclosure mandate.
Size Mismatch Analysis
Separately, we compared the Content-Length reported by the DOJ server for documents that are still accessible against the original file sizes recorded in our database (extracted from the original production). Of the documents returning HTTP 200:
- 23,989 documents have different file sizes than their originals — 20,276 (84.5%) are smaller, 3,713 (15.5%) are larger
- The largest reduction lost 99% of its original file size; the largest increase grew over 500%
- Unlike the removals (concentrated in Dataset 9), size changes are spread across Datasets 9, 10, and 11 — suggesting a different post-publication re-processing pattern that has not been disclosed
Open Data
The complete scan results are published as open data on GitHub for independent verification and analysis:
- CONFIRMED_REMOVED.csv — All 67,784 confirmed-404 documents with EFTA number, justice.gov URL, dataset, page count, scan metadata, and first-page text preview. Statistical sampling estimates ~5.2% are residual false positives from the age gate; 26 known false positives from the sample are flagged in the sample_verified
column. See Statistical Sampling Verification above.
- SIZE_MISMATCHES.csv — All 23,989 documents with file size changes (20,276 reductions, 3,713 increases), including original size, current size, byte difference, and percentage change.
The full OCR text of every document in the EFTA corpus is searchable at epstein-data.com, where victim-identifying information has been redacted (2,099 instances across approximately 1,400 pages).
PART IV: REMOVED DOCUMENTS — CASE-BY-CASE ANALYSIS
The following analysis examines ten removed documents organized across five categories. Each was selected because it contains no victim personally identifying information and can be cited with full EFTA numbers without risk of directing readers to leaked victim PII. For each document, we provide: a description of its contents, an assessment of its relevance to the EFTA's disclosure mandate, and a carveout-by-carveout analysis of whether any lawful basis for removal exists.
All ten documents were read in full from our text corpus. Local PDF copies are preserved. 404 status was manually verified on February 21, 2026.
Our research database at epstein-data.com contains the full OCR text of every document cited below, with victim-identifying information redacted. Every EFTA link below points to the DOJ's server (which returns 404); the OCR link points to our copy.
A. Death and Detention Records
Five documents directly responsive to Section 2(a)(9) of the EFTA — "documentation of Epstein's detention or death" — remain removed from justice.gov. Together they constitute 2,895 pages of the government's own records of what happened in Epstein's housing units, how the BOP responded to his death, and what the FBI found when it investigated.
Document 1: BOP TRUINTEL Log — Composite (General Population + SHU + Inventory)
EFTA: EFTA00053963 · OCR text Dataset: 9 | Pages: 1,000 | Status: HTTP 404This is a composite Federal Bureau of Prisons document containing TRUINTEL (intelligence tracking) logs for both General Population and the Special Housing Unit (9 South) at Metropolitan Correctional Center New York, plus MCC inventory reports. The logs cover July 1 through August 19, 2019 — the entire period of Epstein's incarceration and death. They record, shift by shift: officer rounds, inmate counts, cell searches, feeding schedules, watch calls, and key/radio checks.
This is the official minute-by-minute operational record of the housing units where Epstein was held. It is the document that would show whether institutional protocols were followed during the period leading to his death.
EFTA Relevance: Directly responsive to Section 2(a)(9) — "documentation of Epstein's detention or death (incident reports, witness interviews, medical examiner files, autopsy reports, written records detailing circumstances and cause of death)." Carveout Analysis:- (A) Victim PII: The log contains names and register numbers of other inmates housed in the unit. These individuals are not "victims" under the statute — they are incarcerated persons. Even if treated as protected, their names could be redacted without removing a 1,000-page document.
- (B) CSAM: No.
- (C) Active investigation: The guard investigation (Tova Noel and Michael Thomas) was resolved through deferred prosecution agreements. No active investigation exists.
- (D) Images of death/abuse: The document is a text-based administrative log. It contains no images.
- (E) Classified: No.
Document 2: BOP TRUINTEL Log — 9 South Special Housing Unit
EFTA: EFTA00120887 · OCR text Dataset: 9 | Pages: 395 | Status: HTTP 404This is the companion TRUINTEL log for the Special Housing Unit (SHU), 9 South, at MCC New York — the exact unit where Epstein was housed after his July 7, 2019 transfer from general population, and where he died on August 10, 2019. The log covers July 1 through August 19, 2019, recording every officer round, watch call, count, cell search, and feeding for the SHU.
This is the single most critical operational record for evaluating guard conduct on the night of Epstein's death. It would show whether the 30-minute round schedule was actually maintained or whether there were gaps on August 9-10, 2019.
EFTA Relevance: Directly responsive to Section 2(a)(9) — detention and death documentation. Carveout Analysis:- (A) Victim PII: Same as Document 1 — other SHU inmates' names could be redacted.
- (B) CSAM: No.
- (C) Active investigation: Same as Document 1 — guard investigation resolved.
- (D) Images of death/abuse: Text-based administrative log. No images.
- (E) Classified: No.
Document 3: Psychological Reconstruction of Inmate Death
EFTA: EFTA00041963 · OCR text Dataset: 9 | Pages: 1,000 | Status: HTTP 404This is the BOP's official Psychological Reconstruction of Jeffrey Epstein's death, prepared by the National Suicide Prevention Coordinator. The report is the BOP's own internal finding that its systems failed. On its opening page, it acknowledges three critical limitations:
The report documents that of 71 staff members who were emailed the notification that Epstein needed a cellmate after being removed from Psychological Observation, only 27 opened the email — meaning 62% of relevant staff never read the message. It details Epstein's institutional history, the July 23 incident involving cellmate Nicholas Tartaglione and a noose, the August 9 court document unsealing that reportedly distressed Epstein, and the systemic failures in monitoring.
EFTA Relevance: Directly responsive to Section 2(a)(9) — this is the death investigation reconstruction. Carveout Analysis:- (A) Victim PII: One victim name appears in the antecedent circumstances section. It could be redacted in a single line.
- (B) CSAM: No.
- (C) Active investigation: The report references "pending investigations" as of its drafting, but the guard investigation has since been resolved and no other active investigation has been disclosed.
- (D) Images of death/abuse: The document describes the death but contains no images of it.
- (E) Classified: No.
Document 4: BOP Form 583 — Report of Incident (Death of Jeffrey Epstein)
EFTA: EFTA00120010 · OCR text Dataset: 9 | Pages: 24 | Status: HTTP 404This is BOP Form 583, the official "Report of Incident" filed by MCC staff on August 10, 2019, at 6:33 AM. It is the contemporaneous, structured incident report for inmate death — the form the Bureau of Prisons uses to document what happened, who was involved, and what actions were taken.
The form records the incident as occurring in the SHU, 9 South, classifying it as "Inmate Death" with method "Hanging/Asphyxiation." It names inmate Epstein, Jeffrey, Register No. 76318-054, with injury category "Fatal Injury." It lists the staff who responded, including officers Tova Noel and Michael Thomas.
The description of incident states: "On August 10, 2019, at approximately 6:33 a.m., while serving the breakfast meal inmate Epstein, Jeffrey, Reg. No. 76318-054 was found unresponsive in his cell. Staff called for assistance and began life saving measures. He was escorted to Health Services at approximately 6:39 a.m., and EMS arrived at 6:43 a.m. He was transported to the local hospital at approximately 7:10 a.m. Inmate Epstein was pronounced deceased at 7:36 am."
Attached staff memoranda describe "circumficial bruising around the neck" and document the chain of custody from cell to hospital to morgue, including precise timestamps for every transfer. One lieutenant's memo instructs staff "not to speak to anyone and or the media in regards to the situation."
Two additional copies of this Form 583 (EFTA00109863 and EFTA00134560) are also confirmed removed.
EFTA Relevance: Directly responsive to Section 2(a)(9) — this is an incident report detailing circumstances of death. Carveout Analysis:- (A) Victim PII: No victim information. Only BOP staff names and Epstein's own register number.
- (B) CSAM: No.
- (C) Active investigation: Guard investigation resolved.
- (D) Images of death/abuse: Text-based incident report and staff memoranda. No images.
- (E) Classified: Marked "UNCLASSIFIED/LIMITED OFFICIAL USE ONLY/LAW ENFORCEMENT SENSITIVE" — an administrative handling caveat, not a classification under Executive order.
Document 5: FBI 302 Serial 64 — Death Investigation at MCC
EFTA: EFTA00132208 · OCR text Dataset: 9 | Pages: 476 | Status: HTTP 404This is a 476-page FBI investigative file under case number 90A-NY-3151227, titled "UNSUB(S); JEFFREY EPSTEIN — VICTIM; DEATH INVESTIGATION." It contains FD-302 interview reports from the FBI's investigation into Epstein's death at MCC New York, along with evidence collection logs from the FBI's Computer Analysis Response Team (CART).
The opening interview (dated August 28, 2019) documents agents interviewing MCC inmate Leonardo Fernandez in Cell 218 on L-Tier of the SHU. Fernandez reports that Epstein "would be in legal from approximately 9am to 9pm and he was housed in cell 220 in L Tier" and that he "had no information regarding the death of EPSTEIN."
Subsequent pages include CART evidence collection forms documenting the seizure of a 250 GB Western Digital hard drive (Model WD2500JD, S/N WMAEP1322402) containing "generated reports and data extractions" related to CART Request #218509. The file also includes OIG (Office of Inspector General) forms signed by MCC employees during "Voluntary Interviews" — documenting the FBI's efforts to reconstruct what happened.
EFTA Relevance: Directly responsive to Section 2(a)(9) — this is the FBI's death investigation file, and Section 2(a)(1) — investigation records. Carveout Analysis:- (A) Victim PII: The file interviews MCC inmates and staff about the events surrounding Epstein's death. No trafficking victim information.
- (B) CSAM: No. The CART evidence relates to MCC operations, not CSAM.
- (C) Active investigation: No ongoing death investigation has been disclosed. The guard case was resolved.
- (D) Images of death/abuse: Text-based FD-302 interview reports and evidence logs. No images.
- (E) Classified: No.
B. Prosecution and Defense Records
Two documents that reveal the internal dynamics of the Epstein prosecution — how the case was influenced and how the defense team lobbied to shape its outcome — remain removed from justice.gov.
Document 6: Kenneth Starr / Kirkland & Ellis Fax to US Attorney Acosta
EFTA: EFTA00176111 · OCR text Dataset: 9 | Pages: 36 | Status: HTTP 404This is a 36-page fax transmitted on December 11, 2007, from Kirkland & Ellis LLP to United States Attorney R. Alexander Acosta, Southern District of Florida. It consists of two letters: a cover letter from Kenneth W. Starr (the former Independent Counsel and Solicitor General, then at Kirkland & Ellis) and a substantive 33-page brief authored by Jay P. Lefkowitz, also of Kirkland & Ellis. The fax is part of a larger 99-page transmission.
The document is a detailed defense lobbying brief filed during negotiations over the Non-Prosecution Agreement. Starr introduces the submission by stating: "In the combined 250 years experience of Jeffrey's defense team, we have together and individually concluded that this case is not only extraordinary and unprecedented, it is deeply and uniquely troubling."
The first letter (Lefkowitz) is organized into three sections. Section I attacks the evidentiary foundation of the case, arguing that Palm Beach Police Detective Recarey made "crucial misstatements" in the Police Report and Probable Cause Affidavit compared to tape-recorded witness statements. Section II catalogs alleged misconduct by AUSA Marie Villafana, including:
- Filing the PBPD Probable Cause Affidavit with the court knowing it contained errors
- Issuing overly broad document requests that Deputy Chief Laurie later narrowed
- Initiating an 18 U.S.C. 1591 (sex trafficking) investigation without required DOJ notification
- Personally calling "Mr. Epstein's largest and most valued business client without any basis to inform him of the investigation"
- Attempting to nominate as the attorney representative "a very good personal friend of Ms. Villafana's boyfriend, a fact she assiduously kept hidden from counsel"
- When asked whether DOJ coordination policies were followed, giving "no response other than stating, 'it is none of your concern'"
The second letter (Lefkowitz) argues that the inclusion of 18 U.S.C. Section 2255 — the civil remedy statute for trafficking victims — as a component of a criminal plea agreement was "unprecedented," stating: "Neither Section 2255, nor any other civil remedy statute, has been used as a pre-requisite to criminal plea agreement."
EFTA Relevance: Directly responsive to Section 2(a)(7) — "all internal DOJ communications concerning decisions to charge, not charge, investigate, or decline to investigate Epstein or associates" — and Section 2(a)(6) — "all immunity deals, non-prosecution agreements, plea bargains." Carveout Analysis:- (A) Victim PII: The letter discusses victims in the context of attacking their credibility. Our PII redaction process has removed victim names from the OCR text in our database.
- (B) CSAM: No.
- (C) Active investigation: No.
- (D) Images of death/abuse: No.
- (E) Classified: No.
Document 7: Complete Epstein Investigation Timeline
EFTA: EFTA00224943 · OCR text Dataset: 9 | Pages: 51 | Status: HTTP 404This is a 51-page chronological timeline of the entire Epstein investigation, marked "Privileged Confidential" and "Contains 6(e) Material." It was prepared as an internal prosecution reference document — a comprehensive log of every significant event in Operation Leap Year from May 2006 through the Non-Prosecution Agreement.
The timeline documents, entry by entry: every grand jury subpoena issued (to Colonial Bank, Washington Mutual, Capital One, Chase, Hyperion Air, JEGE Inc., and dozens of other entities), every grand jury testimony session (Special Agent E. Nesbitt Kuyrkendall testified at least seven times), every meeting between prosecutors and defense counsel, every internal email chain between AUSA Villafana and her supervisors, and every piece of correspondence between the US Attorney's Office and Epstein's defense team.
Key entries include:
- 5/23/2006: "File Opening Documents for Operation Leap Year"
- 7/24/2006: Letter from Palm Beach Police Chief Reiter referring the case to the FBI after dissatisfaction with the State Attorney's handling
- 7/26/2007: "Meeting where Criminal Chief Menchel announces that USA Acosta will offer a two-year state plea"
- 5/23/2007: Villafana drafts email stating she "disagrees with promising to have a meeting with Lefcourt or any of Epstein's other attorneys. Believes the prior meeting disclosed prosecution strategy and holding another meeting will disclose even more. If meeting is held, Villafana will ask to have the case reassigned."
- 8/7/2007: "Email chain between Marie Villafana, Cyndee Campos, and Alex Acosta regarding meeting to discuss Epstein matter"
This document is, in essence, the prosecution's own internal index to how the case was built and then dismantled. It identifies every subpoena, every witness, every internal disagreement, and every meeting with defense counsel that led to the NPA.
EFTA Relevance: Directly responsive to Section 2(a)(7) — internal DOJ communications about charging decisions — and Section 2(a)(1) — investigation records. Carveout Analysis:- (A) Victim PII: Victims are referenced by number ("Individual #4," "Individual #28"), not by name. The numbering system is the prosecution's own anonymization.
- (B) CSAM: No.
- (C) Active investigation: The 6(e) marking reflects grand jury secrecy. As with Document 9 below, Rule 6(e) is not one of the EFTA's five carveouts — the statute overrides it.
- (D) Images of death/abuse: No.
- (E) Classified: No.
C. Grand Jury Materials
Document 8: Grand Jury Presentation Outline — Operation Leap Year
EFTA: EFTA00192670 · OCR text Dataset: 9 | Pages: 77 | Status: HTTP 404This is AUSA Marie Villafana's actual script for presenting the Epstein case to the federal grand jury. Titled "GRAND JURY PRESENTATION — OPERATION LEAP YEAR," it is the step-by-step outline the prosecutor used to walk the grand jury through the evidence.
The document is organized into sections: an introduction summarizing the new indictment and current charges; a document return section logging subpoena responses; substantive testimony from Special Agent Kuyrkendall; and individual sections for each victim (referred to by Jane Doe pseudonyms). It references evidence types including flight manifests, telephone billing records, and "message pads" from Epstein's residence that recorded calls from specific individuals.
The presentation outline describes charges under 18 U.S.C. 1591 (sex trafficking of minors) and references co-defendants Sarah Kellen ("SK"), Adriana Ross, and Nadia Marcinkova. It documents the structure of the trafficking operation: how victims were recruited, how appointments were scheduled, and how payment was made.
This is the document that shows how close the federal government came to securing a multi-count indictment for sex trafficking of minors — the indictment that was never filed because of the Non-Prosecution Agreement.
EFTA Relevance: Directly responsive to Section 2(a)(1) — "all investigations, prosecutions, or custodial matters involving Epstein" — and Section 2(a)(7) — internal DOJ communications about charging decisions. Carveout Analysis:- (A) Victim PII: Victims are referred to by Jane Doe pseudonyms — the prosecution's own anonymization. While a small number of initials are visible in the OCR text, these are the prosecution's own working notations, not published victim identifiers.
- (B) CSAM: No.
- (C) Active investigation: No.
- (D) Images of death/abuse: No.
- (E) Classified: No.
D. Financial Records
Document 9: Southern Trust Company USVI Bank Statements
EFTA: EFTA01670187 · OCR text Dataset: 10 | Pages: 455 | Status: HTTP 404These are FirstBank Virgin Islands commercial checking account statements for Southern Trust Company Inc., 6100 Red Hook Quarters Suite B-3, St. Thomas, VI. The account shows a $1,000 opening deposit in December 2012, a $100,000 incoming wire transfer in March 2013, and subsequent disbursements through numbered checks.
The Bates stamps show these were produced by SDNY as confidential grand jury materials (marked "CONFIDENTIAL SDNY_GM_00017243").
EFTA Relevance: Directly responsive to Section 2(a)(5) — entities tied to Epstein's financial networks. Carveout Analysis:- (A) Victim PII: No.
- (B) CSAM: No.
- (C) Active investigation: The "CONFIDENTIAL" marking reflects Federal Rule of Criminal Procedure 6(e) grand jury secrecy. However, the EFTA specifically requires release of grand jury-related materials in Section 2(a)(1), and the underlying bank records are not grand jury testimony — they are business records subpoenaed by the grand jury.
- (D) Images of death/abuse: No.
- (E) Classified: No.
E. FBI Investigation Records
Document 10: Telephone Records — Local Usage Report for Epstein's Manhattan Line
EFTA: EFTA01242527 · OCR text Dataset: 9 | Pages: 2,153 | Status: HTTP 404This is a 2,153-page Local Usage Report for telephone number 212-750-1176 — the landline at Jeffrey Epstein's residence at 9 East 71st Street, Manhattan. The billing records document every outgoing call from the line: date, time, number called, duration, and charges. The records span from at least January 2003 onward.
These telephone records are pure billing data — no names, no content, no victim information of any kind. They are the evidentiary foundation for the call pattern analysis used in the federal investigation: which numbers Epstein called, how frequently, and when. The grand jury presentation outline (Document 8 above) repeatedly references "telephone records" as a primary evidence category, instructing the jury to examine "telephone calls between Jane Doe #X and Kellen" and "calls with Marcinkova."
At 2,153 pages, this is the single largest removed document we have identified by page count.
EFTA Relevance: Directly responsive to Section 2(a)(1) — investigation records — and Section 2(a)(3) — records relating to travel and communications. Carveout Analysis:- (A) Victim PII: No names, no addresses, no identifying information. Only phone numbers and billing data.
- (B) CSAM: No.
- (C) Active investigation: No.
- (D) Images of death/abuse: No.
- (E) Classified: No.
F. The Partial Restoration Pattern
Since our scan was conducted (February 18-20, 2026), we have discovered that the DOJ has been quietly restoring some previously removed documents to justice.gov. This is a significant development with implications for both the factual record and the legal analysis.
On February 21, 2026, we loaded approximately 45 documents from our flagged removal list in standard browser sessions. The results reveal two distinct waves of restoration:
Wave 1 (January 30, 2026): A broad restoration of documents across multiple categories, with Last-Modified timestamps of January 30, 2026. Restored documents include:
- BOP medical records and psychological assessments
- BOP suicide watch logs and handwritten observation notes
- MCC daily assignment rosters (who was on duty and when)
- SDNY search warrants (for Epstein's Manhattan residence and Maxwell's New Hampshire property)
- FBI financial investigation records (Gulfstream invoices, Deutsche Bank due diligence)
- FBI agent text messages about the case
- OCDETF Target Profile intelligence documents
- The federal Plea Agreement
- Defense correspondence from Kirkland & Ellis
Wave 2 (February 19, 2026 — during our scan): A second, more targeted restoration occurred during or after our scanning period:
- The Non-Prosecution Agreement itself
- AUSA Villafana's 651-page prosecution email archive
- The FBI "Epstein in town" movement tracker
- FinCEN BSARs (Suspicious Activity Reports)
- Deutsche Bank KYC files
- Chase bank records
What remains removed: Despite these two waves of restoration, the documents examined in this report's case studies — and others like them — remain 404:
- All three BOP TRUINTEL detention logs (Documents 1-2)
- The BOP Psychological Reconstruction of Epstein's death (Document 3)
- The BOP Form 583 Death Incident Report, in triplicate (Document 4)
- The FBI's 476-page death investigation file (Document 5)
- The Starr/Lefkowitz defense lobbying fax (Document 6)
- The complete 51-page investigation timeline (Document 7)
- The grand jury presentation outline (Document 8)
- Southern Trust Company bank statements (Document 9)
- 2,153 pages of phone records (Document 10)
The pattern is revealing. Documents that were likely removed due to victim PII concerns — medical records, search warrants with victim addresses, prosecution emails naming victims — have been restored, presumably after re-review and possible re-redaction. But documents about how Epstein died in custody (the TRUINTEL logs, psychological reconstruction, death incident report, FBI death investigation), how the prosecution was influenced (the Starr lobbying fax, the investigation timeline), and the evidentiary foundations of the abandoned case (the grand jury outline, the phone records) remain removed.
The partial restoration also confirms, unambiguously, that these documents were removed. If they had never been accessible — if these were scanner errors or CDN artifacts — there would be nothing to "restore." The January 30 and February 19
Last-Modified dates show that the DOJ uploaded new copies to URLs that previously returned 404.
No Federal Register notice accompanied either wave of restoration. No Congressional notification was made. The restoration, like the removal, was silent.
Pattern Analysis
The ten documents examined in detail above total 5,667 pages, all confirmed removed (HTTP 404) as of February 21, 2026. None have any plausible legal basis for removal under any of the EFTA's five carveout categories:
# Document Pages Content EFTA Section Carveout?
--- ---------- ------- --------- -------------- -----------
1 BOP TRUINTEL Composite 1,000 Detention logs, GP + SHU + inventory 2(a)(9) None
2 BOP TRUINTEL 9 South 395 SHU logs — Epstein's exact unit 2(a)(9) None
3 BOP Psychological Reconstruction 1,000 Death investigation — "no video, no interviews" 2(a)(9) None
4 BOP Form 583 24 Death incident report — "Hanging/Asphyxiation" 2(a)(9) None
5 FBI 302 Serial 64 476 Death investigation — inmate interviews, CART evidence 2(a)(9), 2(a)(1) None
6 Starr/Lefkowitz Fax to Acosta 36 Defense lobbying against victim notification 2(a)(7), 2(a)(6) None
7 Investigation Timeline 51 Complete chronology of Operation Leap Year 2(a)(7), 2(a)(1) None
8 GJ Presentation Outline 77 Prosecution script for grand jury 2(a)(1), 2(a)(7) None
9 Southern Trust Company Statements 455 USVI shell entity financials 2(a)(5) None
10 Phone Records 2,153 Epstein's Manhattan line — billing data 2(a)(1), 2(a)(3) None
Five of the ten are directly responsive to Section 2(a)(9) — documentation of Epstein's detention and death. Two concern the internal dynamics of the prosecution and the defense's efforts to influence it. One is the evidentiary script for a grand jury that was building toward a sex trafficking indictment that was never filed. One shows financial flows through an Epstein shell entity. One contains the raw telephone data the investigation relied on.
Together with the partial restoration analysis — which shows the DOJ selectively restoring some categories of documents while leaving others removed — the evidence suggests a pattern in which institutional accountability records are disproportionately likely to remain unavailable.
The Dataset 9 Concentration
The overwhelming concentration of confirmed 404s in Dataset 9 (65,778 of 67,784, or 97.0%) is significant because Dataset 9 contains the FBI's investigative files, prosecutorial correspondence, Bureau of Prisons records, and grand jury materials. These are the categories of documents most directly responsive to the EFTA's nine disclosure categories. Datasets containing less sensitive material (depositions, court filings, public records) show removal rates below 1%.
PART V: THE BROADER PICTURE
What Else We Found — And Why We Are Not Showing You
The four documents above were selected for full citation because they contain no victim PII. But our review of additional removed documents — including internal prosecutorial emails, grand jury testimony, NPA breach correspondence, law enforcement affidavits, and court filings — revealed a much larger problem.
Multiple removed documents contain extensive, unredacted victim personally identifying information that the DOJ failed to adequately redact before publication. In the most severe case, we found a complete FBI victim contact sheet embedded within an email chain: approximately 30 victims' full names, dates of birth, home addresses, phone numbers, email addresses, and family members — for victims who were between 13 and 17 years old at the time of the crimes. This document was not only published with this information exposed — it was subsequently restored to the DOJ's server after being briefly removed, with the victim PII still intact.
Other documents we reviewed contained victim dates of birth leaked through incomplete redaction of police affidavits, full residential addresses given in grand jury testimony, victim school names combined with ages and dates that create identification vectors, and victim MySpace profiles quoted in deposition transcripts.
We are not identifying these documents by EFTA number, and we are not linking to them. We will not create a roadmap to pages where the DOJ's redaction failed.
This matters for the legal analysis because it reveals the DOJ's approach in both directions: documents that should be publicly accessible (BOP logs, institutional reports, shell company financials) were removed, while documents containing the most sensitive victim PII were published — and in some cases restored — without adequate redaction. The failure is not one of over-caution or under-caution. It is a failure of process.
A Note on Victim Privacy
We conducted targeted PII redaction of victim-identifying information in our own text corpus at epstein-data.com. Over three rounds of redaction, we removed 2,099 instances of victim PII — names, phone numbers, Social Security numbers, dates of birth, and residential addresses — across approximately 1,400 pages in roughly 400 documents. The process required building a private watchlist of 173 identifiers for 35 confirmed victims, running automated detection with manual verification, and maintaining separate blacklists and whitelists for surnames that could be either victim identifiers or non-victim proper nouns.
The difficulty was extreme. A single law enforcement affidavit can run over 1,000 pages of OCR-processed text, with victim names and identifying details scattered unpredictably throughout — sometimes in narrative, sometimes in structured police forms where the OCR is so garbled that automated detection produces hundreds of false positives for every real match. School names that could narrow victim identification appear hundreds of times across the broader corpus. Dates of birth appear in formats that overlap with case numbers, booking dates, and field labels.
We redacted the specific victim-identifying information we were able to confirm in our databases, but we cannot guarantee completeness. The complete list of victim PII we identified has been cataloged in a private research file that is never published, never committed to any repository, and never deployed to any server.
We say this not to excuse the DOJ, but to underscore the scale of the problem they created — and to demonstrate that targeted PII redaction is achievable. We redacted 2,099 instances across 1,400 pages using two researchers and automated tools. The DOJ, with its institutional resources, chose instead to remove entire documents wholesale. This is the opposite of the "narrowly tailored" standard the statute requires.
PART VI: LEGAL IMPLICATIONS
1. Procedural Non-Compliance
The removal of approximately 64,000 documents without published Federal Register justification violates Section 2(c)(2) of the EFTA on its face. The statute does not contain an exception for post-publication removal. Every withholding, every redaction requires written justification published in the Federal Register and submitted to Congress. The Attorney General's February 14, 2026 report to Congress — required by Section 3 — lists categories of withheld material but is not a Federal Register publication.
2. Wholesale Removal vs. Narrowly Tailored Redaction
Even where a carveout genuinely applies — such as victim PII in a law enforcement affidavit — the statute authorizes redaction of segregable portions, not removal of entire documents. We know from firsthand experience that targeted PII redaction in these documents is painstaking: it requires identifying specific lines on specific pages out of thousands of OCR-processed pages. But it is work the DOJ is obligated to do under the statute — and removing documents wholesale is not a substitute.
3. The Embarrassment Prohibition
Section 2(b) prohibits withholding based on "embarrassment, reputational harm, or political sensitivity." The BOP TRUINTEL logs and psychological reconstruction expose systemic failures in Epstein's detention — failures the BOP itself acknowledged. The EFTA was enacted precisely because Congress determined these documents should be public despite their embarrassing content.
4. Searchable and Downloadable Format
Section 2(a) requires records be made available in "searchable and downloadable format." Removing approximately 64,000 PDFs from the server makes them neither searchable nor downloadable. This is a straightforward violation of the accessibility mandate.
PART VII: METHODOLOGY NOTES AND DATA FILES
Generated Data Files
The complete scan data — including the full list of all 67,784 confirmed-404 EFTA numbers — is preserved in our research archive. We are not publishing the complete document-level data at this time because our review identified extensive unredacted victim PII in multiple removed documents, and publishing the full list would direct public attention to those pages. The underlying data has been shared with investigative journalists who have the editorial infrastructure to handle victim-identifying material responsibly.
Reproducibility
The scanning methodology is fully documented and the complete scan results are preserved for independent verification. Any researcher who independently identifies a removed EFTA number can verify its status using the method below.
How to Check Yourself (No Coding Required)
To verify a removal:
Open your browser's Developer Tools (F12), go to the Network tab
Navigate to any EFTA PDF on justice.gov (click through the age gate)
Click the PDF request in the Network tab
Look at the Response Headers — find Last-Modified and Content-Length
If Last-Modified shows September 2, 2025, you are looking at the DOJ's 404 error page, not the document — the file has been removed
If Last-Modified` shows a later date (January or February 2026), the file was re-uploaded after the original publication
To check if a still-accessible document was modified:
CONCLUSION
The Epstein Files Transparency Act was enacted with bipartisan support because Congress determined that the Department of Justice's records relating to Jeffrey Epstein must be made available to the public. The statute defines five narrow carveouts, requires written Federal Register justification for any withholding, and categorically prohibits withholding based on embarrassment or political sensitivity.
The DOJ published 1,380,941 documents. Then approximately 64,000 of them disappeared — without Federal Register notice, without Congressional notification, without published justification. The DOJ has since quietly restored some of these documents, in at least two waves (January 30 and February 19, 2026), but critical categories remain removed.
Among the ten documents examined in detail above — totaling 5,667 pages, all confirmed removed as of February 21, 2026:
- Bureau of Prisons detention logs from the night Epstein died (1,395 pages across two TRUINTEL logs)
- The BOP's psychological reconstruction of his death, admitting "no video" and "no interviews" (1,000 pages)
- The BOP's Form 583 death incident report, filed the morning of August 10, 2019, removed in triplicate (24 pages each)
- The FBI's 476-page death investigation file, including inmate interviews and evidence collection logs
- Kenneth Starr's 36-page lobbying fax to US Attorney Acosta, arguing against victim notification
- The complete 51-page investigation timeline documenting every subpoena, meeting, and internal email in Operation Leap Year
- The grand jury presentation outline — the prosecution's actual script for the sex trafficking case that was never filed (77 pages)
- Southern Trust Company bank statements for an Epstein USVI shell entity (455 pages)
- 2,153 pages of telephone records from Epstein's Manhattan line — the raw billing data the investigation relied on
None of these documents contain victim PII. None have any plausible legal basis for removal under the statute's five carveouts. Meanwhile, documents containing the most sensitive victim information imaginable — full names, dates of birth, home addresses, phone numbers, and family members of minor victims — were published with inadequate redaction, and in at least one case restored to the server with the victim PII still exposed.
The partial restoration pattern is itself an admission. Documents that were likely removed for legitimate PII concerns have been restored — presumably after re-review and re-redaction. But institutional accountability records about Epstein's death and the prosecution's conduct remain removed. The DOJ's removal process is not protecting victims. It is removing the records that show how the system worked — and failed.
For the ten documents examined in full above, no lawful basis for removal exists under the statute. For all ~64,000 genuine removals, the procedural requirement of Federal Register notice was not followed. The complete list of removed documents has been shared with investigative journalists.
The bulk HTTP audit was conducted using Claude Code running Opus 4.6, which can make mistakes. All EFTA numbers, dataset assignments, and HTTP status codes were verified programmatically. All ten documents cited above were manually verified to return 404 on justice.gov as of February 21, 2026. Document contents were read from the extracted text corpus and cross-referenced against local PDF copies. All ten documents were reviewed page by page and found to contain no victim personally identifying information before their EFTA numbers were included in this report. The raw data is published on GitHub for independent verification.