Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Shadow Report – Why your "black box" redactions aren't hiding anything
3 points by cd_mkdir 4 days ago | hide | past | favorite | 2 comments
Hello HN,

In high-conflict litigation, "black box" redactions are often a disaster waiting to happen. I realized that many people (and even law firms) use civilian-grade tools that leave "Ghost Layers"—original text layers or metadata underneath the digital ink.

I built the Shadow Report as a free forensic tool to prove this. You can upload a redacted page, and it scans for:

Ghost Text Layers: Checking if the searchable PDF layer still exists beneath the redaction blocks (using pypdf layer inspection).

Metadata Leaks: Extracting Author/Producer info that reveals who actually drafted the document.

Image Fingerprints: Scraping EXIF data that can geo-locate or time-stamp "anonymous" evidence.

Backstory: This is a component of a larger project called Exit Protocol. I started it after a friend was quoted $50k for a forensic accountant to trace "separate property" in a divorce. The math they use—the Lowest Intermediate Balance Rule (LIBR)—is deterministic, but accountants do it manually in Excel. I automated the LIBR math to handle 10k+ transactions via Celery/Postgres.

Stack:

Django 5.0 (Monolith) / Postgres pypdf & Pillow for the forensic scanning Celery for async processing of massive bank discoveries Air-gapped "BYOK" model for law firms (Docker)

I'd love feedback on:

Are there other "Ghost Layer" detection methods I should implement (e.g., color-space delta analysis)? For those in LawTech: How do you handle "PDFs from hell" (scanned, rotated, handwritten notes)? I'm currently using a custom OC-3 implementation.

Try the Redaction Check: https://exitprotocols.com/redaction-check/

Main Site: https://exitprotocols.com/





This lines up with something I’ve also seen a lot — most failures aren’t clever reconstruction attacks, they’re just leftover text layers or metadata that never got removed.

I took a simpler approach and built a small browser-only audit tool that just answers one question: is this PDF still leaking extractable content at all?

It doesn’t try to unredact or guess text, just flags whether text layers, hidden characters, or metadata are still present so you know whether the redaction actually worked.

https://audit.reactpdf.app

Curious if you’ve run into cases where PDFs look clean at the layer/metadata level but still leak via other mechanisms.


If you’ve ever had to prove which dollars in a drained bank account belong to you vs. a spouse, you’ve run into the Lowest Intermediate Balance Rule (LIBR). It’s a 50-year-old legal precedent (See v. See, 1966) that is a nightmare to calculate manually.

I’m a dev who got frustrated seeing forensic accountants charge $500/hr to do this in spreadsheets. So I built Exit Protocol to automate the forensic tracing and "impeachment" of financial lies.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: