Redact discovery files, contracts, and FOIA releases with a review workflow your team can defend
PDFDancer detects sensitive data across finished PDFs, applies confidence scores, and produces audit trails for review. True content removal, not overlays. SDKs for Python, Java, and Node.js.
The Problem
Manual redaction does not scale for legal review
A single matter can produce tens of thousands of pages. Each file needs to be checked for PII, privileged content, and confidential information before release. Manual review is slow, expensive, and easy to miss.
The Limitations
- Manual review does not scale across discovery volumes.
- Overlay tools hide text but leave the original content recoverable.
- Review teams still need a record of what was found and removed.
- GUI-only tools do not fit a document pipeline.
What PDFDancer Changes
- Automated detection across the full document set.
- Confidence scores for review routing.
- True content removal instead of black boxes.
- Audit trails that capture what changed.
- SDKs for Python, Java, and Node.js.
Review Workflow
Detect. Review. Redact.
Detect
Scan discovery files, contracts, and filings for names, dates, SSNs, addresses, account numbers, and other identifiers. Each finding includes a confidence score.
Review
High-confidence findings can move straight to redaction. Lower-confidence matches stay visible for attorney review, with the review trail preserved.
Redact & Audit
Approved content is removed from the PDF, not hidden under overlays. Every action is logged for the audit trail.
Common Identifiers in Legal Documents
Use Cases
Discovery, contracts, and FOIA releases
Discovery and Litigation Support
Discovery sets are large, repetitive, and high risk. PDFDancer scans the set for names, dates, addresses, account numbers, and privileged content, then sends edge cases to attorney review.
Contracts and Due Diligence
When contracts need to move outside the company, redact party names, financial terms, and other sensitive terms without rebuilding the PDF.
FOIA and Public Records
Public records releases need defensible removal and a review trail. PDFDancer produces the output and the record of what was removed.
Document Pipeline
Bring legal redaction into your Python, Node.js, or Java stack
Redaction SDK
See the redaction engine, benchmark data, and deployment options.
Open page →Python SDK
Run legal redaction in Python services, workers, and document pipelines.
Open page →Node.js SDK
Run legal redaction from Node.js and TypeScript backends.
Open page →Java SDK
Batch-process discovery files and contracts from Java services.
Open page →Redaction Tasks
Redact PDFs by line, paragraph, pattern, or batch job
Redact PDFs by line, paragraph, or pattern
Select legal content by line, paragraph, or pattern and remove it permanently.
Open guide →Batch redact multiple fields across all pages
Redact repeated fields across long document sets without manual page-by-page work.
Open guide →Get Started
Send Us a Legal Document. We'll Show You the Output.
Send a representative discovery file, contract, or FOIA release and we will run it through the redaction workflow on your actual documents.