use case
Legal document processing
Batch OCR, redact, and archive case files.
Legal teams live in PDF. Batch-OCR a 1,000-file subpoena response, extract structured text, and deliver a searchable bundle.
Operations used
/ocr
Make scanned evidence searchable
/extract-text
Pull text for eDiscovery indexing
/watermark
Bates-number stamps
/metadata
Tag custodian, date range, matter ID
/protect
Lock down privileged material
Example workflow
- 01Ingest raw scans from custodian foldersS3, SharePoint, or local disk.
- 02SnapPDF ocr (searchablePdf=true)Embed text layer in place.
- 03SnapPDF extract-textPush into Elastic for review.
- 04SnapPDF watermarkBates number "BATES 0001".
- 05SnapPDF protectAES-256, owner-password only.
Code
for (const pdf of custodianFiles) {
const { pdf: searchable } = await snap.pdf.ocr({
file: pdf, searchablePdf: true, languages: ['eng'],
});
const stamped = await snap.pdf.watermark({
file: searchable, kind: 'text',
text: `BATES ${String(bates++).padStart(6, '0')}`,
position: 'bottom-right', opacity: 0.6,
});
await s3.putObject({ Body: stamped.pdf, Key: `matter/${id}/${bates}.pdf` });
}Best for
Law firmsIn-house legaleDiscovery vendors