FREE · NO SIGNUP · 20+ TOOLS

OCR a scanned PDF. Searchable, selectable.

Drop a scanned PDF. We rasterise each page, recognise the text with Tesseract (100+ languages including Arabic, CJK, Cyrillic), and write a text layer back onto the PDF so it becomes searchable and selectable. Free tier: 5 pages per day.

QUESTIONS
How accurate is the OCR?+

Typical accuracy is 95–99% on clean 300 DPI scans in supported scripts. Low-contrast scans, tight line spacing, or exotic fonts can drop to 80%. Larger jobs should use the API with explicit language hints for best results.

Which languages are supported?+

100+ via Tesseract — including English, Spanish, French, German, Arabic, Chinese (Simp + Trad), Japanese, Korean, Russian, Greek, Hebrew, and Hindi. Pass a `languages` list to combine scripts (e.g. `eng,spa` for mixed documents).

Does the original layout survive?+

Yes — we write the recognised text as an invisible layer aligned to each word’s bounding box. The visible scan is untouched, so copy-paste and search both work while the document still looks identical.

Is OCR slow?+

Browser-side: ~1.5–3 seconds per page at 300 DPI. For 100+ page jobs use the /api/v1/ocr endpoint — it streams pages in parallel on a Vercel Function and completes in a fraction of the time.

Can I OCR an image (JPG/PNG) directly?+

Yes — the same endpoint accepts images. We wrap the image in a single-page PDF and run OCR on it. Output is a searchable PDF.