OCR PDF — extract text
Extract searchable text from scanned PDFs. Runs locally with Tesseract.
Process: application/pdf
Process ≤ 50.0 MB
About this tool
Extract searchable text from scanned PDFs. Runs locally with Tesseract.
All processing happens in your browser. Your PDF never uploads to our servers. No account required, no usage limits.
Frequently asked questions
Is this OCR tool free?
Yes, completely free with no usage limits, no account required.
Will my PDF be uploaded to a server?
No. Tesseract runs in your browser via WebAssembly — your PDF never leaves your device.
Does the first use require an internet connection?
Yes. The first time you select a language we download ~12 MB of trained data from the Tesseract CDN. After that, OCR works fully offline.
Which languages are supported?
English, Spanish, French, German, Italian, Portuguese, Dutch, Korean and Japanese. Mix two with a "+" (e.g. eng+spa) for multilingual documents.
What format is the output?
A plain .txt file with the extracted text per page. PDF/A export with searchable text layer is coming in a future update.