How accurate is the OCR?

On a clean, straight, well-lit scan of printed text, expect roughly 95-99% character accuracy. Accuracy falls with blur, low resolution, skew, low contrast, decorative fonts, and text over background images. A sharp 300 DPI scan almost always beats a phone photo taken at an angle.

Does it read handwriting?

Not reliably. Tesseract was built for printed text. Very neat block printing sometimes works partially, but cursive and ordinary handwriting come out mostly garbled — plan on typing handwritten material manually.

Which languages are supported?

Ten: English, Spanish, French, German, Chinese (Simplified), Japanese, Korean, Portuguese, Italian, and Dutch. The model for the selected language downloads on first use and is cached. Picking the wrong language sharply reduces accuracy.

How long does a big PDF take?

Recognition is CPU-heavy and runs page by page — budget a few seconds per page, so a 50-page scan can take a few minutes. Keep the tab open and in the foreground.

Is the layout preserved?

No — the output is plain text. Line breaks roughly follow the original, but columns, tables, fonts, and positioning are not reconstructed. For an editable document use OCR to Word; for tables, Table OCR.

Is my document uploaded anywhere?

No. The file is processed in your browser's memory and never leaves your device. The only network request is the one-time language-model download.

OCR — Extract Text from Images

Use optical character recognition to extract text from images and scanned PDFs.

Drop PDF files here or browse

Max 50MB per file

Language

About OCR (Optical Character Recognition)

OCR turns a picture of text into actual text — the kind you can select, copy, search, and paste somewhere else. A scanned page, a photo of a printed letter, or a screenshot of a slide contains no text data, just pixels. Convertora's OCR tool reads those pixels with Tesseract.js, a JavaScript port of the open-source Tesseract engine, and returns the recognized characters as plain text.

Everything runs inside your browser. The first run downloads the recognition model for your chosen language (a few megabytes); after that it's cached and recognition happens entirely on your machine. No image, PDF, or extracted text is ever sent to a server — which matters when the document is a contract, a medical record, or anything else you wouldn't paste into a random website.

One distinction before you start: a digitally produced PDF (exported from Word, generated by an invoicing system) already contains real text, and the PDF to Text tool extracts it faster and with perfect accuracy. OCR is for scans, photographed pages, and faxes — where text exists only as an image.

How to use it

1Upload a JPG, PNG, or WebP image, or a scanned PDF, up to 50 MB — one file at a time.
2Pick the recognition language. The selector offers English, Spanish, French, German, Chinese (Simplified), Japanese, Korean, Portuguese, Italian, and Dutch — choose the language the document is written in.
3Click Extract Text. A PDF is rendered page by page to high-resolution images and recognized one page at a time, with the progress bar tracking the whole job.
4The recognized text appears in an output box with a character count; multi-page PDFs are separated by page-break markers.
5Click Copy to put the result on your clipboard, or select just the part you need.

Common use cases

Pulling a quote out of a screenshot — a slide, a tweet image, a webinar frame.
Digitizing an old paper document so you can search and reuse it instead of retyping.
Extracting the text of a scanned contract or letter to paste into an email or a new draft.
Getting the text out of a photo of a poster, menu, sign, or notice board.

Frequently asked questions

Tips

Scan at 300 DPI or higher. For phone photos: fill the frame, hold the camera parallel to the page, avoid shadows.
Crop the image to just the text region — page edges, fingers, and desk background give the engine junk to misread.
Dark text on a light background recognizes best; boost the contrast of faded or pencil-written sources in an image editor first.
Proofread numbers, names, and email addresses in the output — exactly the strings where one misread character matters most.

100% private — runs in your browser

Convertora processes everything on your device using JavaScript and WebAssembly. Files never leave your browser, are never uploaded to a server, and are never seen by us or anyone else. The moment you close the tab, the data is gone — there is no temporary cloud copy, no log entry, no retained backup.

Because the work happens locally, processing speed depends on your device — but there are no rate limits, no daily caps, and no file size restrictions beyond what your browser can handle in memory. No signup, no account, no payment. The tool works the same in incognito mode, on a corporate network, or after the page has loaded once, even with the network disconnected.