OCR — Extract Text from Images

Use optical character recognition to extract text from images and scanned PDFs.

Drop PDF files here or browse

Max 50MB per file

About OCR (Optical Character Recognition)

OCR turns a picture of text into actual text — the kind you can select, copy, search, and paste somewhere else. A scanned page, a photo of a printed letter, or a screenshot of a slide contains no text data, just pixels. Convertora's OCR tool reads those pixels with Tesseract.js, a JavaScript port of the open-source Tesseract engine, and returns the recognized characters as plain text.

Everything runs inside your browser. The first run downloads the recognition model for your chosen language (a few megabytes); after that it's cached and recognition happens entirely on your machine. No image, PDF, or extracted text is ever sent to a server — which matters when the document is a contract, a medical record, or anything else you wouldn't paste into a random website.

One distinction before you start: a digitally produced PDF (exported from Word, generated by an invoicing system) already contains real text, and the PDF to Text tool extracts it faster and with perfect accuracy. OCR is for scans, photographed pages, and faxes — where text exists only as an image.

How to use it

  1. 1Upload a JPG, PNG, or WebP image, or a scanned PDF, up to 50 MB — one file at a time.
  2. 2Pick the recognition language. The selector offers English, Spanish, French, German, Chinese (Simplified), Japanese, Korean, Portuguese, Italian, and Dutch — choose the language the document is written in.
  3. 3Click Extract Text. A PDF is rendered page by page to high-resolution images and recognized one page at a time, with the progress bar tracking the whole job.
  4. 4The recognized text appears in an output box with a character count; multi-page PDFs are separated by page-break markers.
  5. 5Click Copy to put the result on your clipboard, or select just the part you need.

Common use cases

  • Pulling a quote out of a screenshot — a slide, a tweet image, a webinar frame.
  • Digitizing an old paper document so you can search and reuse it instead of retyping.
  • Extracting the text of a scanned contract or letter to paste into an email or a new draft.
  • Getting the text out of a photo of a poster, menu, sign, or notice board.

Frequently asked questions

Tips

  • Scan at 300 DPI or higher. For phone photos: fill the frame, hold the camera parallel to the page, avoid shadows.
  • Crop the image to just the text region — page edges, fingers, and desk background give the engine junk to misread.
  • Dark text on a light background recognizes best; boost the contrast of faded or pencil-written sources in an image editor first.
  • Proofread numbers, names, and email addresses in the output — exactly the strings where one misread character matters most.

100% private — runs in your browser

Convertora processes everything on your device using JavaScript and WebAssembly. Files never leave your browser, are never uploaded to a server, and are never seen by us or anyone else. The moment you close the tab, the data is gone — there is no temporary cloud copy, no log entry, no retained backup.

Because the work happens locally, processing speed depends on your device — but there are no rate limits, no daily caps, and no file size restrictions beyond what your browser can handle in memory. No signup, no account, no payment. The tool works the same in incognito mode, on a corporate network, or after the page has loaded once, even with the network disconnected.