iText is an OCR tool which could recognize text from any image.
Name Email Dev Id Roles Organization; Bruno Lowagie: brunolowagie.com: blowagie: Paulo Soares: psoares33users.sourceforge.net: psoares33: Mark Hall: hallm. You can use iText to extract text from PDF, document in paper, page in a book and any other images. It’s so easy to take image. Use iText’s built-in tool to capture any screen. Drag an image to iText’s icon in menu bar. Select an image file. The recognition result is very accurate. Powered by Google online OCR service, support 50+ languages.
You can use iText to extract text from PDF, document in paper, page in a book and any other images.
1. Easily Select Image
iText supports a variety of ways to select images, the operation is very convenient.
1.1 Capture Screen
iText has built-in screen capture tool. Just press the shortcut
⇧⌘1
, capture any area on the screen, you can extract the text in it.Tips: The recognized text has been copied to the system clipboard. You can paste directly.
1.2 Drag the Image to Menubar Icon
For example, when you see an image in Twitter and want to extract the text or number inside, just drag the image to iText’s menubar icon, you will get what you want.
1.3 Choose Image File
Of course, you can also select a picture file to recognize. However, dragging mentioned above is preferred in this case.
1.4 Continuously Recognize
For example, taking screenshot of different positions in PDFs, iText will recognize the text in turn and automatically concatenate the results.
2 Accurately Recognize Text
Do you have this experience: You want to extract the text from a picture and found that there are some errors in the recognized text. As a result, the time to manually modify these errors is longer than the time to type them in a computer.
Obviously, accuracy of recognition is very important, that’s why I work hard on it.
2.1 Powered by Google
First of all, I excluded offline recognition libraries, as the offline libraries are dead and can’t improve itself. Next, in many online OCR services, I compared the products of Microsoft, Google, and others.
Finally, I chose Google’s service as it’s so powerful, which could recognize 50+ languages.
- For normal natural language, such as a page of a book, press release, recognition result is amazingly accurate, even up to 100%.
- For complex typesetting, especially with special characters (e.g., program source code), the recognition result isn’t that good, You may need to manually modify the results after recognition.
- E.g, for just a vertical line, the machine can not distinguish between the lowercase l, or uppercase I (by the way, can you identify them?); In contrast, machine needs to understand the context to optimize the result. But now it’s too hard for machine to understand non-natural language like program source code.
Welcome to have a try and feel how accurate the recognition result is.
2.2 Optimize the Recognition Results
OCR services could accurately recognize the text in image, but not that good for further recognition, e.g., paragraph recognition, etc.
So, iText includes its own algorithm to optimize the result, eg.,
- Automatically identify paragraphs.
- Remove extra spaces between English words and punctuation characters.
- Capitalize the first letter for English.
If you find that the optimization is not good, welcome to send the image to me. I will optimize the algorithm corresponding to the image. Thanks in advance.
2.3 Preview the Original Image for Proofing
As current OCR technology cannot always 100% recognize the text, it’s necessary to review the original image to modify the result. In iText, you could:
- Drag the result window nearby the image.
- Show image in left of the result window.
And then, you will feel easy to update the result.
2.4 Auto Hide Recognition Result
Since iText’s recognition results are very accurate and have been copied to the clipboard, there is no need to edit or copy the text after recognition. At this point, you can turn on the “Auto Hide” option as shown above, and the recognition result window will be automatically hidden after 3s, which is very convenient.
In another side, if you need to edit a recognition result temporarily, just move the mouse to the result window, and the auto hide function will be ignored this time. In addition, the window will not be automatically hidden when the “Pin” option is turned on.
3 Automatically Translate
After recognizing text from image, iText could automatically translate them to 100+ languages, powered by Google.
Download
Inet network scanner 2 1. You can recognize text from images 20 times for free each month, or subscribe iText Pro to unlimitedly recognize text from images.
If you also feel iText is helpful, welcome to rate iText on Mac App Store and leave a small review.
If you had any problem using iText or have any suggestions for improvements, please feel free to contact me.
I’m looking forward to hearing from you.
Contents
What is Capture2Text?
Capture2Text enables users to quickly OCR a portion of the screen using a keyboard shortcut. The resulting text will be saved to the clipboard by default.
Conceptual illustration:
Capture2Text is free and licensed under the terms of the GNU General Public License.
Download
The latest version can be found on the Capture2Text download page hosted by SourceForge.
System Requirements
Supported operating systems:
- Windows 7
- Windows 8/8.1
- Windows 10
Note: Windows XP support has been dropped as of Capture2Text v4.0.
How to Launch Capture2Text (no installation required)
- Unzip the contents of the zip file.
- Double-click on Capture2Text.exe. You should see the Capture2Text icon on the bottom-right of your screen (though it might be hidden in which case you will have to click on the 'Show hidden icons' arrow).
Installing Additional OCR Languages
By default Capture2Text comes packaged with the following languages: English, French, German, Japanese, Korean, Russian, and Spanish.
Follow these steps if you would like to install additional OCR languages:
- Download the appropriate OCR language dictionary.
- Open the '.zip' file you just downloaded with 7-Zip or similar decompression software.
- Drag all files contained within the zip file to the tessdata folder:
- Restart Capture2Text.
Afrikaans (afr) | Greek (ell) | Odiya (ori) |
Albanian (sqi) | Gujarati (guj) | Panjabi (pan) |
Amharic (amh) | Haitian (hat) | Persian (fas) |
Ancient Greek (grc) | Hebrew (heb) | Polish (pol) |
Arabic (ara) | Hindi (hin) | Portuguese (por) |
Assamese (asm) | Hungarian (hun) | Pushto (pus) |
Azerbaijani (aze) | Icelandic (isl) | Romanian (ron) |
Basque (eus) | Indic (inc) | Russian (rus) |
Belarusian (bel) | Indonesian (ind) | Sanskrit (san) |
Bengali (ben) | Inuktitut (iku) | Serbian (srp) |
Bosnian (bos) | Irish (gle) | Sinhala (sin) |
Bulgarian (bul) | Italian (ita) | Slovak (slk) |
Burmese (mya) | Japanese (jpn) | Slovenian (slv) |
Catalan (cat) | Javanese (jav) | Spanish (spa) |
Cebuano (ceb) | Kannada (kan) | Swahili (swa) |
Central Khmer (khm) | Kazakh (kaz) | Swedish (swe) |
Cherokee (chr) | Kirghiz (kir) | Syriac (syr) |
Chinese - Simplified (chi_sim) | Korean (kor) | Tagalog (tgl) |
Chinese - Traditional (chi_tra) | Kurukh (kru) | Tajik (tgk) |
Croatian (hrv) | Lao (lao) | Tamil (tam) |
Czech (ces) | Latin (lat) | Telugu (tel) |
Danish (dan) | Latvian (lav) | Thai (tha) |
Dutch (nld) | Lithuanian (lit) | Tibetan (bod) |
Dzongkha (dzo) | Macedonian (mkd) | Tigrinya (tir) |
English (eng) | Malay (msa) | Turkish (tur) |
Esperanto (epo) | Malayalam (mal) | Uighur (uig) |
Estonian (est) | Maltese (mlt) | Ukrainian (ukr) |
Finnish (fin) | Marathi (mar) | Urdu (urd) |
Frankish (frk) | Math/Equations (equ) | Uzbek (uzb) |
French (fra) | Middle English (1100-1500) (enm) | Vietnamese (vie) |
Galician (glg) | Middle French (1400-1600) (frm) | Welsh (cym) |
Georgian (kat) | Nepali (nep) | Yiddish (yid) |
German (deu) | Norwegian (nor) |
How to Perform a Standard OCR Capture
Follow these steps to perform a standard OCR capture using the capture box:
- Position your mouse pointer at the top-left corner of the text that you want to OCR.
- Press the OCR hotkey (Windows Key + Q) to begin an OCR capture.
- Move your mouse to resize the blue capture box over the text that you want to OCR. You may hold down the right mouse button and drag to move the entire capture box.
- Press the OCR hotkey again (or left-click or press ENTER) to complete the OCR capture. The OCR'd text will be placed in the clipboard and a popup showing the captured text will appear (the popup may be disabled in the settings).
As with all OCR captures, you must manually select the language that you would like to OCR from the settings.
To change the OCR language, right-click the Capture2Text tray icon, select the OCR Language option and then select the desired language.
To quickly switch between 3 languages, use the OCR language quick access keys: Windows Key + 1, Windows Key + 2, and Windows Key + 3. The quick access languages may be specified in the settings.
When Chinese or Japanese is selected, you should specify the text direction (vertical/horizontal/auto) using the text direction hotkey: Windows Key + O. If auto is selected, horizontal will be used when the capture width is more than twice the height, otherwise vertical will be used. The text direction also affects how furigana is stripped from Japanese text.
(For Japanese) Capture2Text will attempt to automatically strip out furigana.
How to Perform a Text Line OCR Capture
Capture2Text can automatically capture the line of text that is closest to the mouse pointer.
Follow these steps to perform a Text Line OCR Capture:
- Position your mouse pointer on or near the line of text to capture.
- Press the Text Line OCR Capture hotkey (Windows Key + E).
- Capture2Text will outline the captured text and save the OCR result to the clipboard.
Example:
How to Perform a Forward Text Line OCR Capture
Capture2Text can automatically capture the line of text starting at the character that is closest to the mouse pointer and working forward.
Itext For Mac
Follow these steps to perform a Forward Text Line OCR Capture:
- Position your mouse pointer on or near the character to start at.
- Press the Forward Text Line OCR Capture hotkey (Windows Key + W).
- Capture2Text will outline the captured text and save the OCR result to the clipboard.
Example:
How to Perform a Bubble OCR Capture
Capture2Text can automatically capture text contained within a comic book speech/thought bubble as long as the bubble is completely enclosed.
Follow these steps to perform a Bubble OCR Capture:
- Position your mouse pointer in the empty part of the bubble (not on the text).
- Press the bubble OCR Capture hotkey (Windows Key + S).
- Capture2Text will outline the captured text and save the OCR result to the clipboard.
Example:
How to Specify the Active OCR Language
To specify the active OCR language, right-click the tray icon, click on OCR Language, and select an OCR languages from the list:
Translation
To enable the translation feature, start by opening the settings dialog (right-click tray icon and select 'Settings..'), and clicking on the Translate tab.
Check the 'Append translation to clipboard' checkbox to append the translated text to the clipboard using the provided separator.Check the 'Show translation in popup window' checkbox to display the translated text along side the OCR text in the popup window. For example:.Each installed OCR language may be translated to a different language.
Zipper 16. Note 1: Some OCR languages do not have translation support. Unsupported languages will not be displayed.
Note 2: The translation feature requires Internet access.
Settings
Right-click the Capture2Text tray icon in the bottom-right of your screen and then select the 'Settings..' option to bring up the Settings dialog. You may hover over many of the option labels to display a helpful tooltip explaining the option.
The Hotkeys tab allows you to specify which key and modifiers to use for each hotkey. To disable a hotkey, select '<Unmapped>' from the drop-down list.
Current OCR language: Specify the active OCR language to use. You may also specify the active OCR language in the tray icon menu.
Quick-Access Languages: The languages to use for each of the quick-access language hotkeys.
Whitelist: Inform the OCR engine that the captured text will only contain the provided characters.
Blacklist: Inform the OCR engine that the captured text will never contain the provided characters.
Text Orientation: The orientation of the text that will be captured. This option is only used when Chinese or Japanese is set as the active OCR language. If Auto is selected, horizontal will be used when the capture width is more than twice the height, otherwise vertical will be used. The text direction also affects how furigana is stripped from Japanese text. You may also specify the text orientation in the tray icon menu or with the Text Orientation hotkey.
Tesseract Config File: An advanced feature that allows you to specify a Tesseract config file.
Trim Capture: During OCR preprocessing, trim captured image to foreground pixels and add a thin border. OCR accuracy will be more consistent and may even be improved.
Deskew Capture: During OCR preprocessing, attempt to compensate for slanted text found in an OCR capture.
Contains options for configuring the automatic captures. Hover over the option labels for more information.
Allows you to specify the colors of the OCR Capture Box. The transparency can be changed by adjusting the 'Alpha channel' value in the color selection dialog.
Allows you to specify the preview position, color, and font. You may disable the preview by unchecking the 'Show Preview Box' checkbox.
Save to clipboard: Save the captured OCR text to the clipboard.
Show popup window: Show the captured OCR text in a popup window:
Keep line breaks: Check this option if you don't want carriage returns and line feeds to be stripped from the captured text.
Logging: Allows you to save all captures to the specified file in the specified format. The following tokens may be used in the format: ${capture}, ${translation}, ${timestamp}, ${linebreak}, ${tab}. The default format is: '${capture}${linebreak}'.
Call Executable: An advanced feature that allows you call an executable after OCR is complete. The following tokens may be used: ${capture}, ${translation}, ${timestamp}. Example:
Allows you to perform text replacements. Supports regular expressions. The text on the left will be replaced with the text on the right. Different replacements may be specified for each OCR language.
See the translation section.
This page allow you to enable the text-to-speech feature, set the volume, and select the options (voice, rate, pitch) to use for each OCR langauge.
Enable Text-to-speech: Enable text-to-speech when text is captured.
When this option is checked and the voice is not set to '<Disabled>', the 'Say' button will appear in the popup dialog:
Volume: Master volume of the text-to-speech feature. Applies to all languages.
Itext Pro 1 2 8 – Ocr Tool Download
OCR Language: Specify speech options for the selected OCR language.
- Rate: Rate of text-to-speech voice.
- Pitch: Pitch of text-to-speech voice.
- Voice: Voice to use for the text-to-speech feature. Set to '<Disabled>' to disable the text-to-speech feature for just the selected OCR language.
Preview: Preview the current rate, pitch, and voice.
Command Line Options
Troubleshooting & FAQ
- I'm getting a message about a missing DLL file when I double-click Capture2Text.exe.Solution: Install the Visual Studio 2015 redistributable.
- Capture2Text doesn't work at all. What can I do?Possible solutions:
- Make sure that you have unzipped Capture2Text. Search Google if you do not know how to unzip a file.
- Make sure that your Anti-virus software is not blocking Capture2Text. Refer to the documentation that was bundled with your Anti-virus software.
- Make sure that you have downloaded the latest version from SourceForge.
- Restart your computer.
- Ask one of your grandchildren to help you :)
- I found a bug!Great! Create a ticket and describe the bug.
- I want to make a suggestion.Great! Create a ticket and describe your suggestion.
- Capture2Text is outputting garbage characters.Solution: Specify the correct OCR language.
- The language that I'm interested in doesn't appear in the OCR language menu.Read Installing Additional OCR Languages.
- I don't see the Capture2Text tray icon.Click the 'Show hidden icons' button (it looks like a triangle or a ^ character).
- I've clicked on the Capture2Text tray icon but it doesn't do anything.Right-click it instead.
- Capture2Text isn't working on my Mac.Capture2Text is a Windows-only software. If you have a technical background, feel free to port it (but don't ask me to help).
- Where is the uninstaller?There isn't one. Capture2Text doesn't have an installer either. To remove Capture2Text from your computer, simply delete the Capture2Text directory.
- Where is the settings .ini file located?Type '%appdata%Capture2Text' into Windows Explorer.You may delete it to restore default settings.
- How do I make Capture2Text portable?Call Capture2Text.exe using the --portable option. You may want to create a shortcut for this. Setting this option will make Capture2Text store the .ini settings file in same directory as Capture2Text.exe (as opposed to '%appdata%Capture2Text' which is the normal location).
- Where is the source code located?The source code is located on SourceForge.
Itext Pro 1 2 8 – Ocr Tools
Related Tools for Japanese Language Learners
- JGlossator (Windows)Automatically lookup Japanese words that you have OCR'd with Capture2Text. Supports de-inflected expressions, readings, audio pronunciation, example sentences, pitch accent, word frequency, kanji information, and grammar analysis. Supports both EDICT and EPWING dictionaries.
- OCR Manga Reader (Android)Free and open source Manga reader android app that allows you to quickly OCR and lookup Japanese words in real-time. There are no ads and no mysterious network permissions. Supports both EDICT and EPWING dictionaries.