SOLIDWORKS: Inspection - Creating Custom OCR Dictionary
SOLIDWORKS Inspection – Creating Custom OCR Dictionary
SOLIDWORKS Inspection is a very powerful tool used to speed up the inspection process. SOLIDWORKS Inspection exists in two different versions. There is a standalone version and a version that functions as an add-in inside SOLIDWORKS. The standalone version of SOLIDWORKS Inspection can open drawings in either PDF or TIFF file formats. Using SOLIDWORKS Inspection, you can balloon characteristics such as notes and dimensions. The program attempts to recognize the text with Optical Character Recognition (OCR).
The OCR functionality works great, for the most part. Unfortunately, you may run into scenarios where the text is not correctly identified. This may happen if the PDF file contains a non-standard text font. If this is a small inspection project, you may elect to manually input the correct text for the ballooned item. The other option is to create a new “OCR custom dictionary” containing the text characters that match this non-standard font.
For this simple example, I’ll attempt to use the OCR to recognize the highlighted note and dimension. This drawing contains a non-standard font that is going to be an issue for the Optical Character Recognition.
The dimension is recognized quite well. The note, however, is only partially recognized. In some cases, the easiest way to correct the issue is by typing in the actual text value. For this example, I will create a custom dictionary and retry the OCR.
To start this process, click on “Auto Extract” on the OCR editor.
You will need to highlight individual letters and numbers on your drawing and adjust the value to match the actual one. The dictionary will save this association between the image of the character and it’s actual value. It is a good idea to repeat this procedure with multiple instances of the same characters.
To show this process, I am going to take advantage of a PDF file that contains all of the required characters. The Auto Extract recognizes individual characters and attempts to match the correct letters to them. You need to go through each character and input the correct value.
When this process is completed, you need to save this custom dictionary. In the SOLIDWORKS Inspection options screen you can specify the new custom dictionary.
After this custom dictionary is selected, OCR recognizes this same note and dimension with much better results.
You will still need to double check the notes to verify their accuracy. In the “NO SHARP EDGES” note, the final letter “S” was actually identified as the number “5”. In this case, I would simply edit the value to correct the issue.
Sr. Application Engineer
Computer Aided Technology