LEADTOOLS Support
Document
Document SDK Questions
languagecharacterplusfilter applied only to certain zones
This topic and its replies were posted before the current version of LEADTOOLS was released and may no longer be applicable.
#1
Posted
:
Wednesday, July 18, 2007 11:32:21 PM(UTC)
Groups: Registered
Posts: 23
I need to have different languagecharacterplusfilter's applied to different zones on a page. Example, I have a persons name that will need ABCDEFGHIJKLMNOPQRSTUVWXYZ-,.
and then I have phone numbers that need 0123456789-()
I have others, but I don't see any properties at the zone level to set the languagecharacterplusfilter. It looks like the languagecharacterplusfilter is global. Is that true? If there is a way to apply a custom filter on a zone, can you post up some sample code in C#?
Thanks,
Dave
#2
Posted
:
Friday, July 20, 2007 10:21:33 AM(UTC)
Groups: Registered, Tech Support, Administrators
Posts: 764
The LanguageCharacterPlusFilter and similar filters are global. The zones use the RasterDocumentZoneData.CharacterFilter to filter the text inside the zone, and one of those enumerations is Plus which uses the LanguageCharacterPlusFilter property.
You can use the RasterDocumentCharacterFilters.Numbers to do a combination of digits and whatever is in your LanguageCharacterPlus property.
Therefore the best you could do is two custom types of filters on the same image. Otherwise you are limited to the other non-custom values of the RasterDocumentCharacterFilters enumeration.
#3
Posted
:
Friday, July 20, 2007 2:17:25 PM(UTC)
Groups: Registered
Posts: 23
So to clarify:
I could use the LanguageCharacterPlus property to add "-/:," and then use the RasterDocumentCharacterFilters.Numbers on the zone to handle 01/25/1975 12:34
??
Is the Alpha do CAPS and lower case and Plus? So if I had a name: Doe, Mary-Jo Jr
would that work the same way?
It would be nice to knock this down to a zone by zone basis since ea zone has different types of data. Maybe even allow regular expressions. Any chance of this feature in another future version?
Dave
#4
Posted
:
Monday, July 23, 2007 6:47:54 AM(UTC)
Groups: Registered, Tech Support, Administrators
Posts: 764
Yes and Yes
I could submit a feature request, but your request at this point is a little bit vague. Could you explain in more detail what you are looking for?
#5
Posted
:
Monday, July 23, 2007 11:28:51 AM(UTC)
Groups: Registered
Posts: 23
Greg,
The request would be to have character filters defined individually for each zone. So if I have 10 different zones on page, I can have ea zone have its own character filter. Example zone1.characterFilter = "C0123456790"; zone2.characterFilter = "0123456789/:" etc.
It would also be nice to take it a step further and be able to define a regular expression. I.E. (xxx)xxx-xxxx for a phone # so that the ( doesn't OCR as / or I or something.
Thanks, I'm sure this would help out those of us processing forms quite a bit.
Dave
#6
Posted
:
Monday, January 5, 2009 6:02:36 AM(UTC)
Groups: Registered, Tech Support, Administrators
Posts: 764
I have submitted a feature request regarding this functionality. For your reference, the feature request number is 6027IDT. If chosen for implementation, most feature requests are added at major releases (16, 17 etc.).
If this is an urgent need, please contact our custom development department by visiting
https://www.leadtools.com/devservices/ for a quote.
Edited by moderator Friday, August 9, 2019 10:01:37 AM(UTC)
| Reason: Not specified
#7
Posted
:
Friday, February 13, 2009 7:47:55 AM(UTC)
Groups: Registered
Posts: 15
I second the motion for Regular Expressions!! I was going to post a message requesting them. They are absolutely invaluable in validating or invalidating strings based on known patterns of characters.
#8
Posted
:
Monday, February 16, 2009 6:32:29 AM(UTC)
Groups: Registered, Tech Support, Administrators
Posts: 764
The developers investigating the feature request, 6027IDT, have let me know that this cannot be added. However, you should be able to do a workaround by creating an application that does something like this:
1. Add zones
2. Recognize page
3. Get all recognized characters.
4. Do custom filter for recognized characters by checking them and replace the character that didn’t match the required characters by spaces or NULLs
5. Set filtered recognized characters.
6. Save recognition results.
For an example of how to get and set recognized characters, take a look at the .NET documentation for the IOcrPage.GetRecognizedCharacters function.
I plan on making a small demo showing how to do this, but will likely be a few more weeks. I'll update this forum post whenever I create a demo application.
As for a regular expression for something like a phone number or an SSN, you should be able to use just the Digit filter and then add the dashes, spaces, parantheses etc. as needed. For example, and image that has (123) 456-7890 and your zone uses the Digit character filter then it should return 1234567890 which you can parse into the format you'd like.
#9
Posted
:
Monday, June 29, 2009 10:24:00 AM(UTC)
Groups: Registered, Tech Support, Administrators
Posts: 764
Attached is a small C# 2005 example using LEADTOOLS 16.5 that shows how to use your own custom filter by editing the recognized characters.
Keep in mind that this is a very simplistic demo intended for proof of concept. You will need to bullet proof it a bit more, especially in regards to setting the Position property with the correct end of zone, end of word, etc.
LEADTOOLS Support
Document
Document SDK Questions
languagecharacterplusfilter applied only to certain zones
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.