This topic and its replies were posted before the current version of LEADTOOLS was released and may no longer be applicable.
#1
Posted
:
Monday, February 5, 2007 4:47:34 PM(UTC)
Groups: Registered
Posts: 5
1. When I exported to well formatted text file using RasterDocumentFormatType.RecAsciiFormatted, it might lose all the formats, such as newline, spaces, etc. And become a very long line instead sometime.
2. For all the other text file format, two lines might join together, missing newline character sometimes.
3. Auto Orient function is not good as Microsoft Office Document Imaging.
Hope V16 could overcome such problems.
#2
Posted
:
Tuesday, February 6, 2007 5:33:25 AM(UTC)
Groups: Guests
Posts: 3,022
Was thanked: 2 time(s) in 2 post(s)
For all of these issues, do they appear with a
specific image or with all images? Can you send me a sample image and explain
which of our demos you used (if any) and what the exact steps you followed
were?
Also, please tell me which parts of the result were
incorrect.
#3
Posted
:
Tuesday, February 6, 2007 1:08:41 PM(UTC)
Groups: Registered
Posts: 5
Thank for your reply. It seems some of images will have such problems.
I got a simple image (.tif) as in the attachment. I just simply drew a zone of the page, recognised and exported to formatted text file. Then it lost all the formats.
Please change 2_Page14.txt to 2_Page14.tif.
#4
Posted
:
Tuesday, February 6, 2007 1:22:40 PM(UTC)
Groups: Registered
Posts: 5
Please refer to the attachment for the result.
#5
Posted
:
Wednesday, February 7, 2007 4:02:24 AM(UTC)
Groups: Guests
Posts: 3,022
Was thanked: 2 time(s) in 2 post(s)
Unless you save in PDF or some other format that
keeps rich text formatting, the table information will be lost.
About line separators, the saved TXT file contains
Line Feed (LF) characters to separate lines. Some text editors do not recognize
this as a new line and expect to find Carriage Return (CR) characters, or CR-LF
pairs. So simply convert these LF characters to CR-LF pairs like in the
attached file.
One way to do that is to open the text file using
notepad, select all text, cut it, paste it into MS Word and save it as text.
#6
Posted
:
Wednesday, February 7, 2007 12:23:39 PM(UTC)
Groups: Registered
Posts: 5
Thanks a lot.
BTW, I got another problem, if the source file is PDF format instead of TIF, the recognition result is very poor.
#7
Posted
:
Thursday, February 8, 2007 5:52:47 AM(UTC)
Groups: Guests
Posts: 3,022
Was thanked: 2 time(s) in 2 post(s)
The default DPI we use in our
toolkit is low and hence your PDF documents are opened in slightly low
resolution. However, adjustment to the PDF loading options can be done
programmatically prior to PDF load.
You cannot post new topics in this forum.
You cannot reply to topics in this forum.
You cannot delete your posts in this forum.
You cannot edit your posts in this forum.
You cannot create polls in this forum.
You cannot vote in polls in this forum.