These forums are now Read Only. If you have an Acrobat question, ask questions and get help from one of our experts.

search in pdf

aras
Registered: Oct 24 2011
Posts: 1

Hi all,
 
My Client requirement is to do a PDF search (non-english) in the Search module of his e-learning website. When i try to extract the contents of PDF for indexing, some of the characters are neglected during extraction (empty spaces in that area,when i view the indexed contents in Luke). I am getting these problem for languages like Tamil/Hindi.
 
The Client is very adamant that he wants PDF search.
 
What is the solution for this...Please give me a ray of light or guidelines.
 
Thanks and Regards,
aras

UVSAR
Expert
Registered: Oct 29 2008
Posts: 1357
You need to explain your workflow. Are you exporting the PDF files to text or RTF using Acrobat, or trying to read the PDF files using some third-party tool? We don't provide any support on these forums for third-party software.

Non-Western text in PDF relies on intact Unicode mapping - so unless you view the exported text in the same working space and have fonts installed to support all the glyphs, stuff will be missing or corrupted. There can also be cases where the PDF has incomplete Unicode maps, so text visible on the page is not exportable. You can test for this by searching for the words within Acrobat itself.