These forums are now Read Only. If you have an Acrobat question, ask questions and get help from one of our experts.

Any recent updates on how Google indexes PDF files?

redcrew
Registered: Nov 7 2006
Posts: 83
Answered

I've read the [url=http://www.acrobatusers.com/articles/2006/02/pdf_for_google/index.php]Make your PDFs work well with Google[/url] article, that was written in February 2006.

Is there an update on how Google now (May 2008) indexes PDF files? Is Google indexing 1.6 files?

My Product Information:
Acrobat Pro 8.1.2, Windows
redcrew
Registered: Nov 7 2006
Posts: 83
bump
lkassuba
ExpertTeam
Registered: Jun 28 2007
Posts: 3636
Google continues to use the open source "XPDF" library to convert PDF->text (or HTML) and then index that like other pieces of content they process. At this point, there are no updates/changes in the process.

Lori Kassuba is an AUC Expert and Community Manager for AcrobatUsers.com.

redcrew
Registered: Nov 7 2006
Posts: 83
Lori,

Thanks. Does that mean Google is indexing 1.6 files?
redcrew
Registered: Nov 7 2006
Posts: 83
Sorry for the *bump*, but it's still not clear to me whether Google is indexing 1.6 files. I'm not sure what the comment that Google is using "open source "XPDF" library to convert PDF->text (or HTML)" means. I'm still very much in the learning stages of some of the terms used to describe PDF files.
daka630
Expert
Registered: Mar 1 2007
Posts: 1420
redcrew,
Xpdf 3.02pl2, available since 2007.11.07, supports PDF versions 1.6 and 1.7.
[url=http://www.foolabs.com/xpdf/]Homepage[/url]
Speculative on my part; but, I'd say that Google is using the most current Xpdf release. If so, then, yes, the extraction of text from PDF version 1.6 files would be supported.

Be well...

Be well...

redcrew
Registered: Nov 7 2006
Posts: 83
Thanks - can you point to the article/post that discusses Google's indexing of PDF files? I keep searching for information, but haven't found any specific post from Google or experts. Perhaps I missed it in my search.
lkassuba
ExpertTeam
Registered: Jun 28 2007
Posts: 3636
Did you have the opportunity to check out the on-demand version of [url=http://adobechats.adobe.acrobat.com/p38642918/]PDF for the Web & SEO[/url]? The presentation details how to optimize your files for the web for searching (including metadata). scanned files, etc.

Lori Kassuba is an AUC Expert and Community Manager for AcrobatUsers.com.