These forums are now Read Only. If you have an Acrobat question, ask questions and get help from one of our experts.

Combining 2 PDFs into SINGLE page PDF

lwallner
Registered: Oct 12 2009
Posts: 4

I need to combine elements of two different PDFs into one, single page PDF.

One PDF is Template of sorts with headers to be edited, the other PDF is a non-ocr scan that includes text and images.

I've found no way to copy and paste on a single page b.t two PDFs. I've tried pasting into an inserted page and then dragging it up or cutting and pasting within the document (with the intention of deleting the unneeded second page) with no luck.

This seems like it should be such a no brainer. Open 1st PDF, Open 2nd PDF, make selection, Ctrl C, go to 1st PDF, place cursor where desired, Ctrl V. But it doesn't work at all in an intuitive way like any other document program where you can cut and paste with ease b/t documents and even b/t programs.

So instead, I've copied the part of the image PDF I want into the original word doc of the template, then created a PDF from that. However, when I try to tell Acrobat to make the entire document OCR, it tells me it can’t b/c it contains renderable text (the header information, I’m assuming). If I run the OCR on a PDF containing just the image, it does it no problem.

What is the best way to bring these two documents together in way that results in a single page, completely searchable PDF?

My Product Information:
Acrobat Pro 9.1.3, Windows
lkassuba
ExpertTeam
Registered: Jun 28 2007
Posts: 3636
Acrobat may not be the best place to combine content. However, you may be able to do this using a two step process. First, use the Snapshot tool (Tools > Select & Zoom > Snapshot Tool) to copy the area you want to the clipboard. Then select Create PDF From Clipboard from under the File menu. In the new PDF use the TouchUp Object tool to select and copy the image. Now you can paste this image anywhere in the original PDF.

Lori Kassuba is an AUC Expert and Community Manager for AcrobatUsers.com.

thomp
Expert
Registered: Feb 15 2006
Posts: 4411
There is another way to do this if the content on the two documents line up. Use the "Document > Watermark > Add..." menu item. This item can be used to overlay pages from two PDF files.The TouchUp Object tool that Lori mentioned can also be used for the entire process, copying from one PDF and Pasting into another. However, what she says is true, Acrobat is not a content creation tool. It is a document finishing tool. The touchup tools are intended for minor changes before a document is distributed. All changes should be done in the original document and then converted to PDF. Over use of the Touchup tools can corrupt the PDF, so be careful.

As for OCR, you should be able to select and OCR just a portion of the PDF page.

Thom Parker
The source for PDF Scripting Info
[url=http://www.pdfScripting.com]pdfscripting.com[/url]

The Acrobat JavaScript Reference, Use it Early and Often
[url=http://www.adobe.com/devnet/acrobat/javascript.php]http://www.adobe.com/devnet/acrobat/javascript.php[/url]

Then most important JavaScript Development tool in Acrobat
[url=http://www.pdfscripting.com/public/34.cfm#JSIntro][b]The Console Window (Video tutorial)[/b][/url]
[url=http://www.acrobatusers.com/tutorials/2006/javascript_console][b]The Console Window(article)[/b][/url]

Thom Parker
The source for PDF Scripting Info
www.pdfscripting.com
Very Important - How to Debug Your Script

lkassuba
ExpertTeam
Registered: Jun 28 2007
Posts: 3636
thomp wrote:
There is another way to do this if the content on the two documents line up. Use the "Document > Watermark > Add..." menu item. This item can be used to overlay pages from two PDF files.
Great point Thom -- I forgot about adding the page using a Watermark.

Lori Kassuba is an AUC Expert and Community Manager for AcrobatUsers.com.

daka630
Expert
Registered: Mar 1 2007
Posts: 1420
thomp wrote:
As for OCR, you should be able to [i]select[/i] and OCR just a portion of the PDF page.
Thom, thank you for "select"...
It got me playing & I've come across the OCR entry in the context menu associated with the Select text and images tool.
I've always just gone to the command menu to OCR individual pages or PDFs.

However, it appears that the Recognize Text Using OCR entry is only present if the entire page content is a scanned image.
When the page has a mix of renderable text and an image of text the OCR entry is not present in the context menu.

In the past, whenever I attempted to OCR a PDF page containing any renderable text Acrobat's OCR engine would not process the page.
Pages before or after, in a multi-page PDF, but not the pages with any renderable text.

Interestingly, while playing with this, I am able to have OCR performed on a PDF page with an image of text and renderable text in the body of the page and in the header/footer regions.
If I have renderable text "behind" the image then OCR is a no-go.
This was with Acrobat 8 and Acrobat 9.


During play, I did not change font color to white to "hide" the text behind the image of text. I have come across this.
fwiw - Examine Document and preview of the Hidden Text feature will show the presence of such.

For a PDF page with only an image of text, a click on the page with the Select text and images tool highlights a block of the image.
Using Recognize Text Using OCR from the context menu brings up the Recognize Text dialog which only permits OCR of all pages, current page, or from page _ to _ ; so, at a minimum, a full page is OCRd rather than the highlighted portion of the image.


Again, thanks for sparking a "look-see" opportunity. 8^)

Be well...

Be well...

lwallner
Registered: Oct 12 2009
Posts: 4
Thanks for your input.
It may help if I explain exactly what I’m doing and why I want to use Acrobat to do this much (it’s really not that much) editing.

We get press clippings sent to us electronically from a media reading agency. They are images and text cut out of a magazine, pasted on a letter sheet of paper with some information about the origin and scanned as a non-ocr PDF.

Because we add to and edit the basic information on the page, we print out the PDF, put the information into a Word template, print out the template page, manually cut out the scanned image from the PDF, and paste it onto the template paper. THEN, in order to share the clippings across our organization, we rescan the images to PDF, making them OCR so that the resulting multipage document is completely searchable (headers and the text in the magazine cutouts).

Basically we are going PDF>hard copy>back to PDF, which is nonsensical. We get about a hundred clippings per month, so it is a lot of wasted paper and manual labor. The header template has 5 short fields that need to be updated for each clip.It doesn’t look like watermark is going to make sense b/c it is going to require the extra step where I crop down each original PDF to just the magazine cutout and save it as another PDF so I can browse to that file to insert as a watermark. With Lori’s snapshot step I have to create a new PDF from the snapshot, but I don’t have to save it and can just close it once I’ve copied it in to the template.

Lori’s method works great all the way up to OCR. Even if I try selecting just the copied text and image, it tells me it can’t run OCR b/c the page contains renderable text. Maybe I’m not using the right tool to select that part? Can you walk me through? (Danka’s explanation is a little beyond me-I’m a new user who only bought the program for this process and I need step 1>menu A>selection Y directions)The only workaround I’ve found for this is to get everything in the PDF using Lori’s method, save it as a jpeg, reopen it in Acrobat and then it will OCR the entire thing for me to save as the final PDF. It’s a lot of steps and I’d like to cut them down if I could-I’m trying to make this LESS work/time consuming than it is currently.
lkassuba
ExpertTeam
Registered: Jun 28 2007
Posts: 3636
lwallner wrote:
Because we add to and edit the basic information on the page, we print out the PDF, put the information into a Word template, print out the template page, manually cut out the scanned image from the PDF, and paste it onto the template paper. THEN, in order to share the clippings across our organization, we rescan the images to PDF, making them OCR so that the resulting multipage document is completely searchable (headers and the text in the magazine cutouts).
Have you tried exporting the images (perhaps as JPEG) from the original PDF and then importing them into Word? Can they be OCR'd in Word? Then perhaps PDF using the PDFMaker macro in Word.

Lori Kassuba is an AUC Expert and Community Manager for AcrobatUsers.com.

lwallner
Registered: Oct 12 2009
Posts: 4
There isn't an OCR function in Word that I can find-you have to use MS Office Doc Imaging to run the OCR and when you try to cut/paste from there it brings only the OCR text, not the image...
So right now the best possible workflow I've created is this:
Write the current Word Template to PDF and use the PDF template as my starting point for every clip going forward-I won't need to use Word any more.
Open PDF template, open Clipping PDF.
Use touch up text tool to edit PDF Template headers with info from the Clipping PDF.
Use snapshot to copy image portion of Clipping PDF.
Create new PDF from clipboard.
Ctrl A, Ctrl C from new PDF.
Ctrl V into Template PDF.
Save Template PDF as jpeg.
Open Jpeg in Acrobat.
Perform OCR on entire document.
Save back to a PDF as final.

If anyone can improve on that, I'd love to know how!
Lisa