PDF documents can present some challenges for full-text indexing due to the format. Without additional processing, Full-Text Search attempts to extract the text layer from the document to use as a rendition for full-text indexing. However, it is not always possible to successfully extract the text layer from a PDF. For this reason, PDF documents can be optionally processed as image documents.
PDF documents processed as images undergo an additional OCR process to create a text rendition of the document for full-text indexing. While this allows most PDFs to be successfully indexed, it is typically slower due to the extra processing.
PDF documents processed as images are displayed in a paginated format, like other image files. PDF documents not processed as images are displayed as HTML output when viewed from a search results list.
To process PDF documents as images in Full-Text Search catalogs: