Metadata extraction limits allows configurations on AbstractMappingMetadataExtracter for:
- control of the maximum time allowed for an extraction
- control of the maximum size (MB) of any single document that the extractor will handle
- control of the maximum number of all the documents being processed at any point in time
The default values for each of these properties are MAX value specified in the java code. These limits are configured per extractor and mimetype.
The limits configured for Content Services are:
Time out configured for all extractor and all mimetypes content.metadataExtracter.default.timeoutMs=20000 Maximum size of a document to process - configured for PdfBoxMetadataExtracter , pdf files content.metadataExtracter.pdf.maxDocumentSizeMB=10 Maximum number of concurrent extractions - configured for PdfBoxMetadataExtracter , pdf files content.metadataExtracter.pdf.maxConcurrentExtractionsCount=5