Configure a Custom Local Transform Pipeline - Alfresco Content Services - 23.4 - 23.4 - Ready - Alfresco - external

Alfresco Content Services

Platform
Alfresco
Product
Alfresco Content Services
Release
23.4
License

Local Transforms may be combined together in a pipeline to form a new transform, where the output from one becomes the input to the next and so on. A pipeline definition (JSON) defines the sequence of transforms and intermediate Media Types. Like any other transformer, it specifies a list of supported source and target Media Types. If you don’t supply any, all possible combinations are assumed to be available. The definition may reuse the transformOptions of transformers in the pipeline, but typically will define its own subset of these.

The following example begins with the helloWorld Transformer described in Creating a T-Engine, which takes a text file containing a name and produces an HTML file with a Hello <name> message in the body. This is then transformed back into a text file. This example contains just one pipeline transformer, but many may be defined in the same file.

{
  "transformers": [
    {
      "transformerName": "helloWorldText",
      "transformerPipeline" : [
        {"transformerName": "helloWorld", "targetMediaType": "text/html"},
        {"transformerName": "html"}
      ],
      "supportedSourceAndTargetList": [
        {"sourceMediaType": "text/plain", "priority": 45,  "targetMediaType": "text/plain" }
      ],
      "transformOptions": [
        "helloWorldOptions"
      ]
    }
  ]
}
  • transformerName - Try to create a unique name for the transform.
  • transformerPipeline - A list of transformers in the pipeline. The targetMediaType specifies the intermediate Media Types between transformers. There is no final targetMediaType as this comes from the supportedSourceAndTargetList.
  • supportedSourceAndTargetList - The supported source and target Media Types, which refer to the Media Types this pipeline transformer can transform from and to, additionally you can set the priority and the maxSourceSizeBytes see Supported Source and Target List. If blank, this indicates that all possible combinations are supported. This is the cartesian product of all source types to the first intermediate type and all target types from the last intermediate type. Any combinations supported by the first transformer are excluded. They will also have the priority from the first transform.
  • transformOptions - A list of references to options required by the pipeline transformer.

Custom Pipeline definitions need to be placed in a directory of the Repository. The default location (below) may be changed by resetting the following Alfresco global property.

local.transform.pipeline.config.dir=shared/classes/alfresco/extension/transform/pipelines

On startup this location is checked every minute, but then switches to once an hour if successful. After a problem, it tries every minute again. These are the same properties use to decide when to read T-Engine configurations, because pipelines combine transformers in the T-Engines.

local.transform.service.cronExpression=4 30 0/1 * * ?
local.transform.service.initialAndOnError.cronExpression=0 * * * * ?

If you are using Docker Compose in development, you will need to copy your pipeline definition into your running Repository container. One way is to use the following command, and it will be picked up the next time the location is read, which is dependent on the cron values.

docker cp custom_pipelines.json <alfresco container>:/usr/local/tomcat/shared/classes/alfresco/extension/transform/pipelines/

In a Kubernetes environment, ConfigMaps can be used to add pipeline definitions. You will need to create a ConfigMap from the JSON file and mount the ConfigMap through a volume to the Repository pods.

kubectl create configmap custom-pipeline-config --from-file=name_of_a_file.json

The necessary volumes are already provided out of the box and the files in ConfigMap custom-pipeline-config will be mounted to /usr/local/tomcat/shared/classes/alfresco/extension/transform/pipelines/. Again, the files will be picked up the next time the location is read, or when the repository pods are restarted.

CAUTION: From Kubernetes documentation: Caution: If there are some files in the mountPath location, they will be deleted.