In this post I will show you a step by step guide to deploy the Document Processing Extension in docker on Ubuntu.
I used the following Ubuntu 22.04.3 LTS release. But at the end you can use any other Linux distribution that supports docker.
At first we should install the required components to our Ubuntu machine if they are not already installed.
- OpenSSL
- sudo apt install openssl
- Python3
- sudo apt install python3
- sudo apt install python3-distutils
- Docker with swarm enabled
- Add Docker’s official GPG key
- sudo apt-get update
- sudo apt-get install ca-certificates curl gnupg
- sudo install -m 0755 -d /etc/apt/keyrings
- curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg –dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
- sudo chmod a+r /etc/apt/keyrings/docker.gpg
- Add the repository to Apt sources
- echo “deb [arch=$(dpkg –print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable” | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
- sudo apt-get update
- Now you can install docker
- sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
- Create a swarm
- run “docker swarm init”
- Add Docker’s official GPG key
Now we can continue with the DPE deployment. You need to download the latest deployment scripts from https://github.com/IBM/cloud-pak/blob/master/repo/case/ibm-dp-extension/1.0.1/ibm-dp-extension-1.0.1.tgz
Extract the file and copy the dpedeploy-23.0.1-IF003.tar.xz from …/ibm-dp-extension/inventory/adpOperator/files/deploy to a different location or extract it in the deploy folder
Now run the dpedeploy to pull the DPE image.
Accept the license and enter the required credentials. As username you use “cp” and as password you must enter your secret key that needs to be created via https://myibm.ibm.com/products-services/containerlibrary
Enter a custom docker swarm stack or leave it as default ibm_dpe. You can use the buit-in Postgre database or connect to a remote Postgre or DB2 database. For demo or testing purpose you can use the built-in database. But for production I recommend to use a remote database system.
If you plan ti use ther OCR Engine 2 please refer to the system requirements for DPE at https://www.ibm.com/docs/en/datacap/9.1.9?topic=extension-installing-document-processing
Check the stack status with “docker stack services ibm_dpe”
When all services are running you can access the DPE UI via the URL that was provided by the deployment. In my case https://dpe
If you plan to use the built-in Postgre database you well also need to perform the following steps.
- docker cp $(docker ps -f name=ibm_dpe_spbackend –quiet):/data-org/db_sample_data/pg_imports.tar.xz ./
Now you have extracted the sample data from the spbackend service container into you local files system
Copy the pg_imports.tar.xz into the folder ibm_assets folder and extract it there.
- cp pg_imports.tar.xz ibm_assets/managed_db_scripts/PG/
- tar xvf pg_imports.tar.xz
Verify if the “imports” folder is created.
Now you can start to create you DPE projects.