Introduction
When migrating Java batch applications to cloud platforms such as OpenShift, there are different approaches how to build and containerize traditional applications. Recall that JSR-352-based Java batch applications can be developed and run in either Java SE or Java EE (now Jakarta EE) environment. So if your existing Java batch applications are Java EE web or enterprise applications deployed to application servers like WildFly, then you would build the new cloud batch applications based on OpenShift WildFly image streams and run it WildFly runtime on OpenShift.
If you've chosen to develop and run your existing Java batch applications as light-weight standalone Java SE applications, it's also easy to migrate to OpenShift using openjdk image steams and runtime. This is what we will be exploring in this blog post to help JBeret users better understand the concepts and steps it takes to modernize batch applications. OpenShift provides a Java S2I (source-to-image) builder process that handles everything from building application source code, injecting application to the base image, publishing to OpenShift image registry, and readying the application for execution. A JBeret sample batch application, jberet-simple, will be used to illustrate each step.
Set up, Build and Run Sample Batch Application the Traditional Way
First, let's see how to build and run the sample application the traditionaly way locally, and familiarize ourselves with the application structure and batch job. jberet-simple is a simple standalone Java SE batch processing application and contains a single batch job as defined in simple.xml. This batch job contains a single chunk-type step that reads a list of numbers by chunks and prints them to the console. The 2 batch artifacts used in this application are:
- arrayItemReader: implemented in jberet-support, reads a list of objects configured in job xml
- mockItemWriter: implemented in jberet-support, writes the output to the console or other destinations
For complete batch job definition, see the JSL file simple.xml.
To git-clone the sample application from github:
git clone https://github.com/jberet/jberet-simple.git
To build the sample application with Maven, including running the integration test:
mvn clean install
To run the integration test that starts the batch job:
mvn integration-test
To run application main class with maven exec plugin, execute any of the following mvn commands:
# run with the default configuration in pom.xml mvn exec:java # run with job xml mvn exec:java -Dexec.arguments="simplxe.xml" # run with job xml and job parameters mvn exec:java -Dexec.arguments="simple.xml jobParam1=x jobParam2=y jobParam3=z"
To build the application as an executable uber jar (fat jar) and run it directly with java -jar command:
mvn clean install -Popenshift java -jar target/jberet-simple.jar simple.xml jobParam1=x jobParam2=y
Note that in the above command, a maven profile named openshift is used. This profile tells maven to build the uber jar to include everything needed to run the application. When openshift profile is present, it will be picked up by OpenShift S2I builder process instead of the default profile. Of course, this profie can also be invoked manually as we just did above.
Build Application Images and Deploy to OpenShift
Next, let's delve into how to run jberet-simple application on OpenShift. Since this is a standalone Java SE application, OpenShift will need to enlist a Java SE runtime, and here we choose to use openjdk18. All the operations we will be performing can be done via either OpenShift command line tool (oc), or OpenShift Web Console. For the sake of brevity, we will use oc commands. For introduction to various features in OpenShift, you may want to check out OpenShift interactive tutorials.
We assume you already have an OpenShift account, and to log in:
oc login https:xxx.openshift.com --token=xxx
To create a new project, if there is no existing projects:
oc new-project
We wil use openjdk18-openshift image stream. Check if it is available in the current project:
oc get is
If openjdk18-openshift is not present, import it:
oc import-image my-redhat-openjdk-18/openjdk18-openshift --from=registry.access.redhat.com/redhat-openjdk-18/openjdk18-openshift --confirm
to create a new application (with default name):
oc new-app openjdk18-openshift~https://github.com/jberet/jberet-simple.git
Or to create a new application with custom name, if the default name doesn't fit:
oc new-app openjdk18-openshift~https://github.com/jberet/jberet-simple.git --name=hello-batch
The above new-app command takes a while to complete. To check its status:
oc status
To list pods, and get logs for the pod associated with the application (replace jberet-simple-1-kpvqn with your pod name):
oc get pods oc logs jberet-simple-1-kpvqn
From the above log output, you can see that the application has been successfully built, deployed to OpenShift online, and batch job executed.
Launch a Job Execution from OpenShift Command Line
By now we've successfully built, deployed to OpenShift and started the batch job execution. You want want to run it again later as needed, and this can be easily done with OpenShift command line with oc client tool and Kubernetes job api.
First, create a yaml file to describe how OpenShift should run the batch application. For example, I created the following file, simple.yaml, to launch the batch application (replace container image value to the appropriate one in your OpenShift environment):
apiVersion: batch/v1 kind: Job metadata: name: simple spec: parallelism: 1 completions: 1 template: metadata: name: simple spec: containers: - name: jberet-simple image: docker-registry.default.svc:5000/pr/jberet-simple command: ["java", "-jar", "/deployments/jberet-simple.jar", "simple.xml", "jobParam1=x", "jobParam2=y"] restartPolicy: OnFailure
Then, run the following command to tell OpenShift to launch the job execution:
$ oc create -f simple.yaml job.batch "simple" created
To list Kubernetes jobs:
$ oc get jobs NAME DESIRED SUCCESSFUL AGE simple 1 1 12m
To list pods, including the one responsible for running the above simple batch application:
$ oc get pods NAME READY STATUS RESTARTS AGE jberet-simple-5-build 0/1 Completed 0 11h jberet-simple-6-build 0/1 Completed 0 8h jberet-simple-6-wwjm7 0/1 CrashLoopBackOff 105 8h postgresql-5-sbfm5 1/1 Running 0 1d simple-mpq8h 0/1 Completed 0 8h
To view logs from the above simple batch job execution, passing the appropriate pod name:
$ oc logs simple-mpq8h
To delete the job created in above step:
$ oc delete job simple job.batch "simple" deleted
Schedule Repeating Job Executions with Kubernetes Cron Jobs from OpenShift Command Line
You may be wondering if it's possible to schedule periodic batch job executions from OpenShift command line. The answer is yes, and this is supported with Kubernetes cron job api, similar to launching one-time job execution as demonstrated above.
First, create a yaml file to define the Kubernetes crob job spec. In the following example, simple-cron.yaml, the cron expression `*/1 * * * *` specifies running the batch job every minute.
apiVersion: batch/v1beta1 kind: CronJob metadata: name: simple-cron spec: successfulJobsHistoryLimit: 3 failedJobsHistoryLimit: 1 schedule: "*/1 * * * *" jobTemplate: spec: template: spec: containers: - name: simple-cron image: docker-registry.default.svc:5000/pr/jberet-simple command: ["java", "-jar", "/deployments/jberet-simple.jar", "simple.xml", "jobParam1=x", "jobParam2=y"] restartPolicy: OnFailure
Then, run the following commands to tell OpenShift to schedule the job executions:
$ oc create -f simple-cron.yaml cronjob.batch "simple-cron" created
To list all cron jobs:
$ oc get cronjobs NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE simple-cron */1 * * * * False 0 7s
To get status of a specific cron job:
$ oc get cronjob simple-cron NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE simple-cron */1 * * * * False 0 24s
To get continuous status of a specific cron job with --watch option:
$ oc get cronjob simple-cron --watch NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE simple-cron */1 * * * * False 0 33s simple-cron */1 * * * * False 1 7s 46s simple-cron */1 * * * * False 0 37s 1m
To get all pods, including the pods responsible for running scheduled job executions:
$ oc get pods NAME READY STATUS RESTARTS AGE postgresql-5-sbfm5 1/1 Running 0 27d simple-cron-1536609780-fmrhf 0/1 ContainerCreating 0 1s simple-mpq8h 0/1 Completed 0 26d
To view logs of one of the scheduled job executions, passing the appropriate pod name:
$ oc logs simple-cron-1536609780-fmrhf
To delete the cron job created above:
$ oc delete cronjob simple-cron cronjob.batch "simple-cron" deleted
Summary
In this blog post, we demonstrated with a sample Java batch application how to run it locally, build and deploy containerized application to OpenShift, launch batch job execution from OpenShift command line, and schedule cron jobs of periodic batch job executions. This post just touches some of the basics of running batch jobs in OpenShift platform, and there are many options for concurrency, scalability and restartability that are worth exploring further. I hope you find it useful in your batch applicaton development, and feedback and comments are always welcome to help us improve project JBeret.