Introduction

 

When migrating Java batch applications to cloud platforms such as OpenShift, there are different approaches how to build and containerize traditional applications.  Recall that JSR-352-based Java batch applications can be developed and run in either Java SE or Java EE (now Jakarta EE) environment.  So if your existing Java batch applications are Java EE web or enterprise applications deployed to application servers like WildFly, then you would build the new cloud batch applications based on OpenShift WildFly image streams and run it WildFly runtime on OpenShift.

 

If you've chosen to develop and run your existing Java batch applications as light-weight standalone Java SE applications, it's also easy to migrate to OpenShift using openjdk image steams and runtime.  This is what we will be exploring in this blog post to help JBeret users better understand the concepts and steps it takes to modernize batch applications.  OpenShift provides a Java S2I (source-to-image) builder process that handles everything from building application source code, injecting application to the base image, publishing to OpenShift image registry, and readying the application for execution.  A JBeret sample batch application, jberet-simple, will be used to illustrate each step.

 

Set up, Build and Run Sample Batch Application the Traditional Way

 

First, let's see how to build and run the sample application the traditionaly way locally, and familiarize ourselves with the application structure and batch job.  jberet-simple is a simple standalone Java SE batch processing application and contains a single batch job as defined in simple.xml.  This batch job contains a single chunk-type step that reads a list of numbers by chunks and prints them to the console. The 2 batch artifacts used in this application are:

 

 

For complete batch job definition, see the JSL file simple.xml.

 

To git-clone the sample application from github:

git clone https://github.com/jberet/jberet-simple.git

 

To build the sample application with Maven, including running the integration test:

mvn clean install

 

To run the integration test that starts the batch job:

mvn integration-test

 

To run application main class with maven exec plugin, execute any of the following mvn commands:

 

# run with the default configuration in pom.xml
mvn exec:java

# run with job xml
mvn exec:java -Dexec.arguments="simplxe.xml"

# run with job xml and job parameters
mvn exec:java -Dexec.arguments="simple.xml jobParam1=x jobParam2=y jobParam3=z"

 

To build  the application as an executable uber jar (fat jar) and run it directly with java -jar command:

 

mvn clean install -Popenshift
java -jar target/jberet-simple.jar simple.xml jobParam1=x jobParam2=y

 

Note that in the above command, a maven profile named openshift is used.  This profile tells maven to build the uber jar to include everything needed to run the application.  When openshift profile is present, it will be picked up by OpenShift S2I builder process instead of the default profile.  Of course, this profie can also be invoked manually as we just did above.

 

Build Application Images and Deploy to OpenShift

 

Next, let's delve into how to run jberet-simple application on OpenShift.  Since this is a standalone Java SE application, OpenShift will need to enlist a Java SE runtime, and here we choose to use openjdk18.  All the operations we will be performing can be done via either OpenShift command line tool (oc), or OpenShift Web Console.  For the sake of brevity, we will use oc commands.  For introduction to various features in OpenShift, you may want to check out OpenShift interactive tutorials.

 

We assume you already have an OpenShift account, and to log in:

oc login https:xxx.openshift.com --token=xxx

 

To create a new project, if there is no existing projects:

oc new-project 

 

We wil use openjdk18-openshift image stream. Check if it is available in the current project:

oc get is

 

If openjdk18-openshift is not present, import it:

oc import-image my-redhat-openjdk-18/openjdk18-openshift --from=registry.access.redhat.com/redhat-openjdk-18/openjdk18-openshift --confirm

 

to create a new application (with default name):

oc new-app openjdk18-openshift~https://github.com/jberet/jberet-simple.git

 

Or to create a new application with custom name, if the default name doesn't fit:

oc new-app openjdk18-openshift~https://github.com/jberet/jberet-simple.git --name=hello-batch

 

The above new-app command takes a while to complete.  To check its status:

oc status

 

To list pods, and get logs for the pod associated with the application (replace jberet-simple-1-kpvqn with your pod name):

oc get pods
oc logs jberet-simple-1-kpvqn

 

From the above log output, you can see that the application has been successfully built, deployed to OpenShift online, and batch job executed.

 

Launch a Job Execution from OpenShift Command Line

 

By now we've successfully built, deployed to OpenShift and started the batch job execution.  You want want to run it again later as needed, and this can be easily done with OpenShift command line with oc client tool and Kubernetes job api.

 

First, create a yaml file to describe how OpenShift should run the batch application.  For example, I created the following file, simple.yaml, to launch the batch application (replace container image value to the appropriate one in your OpenShift environment):

 

apiVersion: batch/v1
kind: Job
metadata:
  name: simple
spec:
  parallelism: 1
  completions: 1
  template:
    metadata:
      name: simple
    spec:
      containers:
      - name: jberet-simple
        image: docker-registry.default.svc:5000/pr/jberet-simple
        command: ["java",  "-jar", "/deployments/jberet-simple.jar", "simple.xml", "jobParam1=x", "jobParam2=y"]
      restartPolicy: OnFailure

 

Then, run the following command to tell OpenShift to launch the job execution:

 

$ oc create -f simple.yaml
job.batch "simple" created

 

To list Kubernetes jobs:

 

$ oc get jobs
NAME      DESIRED   SUCCESSFUL   AGE
simple    1         1            12m

 

To list pods, including the one responsible for running the above simple batch application:

 

$ oc get pods
NAME                    READY     STATUS             RESTARTS   AGE
jberet-simple-5-build   0/1       Completed          0          11h
jberet-simple-6-build   0/1       Completed          0          8h
jberet-simple-6-wwjm7   0/1       CrashLoopBackOff   105        8h
postgresql-5-sbfm5      1/1       Running            0          1d
simple-mpq8h            0/1       Completed          0          8h

 

To view logs from the above simple batch job execution, passing the appropriate pod name:

 

$ oc logs simple-mpq8h

 

To delete the job created in above step:

 

$ oc delete job simple
job.batch "simple" deleted

 

 

Schedule Repeating Job Executions with Kubernetes Cron Jobs from OpenShift Command Line

 

You may be wondering if it's possible to schedule periodic batch job executions from OpenShift command line.  The answer is yes, and this is supported with Kubernetes cron job api, similar to launching one-time job execution as demonstrated above.

 

First, create a yaml file to define the Kubernetes crob job spec.  In the following example, simple-cron.yaml, the cron expression `*/1 * * * *` specifies running the batch job every minute.

 

apiVersion: batch/v1beta1
kind: CronJob
metadata:
  name: simple-cron
spec:
  successfulJobsHistoryLimit: 3
  failedJobsHistoryLimit: 1
  schedule: "*/1 * * * *"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: simple-cron
            image: docker-registry.default.svc:5000/pr/jberet-simple
            command: ["java",  "-jar", "/deployments/jberet-simple.jar", "simple.xml", "jobParam1=x", "jobParam2=y"]
          restartPolicy: OnFailure

 

Then, run the following commands to tell OpenShift to schedule the job executions:

 

$ oc create -f simple-cron.yaml
cronjob.batch "simple-cron" created

 

To list all cron jobs:

 

$ oc get cronjobs
NAME          SCHEDULE      SUSPEND   ACTIVE    LAST SCHEDULE   AGE
simple-cron   */1 * * * *   False     0                   7s

 

To get status of a specific cron job:

 

$ oc get cronjob simple-cron
NAME          SCHEDULE      SUSPEND   ACTIVE    LAST SCHEDULE   AGE
simple-cron   */1 * * * *   False     0                   24s

 

To get continuous status of a specific cron job with --watch option:

 

$ oc get cronjob simple-cron --watch
NAME          SCHEDULE      SUSPEND   ACTIVE    LAST SCHEDULE   AGE
simple-cron   */1 * * * *   False     0                   33s
simple-cron   */1 * * * *   False     1         7s        46s
simple-cron   */1 * * * *   False     0         37s       1m

 

To get all pods, including the pods responsible for running scheduled job executions:

 

$ oc get pods
NAME                           READY     STATUS              RESTARTS   AGE
postgresql-5-sbfm5             1/1       Running             0          27d
simple-cron-1536609780-fmrhf   0/1       ContainerCreating   0          1s
simple-mpq8h                   0/1       Completed           0          26d

 

To view logs of one of the scheduled job executions, passing the appropriate pod name:

 

$ oc logs simple-cron-1536609780-fmrhf

 

To delete the cron job created above:

 

$ oc delete cronjob simple-cron
cronjob.batch "simple-cron" deleted

 

Summary

 

In this blog post, we demonstrated with a sample Java batch application how to run it locally, build and deploy containerized application to OpenShift, launch batch job execution from OpenShift command line, and schedule cron jobs of periodic batch job executions.  This post just touches some of the basics of running batch jobs in OpenShift platform, and there are many options for concurrency, scalability and restartability that are worth exploring further.  I hope you find it useful in your batch applicaton development, and feedback and comments are always welcome to help us improve project JBeret.