Installation and Configuration of Apache Oozie on OpenSUSE 12

Oozie is an open source scheduler for Hadoop. It simplifies workflow and coordination between jobs. Using Oozie we can define dependency between jobs for an input data and hence can automate job dependency using ooze scheduler.

Installation

I used Cloudera's Oozie Repository. To add the repository to OpenSUSE's zypper package manager, as root, enter the following:

[linux-mlkb:/ROOT]> zypper addrepo -f http://archive.cloudera.com/sles/11/x86_64/cdh/cloudera-cdh3.repo
[linux-mlkb:/ROOT]> zypper search oozie
          
Loading repository data...
Reading installed packages...
          
          S | Name         | Summary                                               | Type      
          --+--------------+-------------------------------------------------------+-----------
          | oozie        | Oozie is a system that runs workflows of Hadoop jobs. | package   
          | oozie        | Oozie is a system that runs workflows of Hadoop jobs. | srcpackage
          | oozie-client | Client for Oozie Workflow Engine                      | package

    

Now we can install the software:

         
[linux-mlkb:/ROOT]> zypper install oozie
   
          Loading repository data...
          Reading installed packages...
          Resolving package dependencies...
          
          The following NEW packages are going to be installed:
          bigtop-utils oozie oozie-client 
          
          3 new packages to install.
          Overall download size: 91.8 MiB. After the operation, additional 111.6 MiB will be used.
          Continue? [y/n/?] (y): y
          Retrieving package bigtop-utils-3.4.0+3-1.noarch                                           (1/3),   6.8 KiB ( 13.6 KiB unpacked)
          Retrieving: bigtop-utils-3.4.0+3-1.noarch.rpm .............................................[done]
          Retrieving package oozie-client-2.3.2+27.30-1.noarch                                       (2/3),  34.7 MiB ( 53.9 MiB unpacked)
          Retrieving: oozie-client-2.3.2+27.30-1.noarch.rpm .........................................[done (645.2 KiB/s)]
          Retrieving package oozie-2.3.2+27.30-1.noarch                                              (3/3),  57.1 MiB ( 57.7 MiB unpacked)
          Retrieving: oozie-2.3.2+27.30-1.noarch.rpm ................................................[done (339.6 KiB/s)]
          Installing: bigtop-utils-3.4.0+3-1 ........................................................[done]
          Installing: oozie-client-2.3.2+27.30-1 ....................................................[done]
          Installing: oozie-2.3.2+27.30-1 ...........................................................[done]
          Additional rpm output:
          insserv: Service network is missed in the runlevels 2 4 to use service oozie
          
          Note: This output shows SysV services only and does not include native
          systemd services. SysV configuration data might be overridden by native
          systemd configuration.
          
          oozie                     0:off  1:off  2:on   3:on   4:on   5:on   6:off   
      

Configuration

The oozie utility scripts should now be located in the following folder:

/usr/lib/oozie

Oozie requires a database. Oozie uses java's Derby RDBMS by default. To use another database such as MySQL, as root, Install and configure a database first before going further.

(install and configure a database)
      

Oozie requires a JDBC driver. The following example adds one for MySQL. Enter the following as root:

[linux-mlkb:/ROOT]> service oozie stop
[linux-mlkb:/ROOT]> md tmp
[linux-mlkb:/ROOT]> cd tmp
[linux-mlkb:/ROOT]> wget http://dev.mysql.com/get/Downloads/Connector-J/mysql-connector-java-5.1.31.tar.gz
[linux-mlkb:/ROOT]> tar -zxf mysql-connector-java-5.1.31.tar.gz	
[linux-mlkb:/ROOT]> cd mysql-connector-java-5.1.31
[linux-mlkb:/mysql-connector-java-5.1.31]> cp mysql-connector-java-5.1.31-bin.jar /var/lib/oozie/

As the Oozie user, create the schema that Oozie needs by executing the commands below:

[linux-mlkb:/]> sudo -u oozie /usr/lib/oozie/bin/ooziedb.sh create -run
Sample Output
setting OOZIE_CONFIG=/etc/oozie/conf
setting OOZIE_DATA=/var/lib/oozie
setting OOZIE_LOG=/var/log/oozie
setting OOZIE_CATALINA_HOME=/usr/lib/bigtop-tomcat
setting CATALINA_TMPDIR=/var/lib/oozie
setting CATALINA_PID=/var/run/oozie/oozie.pid
setting CATALINA_BASE=/usr/lib/oozie/oozie-server-0.20
setting CATALINA_OPTS=-Xmx1024m
setting OOZIE_HTTPS_PORT=11443
...
DONE
Oozie DB has been created for Oozie version '3.3.2-cdh4.7.0'
The SQL commands have been written to: /tmp/ooziedb-8250405588513665350.sql

Now we need to configure the web console. You will need to download the ExtJS lib using the following commands, as root:

[root@master tmp]# wget http://archive.cloudera.com/gplextras/misc/ext-2.2.zip
[root@master tmp]# /usr/lib/oozie/bin/oozie-setup.sh -extjs ext-2.2.zip 

Finally, start the oozie server, by running following commands.

[root@master tmp]# service oozie start
[root@master tmp]# service oozie status

oozie.service - LSB: Oozie server daemon
          Loaded: loaded (/etc/init.d/oozie)
          Active: active (exited) since Wed, 08 Apr 2015 20:06:43 -0400; 5s ago
         Process: 8501 ExecStop=/etc/init.d/oozie stop (code=exited, status=0/SUCCESS)
         Process: 8600 ExecStart=/etc/init.d/oozie start (code=exited, status=0/SUCCESS)
          CGroup: name=systemd:/system/oozie.service

Apr 08 20:06:43 linux-mlkb oozie[8600]: Setting OOZIE_HTTP_HOSTNAME: linux-mlkb
Apr 08 20:06:43 linux-mlkb oozie[8600]: Setting OOZIE_HTTP_PORT:     11000
Apr 08 20:06:43 linux-mlkb oozie[8600]: Setting OOZIE_ADMIN_PORT:     11001
Apr 08 20:06:43 linux-mlkb oozie[8600]: Setting OOZIE_BASE_URL:      http://linux-mlkb:11000/oozie
Apr 08 20:06:43 linux-mlkb oozie[8600]: Using   CATALINA_BASE:       /var/lib/oozie/oozie-server
Apr 08 20:06:43 linux-mlkb oozie[8600]: Setting CATALINA_OUT:        /var/log/oozie/catalina.out
Apr 08 20:06:43 linux-mlkb oozie[8600]: Using   CATALINA_PID:        /var/run/oozie/oozie.pid
Apr 08 20:06:43 linux-mlkb oozie[8600]: Using   CATALINA_OPTS:       -Dderby.stream.error.file=/var/log/oozie/derby.log
Apr 08 20:06:43 linux-mlkb oozie[8600]: Adding to CATALINA_OPTS:     -Doozie.home.dir=/usr/lib/oozie -Doozie.config.dir=/etc/oozie -Doozie.log.dir=/var/log/oozie -Doozie.d...:11000/oozie
Apr 08 20:06:43 linux-mlkb oozie[8600]: Oozie start succeeded [root@master tmp]# oozie admin -oozie http://localhost:11000/oozie -status System mode: NORMAL

That's it, Oozie should now be running. Below is a sample of what the Oozie console should look like.