efg's Research Notes  
Embarrassingly Parallel Computations Using the Sun Grid Engine


Cluster Job "get info"

Purpose.  The purpose of this example is to compare/contrast differences among the cluster nodes, the cluster "head node", and a developer's Linux box.

Background.  I have tried to develop a common approach to all cluster jobs that involves potentially three slightly different Linux environments: the cluster nodes, the cluster "head node", and a developer's Linux box.  The differences among these environments can include:

  • memory
  • CPUs
  • platform (32-bit vs 64-bit)
  • mounted filesystems
  • PATH to executables that are available
  • environment variables

Understanding these differences is important to developing software that can be developed and tested on a developer's Linux box and then run on the cluster.

My first few cluster jobs were not successful because I assumed the cluster nodes had certain resources they did not.  I backed into writing this script to understand the differences.

Overview

  1. On the developer's box create a directory for this project. A directory that is visible to both the cluster and the developer's box must be chosen.
  2. Create, test, and run the script, getinfo.bash, on a developer's box to avoid any impact on the cluster during development.  This script simply records various information about the Linux environment.  [This script has been modified and re-run over time to explore other differences that were discovered.] 
  3. Create submit.bash script on developer's box to submit jobs using the Sun Grid Engine "qsub" command to some of the Linux cluster nodes.  For now we'll only run our tests on two random nodes, and assume the rest of the nodes are configured in a similar way, which may or may not be the case.
  4. Login to cluster head node and change to the project directory. 
  5. Run the submit.bash script on the cluster head node.
  6. Compare the differences among the various environments.

Step-by-Step Instructions and Comments

  1. I don't find the default Linux prompt as useful as it might be.  I want to always know what login account I'm using, what machine I'm on, and what the current working directory is. Also, I'd like the time stamp recorded so that when I leave something running for a long time, I can see when it finished whether or not I witness it.  I added this line to my bash .profile file so all this information appears in my prompt:

    PS1="\n[\! \u \h \$(date '+%d%b%y %T') \$( pwd )]\n"

  2. Login to the developer's box (genekc03 in this case).  Because of the PS1 environment variable above, the initial prompt shows this:

    [267 efg genekc03 22Oct07 17:23:24 /home/efg]

    I change the directory to cluster (cd cluster) and then create a new project directory (mkdir getinfo).  I change to the new directory (cd getinfo).  The prompt shows the current working directory:

    [291 efg genekc03 22Oct07 17:27:36 /home/efg/cluster/getinfo]

    Cluster jobs sometimes result in hundreds if not thousands of files.  Using a separate directory per cluster job helps keep all the files in one place.  When there are many output files, a separate directory (perhaps a subdirectory off this working directory) is a good idea to hold the standard output files.

  3. Create getinfo.bash.  I created this script to use the echo command to write certain information to standard output.   Here's what the script looks like:

    [292 efg genekc03 22Oct07 17:28:53 /home/efg/cluster/getinfo]
    cat -n getinfo.bash
    1 #! /bin/bash
    2 # efg, 18 Oct 2007
    3
    4 echo `date +"%Y/%m/%d %T"`
    5 echo "***** parms *****"
    6 echo $1
    7 echo $2 $3
    8 echo "***** uname *****"
    9 uname -sm
    10 echo "*****"
    11 echo "\$HOME="$HOME
    12 echo "\$HOSTNAME="$HOSTNAME
    13 echo "\$REMOTEHOST="$REMOTEHOST
    14 echo "\$JOB="$JOB
    15 echo "\$SGE_TASK_ID="$SGE_TASK_ID
    16 echo "\$SHELL="$SHELL
    17 echo "\$USER="$USER
    18 echo "\$JOB_ID="$JOB_ID
    19 echo "***** \$PATH *****"
    20 echo $PATH | sed "s/:/\n/g"
    21 echo "***** df *****"
    22 df
    23 echo "***** SGE *****"
    24 printenv | grep "^SGE"
    25 echo "***** other env *****"
    26 printenv | grep -v "^SGE"
    27 echo "*****"
    28 echo `date +"%Y/%m/%d %T"`

    Make the script executable:  chmod +x getinfo.bash 

    The script can be tested by entering its name at a command prompt: ./getinfo.bash

    Let's redirect the output of the script to a file and discuss its contents later:

    [158 efg genekc03 19Oct07 15:13:18 /home/efg/cluster/getinfo]
    ./getinfo.bash > stdout-genekc03.txt

    I use the .txt extension so the file can be easily viewed in Windows if desired.  Since I often use files interchangeably on Linux and Windows, I follow a convention that works with both platforms.

  4. Create submit.bash. Here's the script created on the developer's machine:

    [296 efg genekc03 22Oct07 17:31:58 /home/efg/cluster/getinfo]
    cat -n submit.bash
    1 #! /bin/bash
    2 # GetInfo on random cluster nodes.
    3 # efg, 18 Oct 2007
    4
    5 START="`date +'%Y/%m/%d %T'`"
    6
    7 # Output from Cluster Head Node
    8 getinfo.bash "Head Node" $START > stdout0.txt
    9
    10 # Output from two cluster nodes
    11 qsub -q all.q -cwd -N Info1 -o stdout1.txt -j yes getinfo.bash "ONE" $START
    12 qsub -q all.q -cwd -N Info2 -o stdout2.txt -j yes getinfo.bash "TWO" $START

    In line 5 the START variable records the time just before the jobs are submitted using qsub.  This will be used to determine how long the jobs are in the SGE queue.  [Normally this is not needed.]

    The only work performed on the cluster head node should be the execution of a script to submit jobs using qsub. Normally, "extra" work, like shown in line 8, should NOT be done on the cluster head node.  Since this is a trivial script and executes quite quickly, we break the rule about doing "extra" work on the cluster head node and in line 8 record information about the cluster head node in the file stdout0.txt

    Lines 11 and 12 are calls to qsub, the Sun Grid Engine command that submits the jobs to the cluster for scheduling and execution.  Here's a parameter-by-parameter description of what's in line 11:

Description of qsub parameters used in Line 11

Parameter Comments
-q all.q The -q parameter identifies which queue should be used to execute this job.  Often this parameter can be left out.  In our case, certain queues do not perform the same initialization as other queues, so this parameter is usually needed.
-cwd The current working directory (cwd) where the submit.bash script is executed will be the working directory for the jobs submitted to the cluster nodes. If not specified, a user's home directory is the default.
-N Info The -N parameter identifies the name of the cluster job. The name can be useful for monitoring progress of a series of jobs, or possibly identifying jobs that should be cancelled. Generally some sort of mnemonic name that is specific to each job is better than giving all jobs the same name.
-o stdout1.txt The -o parameter identifies the file that will contain the standard output.
-j yes A separate -e parameter can be used to identify the standard error file, or the -j parameter says to write all standard error messages to standard output.
getinfo.bash "ONE" $START Jobs submitted to the cluster must be a script, such as the getinfo.bash script in this example.  In this case, the parameters passed to the getinfo.bash script on the cluster are a constant string, and the job submission time stamp.
  1. Login to cluster head node (cluster02 in this case) and change to the project directory. 

    [310 efg cluster02 23Oct07 11:47:36 /home/efg]
    cd cluster/getinfo

  2. Run the submit.bash script on the cluster head node.

    [312 efg cluster02 23Oct07 11:48:02 /home/efg/cluster/getinfo]
    ./submit.bash
    Your job 84244 ("Info1 ONE 2007/10/23 11:48:16") has been submitted
    Your job 84245 ("Info2 TWO 2007/10/23 11:48:16") has been submitted

    The qsub commands in submit.bash results in two jobs being submitted to the Sun Grid Engine for scheduling and execution. The SGE decides which nodes will be assigned for execution of the jobs.  Since the cluster is idle they run immediately.  We check the job status using the qstat command for all jobs submitted by user efg, but in this case they have already completed.  

qstat -u efg

At other times the cluster can be quite busy and the jobs would be stuck in the queue for some time before execution.  [We use a Ganglia monitoring system to check on how busy the cluster and its nodes are.]

  1. Compare differences among the cluster nodes. We only submit jobs on the cluster head node, so we go back to our developer's box to look at the results. The are four "stdout" output files:
  2. [311 efg genekc03 23Oct07 11:49:28 /home/efg/cluster/getinfo]
    ls -AlF
    total 48
    -rwxrwxr-x 1 efg efg 572 Oct 22 17:27 getinfo.bash*
    -rw-rw-r-- 1 efg efg 10103 Oct 22 17:31 stdout-genekc03.txt
    -rw-rw-r-- 1 efg efg 10067 Oct 23 11:48 stdout0.txt
    -rw-r--r-- 1 efg efg 5974 Oct 23 11:48 stdout1.txt
    -rw-r--r-- 1 efg efg 5974 Oct 23 11:48 stdout2.txt
    -rwxrwxr-x 1 efg efg 369 Oct 22 17:27 submit.bash*

The files from the two cluster nodes are identical in size, but not in content.  The Linux diff command shows some of the differences:

    [312 efg genekc03 23Oct07 11:49:33 /home/efg/cluster/getinfo]
    diff stdout1.txt stdout2.txt > stdout1-vs-stdout2.txt

    Selected differences between Two Cluster Nodes

    Parameter Job 1 Job 2
    $HOSTNAME node0027 node0011
    $JOB_ID 84244 84245
    $PATH
    (difference only)
    /tmp/84244.1.all.q /tmp/84245.1.all.q
    $SGE_STDOUT_PATH /home/efg/cluster/getinfo/stdout1.txt /home/efg/cluster/getinfo/stdout2.txt
    $TMPDIR /tmp/84244.1.all.q /tmp/84245.1.all.q
    $JOB_SCRIPT /tmp/sge/spool/node0027/job_scripts/84244 /tmp/sge/spool/node0011/job_scripts/84245
    $JOB_NAME Info1 Info2


We observe only negligible differences between the cluster nodes.  For now we'll assume other cluster nodes are consistent with these observations.

  1. Compare differences between cluster head node and regular cluster node.

Selected differences between Cluster Head Node and Regular Cluster Node

Parameter Cluster Node Cluster Head Node
$HOSTNAME node0027 cluster02
$SHELL /bin/bash /usr/local/bin/bash
$PATH 29 directories in path 16 directories in path
file systems
(from df)
5 common + 4 "private"
5 common +
3 shared with development machines
$SGE SGE_TASK_STEPSIZE=undefined
SGE_O_WORKDIR=/home/efg/cluster/getinfo
SGE_O_HOME=/home/efg
SGE_ARCH=lx24-x86
SGE_CELL=default
SGE_TASK_LAST=undefined
SGE_TASK_ID=undefined
SGE_BINARY_PATH=/opt/sge/bin/lx24-x86
SGE_STDERR_PATH=/home/efg/cluster/getinfo/stdout1.txt
SGE_STDOUT_PATH=/home/efg/cluster/getinfo/stdout1.txt
SGE_ACCOUNT=sge
SGE_ROOT=/opt/sge
SGE_JOB_SPOOL_DIR=/tmp/sge/spool/node0027/active_jobs/84244.1
SGE_CWD_PATH=/home/efg/cluster/getinfo
SGE_O_LOGNAME=efg
SGE_O_MAIL=/var/spool/mail/efg
SGE_TASK_FIRST=undefined
SGE_O_PATH= <deleted, too long>i686/bioinfo/wublast/current:.
SGE_O_HOST=cluster02
SGE_O_SHELL=/usr/local/bin/bash
SGE_STDIN_PATH=/dev/null
SGE_CELL=default
SGE_ROOT=/opt/sge
environment variables
many differences
  1. Compare differences between regular cluster node and a Linux development box.

    There are even more difference between a cluster node and the development environment, than between the cluster node and the cluster head node.  In particular, there are many more differences between directories in the path, mounted file systems, and environment variables.  All of these differences must be factored into any development work so cluster job submissions will work.

    One way to verify if certain resources are available on a cluster node is to run the qrsh command on the cluster head node.  This will open a shell on a random node and you can then inspect node resources.


    [316 efg cluster02 23Oct07 13:47:54 /home/efg/cluster/getinfo]
    qrsh

  2. [3 efg node0013 23Oct07 14:19:39 /home/efg]

Note the prompt that says we're now working on cluster node0013 and are starting in the home directory.  AVOID doing anything computational intensive on the cluster node this way.  Simply inspect various resources (e.g., df, $PATH, various environment variables), and enter the exit command to return to the cluster head node.

9.  SGE JobQueues.  Even with all this background checking, I still had some cluster job failures that I eventually traced to certain SGE job queues.  Apparently, SGE randomly picks among the existing queues at job submission time, and the initialization process can vary by queue.  For reasons I do not understand, I found I only had the "right" Linux environment ($PATH, environment variables, ...) when a job was submitted on a particular queue.  Lines 11-12 in the submit.bash script shown above both specify a particular queue to avoid this initialization problem:  qsub -q all.q ...

The SGE qconf command can be used to find out what queues are available.

[321 efg cluster02 23Oct07 14:26:09 /home/efg/cluster/getinfo]
qconf -sql
all.q
greylag.q
mec.q

You would need to get an explanation of the differences between queues for your installation from your local cluster administrator.

Summary.  The Linux environment on the cluster nodes managed by a Sun Grid Engine can be different from other Linux resources.  Studying these differences is invaluable to successful SGE job submission.  Sometimes studying these differences is necessary to learn why cluster jobs are failing.


E a r l   F.   G l y n n
e f g @ s t o w e r s - i n s t i t u t e . o r g

Updated
 23 Oct 2007