ComputeMode: On-demand HPC
Cluster Manager
Version 2.0

`http://computemode.imag.fr/`

List of Figures
List of Tables
- 1. ComputeMode Manager Overview
I. Getting a ComputeMode server up and running
- 2. Installing a CM server...
  - 2.1 ... on a real machine
  - 2.2 ... in a virtual machine
    - 2.2.1 VMware Workstation or VMware Server
    - 2.2.2 VirtualBox
- 3. Starting the ComputeMode Virtual Appliance
II. Launching Computations
- 4. Starting Computing
III. Cluster Manager for users
- 5. As a Cluster User, How Do I...
  - 5.1 ... log in on the CM server?
  - 5.2 ... submit jobs to the batch manager?
- 6. As an owner, how do I...
  - 6.1 ... let my system be a part of the CM grid?
  - 6.2 ... get out of the CM grid?
IV. Cluster Manager for administrators
V. Appendices

Glossary

In the following definitions list, the letters '(CM)' indicate a meaning specific to the ComputeMode on-demand HPC cluster manager system.

'Always CM' schedule:: (CM) by default this schedule always boot the nodes in computation mode
batch manager:: also known as Queue Manager, Job Manager, Task Manager
CM administrator:: (CM) ComputeMode administrator - related to the root Unix account on the CM server and to the admin account on the web interface
Note: the cmwebadmin administrator may differ from the cmserver administrator
CM boot mode:: (CM) a type of boot mode whose role is to help instantiate a CM image OS (for the client nodes)
CM:: (CM) ComputeMode: on-demand HPC cluster manager
CMGM:: (CM) see CM
client node:: (CM) the workstation which will be able to boot in Computation Mode (by opposition to CM server)
computation mode:: (CM) the mode in which the client nodes should be to successfully handle the tasks submitted to the CM server
computing mode:: (CM) see computation mode
DHCP:: Dynamic Host Configuration Protocol
grid user:: (CM) a user for which the cmserver administrator has created a Unix account and who is allowed to submit jobs through a batch manager
job manager:: see batch manager
NFS:: Sun's Network File System
NIS:: Network Information Service (a.k.a. yellow pages)
OS:: operating system
Owner:: (CM) see user
PXE:: Pre-execution Environment
queue manager:: see batch manager
RAW PXE boot mode:: (CM) a kind of boot mode whose role is to chain other pxelinux static configuration file
Such a mode can be used for instance to make a machine boot through the network a floppy disk image (FreeDOS, Symantec's Ghost, ...)
remote wake-up:: see Wake-on-LAN
RWU:: see remote wake-up
standard mode:: (CM) the mode in which the client nodes would be if there were no CM server
Task Manager:: see Batch Manager
User Mode:: (CM) see Standard Mode
User:: (CM) the owner of a given node, that is, the person who is using a workstation on a regular basis
Wake-on-LAN:: a functionality which lets a user wake up a powered-off system through the network by sending a specifically crafted packet (used in the phrase: to send Wake-on-LAN packets)
WoL:: see Wake-on-LAN
boot image:: (CM) it is indeed a Linux boot image which is composed of a Linux kernel and a specific initrd tuned for CM - extra parameters may be used too.
boot mode:: (CM) indicates how to boot a machine
cmserver, server:: (CM) ComputeMode Manager server
cmwebadmin-data:: ComputeMode Manager web resources (images, style sheets, logos)
cmwebadmin:: (CM) ComputeMode Manager web administration interface
image: OS: (CM) this is the file hierarchy located in /cm/<OS_NAME> (for instance /cm/debian/...) and which is aimed at being mounted on the clients - it may be hosted on a read only server and requires specific tuning
label:: (CM) it is a logical tag attached to a node - its main role is to simplify the handling of large number of hosts by binding it to a symbolic name - labels may for instance be bound to the room number or the hardware brand
local boot mode:: (CM) a special reserved boot mode which basically tells the machine to boot on its local hard disk
node, host:: (CM) it is a client node, that is a machine which boots a CM OS - a node uses a schedule, and may hold an unspecified number of labels (from none to any number)
off boot mode:: (CM) this is a RAW PXE boot mode whose aim is to shut down the machine which has used it
processing node:: (CM) see Client Node
schedule:: (CM) it is a calendar for a week telling CM which OS to use based on the weekday and the time
special labels, system labels:: (CM) some label names have special meanings to CM - basically they are labels and can be applied and removed as any other label but they cannot be edited or renamed and they may have some special behavior
ssh:: secured shell - this may refer to either the protocol or its implementation (a popular implementation being OpenSSH)

1. ComputeMode Manager Overview

This section aims at describing slightly more thoroughly what CM can and cannot do.

1.1 Quick overview

A ComputeMode server can help you:

deploy easily and non intrusively a distributed network test-bed
extend an already existing computing system through the aggregation of unused computing resources (idle workstations for instance)
have a unified view of all your resources

1.2 General overview

ComputeMode relies on a master-slave architecture built by using a central server as a master. Though centralized, some services may be distributed to other servers.

The CM server maintains the availability of registered PCs on the local network. Each PC owner, in accordance with your company policy, may choose to let his/her PC be used at nights, or during weekends or vacations: most workstations are used interactively 50 hours a week. A PC used by CM when it is idle is told to be a client (or processing or slave) node. The other mode is known as 'user mode'.

The two modes are:

in user mode, in which the machine is working the way it's used to (most likely under Microsoft Windows for instance). The owner of the machine should not even notice that his or her PC is managed as a ComputeMode processing node. In particular, the computational resources of the PC will not be used while in User Mode (unless virtualization support is enabled)
The Computation Mode (from when the ComputeMode name comes) is activated when the machine is in a time period where the PC is considered as idle. If the ComputeMode Server detects some computational peak, the PC can be remotely switched to Computation Mode. The switch from user mode is done through an automatic reboot of the machine and proceeds to a remote boot. The remote boot is handled by the ComputeMode Server with the pre-execution environment (PXE) protocol, which is natively available from the BIOS of PCs since 1999. While in Computation Mode, the machine is running under the Linux Operating System, and does not have access to any local drive.

The ComputeMode administrator can easily manage the computing PCs through a web-based interface available on the ComputeMode server.

A grid user (standard Unix user) may submit computational jobs to the system through the use of a classical batch manager by logging in on the server (through ssh for instance), or computation scheduler (for large parametric computation campaigns for instance)

The batch manager will then take care of:

reserving adequate resources for the computations
scheduling the execution of the jobs

The Open Source (GPL) batch manager 'OAR' ships with CM but other products such as Platform's LSF, OpenPBS, TORQUE or Sun Grid Engine are known to work with ComputeMode.

Load balancing with an already installed job manager is also something which can be accomplished: for instance, your dedicated cluster usually handle the work load but for peak periods, some extra-CPU power would be useful...

When the ComputeMode server detects that a client node becomes unavailable, the latter returns to the previous User Mode so that the owner will not even notice that his/her PC has been used by ComputeMode.

Submitted jobs can take advantage of the NFS distributed file system made available from the ComputeMode Server. Each Cluster User has his/her own private home directory and can use it to store the data required by the computational jobs, as well as to retrieve the generated results.

If a PC owner comes back and needs his/her PC at once, he/she is still the boss: simply by pressing keys (alt-ctl-del), the PC owner can abort any ongoing computational activity on his/her PC, and ComputeMode will restore theUser Mode in about one minute. The Owner won't be bothered again as his/her PC will be 'quarantined' and will no longer be used for further computations till 'un-quarantined'.

1.3 CM: on-demand HPC cluster manager is a grid building software suite?

Yes, it is. The differences between 'cluster' and 'grid' is somewhat fuzzy and the meanings of these words vary according to different CS schools. To help you understand what CM can do for you:

if you have some workstations (PC) available during nights, weekends or vacations, or some spare systems,
if you have CPU-hungry applications (MPI-like parallel codes, parametric codes, scientific simulations, large compilation projects) running under Unix
Note: scientific applications for Microsoft OS [DOS, Win 3.x, and current OS] can often be run under Linux
if you know what the role of a batch manager/scheduler (PBS, Platform's LSF, TORQUE, Sun's Grid Engine/N1, OAR)
CM will help you make all those entities meet easily.

If you happen to already own and use a dedicated cluster, ComputeMode will offer extra power without extra costs: think of the costs of new machines, a system administratore, AC, ...

If you want to initiate students to distributed infrastructures, as well as cluster tools and technologies, or if you simply want to experiment with a cluster prior to buying a dedicated system, CM will help you reach those goals easily.

If you are working in a scientific laboratory, you may want to use CM to offer your searchers or colleagues an extra infrastructure to test-bed your computations.

1.4 ComputeMode is Open Source

ComputeMode is indeed a whole set of software available under Open Source licenses (most tools being available through a GNU GPL) - an acknowledgement list will be given in appendix sec:acknowledgement-oss.

It is currently based on Linux and ships a Debian GNU/Linux OS for the server. The solution to ship a compete distribution was chosen since a lot of configuration has to be done to let the system run smoothly. If you really want to use your own distribution, you may seek help from CM mailing lists.

As such it is available under Open Source license or Open Source-friendly licenses :

cmwebadmin: the web interface administration is available under the GNU GPL 2 or later
all the other codes or patches are available for free (sources are available on ComputeMode web site) under the license of the project to which they apply
cmwebadmin-data: (basically, images, style sheets, logos) are available under a Creative Common license (Attribution-ShareAlike 3.0) which basically lets you modify and redistribute it provided you do not steal Icatis's credits and you let your work under the same license.
cmwebadmin will work without it but it will be much cosmetically pleasant with it.

1.5 Current requirements

As CM is a distributed architecture it has some requirements on the server and the client nodes, as well as your network topology.

1.5.1 Server requirement

The server should be a fast machine dedicated to running ComputeMode. To give very minimal figures, the settings should be worth at least:

Pentium class processor - at least 1 GHz
256MB memory
100Mbps Ethernet
5GB hard disk (SCSI or IDE and a few RAID cards are also supported - SATA is currently not tested) : note the HDD will be repartitioned and reformatted upon ComputeMode installation; hence the full contents of the hard disk will be erased.

You will have to alter these figures according to your needs as, of course, more client nodes require more power. To give a few figures : a Dual Xeon 2.4GHz with 2GB RAM and a gigabit network cards handles around 120 client nodes. If your application makes a lot of I/O, you will have to boost the network link and the storage device or use a NAS system.

1.5.2 Client requirement

Processing nodes will be booted remotely when required by the ComputeMode Server. This process involves running a disk-less Linux (through a network boot) on each of these PCs. Few requirements are made for these PCs, but of course the faster the machines, the higher the computing power will be:

Pentium-class processor
256MB of RAM: do note this is the minimum value to make somethings and that this value will have to be increased to cope with your application needs
100 Mbps Ethernet : Remote-Wake-Up (a.k.a. Wake-on-LAN) support is not mandatory but it would make a great addition
PXE compliance: this feature makes a PC able to network boot. PXE capability is supported on most PC with an integrated LAN network card. It has been integrated by major vendors' corporate PCs since 1999 (and is required by PC98 and PC99 recommendations from Microsoft and by the Wired for Management Initiative from Intel).
PXE should be enabled on each client node through their BIOS setup. Network boot is usually enabled by default by PC manufacturers, but you should check that each PC is properly PXE-enabled. The PXE/Network boot must be configured as the first boot device in order to fully take advantage of ComputeMode boot mechanism. Some systems also activate PXE boot when the system receives a Wake-on-LAN message.

1.5.3 Network architecture requirements

The current version of ComputeMode relies heavily on the use of PXE (also known as network boot protocol) to setup disk-less distributions. Thus, if your network already uses PXE for other purposes, or if your workstations are too old and do not support PXE boot, this may be a show-stopper for this current version.

CM also uses DHCP but it is configured so as to be transparent and not interact with your current set up.

Besides, the following constraints should be enforced:

All ComputeMode nodes should be in the same broadcast range. For instance, having all the machines connected on the same switch, sharing the same subnet or VLAN will be sufficient. This is because ComputeMode relies on the DHCP/PXE protocol, which requires the server to be able to receive Ethernet frames broadcasted on the LAN by any of the Processing Nodes.
PXE should not be in use on your local network. PXE is sometimes used by corporations that use it for deploying new Operating System images to network PCs. If PXE is already in use on your network, the PXE from the ComputeMode Server will conflict with the corporate PXE. You should check with your network administrator whether PXE is on your network or not. In such a case however, advanced configurations may be provided, adapted to your local needs, using for instance an advanced PXE bootstrap chaining mechanism.

Tip:To check whether PXE is in use on your premises, boot a PC and let it proceed into PXE network book (this should happen at the end of its BIOS bootstrap) - of course, the ComputeMode server has be disconnected during that test. If the PC exits without finding any PXE server, then there are chances that no PXE server is connected to your local network.

1.5.4 Access nodes requirements

Some hosts of your network will be used to access the ComputeMode system. For instance, you will need to access the ComputeMode Web interface located on the ComputeMode server. You will also need to access the system to submit computational jobs and retrieve their results. Such operations may be done using any PC from your network (including processing nodes).

The ComputeMode server embeds a minimal X display with a web browser, but as it is a server, it will most likely be more comfortable to access it from your office instead of the server room !

The requirements are rather pieces of advice:

a standard-compliant web browser supporting CSS2 and JavaScript: if JavaScript is not supported, every functionality is still available but in a less user-friendly manner.
Supported browsers are: Gecko-based ones (Firefox 1+), Opera (8+). This web application also works with Microsoft's IE5, IE5.5 and IE6 but due to the poor support of W3C standards by those browsers, minor display glitches may exist. IE7 (and following versions) are expected to offer better results since W3C standards is told to be improved.
a 1024x768 display (at least) - using cmwebadmin on a smaller screen will not be quite user-friendly
a ssh client: several are available at no charge and under Open Source licenses (see OpenSSH or PuTTY). Commercial ssh clients are also available. If you do not intend to either submit job or administer the CM server, you do not need this client software.

1.6 ComputeMode appliances

1.6.1 ComputeMode VMware, VirtualBox or KVM appliances

For testing purpose (as there is some kind of performance limitations), an appliance may be downloaded:

32 and 64 bit-mode appliances:

http://computemode.imag.fr/files/appliances/

It is a fully functional ComputeMode server which lets you test the software prior to dedicating a real server to it.

I. Getting a ComputeMode server up and running

2. Installing a CM server...

2.1 ... on a real machine

First, make sure the network and system requirements are fulfilled.

In the following section, we will assume your CM server will have:

have a IP of 1.2.3.42
a netmask of 255.255.255.0
a gateway of 1.2.3.254
a DNS server of 1.2.3.1

This is the IP you will have to use to access the CM server from within your site.

You can then proceed to the installation of ComputeMode using the online documentation:

http://computemode.imag.fr/mediawiki/index.php/ComputeMode_on_top_of_Debian_Squeeze

The installation screens should be rather straightforward. When requested by the installer, enter the ComputeMode Server network parameters.. Once the installation is finished, go to your own machine (which will be used as an Access Node), open your Web browser and navigate to the following ComputeMode Server address, prepended by http://. For example, open the http://1.2.3.42/ Web page. This is the ComputeMode Server administration page.

2.2 ... in a virtual machine

In most virtualization software, I/O operations are privileged actions which require the virtualization software intervention, hence performances are often degraded compared to a real dedicated server.To get more informations about virtual appliances installation, you can checkthe following on-line documentation:
http://computemode.imag.fr/mediawiki/index.php/ComputeMode_server_appliances

2.2.1 VMware Workstation or VMware Server

If you already use VMware's virtualization products you may simply install the CM software inside a virtual machine. You have however to be sure that it will get access to the same network as your clients (whether these are real or virtual):

if the client nodes are real machines, it will be easier to use the bridge mode

Note:: you may have to alter the /dev/vmnet0 permissions to allow to allow promiscuous mode (you will know when you need it as VMware will complain - for instance if you want to run a network sniffer to troubleshoot possible networking issues)

if you want your clients to be hosted by the same VMware system as your server, then use a team, NAT or host-only networking to avoid connecting to your physical network. Please consult the manual of the VMware product you want to use.

2.2.2 VirtualBox

VirtualBox is a general-purpose full virtualizer for x86 hardware, targeted at server, desktop and embedded use.For a thorough introduction to virtualization and VirtualBox, please refer to the online version of the VirtualBox User Manual's first chapter:

http://www.virtualbox.org/manual/ch01.html.

3. Starting the ComputeMode Virtual Appliance

This chapter describes how to use the VMware appliance of the ComputeMode software.

3.1 VMware Player

The appliance version of CM is built to work out-of-the-box with VMware Player. You can get this tool at no charge from the VMware Player download page

http://www.vmware.com/download/player/

3.2 Recommended usage

Since VMware adds some overhead on I/O operations (disk, network), and since CM is mainly doing I/O, it should mainly be used for evaluation purposes:

software evaluation : you do not have to get a specific setup
quick all-in-one cluster setup: just start it, you can get a small cluster running within minutes

3.3 Appliance Walk-through

3.3.1 Requirements

You need to know: your netmask, an available IP address, your DNS servers, if PXE is used in your service.

3.3.2 Starting the engine

The appliance has been designed to be simple to start and provides a common web page as shown on figure .

**Figure:** Welcome screen

The main part of CM is in the 'Admin' box, especially the 'Webadmin' link. You may now proceed to the chapter cha:cmwebadmin if you want to have detailed explanations about administration or simply proceed to the next chapter to resume your ComputeMode walk-through.

II. Launching Computations

4. Starting Computing

Now that your CM server is running - everything from now on will apply whether your CM server may be real or virtual.

4.1 Logging into cmwebadmin

Once you have logged in your Unix account (or the guest account), you will be shown with a login screen (see figure ). You may also reach cmwebadmin through a web browser located on your desktop machine.

**Figure:** cmwebadmin login screen

The default account and password are given in section cha:default-accounts-passwords. It is a safe practice to change this password now.

Once logged you are taken to the nodes management page (see figure ).

You may choose which labels to show by selecting the drop-down menu at the top. Upon selection, the page will be automatically reloaded (if you have Javascript) to only show the nodes holding the selected label. Clicking on a label name in the right part of a node entry will also select this label to be shown.

The nodes are listed and some of their properties are displayed in columns:

Schedule column: it is the name of schedule which applies to the node
WoL column: it stands for Wake-on-LAN and it indicates whether a Wake-on-LAN (a.k.a. Remote Wake Up) message should be sent to a node which has to boot a non-Local boot mode
Notes column: it is left for the user to add comments
Schedule boot mode column: this column indicates the boot mode the node should be in if it booted now ('now' being the time of the web page generation) - this column is not editable (read/only) and is automatically set by CM
Labels column: indicates the labels tied to a given node

**Figure:** Nodes management page

Clicking on the 'Show/Hide advanced options' button, will toggle the display of several buttons.

Note:: the buttons shown by your CM version may differ from the one shown in figure as the display is highly configurable and dependent on which modules are available and enabled.

**Figure:** Nodes management page - with advanced options shown

4.2 Getting some computation nodes registered

Client (or computation) nodes may be added automatically, or by hand. The following sections is to explain the two options.

4.2.1 Automatic discovery

First, you have to check that this feature is enabled in the BIOS settings of the machines you want to use. A sample BIOS screen is shown in figure .

**Figure:** Enabling network boot in the BIOS

You can now check whether the node is listed in the nodes management screen. There should not been any node if you never booted it.

**Figure:** Nodes management before client node first boot

Once you start the client machine: you should see a screen similar to the one in figure . During this step, you machine tries to find a PXE boot server within your LAN. This request is caught by cmserver which then registers this node as an unconfigured node and adds an automatic name of Uxx-xx-xx-xx-xx-xx (where xx-xx-xx-xx-xx-xx is your client node MAC address). If you choose to reload the nodes management page (as shown on figure ), you will see a new node was added with the label 'Unconfigured'.

If you feel like it, you can verify the MAC address you may have seen in figure is the same as the one you will see in figure .

Note:: if several machines happen to boot at the same time, there may be issues to know which one is which unless you already know the association MAC address - machine name.

**Figure:** Client node first boot: PXE request

**Figure:** Nodes management after client node first boot:
a new node is listed

4.2.2 Manual addition

If you know your machine MAC address, you may simply add the corresponding entry into cmwebadmin: just click the 'Add node' button. You will have to fill the form - see the next section, or chapter cha:cmwebadmin, for further information about every line.

4.3 Editing the newly-added node properties

If you click on the 'Uxx-xx-xx-xx-xx-xx' name in the nodes management, you will be taken to the node edition properties screen. You should chose the 'Always CM' schedule so as to observe something when you reboot the client node. When you are satisfied with the form, click 'Update' button (see figure ).

**Figure:** Node properties edition

If everything went smoothly, you will be taken back to the nodes management screen with a message shown at the top of the page as shown on figure .

**Figure:** Node management after successful node properties edition

Now, if you reboot the machine for which you have just modified the configuration, screens similar to the ones in table will be shown.

Table: CM client boot screens

Screen shot	Comments
	Boot underway
	Boot over

4.4 Getting some work done: cluster computing demonstration

4.4.1 Principle of the demonstration

A simple cluster demonstration is provided so that you may test your setup. This application is made to be simple and visual: basically a 3D scene is split in stripes, each stripe is a task submitted to the job manager. The job manager dispatches the tasks on every available computation node.

Meanwhile, the ComputeMode server will merge and update the display of the computed stripes till every task has been executed.

4.4.2 How to start it?

You can check how many nodes you have by selecting the OAR menu option and choosing Monika. A page similar to the one below will be shown.

**Figure:** OAR page listing the ComputeMode nodes

Then you have to select the 'POV demonstration item' in CM demonstration or to follow the link: http://172.28.255.253/public_html/

**Figure:** POV scenes listing

A few sample scenes are given - choose one and click its name - the rendering process will now start. If you launch Monika again (in another window) you will see that several tasks have been queued (see the figure below).

**Figure:** OAR page listing the ComputeMode nodes executing jobs

After a few seconds, the first results will begin to appear:

**Figure:** POV scene rendering

The nodes scheduling may be seen through the Gantt drawings (as an option in the OAR sub-menu):

**Figure:** DrawOARGantt showing submitted jobs

Eventually, the full picture is displayed.

**Figure:** Complete 3D scene rendered

III. Cluster Manager for users

5. As a Cluster User, How Do I...

This section tries to list frequently asked questions. If you cannot find an answer, just ask and it will be added in later revisions of this manual.

5.1 ... log in on the CM server?

Your CM administrator has to create a Unix account on the server. Once this is done, you will be allowed log in by using a ssh client.

5.2 ... submit jobs to the batch manager?

The batch scheduler installed by default is OAR and provides the de facto standards command line tools: *sub/*del/*stat.

Please consider reading the complete OAR manual at: http://oar.imag.fr/if you have advanced needs. For sake of completeness and ''quickstart-ness'', the most common usage patterns will be listed below.

5.2.1 oarnodes (monitoring)

oarnodes provides information about the nodes registered in ComputeMode: basically it will let you know which nodes may join the computing cluster. It will also list their properties if they have already booted and registered in OAR previously (CPU, RAM, etc.)

Example:

oarnodes

oarnodes -l

5.2.2 oarstat (monitoring)

oarstat displays the current jobs status and its behavior may be altered with the following options:

-f: prints further information about each job
-j <job_id>: prints detail about a specific <job_id>

More concise high level interfaces are also available through Monika and DrawOARGantt. Both tools are available as web pages from the cmwebadmin portal (see section cap:welcome-screen).

To summarize, Monika displays a snapshot of the current OAR status with regards to the nodes occupation and will list jobs and their states.

As for DrawOARGantt, it displays a Gantt diagram of the past nodes reservations. When jobs have a walltime set, it will plot them the way they would execute if they lasted up to their walltimes.

5.2.3 oarsub (submit)

oarsub lets a task be submitted to the job manager.

If you want to have an interactive shell, type:

oarsub -I

You may specify the number of nodes you want by adding the -l option followed by 'nodes=number':

oarsub -I -l nodes=4

The contents of the nodes allocated to your job will be in the $OARNODES environment variable.

You may also submit script files:

oarsub -l nodes=1 ~/myscript.sh

or in-line scripts:

oarsub -l nodes=1 '~/mybin param1 param2'

Note:: when submitting scripts files, you have to make sure the script is executable (chmod +x ~/myscript.sh)

For further options, please read the OAR manual.

5.2.4 oardel (delete a job)

To remove a job from the OAR queue, you have to get its job ID (use oarstat for this purpose) and then use the following command:

oardel <job_ID>

If the job is currently running on a client node, it will be killed.

6. As an owner, how do I...

This chapter aims at giving quick recipes to solve common issues for an administrator.

6.1 ... let my system be a part of the CM grid?

Everything depends on your administrator and your site policy :

you may have to power down your computer at night or when you leave
you may have a special screen saver installed by your IT staff
you may have a ComputeMode agent installed

By default if you boot, during a computation period and, if your machine was registered in the CM server by the administrator, then you can see the CM boot process happen.

6.2 ... get out of the CM grid?

To achieve this, you simply have to hit 'Alt + Ctl + Del' when the system is in computation mode and a wait a few seconds so that the servers acknowledges it. Once done, you will never see CM again unless you tell your system administrator that it is OK to use your machine as a computation node again.

IV. Cluster Manager for administrators

7. ComputeMode web administration interface

ComputeMode Web Administration interface (cmwebadmin for short) is the central place where most of CM behavior can be tuned.

cmwebadmin is built as follows:

a menu on the left of the screen which will take you to management pages (for nodes, users, etc.)
on any management page, you have a list of items and buttons at the bottom of the page to affect the selected objects

7.1 The 'Nodes' menu item

The management page lists all the nodes registered in cmwebadmin.

**Figure:** Nodes management page - with advanced options shown

7.1.1 Editing or adding a node

To reach the node edition page, simply click its name or its MAC address in the node management page.

To add a new node, click the 'Add' button in the management page.

**Figure:** Adding a node

The fields in the form have the following uses:

MAC address: the machine Ethernet address
name: the hostname the machine will be known by
owner: currently, simply used for presentation (in doubt, choose 'admin')
note: basically available for the user - a few special expressions may be used to alter a node behavior (for instance the ip=x.y.z.t option, see below)
Wake-on-LAN: indicates whether the node should be woken up when it is a computation time slot
node schedule menu: indicates which schedule to use
handled_by_job_manager: if this flag is set, then the previous behavior is maintained, that is every modification done in cmwebadmin will be pushed to the job manager (that is by defaut OAR) - if it's not set then the modifications will not be sent to OAR
IP: this field contains, either a valid numerical IP (v4), either nothing - it is mainly used to loosely authenticate user to submit exceptions related to theirs machines - it interacts with the comments (see below)

A few read-only fields are shown when you edit a node (not when you add/create it), namely OAR properties and the list of labels having been applied.

When you are satisfied with the node characteristics, click the 'Add' (or 'Update') button.

7.1.1.1 Special field: Comment & IP field

The IP is mainly used by the users page which let users tell when they are do not use their computer. Some options can be given to alter the behavior of CM by means of the comments field (mainly in the user pages) :

if the IP field has a valid IP, use it => END, else proceed to step 4
(IP field is empty) ComputeMode parses the comment field lookin for a string matching exactly (case & spaces matter !) ip=x.y.z.t with x.y.z.t as a valid IP address. If this string is found, it uses it => END, else proceed to step 4
(nothing in IP and nothing in comment) Try to find the MAC address associated to the connecting IP by probing the network. If a MAC is found, use it => END else proceed to step 4
Has a MAC address been found ?
Yes: user accepted
No: user rejected

7.1.2 Change schedule

Select the nodes for which you want to alter the schedule in the management page, and click the 'Change schedule' button.

A summary page indicating which the nodes chosen are, and asking you for the new schedule is shown such as in figure .

**Figure:** Changing a node schedule

Click the 'Change schedule' button to confirm.

7.1.3 Change WoL

The principle is similar to the one of the 'Change schedule' button (see section ): select your nodes and use the 'Change WoL' button instead.

7.1.4 Apply Label

The principle is similar to the one of the 'Change schedule' button (see section ): select your nodes and use the 'Apply Label' button instead.

Note:: the label to apply must be created before by using the 'Nodes/Labels' menu (see section sub:add-label)

7.1.5 Remove Label

The principle is similar to the one of the 'Change schedule' button (see section ): select your nodes and use the 'Remove Label' button instead.

Note:: the nodes will not be deleted - the same thing applies for labels not tied to any node

7.1.6 Delete Nodes (advanced options)

Select the nodes to delete, click the 'Delete' button. A summary confirmation screen will be shown. Click 'Delete' to confirm.

7.1.7 Export Nodes (advanced options)

Select the nodes you want to export, then click the 'Export' button. A screen as the one shown below will then be shown.

**Figure:** Exporting nodes

You have to click the download button to download the CSV file corresponding to the node you selected. The CSV file (an ASCII text file indeed) may then be imported in most spreadsheets.

7.1.8 Import Nodes (advanced options)

Click the 'Import' button - no node has to be selected. You will be taken to a screen similar to the one shown in figure .

To fill the form, you have to choose a local CSV file (with a specific format) you will upload to the server and tick optionally checkboxes:

either you choose to overwrite existing entries (this is enabled by checking the top box)
if you enable overwrite, you may also opt to completely erase *all the nodes* (this is enabled by checking both lower check-boxes) in the 'Erase all nodes' option

**Figure:** Importing nodes

As stated in the previous section concerning exports, the first line of the CSV has to follow some rule so that the rest of the data may be parsed. Try to export nodes to obtain a sample of a correct file format.

7.1.9 Statistics (advanced options)

Select the nodes of which you want to have statistics, then click the 'Statistics' button. You will then be taken to a page similar to the one shown in figure .

**Figure:** Node statistics

You may change the date and time period for which you want statistics to be drawn.

Note:: if you select lots of nodes and a long time period, over a slow network link, the page rendering may either be slow, or even timeout. You may have to alter the apache server configuration to workaround this issue.

7.1.10 Add Exception (advanced options)

Select the nodes to which you want to add exceptions, then click the button 'Add exception'. Basically these are positive exceptions : whenever a node is in exception, it will put be in an exception schedule. This is configurable system wide in config.php.

Figure: Adding an exception to a set of nodes

	Standard screen
	Click on an edit box to have a calendar shown.

7.2 The 'Nodes / Labels' sub-menu item

**Figure:** Nodes / Labels management page

The column named 'Nodes with this label' indicates the number of nodes which currently have this label applied.

There are two special labels which will be described below.

7.2.1 Special label 'Unconfigured'

This label is added to automatically newly-discovered nodes.

When you edit the node properties and click 'Update' this label will be automatically removed.

7.2.2 Special label 'Quarantine'

This label is automatically added to the nodes which were in Computing modes and whose owner hit the ctl-alt-del to get their machine back.

To disable automatic 'Quarantine' also called within CM node inhibition, edit the CM configuration file: config.php (which should be located in /var/www/cm/).

In this file, find the line reading:

$GLOBALS['ALLOW_INHIBIT'] = true;

Either replace it by :

$GLOBALS['ALLOW_INHIBIT'] = false;

or add this line to locale_config.php.

7.2.3 Label edition

You may rename a label by clicking its name in the labels management page (see figure ). This is just a label rename and not a label creation: that is if nodes A, B and C have the label 'old_foo', if 'old_foo' is renamed as 'new_foo' then, A, B and C will have the 'new_foo' label.

7.2.4 Add label

Simply click the 'Add' button in the management page (see figure ). A form will be shown asking you to enter a label name (basically only alpha-numerical characters) - click 'Add' to create the label.

7.2.5 Delete labels

Select the labels to erase in the management page (see figure ) the click the 'Delete' button. A confirmation screen will be shown.

Note:: the nodes will not be erased, only the label deleted will be removed

7.2.6 Change schedule

This button lets a schedule be changed for the set of nodes having any of the selected labels.

Once you click the button, you will be taken to a page as the one in the screenshot below: select the new schedule, then click the 'Change schedule' button.

**Figure:** Nodes / Labels management page / Change schedule

7.3 The 'Nodes / Exceptions' sub-menu item

Exceptions can only be listed and deleted in this page (see figure ). They are ''positive'' exceptions in the sense that they indicate periods of availability of machines (mainly vacations).

The rationale is that employees who want to let their computers join the grid when they are off sites can declare it from their local browser.

**Figure:** Nodes / Exceptions management page

7.3.1 Delete exceptions

Select the exceptions you want to remove then click this button.

7.4 The 'Accounts' menu item

This is where you go if you want to change the administrator information. Please note that the accounts shown here are only related to cmwebadmin and are not tied at all with the Unix accounts.

**Figure:** Accounts management page

7.4.1 Account edition

**Figure:** Accounts edition

The main field in this form are obviously the login field and the password.

Note:: if the password field is left empty, the previous password is kept.

7.4.2 Add/Delete account

Currently, there is no use adding extra CM accounts - the only limitation is there must be at least one admin account.

7.5 The 'Schedules' menu item

This page list the existing schedule (see figure ) as well as the number of nodes which are using it. Please note that some schedule are reserved (though you can edit them to fit your needs): default is seen as a template for new schedules, 'Always' and 'Never' should correspond respectively to always-computing and always-local.

Their serial numbers may be referred in the config.php file so be extra-careful when you delete schedules you have not created - there is currently no failsafe.

**Figure:** Schedules management page

7.5.1 Schedule edition

Schedule edition consists in editing setting which boot mode to apply based on the day of the week and the time of the day.

Tip:: You can select some parts by clicking then dragging the mouse to select a rectangular portion of the week - the rectangular selection is not shown - this requires Javascript.

**Figure:** Schedule edition

By default, the schedule granularity is 120 minutes. It can be set to other values such as they divide 1 day, that is 24 * 60 = 1440 minutes. This granularity is stored in config.php at the line reading :

$GLOBALS['DEFAULT_TIMESLOTGRANULARITY'] = 120;

WARNING: If some timeslots are not compatible with the old and new granularities, the display may be somewhat messed up (for instance if you have a timeslot from 9.30 to 10.00 with a 30min granularity and you go to a 60min granularity). You have to fix your bootmodes selection then update and it will be fine.

7.5.2 Add schedule

Clicking the 'Add' button in the nodes management page (as shown in figure ) will create a new schedule as a copy of the one named 'default'.

7.5.3 Copy schedule

Select the schedule you want to copy by ticking its check-box, then click the 'Copy' button. A new schedule named 'Copy of ...' will be created.

7.5.4 Delete schedule

Select the schedules to delete in the management page (as shown in figure ), then click the 'Delete' button. You will be shown with a confirmation page to confirm the deletion.

7.6 The 'Boot modes' menu item

This menu (see figure ) lists the boot modes available as well as their types (see below).

**Figure:** boot modes management page

7.6.1 Local boot mode edition

This is a special boot mode (not configurable) which tells a machine to boot locally. It cannot be edited or added since, there is, indeed, only one way to do a local boot (think singleton design pattern).

**Figure:** Local boot mode

7.6.2 Raw PXE boot mode edition

This is a special boot mode you can use to boot, for instance, floppy disc images through PXE.

**Figure:** RAW PXE boot mode edition

On the server side, the binary files must be stored in the TFTPD root (by default, this is: /srv/tftp/PXEClient/)

Let the content below goes into a file named ''default'':

label linux

kernel memdisk

append initrd=foo.bin

foo.bin goes into /srv/tftp/PXEClient/ and the configuration file default goes into /srv/tftp/PXEClient/pxelinux.cfg.

7.6.3 ComputeMode boot modes edition

This mode is editable but some parameters may be slightly tricky - feel free to ask for help.

If you are altering an existing ComputeMode boot mode or creating a new one from scratch, you have to know that a few options are mandatory namely MASTER and NFSROOT.

**Figure:** ComputeMode boot mode edition

The table cap:cm-boot-modes-options lists the available options. Some are here for compatibility reasons and should not be used.

**Table:** ComputeMode boot modes options
(spans several pages)

Option name Mandatory
Sent at boot time (PXE)
Notes & examples

MASTER YES
YES
Hostname of the ComputeMode server in the CM subnet.
Syntax: numerical IPv4 or hostname
Default: 172.28.255.253
Note: you can use a litteral name but as there may be DNS resolving issues, you'd better stick to an IP if possible

NFSROOT YES
YES
ComputeMode root to use for the diskless boot
Syntax: nfsserver:/cm/distribution
Default: 172.28.255.253:/cm/debian
Note: you can use a litteral name but as there may be DNS resolving issues, you'd better stick to an IP if possible

AUTOSTART no
no
Specify which commands to start automatically after the distribution has finished booting.
Syntax: user1:cmdfilename1+user2:cmdfilename2

meaning as user1, launch cmdfilename1, then as user2, launch cmdfilename2. Paths of cmfilename[12] must be absolute.
Default: root:/var/lib/oar/cm/oar_start.sh
(starts OAR)

AUTOSTOP no
no
Specify which commands to stop automatically before the node starts shutdowning
Syntax: same syntax as AUTOSTART just above
Default: root:/var/lib/oar/cm/oar_stop.sh (stops OAR)

EXIT no
no
Specify which exit method to use (another way is to set it in /var/diskless/exit)
Syntax:
halt or reboot or no
Default: reboot
Note: Only these 3 keywords are supported. 'no' means do not exit.

NFSHOME no
no
Specify how to mount the /home directory (may be done using the MOUNTS parameter too)
Syntax: some.host:/some/dir
Default: cmserver:/home

DDNS no
YES
Should Dynamic DNS be activated.
Syntax: yes or no
Default: yes

DEBUG no
YES
Activate debug mode. Value corresponds to the linuxrc breakpoint to stop at.
Syntax: positive integer, 0 means stop at all breakpoints.
Default: not set

DHCPNOKILL no
YES
Do not kill initrd DHCP client (workaround to ensure the initrd and the distribution do not get two different leases from two different DHCP server, which would break everything. The drawback is the initrd will not be unmounted and its memory will not be freed.
Syntax: yes or no
Default: not set
Note: not much tested, use with care

DHCPPORT no
YES
DHCP port that dhclient must use.
Syntax: positive integer
Note: not much tested, use with care

DHCPREJECT no
YES
Should dhcp configuration reject some dhcp servers offers.
Syntax: x.y.z.t+a.b.c.d

where x.y.z.t and z.b.c.d are IPv4 addresses

DHCPSERVER no
YES
Force the use of a DHCP server. Please prefer the DHCPREJECT option.
Syntax: x.y.z.t where x.y.z.t is an IPv4 address.

IP no
YES
Used to setup network manually. Value may be append which means let PXElinux give the information to the initrd. Other possible values are: ipaddr:tftpserver:gateway:netmask You may use the NS parameter to specify the DNS configuration.
Note: not much tested, use with care

MODULES no
no
Specify which modules should be loaded upon distribution startup (another way is to edit the /etc/modules file directly). Syntax: module1+module2

MOUNTS no
no
Specify what the distribution should mount during startup (another way is to modify the fstab of the distribution)
Syntax: host1:/exp/dir%/mnt/dir1%nfs%ro,hard,
intr+host2:/exp/dir%/mnt/dir2%nfs%ro,
hard,intr

NS no
YES
Specify a DNS server and a domain name if the network configuration is not done with DHCP.
Syntax: dns1,dns2:domain1,domain2
Note: not much tested, use with care

NTP no
no
Specify the NTP (time) server to use (another way is to set it in /var/diskless/ntp)
Syntax: IPv4 address, hostname
Default: not set

PULL no
no
Specify the pull method that the node will use to notify and get information from the server (another way is to put it in the /var/diskless/pull file directly)
Note: not much tested, use with care

USERS no
no
Specify users which should be created on the system (another way is to directly modify /etc/passwd)
Syntax: user1:uid1+user2:uid2
Note: not much tested, use with care

7.7 The 'Boot images' menu item

This menu describes which Linux boot images may be booted and with which parameters it should run (see figure ).

**Figure:** boot images management page

7.7.1 Adding or editing a boot image

**Figure:** Boot image edition

You will be shown with a screen similar to the one in figure .

A few informations may be given regarding specific fields:

name and note: should be meaningful strings for humans
kernel: indicates the name of the kernel file in /var/www/bootdirectory/images/
initrd: must be the name of the initrd file in /var/www/bootdirectory/images/
command line: must be options given on the kernel command line

7.7.2 Delete boot images

Select the boot image to delete in the management page (see figure ) then click the 'Delete' button. You will be shown with a summary and a confirmation button.

7.8 The 'OAR' menu item

Some statistics regarding OAR batch system may be shown in this page.

**Figure:** OAR statistics page

7.9 The 'About' menu item

This page holds nothing special but a list of the components used but for the sake of completeness, here is the mandatory screen shot.

**Figure:** cmwebadmin about page

8. Client nodes OS execution

Several solutions are available to add nodes to your CM grid.

8.1 Native node by rebooting

The simplest way to add client nodes is probably to enable PXE and boot the node: the CM server will then detect the PXE request and add the new machine under the name UXX-XX-XX-XX-XX-XX where XX-XX-XX-XX-XX-XX is indeed replaced by the MAC address. The newly detected machine will have the default label 'Unconfigured' applied.

8.2 Virtualized nodes

Another means to test CM is to start a virtual machine (such as VMware Player) and enable PXE so that it may boot on the CM server.

**Figure:** CM virtual machine for VMware Player download page

CM provides a page to download and register easily such a configuration package for VMware Player. Some files may not be redistributed for licensing issues - but you can do it by yourself - so you will have to do a bit of work if you want to enable most options available here. If you are interested in doing this, then please read section cha:enabling-extra-options-in-vclient-download-page.

9. As a CM administrator, how do I...

This chapter aims at giving quick recipes to solve common issues for an administrator.

Note:: Some sections may be empty and just redirect you to others - this is a feature aimed at simplifying a search through keywords.

9.1 ... change CM administrator's password?

Click on the 'Users' menu item.
Click on the account name you want to edit.
Enter a new password (it will not be shown).
Click the 'Update' button.

Note:: The new password is enabled at once.

9.2 ... reset cmwebadmin administrator's password?

This should never happen since, as everybody knows it, nobody has ever forgotten a password :-)

This procedure is slightly annoying but as it's a last resort solution.

Log in as 'root' on your cmserver (ssh, console).
Become the postgres user:
su - postgres
Launch the SQL command line client:
psql -U cmu CMDB
Enter the password CMDB password (see section cha:default-accounts-passwords)
Type:
UPDATE users SET password = '42A0ASfCJSzOg' where login='admin';
Type:
\q
Type twice (one to exit the postgres user shell, one to exit the root shell):
exit

Note:: The new password is 'icatis' (without the quotes).

9.3 ... register a node?

See next section.

9.4 ... add a node?

Click on the 'Nodes' menu item.
Click on the 'Add node' tab.
Fill the form.
Click the 'Add' button.

9.5 ... unregister a node?

See next section.

9.6 ... make a node disappear?

Click on the 'Nodes' menu item.
Click on the node(s) you want to delete.
Click the tab 'Delete node(s)'

Note 1:: if you deleted a node, it may appear again in the future if it boots through PXE and auto-detection is enabled.
Note 2:: if you do not want to see detected nodes, then choose to show only the nodes configured in the default screen.

9.7 ... change the default label used in the nodes management page?

Edit config.php and replace the line reading:

$GLOBALS['NODE_MANAGEMENT_DEFAULT_LABEL_ID'] = 0;

$GLOBALS['NODE_MANAGEMENT_DEFAULT_LABEL_ID'] = <XXX>;

where <XXX> is among:

0: for all the nodes
-1: for all the configured nodes
1: for the nodes in 'Quarantine'
2: for the nodes 'Unconfigured'

Note 1:: getting a label ID different outside of -1 .. 2 is currently not supported
Note 2:: for the record, you may try getting it by looking at the hyper-links generated by the label management pages (for instance http://.../cm/main.php?...&labelid=42 - note well that you use this on your own and this is not supported)

9.8 ... cope with an owner who no longer wants to be part of ComputeMode?

See next section.

9.9 ... black list a machine?

See next section.

9.10 ... put a machine in 'Quarantine'?

Click on the 'Nodes' menu item.
Select the node you want to blacklist.
Click on the tab 'Apply label'
Choose the 'Quarantine' label.
Click the 'Apply' button.

CM will now act as if it were disabled for this host that is:

if this nodes boots through with PXE, it will tell it to boot in local mode
if a CM agents, asks what to do, it will never give an order (reboot or other) and will simply acknowledge the request

9.11 ... remove machines from 'Quarantine' mode?

Click on the 'Nodes' menu item.
Select the node you want to get out of 'Quarantine' mode.
Click 'Remove label'
Select the label named 'Quarantine'
Click 'Remove label'

Note:: Though this label has a special behavior, it is still a label which can be removed like any other standard label.

9.12 ... remove machines from 'Unconfigured' mode?

Click on the 'Nodes' menu item.
Select the node you want to get out of 'Unconfigured' mode.
Click 'Remove label'
Select the label named 'Unconfigured'
Click 'Remove label'

Note:: Though this label has a special behavior, it is still a label which can be removed like any other standard label.

9.13 ... disable auto 'Quarantine'?

See next section.

9.14 ... disable node automatic inhibition?

To disable automatic 'Quarantine' also called 'node inhibition', edit the CM configuration file config.php (which should be located in /var/www/cm/).

In this file, find the line reading:

$GLOBALS['ALLOW_INHIBIT'] = true;

Replace it by :

$GLOBALS['ALLOW_INHIBIT'] = false;

9.15 ... enable/disable node automatic detection?

Disabling this feature requires your editing the cmwebadmin configuration file. In config.php, edit the line reading:

$GLOBALS['MODULE_NODE_AUTODETECTION'] = true;

and replace it by :

$GLOBALS['MODULE_NODE_AUTODETECTION'] = false;

9.16 ... enable/disable owner/user's pages?

Disabling this feature requires your editing the cmwebadmin configuration file. In config.php, edit the line reading:

$GLOBALS['USER_ACCESS'] = true;

and replace it by :

$GLOBALS['USER_ACCESS'] = false;

9.17 ... force reboots at specific times?

See next section.

9.18 ... use an agent?

Such a functionality requires a bit more work and the use and installation of a agent. The work of the agent is basically to ask the server what he should do next - that is reboot the server into computation mode or continue as if nothing.

There are currently 2 agent versions : a Windows and a Linux/Unix one.

9.18.1 Windows agent

**Figure:** Windows Agent download page

A customized agent may be downloaded directly from a page hosted on cmserver. The figure illustrates the customizations which could be done, namely:

at which interval the agent has to be woken up
should it be woken up only when the system has been idle for a certain delay
after how much time should the mode switch should be forced

You can find and download the windows agent on your computemode web server: http://172.28.255.253/wrapper.php?key=winagent

9.18.2 Linux agent

The Linux agent page is currently not shipped with the CM version but a Debian Squeeze version is provide with the Computemode APT's repository.

To install the Linux Agent on a Debian Squeeze node for example, just type those commands on your node:

cat <<EOF > /etc/apt/sources.list.d/computemode.list
deb http://computemode.imag.fr/files/debian/squeeze ./
EOF
apt-get update
apt-get install cm-unixagent

9.19 ... handle specific bank holidays?

CM does not support calendar exceptions such as bank holidays but a script can be used configuring the 'PARTICIPANT_HOLIDAYS_SCRIPT' in the /var/www/cm/config.php file.This variable should contains the absolute path of a script which can analyse a Bank Holiday (a file, database...).

9.20 ... login as root on the client nodes?

For security reasons, logging in as root is disabled on the client nodes on the console. Yet, this can be reactivated by editing the client image OS.

Logging in as root is allowed through ssh from the root account on the CM server.

9.21 ... add a grid user account?

See next section.

9.22 ... add an Unix account?

Log in as root
Execute:
adduser <the_login>
Fill the requested information
If you are using NIS - this is enabled by default set up - you have to rebuild the NIS maps, which may be obtained by typing:
make -C /var/yp
If you want to let this user submit jobs through OAR, you have to add him/her to the oar group, type:
adduser <the_login> oar

9.23 ... use an extra NFS server ?

This can be done by configuring the ComputeMode boot mode parameters. Please read the table cap:cm-boot-modes-options, and especially the part about the MOUNTS option cm-bootmode-mounts-option.

If your servers are not available trough the same subnet (172.28.0.0/16) as your CM nodes, you will have to alter routing on the clients or use some NAT systems.

9.24 ... change the server public IP?

This is currently done the Debian way:

fire up a text editor and open, as root, /etc/network/interfaces
go to the eth0:0 stanza
replace the current line with 'address x.y.z.t' by the new address you want to use
if necessary, edit the 'netmask' and 'gateway' lines

For further information, you may want to read interfaces(5) man page.

9.25 ... use a LDAP server to autenticate my users?

This can be done by configuring some variables in the config.php file.

LDAPSERVER : for the name or ipadress of your LDAP server
LDAPPORT: the ldap port to communicate with your server
LDAPDOMAIN: the ldap domain from your server
IDENT_LDAPPASSWORD: the ldap default password to recognize LDAP user in the cmewadmin account database.An user with this password stored into the cmwebadmin database means that it has to be authenticate by the LDAP server and not the local users database from cmwebadmin.

10. ComputeMode achitecture

This chapter will try to give view a general overview of what's happening behind the scenes.

10.1 What happens during the PXE boot?

When the client boots, it broadcasts a DHCP request with a flag telling it wants some PXE.

The CM server sees this request and replies giving a temporary IP address and telling the client to fetch an IPXE file. IPXE is a binary that allow ComputeMode to use HTTP protocole to execute its commands (so a TCP connection).

10.2 What role does PXELinux play?

IPXE then tries to contact the HTTP server to request a configuration file whose name is based on the MAC address of the network card used. This file is dynamicallycreated during the IPXE request based on what the CM server currently knows (time, day, load, labels, etc)

IPXE then receives this configuration file and acts according to its contents, which may basically of the two following flavors:

local boot or,
chain / boot something else (Linux kernel, image, floppy image, other boot loader, pxelinux boot, etc.)

In the case it is a CM OS image, several options are passed by means of kernel command line options - this will be explained in the next section.

The IP address which was obtained during the PXE negotiation described above is now released.

10.3 What role does a CM image has to play?

Now that the Linux kernel is booting with its attached initrd, the following events take place:

try to find and load a valid network card driver: all the previous network communications happened through the UNDI network stack, which stands for Universal Network Device Interface and is basically a low-level, low-performance interface available for early network communication - it may not be used by modern OS
try to get a DHCP address from the CM server
set up the disk-less Linux distribution by using a mix of NFS and AUFS mounts and RAM disks
start the distribution processes and services

10.4 How does the `/cm/<distrib_name>` file hierarchy works?

Basically this folder contains an almost complete OS image which has been adapted for CM. 'Almost' as to be complete an image has to contain also: kernel and initrd (which both go into /var/www/bootdirectory/images/) and the adequates boot image and boot mode in cmwebadmin.

ComputeMode provides a centralized distribution system for client nodes. As a result, deploying (adding or upgrading) an application on the nodes is easily achieved by modifying the ComputeMode Server's network boot system repository.

Note:: Altering these folders should only be performed by experienced Linux administrators.

The file system for client nodes is, currently, shared by means of the NFS protocol and mounted with the AUFS filesystem to enable a user write mode. For simplicity most images are located in /cm/ - this can be changed if you update the NFS exports list as well as the boot image configuration.

ComputeMode uses currently as default a Debian-based distribution, so the directory is logically: /cm/debian. In this folder, several subdirectories with specific names will be found. Let's review each one of these:

orig/: contains some kind of golden system image, unaltered - it was built by copying all the files of a fresh Linux installation and has been unaltered for network boot. This folder should be managed by using the distribution preferred method (dpkg/apt-get for Debian-based distribution, rpm/urpmi/yum for RPM-based ones, ...)
patch/: contains the data (configuration files, replacement and extra binaries) required to transform the standard base Linux system into a ComputeMode network boot system. As the 'patch' name may suggest it, a mechanism similar to the one for source files is used
rules: is a text configuration file describing how the 'patch' elements are to be mixed with the 'orig' elements at the startup of each client node. It describes the copy and link operations to be applied using a few keywords such as: copypatch, copyorig, linkpatch, linkorig and skip.
Basically you have to know what each operation does, just read the code below:
1. copy means copy to ramdisk
2. link means create a symlink
3. orig means use the file/folder in the orig/ subdirectory
4. patch means use the file/folder in the patch/ subdirectory
5. skip means do nothing/ignore the file/folder
utils/: contains CM-specific scripts required to perform the network boot specific operations. The buildroot.sh script parses the 'rules' configuration file - other commands will perform specific operations for client nodes (start, registration, stop)

As a result, deploying an application to a given Boot Mode is achieved by adding necessary files to the 'orig' directory (possibly using the distribution's packaging facility) using a chroot command to have a consistent system view. Then, you have to check that the installed files are not overridden by the 'patch/' files. Such conflicts can be solved manually using the 'rules' file, but for most well-written applications (i.e. not needing a read/write access to the system installation when running), such do-it-yourself should not be necessary.

Note:: upgrading some system parts (especially libraries) while they are used by client nodes may cause problems (crashes)

According to the rules files, and how your directories are exported, client nodes will be able to use the newly installed software at their next boot.

10.5 Which processes run on the CM server?

Please see section sec:services-running-on-cmserver.

11. Security walk-through

This chapter aims at summarizing the processes running, and the security implications and mitigations of running a CM within your site.

11.1 Security concerns

ComputeMode is not designed to be run in a hostile environment in the sense where, if you want really want to mess with the nodes within your network almost any workstation can steal the identity of another (in more or less difficult ways).

The basic idea which is the one often seen within enterprises is that:

security is paranoid at the network borders : nothing unsafe should get in or get out from the intranet, which most likely implies using firewalls and authenticated web (and other protocols) proxies
within the LAN, hosts can safely communicate with one another
employees are not evil on purpose

On the contrary, if you have hostile people whose sole purpose in your company is to annoy others by sabotaging their work, please rethink your hiring process !

11.2 Services running on the CM server

To know which ports are open on you CM server, you can execute, as root:

netstat -lnp

Several services depends on portmap and hence do not have fixed port numbers: to find out which ports are used, you may type on the server:

rpcinfo -p localhost

11.2.1 Services required by CM server

Several services and servers are running on a standard CM server - some may be added or removed but the following ones are currently needed for proper functioning.

Table: Services running (and required by) the CM server

Service	Daemon name	Ports & protocol	Comment
DNS	bind	TCP/53 + UDP/53	DNS
		1 dynamically allocated UDP port
		TCP/953 (localhost only)	control channel
ssh	sshd	TCP/22
web/http	httpd	TCP/80
web/https	httpd	TCP/443
web	httpd	TCP/943 (localhost only)	control channel
portmapper	portmap	TCP/111=sunrpc and UDP/111=sunrpc
NIS	rpc.yppasswd	(through portmap)
	rpc.ypxfrd	(through portmap)
	ypbind	(through portmap)
mail	exim	TCP/25 on localhost
syslog	syslogd	UDP/514
NFS server	(none - kernel)	TCP/2049 and UDP/2049
	rpc.mountd	through portmap
	rpc.statd	through portmap
NFS mounts	(none - kernel)	UDP system ports (< 1024)
PXE proxy	pxe	UDP/4011
TFTP server	in.tftpd	UDP/69	through xinetd
DHCP server	isc-dhcp-server	UDP/67 + ICMP
PostgreSQL	postmaster	TCP/5432 (may be bound to localhost)	can be disabled

11.2.2 Services required by ComputeMode friendly software tools

ComputeMode which ships a few known tools.

Currently, there is only the OAR scheduler which offers a free (under GNU GPL license) task scheduling systems. This software is written mostly in perl and uses mysql or postgresql, ssh and sudo. The client nodes are contacted when needed through ssh so no extra service is running over these. Please check the table for further information.

Table: Services required by tools shipped with CM: OAR

Service name	Daemon seen	Protocol/Port
OAR	perl	TCP/6666
	postgresql	TCP/5432

11.3 Services running on client nodes

Client nodes may only be accessed from the CM server and they should all belong to the dedicated CM sub-network (by default: 172.28.0.0/16). All the services specified in table are only accessible from the private B network used.

Most have their accesses disabled thanks to inetd and tcp_wrappers.

Table: Services running on a client node

Service name	Daemon name	Protocol/Port	Comment
sunrpc	portmap	UDP/111 + TCP/111	tcp_wrappers
NIS	ypbind	through portmap	tcp_wrappers
NFS/status	rpc.statd	through portmap	tcp_wrappers
ssh	sshd	TCP/22	tcp_wrappers
syslog	syslogd	UDP/514	no filtering
NFS mounts	(none - kernel)	UDP system ports (< 1024) usually around UDP/800	kernel filtering
DHCP client	dhclient	UDP/68	kernel filtering

11.4 Owner accesses

Owners may only access the CM server through a user page on the server. The scope of actions which can be done there is quite limited. They can:

tell their machine is available by adding exceptions
see what happened to their machines for the last few days

Authentication is currently voluntarily weak (IP-based) to simplify the task of users who want to let their workstation join the grid.

Note:: if the CM administrator wants to disable this mode, the config.php file has to be edited (see section sec:enable-disable-owner-user-page).

V. Appendices

A. Default Authentication Credentials

Using default passwords is really close to being ''pure evil'' : do consider seriously changing them once your setup is running smoothly.

Do note however that the spelling may be slightly altered according to the keyboard flavor you are using. Basically if 'icatis' does not work, it may indeed be 'icqtis' that works.

Component	Login	Password	Comment
UNIX account	`root`	`icatis`	change this a.s.a.p. with `passwd`
UNIX account	`guest`	`guest`	disable it when you are done: `passwd -l guest`
cmwebadmin	`admin`	`icatis`	change it (see section sub:account-edition)
PostgreSQL / CMDB	`cmu`	`cmupassword`	change it in PG and in `config.php`
OAR	`oar`	`oar`	change it in MySQL, in `config.php` and in `oar.conf`

B. Enabling extra-options in Virtual Client download page

This chapter tells how to enable most options in the virtual client page. You will have to perform several operations on your server, some may not be quite obvious.

This appendix will be written in a next version of this documentation.

C. Acknowledgements

C.1 Open Source Software

ComputeMode relies on several Open Source components, namely:

the RDBMS used by cmwebadmin is PostgreSQL and is released under a BSD license
http://www.postgresql.org/
the web server used is Apache and is released under the Apache license 2.0
http://www.apache.org/
the interface is built above PHP5 + PEAR + SMARTY which are all released under the PHP license 3.0
the Simple Calendar Widget is by Anthony Garrett and is released under the LGPL 2.1
http://www.tarrget.info/calendar/scw.htm
pxelinux which is released under a GPL 2 license
http://syslinux.zytor.com/
ipxe which is release under a GPL 2 license
http://ipxe.org/
bind DNS server software (ISC) which is released under a BSD license
http://www.isc.org/sw/bind/
DHCP server software (ISC) which is released under a BSD-like license
http://www.isc.org/sw/dhcp/
OAR software (ID-IMAG) is released under a GPL license
http://oar.imag.fr/

If some software you developed is being used and we forgot to mention it in this document, please tell us and we will fix this page in later revisions of this document.

C.2 Third-part software

This manual mentions several third-part software and trademarks. Here is the alphabetized list:

Debian is a registered trademark of Software in the Public Interest, Inc.
http://www.debian.org/
Ghost is a product by Symantec Corporation
http://www.symantec.com/
Linux is a registered trademark by Linus Torvalds
http://www.kernel.org/
VMware Player, VMware Workstation are products of VMware, Inc.
http://www.vmware.com/

An Open Source (GPL) batch manager is shipping with CM and named 'OAR'. If you are more acquainted with Platform's LSF, OpenPBS, TORQUE or Sun Grid Engine, you use them instead provided you adapt the configuration.

D. ComputeMode Database Schema and Dictionary

This appendix describes the database schema used in ComputeMode 2.0.

D.1 Compatibility with previous schemas

The schema described in this appendix is compatible with the previous schema (used in ComputeMode 1.6) meaning the web administration interface 1.6 may be used with a database implementing this schema.

Only fields addition were done, or fields extended without constraint added.

This schema aims at being implemented in PostgreSQL 7.4 or after.

D.2 Conventions

Conventions are numbered for convenience.

Naming convention: whenever it is not used, this will be duly noted.
Tables and fields names: they are case-insensitive.
Table names: they have the same name as the object they store but in in the plural form : to store 'foo' objects the table should be named 'foos'.
Foreign keys: if the table ''foos'' has a field named ''id'', then in the table ''bars'' the field name will begin by ''foos_id''. Most of the time, it will only be ''foos_id'' but it may become ''foos_id_thingie''.
Primary keys: every ''simple'' table has a primary key implemented by means of a sequential auto-incrementing value (PostgreSQL type 'bigserial'). Most of the time, if the table has a plural name, the primary key (named id for short) will be named : 'id' + singular name of the table. For instance, for 'nodes' the record id is named 'idnode'.
Joins table: many-to-many tables have a name based on the aggregation of both table names to join on : table 'foos' and table 'bars' will join in a 'foos_have_bars' (or 'bars_have_foo' - no specified order is not considered) table. For such tables, the key is composed of the aggregation of the two fields used to join. Both fields should respect the convention to foreign keys naming.

The following abbreviations are used throughout this appendix :

PK: primary key constraint (the recording is unique - DBMS enforcement)
FK: foreign key constraint (the recording exists as a primary key in the referring table - DBMS enforcement)
FK*: may be null - but if not, it should refer a foreign key (no DBMS enforcement)
NE: naming exception (when a field or table does not respect these conventions)

D.3 Tables list

Do note that in PostgreSQL ancillary tables are automatically added to the schema to handle sequential values (auto-incrementable values). The tables explicitely created are given below sorted by alphabetic order:

bootimages: Contains informations related to Linux boot images.
bootmodes: Contains the informations related to the boot modes (Local, ComputeMode or RawPXE).
bootparams: Contains boot parameters which are understood by the ComputeMode scripts. This table is not editable from the web interface and only contains fixed values. Some help messages may be added or translated.
bootmodes_have_bootparams: Stores the many-to-many association between the tables bootmodes and bootparams for the ComputeMode bootmodes. (A bootmode has boot parameters, a boot parameter may apply to several bootmodes).
labels: Contains the labels names.
Note: records whose primary keys are worth 1 and 2 are reserved for ComputeMode internal use (implement the 'Quarantine' and 'Unconfigured' labels.
logevents: Stores the modifications of the automatons states during the life of a node.
nodes: Contains all the information related to a node.
exceptions: Contains all the exceptions related to machines. An exception being when a user declares his/her machine is going to be available between 2 dates.
nodes_have_labels: Stores the many-to-many association between nodes and labels (A node may have several labels applied, a label may apply to several nodes).
nodetimeslots: Stores timeslots (= tuples composed of a week day, a start time and an endtime) and its association with a bootmode.
nodeweeklytimeslots: This is were the schedules (that is a set of nodetimeslots for a week) are stored.
users: This table contains the list of users of the web administration interface, that is mostly CM administrators. Personal information and encrypted passwords are also stored here.
version_schema [NE]: This tables must contain only one recording which is used to check the DB schema version when the web administration interface is upgraded.

D.4 Database Schema (plot)

Do note the following elements about figure :

primary keys use a bold typeface
some fields may not be shown for clarity reasons - they are however listed in the next section
the schema has been drawn to minimize relations crossings

**Figure:** CM database schema and relations

D.5 Database Dictionary & Constraints

The database dictionary is described below and aims at giving hints at how fields are used and what they are supposed to contain.

In addition to the PK and FK constraints, some constraints are enforced by PostgreSQL. To avoid depending on too many DBMS specificities, others are only application-enforced.

D.5.1 Table bootImages

idBootImage (PK): id
biName: for boot image name - string identifying the BootImage (e.g. '2.4.24-20mdkenterprise')
CONSTRAINT:DB: not null
kernel: name of the kernel file used - path relative to the TFTPD directory
CONSTRAINT:DB: not null
initrd: name of the initrd image used - path relative to the TFTPD directory
cmdLine: kernel options (e.g. 'ro devfs=mount ramdisk=5000 acpi=ht') - the part related to ComputeMode boot parameters must not conflict with what is stored there
environment: link to an external description of the bootimage

D.5.2 Table bootModes

This table implements some kind of union record. Depending on the value of the 'BMType' the application will use different fields.

idBootMode (PK): id
bmName: a unique string identifying the bootmode (e.g. 'redhatforsmp')
Note: to hold some notes or comments
CONSTRAINT:DB: not null, length > 0, length < 30, UNIQUE
bmType: currently there are only 3 modes supported, knowingly:
'Local': indicates to boot on the local disk : there should be only one
'ComputeMode': indicates a ComputeMode boot
'RawPXEConfig': indicates a plain service
CONSTRAINT:DB: not null, among the 3 strings above
rawConfigFilePath: Path to the raw PXE configuration file to use. This field is only used when BMType is worth 'RawPXEConfig'.
CONSTRAINT:DB: length < 200
bootImages_idBootImage: The id of the BootImage (in the BootImages table) to use if BMType is worth 'ComputeMode'. For the record, BootParams entries are also associated with every ComputeMode type BootMode, using the BootModes_have_BootParams association table.
CONSTRAINT: the DBMS only checks the value is greater or equal to 0.
isSmart (unused): left in the schema for compatibility reasons - should always be 0.

D.5.3 Table bootParams

idBootParam: id
bpName: string identifying the BootParam (e.g. 'NFSROOT')
CONSTRAINT:DB: not null
isPXE: boolean indicating whether this parameter should be passed via the kernel cmdline (via PXE)
if false, the parameter will be given later an extra parameter (via HTTP). Only vitals parameters mandatory for booting should have this value set to true.
CONSTRAINT:DB: not null
defaultValue: an optional sample default value which may be used to prefill dialog in the configuration interface
note: an optional help message about how the bootparam should be used or the format of the values to use

D.5.4 Table bootModes_have_bootParams

This table is an association table for the many-to-many relation between BootModes and BootParams. The fields here are:

bootModes_idBootModes (FK): bootmode to associate
CONSTRAINT:DB: see below
bootParams_idBootParam (FK): bootparam to associate
CONSTRAINT:DB: (bootModes_idBootModes, bootParams_idBootParam) = (PK)
value: value for the association

D.5.5 Table labels

idLabel: id
name: label name
CONSTRAINT: DB: at most 30 characters
CONSTRAINT: APP: only letters, no spaces, unique in a case-insensitive way...

D.5.5.1 IMPORTANT NOTE - VALUES TO USE

CM uses 2 values internally which must be present in the DB :

(idLabel, name) = (1,'Unconfigured')
(idLabel, name) = (2,'Quarantine')

Any other (strictly positive) value is fine to use.

D.5.6 Table logEvents

This table does not have a primary key. However, considering the time granularity of most systems, the switchtime field may often be regarded as a key (though, theoretically it is not).

nodes_idNode: node ID to which the logged event apply
switchTime: timestamp indicating when the event happened
newBootMode [NE]: if there is a boot mode change, then it's stored here
Note: there is no foreign key constraint as a node may be deleted and we don't want to update this table
newState: this is used to store a string relative to the state of the application for this node
CONSTRAINT:DB: length < 50
pxeBoot: indicates whether there was a PXE boot.

D.5.7 Table nodes

idNode (PK): id
users_idUsers (FK): which user own the node
mac: clean MAC address (6 times 2 lower-case hexadecimal numbers separated by dashes, e.g. '11-33-55-77-99-aa') integrity is only partially checked by the SGBD - the application layer has to take care of storing only meaning mac addresses
CONSTRAINT:DB: not null, length < 17
hostname: machine name to use during dhcp requests, for dynamic DNS registration.
Note: a comment field aimed at storing some text - its presence is optional - a hostname should be unique with case ignored.
CONSTRAINT:DB: length < 90
note: optional comment field
nodeWeeklyTimeSlots_idNodeWeeklyTimeSlot (FK): a NWTS ( = schedule) ID
state: used by the application to store some transient information
lastStateChange: timestamp indicating when the last time some event occurred to the node
bootModes_idBootMode_pxe (FK) [NE]: used by the application to store some transient information
Naming exception reason: previously there were several nodes.bootmodes_idbootmode_*, thus requiring an extra extension, the other ones disappeared, and this one was left as is.
lastTimeSeen: timestamp indicating when the latest time the node was seen by the server was
useWol: use Wake on LAN for this node ? 0 means no, 1 means yes
handled_by_job_manager: should CM attempt to synchronize with the job manager ? 0 means no, 1 means yes
host_ip: optionally some numeric IP indicating which is the public IP address of the node
CONSTRAINT: APP: valid IP

D.5.8 Table exceptions

idException (PK): id
nodes_idNode (FK): node to which the exception applies
beginTime: timestamp indicating when the exception is to start - the time of the day should be 00:00:00
CONSTRAINT:DB: not null, beginTime <= endTime
endTime: timestamp indicating when the exception period is to end - the time of the day should be 23:59:59
CONSTRAINT:DB: not null, beginTime <= endTime
nodeWeeklyTimeslots_idNodeWeeklyTimeslot_in (FK) [CE]: the schedule the node had when it entered the exception
nodeWeeklyTimeslots_idNodeWeeklyTimeslot_out (FK) [CE]: the schedule the node will have when it exits the exception
note: optional comment field

D.5.9 Table nodes_have_labels

This table is an association table for the many-to-many relation between Nodes and Labels. The fields here are:

nodes_idNode (FK): node to associate
CONSTRAINT:DB: see below
labels_idLabel (FK): label to associate
CONSTRAINT:DB: (nodes_idNode, labels_idLabel) = PK

D.5.10 Table nodeTimeSlots

2 different NodeTimeSlots fields related to a same NWTS should not overlap (application enforced).

idNodeTimeSlot (PK): id
NodeWeeklyTimeSlots_idNodeWeeklyTimeSlot (FK): the ID of the NWTS/Schedule to which the record belongs
dayNo: the number of the day (ranging [and including] from 0 = Sunday to 6 = Saturday)
CONSTRAINT:DB: not null, 0 <= dayNo <= 6
beginTime: time at which the period starts - should be the exact time (for instance, for 6:30 it would be 6:30:00)
CONSTRAINT:DB: not null, beginTime <= endTime
endTime: time at which the period ends - should be the exact time - 1 second (for instance, for 7:00 it would be 6:59:59)
CONSTRAINT:DB: not null, beginTime <= endTime
bootmode_idbootmode (FK): ID of the boot mode associated to the period

D.5.11 Table nodeWeeklyTimeSlots

This table name may be shortened to 'schedule' or 'nwts'.

idNodeWeeklyTimeSlot (PK): id
nwtsName: name of the schedule
CONSTRAINT:DB: not null, length > 0, length <42, unique
isTemplate (unused): a boolean indicating whether this is editable by users or if this is an admin table (a template is instantiated when a user edits his own NWTS)
CONSTRAINT:DB: not null
note: optional comments
isuserselectable: this field indicate if a user can select this Sheduler or if it is an administrator one
bootmodes_iddefault [NE]: default bootmode - always use 1 ('Local') - other values are undefined - left for compatibility reasons

D.5.12 Table users

idUser (PK): id
login: a nickname, optionally based on the user's real name
CONSTRAINT:DB: not null, length > 0, length < 30, UNIQUE
password: the password hashed with the crypt() function (used to authenticate the user)
CONSTRAINT:DB: not null, length > 0, length < 100
firstname: optional fields of information regarding the user
CONSTRAINT:DB: length < 50
lastname: optional fields of information regarding the user
CONSTRAINT:DB: length < 50
email: optional fields of information regarding the user
CONSTRAINT:DB: length < 90
phoneNumber: optional fields of information regarding the user
CONSTRAINT:DB: length < 40
adminrole: describes if a user is an administrator or not, 1 for an administrator role, 0 for an user
active_cm: describes if an user is active or not
active_holidays: describes if a user want to use an holliday planning file to compute ComputeMode scheduler

D.5.13 Table version_schema

This table does not have a primary key as it only has one record.

ts: timestamp of the schema

D.5.13.1 IMPORTANT NOTE

Currently, the row in this record is only used by the update system.

D.6 Joins

D.6.1 Joins list

The tables have the ''simple'' join relations below - the list items are only numbered for convenience:

bootimages.idbootimage = bootmodes.bootimages_idbootimage
bootmodes.idbootmode = nodetimeslots.bootmodes_idbootmode
nodes.users_iduser = users.iduser
nodes.nodeweeklytimeslots_idnodeweeklytimeslot =
nodeweeklytimeslots.idnodeweeklytimeslot
nodes.bootmodes_idbootmode_pxe = bootmodes.idbootmode
exceptions.nodeweeklytimeslots_idnodeweeklytimeslot_in =
nodeweeklytimeslots.idnodeweeklytimeslot
Naming exception reason: several exceptions.nodeweeklytimeslots_idnodeweeklytimeslot used in a record
exceptions.nodeweeklytimeslots_idnodeweeklytimeslot_out =
nodeweeklytimeslots.idnodeweeklytimeslot
Naming exception reason: several exceptions.nodeweeklytimeslots_idnodeweeklytimeslot used in a record
exceptions.nodes_idnode = nodes.idnode
logevents.nodes_idnode = nodes.idnode

D.6.2 Many-to-many joins tables

The tables storing many-to-many associations use the relations below - the list items are only numbered for convenience:

bootmodes_have_bootparams.bootmodes_idbootmode = bootmodes.idbootmode AND bootmodes_have_bootparams.bootparams_idbootparam = bootparams.idbootparam
nodes_have_labels.nodes_idnode = nodes.idnode AND nodes_have_labels.labels_idlabel = labels.idlabel

E. ComputeMode boot process

The figure cap:network-communications-during-boot describes all the network communications occuring when a node boots diskless in ComputeMode.

**Figure:** Network communications during the boot process

F. Updating Debian ComputeMode Images

F.1 Initrd Images

If you wish to upgrade a network driver or alter the initrd image by yourself:

Extract the contents of an initrd image

mkdir /var/www/bootdirectory/images/temp
cd /var/www/bootdirectory/images/temp
gzip -dc ../debiaufs | cpio -id

Fetch the drivers to update. For most vendor systems they should be Broadcom (tg3) or Intel (e1000) (as of today):
http://www.broadcom.com/support/ethernet_nic/driver-sla.php?driver=570x-Linux
http://downloadcenter.intel.com/Detail_Desc.aspx?ProductID=838&DwnldID=9180
Compile those drivers for the kernel you want to use - you may have to read the packages instructions. You will get a tg3.o and e1000.o (for 2.4 - for 2.6 they will have the .ko extension)
Copy those drivers into:
/var/www/bootdirectory/images/temp/lib/modules/<kernel version>/kernel/drivers/net/tg3.o
/var/www/bootdirectory/images/temp/lib/modules/<kernel version>/kernel/drivers/net/e1000/e1000o
Note: <kernel version> should be replaced by the version as returned by a 'uname -r' call
Update drivers<>PCI mappings by calling:
depmod -a -b /var/www/bootdirectory/images/temp/
Make an initrd file thanks to:
cd /var/www/bootdirectory/images/temp/
find ./ | cpio -H newc -o|gzip > /var/www/bootdirectory/images/debiaufs.NEW
Optionally, backup the old debiaufs image, rename the -new one or edit the bootimage.
Try to boot a new machine with PXE

F.2 Tips & Tricks

F.2.1 Configuring the boot message shown at the end of the diskless boot

Just edit the file located at : /cm/debian/patch/var/diskless/boot.msg

You can use ANSI colors and plain text.

F.2.2 Changing the alt-ctl-del behavior

As a default, in the computing distribution shipped (in /cm/debian/), hitting 'alt-ctl-del' when in ComputeMode will use the end behavior specified in the bootmode (halt or reboot for instance).

For instance, if it's set to 'halt', the node will be quarantined and stops when the schedule ends or the user hits 'alt-ctl-del'.

To change this behavior edit the distribution patched etc/inittab : look for the line reading:

ca::ctrlaltdel:/diskless/utils/diskless_exit.sh inhibit

and replace it by something such as:

ca::ctrlaltdel:/diskless/utils/diskless_exit.sh inhibit reboot

On the server, the files are located at:

/cm/debian/patch/etc/inittab

and

/cm/debian/utils/diskless_exit.sh

G. Localization (L10N, I18N)

cmwebadmin has supported both French and English language since version 1.2 through a rudimentary translation system.

In this appendix, every path specified is relative to the path cmwebadmin is installed (by default, /var/www/cm/)

Since the 1.6 version, the existing system has been overhauled and now support more standardized .po files (also known as locales) with the following constraints :

the .po files must be encoded in UTF8
the .po files must start by the UTF8 signature (3 bytes 0x EF BB BF)
the .po files must be named : <language code on 2 characters>_<country code on 2 characters>.utf8.po
Examples: en_US.utf8.po corresponds to English from the US - likewise fr_FR.utf8.po corresponds to French from France
they must be placed in ./locale/
the first line will be skipped (to simplify parsing with the UTF8 signature)
comments must start with a '#' characters and there lines are skipped
empty lines are skipped
for each msgid entry, a msgstr entry must follow
multiline msgstr are not supported

Those files are parsed the first time they are used and the parsed copy is stored in templates_c. Each time they have to be used, the file modification time of the parsed copy and the original copy are compared. If the .po file is newer than the parsed copy, the latter is updated.

For information, the parsed file are stored as a serialized PHP object whose name is something like :

./templates_c/l10n.<language code on 2 characters>_<country code on 2 characters>.ser

To add a new translation, just add a file in the ./locale/ directory. If you have some translations missing, the en_US (defined in config.php) flavor will be used, then the keyword (msgid) itself if nothing else is found.

About this document ...

ComputeMode: On-demand HPC
Cluster Manager
Version 2.0

http://computemode.imag.fr/

This document was generated using the LaTeX2HTML translator Version 2008 (1.71)

The command line arguments were:
latex2html -dir ./build -split 0 -show_section_numbers -html_version 3.2 -no_navigation manual.tex

The translation was initiated by genevois on 2012-02-16

genevois 2012-02-16

Option name	Mandatory	Sent at boot time (PXE)	Notes & examples

MASTER	YES	YES	Hostname of the ComputeMode server in the CM subnet. Syntax: numerical IPv4 or hostname Default: 172.28.255.253 Note: you can use a litteral name but as there may be DNS resolving issues, you'd better stick to an IP if possible
NFSROOT	YES	YES	ComputeMode root to use for the diskless boot Syntax: nfsserver:/cm/distribution Default: 172.28.255.253:/cm/debian Note: you can use a litteral name but as there may be DNS resolving issues, you'd better stick to an IP if possible
AUTOSTART	no	no	Specify which commands to start automatically after the distribution has finished booting. Syntax: user1:cmdfilename1+user2:cmdfilename2
meaning as user1, launch cmdfilename1, then as user2, launch cmdfilename2. Paths of cmfilename[12] must be absolute. Default: root:/var/lib/oar/cm/oar_start.sh (starts OAR)
AUTOSTOP	no	no	Specify which commands to stop automatically before the node starts shutdowning Syntax: same syntax as AUTOSTART just above Default: root:/var/lib/oar/cm/oar_stop.sh (stops OAR)
EXIT	no	no	Specify which exit method to use (another way is to set it in /var/diskless/exit) Syntax: halt or reboot or no Default: reboot Note: Only these 3 keywords are supported. 'no' means do not exit.
NFSHOME	no	no	Specify how to mount the /home directory (may be done using the MOUNTS parameter too) Syntax: some.host:/some/dir Default: cmserver:/home
DDNS	no	YES	Should Dynamic DNS be activated. Syntax: yes or no Default: yes
DEBUG	no	YES	Activate debug mode. Value corresponds to the linuxrc breakpoint to stop at. Syntax: positive integer, 0 means stop at all breakpoints. Default: not set
DHCPNOKILL	no	YES	Do not kill initrd DHCP client (workaround to ensure the initrd and the distribution do not get two different leases from two different DHCP server, which would break everything. The drawback is the initrd will not be unmounted and its memory will not be freed. Syntax: yes or no Default: not set Note: not much tested, use with care
DHCPPORT	no	YES	DHCP port that dhclient must use. Syntax: positive integer Note: not much tested, use with care
DHCPREJECT	no	YES	Should dhcp configuration reject some dhcp servers offers. Syntax: x.y.z.t+a.b.c.d
where x.y.z.t and z.b.c.d are IPv4 addresses
DHCPSERVER	no	YES	Force the use of a DHCP server. Please prefer the DHCPREJECT option. Syntax: x.y.z.t where x.y.z.t is an IPv4 address.
IP	no	YES	Used to setup network manually. Value may be append which means let PXElinux give the information to the initrd. Other possible values are: ipaddr:tftpserver:gateway:netmask You may use the NS parameter to specify the DNS configuration. Note: not much tested, use with care
MODULES	no	no	Specify which modules should be loaded upon distribution startup (another way is to edit the /etc/modules file directly). Syntax: module1+module2
MOUNTS	no	no	Specify what the distribution should mount during startup (another way is to modify the fstab of the distribution) Syntax: host1:/exp/dir%/mnt/dir1%nfs%ro,hard, intr+host2:/exp/dir%/mnt/dir2%nfs%ro, hard,intr
NS	no	YES	Specify a DNS server and a domain name if the network configuration is not done with DHCP. Syntax: dns1,dns2:domain1,domain2 Note: not much tested, use with care
NTP	no	no	Specify the NTP (time) server to use (another way is to set it in /var/diskless/ntp) Syntax: IPv4 address, hostname Default: not set
PULL	no	no	Specify the pull method that the node will use to notify and get information from the server (another way is to put it in the /var/diskless/pull file directly) Note: not much tested, use with care
USERS	no	no	Specify users which should be created on the system (another way is to directly modify /etc/passwd) Syntax: user1:uid1+user2:uid2 Note: not much tested, use with care

ComputeMode: On-demand HPC Cluster Manager Version 2.0 http://computemode.imag.fr/

2.2 ... in a virtual machine

3. Starting the ComputeMode Virtual Appliance

4. Starting Computing

7. ComputeMode web administration interface

7.1.2 Change schedule

7.2.2 Special label 'Quarantine'

7.2.4 Add label

7.4.1 Account edition

9.1 ... change CM administrator's password?

9.16 ... enable/disable owner/user's pages?

11.2 Services running on the CM server

A. Default Authentication Credentials

B. Enabling extra-options in Virtual Client download page

C.1 Open Source Software

ComputeMode: On-demand HPC
Cluster Manager
Version 2.0

`http://computemode.imag.fr/`