Clustor creates jobs based on the number of parameter combinations specified in a plan script. There are 4 types of Clustor parameters involved in this study which specify:

  1. Selection method
    Apart from the five selection methods previously mentioned, there are 6 other methods examined in this case study which results are not presented, mainly because the methods couldn't converge into models with reasonably good predictive performance even after 50 variables have been included in the model. So, in total there are 11 selection methods.

  2. Data file
    The data for this case study are randomly sampled into 10 train-test data sets. In addition to that, one train-test data set using data from consecutive years 1950-1987 as the training data and 1987-1994 as the test data is also formed. The latter data set was used to build the benchmark forecasting models named SHIFOR and SHIFOR94 against which the predictive performance of the model being built in this case study is compared. So, in total there are 11 data sets each stored in a different data file.

  3. Regression coefficient calculation
    In this experiments, the coefficients of a set of variables can be calculated using two methods: Jacobian Transformation and Gaussian Elimination. The cost of a model, especially one which variables are highly correlated, can differ depending on which method is used to find the coefficients of the variables. This in effect will cause the search algorithm to choose different paths in its pursue to find an optimum model.

  4. MML model parameter estimation
    This parameter is to distinguish the way a model's variable coefficients and standard deviation is calculated between MML and the other methods. This is simply due to the different way those parameters are derived from the initial formula in MML method. Hence, this Clustor parameter has 2 values.
Three Clustor script files are to be created before submitting the simulations to Clustor:
  1. Plan file (example: complete.pln). This file contains all the parameters and the command lines which Clustor will parameterize to generate jobs at run time.
  2. Clustor option file (example: clustor.options) This file sets, among other things, the limit of jobs that can run concurrently on a node for a user to prevent a user from dominating Clustor at any given time.
  3. Root option file (example: root.options) The number of concurrent node activations is set in this file.