Model Run Cycle

From OpenM++
Jump to: navigation, search

Contents

Model run cycle overview

Model run (execution of the model) consists of the following steps:

  • initializing of model process(es) with model run options
  • connecting to database and creating "model run" with run_id and run_name
  • find set of input parameters and prepare it for the run
  • reading model input parameters
  • simulation of sub-values
  • writing output sub-values to output tables in database
  • aggregating sub-values using Output Expressios

Results of model run stored in database within unique integer "run_id" and include all model parameters, options and output result tables. You always can find full set of model input and output by run id.

OpenM++ models can be run on Windows and Linux platforms, on single desktop computer, on multiple computers over network, in HPC cluster or cloud environment (Google Cloud, Microsoft Azure, Amazon,...). Because openM++ runtime library hides all that complexity from the model we can safely assume model is a single executable on local machine. Please check Model Run: How to Run the Model for more details.

Sub-values: sub-samples, members, replicas

Following terms: "simulation member", "replica", "sub-sample" are often used in micro-simulation conversations interchangeably, depending on context. To avoid terminology discussion openM++ uses "sub-value" as equivalent of all above and some older pages of that wiki may contain "sub-sample" in that case.

Model output tables: sub-values, accumulators and expressions

There are two kind of model output tables:

  • accumulators table: output sub-values (similar to Modgen sub-samples)
  • expressions table: model output value calculated as accumulators aggregated across sub-values (e.g. mean or CV or SE)

All output accumulator tables always contain same number of sub-values, for example model run:

model.exe -OpenM.Subvalues 16

will create 16 sub-values for each accumulator in each output accumulator table.

Model parameters: sub-values (optional)

OpenM++ parameters can also contain sub-values. Parameters sub-values are not required, it is a user choice to run the model and supply sub-values for some parameters.

For example, if user wants to describe statistical uncertanty of parameter SalaryByYearByProvince then csv file with 16 sub-values can be supplied to run the model:

model.exe -OpenM.Subvalues 16 Subvalue.SalaryByYearByProvince csv -OpenM.ParamDir C:\MyCsv\

Note: To simplify diagram below we do omit sub-values from the picture. But in real database there are multiple sub-values for parameters and accumulators; each sub-value identified by sub_id column.

How model finds input parameters: Parameters search order

OpenM++ Model run: Input and Output
Model run: Input and Output

Model search for input parameter values in following order:

  • use parameter value specified as command line argument
  • use parameter value specified inside of ini-file [Parameter] section
  • use parameter value from profile_option table
  • read parameter.csv file from "OpenM.ParamDir" directory
  • use parameter value set of input parameters in database: workset or default workset
  • use same value as in previous model run: use base run of workset
  • some parameters, e.g. number of sub-values may have default values

In any case all input parameters are copied under new run id before simulation starts. That process of copy parameters required to guarantee database always has full copy of input parameters for particular model run.

Model run options

There are many options which control model run, i.e.: number of sub-values, number of threads, etc. OpenM++ model gets run options in following order:

  • as command line arguments
  • from model run options ini-file
  • from database run_option and profile_option tables
  • use default values

Each option has unique key associated with it, e.g. "Parameter.RandomSeed" is model input parameter "RandomSeed", which is most likely, random generator starting seed. You can use this key to specify model parameter on command line, in ini-file or database. For example:

modelOne.exe -Parameter.RandomSeed 123 -ini my.ini

would run modelOne model with random seed = 123 and other options from my.ini file.

Please see OpenM++ Model Run Options to find out more.

Workset: Set of model input parameters in database

Database can contain multiple versions of model input parameter value. User can edit (change values of) input parameter(s) and save it as "working set of model input parameters" (a.k.a. "workset").

  • each set of parameters has unique "set id" and unique "set name"
  • each model must have at least one full set of input parameters populated with default values (default set)
  • default set always have a minimal value of set id for particular model
  • default set usually have same name as model name

Base Run: Parameters from previous model run

Most of the model parameters are not changing between simulations and workset is usually contains only a small subset of model input. In that case it can be defined as "based on previous model run" (a.k.a. "base run") and all parameters, which are not exist in the workset, will be selected from existing model run results by base_run_id.

How model finds input parameters: Default

If user run the model without any arguments:

modelOne.exe

then input parameters selected from default set, which is the first input data set of that model.

How model finds input parameters: Input set name or Id

To run the model with input data other than default user can specify set id or workset name:

modelOne.exe -OpenM.SetId 20
modelOne.exe -OpenM.SetName MyParametersSet

assuming workset with set_id = 20 and name MyParametersSet exists in model database.

How model finds input parameters: Value as command line argument

It is also possible to specify value of any scalar parameter as command line argument, i.e.:

model.exe -Parameter.RandomSeed 123

There is an example of such technique at Run model from R: simple loop over model parameter page, where we using NewCaseBased model to study effect of Mortality Hazard input parameter on Duration of Life output:

for (mortalityValue from 0.014 to 0.109 by step 0.005)
{
  # run the model
  NewCaseBased.exe -Parameter.MortalityHazard mortalityValue
}

How model finds input parameters: Sub-values [0, N-1] as command line argument

If we want to run the model with multiple sub-values (a.k.a. sub-samples) and want "Some" parameter sub-values to be created as [0, N-1] then:

model.exe -OpenM.Subvalues 16 -Subvalue.Some iota

as result sub-values parameter Some would be: [0, ..., 15]

How model finds input parameters: Value inside of ini.file

Also any scalar parameter can be defined in model ini-file, i.e.:

model.exe -ini my.ini
; inside of my.ini file:
;
[Parameter]
Z_Parameter = B        ; string parameter
SomeInt     = 1234     ; integer parameter
OrLogical   = true     ; boolean parameter
Anumber     = 9.876e5  ; float parameter

How model finds input parameters: Value in model profile

Another way to supply value of scalar parameter(s) is through profile_option database table. For example:

model.exe -OpenM.SetId 20 -OpenM.OptionsProfile MyProfile
SELECT * FROM profile_lst;

profile_name
------------
MyProfile

SELECT * FROM profile_option;

profile_name  option_key             option_value
------------- ---------------------- ------------
MyProfile     Parameter.RandomSeed   4095

How model finds input parameters: Csv file

It is also possible to supply some (or even all) model parameters as csv-file(s). For example:

model.exe -OpenM.ParamDir C:\my_csv

If directory C:\my_csv\ exist and contains parameterName.csv file model will use it parameter values. Parameter directory can be specified as command-line argument or as ini-file entry (usage of profile_option table not recommended OpenM.ParamDir).

On picture above model run as:

model.exe -ini my.ini -OpenM.SetId 20

and my.ini file contains:

[OpenM]
ParamDir = C:\my_csv\

As result model.exe will read from C:\my_csv\Sex.csv values of "Sex" parameter:

sub_id,dim0,param_value
0,     F,   true
0,     M,   false

It is also possible to have enum id's in csv files instead of codes, for example C:\my_csv\Sex.csv can be:

sub_id,dim0,param_value
0,     0,   true
0,     1,   false

To use such csv files you need to run the model with OpenM.IdCsv true argument:

model.exe -OpenM.SetId 20 OpenM.IdCsv true

Format of parameter.csv is based on RFC 4180 with some simplification:

  • space-only lines silently ignored
  • end of line can be CRLF or LF
  • values are trimmed unless they are " double quoted "
  • multi-line string values not supported

If parameter is boolean then following values expected (not case sensitive):

  • "true" or "t" or "1"
  • "false" or "f" or "0"

Important: Header line must include all dimension names, in ascending order, without spaces, e.g.: sub_id,dim0,dim1,dim2,dim3,param_value.

Parameter.csv file must contain all values, e.g. if parameter has 123456 values then csv must have all 123456 lines + header. Sorting order of lines are not important.

Csv file with multiple sub-values

If user want to supply up to 32 sub-values of "Sex" parameter then csv file look like:

sub_id,dim0,param_value
0,     F,   true
0,     M,   false
1,     F,   true
1,     M,   true
.................
31,    F,   false
31,    M,   true

Important: Presence of multiple sub-values in csv file (or in database) does not mean model will use all parameter sub-values. Only explicitly specified parameter(s) receiving sub-values.

For example, if user run the model 3 times:

model.exe -OpenM.Subvalues 16
model.exe -OpenM.Subvalues 16 -OpenM.ParamDir C:\my_csv
model.exe -OpenM.Subvalues 16 -OpenM.ParamDir C:\my_csv -Subvalue.Sex csv
  1. "Sex" parameter expected to be in database and no sub-values used
  2. "Sex" parameter value is sub-value 0 from C:\my_csv\Sex.csv
  3. "Sex" parameter using sub-values [0, 15] from C:\my_csv\Sex.csv

Important: Number of sub-values in csv must be at least as user required. In example above Sex.csv contains 32 sub-values and user runs the model with first 16 of them.

How model finds input parameters: Value from previous model run (base run)

Most of the model parameters are not changing between simulations and working set is usually contains a small subset of model input. In that case it can be defined as "based on existing run" and all parameters, which are not exist in that workset, will be selected from previous model run results by run_id.

On picture above command line to run the model is:

model.exe -ini my.ini -OpenM.SetId 20

and input set with id 20 defined as "based on run" with id = 11:

SELECT set_id, set_name, base_run_id FROM workset_lst WHERE set_id = 20;

set_id set_name            base_run_id
------ ------------------- -----------
    20 set_based_on_run_11     11

Because workset with id = 20 does not include "Provinces" input parameter those values selected from existing model run by run_id = 11. As result model will use following values for "Provinces":

SELECT dim0, param_value FROM Provinces WHERE run_id = 11;

dim0 value
---- -----
   0    ON
   1    QC

Note: sql above specially simplified, actual database table names, column names and queries bit more complex.

How model finds input parameters: Sub-values from from database

If we want to run the model with multiple sub-values (a.k.a. sub-samples) and want "RatioByProvince" parameter sub-values selected from databse:

model.exe -OpenM.Subvalues 16 -Subvalue.RatioByProvince db

Model will select "RatioByProvince" parameter sub-values from deafult workset or from base run, if there are no RatioByProvince parameter in workset. Database must contain at least 16 sub-values for "RatioByProvince".

For example:

SELECT sub_id, dim0, param_value FROM RatioByProvince WHERE run_id = 11;

sub_id dim0 value
------ ---- -----
     0    0  1.00
     0    1  1.01
     1    0  1.02
     1    1  1.03
     2    0  1.04
     2    1  1.05
     ............
     31   0  1.31
     31   1  1.32

In that case first 16 sub-values will be selected with sub_id between 0 and 15.

<metadesc>OpenM++: open source microsimulation platform</metadesc>