Skip to content

Instantly share code, notes, and snippets.

@crkrenn
Last active June 16, 2020 16:21
Show Gist options
  • Save crkrenn/b41364d8ad0828d22d008eaa1388d6e6 to your computer and use it in GitHub Desktop.
Save crkrenn/b41364d8ad0828d22d008eaa1388d6e6 to your computer and use it in GitHub Desktop.

Title

Place 1

Hello, this is some text to fill in this, here, is a link to the second place.

Place 2

Place one has the fun times of linking here, but I can also link back here.

Place's 3: other example

Place one has the fun times of linking here,

Work in progress:

See: (WIP)[#work-in-progress]. Thanks!

  1. moving sample scripts to subdirectory
  2. support pgen and non-pgen sampling
  3. rename pgen_sample scisample

Multidimensional Sampling

This repo contains a pgen script for maestro that implements several multidimensional sampling methods which are driven by a simple yaml specification included in the env/variables section of any maestro specification. This yaml specification is a nested list named SAMPLE_DICTIONARY, and this repo contains several examples.

The pgen script is executed as follows

maestro run [-y] --pgen ./pgen_sample.py {sample_list.yaml},

where [-y] is an optional auto-run option, and sample_list.yaml is a maestro specification containing a SAMPLE_DICTIONARY.

pgen_sample.py currently supports four sampling modes:

  1. list: see sample_list.yaml
  2. column_list: see sample_column_list.yaml
  3. cross_product: see sample_cross_product.yaml
  4. best_candidate: see sample_best_candidate.yaml

Note: the best_candidate sampling mode currently requires that maestro is installed from scratch in a virtual environment (see Maestro install documentation) and that the pandas, scipy, and sklearn packages be installed in that environment as well. Please see the (Installing the necessary python packages)[#installing-the-necessary-python-packages] section below.

The List Mode

The list mode requires two items to be defined in SAMPLE_DICTIONARY:

  1. sample_type must equal list
  2. constants or parameters must contain at least one variable and one value.

The following sample dictionary

SAMPLE_DICTIONARY:
    sample_type: list
    constants:
        X3: 20
    parameters:
        X1: [ 5, 10 ]
        X2: [ 5, 10 ]

is equivalent to the following Maestro global.parameter block:

global.parameters:
    X1:
        values: [5, 10]
        label: X1.%%
    X2:
        values: [5, 10]
        label: X2.%%
    X3:
        values: [20, 20]
        label: X3.%%

The Column List Mode

The column_list mode requires two items to be defined inSAMPLE_DICTIONARY:

  1. sample_type must equal column_list
  2. constants or parameters must contain at least one variable and one value.

The following sample dictionary

SAMPLE_DICTIONARY:
    sample_type: column_list
    constants:
        X3: 20
    parameters: |
        X1  X2
        5   5
        10  10

is also equivalent to the following Maestro global.parameter block:

global.parameters:
    X1:
        values: [5, 10]
        label: X1.%%
    X2:
        values: [5, 10]
        label: X2.%%
    X3:
        values: [20, 20]
        label: X3.%%

The Cross Product Mode

The cross_product mode requires two items to be defined inSAMPLE_DICTIONARY:

  1. sample_type must equal cross_product
  2. constants or parameters must contain at least one variable and one value.

The following sample dictionary

SAMPLE_DICTIONARY:
    sample_type: cross_product
    constants:
        X4: 20
    parameters:
        X1: [ 5, 10 ]
        X2: [ 5, 10 ]
        X3: [ 5, 10 ]

is also equivalent to the following Maestro global.parameter block:

global.parameters:
    X1:
        values: [5, 5, 5, 5, 10, 10, 10, 10]
        label: X1.%%
    X2:
        values: [5, 5, 10, 10, 5, 5, 10, 10]
        label: X2.%%
    X3:
        values: [5, 10, 5, 10, 5, 10, 5, 10]
        label: X3.%%
    X4:
        values: [20, 20, 20, 20, 20, 20, 20, 20]
        label: X4.%%

The Best Candidate Mode

The cross_product mode requires three items to be defined inSAMPLE_DICTIONARY:

  1. sample_type must equal best_candidate
  2. num_samples must contain an integer
  3. parameters must contain at least one variable and one range.

The following sample dictionary

        SAMPLE_DICTIONARY:
            sample_type: best_candidate
            num_samples: 4
            # previous_samples: samples.csv # optional
            constants:
                X3: 20
            parameters:
                X1:
                   min: 10
                   max: 50
                X2:
                   min: 10
                   max: 50

will produce different results each time it is run. Below is an example of an equivalent Maestro global.parameter block:

global.parameters:
    X1:
        values: [48.70164719044195, 10.286343604507039, 22.19704244879045, 28.491627750335073]
        label: X1.%%
    X2:
        values: [29.705997207402138, 36.7811077888954, 11.310907646035941, 48.554124837450594]
        label: X2.%%
    X3:
        values: [20, 20, 20, 20]
        label: X3.%%

Installing the necessary python packages

(install pkg)[#installing-the-necessary-python-packages]

Th best_candidate sampling mode currently requires that maestro is installed from scratch in a virtual environment (see Maestro install documentation) and that the pandas, scipy, and sklearn packages be installed in that environment as well. Below are the steps to take:

# set install locations:
MAESTRO_INSTALL_DIRECTORY=${HOME}/maestro_install
MAESTRO_SHEETMUSIC_DIRECTORY=${HOME}/maestro_sheetmusic_install

# Deactivate any current virtual environments.
deactivate 

# Clone a fresh copy of the maestro source.
# Note: this can not be in the maestro_sheetmusic directory.
mkdir ${MAESTRO_INSTALL_DIRECTORY}
cd ${MAESTRO_INSTALL_DIRECTORY}
date > maestro_install_start.txt
git clone https://github.com/LLNL/maestrowf.git 

# Set up and activate a new virtual environment.
cd maestrowf
python3 -m venv venv
source venv/bin/activate

# Install maestro.
pip install -r requirements.txt
pip install -e .

# Install pandas, scipy, and sklearn
pip install panda
pip install scipy
pip install sklearn

# Timestamp #2
cd ${MAESTRO_INSTALL_DIRECTORY}
date > maestro_install_end.txt

# Clone maestro sheetmusic
mkdir ${MAESTRO_SHEETMUSIC_DIRECTORY}
cd ${MAESTRO_SHEETMUSIC_DIRECTORY}
git clone https://github.com/LLNL/maestro_sheetmusic.git 

# Run smart_sampler_demo
cd ${MAESTRO_SHEETMUSIC_DIRECTORY}/maestro_sheetmusic/multidimensional_sampling
maestro run -y --pgen ./pgen_sample.py sample_best_candidate.yaml

# Timestamp #3
cd ${MAESTRO_INSTALL_DIRECTORY}
date > maestro_sheetmusic_run_end.txt
echo done
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment