cdlhub/notes-on-flamingo.md

## notes-on-flamingo.md

      
    Raw
  

              notes-on-flamingo.md
            
          
    Notes on Flamingo

This document is based on discussion with Mark Pinkerton on Sept. 20th, 2017.
See notes on ktools

  
## notes-on-ktools.md

      
    Raw
  

              notes-on-ktools.md
            
          
    Notes

Core Components


eve is the event distributing utility. It outputs subsets of the events as a stream for the next module (getmodel).
Input:
- the number of partitions p to create out of the full list of event (arg. 2). If you have 100 events and p=4, then you will have 4 partitions of 25 events each.
   	- index (1-based) n of the partition to process (arg. 1). If p=2 in the previous example, then you will process events 26 to 50.
getmodel generates a stream of effective damageability cdfs for the input stream of events. It generates cdfs from the model files footprint.bin and vulnerability.bin, and the user's exposures file which is called items.bin. getmodel streams into gulcalc or can be output to a binary file.
gulcalc performs the ground up loss sampling calculations (using Monte-Carlo) and numerical integration. The output is a stream of sampled ground up losses. This can be output to a binary file or streamed into fmcalc or summarycalc.
fmcalc performs the insured loss calculations on the ground up loss samples, mean, and total insured value. The output is a stream of insured loss samples. The result can be output to a binary file or streamed into summarycalc.
summarycalc performs a summing of sampled losses according to the user's reporting requirements. For example this might involve summing coverage losses to regional level, or policy losses to portfolio level. The output is sampled loss by event_id and summary_id, which represents a meaningful group of losses to the user.

Output Components

Output components perform results analysis such as an event loss table or loss exceedance curve on the sampled output from summarycalc. The output is a results table in csv format.

eltcalc generates an event loss table from the sampled losses from summarycalc. It contains sample mean and standard deviation, and total exposed value for each event at the given summary level.
leccalc generates loss exceedance curve from the sampled losses from summarycalc. There are 8 variants of curves with small differences in the output format but the common output fields are summary_id, return period, and loss exceedance threshold. This output is only available for models which provide an occurrence file.
pltcalc generates a period loss table from the sampled losses from summarycalc. It contains sample mean and standard deviation, and total exposed value for each event and each period (for example a year) at the given summary level. It also contains a date field, corresponding to the date of the event occurrence. This output is only available for models which provide an occurrence file.
aalcalc and aalsummary generate the average annual loss and standard deviation of loss from the sampled losses from summarycalc, for each summary_id. The output also contains total exposed value for each summary level, which is the maximum of the total exposed value across all simulated periods. This output is only available for models which provide an occurrence file.

Files

The model static data for the core workflow are the event footprint, vulnerability, damage bin dictionary and random number file.
The user / analysis input data for the core workflow are the events, items, coverages, fm programme, fm policytc, fm profile, fm xref, fm summary xref and gul summary xref files.

static

damage_bin_dict.bin -- a reference table which defines how the effective damageability cdfs are discretized on a relative damage scale (normally between 0 and 1). Input of getmodel and gulcalc.
footprint.bin-- Event footprints. Input of getmodel.
footprint.idx-- Index file containing the starting positions of each event block. Input of getmodel.
random.bin -- Random number file. It contains a list of random numbers used for ground up loss sampling. Input of gulcalc. Optional.
vulnerability.bin -- Contains the conditional distributions of damage for each intensity bin and for each vulnerability_id. Input of getmodel.


input

coverages.bin -- List of coverage IDs with their TIV. Input of gulcalc and fmcalc.
events.bin -- List of event IDs only. Input of eve.
ìtems.bin -- User's exposures file. Input of getmodel, gulcalc and output components.
gulsummaryxref.bin -- Cross reference file which determines how coverage losses from gulcalc output are summed together into at various summary levels in summarycalc. Input of summarycalc.
fm_programme.bin -- Contains the level heirarchy and defines aggregations of losses required to perform a loss calculation. Input of fmcalc.
fm_profile.bin -- Contains the list of calculation rules with profile values (policytc_ids) that appear in the fm_policytc.bin file. Input of fmcalc.
fm_policytc.bin -- Contains the cross reference between the aggregations of losses defined in the fm programme file at a particular level and the calculation rule that should be applied as defined in the fm_profile.bin file. Input of fmcalc.
fmsummaryxref.bin -- Cross reference file which determines how losses from fmcalc output are summed together at various summary levels by summarycalc. Input of summarycalc.
fm_xref.bin -- Contains cross reference data specifying the output_id in the fmcalc as a combination of agg_id and layer_id. Input of fmcalc.
occurence.bin -- Aassigns occurrences of the event_ids to numbered periods. Required for any output which involves the calculation of loss metrics over a period of time. Input of certain output components.
returnperiods.bin -- List of return periods that the user requires to be included in loss exceedance curve (leccalc) results. Input of leccalc.
periods.bin -- List of all the periods that are in the model and is optional for weigthing the periods in the calculation. Input of leccalc.


Glossary


cdfs: discrete cumulative distribution functions.

Questions

Model


Binary files contains mostly IDs. So, where is the real data?
What is the minimal set of static/input files to make ktools run?
Let's say I have my own model (events, coverage, vulnerabilities...) I want to convert to ktools format. What would be the process to follow?
Is there a way to easily calibrate damageability cdfs (see damage bin dictionary specs)?
How to specify event duration (is it meaningful in ktools model)?
Does coverage (with TIV) represent the level of analysis (e.g country, cresta...), or is it area peril? What do they represent?
Can we have area peril containing smaller area peril (e.g. Country, CRESTA, municipal boundaries...)? What granularity is supported? Can we go down to lat/long?
What is an exposure item? What an item can represent?
Can the model be multi-hazard (e.g. earthquake and flood) in the same set of files?
Is damage bin dictionary file a finite set of all cdf for a given model?
I need more information about data files relationships (i.e. primary key, secondary keys for each of them. At least events, items, damage bin dictionary, vulnerability, footprint, and coverages.) In other word, what is OASIS database (data files) scheme?

Dev


Can I run ktools on my local machine inside a Docker container under Windows, macOS and Linux?
What is number of intensity bins meaning for footprint? Is it defined by the user? Could it be automatically set?
What is number of damage bins meaning for vulnerabilities? Is it defined by the user? Could it be automatically set?
Which file contains hazard intensity bin values (not IDs)?