speth/00-input-files.md

## 00-input-files.md

      
    Raw
  

              00-input-files.md
            
          
    Issues with CTI Input Files


Loss of opportunity for interoperability with other software that is easier with a standard format
Run-time Python dependency is a source of never-ending difficulty for users
Extra work required to implement new input file features, since they need to be implemented in both CTI and XML, which leads to incompleteness in the CTI interface and requries users to use the XML interface in certain cases.

Issues with XML input files


The format itself is needlessly verbose. Consider the definition of a single 'falloff' reaction in XML:
<reaction reversible="yes" type="falloff" id="0095">
  <equation>OH + CH3 (+ M) [=] CH3OH (+ M)</equation>
  <rateCoeff>
    <Arrhenius>
       <A>2.790000E+15</A>
       <b>-1.4299999999999999</b>
       <E units="cal/mol">1330.000000</E>
    </Arrhenius>
    <Arrhenius name="k0">
       <A>4.000000E+30</A>
       <b>-5.9199999999999999</b>
       <E units="cal/mol">3140.000000</E>
    </Arrhenius>
    <efficiencies default="1.0">C2H6:3  CH4:2  CO:1.5  CO2:2  H2:2  H2O:6 </efficiencies>
    <falloff type="Troe">0.412 195 5900 6394 </falloff>
  </rateCoeff>
  <reactants>CH3:1 OH:1.0</reactants>
  <products>CH3OH:1.0</products>
</reaction>
compared to the equivalent CTI (Python):
falloff_reaction("OH + CH3 (+ M) <=> CH3OH (+ M)",
         kf=[2.79000E+18, -1.43, 1330],
         kf0=[4.00000E+36, -5.92, 3140],
         falloff=Troe(A=0.412, T3=195, T1=5900, T2=6394),
         efficiencies="C2H6:3 CH4:2 CO:1.5 CO2:2 H2:2 H2O:6")
Even after removing the extra whitespace, it's still twice as long.


It requires extra processing to extract useful information for
(a) the mappings of species name to quantities contained in the <efficiencies>, <reactants> and <products> tags
(b) array data such as that in the <falloff> tag. All of the alternatives (Python, JSON, YAML) have intrinsic support for mapping and array data types.


It contains redundant information, which leads to confusion and errors. The reaction stoichiometry is encoded both in the <equation> tag as well as in the <reactants> and <products> tags.


The method for encoding arrays is inconsistent. In some places, we have a space delimited string, e.g. the <falloff> tag here. In others (e.g. the floatArray associated with species thermo data), we have comma delimited lists. Which of these formats is allowed in any given context is a mystery.


Cantera misses one of the key benefits of using an standard format such as XML: There are existing XML parsing libraries that work just fine, and there's no reason for Cantera to have it's own XML parser.


Extracting data from the XML tree requires writing a lot of code. For example, here's a snippet of XML code from the definition of a HMWSoln object:
<thetaAnion anion1="Cl-" anion2="OH-">
  <Theta> -0.05,  0.0, 0.0, 0.0, 0.0 </Theta>
</thetaAnion>
The function to read and validate the data from this node is 80 lines long (see https://github.com/Cantera/cantera/blob/master/src/thermo/HMWSoln_input.cpp#L235).


Implementation Considerations


Need to decide between JSON, YAML, and other alternatives
Want to separate input file parsing from actual application logic (compare the tight coupling of ThermoPhase::initThermoXML to the setupFooReaction functions which are called by newReaction(XML_Node&)).
Should be able to create objects without any explicit input file

Already possible for ideal gases through Reaction and Species objects


Should be able to serialize objects created in this way and generate new input files
Old input files can be supported by writing translators

Translator from CTI is just a modified version of ctml_writer.py


Successful implementation is made difficult by large number of classes missing test coverage (Cantera/cantera#267)
Also need to replace XML as the input/output file format for the 1D solver

Concerns with YAML/JSON


With YAML, significance of whitespace may confuse some users
With both YAML and JSON, order of keys in mappings is not specified, so serialization can result in
keys ending up in any order


## 01-sample.yaml
---
units: {length: cm, time: s, quantity: mol, act_energy: cal/mol}

phases:
- name: gri30
  elements: [O, H, C, N, Ar]
  species: [H2, H, O, O2, OH, H2O, HO2, H2O2]
  thermo: IdealGas
  reactions: all
  kinetics: gaskinetics
  initial_state: {temperature: 300.0, pressure: [1.0, atm],
                  mole_fractions: {CH4: 0.2, H2O: 0.8}}

species:
- name: CH4
  atoms: {C: 1, H: 4}
  thermo:
  - type: NASA
    Tmin: 200.0
    Tmax: 1000.0
    data: [5.149876130E+00, -1.367097880E-02, 4.918005990E-05, -4.847430260E-08,
           1.666939560E-11, -1.024664760E+04, -4.641303760E+00]

  - type: NASA
    Tmin: 1000.0
    Tmax: 3500.0
    data: [7.485149500E-02, 1.339094670E-02, -5.732858090E-06, 1.222925350E-09,
           -1.018152300E-13, -9.468344590E+03, 1.843731800E+01]

  transport: {type: gas, geometry: nonlinear, diameter: 3.75,
              well_depth: 141.40, polar: 2.60, rot_relax: 13.00}

- name: H2O
  atoms: {H: 2, O: 1}
  thermo:
  - type: NASA
    Tmin: 200.0
    Tmax: 1000.0
    data: [4.198640560E+00, -2.036434100E-03, 6.520402110E-06, -5.487970620E-09,
           1.771978170E-12, -3.029372670E+04, -8.490322080E-01]

  - type: NASA
    Tmin: 1000.0
    Tmax: 3500.0
    data: [3.033992490E+00, 2.176918040E-03, -1.640725180E-07, -9.704198700E-11,
           1.682009920E-14, -3.000429710E+04, 4.966770100E+00]

  transport: {type: gas, geometry: nonlinear, diameter: 2.60,
              well_depth: 572.40, dipole: 1.84, rot_relax: 4.00}

reactions:
- type: three_body
  equation: "2 O + M <=> O2 + M"
  rate: [1.2000e17, -1, 0]
  efficiencies: {AR: 0.83, C2H6: 3, CH4: 2, CO: 1.75, CO2: 3.6, H2: 2.4, H2O: 15.4}

- type: troe
  equation: "H + CH2 (+ M) <=> CH3 (+ M)"
  kf: [6.00000E+14, 0, 0]
  kf0: [1.04000E+26, -2.76, 1600]
  falloff: [0.562, 91, 5836, 8552]
  efficiencies: " AR:0.7  C2H6:3  CH4:2  CO:1.5  CO2:2  H2:2  H2O:6 "
...

## 02-sample.json
{
  "units": {
    "length": "cm",
    "time": "s",
    "quantity": "mol",
    "act_energy": "cal/mol"
  },
  "phases": [
    {
      "name": "gri30",
      "elements": ["O", "H", "C", "N", "Ar"],
      "species": ["H2", "H", "O", "O2", "OH", "H2O", "HO2", "H2O2"],
      "thermo": "IdealGas",
      "reactions": "all",
      "kinetics": "gaskinetics",
      "initial_state": {"temperature": 300, "pressure": [1, "atm"],
                  "mole_fractions": {"CH4": 0.2, "H2O": 0.8}}
    }
  ],
  "species": [
    {
      "name": "CH4",
      "atoms": {"C": 1, "H": 4},
      "thermo": [
        {
          "type": "NASA",
          "Tmin": 200,
          "Tmax": 1000,
          "data": [5.149876130E+00, -1.367097880E-02, 4.918005990E-05, -4.847430260E-08,
                   1.666939560E-11, -1.024664760E+04, -4.641303760E+00]
        },
        {
          "type": "NASA",
          "Tmin": 1000,
          "Tmax": 3500,
          "data": [7.485149500E-02, 1.339094670E-02, -5.732858090E-06, 1.222925350E-09,
                   -1.018152300E-13, -9.468344590E+03, 1.843731800E+01]
        }
      ],
      "transport": {"type": "gas", "geometry": "nonlinear", "diameter": 3.75,
                "well_depth": 141.4, "polar": 2.6, "rot_relax": 13}
    },
    {
      "name": "H2O",
      "atoms": {"H": 2, "O": 1},
      "thermo": [
        {
          "type": "NASA",
          "Tmin": 200,
          "Tmax": 1000,
          "data": [4.198640560E+00, -2.036434100E-03, 6.520402110E-06, -5.487970620E-09,
                   1.771978170E-12, -3.029372670E+04, -8.490322080E-01]
        },
        {
          "type": "NASA",
          "Tmin": 1000,
          "Tmax": 3500,
          "data": [3.033992490E+00, 2.176918040E-03, -1.640725180E-07, -9.704198700E-11,
                   1.682009920E-14, -3.000429710E+04, 4.966770100E+00]
        }
      ],
      "transport": {"type": "gas", "geometry": "nonlinear", "diameter": 2.6,
                "well_depth": 572.4, "dipole": 1.84, "rot_relax": 4}
    }
  ],
  "reactions": [
    {
      "type": "three_body",
      "equation": "2 O + M <=> O2 + M",
      "rate": [1.2000e17, -1, 0],
      "efficiencies": {"AR": 0.83, "C2H6": 3, "CH4": 2, "CO": 1.75,
                  "CO2": 3.6, "H2": 2.4, "H2O": 15.4}
    },
    {
      "type": "troe",
      "equation": "H + CH2 (+ M) <=> CH3 (+ M)",
      "kf": [6.00000E+14, 0, 0],
      "kf0": [1.04000E+26, -2.76, 1600],
      "falloff": [0.562, 91, 5836, 8552],
      "efficiencies": " AR:0.7  C2H6:3  CH4:2  CO:1.5  CO2:2  H2:2  H2O:6 "
    }
  ]
}

## 03-parseyaml.cpp
#include "cantera/thermo.h"
#include "yaml-cpp/yaml.h" // tested with yaml-cpp 0.5.3

using namespace Cantera;
using namespace std;

SpeciesThermoInterpType* newNasaPoly2(const YAML::Node& yaml)
{
    int ilow = (yaml[1]["Tmin"].as<double>() > yaml[0]["Tmin"].as<double>()) ? 0 : 1;
    int ihigh = 1 - ilow;
    double tlow = yaml[ilow]["Tmin"].as<double>();
    double thigh = yaml[ihigh]["Tmax"].as<double>();
    double tmid = yaml[ilow]["Tmax"].as<double>();
    if (fabs(tmid - yaml[ihigh]["Tmin"].as<double>()) > 0.01) {
        throw CanteraError("newNasaPoly2", "non-continuous temperature ranges"
            " {} != {}", tmid, yaml[ihigh]["Tmin"].as<double>());
    }
    vector_fp coeffs(1, tmid);
    coeffs.reserve(15);
    for (auto& coeff : yaml[ihigh]["data"]) {
        coeffs.push_back(coeff.as<double>());
    }
    for (auto& coeff : yaml[ilow]["data"]) {
        coeffs.push_back(coeff.as<double>());
    }
    double pref = OneAtm;
    return newSpeciesThermoInterpType("nasa", tlow, thigh, pref, coeffs.data());
}

void parseSpecies(Species& S, const YAML::Node& yaml) {
    S.name = yaml["name"].as<string>();
    S.composition = yaml["atoms"].as<map<string, double>>();
    S.thermo.reset(newNasaPoly2(yaml["thermo"]));
}

void yaml_demo()
{
    YAML::Node data = YAML::LoadFile("sample.yml");
    const YAML::Node& phase_data = data["phases"][0]; // Take the first phase node
    unique_ptr<ThermoPhase> gas(newThermoPhase(phase_data["thermo"].as<string>()));

    for (const auto& elem : phase_data["elements"]) {
        gas->addElement(elem.as<string>());
    }

    for (const auto& spnode : data["species"]) {
        shared_ptr<Species> S(new Species());
        parseSpecies(*S, spnode);
        gas->addSpecies(S);
    }

    const YAML::Node& state = phase_data["initial_state"];
    const YAML::Node& pNode = state["pressure"];
    double p;
    if (pNode.IsScalar()) {
        p = pNode.as<double>();
    } else {
        p = pNode[0].as<double>() * toSI(pNode[1].as<string>());
    }
    gas->setState_TPX(state["temperature"].as<double>(), p,
                      state["mole_fractions"].as<map<string, double>>());
    writelog("{}\n", gas->report());
}

int main()
{
    try {
        yaml_demo();
    } catch (exception& err) {
        writelog("{}\n", err.what());
    }
}

## 04-parsejson.cpp
#include <fstream>
#include <iostream>
#include <vector>

#include "jsoncpp/json/json.h"

struct NasaThermo
{
    double Tmin;
    double Tmax;
    std::vector<double> coeffs;
};

struct Species
{
    std::string name;
    std::map<std::string, int> composition;
    std::vector<NasaThermo> thermo;
};

void operator >>(const Json::Value& node, NasaThermo& t)
{
    t.Tmin = node["Tmin"].asDouble();
    t.Tmax = node["Tmax"].asDouble();

    const Json::Value& dataNode = node["data"];
    t.coeffs.resize(dataNode.size());
    for (int i=0; i<dataNode.size(); i++) {
        t.coeffs[i] = dataNode[i].asDouble();
    }
}

void operator >>(const Json::Value& node, Species& s)
{
    s.name = node["name"].asString();

    const Json::Value& compNode = node["atoms"];
    for (Json::ValueIterator it=compNode.begin(); it!=compNode.end(); ++it) {
        s.composition[it.memberName()] = (*it).asDouble();
    }

    const Json::Value& thermoNode = node["thermo"];
    s.thermo.resize(thermoNode.size());
    for (int i=0; i<thermoNode.size(); i++) {
        thermoNode[i] >> s.thermo[i];
    }
}


int main(int argc, char** argv)
{
    std::ifstream fin("sample.json");
    Json::Value doc;
    fin >> doc;

    std::vector<Species> species;
    const Json::Value& spec = doc["species"];
    for (int i=0; i!=spec.size(); i++) {
        Species s;
        spec[i] >> s;
        species.push_back(s);
    }

    return 0;
}
	---
	units: {length: cm, time: s, quantity: mol, act_energy: cal/mol}

	phases:
	- name: gri30
	elements: [O, H, C, N, Ar]
	species: [H2, H, O, O2, OH, H2O, HO2, H2O2]
	thermo: IdealGas
	reactions: all
	kinetics: gaskinetics
	initial_state: {temperature: 300.0, pressure: [1.0, atm],
	mole_fractions: {CH4: 0.2, H2O: 0.8}}

	species:
	- name: CH4
	atoms: {C: 1, H: 4}
	thermo:
	- type: NASA
	Tmin: 200.0
	Tmax: 1000.0
	data: [5.149876130E+00, -1.367097880E-02, 4.918005990E-05, -4.847430260E-08,
	1.666939560E-11, -1.024664760E+04, -4.641303760E+00]

	- type: NASA
	Tmin: 1000.0
	Tmax: 3500.0
	data: [7.485149500E-02, 1.339094670E-02, -5.732858090E-06, 1.222925350E-09,
	-1.018152300E-13, -9.468344590E+03, 1.843731800E+01]

	transport: {type: gas, geometry: nonlinear, diameter: 3.75,
	well_depth: 141.40, polar: 2.60, rot_relax: 13.00}

	- name: H2O
	atoms: {H: 2, O: 1}
	thermo:
	- type: NASA
	Tmin: 200.0
	Tmax: 1000.0
	data: [4.198640560E+00, -2.036434100E-03, 6.520402110E-06, -5.487970620E-09,
	1.771978170E-12, -3.029372670E+04, -8.490322080E-01]

	- type: NASA
	Tmin: 1000.0
	Tmax: 3500.0
	data: [3.033992490E+00, 2.176918040E-03, -1.640725180E-07, -9.704198700E-11,
	1.682009920E-14, -3.000429710E+04, 4.966770100E+00]

	transport: {type: gas, geometry: nonlinear, diameter: 2.60,
	well_depth: 572.40, dipole: 1.84, rot_relax: 4.00}

	reactions:
	- type: three_body
	equation: "2 O + M <=> O2 + M"
	rate: [1.2000e17, -1, 0]
	efficiencies: {AR: 0.83, C2H6: 3, CH4: 2, CO: 1.75, CO2: 3.6, H2: 2.4, H2O: 15.4}

	- type: troe
	equation: "H + CH2 (+ M) <=> CH3 (+ M)"
	kf: [6.00000E+14, 0, 0]
	kf0: [1.04000E+26, -2.76, 1600]
	falloff: [0.562, 91, 5836, 8552]
	efficiencies: " AR:0.7 C2H6:3 CH4:2 CO:1.5 CO2:2 H2:2 H2O:6 "
	...
	{
	"units": {
	"length": "cm",
	"time": "s",
	"quantity": "mol",
	"act_energy": "cal/mol"
	},
	"phases": [
	{
	"name": "gri30",
	"elements": ["O", "H", "C", "N", "Ar"],
	"species": ["H2", "H", "O", "O2", "OH", "H2O", "HO2", "H2O2"],
	"thermo": "IdealGas",
	"reactions": "all",
	"kinetics": "gaskinetics",
	"initial_state": {"temperature": 300, "pressure": [1, "atm"],
	"mole_fractions": {"CH4": 0.2, "H2O": 0.8}}
	}
	],
	"species": [
	{
	"name": "CH4",
	"atoms": {"C": 1, "H": 4},
	"thermo": [
	{
	"type": "NASA",
	"Tmin": 200,
	"Tmax": 1000,
	"data": [5.149876130E+00, -1.367097880E-02, 4.918005990E-05, -4.847430260E-08,
	1.666939560E-11, -1.024664760E+04, -4.641303760E+00]
	},
	{
	"type": "NASA",
	"Tmin": 1000,
	"Tmax": 3500,
	"data": [7.485149500E-02, 1.339094670E-02, -5.732858090E-06, 1.222925350E-09,
	-1.018152300E-13, -9.468344590E+03, 1.843731800E+01]
	}
	],
	"transport": {"type": "gas", "geometry": "nonlinear", "diameter": 3.75,
	"well_depth": 141.4, "polar": 2.6, "rot_relax": 13}
	},
	{
	"name": "H2O",
	"atoms": {"H": 2, "O": 1},
	"thermo": [
	{
	"type": "NASA",
	"Tmin": 200,
	"Tmax": 1000,
	"data": [4.198640560E+00, -2.036434100E-03, 6.520402110E-06, -5.487970620E-09,
	1.771978170E-12, -3.029372670E+04, -8.490322080E-01]
	},
	{
	"type": "NASA",
	"Tmin": 1000,
	"Tmax": 3500,
	"data": [3.033992490E+00, 2.176918040E-03, -1.640725180E-07, -9.704198700E-11,
	1.682009920E-14, -3.000429710E+04, 4.966770100E+00]
	}
	],
	"transport": {"type": "gas", "geometry": "nonlinear", "diameter": 2.6,
	"well_depth": 572.4, "dipole": 1.84, "rot_relax": 4}
	}
	],
	"reactions": [
	{
	"type": "three_body",
	"equation": "2 O + M <=> O2 + M",
	"rate": [1.2000e17, -1, 0],
	"efficiencies": {"AR": 0.83, "C2H6": 3, "CH4": 2, "CO": 1.75,
	"CO2": 3.6, "H2": 2.4, "H2O": 15.4}
	},
	{
	"type": "troe",
	"equation": "H + CH2 (+ M) <=> CH3 (+ M)",
	"kf": [6.00000E+14, 0, 0],
	"kf0": [1.04000E+26, -2.76, 1600],
	"falloff": [0.562, 91, 5836, 8552],
	"efficiencies": " AR:0.7 C2H6:3 CH4:2 CO:1.5 CO2:2 H2:2 H2O:6 "
	}
	]
	}
	#include "cantera/thermo.h"
	#include "yaml-cpp/yaml.h" // tested with yaml-cpp 0.5.3

	using namespace Cantera;
	using namespace std;

	SpeciesThermoInterpType* newNasaPoly2(const YAML::Node& yaml)
	{
	int ilow = (yaml[1]["Tmin"].as<double>() > yaml[0]["Tmin"].as<double>()) ? 0 : 1;
	int ihigh = 1 - ilow;
	double tlow = yaml[ilow]["Tmin"].as<double>();
	double thigh = yaml[ihigh]["Tmax"].as<double>();
	double tmid = yaml[ilow]["Tmax"].as<double>();
	if (fabs(tmid - yaml[ihigh]["Tmin"].as<double>()) > 0.01) {
	throw CanteraError("newNasaPoly2", "non-continuous temperature ranges"
	" {} != {}", tmid, yaml[ihigh]["Tmin"].as<double>());
	}
	vector_fp coeffs(1, tmid);
	coeffs.reserve(15);
	for (auto& coeff : yaml[ihigh]["data"]) {
	coeffs.push_back(coeff.as<double>());
	}
	for (auto& coeff : yaml[ilow]["data"]) {
	coeffs.push_back(coeff.as<double>());
	}
	double pref = OneAtm;
	return newSpeciesThermoInterpType("nasa", tlow, thigh, pref, coeffs.data());
	}

	void parseSpecies(Species& S, const YAML::Node& yaml) {
	S.name = yaml["name"].as<string>();
	S.composition = yaml["atoms"].as<map<string, double>>();
	S.thermo.reset(newNasaPoly2(yaml["thermo"]));
	}

	void yaml_demo()
	{
	YAML::Node data = YAML::LoadFile("sample.yml");
	const YAML::Node& phase_data = data["phases"][0]; // Take the first phase node
	unique_ptr<ThermoPhase> gas(newThermoPhase(phase_data["thermo"].as<string>()));

	for (const auto& elem : phase_data["elements"]) {
	gas->addElement(elem.as<string>());
	}

	for (const auto& spnode : data["species"]) {
	shared_ptr<Species> S(new Species());
	parseSpecies(*S, spnode);
	gas->addSpecies(S);
	}

	const YAML::Node& state = phase_data["initial_state"];
	const YAML::Node& pNode = state["pressure"];
	double p;
	if (pNode.IsScalar()) {
	p = pNode.as<double>();
	} else {
	p = pNode[0].as<double>() * toSI(pNode[1].as<string>());
	}
	gas->setState_TPX(state["temperature"].as<double>(), p,
	state["mole_fractions"].as<map<string, double>>());
	writelog("{}\n", gas->report());
	}

	int main()
	{
	try {
	yaml_demo();
	} catch (exception& err) {
	writelog("{}\n", err.what());
	}
	}
	#include <fstream>
	#include <iostream>
	#include <vector>

	#include "jsoncpp/json/json.h"

	struct NasaThermo
	{
	double Tmin;
	double Tmax;
	std::vector<double> coeffs;
	};

	struct Species
	{
	std::string name;
	std::map<std::string, int> composition;
	std::vector<NasaThermo> thermo;
	};

	void operator >>(const Json::Value& node, NasaThermo& t)
	{
	t.Tmin = node["Tmin"].asDouble();
	t.Tmax = node["Tmax"].asDouble();

	const Json::Value& dataNode = node["data"];
	t.coeffs.resize(dataNode.size());
	for (int i=0; i<dataNode.size(); i++) {
	t.coeffs[i] = dataNode[i].asDouble();
	}
	}

	void operator >>(const Json::Value& node, Species& s)
	{
	s.name = node["name"].asString();

	const Json::Value& compNode = node["atoms"];
	for (Json::ValueIterator it=compNode.begin(); it!=compNode.end(); ++it) {
	s.composition[it.memberName()] = (*it).asDouble();
	}

	const Json::Value& thermoNode = node["thermo"];
	s.thermo.resize(thermoNode.size());
	for (int i=0; i<thermoNode.size(); i++) {
	thermoNode[i] >> s.thermo[i];
	}
	}


	int main(int argc, char** argv)
	{
	std::ifstream fin("sample.json");
	Json::Value doc;
	fin >> doc;

	std::vector<Species> species;
	const Json::Value& spec = doc["species"];
	for (int i=0; i!=spec.size(); i++) {
	Species s;
	spec[i] >> s;
	species.push_back(s);
	}

	return 0;
	}