matthewfeickert/README.md

## README.md

      
    Raw
  

              README.md
            
          
    Summary of fixed POI problem for WSMaker to pyhf translation

It is desirable to be able to take a workspace created with WSMaker and use pyhf's translation utilities to write out a HistFactory JSON workspace file that is a pyhf compatible format. This is possible, however, when this happens the workspace has the POI fixed, which then makes statistical inference impossible.
Example

JC has shared with us one of his WSMaker workspaces that he's generated
$ tree -L 1 output/SemileptonicVBS_mc16ade_v3312_1LepOnly.1lepResBooRNNNoFid_m5_p0_SemileptonicVBS_13TeV_Systs_FS0_0_1lepResBooRNNNoFid_m5_p0/
output/SemileptonicVBS_mc16ade_v3312_1LepOnly.1lepResBooRNNNoFid_m5_p0_SemileptonicVBS_13TeV_Systs_FS0_0_1lepResBooRNNNoFid_m5_p0/
├── normfiles
├── workspaces
└── xml

3 directories, 0 files
$ find . -iname "driver.xml"
./output/SemileptonicVBS_mc16ade_v3312_1LepOnly.1lepResBooRNNNoFid_m5_p0_SemileptonicVBS_13TeV_Systs_FS0_0_1lepResBooRNNNoFid_m5_p0/xml/5/driver.xml
If we then setup a Python virtual environment and install pyhf with its xmlio extra
$ python -m venv example
$ . example/bin/activate
(example) $ python -m pip --quiet install --upgrade pip setuptools wheel
(example) $ python -m pip install pyhf[xmlio]
we can then use its command line utilities to generate a HistFactory JSON workspace file using the driver.xml file
(example) $ pyhf xml2json $(find . -iname "driver.xml") > workspace.json
which we can see is valid
(example) $ pyhf inspect workspace.json
                                                                              Summary
                                                                        ------------------
                                                                           channels  3
                                                                            samples  6
                                                                         parameters  15
                                                                          modifiers  15

                                                                           channels  nbins
                                                                         ----------  -----
          Region_distDNN_DSRVBSHP_BMin0_J0_incJet1_L1_T0_incFat1_Y6051_incTag1_Fat1   15
          Region_distDNN_DSRVBSLP_BMin0_J0_incJet1_L1_T0_incFat1_Y6051_incTag1_Fat1   15
                    Region_distDNN_DSRVBSTight_BMin0_T0_Y6051_incTag1_J2_L1_incJet1   15

                                                                            samples
                                                                         ----------
                                                                            Diboson
                                                                            EW6lvqq
                                                                                  W
                                                                                  Z
                                                                               stop
                                                                              ttbar

                                                                         parameters  constraint              modifiers
                                                                         ----------  ----------              ----------
                                                               ATLAS_LUMI_2015_2018  constrained_by_normal   normsys
                                                             ATLAS_norm_W1LepMerged  unconstrained           normfactor
                                                           ATLAS_norm_W1LepResolved  unconstrained           normfactor
                                                                ATLAS_norm_lumiProj  unconstrained           normfactor
                                                         ATLAS_norm_ttbar1LepMerged  unconstrained           normfactor
                                                       ATLAS_norm_ttbar1LepResolved  unconstrained           normfactor
                                                                           NormStop  constrained_by_normal   normsys
                                                                   NormVV1LepMerged  constrained_by_normal   normsys
                                                                 NormVV1LepResolved  constrained_by_normal   normsys
                                                                    NormZ1LepMerged  constrained_by_normal   normsys
                                                                  NormZ1LepResolved  constrained_by_normal   normsys
                                                                 mu_SemileptonicVBS  unconstrained           normfactor
staterror_Region_distDNN_DSRVBSHP_BMin0_J0_incJet1_L1_T0_incFat1_Y6051_incTag1_Fat1  constrained_by_normal   staterror
staterror_Region_distDNN_DSRVBSLP_BMin0_J0_incJet1_L1_T0_incFat1_Y6051_incTag1_Fat1  constrained_by_normal   staterror
          staterror_Region_distDNN_DSRVBSTight_BMin0_T0_Y6051_incTag1_J2_L1_incJet1  constrained_by_normal   staterror

                                                                        measurement           poi            parameters
                                                                         ----------        ----------        ----------
                                                                             (*) VH    mu_SemileptonicVBS    lumi,ATLAS_norm_ttbar1LepMerged,ATLAS_norm_W1LepMerged,mu_SemileptonicVBS,ATLAS_norm_ttbar1LepResolved,ATLAS_norm_W1LepResolved,ATLAS_norm_lumiProj


However, if we try to perform any statistical inference we immediately fail as the POI is fixed in the workspace
$ pyhf cls workspace.json
Traceback (most recent call last):
  File "/home/feickert/.pyenv/versions/example/bin/pyhf", line 8, in <module>
    sys.exit(cli())
  File "/home/feickert/.pyenv/versions/example/lib/python3.9/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/home/feickert/.pyenv/versions/example/lib/python3.9/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/home/feickert/.pyenv/versions/example/lib/python3.9/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/feickert/.pyenv/versions/example/lib/python3.9/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/feickert/.pyenv/versions/example/lib/python3.9/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/home/feickert/.pyenv/versions/example/lib/python3.9/site-packages/pyhf/cli/infer.py", line 223, in cls
    result = hypotest(
  File "/home/feickert/.pyenv/versions/example/lib/python3.9/site-packages/pyhf/infer/__init__.py", line 147, in hypotest
    _check_hypotest_prerequisites(pdf, data, init_pars, par_bounds, fixed_params)
  File "/home/feickert/.pyenv/versions/example/lib/python3.9/site-packages/pyhf/infer/__init__.py", line 15, in _check_hypotest_prerequisites
    raise exceptions.InvalidModel(
pyhf.exceptions.InvalidModel: POI at index [22] is set as fixed, which makes inference impossible. Please unfix the POI to continue.
Manually inspecting the workspace we indeed see that the POI is fixed
                    {
                        "bounds": [
                            [
                                -10.0,
                                10.0
                            ]
                        ],
                        "fixed": true,
                        "inits": [
                            1.0
                        ],
                        "name": "mu_SemileptonicVBS"
                    },

if we manually change this to "fixed": false, then statistical inference is possible (:+1:).
Problem

However, manual editing of workspaces is highly undesirable and should be viewed as a huge red flag for any sort of translation. Workspaces should be able to be properly defined and translated between (hopefully bidirectionally like pyhf has done with HistFitter).
What JC needs to figure out is how in WSMaker to indicate that at workspace generation time the POI should be unfixed.