Skip to content

Instantly share code, notes, and snippets.

@matthewfeickert
Last active March 22, 2021 16:10
Show Gist options
  • Save matthewfeickert/e2e06312674a71eac2d6f2f30e60125c to your computer and use it in GitHub Desktop.
Save matthewfeickert/e2e06312674a71eac2d6f2f30e60125c to your computer and use it in GitHub Desktop.
Bug report for Python 2 vs. Python 3 inconsistency

I'm seeing a bug between pyAMI v5.1.2 on CVMFS with inconsistent return structure for the provenance node dicts. The docs mention that for the pyAMI.atlas.api.get_dataset_prov API the return should be

a map of python dictionnaries. The key "node" gives a list of dataset with the distance to the given dataset and the key "edge" gives the list of successive pairs of input an output datasets.

This is true for Python 2.7, but for Python 3 the return structure is inconsistent with some datasets in the provenance node list having a return of

odict_keys(['logicalDatasetName', 'dataType', 'distance', 'events'])

while others in the list have

odict_keys(['source', 'destination'])

which is what the edge dict is supposed to return.

Python 2

[feickert@login ~]$ hostname
login.usatlas.org
[feickert@login ~]$ python --version --version
Python 2.7.5
[feickert@login ~]$ voms-proxy-init -voms atlas
[feickert@login ~]$ lsetup pyAMI
[feickert@login ~]$ python -c "import pyAMI; print(pyAMI)"
<module 'pyAMI' from '/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/pyAmi/pyAMI-5.1.2/lib/pyAMI/__init__.py'>
[feickert@login ~]$ python bug_report.py
In Python 2 pyAMI's provenance nodes always contain: set([u'dataType', u'distance', u'logicalDatasetName', u'events'])

OrderedDict([(u'logicalDatasetName', u'mc16_13TeV.301254.Pythia8EvtGen_A14NNPDF23LO_Wprime_WZqqqq_m400.recon.AOD.e3749_s3126_r9364'), (u'dataType', u'AOD'), (u'distance', u'0'), (u'events', u'0')])

Python 3

[feickert@login ~]$ hostname
login.usatlas.org
[feickert@login ~]$ lsetup 'views LCG_98python3 x86_64-centos7-gcc8-opt'
************************************************************************
Requested:  views ...
 Setting up views LCG_98python3:x86_64-centos7-gcc8-opt ...
>>>>>>>>>>>>>>>>>>>>>>>>> Information for user <<<<<<<<<<<<<<<<<<<<<<<<<
************************************************************************
[feickert@login ~]$ python --version --version
Python 3.7.6 (default, Aug 12 2020, 09:46:40)
[GCC 8.3.0]
[feickert@login ~]$ voms-proxy-init -voms atlas
[feickert@login ~]$ lsetup pyAMI
[feickert@login ~]$ python -c "import pyAMI; print(pyAMI)"
<module 'pyAMI' from '/cvmfs/atlas.cern.ch/repo/ATLASLocalRootBase/x86_64/pyAmi/pyAMI-5.1.2/lib/pyAMI/__init__.py'>
[feickert@login ~]$ python bug_report.py
In Python 3 pyAMI's provenance nodes can contain: {'source', 'events', 'logicalDatasetName', 'distance', 'dataType', 'destination'}

OrderedDict([('logicalDatasetName', 'mc16_13TeV.301254.Pythia8EvtGen_A14NNPDF23LO_Wprime_WZqqqq_m400.recon.AOD.e3749_s3126_r9364'), ('dataType', 'AOD'), ('distance', '0'), ('events', '0')])
OrderedDict([('source', 'mc16_13TeV.301254.Pythia8EvtGen_A14NNPDF23LO_Wprime_WZqqqq_m400.simul.HITS.e3749_s3126'), ('destination', 'mc16_13TeV.301254.Pythia8EvtGen_A14NNPDF23LO_Wprime_WZqqqq_m400.recon.AOD.e3749_s3126_r9364')])
import pyAMI.client
import pyAMI.atlas.api as atlas_api
import sys
def main():
# Written for pyAMI v5.1.2 API
atlas_api.init()
client = pyAMI.client.Client("atlas")
dataset = "mc16_13TeV.301254.Pythia8EvtGen_A14NNPDF23LO_Wprime_WZqqqq_m400.recon.AOD.e3749_s3126_r9364"
# https://ami.in2p3.fr/pyAMI/pyAMI5_atlas_api.html#pyAMI.atlas.api.get_dataset_prov
# The key "node" gives a list of dataset with the distance to the given dataset
# and the key "edge" gives the list of successive pairs of input an output datasets.
provenance = atlas_api.get_dataset_prov(client, dataset)
# For Python 2 provenance has the following structure for all datasets:
# provenance node keys: ['logicalDatasetName', 'dataType', 'distance', 'events']
# provenance edge keys: ['source', 'destination']
# For Python 3 the strucutre is inconsistent provenance nodes having a mix of keys
unique_keys = set([key for node in provenance["node"] for key in node.keys()])
if sys.version_info[0] < 3:
print(
"In Python 2 pyAMI's provenance nodes always contain: {}\n".format(
unique_keys
)
)
else:
print(
"In Python 3 pyAMI's provenance nodes can contain: {}\n".format(unique_keys)
)
for node in provenance["node"]:
if "logicalDatasetName" in node.keys():
print(node)
break
for node in provenance["node"]:
if "source" in node.keys():
print(node)
break
if __name__ == "__main__":
main()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment