I upgraded my Qualcomm 888 testing pipeline to a new version of the SNPE SDK moving from snpe-1.61.0 to snpe-1.64.0. Then re-ran all our 8000 models we have been testing. The results are mixed. On the one hand snpe-1.64.0 seems to have fixed some quantization bugs such that a lot more models are getting an interesting non-zero F1 accuracy score and all scores are the same or better than before, but on the other hand the mean inference times have slowed down some. The slow down is more evident in the smaller faster models. So my question is for smaller models the perf hit is pretty bad, is this a known issue and something that will likely improve in future versions of the SDK ?
Here's the results on the smaller faster models:
Here's the result on medium sized models:
Here's the results on larger models: