Created
August 8, 2021 13:27
-
-
Save iwatobipen/6d8708d8c77c615cfffbb89409be730d to your computer and use it in GitHub Desktop.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Unfortunately, rdFMCS.FindMCS finds the maximum common substructure and not the differences between two molecules and so this code can only highlight what is not in the maximum common substructure. Since the ether oxygen does not match the carbonyl carbon the maximal substructure can only include a single proyl chain as the mismatching ether/carbonyl breaks the molecule in half and so the other propyl chain is not in the same substructure.
If you want to extend this code to allow it to identify differences such as the one you describe, what you would have to do is iteratively split the molecules along the bonds that connect the matched/unmatched subgroups, followed by repeating the matching step between the unmatched subgroups until rdFMCS.FindMCS cannot find anymore.
The connecting bonds can be found with a function like:
Where atomIds is the list of atom Ids that were highlighted as mismatching by view_differences (i.e. target_atm1/target_atm2).
You could then use something like this for each iteration:
You can then compare mol1_unnacounted to mol2_unnacounted in the same way as you did mol1 to mol2! Rinse and repeat until you're left with your carbonyl and ether and (if you've tracked atom IDs properly) highlight them in your original molecule.