Skip to content

Instantly share code, notes, and snippets.

@NatureGeorge
Created September 23, 2022 08:41
Show Gist options
  • Save NatureGeorge/49fcf7ba1c6302ac14e3c750e4131a16 to your computer and use it in GitHub Desktop.
Save NatureGeorge/49fcf7ba1c6302ac14e3c750e4131a16 to your computer and use it in GitHub Desktop.
Convert mmCIF file to pandas.Dataframe.
import gemmi
import pandas as pd
cif_doc_block = gemmi.cif.read('1ug6.cif.gz')[0]
keys = (
'_atom_site.id',
'_atom_site.label_atom_id',
'_atom_site.label_comp_id',
'_atom_site.label_asym_id',
'_atom_site.auth_asym_id',
'_atom_site.label_entity_id',
'_atom_site.label_seq_id',
'_atom_site.auth_seq_id',
'_atom_site.pdbx_PDB_ins_code',
'_atom_site.Cartn_x',
'_atom_site.Cartn_y',
'_atom_site.Cartn_z',
'_atom_site.B_iso_or_equiv',
'_atom_site.pdbx_PDB_model_num',
)
pdb_frame = pd.DataFrame({key: cif_doc_block.find_values(key) for key in keys})
for key in keys[9:-1]:
pdb_frame[key] = pdb_frame[key].astype('float32')
for key in (keys[0], keys[-1]):
pdb_frame[key] = pdb_frame[key].astype('int')
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment