Skip to content

Instantly share code, notes, and snippets.

@olp-cs
Created July 27, 2024 16:06
Show Gist options
  • Save olp-cs/ac3941ae2bf24af4e0cd4ff3be904c08 to your computer and use it in GitHub Desktop.
Save olp-cs/ac3941ae2bf24af4e0cd4ff3be904c08 to your computer and use it in GitHub Desktop.
[GEOParse] Non-breaking change compatible with pandas 1.3.0 and 2.2.2

The notebooks in this Gist compare the following operations and demonstrate their equivalent outputs:

For pandas 1.3.0:

input_data.groupby(group_by_column).mean()[[expression_column]]

For pandas 1.3.0 and 2.2.2:

input_data.groupby(group_by_column).mean(numeric_only=True)[[expression_column]]

input_data.groupby(group_by_column)[[expression_column]].mean()
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Display the source blob
Display the rendered blob
Raw
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
ID_REF VALUE LogRatioError PValueLogRatio gProcessedSignal rProcessedSignal ID GB_ACC Gene_Desc Gene_Sym SPOT_ID SEQUENCE
0 1 -1.627476 0.1360 6.410000e-33 9130.0 215.00 1 U02079 nuclear factor of activated T-cells, cytoplasmic 2 Nfatc2 NaN ACCTGGATGACGCAGCCACTTCAGAAAGCTGGGTTGGGACAGAAAGGTATATAGAGAGAAAATTTTGGAA
1 2 0.141225 1.3400 1.000000e+00 41.4 57.20 2 NM_008154 G-protein coupled receptor 3 Gpr3 NaN CTGTACAATGCTCTCACTTACTACTCAGAGACAACGGTAACTCGGACTTATGTGATGCTGGCCTTGGTGT
2 3 0.182768 0.0519 4.330000e-04 5130.0 7810.00 3 AK015719 tropomodulin 2 Tmod2 NaN CACCAGGCTCAGTGCCTAGTATCGGCTTCACCTAGTGTGGTTACTCAGGGCACGCAGAGCTACAGAACAC
3 4 -0.393227 0.0608 1.020000e-10 4650.0 1880.00 4 AK003367 mitochondrial ribosomal protein L15 Mrpl15 NaN CAAGAAGTCTAGAAATTCTGTGCAAGCCTATTCCATTCTTTCTGCGGGGACAACCAATTCCGAAAAGAAT
4 5 -0.986599 0.1050 6.320000e-21 2910.0 301.00 5 BC003333 RIKEN cDNA 0610033I05 gene 0610033I05Rik NaN AGAACTGGGTGGCAGATATCCTAGAGTTTTGACCAACGTTCACAGCACACATATTGATCTTATAGGACCT
5 6 0.023881 0.1020 8.150000e-01 708.0 748.00 6 NM_008462 killer cell lectin-like receptor, subfamily A, member 2 Klra2 NaN TGAATTGAAGTTCCTTAAATCCCAACTTCAAAGAAACACATACTGGATTTCACTGACACATCATAAAAGC
6 7 -1.484182 0.1250 1.420000e-32 10200.0 336.00 7 NM_008029 FMS-like tyrosine kinase 4 Flt4 NaN GAGGTGCTGTGGGATGACCGCCGGGGCATGCGGGTGCCCACTCAACTGTTGCGCGATGCCCTGTACCTGC
7 8 -1.826136 0.4150 1.100000e-05 719.0 10.70 8 NM_054088 adiponutrin Adpn NaN GTCTGAGTTCCATTCCAAAGACGAAGTCGTGGATGCCCTGGTGTGTTCCTGCTTCATTCCCCTCTTCTCT
8 9 -1.034478 1.7800 1.000000e+00 96.2 8.89 9 NM_009750 nerve growth factor receptor (TNFRSF16) associated protein 1 Ngfrap1 NaN TACAGCTGAGAAATTGTCTACGCATCCTTATGGGGGAGCTGTCTAACCACCACGATCACCATGATGAATT
9 10 0.240589 0.3090 4.360000e-01 161.0 280.00 10 AB045323 DNA segment, Chr 8, ERATO Doi 594, expressed D8Ertd594e NaN GATTCAGACTCGGGAGGAGCATCCCAACCTCTCCTTGAGGATAAAGGCCTGAGCGATTGCCCTGGGGAGC
10 11 0.320937 0.3590 3.710000e-01 125.0 261.00 11 AK005789 dynein, cytoplasmic, light chain 2B Dncl2b NaN TGCAGAAGGCATTCCAATCCGAACAACCCTGGACAACTCCACAACGGTTCAGTATGCGGGTCTTCTCCAC
11 12 0.358304 2.0600 1.000000e+00 20.4 46.60 12 NM_010517 insulin-like growth factor binding protein 4 Igfbp4 NaN GGAGAAGCTGGCGCGCTGCCGCCCCCCCGTGGGTTGCGAGGAGTTGGTGCGGGAGCCAGGCTGCGGTTGT
12 13 -0.012207 0.3640 9.730000e-01 184.0 179.00 13 AK010722 RIKEN cDNA 2410075D05 gene 2410075D05Rik NaN GGAGCATCTGGAGTTCCGCTTACCGGAAATAAAGTCTTTACTATCGGTGATTGGAGGGCAGTTCACTAAC
13 14 -1.548040 0.1300 7.210000e-33 10200.0 290.00 14 AK003755 DNA segment, Chr 4, ERATO Doi 421, expressed D4Ertd421e NaN AGCAAAGAGATCTCCCTCAGTGTGCCCATAGGTGGCGGTGCGAGCTTGCGGTTATTGGCCAGTGACTTGC
14 15 0.007342 0.2980 9.800000e-01 221.0 225.00 15 BC003241 cleavage stimulation factor, 3\' pre-RNA, subunit 3 Cstf3 NaN AAATTAGAAGAAAATCCATATGACCTTGATGCTTGGAGCATTCTCATTCGAGAGGCACAGAATCAACCTA
15 16 -0.226702 0.9440 8.100000e-01 89.0 52.80 16 AK004937 RIKEN cDNA 1300007O09 gene 1300007O09Rik NaN CAGACACAAACCCTAGGTTGTATTGTAGACCGGAGTTTAAGCAGGCACTACCTGTCTGTCTTTTCTTCAT
16 17 -0.148402 0.8010 8.530000e-01 96.5 68.60 17 AK004524 unnamed protein product; hypothetical SOCS domain NaN NaN CGGAGCCCTGCGCGCCCAGAGCCCCCTCCCACCCGCTTCCACCAAGTGCATGGAGCCAACATCCGCATGG
17 18 -0.612220 0.1280 1.690000e-06 1120.0 273.00 18 NM_025999 RIKEN cDNA 2610110L04 gene 2610110L04Rik NaN TGCATTGATAAATGGAGTGATCGACACAGGAACTGCCCCATTTGTCGCCTACAGATGACTGGAGCAAATG
18 19 0.079690 0.0878 3.640000e-01 821.0 987.00 19 NaN NaN NaN -- CONTROL NaN
19 20 -0.084895 0.9380 9.280000e-01 76.8 63.20 20 NM_023120 guanine nucleotide binding protein (G protein), beta polypeptide 1-like Gnb1l NaN ACCGCCTGGTCCCAGATTTGTCCTCCGAGGCACACAGTCGGCTGTGAACACGCTCCATTTCTGCCCACCA
GB_ACC VALUE
AB045323 0.240589
AK003367 -0.393227
AK003755 -1.54804
AK004524 -0.148402
AK004937 -0.226702
AK005789 0.320937
AK010722 -0.012207
AK015719 0.182768
BC003241 0.007342
BC003333 -0.986599
NM_008029 -1.484182
NM_008154 0.141225
NM_008462 0.023881
NM_009750 -1.034478
NM_010517 0.358304
NM_023120 -0.084895
NM_025999 -0.61222
NM_054088 -1.826136
U02079 -1.627476
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment