Skip to content

Instantly share code, notes, and snippets.

@linnil1
Created February 29, 2024 14:43
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save linnil1/5151f9a7587edf93f77ae42480af3ffe to your computer and use it in GitHub Desktop.
Save linnil1/5151f9a7587edf93f77ae42480af3ffe to your computer and use it in GitHub Desktop.
Read Excel that contains merge cell
import pandas as pd
from openpyxl import load_workbook
def readExcel(filename):
excel = load_workbook(filename)
sheetname = excel.sheetnames[0]
df = pd.read_excel(filename, sheet_name=sheetname, header=None)
# print("origin:", df, sep="\n")
sheet = excel.get_sheet_by_name(sheetname)
for merged_cell in sheet.merged_cells:
c1, r1, c2, r2 = map(lambda i: i - 1, merged_cell.bounds)
# print("Fill", (r1, c1), "to", (r2, c2), "with", (r1, c1), "=", df.iloc[r1, c1])
df.iloc[r1 : r2 + 1, c1 : c2 + 1] = df.iloc[r1, c1]
# print("Result:", df, sep="\n")
return df
df = readExcel("sample_merged_sheet.xlsx")
# set first line as header
df.columns = df.iloc[0]
df = df[1:]
print(df)
## Excel:
```
h1 h2 h3 h4 h5 h6 h7 h8
--------------------------------------------------------------------------
| 1.0 | 2.0 | 3.0 | 4.0 | NaN | 1.0 | 2.0 | 3.0 |
--------------------------------------------------------------------------
| 2.0 | NaN | NaN | 4.0 | 4.0 | 5.0 | 7.0 |
-------------------------------------------------------
| 3.0 | 3.0 | 3.0 | NaN | 5.0 | |
--------------------------------------------------------------------------
| 4.0 | NaN | 3.0 | 0.0 | 5.0 | NaN |
---------------------------------------------- ---------
| 5.0 | 5.0 | 8.0 | 8.0 | 8.0 | | NaN |
---------- -------------------------- ----------
| 6.0 | | 8.0 | 16.0 | 24.0 | | NaN |
------------------- ---------------------------------------------
| 7.0 | 7.0 | | NaN | 24.0 | NaN | 0.0 | 0.0 |
---------- ------------------------------------------------------
| 8.0 | | 0.0 | NaN | 24.0 | 24.0 | 24.0 | 24.0 |
--------------------------------------------------------------------------
| 9.0 | 9.0 | 9.0 | 9.0 | 33.0 | 57.0 | 81.0 |105.0 |
## Use `pd.read_excel()`
```
h1 h2 h3 h4 h5 h6 h7 h8
0 1 2.0 3.0 4.0 NaN 1.0 2.0 3.0
1 2 NaN NaN 4.0 4.0 5.0 7.0 NaN
2 3 3.0 3.0 NaN NaN 5.0 NaN NaN
3 4 NaN 3.0 NaN 0.0 5.0 NaN NaN
4 5 5.0 8.0 8.0 8.0 NaN NaN NaN
5 6 NaN 8.0 16.0 24.0 NaN NaN NaN
6 7 7.0 NaN NaN 24.0 NaN 0.0 0.0
7 8 NaN 0.0 NaN 24.0 24.0 24.0 24.0
8 9 9.0 9.0 9.0 33.0 57.0 81.0 105.0
```
## Result (`readExcel`)
Fill (4, 2) to (4, 3) with (4, 2) = 3
Fill (6, 2) to (7, 2) with (6, 2) = 8
Fill (3, 2) to (3, 3) with (3, 2) = 3
Fill (7, 1) to (8, 1) with (7, 1) = 7
Fill (4, 5) to (6, 6) with (4, 5) = 5
Fill (5, 1) to (6, 1) with (5, 1) = 5
Fill (2, 6) to (3, 7) with (2, 6) = 7
h1 h2 h3 h4 h5 h6 h7 h8
1 1 2 3 4 NaN 1 2 3
2 2 NaN NaN 4 4 5 7 7
3 3 3 3 3 NaN 5 7 7
4 4 NaN 3 3 0 5 5 NaN
5 5 5 8 8 8 5 5 NaN
6 6 5 8 16 24 5 5 NaN
7 7 7 8 NaN 24 NaN 0 0
8 8 7 0 NaN 24 24 24 24
9 9 9 9 9 33 57 81 105
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment