Created
February 29, 2024 14:43
-
-
Save linnil1/5151f9a7587edf93f77ae42480af3ffe to your computer and use it in GitHub Desktop.
Read Excel that contains merge cell
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import pandas as pd | |
from openpyxl import load_workbook | |
def readExcel(filename): | |
excel = load_workbook(filename) | |
sheetname = excel.sheetnames[0] | |
df = pd.read_excel(filename, sheet_name=sheetname, header=None) | |
# print("origin:", df, sep="\n") | |
sheet = excel.get_sheet_by_name(sheetname) | |
for merged_cell in sheet.merged_cells: | |
c1, r1, c2, r2 = map(lambda i: i - 1, merged_cell.bounds) | |
# print("Fill", (r1, c1), "to", (r2, c2), "with", (r1, c1), "=", df.iloc[r1, c1]) | |
df.iloc[r1 : r2 + 1, c1 : c2 + 1] = df.iloc[r1, c1] | |
# print("Result:", df, sep="\n") | |
return df | |
df = readExcel("sample_merged_sheet.xlsx") | |
# set first line as header | |
df.columns = df.iloc[0] | |
df = df[1:] | |
print(df) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
## Excel: | |
``` | |
h1 h2 h3 h4 h5 h6 h7 h8 | |
-------------------------------------------------------------------------- | |
| 1.0 | 2.0 | 3.0 | 4.0 | NaN | 1.0 | 2.0 | 3.0 | | |
-------------------------------------------------------------------------- | |
| 2.0 | NaN | NaN | 4.0 | 4.0 | 5.0 | 7.0 | | |
------------------------------------------------------- | |
| 3.0 | 3.0 | 3.0 | NaN | 5.0 | | | |
-------------------------------------------------------------------------- | |
| 4.0 | NaN | 3.0 | 0.0 | 5.0 | NaN | | |
---------------------------------------------- --------- | |
| 5.0 | 5.0 | 8.0 | 8.0 | 8.0 | | NaN | | |
---------- -------------------------- ---------- | |
| 6.0 | | 8.0 | 16.0 | 24.0 | | NaN | | |
------------------- --------------------------------------------- | |
| 7.0 | 7.0 | | NaN | 24.0 | NaN | 0.0 | 0.0 | | |
---------- ------------------------------------------------------ | |
| 8.0 | | 0.0 | NaN | 24.0 | 24.0 | 24.0 | 24.0 | | |
-------------------------------------------------------------------------- | |
| 9.0 | 9.0 | 9.0 | 9.0 | 33.0 | 57.0 | 81.0 |105.0 | | |
## Use `pd.read_excel()` | |
``` | |
h1 h2 h3 h4 h5 h6 h7 h8 | |
0 1 2.0 3.0 4.0 NaN 1.0 2.0 3.0 | |
1 2 NaN NaN 4.0 4.0 5.0 7.0 NaN | |
2 3 3.0 3.0 NaN NaN 5.0 NaN NaN | |
3 4 NaN 3.0 NaN 0.0 5.0 NaN NaN | |
4 5 5.0 8.0 8.0 8.0 NaN NaN NaN | |
5 6 NaN 8.0 16.0 24.0 NaN NaN NaN | |
6 7 7.0 NaN NaN 24.0 NaN 0.0 0.0 | |
7 8 NaN 0.0 NaN 24.0 24.0 24.0 24.0 | |
8 9 9.0 9.0 9.0 33.0 57.0 81.0 105.0 | |
``` | |
## Result (`readExcel`) | |
Fill (4, 2) to (4, 3) with (4, 2) = 3 | |
Fill (6, 2) to (7, 2) with (6, 2) = 8 | |
Fill (3, 2) to (3, 3) with (3, 2) = 3 | |
Fill (7, 1) to (8, 1) with (7, 1) = 7 | |
Fill (4, 5) to (6, 6) with (4, 5) = 5 | |
Fill (5, 1) to (6, 1) with (5, 1) = 5 | |
Fill (2, 6) to (3, 7) with (2, 6) = 7 | |
h1 h2 h3 h4 h5 h6 h7 h8 | |
1 1 2 3 4 NaN 1 2 3 | |
2 2 NaN NaN 4 4 5 7 7 | |
3 3 3 3 3 NaN 5 7 7 | |
4 4 NaN 3 3 0 5 5 NaN | |
5 5 5 8 8 8 5 5 NaN | |
6 6 5 8 16 24 5 5 NaN | |
7 7 7 8 NaN 24 NaN 0 0 | |
8 8 7 0 NaN 24 24 24 24 | |
9 9 9 9 9 33 57 81 105 | |
``` |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment