Skip to content

Instantly share code, notes, and snippets.

@jendelel
Last active December 9, 2023 16:52
Show Gist options
  • Star 4 You must be signed in to star a gist
  • Fork 6 You must be signed in to fork a gist
  • Save jendelel/3a8e768a8eb9345d49f2a82d02946122 to your computer and use it in GitHub Desktop.
Save jendelel/3a8e768a8eb9345d49f2a82d02946122 to your computer and use it in GitHub Desktop.
Load INBREAST ROIs
from skimage.draw import polygon
import numpy as np
import plistlib
def load_inbreast_mask(mask_path, imshape=(4084, 3328)):
"""
This function loads a osirix xml region as a binary numpy array for INBREAST
dataset
@mask_path : Path to the xml file
@imshape : The shape of the image as an array e.g. [4084, 3328]
return: numpy array where positions in the roi are assigned a value of 1.
"""
def load_point(point_string):
x, y = tuple([float(num) for num in point_string.strip('()').split(',')])
return y, x
mask = np.zeros(imshape)
with open(mask_path, 'rb') as mask_file:
plist_dict = plistlib.load(mask_file, fmt=plistlib.FMT_XML)['Images'][0]
numRois = plist_dict['NumberOfROIs']
rois = plist_dict['ROIs']
assert len(rois) == numRois
for roi in rois:
numPoints = roi['NumberOfPoints']
points = roi['Point_px']
assert numPoints == len(points)
points = [load_point(point) for point in points]
if len(points) <= 2:
for point in points:
mask[int(point[0]), int(point[1])] = 1
else:
x, y = zip(*points)
x, y = np.array(x), np.array(y)
poly_x, poly_y = polygon(x, y, shape=imshape)
mask[poly_x, poly_y] = 1
return mask
@Feyn-Man
Copy link

Hi, you find a simplified version of this code (without the load_point function) on my fork 😄

@tatsunidas
Copy link

Hi, thank you for your great work.
Recently, I using the INbreast dataset too.
This is just my simple idea, I think this code can brush up for specify roi categories (e.g, Mass/Calcification/Cluster).
When I used your code AS-IS, all rois were loaded as label array.
So, I suggest following changes,

@roi_class_name is the roi attribute name, None = all roi, Calcification(including cluster) or Mass

'''
def load_inbreast_roimask(roi_class_name=None, mask_path='', imshape=(4084, 3328)):

def load_point(point_string):
    x, y = tuple([float(num) for num in point_string.strip('()').split(',')])
    return y, x

mask = np.zeros(imshape)
with open(mask_path, 'rb') as mask_file:
    plist_dict = plistlib.load(mask_file, fmt=plistlib.FMT_XML)['Images'][0]
    numRois = plist_dict['NumberOfROIs']
    rois = plist_dict['ROIs']
    assert len(rois) == numRois
    for roi in rois:
        # to check dict in roi
        '''
        for k, v in roi.items():
            print(k, v)
        '''
       # here -start-
        if roi_class_name is not None:
            if roi_class_name == "Calcification":
                if roi_class_name != roi['Name'] and roi["Name"] != "Cluster":
                    continue
            else:
                if roi_class_name != roi['Name']:
                    continue
       # -end-
        numPoints = roi['NumberOfPoints']
        points = roi['Point_px']
        assert numPoints == len(points)
        points = [load_point(point) for point in points]
        if len(points) <= 2:
            for point in points:
                mask[int(point[0]), int(point[1])] = 1
        else:
            x, y = zip(*points)
            x, y = np.array(x), np.array(y)
            poly_x, poly_y = polygon(x, y, shape=imshape)
            mask[poly_x, poly_y] = 1
return mask.astype(np.uint8)

'''

Do you think this changes is looks good? (I do not have my confidence...)
I waiting your reply.

@Feyn-Man
Copy link

Feyn-Man commented Mar 9, 2021

Hi @tatsunidas what's the point of specify roi_class_name == "Calcification" if roi_class_name is not None? If you only want to skip the categories different from what you specify in the roi_class_name parameter you can simply use the if roi_class_name != roi['Name'] condition

@jasminjahanpuspo
Copy link

jasminjahanpuspo commented Apr 3, 2021

Hello, I am using this code. It's pretty much easy to understand but somehow I get all values zeros when run on Colab. Here is a screenshot of my code. Any kind of help will be appreciated.
Capture5

@satwiksunnam19
Copy link

Hello, @jasminjahanpuspo @Feyn-Man @jendelel have you solved how to solve the conversion of the.xml files into png format the code which is shown here is not working for all the files and reproducing an error.

please reply and share your insights ASAP,
Thanks & Regards,
Satwik Sunnam

@jasminjahanpuspo
Copy link

Hello, @jasminjahanpuspo @Feyn-Man @jendelel have you solved how to solve the conversion of the.xml files into png format the code which is shown here is not working for all the files and reproducing an error.

please reply and share your insights ASAP, Thanks & Regards, Satwik Sunnam

Hello @satwiksunnam19 Use the code hope you find it useful: https://github.com/wentaozhu/inbreast

@satwiksunnam19
Copy link

satwiksunnam19 commented Jan 30, 2023

Hey, @jasminjahanpuspo I've got no experience using Matlab and if you give me a step-wise method to execute the files to run the code it would be a great thing.

Thanks & Regards,
Satwik Sunnam.

@jasminjahanpuspo
Copy link

jasminjahanpuspo commented Jan 30, 2023 via email

@satwiksunnam19
Copy link

Hello @jasminjahanpuspo I'm trying to replicate the work of this project https://github.com/Holliemin9090/Mammographic-mass-CAD-via-pseudo-color-mammogram-and-Mask-R-CNN

I'm using INBREAST Dataset.

For dataset preparation, I've tried the conversion of images and ROIs

  1. Completed the Conversion of DICOM to PNG of the images.
  2. Completed the conversion of XML files to PNG/JPG images.

My doubt is

  1. I've more DICOM images when compared to the number of ROIs images, how to mitigate the problem.

Is there any suggestion from your side and also implementing the GitHub mentioned above?

Thanks & Regards,
S S

@jasminjahanpuspo
Copy link

jasminjahanpuspo commented Jan 31, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment