Skip to content

Instantly share code, notes, and snippets.

@ehofesmann
Last active April 25, 2023 20:34
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ehofesmann/35e181991d413cd3a1c54349610aefe0 to your computer and use it in GitHub Desktop.
Save ehofesmann/35e181991d413cd3a1c54349610aefe0 to your computer and use it in GitHub Desktop.
from fiftyone import ViewField as F
def preprocess_and_merge_with_fiftyone(coco_dataset, oi_dataset, coco_food, oi_food):
# Since the class names are lowercase in COCO and uppercase in Open Images, we'll need to normalize them before merging the datasets together. In this case, let's make all class names capitalized.
coco_food_map = {f: f.capitalize() for f in coco_food}
coco_view = coco_dataset.map_labels(
"ground_truth", coco_food_map
)
# When downloading the datasets with specified classes, we made sure that each sample contains at least one instance of a food item of interest. However, other classes of object detections also exist within these samples. Since we're only interested in food, let's also filter the labels of these datasets to remove any non-food items in this preprocessing step.
coco_view = coco_view.filter_labels(
"ground_truth", F("label").is_in(coco_food)
)
oi_view = oi_dataset.filter_labels(
"ground_truth", F("label").is_in(oi_food)
)
coco_view.save()
oi_view.save()
# Now that our COCO and Open Images datasets are ready, we can merge them into one dataset we'll name "food".
dataset = oi_dataset.clone()
dataset.merge_samples(coco_dataset)
# Remove the existing "validation" tag that came when downloading the validation split of COCO and Open Images
dataset.untag_samples(["validation"])
return dataset
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment