nousr/example_script.sh

## readme.md

      
    Raw
  

              readme.md
            
          
    Install stuff


create a virtual environment python3 -m venv .env then activate it source .env/bin/activate
install pytorch
install clip-retrieval pip install clip-retrieval
install s3fs pip install s3fs
(optional) install wandb pip install wandb and login wandb login

Create a script that points to your data & and output folder


create a folder of image & txt pairs with the same filename (excepting the extension)


example: img0.png, img0.txt


fill out the input_dataset and output_dataset fields of the script below
change the wandb project name & toggle if using wandb
adjust the clip model preference
adjust the slurm job comment to use your team's account
set your slurm cache path ( can be anything you'd like )

more notes & advanced usage at: https://github.com/rom1504/clip-retrieval

  
## example_script.sh
#!/bin/bash
clip-retrieval inference \
--input_dataset="<parent folder containing images>" \
--output_folder="<output s3 bucket or local folder>" \
--input_format="files" \
--enable_metadata=False \
--write_batch_size=500 \
--num_prepro_workers=2 \
--batch_size=64 \
--enable_wandb=True \
--wandb_project="<project name>" \
--clip_model="open_clip:ViT-H-14" \
--use_jit=False \
--distribution_strategy="slurm" \
--slurm_job_name="shot-deck-embed" \
--slurm_partition="g40423" \
--slurm_nodes=1 \
--slurm_job_comment="<your account" \
--slurm_job_timeout=350000 \
--cache_path=None \
--clip_cache_path=None \
--slurm_cache_path="<your cache path>" \
--slurm_verbose_wait=False \
	#!/bin/bash
	clip-retrieval inference \
	--input_dataset="<parent folder containing images>" \
	--output_folder="<output s3 bucket or local folder>" \
	--input_format="files" \
	--enable_metadata=False \
	--write_batch_size=500 \
	--num_prepro_workers=2 \
	--batch_size=64 \
	--enable_wandb=True \
	--wandb_project="<project name>" \
	--clip_model="open_clip:ViT-H-14" \
	--use_jit=False \
	--distribution_strategy="slurm" \
	--slurm_job_name="shot-deck-embed" \
	--slurm_partition="g40423" \
	--slurm_nodes=1 \
	--slurm_job_comment="<your account" \
	--slurm_job_timeout=350000 \
	--cache_path=None \
	--clip_cache_path=None \
	--slurm_cache_path="<your cache path>" \
	--slurm_verbose_wait=False \