Skip to content

Instantly share code, notes, and snippets.

@simonw

simonw/raw.md Secret

Created October 23, 2024 17:57
Show Gist options
  • Save simonw/b1bffe54ebdf3583ec4e3639fb535567 to your computer and use it in GitHub Desktop.
Save simonw/b1bffe54ebdf3583ec4e3639fb535567 to your computer and use it in GitHub Desktop.

Prompt Gemini Script with Raw Output Option

human (Oct 23, 2024, 10:52 AM)

#!/bin/bash

# Check if GOOGLE_API_KEY is set
if [ -z "$GOOGLE_API_KEY" ]; then
    echo "Error: GOOGLE_API_KEY environment variable is not set" >&2
    exit 1
fi

# Default model
model="8b"
prompt=""
image_file=""

# Parse arguments
while [[ $# -gt 0 ]]; do
    case $1 in
        -m)
            model="$2"
            shift 2
            ;;
        *)
            if [ -z "$prompt" ]; then
                prompt="$1"
            elif [ -z "$image_file" ]; then
                image_file="$1"
            fi
            shift
            ;;
    esac
done

# Validate prompt
if [ -z "$prompt" ]; then
    echo "Error: No prompt provided" >&2
    echo "Usage: prompt-gemini \"prompt\" [image_file] [-m model]" >&2
    exit 1
fi

# Map model names to full model strings
case $model in
    "8b"|"flash-8b")
        model_string="gemini-1.5-flash-8b-latest"
        ;;
    "flash")
        model_string="gemini-1.5-flash-latest"
        ;;
    "pro")
        model_string="gemini-1.5-pro-latest"
        ;;
    *)
        model_string="gemini-1.5-$model"
        ;;
esac

# Create temporary file
temp_file=$(mktemp)
trap 'rm -f "$temp_file"' EXIT

# Determine mime type if image file is provided
if [ -n "$image_file" ]; then
    if [ ! -f "$image_file" ]; then
        echo "Error: File '$image_file' not found" >&2
        exit 1
    fi

    # Get file extension and convert to lowercase
    ext=$(echo "${image_file##*.}" | tr '[:upper:]' '[:lower:]')
    
    case $ext in
        png)
            mime_type="image/png"
            ;;
        jpg|jpeg)
            mime_type="image/jpeg"
            ;;
        gif)
            mime_type="image/gif"
            ;;
        pdf)
            mime_type="application/pdf"
            ;;
        *)
            echo "Error: Unsupported file type .$ext" >&2
            exit 1
            ;;
    esac

    # Create JSON with image data
    cat <<EOF > "$temp_file"
{
  "contents": [
    {
      "role": "user",
      "parts": [
        {
          "text": "$prompt"
        },
        {
          "inlineData": {
            "data": "$(base64 -i "$image_file")",
            "mimeType": "$mime_type"
          }
        }
      ]
    }
  ]
}
EOF
else
    # Create JSON without image data
    cat <<EOF > "$temp_file"
{
  "contents": [
    {
      "role": "user",
      "parts": [
        {
          "text": "$prompt"
        }
      ]
    }
  ]
}
EOF
fi

# Make API request
curl -s "https://generativelanguage.googleapis.com/v1beta/models/$model_string:generateContent?key=$GOOGLE_API_KEY" \
    -H 'Content-Type: application/json' \
    -X POST \
    -d @"$temp_file" | jq

Modify this script to add an extra -r option which, if present, causes the final line to pipe through jq like this:

... | jq '.candidates[0].content.parts[0].text' -r

assistant (Oct 23, 2024, 10:52 AM)

I'll help modify the script to add the raw output option.

Modified Prompt Gemini Script with Raw Output Option

#!/bin/bash

# Check if GOOGLE_API_KEY is set
if [ -z "$GOOGLE_API_KEY" ]; then
    echo "Error: GOOGLE_API_KEY environment variable is not set" >&2
    exit 1
fi

# Default model and options
model="8b"
prompt=""
image_file=""
raw_output=false

# Parse arguments
while [[ $# -gt 0 ]]; do
    case $1 in
        -m)
            model="$2"
            shift 2
            ;;
        -r)
            raw_output=true
            shift
            ;;
        *)
            if [ -z "$prompt" ]; then
                prompt="$1"
            elif [ -z "$image_file" ]; then
                image_file="$1"
            fi
            shift
            ;;
    esac
done

# Validate prompt
if [ -z "$prompt" ]; then
    echo "Error: No prompt provided" >&2
    echo "Usage: prompt-gemini \"prompt\" [image_file] [-m model] [-r]" >&2
    exit 1
fi

# Map model names to full model strings
case $model in
    "8b"|"flash-8b")
        model_string="gemini-1.5-flash-8b-latest"
        ;;
    "flash")
        model_string="gemini-1.5-flash-latest"
        ;;
    "pro")
        model_string="gemini-1.5-pro-latest"
        ;;
    *)
        model_string="gemini-1.5-$model"
        ;;
esac

# Create temporary file
temp_file=$(mktemp)
trap 'rm -f "$temp_file"' EXIT

# Determine mime type if image file is provided
if [ -n "$image_file" ]; then
    if [ ! -f "$image_file" ]; then
        echo "Error: File '$image_file' not found" >&2
        exit 1
    fi

    # Get file extension and convert to lowercase
    ext=$(echo "${image_file##*.}" | tr '[:upper:]' '[:lower:]')
    
    case $ext in
        png)
            mime_type="image/png"
            ;;
        jpg|jpeg)
            mime_type="image/jpeg"
            ;;
        gif)
            mime_type="image/gif"
            ;;
        pdf)
            mime_type="application/pdf"
            ;;
        *)
            echo "Error: Unsupported file type .$ext" >&2
            exit 1
            ;;
    esac

    # Create JSON with image data
    cat <<EOF > "$temp_file"
{
  "contents": [
    {
      "role": "user",
      "parts": [
        {
          "text": "$prompt"
        },
        {
          "inlineData": {
            "data": "$(base64 -i "$image_file")",
            "mimeType": "$mime_type"
          }
        }
      ]
    }
  ]
}
EOF
else
    # Create JSON without image data
    cat <<EOF > "$temp_file"
{
  "contents": [
    {
      "role": "user",
      "parts": [
        {
          "text": "$prompt"
        }
      ]
    }
  ]
}
EOF
fi

# Make API request with conditional jq processing
if [ "$raw_output" = true ]; then
    curl -s "https://generativelanguage.googleapis.com/v1beta/models/$model_string:generateContent?key=$GOOGLE_API_KEY" \
        -H 'Content-Type: application/json' \
        -X POST \
        -d @"$temp_file" | jq '.candidates[0].content.parts[0].text' -r
else
    curl -s "https://generativelanguage.googleapis.com/v1beta/models/$model_string:generateContent?key=$GOOGLE_API_KEY" \
        -H 'Content-Type: application/json' \
        -X POST \
        -d @"$temp_file" | jq
fi

The key changes I made to the script are:

  1. Added a new raw_output variable initialized to false
  2. Added -r to the argument parsing section
  3. Updated the usage message to include the new -r option
  4. Modified the final API request section to conditionally use different jq processing based on the raw_output flag
  5. Kept all other functionality intact

Now you can use the script with the -r flag to get raw text output, like this:

./prompt-gemini "Tell me a joke" -r

Or without the -r flag to get the full JSON output as before:

./prompt-gemini "Tell me a joke"

human (Oct 23, 2024, 10:53 AM)

refactor that so it sets a jq variable of '.' if no -r was set and .candidates.... if the option was set

assistant (Oct 23, 2024, 10:53 AM)

Modified Prompt Gemini Script with JQ Variable

#!/bin/bash

# Check if GOOGLE_API_KEY is set
if [ -z "$GOOGLE_API_KEY" ]; then
    echo "Error: GOOGLE_API_KEY environment variable is not set" >&2
    exit 1
fi

# Default model and options
model="8b"
prompt=""
image_file=""
jq_filter="."

# Parse arguments
while [[ $# -gt 0 ]]; do
    case $1 in
        -m)
            model="$2"
            shift 2
            ;;
        -r)
            jq_filter=".candidates[0].content.parts[0].text"
            shift
            ;;
        *)
            if [ -z "$prompt" ]; then
                prompt="$1"
            elif [ -z "$image_file" ]; then
                image_file="$1"
            fi
            shift
            ;;
    esac
done

# Validate prompt
if [ -z "$prompt" ]; then
    echo "Error: No prompt provided" >&2
    echo "Usage: prompt-gemini \"prompt\" [image_file] [-m model] [-r]" >&2
    exit 1
fi

# Map model names to full model strings
case $model in
    "8b"|"flash-8b")
        model_string="gemini-1.5-flash-8b-latest"
        ;;
    "flash")
        model_string="gemini-1.5-flash-latest"
        ;;
    "pro")
        model_string="gemini-1.5-pro-latest"
        ;;
    *)
        model_string="gemini-1.5-$model"
        ;;
esac

# Create temporary file
temp_file=$(mktemp)
trap 'rm -f "$temp_file"' EXIT

# Determine mime type if image file is provided
if [ -n "$image_file" ]; then
    if [ ! -f "$image_file" ]; then
        echo "Error: File '$image_file' not found" >&2
        exit 1
    fi

    # Get file extension and convert to lowercase
    ext=$(echo "${image_file##*.}" | tr '[:upper:]' '[:lower:]')
    
    case $ext in
        png)
            mime_type="image/png"
            ;;
        jpg|jpeg)
            mime_type="image/jpeg"
            ;;
        gif)
            mime_type="image/gif"
            ;;
        pdf)
            mime_type="application/pdf"
            ;;
        *)
            echo "Error: Unsupported file type .$ext" >&2
            exit 1
            ;;
    esac

    # Create JSON with image data
    cat <<EOF > "$temp_file"
{
  "contents": [
    {
      "role": "user",
      "parts": [
        {
          "text": "$prompt"
        },
        {
          "inlineData": {
            "data": "$(base64 -i "$image_file")",
            "mimeType": "$mime_type"
          }
        }
      ]
    }
  ]
}
EOF
else
    # Create JSON without image data
    cat <<EOF > "$temp_file"
{
  "contents": [
    {
      "role": "user",
      "parts": [
        {
          "text": "$prompt"
        }
      ]
    }
  ]
}
EOF
fi

# Make API request with jq filter
curl -s "https://generativelanguage.googleapis.com/v1beta/models/$model_string:generateContent?key=$GOOGLE_API_KEY" \
    -H 'Content-Type: application/json' \
    -X POST \
    -d @"$temp_file" | jq "$jq_filter" -r

The key changes I made are:

  1. Replaced the raw_output boolean with a jq_filter variable that's initialized to "."
  2. Modified the -r option handling to set the jq_filter to ".candidates[0].content.parts[0].text"
  3. Simplified the final curl command to use a single line with the jq_filter variable
  4. Always use -r with jq since it doesn't affect the output when the filter is "."

The usage remains the same, but the code is now more maintainable with less duplication. The -r option still produces raw text output while omitting it shows the full JSON response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment