Skip to content

Instantly share code, notes, and snippets.

@dkypuros
Created June 16, 2025 15:16
Show Gist options
  • Save dkypuros/c845fb816c660e3b999277dbe2a2079c to your computer and use it in GitHub Desktop.
Save dkypuros/c845fb816c660e3b999277dbe2a2079c to your computer and use it in GitHub Desktop.
Maas Documentation

the documentation provided by Parasol’s MaaS was not very clear or explicit, especially for a first-time user. Here’s a breakdown of what the documentation offered versus what we had to discover through experimentation:

What the Documentation Said Model List: The docs listed available models (like Granite-3.1-8B-Instruct) and their endpoints. API Endpoints: It showed endpoint URLs for each model, e.g.: https://granite-3-8b-instruct-maas-apicast-production.apps.prod.rhoai.rh-aiservices-bu.com:443 Paths: For each model, it listed available HTTP paths, such as: /v1/chat/completions /v1/completions /v1/embeddings /v1/models (to list available models) No explicit sample cURL or Python code: There was no clear “here’s how to call the chat endpoint” example for your specific model or API key.

Model Name Ambiguity

The model names in the docs (like Granite-3.1-8B-Instruct) did not match the actual model ID required by the API (granite-3-8b-instruct). The docs did not clarify the exact string to use in the model parameter.


What I Discovered

Correct Model Name:
The API required the model name to be exactly granite-3-8b-instruct (lowercase, hyphens, no version sub-point), not Granite-3.1-8B-Instruct as indicated in the documentation.

How I Found the Model Name:
I had to call /v1/models to list the actual model IDs accepted by the API.

Request Format:
The API is OpenAI-compatible, so the request format is the same as OpenAI’s chat/completions endpoint.

Working Example:
Through trial and error, I arrived at a working cURL and Python example using the correct model name and endpoint.


Quick Start Guide: Calling Granite with API Keys Securely

Here’s how I got a working Python script that securely handles API keys using a .env file:

  1. Create a .env file (in your project directory):

    GRANITE_API_KEY=your_api_key_here
    GRANITE_API_URL=https://granite-3-8b-instruct-maas-apicast-production.apps.prod.rhoai.rh-aiservices-bu.com:443/v1/chat/completions
  2. Install dependencies (in your virtual environment):

    pip install requests python-dotenv
  3. Create a Python script (e.g., inference.py):

    import os
    import requests
    from dotenv import load_dotenv
    
    load_dotenv()
    
    API_KEY = os.getenv("GRANITE_API_KEY")
    API_URL = os.getenv("GRANITE_API_URL")
    
    headers = {
        "Content-Type": "application/json",
        "Authorization": f"Bearer {API_KEY}"
    }
    
    data = {
        "model": "granite-3-8b-instruct",
        "messages": [
            {"role": "user", "content": "Hello, world!"}
        ]
    }
    
    response = requests.post(API_URL, headers=headers, json=data)
    print("Status Code:", response.status_code)
    print("Response:", response.text)
  4. Run your script:

    python inference.py

This approach keeps your API key out of your code and lets you easily swap endpoints or keys as needed.


Summary Table

Step Documentation Provided What Actually Worked / Discovered
Endpoint URL Yes Yes
Model Name Ambiguous Had to call /v1/models to get exact
Example Request No Had to construct and test
Auth/API Key usage Implied Standard Bearer token worked
OpenAI Compatibility Implied Confirmed by request/response format

Conclusion

The documentation gave you the pieces, but not a clear, working example. The most critical gap was the mismatch between the documented model name and the actual model parameter required by the API. You had to experiment and inspect the /v1/models response to discover the correct value.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment