Skip to content

Instantly share code, notes, and snippets.

View bajajcodes's full-sized avatar
🏠
Working from home

shubhambajaj bajajcodes

🏠
Working from home
View GitHub Profile

This note is about the caller assistant project, my first voice AI project. In this project, I created a voice AI assistant using Twilio as telephony, OpenAI as LLM, and deepgram speech-to-text provider, and then Amazon Polly powered by Twilio as text-to-speech provider. This project was all about navigating an IVR to connect to a human agent to verify some information on a customer’s profile. The customers have their information related to their health records, their insurance, etc., and they are associated with some companies that want to verify their health records in healthcare agencies. We build a voice AI assistant/agent that navigates the IVR, connects with a human agent, and finds out if the current information available to us is correctly updated in the healthcare agency’s records or not. If not, it asks them what’s the best method to update them? Confirms everything and then at the end of the call, sends an end-of-call report to the customer, provider, or customer manager here to update the customer

Impromptu Google Meet Meeting - September 24


0:03 - shubham bajaj (shmbajaj32@gmail.com) Okay, this project was named as Caller Assistant Prompt and it had three steps to achieving success. First one was to navigate an IVR. Basically, this means we had a decision tree where we have listed down the different key presses or content to be spoken out to navigate an IVR. Once that is done, we used to identify if the call is connected with human or not. If it's connected with human, then we move to step 2 where we actually verify the information. about our customer's customer on file with healthcare agencies and then finally at the end we send the structured output or end of call report back to the customer so now when i say healthcare agency here what i mean is our customer has few clients and they want to verify if their clients customers current data matches the records of these healthcare agencies or healthcare insurance companies or not if it's out of date or it's not moved to in progress for insura

Vapi Proxy Server Guide

Overview

Proxy server keeps assistant configs and API keys on your backend. Frontend sends custom data, backend maps to Vapi calls.

Flow: Frontend -> Your Proxy -> Vapi API -> Response -> Frontend

Frontend Setup

Teach Yourself: SIP Credential Architecture and Authentication

Understanding SIP Communication Fundamentals

SIP (Session Initiation Protocol) communication requires mutual trust between parties. When you create SIP credentials, you're establishing a bidirectional security relationship.

The Core Principle: Mutual Whitelisting + Authentication

Every SIP connection involves two fundamental security layers:

Teach Yourself: How to Create SIP Credentials

Understanding the Hostname Error

SIP credential creation fails when you use a hostname (domain name) for a gateway that receives inbound calls.

What Changed

The SIP infrastructure (Jambonz) upgraded and now enforces validation rules. Previously, hostname configurations were accepted but never worked for inbound calls. Now the system blocks invalid configurations immediately.

You are a Pronunciation Transformation Assistant for text-to-speech (TTS) models. Your workflow always follows these steps:

  1. Context Intake:

    • Ask the user to provide the specific pronunciation transformation task and include a few representative examples.
    • The context can be any domain (e.g., numbers, dates, abbreviations, measurements, technical terms).
  2. Problem Clarification:

    • Restate the problem in your own words to confirm understanding.
    • Ask clarifying questions if the context is ambiguous or incomplete.
export default {
async fetch(request) {
const url = new URL(request.url);
// Handle CORS preflight
if (request.method === 'OPTIONS') {
return new Response(null, {
status: 204,
headers: {
'Access-Control-Allow-Origin': '*',
const CONFIG = {
MODEL_NAME: 'static-model',
DELAYS: {
WORD: 20,
SENTENCE: 50,
TOOL_CALL: 50,
},
RESPONSES: {
NORMAL: [
'Hello! This is a static response.',

Customizing Chunk Plan with cURL

Modifying Minimum Characters

curl -X PATCH "https://api.vapi.ai/voice/voices/{voiceId}" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "chunkPlan": {

Trieve Knowledge Base Integration with Vapi

Query Flow Overview

  1. Query Formation:

    • When a user sends a message, Vapi extracts the most recent user queries (up to the last 3).
    • These messages are joined with periods to form a single query string for Trieve
  2. Knowledge Base Retrieval:

  • Vapi calls the internal function which: