0% found this document useful (0 votes)
11 views7 pages

Agent Initialization Delay Fix

The document outlines a server implementation using Deepgram's SDK to handle restaurant order processing via Twilio. It includes functions for fetching menus, looking up customers, and managing order placements while ensuring a friendly interaction with the customer. The server utilizes WebSocket for real-time communication and handles various events related to audio and function calls.

Uploaded by

ptmdash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views7 pages

Agent Initialization Delay Fix

The document outlines a server implementation using Deepgram's SDK to handle restaurant order processing via Twilio. It includes functions for fetching menus, looking up customers, and managing order placements while ensuring a friendly interaction with the customer. The server utilizes WebSocket for real-time communication and handles various events related to audio and function calls.

Uploaded by

ptmdash
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Agent initialization delay fix

import { createClient, AgentEvents } from '@deepgram/sdk'; import { WebSocketServer,


WebSocket } from 'ws'; import fs from 'fs'; import https from 'https'; import 'dotenv/config';
import axios from 'axios'; import querystring from 'querystring'; import parsePhoneNumber
from 'libphonenumber-js' const deepgram = createClient(process.env.DEEPGRAM_API_KEY); let
API_BASE_URL = "http://127.0.0.1:3000" function getNationalNumber(phone: unknown): string
| undefined { if (typeof phone === 'string') { const phno = parsePhoneNumber(phone,
'US'); return phno ? phno.countryCallingCode + phno.nationalNumber : phone; } return
undefined; } let restaurantNo: string | undefined = ''; let customerPhone: string | undefined = '';
let storedMenu: any = []; let storedCustomer: any = null; const server = https.createServer({
cert: fs.readFileSync('./cert.pem'), key: fs.readFileSync('./key.pem'), }, (req, res) => { if (req.url
=== '/') { res.writeHead(200, { 'Content-Type': 'text/plain' }); res.end('Hello'); } else if
(req.url === '/api/twilio' && req.method === 'POST') { let body = ''; req.on('data', chunk
=> { body += chunk.toString(); // collect data }); req.on('end', () => {
console.log("Hello - Twilio POST received"); const parsed = querystring.parse(body);
restaurantNo = getNationalNumber(parsed.To); customerPhone =
getNationalNumber(parsed.From); console.log( Restaurant: ${restaurantNo}, Customer:
${customerPhone}); if (restaurantNo) { fetchMenu(restaurantNo).then(menu =>
storedMenu = menu ).catch(error => { console.error('Error fetching menu:',
error); }); } if (customerPhone) { findCustomer(customerPhone,
restaurantNo).then(customerResult => storedCustomer = customerResult ).catch(error
=> { storedCustomer = null; console.error('Background customer fetch
error:', error.data); }); } // <Say voice="alice">Connecting your call.</Say>
const twilioResponse = <?xml version="1.0" encoding="UTF-8"?> <Response> <Connect>
<Stream url="wss://q6f1bmhp-443.inc1.devtunnels.ms"/> </Connect> </Response>;
console.log('Twilio response:', twilioResponse); res.writeHead(200, { 'Content-Type':
'application/xml' }); res.end(twilioResponse); }); } else { res.writeHead(404);
res.end(); } }); const wss = new WebSocketServer({ server }); let browserWs: WebSocket | null
= null; let streamSid: string | null = null; async function fetchMenu(callId: string) { try {
let url = ${API_BASE_URL}/api/menu-items?no=${restaurantNo} console.log(Fetching menu for
call ${callId}:, url); const response: any = await axios.get(url); const data =
response.data; return data; } catch (error) { console.error('Menu API Error:', error);
return { success: false, error: error instanceof Error ? error.message : 'Unknown
error', data: [] }; } } async function findCustomer(phone: string | undefined,
restaurantNo: string | undefined) { try { let url = ${API_BASE_URL}/api/customers?
phone=${phone}&restaurantNo=${restaurantNo}; console.log(Looking up customer:, url);
const response: any = await axios.get(url); const data = response.data; return data ||
null; } catch (error: any) { console.error('Customer lookup Error:', error); return {
success: false, error: error instanceof Error ? error.message : 'Unknown error',
data: null }; } } const humanLikeFunctions = [ { "name": "end_conversation",
"description": "End conversation naturally when customer indicates they're done (goodbye,
thanks, that's all, etc.)", "parameters": { "type": "object", "properties": {
"reason": { "type": "string", "description": "Why the conversation is
ending" } }, "required": ["reason"] } }, { "name": "placeOrder",
"description": "Submit the complete order to the kitchen - make sure everything is perfect!",

Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 1/7
"parameters": { "type": "object", "properties": { "customerName": {
"type": "string", "description": "Customer's name (only needed for new customers)"
}, "customerPhone": { "type": "string", "description": "Phone
number (usually already have this)" }, "customerEmail": { "type":
"string", "description": "Email if they want receipts/updates" },
"orderType": { "type": "string", "enum": ["dine-in", "takeaway", "delivery"],
"description": "How they want their food - eating here, taking out, or delivery" },
"tableNumber": { "type": "string", "description": "Table number for dine-in
orders" }, "deliveryAddress": { "type": "object",
"description": "Where to deliver (only for delivery orders)", "properties": {
"street": { "type": "string" }, "city": { "type": "string" }, "zipCode": {
"type": "string" }, "specialInstructions": { "type": "string", "description": "Delivery
notes like 'ring doorbell', 'leave at door'" } }, "required": ["street", "city",
"zipCode"] }, "orderItems": { "type": "array",
"description": "Everything they want to eat - must include exact details from menu",
"items": { "type": "object", "properties": { "id": {
"type": "integer", "description": "Menu item ID number (from fetchMenu)"
}, "menuItemName": { "type": "string",
"description": "Exact name from menu (from fetchMenu)" },
"quantity": { "type": "integer", "minimum": 1,
"description": "How many they want" }, "price": {
"type": "number", "description": "Price per item (from fetchMenu)"
}, "size": { "type": "string", "enum": ["small",
"medium", "large", "extra-large"], "description": "Size if applicable"
}, // "customizations": { // "type": "array", //
"items": {"type": "string"}, // "description": "Special requests like 'no onions',
'extra cheese'" // }, "notes": { "type": "string",
"description": "Any special instructions for this item" } },
"required": ["id", "menuItemName", "quantity", "price"] } },
"specialRequests": { "type": "string", "description": "Any special notes for
the whole order" }, "isConfirmed": { "type": "boolean",
"description": "Customer has confirmed they want to place this order" } },
"required": ["customerName","orderItems", "isConfirmed"] } } ]; function
connectToAgent(prompt: string) { try { // Create an agent connection const agent =
deepgram.agent(); agent.on('Welcome', (data) => { console.log('Server welcome
message:', data); // Uncomment the following lines if you want to handle Twilio's media
events agent.configure({ audio: { input: { encoding:
'mulaw', sample_rate: 8000 }, output: {
encoding: 'mulaw', sample_rate: 8000, container: 'none' }
// input: { // encoding: 'linear16', // sample_rate: 24000 // },
// output: { // encoding: 'linear16', // sample_rate: 24000,
// container: 'none' // } }, agent: { listen: {
provider: { type: 'deepgram', model: 'nova-3' }
}, think: { provider: { type: 'open_ai',
model: 'gpt-4o-mini' }, // prompt: humanLikeRestaurantPrompt,
prompt, functions: humanLikeFunctions },
speak: { provider: { type: 'deepgram', model: 'aura-
2-helena-en' } }, greeting: "Hello! Welcome to the
Restaurant. How can I help you today?" } }); });
agent.on(AgentEvents.Audio, (audio: Buffer) => { if (browserWs?.readyState ===
WebSocket.OPEN) { try { browserWs.send(JSON.stringify({
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 2/7
event: 'media', streamSid: streamSid, media: { payload:
audio.toString('base64') } })); } catch (error) { console.error('Audio
error:', error instanceof Error ? error.message : 'Unknown error'); } } });
agent.on(AgentEvents.FunctionCallRequest, async (functionCall) => { const
functionName = functionCall.functions[0].name; console.log('Function call received:',
functionCall.functions[0].name); const functionId = functionCall.functions[0].id; try
{ let responseContent = {}; switch (functionName) { case
'placeOrder': // Your existing placeOrder logic here, but ensure consistent
response format const orderData =
JSON.parse(functionCall.functions[0].arguments); let finalCustomerData = {
customerName: orderData.customerName, customerPhone:
orderData.customerPhone || customerPhone, customerEmail:
orderData.customerEmail }; if (storedCustomer &&
storedCustomer.success) { finalCustomerData = {
customerName: storedCustomer.customer.name || orderData.customerName,
customerPhone: storedCustomer.customer.phone || customerPhone,
customerEmail: storedCustomer.customer.email || orderData.customerEmail };
} const validatedOrderData = { customerName:
finalCustomerData.customerName, customerPhone:
finalCustomerData.customerPhone, customerEmail:
finalCustomerData.customerEmail || null, orderItems: orderData.orderItems ||
[], specialRequests: orderData.specialRequests || null,
isConfirmed: orderData.isConfirmed || false, restaurantNo: restaurantNo,
orderType: orderData.orderType || 'takeaway', tableNumber:
orderData.tableNumber || null, deliveryAddress: orderData.deliveryAddress ||
null }; console.log(validatedOrderData,"Asd");
const orderResponse = await fetch(${API_BASE_URL}/api/create-order, { method:
'POST', headers: { 'Content-Type': 'application/json' }, body:
JSON.stringify(validatedOrderData) }); const orderResult = await
orderResponse.json(); if (orderResponse.ok && orderResult.success) {
responseContent = { success: true, message: Order placed!
Order #${orderResult.data?.id}. Total: $${orderResult.data?.totalAmount},
orderId: orderResult.data?.id, totalAmount: orderResult.data?.totalAmount
}; } else { throw new Error(orderResult.message || 'Failed to place
order'); } break; case 'end_conversation':
const endData = JSON.parse(functionCall.functions[0].arguments);
responseContent = { success: true, message: "Thank you for your
order! Have a great day!", reason: endData.reason };
// Send response first, then handle cleanup const endResponse = {
type: "FunctionCallResponse", id: functionId, name:
functionName, content: JSON.stringify(responseContent) };
agent.send(JSON.stringify(endResponse)); // Delay cleanup to allow response to
be processed setTimeout(() => { agent.disconnect();
if (browserWs) browserWs.close(); }, 2000); return; // Exit early for
end_conversation } // Standardized response format for all functions
const standardResponse = { type: "FunctionCallResponse", id: functionId,
name: functionName, content: JSON.stringify(responseContent) };
console.log(Sending ${functionName} response:, JSON.stringify(standardResponse).substring(0,
200) + '...'); // Use consistent send method
agent.send(JSON.stringify(standardResponse)); } catch (error) {
console.error(Error in ${functionName} handler:, error); const errorResponse = {
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 3/7
type: "FunctionCallResponse", id: functionId, name: functionName,
content: JSON.stringify({ success: false, error: error instanceof Error ?
error.message : "An unexpected error occurred" }) };
agent.send(JSON.stringify(errorResponse)); } }); return agent; } catch (error) {
console.error('Error connecting to Deepgram:', error); process.exit(1); } }
wss.on('connection', async (ws, request) => { browserWs = ws; const updatedPrompt =
Customer: ${storedCustomer.success? JSON.stringify(storedCustomer): 'No customer found'}
Menu: ${JSON.stringify(storedMenu.data)}, Total Menu Items: ${storedMenu.data.length} You
are a friendly restaurant assistant. Keep responses under 50 words. CRITICAL: - Check if
customer exists: ${storedCustomer.success}, if false ALWAYS ask for name. if true, use
customer data. - NEVER list multiple menu items at once - Use the provided customer and menu
data to personalize responses - After placing an order, end the conversation and call
end_conversation() - Extra Details: Restro No. ${restaurantNo}, Customer Phone:
${customerPhone} WORKFLOW: 1. Greet: "Hi! What can I get you?" 2. Take order: Ask for
items, size, customizations 3. GATHER INFO NATURALLY: - "Are you dining with us, taking
it to go, or should we deliver?" - For dine-in: "Perfect! Which table are you at?"
(remember this!) - For delivery: "Got it! What's the address?" - For takeaway:
"Sounds good!" 4. Complete: "So that's [items] for [type]". After this IMMEDIATELY Use
placeOrder() then end_conversation() Be warm, efficient, and helpful. Ask one question at a
time. ; const agent = await connectToAgent(updatedPrompt); ws.on('message', (message:
Buffer) => { try { const data = JSON.parse(message.toString()); if (data.event
=== 'start') { streamSid = data.start.streamSid; return; } // Send
media payload to Deepgram if (data.event === 'media' && data.media?.payload) {
agent?.send(Buffer.from(data.media.payload, 'base64')); } else {
agent?.send(message); } } catch (error) { console.error('Message error:', error
instanceof Error ? error.message : 'Unknown error'); } }); ws.on('close', async () => {
if (agent) { await agent.disconnect(); } browserWs = null; }); ws.on('error',
(error) => { console.error('WebSocket error:', error); }); }); wss.on('listening', () => {
console.log('WebSocket server is listening on port 443'); }); server.listen(443, () => {
console.log('Listening on wss://localhost:443'); }); // Handle graceful shutdown
process.on('SIGINT', async () => { console.log('\nReceived SIGINT. Closing server gracefully...');
// Close all WebSocket connections wss.clients.forEach((client) => { if (client.readyState
=== WebSocket.OPEN) { client.close(1000, 'Server shutting down'); } }); // Close
WebSocket server wss.close(() => { console.log('WebSocket server closed'); }); // Close
HTTPS server server.close(() => { console.log('HTTPS server closed'); process.exit(0);
}); }); process.on('SIGTERM', async () => { console.log('\nReceived SIGTERM. Closing server
gracefully...'); // Close all WebSocket connections wss.clients.forEach((client) => { if
(client.readyState === WebSocket.OPEN) { client.close(1000, 'Server shutting down'); }
}); // Close WebSocket server wss.close(() => { console.log('WebSocket server closed');
}); // Close HTTPS server server.close(() => { console.log('HTTPS server closed');
process.exit(0); }); }); // Handle uncaught exceptions process.on('uncaughtException', (error)
=> { console.error('Uncaught Exception:', error); // Close all WebSocket connections
wss.clients.forEach((client) => { if (client.readyState === WebSocket.OPEN) {
client.close(1011, 'Server error'); } }); // Close WebSocket server wss.close(() => {
console.log('WebSocket server closed due to error'); }); // Close HTTPS server
server.close(() => { console.log('HTTPS server closed due to error'); process.exit(1); });
}); // Handle unhandled promise rejections process.on('unhandledRejection', (reason, promise)
=> { console.error('Unhandled Rejection at:', promise, 'reason:', reason); // Close all
WebSocket connections wss.clients.forEach((client) => { if (client.readyState ===
WebSocket.OPEN) { client.close(1011, 'Server error'); } }); // Close WebSocket
server wss.close(() => { console.log('WebSocket server closed due to error'); }); //
Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 4/7
Close HTTPS server server.close(() => { console.log('HTTPS server closed due to error');
process.exit(1); }); }); due to agent initialization it is taking time in connecting call how can i
handle that

The delay is happening because your agent is being initialized only after the WebSocket connects —
meaning Twilio is already streaming audio, but your Deepgram agent is still starting up.
We can solve this in two ways:

1️⃣ Preload & Pool Agents Before Calls


Instead of creating the agent inside wss.on('connection'), create (or “warm up”) an agent in advance
and keep it ready.
Approach:

Maintain an agent pool so a ready-to-use agent is always available.


On call end, recycle or recreate the agent for the next call.
This avoids the cold start delay of deepgram.agent().

Example:

ts

let readyAgent: any = null;

async function preloadAgent(prompt: string) {


readyAgent = await connectToAgent(prompt);
console.log("Agent preloaded and ready.");
}

// Preload at server startup


preloadAgent("Default warm-up prompt");

wss.on('connection', (ws, request) => {


browserWs = ws;

const updatedPrompt = /* build your prompt */;

// Use the preloaded agent immediately


const agent = readyAgent;

// Start preloading next one for future calls


preloadAgent(updatedPrompt);

ws.on('message', (message: Buffer) => {


const data = JSON.parse(message.toString());
if (data.event === 'start') {
streamSid = data.start.streamSid;
return;
}
if (data.event === 'media' && data.media?.payload) {
agent?.send(Buffer.from(data.media.payload, 'base64'));
} else {
agent?.send(message);
}
});
});

Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 5/7
✅ Pros: Immediate connection for calls
⚠ Cons: Slightly higher resource usage because an agent is always running
2️⃣ Return Twilio XML Immediately & Load Agent in Parallel
Right now, your Twilio handler waits for fetchMenu() and findCustomer() to finish before sending XML
back — which means the call doesn’t connect until all API calls are done.
Instead:

Send Twilio <Response> immediately.

Do menu & customer lookup in the background while Deepgram is initializing.

Example:

ts

else if (req.url === '/api/twilio' && req.method === 'POST') {


let body = '';
req.on('data', chunk => { body += chunk.toString(); });
req.on('end', () => {
const parsed = querystring.parse(body);
restaurantNo = getNationalNumber(parsed.To);
customerPhone = getNationalNumber(parsed.From);

// Send Twilio XML right away


const twilioResponse = `<?xml version="1.0" encoding="UTF-8"?>
<Response>
<Say voice="alice">Connecting your call.</Say>
<Connect>
<Stream url="wss://your-server"/>
</Connect>
</Response>`;
res.writeHead(200, { 'Content-Type': 'application/xml' });
res.end(twilioResponse);

// Now do API lookups in background


if (restaurantNo) fetchMenu(restaurantNo).then(menu => storedMenu = menu);
if (customerPhone) findCustomer(customerPhone, restaurantNo).then(c => storedCustomer =
c);
});
}

✅ Pros: Twilio connects instantly, user hears “Connecting your call”


⚠ Cons: First few seconds of conversation might not have menu/customer data yet — you’ll need your
prompt to gracefully handle that

3️⃣ Hybrid: Preload + Lazy Data Fetch


The fastest approach is to preload the Deepgram agent at startup and fetch menu/customer in
parallel after the call starts.

This makes call connection instant.

Data arrives in a few seconds and is used mid-conversation without telling the customer
“loading…”.

Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 6/7
If you want, I can rewrite your connectToAgent and wss.on('connection') to support preloaded warm
agents so the delay is near zero. That would make Twilio calls start almost instantly.

Printed using ChatGPT to PDF, powered by PDFCrowd HTML to PDF API. 7/7

You might also like