Services

Expertise

Cases

BLOG

About

Avatar Chatbot: Eric AI

Chatbots

ai

Summary

Purpose

The client set out to create a conversational AI avatar capable of answering questions using private, internal knowledge, while remaining transparent, controllable, and easy to integrate into an existing product ecosystem. The goal was to deliver a production-quality Proof of Concept (PoC) demonstrating both technical feasibility and being ready for real-world user experience.

Methodology

We followed an iterative, delivery-first approach:

Rapid local prototyping to validate core assumptions
Early cloud deployment to uncover integration and latency issues
Multiple feedback and refinement cycles
Close collaboration with the client’s frontend and engineering teams

"Thanks to Postdata's work, the client launched their video avatar on time. The team conducted weekly meetings and stand-ups to ensure effective collaboration throughout the engagement. Moreover, Postdata's professionalism, work ethic, and ability to meet deadlines impressed the client." Network Marketing Pro
Review by Network Marketing Pro on Clutch

Project overview

Scope

Building an end-to-end backend system that would:

Retrieve answers from a private knowledge base using Retrieval-Augmented Generation (RAG)
Support multi-turn conversational context
Generate real-time talking avatar video responses
Deploy in a scalable, cloud-native environment
Integrate cleanly with the client’s frontend and internal systems
Document architectural decisions and future extensibility options
Ensure multilanguage support

We delivered a modular, cloud-deployed backend composed of two connected pipelines:

Conversational RAG Service
A FastAPI-based service that retrieves relevant content from a vector database, generates grounded answers using an LLM, and maintains short-term conversational memory.
Video Avatar Generation Service.
A real-time avatar streaming integration that converts generated answers into lip-synced video using a third-party avatar provider.

The backend was deployed as a containerized service on Google Cloud Run, using automatic scaling to handle variable request load without requiring manual management. This ensured the PoC was not only functional, but also had production-ready architecture.

Results

Core Deliverables:

End-to-end conversational RAG backend: we delivered a fully functional backend capable of retrieving answers from a private knowledge base, maintaining short-term conversational context, and generating grounded, auditable responses using an LLM.
Real-time AI avatar video streaming: the system converts generated text answers into lip-synced video responses in real time, enabling a natural, engaging conversational experience for end users.
Cloud-native deployment and integration: the solution was deployed as a containerized service on Google Cloud Run with autoscaling enabled and integrated into the client’s system through well-defined HTTP APIs.
Data ingestion and indexing pipeline: a deterministic pipeline was implemented to clean, chunk, embed, and index client documents, ensuring reliable retrieval quality and repeatable updates.
Comprehensive documentation: we delivered detailed API references, deployment instructions, and operational documentation to support ongoing development and handover.

Additional Value Delivered:

Performance and cost optimization: we optimized LLM calls to reduce latency and control inference costs; we created internal endpoints for monitoring usage, controlling costs, and observing system performance.
Safety and quality of responses: prompt guardrails were introduced to define avatar behavior and ensure smooth user experience.
Extended capabilities and integration support: we added an LLM-based translation layer and worked closely with the client’s frontend team to ensure correct real-time streaming and end-to-end integration.

Insights & Conclusions

The project quickly delivered an AI avatar grounded in private knowledge without sacrificing architectural quality. In addition to meeting all technical objectives, the client successfully showcased the solution at a large public event with a high number of attendees, where the system ran live. Early cloud deployment, autoscaling infrastructure, and continuous feedback loops were critical to achieving this outcome within a short time.

Next steps

Following the successful PoC, the next phase will focus on introducing persistent conversation history, expanded multilingual support, automated re-indexing pipelines, and deeper RAG quality improvements such as enhanced multi-turn grounding and retrieval over chat history. Additional exploration may include evaluating alternative avatar providers and embedding models, selectively assessing the Vertex AI stack for potential performance or cost benefits, and extending interaction modalities with voice-based input.

Project duration:

4 weeks

Team

3

1 Lead Data Scientist, 1 Data Scientist, 1 Backend Developer

Technologies

Python, FastAPI, Gemini, Pinecone, HeyGen, Google Cloud Run

Tech challenge

Knowledge retrieval quality: we tuned chunking strategy, retrieval parameters, and prompt constraints limiting the LLM to retrieved context only.

End-to-end latency: optimized request orchestration and reduced token usage after early cloud deployment.

Multi-turn context management: we introduced conversation history summarization to maintain relevant context across turns while improving accuracy and token efficiency.

Cost control & observability: added usage and monitoring endpoints, optimized prompt length, and assisted the client team with request-limit management.

Avatar safety & brand alignment: ensured multiple prompt-guardrail iterations based on stakeholder feedback and real-world testing.

Frontend integration: provided clear API contracts, example payloads, and hands-on support for the client’s frontend team.

Solution

The project delivered a cloud-native, scalable AI avatar system that combines retrieval-augmented intelligence with real-time video generation. It established a robust foundation for future expansion into a production-grade platform.

Discover more

Content recommendation engine for article suggestions

Recommendation System

Gov

app development

Efficient Content Processing and Recommendation System for Kyiv Independent News Journal

Data Science

OpenAI

Recommendation System

LLM-based Credit Scoring using transaction data

scoring systems

Data Science

fintech

Scoring system for MD Finance

fintech

scoring systems

BUENO: From Brand to Launch - Wellness Platform for the Latino Community

AI-Powered YouTube Video Recommender for the Austrian Chamber of Commerce (WKO Inhouse)

OpenAI

Recommendation System

Chatbots

Data Science

Gov

Automated Pizza Store Delivery Planner

logistics

OR-Tools

ai

Vehicle Routing Problem

Optimization

Clothes detection

retail

App for meetings

app development

Social network analysis

martech

Face recognition system for retail

retail

Let's talk about your case

Email: andrii.rohovyi@postdata.ai

Let's talk about your case

Email: andrii.rohovyi@postdata.ai

Let's talk about your case

Services

Expertise

Cases

BLOG

About

Talk to CEO

MENU

MENU

all cases

Avatar Chatbot: Eric AI

Chatbots

ai

Summary

Project overview

Results

Insights & Conclusions

Next steps

4 weeks

3

Solution

Discover more

Content recommendation engine for article suggestions

Recommendation System

Gov

app development

Efficient Content Processing and Recommendation System for Kyiv Independent News Journal

Data Science

OpenAI

Recommendation System

LLM-based Credit Scoring using transaction data

scoring systems

Data Science

fintech

Scoring system for MD Finance

fintech

scoring systems

BUENO: From Brand to Launch - Wellness Platform for the Latino Community

AI-Powered YouTube Video Recommender for the Austrian Chamber of Commerce (WKO Inhouse)

OpenAI

Recommendation System

Chatbots

Data Science

Gov

Automated Pizza Store Delivery Planner

logistics

OR-Tools

ai

Vehicle Routing Problem

Optimization

Clothes detection

retail

App for meetings

app development

Social network analysis

martech

Face recognition system for retail

retail

Let's talk about your case

Book a call

Write email

Email: andrii.rohovyi@postdata.ai

Let's talk about your case

Book a call

Write email

Email: andrii.rohovyi@postdata.ai

Let's talk about your case

Book a call

Write email

Email: andrii.rohovyi@postdata.ai

GENERAL

Domains

Expertise

GENERAL

Domains

Expertise

GENERAL

Domains

Expertise