Publications

Conference name: CVMI 2025

Bringing Llama-3 to the Edge: End-to-End Quantized Conversational AI on Raspberry Pi 5

Topic: Running LLM based chatbots on edge and low compute environments

Abstract: Deploying conversational AI systems on resource-constrained edge computing devices requires sophisticated optimization techniques to ensure computational efficiency without compromising performance quality. This paper presents a comprehensive study on the optimization and deployment of an end-to-end conversational AI pipeline - comprising Whisper automatic speech recognition (ASR), a 3-billion parameter Llama-3 large language model (LLM) and lightweight text-to-speech (TTS) - running entirely on a Raspberry Pi module.

DOI: 10.1109/CVMI66673.2025.11337918

Read Paper →