
Request The client is a startup developing a wearable device for pets that contains speakers and sensors for pet-human conversation. DataRoot Labs was asked to build an intelligent, real-time communication system embedded in a wearable featuring cloud-based ML for dialogue, sound, and activity detection, personalized by caregiver voice responses and a robust event-driven alerts pipeline.
DRL team built a robust, real-time system for voice interaction, emotion recognition, and sensor-based activity understanding. For general conversations, a streaming-based response generation system was paired with a custom-designed virtual character that reflects user-selected personas. DRL fine-tuned a voice cloning system, running extensive experiments to optimize audio quality. Keyword detection was enhanced using a model capable of learning new trigger words from just a few audio examples, avoiding full retraining. Emotion detection combined both audio-based classification and text analysis, capturing a wide range of user expressions. All models were optimized for GPU inference and deployed through a scalable serving infrastructure supporting concurrent execution and dynamic batching.