Version: 1.0.0-beta.5

Architecture

Please make sure you read the glossary to have a better understanding of this section.

High-Level Architecture Schema#

Leon's High-Level Architecture Schema

This scenario describes the steps of the above schema. Please note that most interactions are done through WebSockets.

Client (web app, etc.) makes an HTTP request to GET some information about Leon.
HTTP API responds information to client.
User talks with their microphone.
.
a. If hotword server is launched, Leon listens (offline) if user is calling him by saying Leon.
b. If Leon understands user is calling him, Leon emits a message to the main server via a WebSocket. Now Leon is listening (offline) to user.
c. User said Hello! to Leon, client transforms the audio input to an audio blob.
ASR transforms audio blob to a wave file.
STT parser transforms wave file to string (Hello).
.
a. User receives string and string is forwarded to NLU.
b. Or user type Hello! with their keyboard (and ignores steps 1. to 7.a.). Hello! string is forwarded to NLU.
NLU classifies string and pick up classification.
If collaborative logger is enabled, classification is sent to collaborative logger.
Brain creates a child process and executes the chosen module.
If synchronizer is enabled and module has this option, it synchronizes content.
Brain creates an answer and forwards it to TTS synthesizer.
TTS synthesizer transforms text answer (and send it to user as text) to audio buffer which is played by client.