Basic Concepts 📚#
Welcome to the core concepts of the EmbodiedAgents framework. This section introduces the three fundamental building blocks you’ll work with:
🧩 Components: Basic building blocks of EmbodiedAgents.
🔌 Clients: Execution backends that instantiate and call inference on ML models served on local or remote platforms.
🧠 Models / Vector DBs: Configurations for ML models (like LLMs, MLLMs, TTS, vision etc.) or databases used by components.
Each of these building blocks can be composed, configured, and executed flexibly, enabling powerful embodied agent-based applications across modalities.
Components 🧩#
Components are the modular building blocks that define the behavior of an agent. They can represent anything that can be termed as functional behaviour. For example the ability to understand the process text. Each component defines:
Inputs and outputs (which are ROS topics)
Functional logic, including runtime time configuration parameters.
Health status and fallback mechanisms
Components can be combined arbitrarily to create more complex systems such as multi-modal agents with perception-action loops.
📘 Learn more: Components
Clients 🔌#
Clients are execution backends that instantiate and call inference on ML models.
Communicate with remote model serving platforms.
Various communication protocols, HTTP, WebSockets, RESP.
Clients abstract away the underlying model serving details, allowing you to be agnostic to the platforms that the machine learning model is served on.
📘 Learn more: Clients
Models / Vector DBs 🧠#
Components often rely on underlying machine learning models or vector databases. These are defined as specifications, such as:
LLMs, MLLMs, Whisper, TTS, vision models
Vector DBs like ChromaDB for semantic retreival
These models and DBs are standardized using attrs
-based classes, so you can easily plug them into compatible clients regardless of platform or backend.
📘 Learn more: Models & Vector DBs