agents.components.mllm

agents.components.mllm#

Module Contents#

Classes#

MLLM

This component utilizes multi-modal large language models (e.g. Llava) that can be used to process text and image data.

API#

class agents.components.mllm.MLLM(*, inputs: list[Union[agents.ros.Topic, agents.ros.FixedInput]], outputs: list[agents.ros.Topic], model_client: agents.clients.model_base.ModelClient, config: Optional[agents.config.MLLMConfig] = None, trigger: Union[agents.ros.Topic, list[agents.ros.Topic], float] = 1, callback_group=None, component_name: str = 'mllm_component', **kwargs)#

Bases: agents.components.llm.LLM

This component utilizes multi-modal large language models (e.g. Llava) that can be used to process text and image data.

Parameters:
  • inputs (list[Topic | FixedInput]) – The input topics or fixed inputs for the MLLM component. This should be a list of Topic objects or FixedInput instances, limited to String and Image types.

  • outputs (list[Topic]) – The output topics for the MLLM component. This should be a list of Topic objects. String type is handled automatically.

  • model_client (ModelClient) – The model client for the MLLM component. This should be an instance of ModelClient.

  • config (MLLMConfig) – Optional configuration for the MLLM component. This should be an instance of MLLMConfig. If not provided, defaults to MLLMConfig().

  • trigger (Union[Topic, list[Topic], float]) – The trigger value or topic for the MLLM component. This can be a single Topic object, a list of Topic objects, or a float value for a timed component. Defaults to 1.

  • callback_group (str) – An optional callback group for the MLLM component. If provided, this should be a string. Otherwise, it defaults to None.

  • component_name (str) – The name of the MLLM component. This should be a string and defaults to β€œmllm_component”.

Example usage:

text0 = Topic(name="text0", msg_type="String")
image0 = Topic(name="image0", msg_type="Image")
text0 = Topic(name="text1", msg_type="String")
config = MLLMConfig()
model = TransformersMLLM(name='idefics')
model_client = ModelClient(model=model)
mllm_component = MLLM(inputs=[text0, image0],
                      outputs=[text1],
                      model_client=model_client,
                      config=config,
                      component_name='mllm_component')