agents.components.texttospeech

agents.components.texttospeech#

Module Contents#

Classes#

TextToSpeech

This component takes in text input and outputs an audio representation of the text using TTS models (e.g. SpeechT5). The generated audio can be played using any audio playback device available on the agent.

API#

class agents.components.texttospeech.TextToSpeech(*, inputs: List[agents.ros.Topic], outputs: Optional[List[agents.ros.Topic]] = None, model_client: agents.clients.model_base.ModelClient, config: Optional[agents.config.TextToSpeechConfig] = None, trigger: Union[agents.ros.Topic, List[agents.ros.Topic]], callback_group=None, component_name: str = 'texttospeech_component', **kwargs)#

Bases: agents.components.model_component.ModelComponent

This component takes in text input and outputs an audio representation of the text using TTS models (e.g. SpeechT5). The generated audio can be played using any audio playback device available on the agent.

Parameters:
  • inputs (list[Topic]) – The input topics for the TTS. This should be a list of Topic objects, limited to String type.

  • outputs (list[Topic]) – Optional output topics for the TTS. This should be a list of Topic objects, Audio type is handled automatically.

  • model_client (ModelClient) – The model client for the TTS. This should be an instance of ModelClient.

  • config (Optional[TextToSpeechConfig]) – The configuration for the TTS. This should be an instance of TextToSpeechConfig. If not provided, it defaults to TextToSpeechConfig()

  • trigger (Union[Topic, list[Topic]) – The trigger value or topic for the TTS. This can be a single Topic object or a list of Topic objects.

  • callback_group (str) – An optional callback group for the TTS. If provided, this should be a string. Otherwise, it defaults to None.

  • component_name (str) – The name of the TTS component. This should be a string and defaults to “texttospeech_component”.

Example usage:

text_topic = Topic(name="text", msg_type="String")
audio_topic = Topic(name="audio", msg_type="Audio")
config = TextToSpeechConfig(play_on_device=True)
model_client = ModelClient(model=SpeechT5(name="speecht5"))
tts_component = TextToSpeech(
    inputs=[text_topic],
    outputs=[audio_topic],
    model_client=model_client,
    config=config,
    component_name='tts_component'
)