Use Tool Calling in Go-to-X#

In the previous example we created a Go-to-X component using basic text manipulation on LLM output. However, for models that have been specifically trained for tool calling, one can get better results for structured outputs by invoking tool calling. At the same time tool calling can be useful to generate responses which require intermediate use of tools by the LLM before providing a final answer. In this example we will utilize tool calling for the former utility of getting a better structured output from the LLM, by reimplementing the Go-to-X component.

Register a tool (function) to be called by the LLM#

To utilize tool calling we will change our strategy of doing pre-processing to LLM text output, and instead ask the LLM to provide structured input to a function (tool). The output of this function will then be sent for publishing to the output topic. Lets see what this will look like in the following code snippets.

First we will modify the component level prompt for our LLM.

# set a component prompt
goto.set_component_prompt(
    template="""What are the position coordinates in the given metadata?"""
)

Next we will replace our pre-processing function, with a much simpler function that takes in a list and provides a numpy array. The LLM will be expected to call this function with the appropriate output. This strategy generally works better than getting text input from LLM and trying to parse it with an arbitrary function. To register the function as a tool, we will also need to create its description in a format that is explanatory for the LLM. This format has been specified by the Ollama client.

Caution

Tool calling is currently available only when components utilize the OllamaClient.

See also

To see a list of models that work for tool calling using the OllamaClient, check here

# pre-process the output before publishing to a topic of msg_type PoseStamped
def get_coordinates(position: list[float]) -> np.ndarray:
    """Get position coordinates"""
    return np.array(position, dtype=float)


function_description = {
    "type": "function",
    "function": {
        "name": "get_coordinates",
        "description": "Get position coordinates",
        "parameters": {
            "type": "object",
            "properties": {
                "position": {
                    "type": "list[float]",
                    "description": "The position coordinates in x, y and z",
                }
            },
        },
        "required": ["position"],
    },
}

# add the pre-processing function to the goal_point output topic
goto.register_tool(
    tool=get_coordinates,
    tool_description=function_description,
    send_tool_response_to_model=False,
)

In the code above, the flag send_tool_response_to_model has been set to False. This means that the function output will be sent directly for publication, since our usage of the tool in this example is limited to forcing the model to provide a structured output. If this flag was set to True, the output of the tool (function) will be sent back to the model to produce the final output, which will then be published. This latter usage is employed when a tool like a calculator, browser or code interpreter can be provided to the model for generating better answers.

Launching the Components#

And as before, we will launch our Go-to-X component.

from agents.ros import Launcher

# Launch the component
launcher = Launcher()
launcher.add_pkg(components=[goto])
launcher.bringup()

The complete code for this example is given below:

Go-to-X Component#
 1import numpy as np
 2from agents.components import LLM
 3from agents.models import Llama3_1
 4from agents.vectordbs import ChromaDB
 5from agents.config import LLMConfig
 6from agents.clients.roboml import HTTPDBClient
 7from agents.clients.ollama import OllamaClient
 8from agents.ros import Launcher, Topic
 9
10# Start a Llama3.1 based llm component using ollama client
11llama = Llama3_1(name="llama")
12llama_client = OllamaClient(llama)
13
14# Initialize a vector DB that will store our routes
15chroma = ChromaDB(name="MainDB")
16chroma_client = HTTPDBClient(db=chroma)
17
18# Define LLM input and output topics including goal_point topic of type PoseStamped
19goto_in = Topic(name="goto_in", msg_type="String")
20goal_point = Topic(name="goal_point", msg_type="PoseStamped")
21
22config = LLMConfig(
23    enable_rag=True,
24    collection_name="map",
25    distance_func="l2",
26    n_results=1,
27    add_metadata=True,
28)
29
30# initialize the component
31goto = LLM(
32    inputs=[goto_in],
33    outputs=[goal_point],
34    model_client=llama_client,
35    db_client=chroma_client,  # check the previous example where we setup this database client
36    trigger=goto_in,
37    config=config,
38    component_name="go_to_x",
39)
40
41# set a component prompt
42goto.set_component_prompt(
43    template="""What are the position coordinates in the given metadata?"""
44)
45
46
47# pre-process the output before publishing to a topic of msg_type PoseStamped
48def get_coordinates(position: list[float]) -> np.ndarray:
49    """Get position coordinates"""
50    return np.array(position, dtype=float)
51
52
53function_description = {
54    "type": "function",
55    "function": {
56        "name": "get_coordinates",
57        "description": "Get position coordinates",
58        "parameters": {
59            "type": "object",
60            "properties": {
61                "position": {
62                    "type": "list[float]",
63                    "description": "The position coordinates in x, y and z",
64                }
65            },
66        },
67        "required": ["position"],
68    },
69}
70
71# add the pre-processing function to the goal_point output topic
72goto.register_tool(
73    tool=get_coordinates,
74    tool_description=function_description,
75    send_tool_response_to_model=False,
76)
77
78# Launch the component
79launcher = Launcher()
80launcher.add_pkg(components=[goto])
81launcher.bringup()