Skip to content

Conversation

@atharva151101
Copy link

@atharva151101 atharva151101 commented Oct 27, 2025

Purpose

Adds tool calling support for sem_map operator. We add a new class definition of LMWithTools which needs to be configured in the settings to be used by sem_map.
The implementation used crewai as the base framework to build upon for agentic operators.
Once configured, users can paas any crew ai compatible tools with sem_map, they can also define custom tools using pydantic (as shown in the test example below).

Test Plan

Added some basic UTs. Did a manual test using a custom script to test the e2e workflow to test that tool calling is working with the new sem_map operator flow. (Was planning to add this as a pytest, but pytest has issues when the function being tested uses asyncio internally)

Test Results

Script used for testing

from lotus.models import LMWithTools
import lotus
from lotus.sem_ops.sem_map import sem_map_with_tools 
import pandas as pd 
from typing import Type 
from crewai.tools import BaseTool
from pydantic import BaseModel, Field
import time


class FileReadArgs(BaseModel):
    filename: str = Field(..., description="The name of the file to read.")
    
class FileReadTool(BaseTool):
    name: str = "File Read"
    description: str = "A tool to read files from the local filesystem."
    args_schema: Type[BaseModel] = FileReadArgs


    def __init__(self):
        super().__init__()

    def _run(self, filename: str) -> str:
        print(f"Reading file: {filename}")
        if filename == "text1.txt":
            return "13 + 16"
        elif filename == "text2.txt":
            return "29 + 51"

class AdditionArgs(BaseModel):
    num1: str = Field(..., description="The first number.")
    num2: str = Field(..., description="The second number.")

class AdditionTool(BaseTool):
    name: str = "Addition Tool"
    description: str = "A tool to add two numbers."
    args_schema: Type[BaseModel] = AdditionArgs


    def __init__(self):
        super().__init__()

    def _run(self, num1: int, num2: int) -> str:
        print(f"Adding numbers: {num1} + {num2}")
        return str(int(num1) + int(num2))

lmwithtools = LMWithTools()

lotus.settings.configure(lm_with_tools=lmwithtools)
data = {
"File names": [
    "text1.txt",
    "text2.txt",
]
}

df = pd.DataFrame(data)
user_instruction = "Evaluate the mathametical expression in the file {File names} to a single integer. Give only the integer as output."
df = df.sem_map(user_instruction, tools=[FileReadTool(), AdditionTool()])
print(df)

Output:

(lotus) atharvachougule@Atharvas-MacBook-Pro-3 lotus % python3 test.py
Mapping:   0%|                                     0/2 Rows processed [00:00<?, ?it/s]2025-10-27 15:17:23,165 - INFO - OpenAI API usage: {'prompt_tokens': 402, 'completion_tokens': 47, 'total_tokens': 449}
Reading file: text1.txt
2025-10-27 15:17:30,578 - INFO - OpenAI API usage: {'prompt_tokens': 428, 'completion_tokens': 58, 'total_tokens': 486}
Adding numbers: 13 + 16
2025-10-27 15:17:32,052 - INFO - OpenAI API usage: {'prompt_tokens': 402, 'completion_tokens': 512, 'total_tokens': 914}
Reading file: text2.txt
2025-10-27 15:17:36,987 - INFO - OpenAI API usage: {'prompt_tokens': 470, 'completion_tokens': 15, 'total_tokens': 485}
Mapping:  50%|██████████████               1/2 Rows processed [00:15<00:15, 15.69s/it]2025-10-27 15:17:39,308 - INFO - OpenAI API usage: {'prompt_tokens': 428, 'completion_tokens': 64, 'total_tokens': 492}
Adding numbers: 29 + 51
2025-10-27 15:17:45,752 - INFO - OpenAI API usage: {'prompt_tokens': 476, 'completion_tokens': 14, 'total_tokens': 490}
Mapping: 100%|████████████████████████████ 2/2 Rows processed [00:24<00:00, 12.23s/it]
  File names _map
0  text1.txt   29
1  text2.txt   80

(Optional) Documentation Update

TODO: (Add documentation changes to sem_map, add new doc for how to use tools with sem ops)

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Performance improvement
  • Refactoring (no functional changes)

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • [] I have commented my code, updating docstrings
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

BEFORE SUBMITTING, PLEASE READ https://github.com/lotus-data/lotus/blob/main/CONTRIBUTING.md
anything written below this line will be removed by GitHub Actions

@atharva151101 atharva151101 force-pushed the feature/agentic-sem-map-v1 branch from b15f463 to bb6a4d8 Compare November 14, 2025 18:34
@atharva151101 atharva151101 force-pushed the feature/agentic-sem-map-v1 branch from bfa6981 to 94f92b7 Compare December 11, 2025 10:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants