arxiv:2410.06703
Segev Shlomov
segevshlomov
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
about 2 months ago
From Benchmarks to Business Impact: Deploying IBM Generalist Agent in Enterprise Production
upvoted
a
paper
9 months ago
ST-WebAgentBench: A Benchmark for Evaluating Safety and Trustworthiness
in Web Agents
liked
a dataset
9 months ago
dolev31/st-webagentbench