如何异步运行评估
我们可以使用 SDK 通过 aevaluate() 异步运行评估,它接受与 evaluate() 相同的所有参数,但期望应用程序函数是异步的。您可以在此处了解有关如何使用 evaluate()
函数的更多信息。
仅限 Python
本指南仅在使用 Python SDK 时相关。在 JS/TS 中,evaluate()
函数已经是异步的。您可以在此处查看如何使用它。
使用 aevaluate()
- Python
需要 langsmith>=0.3.13
from langsmith import wrappers, Client
from openai import AsyncOpenAI
# Optionally wrap the OpenAI client to trace all model calls.
oai_client = wrappers.wrap_openai(AsyncOpenAI())
# Optionally add the 'traceable' decorator to trace the inputs/outputs of this function.
@traceable
async def researcher_app(inputs: dict) -> str:
instructions = """You are an excellent researcher. Given a high-level research idea, \
list 5 concrete questions that should be investigated to determine if the idea is worth pursuing."""
response = await oai_client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": instructions},
{"role": "user", "content": inputs["idea"]},
],
)
return response.choices[0].message.content
# Evaluator functions can be sync or async
def concise(inputs: dict, outputs: dict) -> bool:
return len(outputs["output"]) < 3 * len(inputs["idea"])
ls_client = Client()
ideas = [
"universal basic income",
"nuclear fusion",
"hyperloop",
"nuclear powered rockets",
]
dataset = ls_client.create_dataset("research ideas")
ls_client.create_examples(
dataset_name=dataset.name,
examples=[{"inputs": {"idea": i}} for i in ideas],
)
# Can equivalently use the 'aevaluate' function directly:
# from langsmith import aevaluate
# await aevaluate(...)
results = await ls_client.aevaluate(
researcher_app,
data=dataset,
evaluators=[concise],
# Optional, add concurrency.
max_concurrency=2, # Optional, add concurrency.
experiment_prefix="gpt-4o-mini-baseline" # Optional, random by default.
)