test#

langsmith.testing._internal.test(func: Callable) → Callable[source]#

langsmith.testing._internal.test(*, id: UUID | None = None, output_keys: Sequence[str] | None = None, client: Client | None = None, test_suite_name: str | None = None) → Callable[[Callable], Callable]

在 LangSmith 中追踪 pytest 测试用例。

此装饰器用于在 LangSmith 中追踪 pytest 测试。它确保创建必要的示例数据并将其与测试函数关联。装饰后的函数将作为测试用例执行，结果将由 LangSmith 记录和报告。

参数:

id (-) – 测试用例的唯一标识符。如果未提供，则将基于测试函数的模块和名称生成 ID。
output_keys (-) – 要被视为测试用例输出键的键列表。这些键将从测试函数的输入中提取，并存储为预期输出。
client (-) – 用于与 LangSmith 服务通信的 LangSmith 客户端实例。如果未提供，将使用默认客户端。
test_suite_name (-) – 测试用例所属的测试套件的名称。如果未提供，测试套件名称将根据环境或包名称确定。

返回值:

装饰后的测试函数。

返回类型:

Callable

环境

LANGSMITH_TEST_CACHE: 如果设置，API 调用将被缓存到磁盘以
在测试期间节省时间和成本。建议将缓存文件提交到您的存储库，以便更快地运行 CI/CD。需要安装 ‘langsmith[vcr]’ 包。
LANGSMITH_TEST_TRACKING: 将此变量设置为目录的路径

以启用测试结果的缓存。这对于重新运行测试很有用
而无需重新执行代码。需要 ‘langsmith[vcr]’ 包。

示例

对于基本用法，只需使用 @pytest.mark.langsmith 装饰测试函数。在底层，这将调用 test 方法

import pytest


# Equivalently can decorate with `test` directly:
# from langsmith import test
# @test
@pytest.mark.langsmith
def test_addition():
    assert 3 + 4 == 7

任何被追踪的代码（例如使用 @traceable 或 wrap_* 函数追踪的代码）都将在测试用例中被追踪，以提高可见性和调试能力。

import pytest
from langsmith import traceable


@traceable
def generate_numbers():
    return 3, 4


@pytest.mark.langsmith
def test_nested():
    # Traced code will be included in the test case
    a, b = generate_numbers()
    assert a + b == 7

LLM 调用很昂贵！通过设置 LANGSMITH_TEST_CACHE=path/to/cache 缓存请求。将这些文件检入以加速 CI/CD 管道，以便您的结果仅在您的提示或请求的模型更改时才更改。

请注意，这将需要您安装带有 vcr 扩展的 langsmith

pip install -U “langsmith[vcr]”

如果您安装 libyaml，缓存会更快。有关更多详细信息，请参阅 https://vcrpy.readthedocs.io/en/latest/installation.html#speed。

# os.environ["LANGSMITH_TEST_CACHE"] = "tests/cassettes"
import openai
import pytest
from langsmith import wrappers

oai_client = wrappers.wrap_openai(openai.Client())


@pytest.mark.langsmith
def test_openai_says_hello():
    # Traced code will be included in the test case
    response = oai_client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Say hello!"},
        ],
    )
    assert "hello" in response.choices[0].message.content.lower()

LLM 是随机的。简单的断言是不可靠的。您可以使用 langsmith 的 expect 来评分并对您的结果进行近似断言。

import pytest
from langsmith import expect


@pytest.mark.langsmith
def test_output_semantically_close():
    response = oai_client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=[
            {"role": "system", "content": "You are a helpful assistant."},
            {"role": "user", "content": "Say hello!"},
        ],
    )
    # The embedding_distance call logs the embedding distance to LangSmith
    expect.embedding_distance(
        prediction=response.choices[0].message.content,
        reference="Hello!",
        # The following optional assertion logs a
        # pass/fail score to LangSmith
        # and raises an AssertionError if the assertion fails.
    ).to_be_less_than(1.0)
    # Compute damerau_levenshtein distance
    expect.edit_distance(
        prediction=response.choices[0].message.content,
        reference="Hello!",
        # And then log a pass/fail score to LangSmith
    ).to_be_less_than(1.0)

@test 装饰器与 pytest fixtures 原生配合使用。这些值将填充 LangSmith 中相应示例的“inputs”。

import pytest


@pytest.fixture
def some_input():
    return "Some input"


@pytest.mark.langsmith
def test_with_fixture(some_input: str):
    assert "input" in some_input

您仍然可以像往常一样使用 pytest.parametrize() 来使用相同的测试函数运行多个测试用例。

import pytest


@pytest.mark.langsmith(output_keys=["expected"])
@pytest.mark.parametrize(
    "a, b, expected",
    [
        (1, 2, 3),
        (3, 4, 7),
    ],
)
def test_addition_with_multiple_inputs(a: int, b: int, expected: int):
    assert a + b == expected

默认情况下，每个测试用例都将根据函数名称和模块分配一个一致的唯一标识符。您还可以使用 id 参数提供自定义标识符

import pytest
import uuid

example_id = uuid.uuid4()


@pytest.mark.langsmith(id=str(example_id))
def test_multiplication():
    assert 3 * 4 == 12

默认情况下，所有测试输入都将作为“inputs”保存到数据集中。您可以指定 output_keys 参数以将这些键持久化在数据集的“outputs”字段中。

import pytest


@pytest.fixture
def expected_output():
    return "input"


@pytest.mark.langsmith(output_keys=["expected_output"])
def test_with_expected_output(some_input: str, expected_output: str):
    assert expected_output in some_input

要运行这些测试，请使用 pytest CLI。或直接运行测试函数。

test_output_semantically_close()
test_addition()
test_nested()
test_with_fixture("Some input")
test_with_expected_output("Some input", "Some")
test_multiplication()
test_openai_says_hello()
test_addition_with_multiple_inputs(1, 2, 3)