feat: add html_extract_compare_v2 #208

e06084 · 2025-10-15T02:54:33Z

概述

LLMHtmlExtractCompareV2 是用于对比评估两种 HTML 内容提取工具效果的增强版本。相比 V1 版本，V2 版本采用了更高效的评估策略，大幅减少了 token 消耗。

基础用法

import os
from dingo.io import Data
from dingo.model.llm.llm_html_extract_compare_v2 import LLMHtmlExtractCompareV2

# 初始化评估器
evaluator = LLMHtmlExtractCompareV2()
evaluator.dynamic_config.model = 'gpt-4'
evaluator.dynamic_config.key = os.getenv("OPENAI_KEY")
evaluator.dynamic_config.api_url = 'https://api.openai.com/v1'

# 准备数据
data = Data(
    data_id="test_001",
    prompt="工具A提取的内容...",
    content="工具B提取的文本内容",
    raw_data={
        "language": "zh"
    }
)

# 执行评估
result = evaluator.eval(data)

# 查看结果
print(f"判断: {result.type}")
print(f"推理: {result.reason[0]}")

dingo/model/response/response_class.py

examples/compare/html_extract_compare_v2_example_dataset.py

feat: add html_extract_compare_v2

4141753

shijinpjlab reviewed Oct 15, 2025

View reviewed changes

dingo/model/response/response_class.py Outdated Show resolved Hide resolved

examples/compare/html_extract_compare_v2_example_dataset.py Show resolved Hide resolved

add ut

4eb8bcb

e06084 force-pushed the dev branch from a0d569e to 4eb8bcb Compare October 15, 2025 03:57

e06084 added 2 commits October 15, 2025 12:01

fix

96b3325

x

f095d02

shijinpjlab merged commit 30a8fc1 into MigoXLab:dev Oct 15, 2025
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add html_extract_compare_v2 #208

feat: add html_extract_compare_v2 #208

Uh oh!

e06084 commented Oct 15, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat: add html_extract_compare_v2 #208

feat: add html_extract_compare_v2 #208

Uh oh!

Conversation

e06084 commented Oct 15, 2025

概述

基础用法

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants