langchain旧API替换以及ragas告警和输出的问题

情况说明：
本次代码是L1课程中的“4-RAG高级技术与实践”代码，该代码是使用langchain，Chrome，ragas去实现RAG的并实现RAG评估功能。但由于本人已经升级了langchain1.0.0版本，导致较多api已经合并了，目前本人已经根据官方文档替换了api，基本langchain的API问题已解决。但ragas的输出出现了告警（还未翻阅ragas官方文档），问题如下：
1.告警提示LangchainEmbeddingsWrapper在未来弃用，但我目前未查阅到具体替代的包
2.在评估的时候会有进度条，但是该进度条为什么是24呢，我想知道具体的原因
3.准备加载完成的时候出现了“Prompt fix_output_format failed to parse output: The output parser failed to parse the output including retries.”，因为这几天一直在看视频和看langchain,llamaindex,chromdd官方文档，ragas文档未看，希望可以老师领路一下，让我们快速入门，我也不知道这个原因是什么导致这个告警的
4.代码中的“questions”和“ground_truths”设置了很多的问题，用于确保问题多样性，但我记得之前有提到过该问题的多样性可以使用langchain或者llamaindex去实现，我想结合langchain或者llamaindex两个框架去实现这种方式。但不知道如何下手，烦请老师可以解决一下，感谢。
5.ragas的输出我有点不太理解，我知道它是靠LLM的向量模型去做余弦相似度对比的，但是我记得rag三元组，会相互的对比，但是这里怎么去体现呢？
代码如下：（该代码与之前课件的代码有一点点不一样，特别是ragas那里，需要额外的指定ragas_embeddings）

import os
from langchain_community.document_loaders.pdf import PyPDFLoader
from langchain_community.embeddings import DashScopeEmbeddings
from langchain_community.llms import Tongyi
from langchain_chroma import Chroma
# from langchain.text_splitter import RecursiveCharacterTextSplitter # 旧的API目前不适用，已查阅文档修改如下
from langchain_classic.text_splitter import RecursiveCharacterTextSplitter
# from langchain.retrievers import ParentDocumentRetriever # 旧的API目前不适用，已查阅文档修改如下
from langchain_classic.retrievers import ParentDocumentRetriever
# from langchain.storage import InMemoryStore # 旧的API目前不适用，已查阅文档修改如下
from langchain_core.stores import InMemoryStore
# from langchain.prompts import ChatPromptTemplate # 旧的API目前不适用，已查阅文档修改如下
from langchain_core.prompts import ChatPromptTemplate
# from langchain.schema.runnable import RunnableMap # 旧的API目前不适用，已查阅文档修改如下
from langchain_core.runnables import RunnableMap
# from langchain.schema.output_parser import StrOutputParser # 旧的API目前不适用，已查阅文档修改如下
from langchain_core.output_parsers import StrOutputParser
from datasets import Dataset
from ragas import evaluate
from ragas.metrics import (
    faithfulness,
    answer_relevancy,
    context_recall,
    context_precision,
)


# 自定义嵌入类
class BatchingDashScopeEmbeddings(DashScopeEmbeddings):
    """自定义嵌入类，确保每次请求不超过 DashScope API 限制（10 个文本）"""

    def embed_documents(self, texts):
        all_embeddings = []
        batch_size = 10  # DashScope API 限制

        for i in range(0, len(texts), batch_size):
            batch = texts[i:i + batch_size]
            try:
                embeddings = super().embed_documents(batch)
                all_embeddings.extend(embeddings)
            except Exception as e:
                raise RuntimeError(f"Error embedding batch {i}-{i + len(batch)}: {str(e)}") from e

        return all_embeddings


# 解析 pdf 文档
docs = PyPDFLoader("./浦发上海浦东发展银行西安分行个金客户经理考核办法.pdf").load()
print(f"Loaded {len(docs)} document pages")

# 初始化大语言模型(我没有加入环境变量中，是手动指定的)
DASHSCOPE_API_KEY = "请替换成自己的api key"
llm = Tongyi(
    model_name="qwen-max",
    dashscope_api_key=DASHSCOPE_API_KEY
)

# message: <400> InternalError.Algo.InvalidParameter: Value error, batch size is invalid, it should not be larger than 10.: input.contents
# 创建嵌入模型
embeddings = BatchingDashScopeEmbeddings(
    model="text-embedding-v4",
    dashscope_api_key=DASHSCOPE_API_KEY
)

# 创建主文档分割器
parent_splitter = RecursiveCharacterTextSplitter(chunk_size=512)

# 创建子文档分割器
child_splitter = RecursiveCharacterTextSplitter(chunk_size=256)

# 创建向量数据库对象
vectorstore = Chroma(
    collection_name="split_parents",
    embedding_function=embeddings
)
# 创建内存存储对象
store = InMemoryStore()
# 创建父文档检索器
retriever = ParentDocumentRetriever(
    vectorstore=vectorstore,
    docstore=store,
    child_splitter=child_splitter,
    parent_splitter=parent_splitter,
    search_kwargs={"k": 2}
)

# 添加文档集
retriever.add_documents(docs)

# 切割出来主文档的数量
print(f"Stored {len(list(store.yield_keys()))} parent documents")

# 创建prompt模板（RAG Prompt）
template = """You are an assistant for question-answering tasks.
Use the following pieces of retrieved context to answer the question.
If you don't know the answer, just say that you don't know.
Use two sentences maximum and keep the answer concise.
Question: {question}
Context: {context}
Answer:
"""


# 修复：确保 context 是字符串而非 Document 对象列表
def format_docs(docs):
    return "\\n\\n".join(doc.page_content for doc in docs)


# 由模板生成prompt
prompt = ChatPromptTemplate.from_template(template)

# 创建 chain（LCEL langchain 表达式语言）
chain = RunnableMap({
    # 旧的API目前不适用，已查阅文档修改如下
    # "context": lambda x: format_docs(retriever.get_relevant_documents(x["question"])),
    "context": lambda x: format_docs(retriever.invoke(x["question"])),
    "question": lambda x: x["question"]
}) | prompt | llm | StrOutputParser()

query = "客户经理被投诉了，投诉一次扣多少分？"
response = chain.invoke({"question": query})
print("\\nQuestion:", query)
print("Answer:", response)

# 保证问题需要多样性，场景化覆盖
questions = [
    "客户经理被投诉了，投诉一次扣多少分？",
    "客户经理每年评聘申报时间是怎样的？",
    "客户经理在工作中有不廉洁自律情况的，发现一次扣多少分？",
    "客户经理不服从支行工作安排，每次扣多少分？",
    "客户经理需要什么学历和工作经验才能入职？",
    "个金客户经理职位设置有哪些？"
]

ground_truths = [
    "每投诉一次扣2分",
    "每年一月份为客户经理评聘的申报时间",
    "在工作中有不廉洁自律情况的每发现一次扣50分",
    "不服从支行工作安排，每次扣2分",
    "须具备大专以上学历，至少二年以上银行工作经验",
    "个金客户经理职位设置为：客户经理助理、客户经理、高级客户经理、资深客户经理"
]

answers = []
contexts = []

# Inference
for query in questions:
    answers.append(chain.invoke({"question": query}))
    # 旧的API目前不适用，已查阅文档修改如下
    # contexts.append([docs.page_content for docs in retriever.get_relevant_documents(query)]
    contexts.append([docs.page_content for docs in retriever.invoke(query)]
                    )

# To dict
data = {
    "user_input": questions,
    "response": answers,
    "retrieved_contexts": contexts,
    "reference": ground_truths
}

# Convert dict to dataset
dataset = Dataset.from_dict(data)
print(dataset)

from ragas.llms import LangchainLLMWrapper
from ragas.embeddings import LangchainEmbeddingsWrapper

# 包装你的 LLM 和 embeddings
ragas_llm = LangchainLLMWrapper(llm)
ragas_embeddings = LangchainEmbeddingsWrapper(embeddings)
# 评测结果
result = evaluate(
    dataset=dataset,
    metrics=[
        context_precision,  # 上下文精度
        context_recall,  # 上下文召回率
        faithfulness,  # 忠实度
        answer_relevancy,  # 答案相关性
    ],
    llm=ragas_llm,  # 指定使用你的 Tongyi 模型
    embeddings=ragas_embeddings  # 指定使用你的 DashScope embeddings
)

df = result.to_pandas()
print(df)

pycharm终端输出和告警如下：

D:\\Miniconda3\\envs\\llm\\python.exe C:\\Users\\Administrator\\Desktop\\me\\聚客AI\\第3章_RAG高级技术\\ragas-demo\\demo_1.py 
Loaded 9 document pages
Stored 11 parent documents

Question: 客户经理被投诉了，投诉一次扣多少分？
Answer: 客户经理被投诉一次扣2分。  
依据是百度文库中提到的“有客户投诉的，每投诉一次扣2分”。
Dataset({
    features: ['user_input', 'response', 'retrieved_contexts', 'reference'],
    num_rows: 6
})
C:\\Users\\Administrator\\Desktop\\me\\聚客AI\\第3章_RAG高级技术\\ragas-demo\\demo_1.py:169: DeprecationWarning: LangchainLLMWrapper is deprecated and will be removed in a future version. Use the modern LLM providers instead: from ragas.llms.base import llm_factory; llm = llm_factory('gpt-4o-mini') or from ragas.llms.base import instructor_llm_factory; llm = instructor_llm_factory('openai', client=openai_client)
  ragas_llm = LangchainLLMWrapper(llm)
C:\\Users\\Administrator\\Desktop\\me\\聚客AI\\第3章_RAG高级技术\\ragas-demo\\demo_1.py:170: DeprecationWarning: LangchainEmbeddingsWrapper is deprecated and will be removed in a future version. Use the modern embedding providers instead: embedding_factory('openai', model='text-embedding-3-small', client=openai_client) or from ragas.embeddings import OpenAIEmbeddings, GoogleEmbeddings, HuggingFaceEmbeddings
  ragas_embeddings = LangchainEmbeddingsWrapper(embeddings)
Evaluating:  88%|████████▊ | 21/24 [01:05<00:08,  2.95s/it]Prompt fix_output_format failed to parse output: The output parser failed to parse the output including retries.
Prompt fix_output_format failed to parse output: The output parser failed to parse the output including retries.
Prompt fix_output_format failed to parse output: The output parser failed to parse the output including retries.
Prompt n_l_i_statement_prompt failed to parse output: The output parser failed to parse the output including retries.
Exception raised in Job[18]: RagasOutputParserException(The output parser failed to parse the output including retries.)
Evaluating:  92%|█████████▏| 22/24 [01:24<00:11,  5.52s/it]Prompt fix_output_format failed to parse output: The output parser failed to parse the output including retries.
Prompt fix_output_format failed to parse output: The output parser failed to parse the output including retries.
Prompt fix_output_format failed to parse output: The output parser failed to parse the output including retries.
Prompt n_l_i_statement_prompt failed to parse output: The output parser failed to parse the output including retries.
Exception raised in Job[6]: RagasOutputParserException(The output parser failed to parse the output including retries.)
Evaluating: 100%|██████████| 24/24 [01:38<00:00,  4.11s/it]
                    user_input  ... answer_relevancy
0           客户经理被投诉了，投诉一次扣多少分？  ...         0.993391
1            客户经理每年评聘申报时间是怎样的？  ...         0.888060
2  客户经理在工作中有不廉洁自律情况的，发现一次扣多少分？  ...         0.824145
3        客户经理不服从支行工作安排，每次扣多少分？  ...         0.885896
4         客户经理需要什么学历和工作经验才能入职？  ...         0.955197
5               个金客户经理职位设置有哪些？  ...         0.661334

[6 rows x 8 columns]

进程已结束，退出代码为 0

10月23日 16:04 | 69人阅读

回答 | 共 2 个

按点赞量排序

Moonlike Smile

在代码中，如果ragas评估的时候采用课件的方式，我这里也会出现很多的报红，并不是报错也不是警告，我简单查了一下没找到原因。
代码如下：

result = evaluate(
    dataset=dataset, # 测试数据集
    # 评估的指标，这里我们定义了4个
    metrics=[
        context_precision,  # 上下文精度
        context_recall,  # 上下文召回率
        faithfulness,  # 忠实度
        answer_relevancy,  # 答案相关性
    ],
    llm=llm,  # 指定的 Tongyi 模型
    embeddings=embeddings  # 指定 DashScope embeddings
)

使用这段代码后出现的爆红，在pycharm终端显示如下：

D:\Miniconda3\envs\llm\python.exe C:\Users\Administrator\Desktop\me\聚客AI\第3章_RAG高级技术\ragas-demo\demo_1.py 
Loaded 9 document pages
Stored 11 parent documents

Question: 客户经理被投诉了，投诉一次扣多少分？
Answer: 客户经理每被投诉一次扣2分。  
依据是百度文库中提到的“有客户投诉的，每投诉一次扣2分”。
Dataset({
    features: ['user_input', 'response', 'retrieved_contexts', 'reference'],
    num_rows: 6
})
Evaluating:  92%|█████████▏| 22/24 [00:58<00:05,  2.66s/it]Exception raised in Job[10]: OutputParserException(Failed to parse StringIO from completion {"statements": [{"statement": "A client manager who exhibits unclean or undisciplined behavior at work will have 50 points deducted upon each discovery.", "reason": "The context states that \"\u5728\u5de5\u4f5c\u4e2d\u6709\u4e0d\u5ec9\u6d01\u81ea\u5f8b\u60c5\u51b5\u7684\u6bcf\u53d1\u73b0\u4e00\u6b21\u626350\u5206\" which translates to \"if there is an instance of unclean or undisciplined (not clean and self-disciplined) behavior at work, 50 points will be deducted for each discovery.\" This directly supports the statement.", "verdict": 1}, {"statement": "The basis for the deduction is that 50 points are subtracted for each instance of unclean or undisciplined behavior observed in the work of a client manager.", "reason": "This statement rephrases the rule stated in point 7 of the context: \"\u5728\u5de5\u4f5c\u4e2d\u6709\u4e0d\u5ec9\u6d01\u81ea\u5f8b\u60c5\u51b5\u7684\u6bcf\u53d1\u73b0\u4e00\u6b21\u626350\u5206\", which clearly establishes the basis for deducting 50 points per instance. The statement is directly supported by the text.", "verdict": 1}]}. Got: 1 validation error for StringIO
text
  Field required [type=missing, input_value={'statements': [{'stateme... text.', 'verdict': 1}]}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.11/v/missing
For troubleshooting, visit: https://python.langchain.com/docs/troubleshooting/errors/OUTPUT_PARSING_FAILURE )
Evaluating: 100%|██████████| 24/24 [01:52<00:00,  4.70s/it]
                    user_input  ... answer_relevancy
0           客户经理被投诉了，投诉一次扣多少分？  ...         0.993413
1            客户经理每年评聘申报时间是怎样的？  ...         0.933930
2  客户经理在工作中有不廉洁自律情况的，发现一次扣多少分？  ...         0.815736
3        客户经理不服从支行工作安排，每次扣多少分？  ...         0.885896
4         客户经理需要什么学历和工作经验才能入职？  ...         0.815053
5               个金客户经理职位设置有哪些？  ...         0.639936

[6 rows x 8 columns]

进程已结束，退出代码为 0

10月24日 15:04

请先登录 · 注册

聚客AI-挽风

rags我也很久没用了，我记得是一共六条数据，评估四个指标，所以一共是24，想知道rags输出怎么计算的直接看源码就行了。想要多样性问答可以利用现有的大模型生成，再进行人工审查。个人认为不要在版本问题耗费太多精力，目前框架更新迭代很快，经常有些小bug，不要纠结太多警告信息

10月24日 15:39

请先登录 · 注册

游客