做竞价的网站还用做seo,wordpress修改文章链接,响应式网站 html,软件工程师的发展前景LangChain系列文章
LangChain 实现给动物取名字#xff0c;LangChain 2模块化prompt template并用streamlit生成网站 实现给动物取名字LangChain 3使用Agent访问Wikipedia和llm-math计算狗的平均年龄LangChain 4用向量数据库Faiss存储#xff0c;读取YouTube的视频文本搜索I…LangChain系列文章
LangChain 实现给动物取名字LangChain 2模块化prompt template并用streamlit生成网站 实现给动物取名字LangChain 3使用Agent访问Wikipedia和llm-math计算狗的平均年龄LangChain 4用向量数据库Faiss存储读取YouTube的视频文本搜索Indexes for information retrieveLangChain 5易速鲜花内部问答系统LangChain 6根据图片生成推广文案HuggingFace中的image-caption模型LangChain 7 文本模型TextLangChain和聊天模型ChatLangChainLangChain 8 模型Model I/O输入提示、调用模型、解析输出LangChain 9 模型Model I/O 聊天提示词ChatPromptTemplate, 少量样本提示词FewShotPromptLangChain 10思维链Chain of Thought一步一步的思考 think step by stepLangChain 11实现思维树Implementing the Tree of Thoughts in LangChain’s ChainLangChain 12调用模型HuggingFace中的Llama2和Google Flan t5LangChain 13输出解析Output Parsers 自动修复解析器LangChain 14 SequencialChain链接不同的组件LangChain 15根据问题自动路由Router Chain确定用户的意图LangChain 16 通过Memory记住历史对话的内容LangChain 17 LangSmith调试、测试、评估和监视基于任何LLM框架构建的链和智能代理 1. 评估Agent
除了记录运行LangSmith还允许您测试和评估LLM应用程序。
在本节中您将利用LangSmith创建基准数据集并在代理上运行AI辅助评估器。您将按照以下几个步骤进行
创建数据集初始化一个新的代理来进行基准测试配置评估器来对代理的输出进行评分在数据集上运行代理并评估结果
1. 1. 创建一个LangSmith数据集
在下面我们使用LangSmith客户端从上面的输入问题和标签列表创建一个数据集。您将在以后使用这些数据来衡量新代理的性能。数据集是一组示例只是您可以用作应用程序测试用例的输入-输出对。
有关数据集的更多信息包括如何从CSV文件或其他文件创建它们或者如何在平台上创建它们请参阅LangSmith文档。
outputs [LangChain is an open-source framework for building applications using large language models. It is also the name of the company building LangSmith.,LangSmith is a unified platform for debugging, testing, and monitoring language model applications and agents powered by LangChain,July 18, 2023,The langsmith cookbook is a github repository containing detailed examples of how to use LangSmith to debug, evaluate, and monitor large language model-powered applications.,September 5, 2023,
]dataset_name fagent-qa-{unique_id}dataset client.create_dataset(dataset_name,descriptionAn example dataset of questions over the LangSmith documentation.,
)for query, answer in zip(inputs, outputs):client.create_example(inputs{input: query}, outputs{output: answer}, dataset_iddataset.id)smith.langchain
1.2. 初始化一个新的代理以进行基准测试
LangSmith允许您评估任何LLM、Chains、Agents甚至是自定义函数。会话代理是有状态的它们有记忆为了确保这种状态不会在数据集运行之间共享我们将传入一个chain_factory也称为构造函数函数来为每次调用进行初始化。
在这种情况下我们将测试一个使用OpenAI的函数调用端点的代理。
from langchain import hub
from langchain.agents import AgentExecutor, AgentType, initialize_agent, load_tools
from langchain.agents.format_scratchpad import format_to_openai_function_messages
from langchain.agents.output_parsers import OpenAIFunctionsAgentOutputParser
from langchain.chat_models import ChatOpenAI
from langchain.tools.render import format_tool_to_openai_function# Since chains can be stateful (e.g. they can have memory), we provide
# a way to initialize a new chain for each row in the dataset. This is done
# by passing in a factory function that returns a new chain for each row.
def agent_factory(prompt):llm_with_tools llm.bind(functions[format_tool_to_openai_function(t) for t in tools])runnable_agent ({input: lambda x: x[input],agent_scratchpad: lambda x: format_to_openai_function_messages(x[intermediate_steps]),}| prompt| llm_with_tools| OpenAIFunctionsAgentOutputParser())return AgentExecutor(agentrunnable_agent, toolstools, handle_parsing_errorsTrue)1.3. 配置评估
在UI中手动比较链的结果是有效的但可能会耗费时间。使用自动化指标和AI辅助反馈来评估您的组件性能可能会有所帮助。
接下来我们将创建一些预先实现的运行评估器执行以下操作
将结果与基本真实标签进行比较。使用嵌入距离测量语义不相似性使用自定义标准以无参考方式评估代理响应的“方面”
有关如何选择适当的评估器以及如何创建自己的自定义评估器的更多讨论请参阅LangSmith文档。
from langchain.evaluation import EvaluatorType
from langchain.smith import RunEvalConfigevaluation_config RunEvalConfig(# Evaluators can either be an evaluator type (e.g., qa, criteria, embedding_distance, etc.) or a configuration for that evaluatorevaluators[# Measures whether a QA response is Correct, based on a reference answer# You can also select via the raw string qaEvaluatorType.QA,# Measure the embedding distance between the output and the reference answer# Equivalent to: EvalConfig.EmbeddingDistance(embeddingsOpenAIEmbeddings())EvaluatorType.EMBEDDING_DISTANCE,# Grade whether the output satisfies the stated criteria.# You can select a default one such as helpfulness or provide your own.RunEvalConfig.LabeledCriteria(helpfulness),# The LabeledScoreString evaluator outputs a score on a scale from 1-10.# You can use default criteria or write our own rubricRunEvalConfig.LabeledScoreString({accuracy:
Score 1: The answer is completely unrelated to the reference.
Score 3: The answer has minor relevance but does not align with the reference.
Score 5: The answer has moderate relevance but contains inaccuracies.
Score 7: The answer aligns with the reference but has minor errors or omissions.
Score 10: The answer is completely accurate and aligns perfectly with the reference.},normalize_by10,),],# You can add custom StringEvaluator or RunEvaluator objects here as well, which will automatically be# applied to each prediction. Check out the docs for examples.custom_evaluators[],
)1.4. 运行代理和评估者
使用run_on_dataset或异步arun_on_dataset函数来评估你的模型。这将
从指定的数据集中获取示例行。在每个示例上运行你的代理或任何自定义函数。将评估器应用于生成的运行轨迹和相应的参考示例以生成自动反馈。
结果将在LangSmith应用程序中可见。
from langchain import hub# We will test this version of the prompt
prompt hub.pull(wfh/langsmith-agent-prompt:798e7324)import functoolsfrom langchain.smith import (arun_on_dataset,run_on_dataset,
)chain_results run_on_dataset(dataset_namedataset_name,llm_or_chain_factoryfunctools.partial(agent_factory, promptprompt),evaluationevaluation_config,verboseTrue,clientclient,project_namefrunnable-agent-test-5d466cbc-{unique_id},tags[testing-notebook,prompt:5d466cbc,], # Optional, adds a tag to the resulting chain runs
)# Sometimes, the agent will error due to parsing issues, incompatible tool inputs, etc.
# These are logged as warnings here and captured as errors in the tracing UI.View the evaluation results for project runnable-agent-test-5d466cbc-bf2162aa at:https://smith.langchain.com/o/ebbaf2eb-769b-4505-aca2-d11de10372a4/projects/p/0c3d22fa-f8b0-4608-b086-2187c18361a5[ ] 0/5Chain failed for example 54b4fce8-4492-409d-94af-708f51698b39 with inputs {input: Who trained Llama-v2?}Error Type: TypeError, Message: DuckDuckGoSearchResults._run() got an unexpected keyword argument arg1[-------------------------------------------------] 5/5Eval quantiles:0.25 0.5 0.75 mean modeembedding_cosine_distance 0.086614 0.118841 0.183672 0.151444 0.050158correctness 0.000000 0.500000 1.000000 0.500000 0.000000score_string:accuracy 0.775000 1.000000 1.000000 0.775000 1.000000helpfulness 0.750000 1.000000 1.000000 0.750000 1.0000001.5 请查看测试结果
您可以通过点击上面输出中的URL或导航到LangSmith“agent-qa-{unique_id}”数据集中的“测试和数据集”页面来查看下面的测试结果跟踪UI。 2. 整合代码运行
Agents/chat_agents_search_evaluate.py这段代码使用 Langchain 和 LangSmith 库构建了一个复杂的问答系统利用大型语言模型和其他工具如 DuckDuckGo 搜索来回答问题并进行评估和测试。以下是对代码的详细解释和注释
# 导入与 OpenAI 语言模型进行交互的模块。
from langchain.llms import OpenAI # 导入创建和管理提示模板的模块。
from langchain.prompts import PromptTemplate # 导入构建基于大型语言模型的处理链的模块。
from langchain.chains import LLMChain # 导入从 .env 文件加载环境变量的库。
from dotenv import load_dotenv # 导入创建和管理 OpenAI 聊天模型实例的类。
from langchain.chat_models import ChatOpenAI# 加载 .env 文件中的环境变量。
load_dotenv() # 设置环境变量包括唯一项目 ID 和 Langchain API 设置。
import os
from uuid import uuid4
unique_id uuid4().hex[0:8]
os.environ[LANGCHAIN_PROJECT] fTracing Walkthrough - {unique_id}
os.environ[LANGCHAIN_ENDPOINT] https://api.smith.langchain.com
os.environ[LANGCHAIN_API_KEY] ls__xxxx # 替换为你的 API 密钥# 初始化 LangSmith 客户端。
from langsmith import Client
client Client()# 导入 Langchain 的其他必要模块和工具。
from langchain import hub
from langchain.agents import AgentExecutor
from langchain.agents.format_scratchpad import format_to_openai_function_messages
from langchain.agents.output_parsers import OpenAIFunctionsAgentOutputParser
from langchain.tools import DuckDuckGoSearchResults
from langchain.tools.render import format_tool_to_openai_function# 创建 ChatOpenAI 实例。
llm ChatOpenAI(modelgpt-3.5-turbo-16k, temperature0)# 定义工具列表。
tools [DuckDuckGoSearchResults(nameduck_duck_go)]# 定义输入问题列表。
inputs [What is LangChain?,Whats LangSmith?,When was Llama-v2 released?,What is the langsmith cookbook?,When did langchain first announce the hub?,
]# 创建数据集。
outputs [LangChain is an open-source framework for building applications using large language models. It is also the name of the company building LangSmith.,LangSmith is a unified platform for debugging, testing, and monitoring language model applications and agents powered by LangChain,July 18, 2023,The langsmith cookbook is a github repository containing detailed examples of how to use LangSmith to debug, evaluate, and monitor large language model-powered applications.,September 5, 2023,
]
dataset_name fagent-qa-{unique_id}
dataset client.create_dataset(dataset_name,descriptionAn example dataset of questions over the LangSmith documentation.,
)# 为每个问题创建数据集示例。
for query, answer in zip(inputs, outputs):client.create_example(inputs{input: query}, outputs{output: answer}, dataset_iddataset.id)# 导入并使用 Langchain 和 LangSmith 的评估模块。
from langchain.evaluation import EvaluatorType
from langchain.smith import RunEvalConfig
from langchain import hub
from langchain.agents import AgentExecutor, AgentType, initialize_agent, load_tools
from langchain.agents.format_scratchpad import format_to_openai_function_messages
from langchain.agents.output_parsers import OpenAIFunctionsAgentOutputParser
from langchain.chat_models import ChatOpenAI
from langchain.tools.render import format_tool_to_openai_function
from langchain.smith import arun_on_dataset, run_on_dataset# Since chains can be stateful (e.g. they can have memory), we provide
# a way to initialize a new chain for each row in the dataset. This is done
# by passing in a factory function that returns a new chain for each row.
def agent_factory(prompt):llm_with_tools llm.bind(functions[format_tool_to_openai_function(t) for t in tools])runnable_agent ({input: lambda x: x[input],agent_scratchpad: lambda x: format_to_openai_function_messages(x[intermediate_steps]),}| prompt| llm_with_tools| OpenAIFunctionsAgentOutputParser())return AgentExecutor(agentrunnable_agent, toolstools, handle_parsing_errorsTrue)
# 设置评估配置。
from langchain.evaluation import EvaluatorType
from langchain.smith import RunEvalConfigevaluation_config RunEvalConfig(# Evaluators can either be an evaluator type (e.g., qa, criteria, embedding_distance, etc.) or a configuration for that evaluatorevaluators[# Measures whether a QA response is Correct, based on a reference answer# You can also select via the raw string qaEvaluatorType.QA,# Measure the embedding distance between the output and the reference answer# Equivalent to: EvalConfig.EmbeddingDistance(embeddingsOpenAIEmbeddings())EvaluatorType.EMBEDDING_DISTANCE,# Grade whether the output satisfies the stated criteria.# You can select a default one such as helpfulness or provide your own.RunEvalConfig.LabeledCriteria(helpfulness),# The LabeledScoreString evaluator outputs a score on a scale from 1-10.# You can use default criteria or write our own rubricRunEvalConfig.LabeledScoreString({accuracy:
Score 1: The answer is completely unrelated to the reference.
Score 3: The answer has minor relevance but does not align with the reference.
Score 5: The answer has moderate relevance but contains inaccuracies.
Score 7: The answer aligns with the reference but has minor errors or omissions.
Score 10: The answer is completely accurate and aligns perfectly with the reference.},normalize_by10,),],# You can add custom StringEvaluator or RunEvaluator objects here as well, which will automatically be# applied to each prediction. Check out the docs for examples.custom_evaluators[],
)from langchain import hub
# 从 Langchain Hub 拉取最新版本的提示。
prompt hub.pull(wfh/langsmith-agent-prompt:798e7324)
print(prompt)import functools
# 定义代理工厂函数。
from langchain import hub
from langchain.agents import AgentExecutor, AgentType, initialize_agent, load_tools
from langchain.agents.format_scratchpad import format_to_openai_function_messages
from langchain.agents.output_parsers import OpenAIFunctionsAgentOutputParser
from langchain.chat_models import ChatOpenAI
from langchain.tools.render import format_tool_to_openai_functionfrom langchain.smith import (arun_on_dataset,run_on_dataset,
)chain_results run_on_dataset(dataset_namedataset_name,llm_or_chain_factoryfunctools.partial(agent_factory, promptprompt),evaluationevaluation_config,verboseTrue,clientclient,project_namefrunnable-agent-test-5d466cbc-{unique_id},tags[testing-notebook,prompt:5d466cbc,], # Optional, adds a tag to the resulting chain runs
)# 打印链运行结果。
print(chain_results)输出结果
看来访问OpenAI受限制很大需要突破一下。。
$ python Agents/chat_agents_search_evaluate.py
input_variables[agent_scratchpad, input] input_types{agent_scratchpad: typing.List[typing.Union[langchain.schema.messages.AIMessage, langchain.schema.messages.HumanMessage, langchain.schema.messages.ChatMessage, langchain.schema.messages.SystemMessage, langchain.schema.messages.FunctionMessage, langchain.schema.messages.ToolMessage]]} messages[SystemMessagePromptTemplate(promptPromptTemplate(input_variables[], templateYou are an expert senior software engineer. You are responsible for answering questions about LangChain. Use functions to consult the documentation before answering.)), HumanMessagePromptTemplate(promptPromptTemplate(input_variables[input], template{input})), MessagesPlaceholder(variable_nameagent_scratchpad)]
View the evaluation results for project runnable-agent-test-5d466cbc-3c42290b at:
https://smith.langchain.com/o/1441af63-d5a4-549b-893f-4f8d06c24390/projects/p/e685bf56-6fe9-4af3-af5c-3f7b6d674dcd?evaltrueView all tests for Dataset agent-qa-3c42290b at:
https://smith.langchain.com/o/1441af63-d5a4-549b-893f-4f8d06c24390/datasets/b4423fd6-5d73-4029-a9f2-a4d4afbd27dd[--------- ] 1/5Chain failed for example b2bbbc7e-41f7-409f-a566-4afc01f9f1a5 with inputs {input: When did langchain first announce the hub?}
Error Type: RateLimitError, Message: Rate limit reached for gpt-3.5-turbo-16k in organization org-jkd8QtrppR9UcAr9C841gy2b on requests per min (RPM): Limit 3, Used 3, Requested 1. Please try again in 20s. Visit https://platform.openai.com/account/rate-limits to learn more. You can increase your rate limit by adding a payment method to your account at https://platform.openai.com/account/billing.[------------------- ] 2/5Chain failed for example 6bb422a3-27f4-4d59-a87d-1287e0820597 with inputs {input: What is the langsmith cookbook?}
Error Type: RateLimitError, Message: Rate limit reached for gpt-3.5-turbo-16k in organization [--------------------------------------- ] 4/5Chain failed for example a4bc6b3e-b757-425e-91e5-76f46ee27ade with inputs {input: When was Llama-v2 released?}
Error Type: RateLimitError, Message: Rate limit reached for gpt-3.5-turbo-16k in organization [-------------------------------------------------] 5/5Chain failed for example 41dd48e4-6193-4e1d-afe2-d0e5f678f280 with inputs {input: Whats LangSmith?}
Error Type: RateLimitError, Message: Rate limit reached for gpt-3.5-turbo-16k in organization Eval quantiles:0.25 0.5 0.75 mean mode
execution_time 15.619272 15.619272 15.619272 15.619272 15.619272
correctness NaN NaN NaN NaN NaN
score_string:accuracy NaN NaN NaN NaN NaN
helpfulness NaN NaN NaN NaN NaN
embedding_cosine_distance 0.092627 0.092627 0.092627 0.092627 0.092627{project_name: runnable-agent-test-5d466cbc-3c42290b, results: {b2bbbc7e-41f7-409f-a566-4afc01f9f1a5: {output: {Error: RateLimitError}, input: {input: When did langchain first announce the hub?}, feedback: [], execution_time: 15.619272, reference: {output: September 5, 2023}}, 6bb422a3-27f4-4d59-a87d-1287e0820597: {output: {Error: RateLimitError}, input: {input: What is the langsmith cookbook?}, feedback: [], execution_time: 15.619272, reference: {output: The langsmith cookbook is a github repository containing detailed examples of how to use LangSmith to debug, evaluate, and monitor large language model-powered applications.}}, a4bc6b3e-b757-425e-91e5-76f46ee27ade: {output: {Error: RateLimitError}, input: {input: When was Llama-v2 released?}, feedback: [], execution_time: 15.619272, reference: {output: July 18, 2023}}, 41dd48e4-6193-4e1d-afe2-d0e5f678f280: {output: {Error: RateLimitError}, input: {input: Whats LangSmith?}, feedback: [], execution_time: 15.619272, reference: {output: LangSmith is a unified platform for debugging, testing, and monitoring language model applications and agents powered by LangChain}}, e2912900-dc5c-4b2b-bae7-2867ef761edd: {output: {input: What is LangChain?, output: LangChain is a blockchain-based platform that aims to bridge the language barrier by providing translation and interpretation services. It utilizes smart contracts and a decentralized network of translators to facilitate secure and efficient language translation. LangChain aims to revolutionize the language industry by providing a transparent and reliable platform for language services.}, input: {input: What is LangChain?}, feedback: [EvaluationResult(keycorrectness, scoreNone, valueNone, commentError evaluating run 6b9e0ca1-3bd4-4d15-8816-3c34ca4b4f04: The model gpt-4 does not exist or you do not have access to it. Learn more: https://help.openai.com/en/articles/7102672-how-can-i-access-gpt-4., correctionNone, evaluator_info{}, source_run_idNone, target_run_idNone), EvaluationResult(keyscore_string:accuracy, scoreNone, valueNone, commentError evaluating run 6b9e0ca1-3bd4-4d15-8816-3c34ca4b4f04: The model gpt-4 does not exist or you do not have access to it. Learn more: https://help.openai.com/en/articles/7102672-how-can-i-access-gpt-4., correctionNone, evaluator_info{}, source_run_idNone, target_run_idNone), EvaluationResult(keyhelpfulness, scoreNone, valueNone, commentError evaluating run 6b9e0ca1-3bd4-4d15-8816-3c34ca4b4f04: The model gpt-4 does not exist or you do not have access to it. Learn more: https://help.openai.com/en/articles/7102672-how-can-i-access-gpt-4., correctionNone, evaluator_info{}, source_run_idNone, target_run_idNone), EvaluationResult(keyembedding_cosine_distance, score0.09262746580850112, valueNone, commentNone, correctionNone, evaluator_info{__run: RunInfo(run_idUUID(2e07f133-983b-452d-a5be-e2323ac3bd42))}, source_run_idNone, target_run_idNone)], execution_time: 15.619272, reference: {output: LangChain is an open-source framework for building applications using large language models. It is also the name of the company building LangSmith.}}}}
代码 https://github.com/zgpeace/pets-name-langchain/tree/develop
参考
https://python.langchain.com/docs/langsmith/walkthroughhttps://docs.smith.langchain.com/