本文使用Dify v1.4.0版本。在Dify的结构化输出中，当模型本身不支持JSON Schema时，就要通过提示词方式来引导模型生成结构化输出。

一._handle_prompt_based_schema

源码位置：dify\api\core\workflow\nodes\llm\node.py

该方法用于处理不支持原生JSON Schema的语言模型的结构化输出。此方法通过修改提示消息，在系统提示中嵌入JSON Schema信息，从而引导模型按照指定结构返回数据。这种实现确保了即使模型不支持原生JSON Schema，也能通过提示工程技术引导模型生成结构化输出。

def_handle_prompt_based_schema(self, prompt_messages: Sequence[PromptMessage]) -> list[PromptMessage]:
"""
    Handle structured output for models without native JSON schema support.
    This function modifies the prompt messages to include schema-based output requirements.

    Args:
        prompt_messages: Original sequence of prompt messages

    Returns:
        list[PromptMessage]: Updated prompt messages with structured output requirements
    """
# Convert schema to string format
    schema_str = json.dumps(self._fetch_structured_output_schema(), ensure_ascii=False)

# Find existing system prompt with schema placeholder
    system_prompt = next(
        (prompt for prompt in prompt_messages if isinstance(prompt, SystemPromptMessage)),
None,
    )
    structured_output_prompt = STRUCTURED_OUTPUT_PROMPT.replace("{{schema}}", schema_str)
# Prepare system prompt content
    system_prompt_content = (
        structured_output_prompt + "\n\n" + system_prompt.content
if system_prompt and isinstance(system_prompt.content, str)
else structured_output_prompt
    )
    system_prompt = SystemPromptMessage(content=system_prompt_content)

# Extract content from the last user message

    filtered_prompts = [prompt for prompt in prompt_messages ifnot isinstance(prompt, SystemPromptMessage)]
    updated_prompt = [system_prompt] + filtered_prompts

return updated_prompt

1.获取并转换Schema

调用_fetch_structured_output_schema()获取结构化输出定义
使用json.dumps()将Schema转换为字符串格式，保留非ASCII字符

2.处理系统提示

从现有提示中查找系统提示消息
使用STRUCTURED_OUTPUT_PROMPT作为模板，替换{{schema}}占位符
如果找到现有系统提示，将其内容与Schema提示合并
否则，直接使用Schema提示创建新的系统提示

3.组合最终提示集

过滤掉原始提示中的所有系统提示
将新的系统提示放在列表最前面
添加其它类型的提示（用户提示、助手提示等）

4.注意事项

代码中有”Extract content from the last user message”的注释，但实际没有实现该功能
这种方法本质上是通过指令而非模型原生能力实现结构化输出
系统提示总是被放在列表的最前面，确保模型优先理解输出格式要求

二._fetch_structured_output_schema

源码位置：dify\api\core\workflow\nodes\llm\node.py

这段代码实现了_fetch_structured_output_schema方法，用于获取并验证LLM节点的结构化输出模式。此方法确保了返回的结构化输出模式是一个有效的JSON对象，可以安全用于后续处理流程。

def_fetch_structured_output_schema(self) -> dict[str, Any]:
"""
    Fetch the structured output schema from the node data.

    Returns:
        dict[str, Any]: The structured output schema
    """
ifnot self.node_data.structured_output:
raise LLMNodeError("Please provide a valid structured output schema")
    structured_output_schema = json.dumps(self.node_data.structured_output.get("schema", {}), ensure_ascii=False)
ifnot structured_output_schema:
raise LLMNodeError("Please provide a valid structured output schema")

try:
        schema = json.loads(structured_output_schema)
ifnot isinstance(schema, dict):
raise LLMNodeError("structured_output_schema must be a JSON object")
return schema
except json.JSONDecodeError:
raise LLMNodeError("structured_output_schema is not valid JSON format")

1.初始验证

首先检查节点数据中是否包含结构化输出配置：

ifnot self.node_data.structured_output:
raise LLMNodeError("Please provide a valid structured output schema")

如果structured_output为空，直接抛出错误提示用户提供有效的结构化输出模式。

2.JSON序列化

structured_output_schema = json.dumps(self.node_data.structured_output.get("schema", {}), ensure_ascii=False)

从节点数据的structured_output字典中获取schema字段（如果不存在则返回空字典），然后将其序列化为JSON字符串。ensure_ascii=False确保非ASCII字符可以保留原样而非转为转义序列。

3.序列化结果验证

ifnot structured_output_schema:
raise LLMNodeError("Please provide a valid structured output schema")

检查序列化后的JSON字符串是否为空，若为空则抛出错误。

4.JSON解析与类型验证

try:
    schema = json.loads(structured_output_schema)
ifnot isinstance(schema, dict):
raise LLMNodeError("structured_output_schema must be a JSON object")
return schema
except json.JSONDecodeError:
raise LLMNodeError("structured_output_schema is not valid JSON format")

这一步骤包含：

将JSON字符串解析回Python对象
验证解析后的对象是否为字典类型
如果解析成功且类型正确，返回该模式
如果遇到JSON解析错误，抛出自定义错误提示无效的JSON格式

三.STRUCTURED_OUTPUT_PROMPT

源码位置：dify\api\core\workflow\utils\structured_output\prompt.py

STRUCTURED_OUTPUT_PROMPT = """You’re a helpful AI assistant. You could answer questions and output in JSON format.
constraints:
    - You must output in JSON format.
    - Do not output boolean value, use string type instead.
    - Do not output integer or float value, use number type instead.
eg:
    Here is the JSON schema:
    {"additionalProperties": false, "properties": {"age": {"type": "number"}, "name": {"type": "string"}}, "required": ["name", "age"], "type": "object"}

    Here is the user's question:
    My name is John Doe and I am 30 years old.

    output:
    {"name": "John Doe", "age": 30}
Here is the JSON schema:
{{schema}}
"""  # noqa: E501

翻译为中文：

STRUCTURED_OUTPUT_PROMPT = """您是一个有帮助的AI助手。您可以回答问题并以JSON格式输出。
约束条件：
    - 您必须以JSON格式输出。
    - 不要输出布尔值，请使用字符串类型代替。
    - 不要输出整数或浮点值，请使用数字类型代替。
例如：
    这是JSON模式：
    {"additionalProperties": false, "properties": {"age": {"type": "number"}, "name": {"type": "string"}}, "required": ["name", "age"], "type": "object"}

    这是用户的问题：
    我的名字是John Doe，我今年30岁。

    输出：
    {"name": "John Doe", "age": 30}
这是JSON模式：
{{schema}}
"""  # noqa: E501

参考文献

[1] https://github.com/langgenius/dify/releases/tag/1.4.0

[2] 结构化输出：https://docs.dify.ai/zh-hans/guides/workflow/structured-outputs

[3] Dify中的结构化输出：https://z0yrmerhgi8.feishu.cn/wiki/IEEEwv3JGijiAUkXVhfcURqknAh

知识星球服务内容：Dify源码剖析及答疑，Dify对话系统源码，NLP电子书籍报告下载，公众号所有付费资料。加微信buxingtianxia21进NLP工程化资料群。

（文：NLP工程化）

一	二	三	四	五	六	日
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28