RFC7230: Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing

Table of Contents

View Repo
                                                       PROPOSED STANDARD
                                                            Errata Exis
Internet Engineering Task Force (IETF)                  R. Fielding, Ed.
Request for Comments: 7230                                         Adobe
Obsoletes: 2145, 2616                                    J. Reschke, Ed.
Updates: 2817, 2818                                           greenbytes
Category: Standards Track                                      June 2014
ISSN: 2070-1721

摘要 / Abstract

The Hypertext Transfer Protocol (HTTP) is a stateless application-level protocol for distributed, collaborative, hypertext information systems. This document provides an overview of HTTP architecture and its associated terminology, defines the "http" and "https" Uniform Resource Identifier (URI) schemes, defines the HTTP/1.1 message syntax and parsing requirements, and describes related security concerns for implementations.

超文本传输协议(HTTP)是一种无状态的应用层协议,适用于分布式、协作式的超文本信息系统。本文档提供 HTTP 架构以及其相关术语的概述,定义了 "http" 和 "https" 两种 URI 方案schemes,定义了 HTTP/1.1 消息句法和解析要求,以及描述了对实现implementations的安全性相关的注意事项。

译注:"implementations" 指的是实现了某种标准、规范的一种产品,例如 "HTTP implementations" 指的是实现了 HTTP 规范的应用程序(客户端、服务器、代理等等)。

备忘状态 / Status of This Memo

This is an Internet Standards Track document.

This document is a product of the Internet Engineering Task Force (IETF). It represents the consensus of the IETF community. It has received public review and has been approved for publication by the Internet Engineering Steering Group (IESG). Further information on Internet Standards is available in Section 2 of RFC 5741.

Information about the current status of this document, any errata, and how to provide feedback on it may be obtained at http://www.rfc-editor.org/info/rfc7230.

1. 引言 / Introduction

The Hypertext Transfer Protocol (HTTP) is a stateless application-level request/response protocol that uses extensible semantics and self-descriptive message payloads for flexible interaction with network-based hypertext information systems. This document is the first in a series of documents that collectively form the HTTP/1.1 specification:

超文本传输协议(HTTP)是一种基于请求/响应模式的、无状态stateless的应用层协议,使用可扩展的语义extensible semantics自我描述self-descriptive的消息有效载荷来与基于网络network-based的超文本信息系统进行灵活的交互。本文档是 HTTP/1.1 规范系列文档里的第一份。

  • "Message Syntax and Routing" (this document)
  • "Semantics and Content" [RFC7231]
  • "Conditional Requests" [RFC7232]
  • "Range Requests" [RFC7233]
  • "Caching" [RFC7234]
  • "Authentication" [RFC7235]
  • 《消息句法和路由》(本文档)
  • 《语义和内容》【RFC7231
  • 《条件请求》【RFC7232
  • 《范围请求》【RFC7233
  • 《缓存》【RFC7234
  • 《认证》【RFC7235

This HTTP/1.1 specification obsoletes RFC 2616 and RFC 2145 (on HTTP versioning). This specification also updates the use of CONNECT to establish a tunnel, previously defined in RFC 2817, and defines the "https" URI scheme that was described informally in RFC 2818.

本 HTTP/1.1 规范废弃了 RFC2616 以及 RFC2145(关于 HTTP 版本管理方而的内容)。本规范也更新了之前定义在 RFC2817 里的关于 CONNECT 在建立隧道时的使用方式,以及定义了原来在 RFC2818 有过非正式描述的 "https" URI 方案scheme

译注:除非特别说明,本译文所述的“方案”特指 "URI scheme",对于 scheme 的详细描述,见章节 2.7

HTTP is a generic interface protocol for information systems. It is designed to hide the details of how a service is implemented by presenting a uniform interface to clients that is independent of the types of resources provided. Likewise, servers do not need to be aware of each client's purpose: an HTTP request can be considered in isolation rather than being associated with a specific type of client or a predetermined sequence of application steps. The result is a protocol that can be used effectively in many different contexts and for which implementations can evolve independently over time.

HTTP 是一种用于信息系统的通用接口协议。它的设计思想是通过向客户端呈现一种独立于服务service自身所提供的资源类型的统一接口,来隐藏这个服务的实现细节。同样,服务器并不需要知道每一个客户端的目的,这是因为可以认为所有 HTTP 请求都是孤立的,而不是与一种特定的客户端类型、或者一连串预先定义好的应用步骤相关联。这样的设计造就了一个能够被有效用于多种不同场景,以及各种实现implementations能够独立而长久地获得发展的协议。

HTTP is also designed for use as an intermediation protocol for translating communication to and from non-HTTP information systems. HTTP proxies and gateways can provide access to alternative information services by translating their diverse protocols into a hypertext format that can be viewed and manipulated by clients in the same way as HTTP services.

HTTP 也可以作为一种中间人协议来使用,对非 HTTPnon-HTTP信息系统的相互通信进行翻译translate。HTTP 代理proxy网关gateway能够提供对可替代的信息服务的访问,具体是通过将它们的驱动协议翻译为一种能够被客户端查看和操作的超文本格式,使之能像访问 HTTP 服务一样的方式来访问。

One consequence of this flexibility is that the protocol cannot be defined in terms of what occurs behind the interface. Instead, we are limited to defining the syntax of communication, the intent of received communication, and the expected behavior of recipients. If the communication is considered in isolation, then successful actions ought to be reflected in corresponding changes to the observable interface provided by servers. However, since multiple clients might act in parallel and perhaps at cross-purposes, we cannot require that such changes be observable beyond the scope of a single response.

这种灵活性的一个结果是,协议不能依据接口背后发生了什么来定义。而是,我们限定在定义通信的句法、接收到的通信的意图,以及接收端的预期行为。如果该通信可认为是孤立的,那么通信成功的作用量应该被反映到对应的由服务器所提供的可观察接口的变化之上。但是,由于多个客户端可能存在并行工作,而且可能相互矛盾,我们不能要求这种变化在超出单独一次响应的范围以外被观察到。

This document describes the architectural elements that are used or referred to in HTTP, defines the "http" and "https" URI schemes, describes overall network operation and connection management, and defines HTTP message framing1 and forwarding requirements. Our goal is to define all of the mechanisms necessary for HTTP message handling that are independent of message semantics, thereby defining the complete set of requirements for message parsers and message-forwarding intermediaries.

本文档描述了用于或涉及 HTTP 的架构元素architectural elements,定义了 "http" 和 "https" 两种 URI 方案,总体上描述了网络操作和连接管理,并且定义了 HTTP 消息在分帧framing转发forwarding方面的要求。我们的目标是为处理 HTTP 消息定义所有独立于消息语义的必要方法,从而为消息解析器message parsers消息转发中间人message-forwarding intermediaries定义完整的要求集。

1.1. 要求标记 / Requirements Notation

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

关键词 必须MUST禁止MUST NOT要求REQUIRED必须SHALL禁止SHALL NOT应当SHOULD不应当SHOULD NOT推荐RECOMMENDED可以MAY可选OPTIONAL 的意义与【RFC2119】一致。

Conformance criteria and considerations regarding error handling are defined in Section 2.5.

关于错误处理的一致性标准以及注意事项会在章节 2.5 中定义。

1.2. 句法标记 / Syntax Notation

This specification uses the Augmented Backus-Naur Form (ABNF) notation of [RFC5234] with a list extension, defined in Section 7, that allows for compact definition of comma-separated lists using a '#' operator (similar to how the '*' operator indicates repetition). Appendix B shows the collected grammar with all list operators expanded to standard ABNF notation.

本规范使用了扩展巴科斯范式Augmented Backus-Naur Form(ABNF)标记法【RFC5234】,另外,出于定义的紧凑性的考虑,本规范对 ABNF 规则进行了扩展(见章节 7),允许使用一个 # 操作符(类似于 * 操作符,指代“重复”)来定义一种以逗号分隔的列表。附录 B 展示了所有已收集的包含列表扩展规则以及标准 ABNF 标记的语法。

The following core rules are included by reference, as defined in [RFC5234], Appendix B.1: ALPHA (letters), CR (carriage return), CRLF (CR LF), CTL (controls), DbbIGIT (decimal 0-9), DQUOTE (double quote), HEXDIG (hexadecimal 0-9/A-F/a-f), HTAB (horizontal tab), LF (line feed), OCTET (any 8-bit sequence of data), SP (space), and VCHAR (any visible [USASCII] character).

本规范引用了以下定义在【RFC5234】附录 B.1 中的核心规则:ALPHA (字母)、CR (回车符)、CRLF (回车换行符)、CTL (控制字符)、DIGIT (十进制数字 0-9)、DQUOTE (双引号)、HEXDIG (十六进制数字 0-9/A-F/a-f)、HTAB (水平制表符)、LF (换行符)、OCTET (八位组字节)、SP (空白)以及 VCHAR (【USASCII】可见字符)。

译注:除非特别说明,本译文所述的“字节”皆为 Octet,而不是 Byte。

As a convention, ABNF rule names prefixed with "obs-" denote "obsolete" grammar rules that appear for historical reasons.

按照惯例,名称以 "obs-" 开头的 ABNF 规则代表这是已经废弃obsolete了的语法,之所以这种规则会出现是为了描述历史遗留的问题。

2. 体系结构 / Architecture

HTTP was created for the World Wide Web (WWW) architecture and has evolved over time to support the scalability needs of a worldwide hypertext system. Much of that architecture is reflected in the terminology and syntax productions used to define HTTP.

HTTP 是为万维网(WWW)而创立的,并且长久以来的发展支撑起世界范围内超文系统对可扩展性的需要。用于定义 HTTP 的术语和句法反映了这一体系结构的方方面面。

2.1. 客户端/服务器消息传递 (Client/Server Messaging)

HTTP is a stateless request/response protocol that operates by exchanging messages (Section 3) across a reliable transport- or session-layer "connection" (Section 6). An HTTP "client" is a program that establishes a connection to a server for the purpose of sending one or more HTTP requests. An HTTP "server" is a program that accepts connections in order to service HTTP requests by sending HTTP responses.

HTTP 是一种无状态的请求/响应协议,通过一个可靠的传输层或会话层“连接”来交换消息exchanging messages章节 3)。HTTP 客户端client是一种用于与服务器建立连接connection章节 6),向其发送一种或多个 HTTP 请求的应用程序。HTTP 服务器server是一种接受客户端连接,接收 HTTP 请求,发送 HTTP 响应的应用程序。

译注:response 译作“响应”、“应答”,本文统一译为“响应”,作动词时有时会译为“回应…的响应”;message 译作“消息”、“报文”,这里统一译为“消息”。

The terms "client" and "server" refer only to the roles that these programs perform for a particular connection. The same program might act as a client on some connections and a server on others. The term "user agent" refers to any of the various client programs that initiate a request, including (but not limited to) browsers, spiders (web-based robots), command-line tools, custom applications, and mobile apps. The term "origin server" refers to the program that can originate authoritative responses for a given target resource. The terms "sender" and "recipient" refer to any implementation that sends or receives a given message, respectively.

术语“客户端”和“服务器”特指在一个具体连接connection中的相关程序所充当的角色。同一个程序可能在某些连接中充当一个客户端,而在其他连接中充当的是一个服务器。术语“用户代理user agent”指的是任何发起请求的各种客户端程序,包括(但不限于)浏览器、爬虫(基于网络的机器人)、命令行工具、定制应用和移动应用。术语“源服务器origin server”指的是任何为一个给定目标资源创建权威响应authoritative response章节 9.1)的程序。术语“发送端sender”和“接收端recipient”分别指的是任何发送或者接收一个给定消息的实现implementation

HTTP relies upon the Uniform Resource Identifier (URI) standard [RFC3986] to indicate the target resource (Section 5.1) and relationships between resources. Messages are passed in a format similar to that used by Internet mail [RFC5322] and the Multipurpose Internet Mail Extensions (MIME) [RFC2045] (see Appendix-A of [RFC7231] for the differences between HTTP and MIME messages).

HTTP 依靠“统一资源标识符Uniform Resource Identifier(URI)标准【RFC3986】”来标识目标资源(章节 5.1)以及资源与资源之间的关系。消息通过类似于互联网邮件【RFC5233】和多用途互联网邮件扩展Multipurpose Internet Mail Extensions(MIME)【RFC2045】的格式来进行传输。对于 HTTP 与 MIME 之间的区别可以查看【RFC7231】附录 A。)

Most HTTP communication consists of a retrieval request (GET) for a representation of some resource identified by a URI. In the simplest case, this might be accomplished via a single bidirectional connection (=) between the user agent (UA) and the origin server (O).

大多数 HTTP 的通讯是由 GET 请求组成的,通过向一个 URI 发起检索请求retrieval request(GET)来获得该 URI 所标识的资源一种表示方式representation。在最简单的情况下,可以经由一个在用户代理(UA)和源服务器(O)之间的双向连接bidirectional connection(=)就能完成。

     request   >
UA ======================================= O
                            <   response

A client sends an HTTP request to a server in the form of a request message, beginning with a request-line that includes a method, URI, and protocol version (Section 3.1.1), followed by header fields containing request modifiers, client information, and representation metadata (Section 3.2), an empty line to indicate the end of the header section, and finally a message body containing the payload body (if any, Section 3.3).

客户端以请求消息request message的形式向服务器发送一个 HTTP 请求。请求消息以一个包含了方法method、URI 和协议版本protocol version请求行request line章节 3.1.1)作为开始;随后是包含了请求修饰符request modifiers、客户端信息以及表示形式元数据representation metadata头字段header fields章节 3.2);接着是一个空行,来表示消息头部header section结束;最后是一个包含有效载荷payload body(如果有的话)的消息体message body章节 3.3)。

译注:"header fields" 通常译作头字段、首部字段、报头域等,本文统一译作“头字段”。"message body" 通常译作消息体、消息主体、报文正文等,本文统一译作“消息体”。

A server responds to a client's request by sending one or more HTTP response messages, each beginning with a status line that includes the protocol version, a success or error code, and textual reason phrase (Section 3.1.2), possibly followed by header fields containing server information, resource metadata, and representation metadata (Section 3.2), an empty line to indicate the end of the header section, and finally a message body containing the payload body (if any, Section 3.3).

服务器通过发送一个或多个 HTTP 响应消息response message来响应客户端的请求。每个响应消息以一个包含协议版本protocol version、一个成功或失败的状态码status code以及一个描述状态码的文本短语reason phrase来构成的状态行status line章节 3.1.2)作为开始;随后可能是包含服务器信息、资源元数据以及表示形式元数据representation metadata头字段header fields章节 3.2);接着是一个空行,来表示消息头部header section结束;最后是一个包含有效载荷payload body(如果有的话)的消息体message body章节 3.3)。

A connection might be used for multiple request/response exchanges, as defined in Section 6.3.

一个连接可能被用于多次请求/响应的消息交换,其定义见章节 6.3

The following example illustrates a typical message exchange for a GET request (Section 4.3.1 of [RFC7231]) on the URI "http://www.example.com/hello.txt":

下面举例说明对于 URI 为 "http://www.example.com/hello.txt" 的一个典型的 GET 请求(【RFC7231】章节 4.3.1)的消息交换过程。

Client request:

客户端请求:

GET /hello.txt HTTP/1.1
User-Agent: curl/7.16.3 libcurl/7.16.3 OpenSSL/0.9.7l zlib/1.2.3
Host: www.example.com
Accept-Language: en, mi

Server response:

服务器响应:

HTTP/1.1 200 OK
Date: Mon, 27 Jul 2009 12:28:53 GMT
Server: Apache
Last-Modified: Wed, 22 Jul 2009 19:15:56 GMT
ETag: "34aa387-d-1568eb00"
Accept-Ranges: bytes
Content-Length: 51
Vary: Accept-Encoding
Content-Type: text/plain

Hello World! My payload includes a trailing CRLF.

2.2. 实现的差异性 / Implementation Diversity

When considering the design of HTTP, it is easy to fall into a trap of thinking that all user agents are general-purpose browsers and all origin servers are large public websites. That is not the case in practice. Common HTTP user agents include household appliances, stereos, scales, firmware update scripts, command-line programs, mobile apps, and communication devices in a multitude of shapes and sizes. Likewise, common HTTP origin servers include home automation units, configurable networking components, office machines, autonomous robots, news feeds, traffic cameras, ad selectors, and video-delivery platforms.

在考虑 HTTP 协议的设计时,很容易会陷入一个误区:认为所有的用户代理都是通用的网页浏览器;所有的源服务器都是大型公共站点。然而实际上并不是这么一回事。一般的 HTTP 用户代理包含了家用电器、音响器材、磅秤、固件升级脚本、命令行程序、移动应用以及各种形状和尺寸的通信设备。同样,一般的 HTTP 源服务器包含家庭自动化单元、可配置的网络组件、办公设备、自主学习的机器人、新闻源、交通摄像头、广告选择器以及视频分发平台。

The term "user agent" does not imply that there is a human user directly interacting with the software agent at the time of a request. In many cases, a user agent is installed or configured to run in the background and save its results for later inspection (or save only a subset of those results that might be interesting or erroneous). Spiders, for example, are typically given a start URI and configured to follow certain behavior while crawling the Web as a hypertext graph.

术语“用户代理user agent”并不是意味着在请求的时候有一个人类用户与软件代理进行直接交互。在许多情况下,用户代理是被安装或配置用于后台运行,并保存其运行结果用于后续检验(或者只保存那些感兴趣的,或者错误的那部分)。例如,爬虫,其典型应用是给定一个起始 URI,然后配置其抓取网页文本的后续行为。

The implementation diversity of HTTP means that not all user agents can make interactive suggestions to their user or provide adequate warning for security or privacy concerns. In the few cases where this specification requires reporting of errors to the user, it is acceptable for such reporting to only be observable in an error console or log file. Likewise, requirements that an automated action be confirmed by the user before proceeding might be met via advance configuration choices, run-time options, or simple avoidance of the unsafe action; confirmation does not imply any specific user interface or interruption of normal processing if the user has already made that choice.

HTTP 实现implementations上的差异性,表现为不是所有的用户代理都能为用户提供交互性的建议或者对其关注的安全或隐私提供足够的警示。例如,本规范规定了在某些情况下要求向用户报告错误,但在某些实现implementations上,这些报告信息可能只输出到错误控制台或者日志文件里,这也是允许的。同样,用户可以在用户代理里(例如在高级选项、运行时选项或者不安全操作中)预先配置接下来的默认行为,规范要求当遇到这些默认行为时需要用户确认,而这个确认并不意味着出现一个特定的用户界面或者正常流程被打断,如果用户已经预先做出了选择的话。

2.3. 中间人 / Intermediaries

HTTP enables the use of intermediaries to satisfy requests through a chain of connections. There are three common forms of HTTP intermediary: proxy, gateway, and tunnel. In some cases, a single intermediary might act as an origin server, proxy, gateway, or tunnel, switching behavior based on the nature of each request.

HTTP 能使用中间人intermediaries来满足在通信链路里中转请求的需要。HTTP 有三种中间人:代理proxy网关gateway隧道tunnel。在某些情况下,一个中间人可以依据当前接收到的请求来决定是以源服务器、代理、网关还是隧道的方式来处理这个请求。

     >             >             >             >
UA =========== A =========== B =========== C =========== O
           <             <             <             <

The figure above shows three intermediaries (A, B, and C) between the user agent and origin server. A request or response message that travels the whole chain will pass through four separate connections. Some HTTP communication options might apply only to the connection with the nearest, non-tunnel neighbor, only to the endpoints of the chain, or to all connections along the chain. Although the diagram is linear, each participant might be engaged in multiple, simultaneous communications. For example, B might be receiving requests from many clients other than A, and/or forwarding requests to servers other than C, at the same time that it is handling A's request. Likewise, later requests might be sent through a different path of connections, often based on dynamic configuration for load balancing.

上图演示了在用户代理(UA)和源服务器(O)之间的三个中间人(A、B 和 C)。一个请求消息或者响应消息通过依次建立四个单独的连接来穿越整条链路。HTTP 的某些通信选项可能仅适用于通信链路上的某些节点上,例如离其最近的非隧道节点、链路的端点,或者适用于链路上的所有节点。虽然上图以线性的方式展示这条链路(但并不一定是线性的),每个节点都可能在处理多个并行的通信。例如,B 在处理来自 A 的请求的同时,还可能接收到来自 A 之外的多个客户端的请求,并(或)将其转发这些请求到 C 之外的服务器。同样,后面接收到的请求可能被节点依据其负载均衡的策略发送至另外一个不同通信路径上(译注:例如,来自 A 的请求被 B 转发到 D,而不是上图所示的 C)。

译注: 可以将通信链路想像为一条公交线路 A – B – C … X – Y – Z,线路两个端点(起始端/终点站)分别为 A 与 Z,之间的所有站点可以认为是中间人。公交车(请求消息)先从 A 站(用户代理)开始发起,途经 B、C……最终到达 Z 终点站(源服务器),然后公交车(响应消息)以 Z 站作为起点,途经 Y、X……最终返回到终点站 A。

其中“A 到 B”、“B 到 C”等,称之为“逐跳("hop-by-hop")”;而“A 到 Z”、“Z 到 A”,称之为“端到端”("end-to-end")。

The terms "upstream" and "downstream" are used to describe directional requirements in relation to the message flow: all messages flow from upstream to downstream. The terms "inbound" and "outbound" are used to describe directional requirements in relation to the request route: "inbound" means toward the origin server and "outbound" means toward the user agent.

术语“上游upstream”和“下游downstream”用于描述消息流的方向:所有的消息都从上游流到下游。术语“入站inbound”和“出站outbound”用于描述请求的路由方向:“入站”意为数据流朝向源服务器流动,而“出站”意为数据流朝向用户代理流动。

译注:上游与下游,拿刚才公交车的例子,在公交车上行时(从 A 到往 Z):A 是 B、C、……Z 的上游;B 是 A 的下游,是 C……Z 的上游。在公交车下行时(从 Z 到往 A)刚好相反,Z 是 Y、X、……A 的上游。只要记住,是按公交车(水)的行驶(流动)方向来区分上下游的,它总是从上游开往(流行)下游。

译注:入站与出站,路由器是连接互联网的枢纽,数据流入互联网,这叫“入站”,例如文件上传;流出互联网,这叫“出站”,例如文件下载。

A "proxy" is a message-forwarding agent that is selected by the client, usually via local configuration rules, to receive requests for some type(s) of absolute URI and attempt to satisfy those requests via translation through the HTTP interface. Some translations are minimal, such as for proxy requests for "http" URIs, whereas other requests might require translation to and from entirely different application-level protocols. Proxies are often used to group an organization's HTTP requests through a common intermediary for the sake of security, annotation services, or shared caching. Some proxies are designed to apply transformations to selected messages or payloads while they are being forwarded, as described in Section 5.7.2.

代理proxy,是一种由客户端选定的负责消息转发的中介,一般通过本地设置的规则来接收绝对 URIabsolute URI类型的请求并试图经由 HTTP 接口的翻译translation来满足这些请求。某些翻译是以最低限度来进行的,例如对 "http" URI 进行请求代理;与之相反的是,某些请求可能要求翻译为或翻译自translation to and from完全不同的应用层协议。为了安全性、服务标识或者共享缓存,某些代理一般通过一个共同的中间人,对同一组织的 HTTP 请求进行分组。某些代理被设计为对选定的消息或有效载荷在其被转发时进行转换(见 5.7.2)。

译注:Wikipedia 上对绝对 URI 的描述

A "gateway" (a.k.a. "reverse proxy") is an intermediary that acts as an origin server for the outbound connection but translates received requests and forwards them inbound to another server or servers. Gateways are often used to encapsulate legacy or untrusted information services, to improve server performance through "accelerator" caching, and to enable partitioning or load balancing of HTTP services across multiple machines.

网关gateway(又称为“反向代理reverse proxy”),对于出站通信outbound connection来说网关充当一个源服务器,它会将接收到的请求进行翻译translate,然后转发到站内inbound的一个或多个服务器上。网关通常用于封装遗留或者不受信任的信息服务,通过“加速器”缓存,以及在多机中开启分片或负载均衡来提升 HTTP 服务器的性能。

All HTTP requirements applicable to an origin server also apply to the outbound communication of a gateway. A gateway communicates with inbound servers using any protocol that it desires, including private extensions to HTTP that are outside the scope of this specification. However, an HTTP-to-HTTP gateway that wishes to interoperate with third-party HTTP servers ought to conform to user agent requirements on the gateway's inbound connection.

HTTP 中所有对于源服务器的要求都适用于网关的出站通信outbound communication。一个网关可以使用其喜欢的协议与入站网关通信,包括对 HTTP 的私有扩展(已经超出了本标准的范畴)。但是,如果一个 HTTP-to-HTTP 的网关在入站inbound时想跟第三方 HTTP 服务器交互的话应该遵循本标准对于用户代理的要求。

A "tunnel" acts as a blind relay between two connections without changing the messages. Once active, a tunnel is not considered a party to the HTTP communication, though the tunnel might have been initiated by an HTTP request. A tunnel ceases to exist when both ends of the relayed connection are closed. Tunnels are used to extend a virtual connection through an intermediary, such as when Transport Layer Security (TLS, [RFC5246]) is used to establish confidential communication through a shared firewall proxy.

隧道tunnel在两个连接之间充当一个盲中继blind relay,即隧道并不会对消息进行更改。隧道在激活后,由 HTTP 请求来进行初始化,但隧道并不作为 HTTP 通信的一部分。在隧道两端的连接都关闭后,隧道将不复存在。经由一个中间人的中转,隧道能够用来扩展一种虚连接(virtual connection),例如传输层安全协议(TLS,[RFC5246])可以经由一个共享的防火墙代理来建立保密通信confidential communication

译注:"blind relay",盲中继,只是将字节从一个连接转发到另一个连接中去,不对 Connection 头字段进行特殊的处理。

The above categories for intermediary only consider those acting as participants in the HTTP communication. There are also intermediaries that can act on lower layers of the network protocol stack, filtering or redirecting HTTP traffic without the knowledge or permission of message senders. Network intermediaries are indistinguishable (at a protocol level) from a man-in-the-middle attack, often introducing security flaws or interoperability problems due to mistakenly violating HTTP semantics.

上述这些类型的中间人仅认为是在 HTTP 通信中作为参与者。这些中间人同样能工作在网络协议栈的底层,过滤或重定向 HTTP 流而不必了解消息发送者的权限或逻辑。网络中间人并不能(在协议层面上)识别出消息是否来自于中间人攻击(man-in-the-middle attack),因此,有时会因为中间人的实现implementations有误没有遵循 HTTP 语义从而引入了安全隐患或者互操作性问题。

For example, an "interception proxy" [RFC3040] (also commonly known as a "transparent proxy" [RFC1919] or "captive portal") differs from an HTTP proxy because it is not selected by the client. Instead, an interception proxy filters or redirects outgoing TCP port 80 packets (and occasionally other common port traffic). Interception proxies are commonly found on public network access points, as a means of enforcing account subscription prior to allowing use of non-local Internet services, and within corporate firewalls to enforce network usage policies.

例如,一个拦截代理interception proxy(一般又叫作透明代理transparent proxy【RFC1919】或者捕获门户(强制网络门户)captive portal)与一个 HTTP 代理的区别在于它不是由客户端选择的,但是,拦截代理会过滤或者重定向 TCP 80 出口端口的数据包(有时还包括其他一般端口的流量)。拦截代理在公有网络访问点2里很常见,作为一种在允许使用非本地互联网服务之前的强制认证手段;同样也常见于企业防火墙里,用于强制执行网络使用上的策略。

译注:强制网络门户,是一个在用户使用无线网络前,先被导向至的 Web 网页,它是使用公共访问网络的用户在被授予访问权限前必须访问和交互的页面。

HTTP is defined as a stateless protocol, meaning that each request message can be understood in isolation. Many implementations depend on HTTP's stateless design in order to reuse proxied connections or dynamically load balance requests across multiple servers. Hence, a server MUST NOT assume that two requests on the same connection are from the same user agent unless the connection is secured and specific to that agent. Some non-standard HTTP extensions (e.g., [RFC4559]) have been known to violate this requirement, resulting in security and interoperability problems.

HTTP 被定义为一种无状态的协议,意味着每一个请求消息都能够(在不需要依赖其他消息的情况下)被单独理解。许多实现implementations依托于 HTTP 无状态性来复用代理过的连接或者通过多台服务器实施对请求的动态负载均衡。因此,一个服务器 禁止 假设同一个连接里的两个请求是来自于同一个用户代理,除非是连接是安全的或者这些请求是该用户代理特有的。目前已发现某些非标准的 HTTP 扩展(例如【RFC4559】)违反了这一要求,结果就是引发安全性和互操作性的问题。

译注:源服务器或中间人能够完全理解每一个请求消息的含义,这种理解并不用基于该请求消息的前一个或多个请求消息的内容。

2.4. 缓存 / Caches

A "cache" is a local store of previous response messages and the subsystem that controls its message storage, retrieval, and deletion. A cache stores cacheable responses in order to reduce the response time and network bandwidth consumption on future, equivalent requests. Any client or server MAY employ a cache, though a cache cannot be used by a server while it is acting as a tunnel.

缓存cache,是一种保存之前的响应消息的本地存储,以及控制其内的消息的存储、获取和删除的子系统。缓存存储了可缓存的cacheable响应是为了减少将来的响应时间和网络带宽消耗。任何客户端或者服务器 可以 使用缓存,但是,当服务器作为隧道tunnel而使用时,不能使用缓存。

The effect of a cache is that the request/response chain is shortened if one of the participants along the chain has a cached response applicable to that request. The following illustrates the resulting chain if B has a cached copy of an earlier response from O (via C) for a request that has not been cached by UA or A.

缓存cache的作用是缩短请求/响应链,体现为在一个有缓存参与的请求/响应链中,如果链路中的某个缓存cache保存并返回了与该请求相匹配的响应消息。下图的请求响应链的意思是,如果 B 保存了之前从源服务器 O (经过 C)返回的响应消息的副本,而这个响应没有缓存于用户代理 UA 或者 A 中,那么 B 就可以直接返回缓存的响应,而不用再转发至 C。

     >             >
UA =========== A =========== B - - - - - - C - - - - - - O
           <             <

A response is "cacheable" if a cache is allowed to store a copy of the response message for use in answering subsequent requests. Even when a response is cacheable, there might be additional constraints placed by the client or by the origin server on when that cached response can be used for a particular request. HTTP requirements for cache behavior and cacheable responses are defined in Section 2 of [RFC7234].

如果一个缓存被允许去存储一个响应消息的副本用于应答随后的请求,那么这个响应消息是“可缓存的cacheable”。即使一个响应是可缓存的,也可能存在一些来自客户端或源服务器的额外约束来规定在什么情况下所缓存的响应消息能够用于具体的请求。HTTP 关于缓存的行为cache behavior以及可缓存的响应cacheable reponses的定义,见【RFC7234】第二章

There is a wide variety of architectures and configurations of caches deployed across the World Wide Web and inside large organizations. These include national hierarchies of proxy caches to save transoceanic bandwidth, collaborative systems that broadcast or multicast cache entries, archives of pre-fetched cache entries for use in off-line or high-latency environments, and so on.

缓存cache的各种各样的架构和配置广泛存在于万维网和大型组织中,包括用于节省越洋带宽的国际级的代理缓存,广播或组播缓存项的协作系统,用于离线或高延迟环境的预取的缓存档案等等。

2.5. 一致性和错误处理 / Conformance and Error Handling

This specification targets conformance criteria according to the role of a participant in HTTP communication. Hence, HTTP requirements are placed on senders, recipients, clients, servers, user agents, intermediaries, origin servers, proxies, gateways, or caches, depending on what behavior is being constrained by the requirement. Additional (social) requirements are placed on implementations, resource owners, and protocol element registrations when they apply beyond the scope of a single communication.

本规范旨在为参与 HTTP 通信的角色制定一致性准则。因此,HTTP 对一致性的要求着眼于发送端、接收端、客户端、服务端、用户代理、中间人、源服务器、代理、网关和缓存,取决于哪些行为被要求所约束。附加的要求着眼于实现implementations、资源所有者以及应用于超出单一通信时的协议元素登记条目protocol element registrations

译注:本文多处提及“协议元素”这一术语,它指代组成一个完整协议的某个部分。为了方便描述一个协议的组成,我们会对协议的各个组成部分进行命名,这个经过命名的组成部分就是一个协议元素。例如,URI 由 scheme、authority、path、query、fragment 等元素组合而成。更多详情见【RFC6365】章节 6

The verb "generate" is used instead of "send" where a requirement differentiates between creating a protocol element and merely forwarding a received element downstream.

动词“生成generate”和“发送send”,用于区分“创建一个协议元素”和“仅仅将其接收到的元素转发到下游”。

An implementation is considered conformant if it complies with all of the requirements associated with the roles it partakes in HTTP.

判断一个实现implementation是否符合本规范,需要判断它是否遵循了本规范中涉及到对参与 HTTP 通信的所有角色的所有要求。

Conformance includes both the syntax and semantics of protocol elements. A sender MUST NOT generate protocol elements that convey a meaning that is known by that sender to be false. A sender MUST NOT generate protocol elements that do not match the grammar defined by the corresponding ABNF rules. Within a given message, a sender MUST NOT generate protocol elements or syntax alternatives that are only allowed to be generated by participants in other roles (i.e., a role that the sender does not have for that message).

一致性包含协议元素protocol elements句法syntax语义semantics。发送端 禁止 生成其明知是不正确的协议元素。发送端 禁止 生成与相关 ABNF 规则所定义的语法grammar不相匹配的协议元素。在给定的消息中,发送端 禁止 生成只允许在其他角色参与者(也就是说,一种发送端所不具备的角色)中生成的协议元素或相关句法替代品。

译注:不能将错就错。

译注:编译原理或语言学中的 "grammar", "semantics" 以及 "syntax" 这几个概念了解一下?

When a received protocol element is parsed, the recipient MUST be able to parse any value of reasonable length that is applicable to the recipient's role and that matches the grammar defined by the corresponding ABNF rules. Note, however, that some received protocol elements might not be parsed. For example, an intermediary forwarding a message might parse a header-field into generic field-name and field-value components, but then forward the header field without further parsing inside the field-value.

当一个接收到的协议元素被解析parse时,接收端必须能够解释任何适用于接收端这一角色以及与相关 ABNF 规则所定义的语法相匹配的、合理长度的值。需要注意的是,某些接收到的协议元素可能不被解析parse。例如,一个中间人在转发消息时可能会将一个头字段header-field解析为头字段名field-name字段值field-value,但转发头字段时并没有再对字段值进一步解析parse

译注:出于兼容性考虑,当接收者的 HTTP 版本是 HTTP/1.0,假如接到到的消息版本是 HTTP/1.1,那么某些头字段可能会被忽略。

HTTP does not have specific length limitations for many of its protocol elements because the lengths that might be appropriate will vary widely, depending on the deployment context and purpose of the implementation. Hence, interoperability between senders and recipients depends on shared expectations regarding what is a reasonable length for each protocol element. Furthermore, what is commonly understood to be a reasonable length for some protocol elements has changed over the course of the past two decades of HTTP use and is expected to continue changing in the future.

HTTP 并没有对其协议元素作具体长度限制,因为“多少的长度才算合适”这个问题过于宽泛,需要依据实现implementations具体的部署场景deployment context和目的来决定。因此,发送端和接收端之间的互操作性interoperability取决于它们“对于每一个协议元素,如何才算是合理长度”的共同期望。此外,对于某些协议元素来说,多少才算是一个通俗合理的长度这个问题的答案已经在过去二十多年来完全变更了,而且在将来仍会继续变更。

At a minimum, a recipient MUST be able to parse and process protocol element lengths that are at least as long as the values that it generates for those same protocol elements in other messages. For example, an origin server that publishes very long URI references to its own resources needs to be able to parse and process those same references when received as a request target.

接收端必须能够最低限度地解析parse处理process协议元素的长度,至少和它在其他消息中生成的同样一个协议元素的长度一致。例如,一个源服务器公布了一个非常长的 URI 来引用其自身资源,当它接收到以这个 URI 作为目标资源的请求时, 源服务器必须能够正确地解析和处理这个 URI。

A recipient MUST interpret a received protocol element according to the semantics defined for it by this specification, including extensions to this specification, unless the recipient has determined (through experience or configuration) that the sender incorrectly implements what is implied by those semantics. For example, an origin server might disregard the contents of a received Accept-Encoding header field if inspection of the User-Agent header field indicates a specific implementation version that is known to fail on receipt of certain content codings.

接收端 必须 依据本规范(及其后续扩展)所定义的语义来解释interpret其接收到的协议元素,除非接收端已经(通过经验或者配置)确定发送端并没有正确实现那些语义。例如,源服务器接到一个请求消息,这个请求的 Accept-Encoding 消息头字段表明发送端支持某些编码类型,源服务器通过检查这个请求的 User-Agent 头字段来获得这个用户代理的实现版本,(从过往的经验上)得知实际上这个用户代理并不能正确处理其声明的编码类型,于是源服务器可以忽略接收到的 Accept-Encoding 消息头字段的内容。

Unless noted otherwise, a recipient MAY attempt to recover a usable protocol element from an invalid construct. HTTP does not define specific error handling mechanisms except when they have a direct impact on security, since different applications of the protocol require different error handling strategies. For example, a Web browser might wish to transparently recover from a response where the Location header field doesn't parse according to the ABNF, whereas a systems control client might consider any form of error recovery to be dangerous.

除非另有说明,接收端 可以 尝试从一个不合法的消息结构中恢复recover出一个可用的协议元素。HTTP 协议在不用的应用场景上会有不同的错误处理策略的要求,因此,协议本身并没有定义具体的错误处理机制,除非这种错误直接影响到安全性。例如,一个网页浏览器接收到一个响应消息,响应消息的 Location 头字段依据 ABNF 规则并不能合法解析parse到,于是浏览器可能希望进行透明恢复(transparently recover);但是对于一个系统控制客户端,可能认为任何方式的错误恢复都是危险的。

译注:这里是拿“Web Browser”与所谓的“Systems Control Client”作对比。

2.6. 协议版本管理 / Protocol Versioning

HTTP uses a "<major>.<minor>" numbering scheme to indicate versions of the protocol. This specification defines version "1.1". The protocol version as a whole indicates the sender's conformance with the set of requirements laid out in that version's corresponding specification of HTTP.

HTTP 使用“<主版本>.<次版本>”这种编号方案numbering scheme来表明协议的版本。本规范定义了版本号“1.1”。整体来说,协议版本表明了发送端遵循了哪一个版本的 HTTP 规范。

译注:这里出现了本规范提及过的两种类型的 scheme 之一:numbering scheme,也就是编号方案。另外一种 scheme 是 URI scheme,URI 方案,也就是我们常见的 "http" 和 "https"。

The version of an HTTP message is indicated by an HTTP-version field in the first line of the message. HTTP-version is case-sensitive.

HTTP 协议的版本通过在消息的第一行的 HTTP-version 字段来指定。需要注意的是,HTTP-version 是区分大小写的,以下是 HTTP-version 的 ABNF 规则。

HTTP-version  = HTTP-name "/" DIGIT "." DIGIT
HTTP-name     = %x48.54.54.50 ; "HTTP", case-sensitive 

The HTTP version number consists of two decimal digits separated by a "." (period or decimal point). The first digit ("major version") indicates the HTTP messaging syntax, whereas the second digit ("minor version") indicates the highest minor version within that major version to which the sender is conformant and able to understand for future communication. The minor version advertises the sender's communication capabilities even when the sender is only using a backwards-compatible subset of the protocol, thereby letting the recipient know that more advanced features can be used in response (by servers) or in future requests (by clients).

HTTP 的版本号由 2 个十进制数组成,中间以英文句号 "." 分隔。第一个数字(主版本号major version)表示 HTTP 消息的句法,第二个数字(次版本号minor version)表示发送端在接下来的通信中将会遵循以及能够理解的最高次版本。次要版本号声明了发送端的通信能力,即使发送端仅仅使用协议的向后兼容backwards-compatible的子集,因此让接收端了解更多高级功能能够被用于响应(作为服务器)或者用于接下来的请求(作为客户端)。

When an HTTP/1.1 message is sent to an HTTP/1.0 recipient [RFC1945] or a recipient whose version is unknown, the HTTP/1.1 message is constructed such that it can be interpreted as a valid HTTP/1.0 message if all of the newer features are ignored. This specification places recipient-version requirements on some new features so that a conformant sender will only use compatible features until it has determined, through configuration or the receipt of a message, that the recipient supports HTTP/1.1.

当一个 HTTP/1.1 消息被发送到一个 HTTP/1.0 接收端【RFC1945】或者一个接收端的版本号未知,HTTP/1.1 消息会被构建成一个能够被解释interprete为一个合法的 HTTP/1.0 消息,如果忽略掉所有在 HTTP/1.1 新增的功能的话。本规范明确了接收端使用新功能的版本要求,以便于发送端可以仅仅使用兼容性功能与接收端通信,直到发送端(通过配置,或者接收到的消息)已经明确接收端支持 HTTP/1.1。

译注:也就是说,HTTP/1.1 是向后兼容的。

译注:发送端如何得知接收端支持 HTTP/1.1?一个办法是,发送端不管接收端是否支持,强制使用 HTTP/1.1;另一个办法是解析从接收端响应的消息,分析其是否真正实现了 HTTP/1.1。

The interpretation of a header field does not change between minor versions of the same major HTTP version, though the default behavior of a recipient in the absence of such a field can change. Unless specified otherwise, header fields defined in HTTP/1.1 are defined for all versions of HTTP/1.x. In particular, the Host and Connection header fields ought to be implemented by all HTTP/1.x implementations whether or not they advertise conformance with HTTP/1.1.

在规范中,在主版本major version一致的情况下,不同次版本minor version并不会对消息头字段有不同的解释interpretation,虽然接收者在缺少这些字段时的默认行为会有所不同。除非具体说明,定义在 HTTP/1.1 版本的头字段同样适用于所有 HTTP/1.x 版本。特别是,HostConnection 头字段应该为所有 HTTP/1.x 版本所实现,无论它们声明是否与 HTTP/1.1 版本一致。

New header fields can be introduced without changing the protocol version if their defined semantics allow them to be safely ignored by recipients that do not recognize them. Header field extensibility is discussed in Section 3.2.1.

将来新的头字段能够在不改变当前协议版本的情况下被引入,如果定义这些新头字段的语义允许它们能够在接收者无法识别的情况下被其安全忽略safely ignored。头字段的扩展性extensibility会在 章节 3.2.1 中讨论。

Intermediaries that process HTTP messages (i.e., all intermediaries other than those acting as tunnels) MUST send their own HTTP-version in forwarded messages. In other words, they are not allowed to blindly forward the first line of an HTTP message without ensuring that the protocol version in that message matches a version to which that intermediary is conformant for both the receiving and sending of messages. Forwarding an HTTP message without rewriting the HTTP-version might result in communication errors when downstream recipients use the message sender's version to determine what features are safe to use for later communication with that sender.

处理 HTTP 消息的中间人(除了作为隧道tunnel的中间人) 必须 在其转发消息中包含它们自身的 HTTP-version。换句话说,在以上中间人接收和发送消息的时候,它们并不允许在没有确保消息的版本与自身所使用的 HTTP 版本是否一致的情况下盲转发blindly forward HTTP 消息的首行。当下游downstream接收端使用消息的发送端版本来决定“对于接下来与之通信,什么功能能够安全使用”时,在没有重写 HTTP-version 的情况下直接转发一个 HTTP 消息可能会导致通信错误。

译注:隧道作为盲中介,它并不会对消息本身作修改。

A client SHOULD send a request version equal to the highest version to which the client is conformant and whose major version is no higher than the highest version supported by the server, if this is known. A client MUST NOT send a version to which it is not conformant.

客户端所发送的请求消息版本 应当 等于其支持的最高版本,同时,客户端的主版本major version不能高于服务器支持的最高主版本号(如果客户端知道服务器的主版本号的话)。客户端 禁止 发送自身不支持的协议版本。

译注:不能打肿脸充胖子。例如,当客户端最高仅支持 HTTP/1.0 时,请求行的 HTTP-version 字段不能是 HTTP/1.1。

A client MAY send a lower request version if it is known that the server incorrectly implements the HTTP specification, but only after the client has attempted at least one normal request and determined from the response status code or header fields (e.g., Server) that the server improperly handles higher request versions.

如果客户端知道服务器没有正确实现 HTTP 规范,客户端 可以 向服务器发送较低版本的请求,但仅当客户端在至少发送一次正常(最高版本)请求未遂,并且依据服务器的响应消息里的状态码或者头字段断定服务器不能正确处理更高版本的请求的情况下才允许上述做法。

A server SHOULD send a response version equal to the highest version to which the server is conformant that has a major version less than or equal to the one received in the request. A server MUST NOT send a version to which it is not conformant. A server can send a 505 (HTTP Version Not Supported) response if it wishes, for any reason, to refuse service of the client's major protocol version.

服务器所发送的响应消息版本 应当 低于或等于其接收到的请求消息的主版本major version。服务器 不能 发送自身不支持的协议版本。如有必要,当服务器不支持客户端所声明的 HTTP 协议主版本时,服务器可以发送一个 505 (HTTP Version Not Supported) 响应来拒绝来自客户端的请求服务。

A server MAY send an HTTP/1.0 response to a request if it is known or suspected that the client incorrectly implements the HTTP specification and is incapable of correctly processing later version responses, such as when a client fails to parse the version number correctly or when an intermediary is known to blindly forward the HTTP-version even when it doesn't conform to the given minor version of the protocol. Such protocol downgrades SHOULD NOT be performed unless triggered by specific client attributes, such as when one or more of the request header fields (e.g., User-Agent) uniquely match the values sent by a client known to be in error.

如果服务器知道或者怀疑客户端没有正确实现 HTTP 规范而且不能够正确处理更高版本的响应的时候,服务器 可以 发送 HTTP/1.0 响应。例如,当客户端没有正确解析parse协议版本号,或者已知一个中间人即使自身没有实现给定的 HTTP-version 的次版本的规范(即不支持给定版本的 HTTP 协议)仍然盲转发该 HTTP-version 等。不应该 执行上述这种协议版本的降级行为,除非服务器(或其他中间人)被特定客户端的特性所触发,例如当唯一匹配到客户端所发送的一个或多个请求头字段(例如 User-Agent)是已知会导致错误。

The intention of HTTP's versioning design is that the major number will only be incremented if an incompatible message syntax is introduced, and that the minor number will only be incremented when changes made to the protocol have the effect of adding to the message semantics or implying additional capabilities of the sender. However, the minor version was not incremented for the changes introduced between [RFC2068] and [RFC2616], and this revision has specifically avoided any such changes to the protocol.

HTTP 版本编号的设计意图是:主版本号major number只会在引入不兼容的消息句法的情况下才会增加;次版本号minor number只会在对协议的改动会引起语义的添加,或者赋予发送端新的能力时才会增加。但是,从【RFC2068】【RFC2616】的修订过程中,次版本号并没有增加(仍然是 HTTP/1.1),同时,本次修订已经明确避免对协议(版本号)的变动。

When an HTTP message is received with a major version number that the recipient implements, but a higher minor version number than what the recipient implements, the recipient SHOULD process the message as if it were in the highest minor version within that major version to which the recipient is conformant. A recipient can assume that a message with a higher minor version, when sent to a recipient that has not yet indicated support for that higher version, is sufficiently backwards-compatible to be safely processed by any implementation of the same major version.

接收端接收到一个 HTTP 消息,如果接收端兼容该消息的主版本号,但不兼容其次版本号(接收端所支持的次版本号低于该消息所标识的次版本号),那么,接收端 应当 以其所能支持的最高次版本(前题是相同主版本)的方式来处理这个消息。当接收端接收到一个消息,如果该消息的次要版本号高于接收端所实现的,接收端可以假设这个消息能够向后兼容所有具有相同主版本号的实现implementation,让其被安全处理。

2.7. 统一资源标识符 / Uniform Resource Identifiers

Uniform Resource Identifiers (URIs) [RFC3986] are used throughout HTTP as the means for identifying resources (Section 2 of [RFC7231]). URI references are used to target requests, indicate redirects, and define relationships.

统一资源标识符(URIs)【RFC3986】 作为标识资源(【RFC7231】章节 2)的手段,广泛使用于 HTTP 中。URI 引用URI references用于定位请求,标识重定向以及定义关联。

The definitions of "URI-reference", "absolute-URI", "relative-part", "scheme", "authority", "port", "host", "path-abempty", "segment", "query", and "fragment" are adopted from the URI generic syntax. An "absolute-path" rule is defined for protocol elements that can contain a non-empty path component. (This rule differs slightly from the path-abempty rule of RFC 3986, which allows for an empty path to be used in references, and path-absolute rule, which does not allow paths that begin with "//".) A "partial-URI" rule is defined for protocol elements that can contain a relative URI but not a fragment component.

URI-referenceabsolute-URIrelative-partschemeauthorityporthostpath-abemptysegmentqueryfragment 是引用自【RFC3986】absolute-path 规则用于定义能够包含一个非空路径的协议元素(这个规则在 RFC3986 中与 path-abempty 有些微的区别:path-abempty 允许在引用中使用空路径,而 path-absolute 规则不允许以 "//" 开头)。partial-URL 规则用于定义能包含一个相对 URI 但不能包含一个 fragment 的协议元素。

URI-reference = <URI-reference, see [RFC3986], Section 4.1>
absolute-URI  = <absolute-URI, see [RFC3986], Section 4.3>
relative-part = <relative-part, see [RFC3986], Section 4.2>
scheme        = <scheme, see [RFC3986], Section 3.1>
authority     = <authority, see [RFC3986], Section 3.2>
uri-host      = <host, see [RFC3986], Section 3.2.2>
port          = <port, see [RFC3986], Section 3.2.3>
path-abempty  = <path-abempty, see [RFC3986], Section 3.3>
segment       = <segment, see [RFC3986], Section 3.3>
query         = <query, see [RFC3986], Section 3.4>
fragment      = <fragment, see [RFC3986], Section 3.5>

absolute-path = 1*( "/" segment )
partial-URI   = relative-part [ "?" query ]

译注:【RFC3986】章节 3 有 URI 的完整图解,如下图所示:

  foo://example.com:8042/over/there?name=ferret#nose
  \_/   \______________/\_________/ \_________/ \__/
   |           |            |            |        |
scheme     authority       path        query   fragment
   |   _____________________|__
  / \ /                        \
  urn:example:animal:ferret:nose

Each protocol element in HTTP that allows a URI reference will indicate in its ABNF production whether the element allows any form of reference (URI-reference), only a URI in absolute form (absolute-URI), only the path and optional query components, or some combination of the above. Unless otherwise indicated, URI references are parsed relative to the effective request URI (Section 5.5).

HTTP 中的每一个允许 URI 引用的协议元素都会在它的 ABNF 产生中提及到这个元素允许哪种形式的引用:

  1. 任何形式的引用(URI-reference
  2. 只能是绝对形式的引用(absolute-URI
  3. 只能是路径(path)和可选的查询(query)组成部分
  4. 以上一个或多个组合

除非另有说明,URI 引用会解析(parse)为相关的“实际请求 URIeffective request URI”(章节 5.5)。

2.7.1. http URI 方案 / http URI Scheme

The "http" URI scheme is hereby defined for the purpose of minting identifiers according to their association with the hierarchical namespace governed by a potential HTTP origin server listening for TCP ([RFC0793]) connections on a given port.

"http" URI 方案(简称 "http" 方案)专门为建造某种标识而定义的,这种标识的建造规则依据于其与监听给定端口号的 TCP 连接(【RFC0793】) 的源服务器所管理的层级命名空间的关联。

译注:namespace,即命名空间,一般我们认为命名空间就是 Java、C# 等编程语言的语法规则,实际上,命名空间是一个广义的概念,它只是一组符号按一定的规则组合而成的用于关联一个对象的字符序列,这个字符序列就组成了一个命名空间(或者叫命名空间的名称),以便于通过这个命名空间来引用相关的对象。常见的命名空间的例子有文件系统、Java 等编程语言的 namespace 关键字、计算机网络或分布式系统中对资源的命名等。

http-URI = "http:" "//" authority path-abempty [ "?" query ] [ "#" fragment ]

The origin server for an "http" URI is identified by the authority component, which includes a host identifier and optional TCP port ([RFC3986], Section 3.2.2). The hierarchical path component and optional query component serve as an identifier for a potential target resource within that origin server's name space. The optional fragment component allows for indirect identification of a secondary resource, independent of the URI scheme, as defined in Section 3.5 of [RFC3986].

如上所示,对于一个 "http" URI,源服务器被标记到 authority 组件里,authority 包含一个主机host标识符和一个可选的 TCP 端口port【RFC3986】,章节 3.2.2)。path 组件和可选的 query 组件组成一个标识符,对位于源服务器命名空间里的某个潜在目标资源进行标记。可选的 fragment 组件允许间接标识一个次要资源secondary resource而不依赖于哪一种 URI 方案("http" 或者 "https"),见【RFC3986】章节 3.5

译注:按照【RFC3986】章节 3.2 的解释,"authority" 是“管理机构”的意思,由域名或 IP,加上一个可选的端口组成,通俗的讲,它的作用是相当于一个房屋的门牌,通过找门牌就可以找到这一间房屋。而 "path" 相当于从房屋大门走到特定房间的路径。另外,"authority" 除了“机构、权威、权力、当局”的意思以外,在其他文库管理方面还有其他有趣的意思3

译注:component 即组件,代表组成一个完整 URI 的某个单元。

A sender MUST NOT generate an "http" URI with an empty host identifier. A recipient that processes such a URI reference MUST reject it as invalid.

发送端 禁止 生成一个 host 为空的 "http" URI。接收端 必须 以 URI 不合法的原因拒绝处理这种 URI。

If the host identifier is provided as an IP address, the origin server is the listener (if any) on the indicated TCP port at that IP address. If host is a registered name, the registered name is an indirect identifier for use with a name resolution service, such as DNS, to find an address for that origin server. If the port subcomponent is empty or not given, TCP port 80 (the reserved port for WWW services) is the default.

如果 host 标识符以 IP 地址的形式来提供,表示源服务器就是在那个 IP 地址对应的 TCP 端口的监听器;如果 host 是一个已登记的名称(registered name,可以理解为域名),所谓“已登记的名称”,是一个用于名称解释服务name resolution service的间接标识,例如域名系统(DNS)用于查找源服务器的地址;如果 port 子组件为空或未提供,那么 TCP 默认使用 80(WWW 服务的保留端口)端口。

Note that the presence of a URI with a given authority component does not imply that there is always an HTTP server listening for connections on that host and port. Anyone can mint a URI. What the authority component determines is who has the right to respond authoritatively to requests that target the identified resource. The delegated nature of registered names and IP addresses creates a federated namespace, based on control over the indicated host and port, whether or not an HTTP server is present. See Section 9.1 for security considerations related to establishing authority.

需要注意的是,一个 URI 带有给定的 authority 组件并不意味着这个 URI 一定就是某个监听那个 host 以及对应 port 来等待连接的 HTTP 服务器。任何人都可以建造 URI。而 authority 决定的是谁有权力去响应这个定位目标资源的请求。注册域名和 IP 地址所代表的本质是,基于支配明确的 hostport 生成一个联合命名空间,无论最终呈现的是否是一个 HTTP 服务器。见章节 9.1

When an "http" URI is used within a context that calls for access to the indicated resource, a client MAY attempt access by resolving the host to an IP address, establishing a TCP connection to that address on the indicated port, and sending an HTTP request message (Section 3) containing the URI's identifying data (Section 5) to the server. If the server responds to that request with a non-interim HTTP response message, as described in Section 6 of [RFC7231], then that response is considered an authoritative answer to the client's request.

当一个 "http" URI 用于一个请求访问目标资源的场合里,客户端 可以 尝试通过解释resolve host 获得 IP 地址,(通过对应的端口)建立一个 TCP 连接到这个地址,然后发送一个包含这个 URI 的识别数据(见章节 5)的 HTTP 请求消息(章节 3),从而访问到这个目标资源。如果服务器对这个请求响应了一个非过渡non-interim的 HTTP 响应消息(见【RFC7231】章节 6),那么这个响应可认为是一个对客户端请求的权威应答authoritative answer

Although HTTP is independent of the transport protocol, the "http" scheme is specific to TCP-based services because the name delegation process depends on TCP for establishing authority. An HTTP service based on some other underlying connection protocol would presumably be identified using a different URI scheme, just as the "https" scheme (below) is used for resources that require an end-to-end secured connection. Other protocols might also be used to provide access to "http" identified resources — it is only the authoritative interface that is specific to TCP.

虽然 HTTP 并不依赖其他传输协议,但 "http" 方案是特指基于 TCP 的服务的,这是因为名称委派处理name delegation process需要依赖 TCP 来确立权威establishing authority章节 9.1)。一个基于其他多个底层通信协议的 HTTP 服务可能会被标识为使用一个不同的 URI 方案,就像 "https" 方案是用于要求端到端安全的资源访问一样。其他协议可能也用于提供访问以 "http" 标识的资源,但这是唯一特定于 TCP 的权威接口(官方接口)authoritative interface

The URI generic syntax for authority also includes a deprecated userinfo subcomponent ([RFC3986], Section 3.2.1) for including user authentication information in the URI. Some implementations make use of the userinfo component for internal configuration of authentication information, such as within command invocation options, configuration files, or bookmark lists, even though such usage might expose a user identifier or password. A sender MUST NOT generate the userinfo subcomponent (and its "@" delimiter) when an "http" URI reference is generated within a message as a request target or header field value. Before making use of an "http" URI reference received from an untrusted source, a recipient SHOULD parse for userinfo and treat its presence as an error; it is likely being used to obscure the authority for the sake of phishing attacks.

在 URI 的通用句法中有关 authority 方面还包含了一个已废弃的 userinfo 子组件(见【RFC3986】章节 3.2.1),用于包含用户信息到 URI 里。某些实现implementationsuserinfo 组件用于携带供内部使用的认证信息,例如命令调用的选项、配置文件或者书签列表,尽管这些用途可能会暴露用户名或密码。当发送端生成一个 HTTP 消息,包含以 http URI 引用作为一个请求目标或者消息头字段里的值(例如头字段 Location)时,发送端 禁止 生成 userinfo 子组件(以及其 "@" 分隔符)。在使用一个接收自一个非受信的源的 http URI 引用时,接收者 应当userinfo 进行解析parse并且对待它的出现当作一个错误,它的出现很可能带来网络钓鱼phishing attach的威胁。

2.7.2. https URI 方案 / https URI Scheme

The "https" URI scheme is hereby defined for the purpose of minting identifiers according to their association with the hierarchical namespace governed by a potential HTTP origin server listening to a given TCP port for TLS-secured connections ([RFC5246]).

"https" URI 方案(简称 "https" 方案)专门为建造某种标识而定义的,这种标识的建造规则依据于其与监听给定端口号用于使用 TLS 安全协议进行 TCP 连接 (【RFC5246】)的源服务器所管理的层级命名空间的关联。

All of the requirements listed above for the "http" scheme are also requirements for the "https" scheme, except that TCP port 443 is the default if the port subcomponent is empty or not given, and the user agent MUST ensure that its connection to the origin server is secured through the use of strong encryption, end-to-end, prior to sending the first HTTP request.

所有上文罗列过的对于 "http" 方案的要求同样适用于 "https" 方案,除了没有明确指明端口号时 "https" 的默认端口是 443 而 "http" 的默认端口是 80,以及用户代理 必须 保证它与源服务器的端到端连接在发送第一个 HTTP 请求之前已经是使用强加密技术到达安全级别以外。

https-URI = "https:" "//" authority path-abempty [ "?" query ] [ "#" fragment ]

Note that the "https" URI scheme depends on both TLS and TCP for establishing authority. Resources made available via the "https" scheme have no shared identity with the "http" scheme even if their resource identifiers indicate the same authority (the same host listening to the same TCP port). They are distinct namespaces and are considered to be distinct origin servers. However, an extension to HTTP that is defined to apply to entire host domains, such as the Cookie protocol [RFC6265], can allow information set by one service to impact communication with other services within a matching group of host domains.

需要注意的是,"https" URI 方案依赖于 TLS 以及 TCP 来确立权威establishing authority章节 9.1)。通过 "https" 方案标识的资源与通过 "http" 方案标识的资源两者间并没有任何关系,即使它们的 authority 组件一样(有相同的 host 和相同的 TCP port)。它们的命名空间是有区别的,因此指向的是两个不同的源服务器。然而,后来的规范对 HTTP 进行了扩展来(使某些特性)适用于所有主机域名,例如 Cookie 协议【RFC6265】,能够允许一个服务设置某些信息,通过一个关于主机域名的匹配规则集合来影响与其他服务的通信。

译注:即使两个 URI 除了 scheme 不一样以外,其他各组件都一模一样,如 http://www.example.com/pathhttps://www.example/path 这两个 URI 并不一定标识同一个资源,因为这是两个是不同的 URI。

The process for authoritative access to an "https" identified resource is defined in [RFC2818].

权威访问(authoritative access)4某个使用 "https" 来标识的资源的过程定义于【RFC2818】

2.7.3. http 与 https 的归一和比较 / http and https URI Normalization and Comparison

Since the "http" and "https" schemes conform to the URI generic syntax, such URIs are normalized and compared according to the algorithm defined in Section 6 of [RFC3986], using the defaults described above for each scheme.

因为 "http" 和 "https" 两种方案都遵循 URI 通用句法,因此这些 URI 都可以依据定义于【RFC3986】章节 6 的算法来进行归一和对比。

If the port is equal to the default port for a scheme, the normal form is to omit the port subcomponent. When not being used in absolute form as the request target of an OPTIONS request, an empty path component is equivalent to an absolute path of "/", so the normal form is to provide a path of "/" instead. The scheme and host are case-insensitive and normally provided in lowercase; all other components are compared in a case-sensitive manner. Characters other than those in the "reserved" set are equivalent to their percent-encoded5 octets: the normal form is to not encode them (see Sections 2.1 and 2.2 of [RFC3986]).

如果一个 URI port 等于其对应方案的默认端口("http" 方案的默认端口是 80,"https" 方案的默认端口是 443),那么其通常的形式是省略掉 port 子组件。当一个 OPTIONS 请求没有使用绝对形式absolute form作为请求目标request target时,一个空的 path 等价于绝对路径 "/",所以通常的形式是使用路径 "/" 来代表空路径。schemehost 是不区分大小写的,通常使用小写。除了 schemehost 以外的所有其他组件都是区分大小写的。除了“保留”字符集以外的所有字符都等价于它的 URL 编码(Precent-encoded,又叫百分号编码)形式的字节Octets:一般形式是(如非必要)不要对它们进行编码(见【RFC3986】章节 2.1 和 2.2)。

译注:URL 编码,又叫百分号编码,每个字符由 % 加上两位的十六进制 0~F 组成,对于百分号编码还可以参考这篇博文

For example, the following three URIs are equivalent:

例如,以下三个 URI 是等价的:

http://example.com:80/~smith/home.html
http://EXAMPLE.com/%7Esmith/home.html
http://EXAMPLE.com:/%7esmith/home.html

3. 消息格式 / Message Format

All HTTP/1.1 messages consist of a start-line followed by a sequence of octets in a format similar to the Internet Message Format [RFC5322]: zero or more header fields (collectively referred to as the "headers" or the "header section"), an empty line indicating the end of the header section, and an optional message body.

所有 HTTP/1.1 消息皆由一个“起始行start-line”以及随后的消息头message header,然后空一行(表明消息头结束),最后是一个可选的消息体message body组合而成。其中消息头由 0 个或多个头字段header fields组成,头字段的格式类似于互联网消息格式Internet Message Format【RFC5322】

译注:"header" 译作“消息头”,也有译作“报头字段”;"message body" 译作“消息体”,也有译作“报文正文”。

HTTP-message   = start-line

*( header-field CRLF )
                 CRLF
                 [ message-body ]

The normal procedure for parsing an HTTP message is to read the start-line into a structure, read each header field into a hash table by field name until the empty line, and then use the parsed data to determine if a message body is expected. If a message body has been indicated, then it is read as a stream until an amount of octets equal to the message body length is read or the connection is closed.

解析 HTTP 消息的一般流程是先将起始行start line读入到一个构造体中,将所有头字段header fields读入到一个哈希表中(以字段的名称作为键)直到遇到空行empty line,然后使用以上解析得到的信息来决定是否需要消息体message body。如果消息头表明消息带有消息体,那么将消息体以的方式读入,直到已读字节数amount of octets等于消息体的长度或者连接已被关闭为止。

A recipient MUST parse an HTTP message as a sequence of octets in an encoding that is a superset of US-ASCII [USASCII]. Parsing an HTTP message as a stream of Unicode characters, without regard for the specific encoding, creates security vulnerabilities due to the varying ways that string processing libraries handle invalid multibyte character sequences that contain the octet LF (%x0A). String-based parsers can only be safely used within protocol elements after the element has been extracted from the message, such as within a header field-value after message parsing has delineated the individual fields.

接收端 必须 将 HTTP 消息解析为以 US-ASCII 的超集来编码的字节octet序列。没有考虑具体的编码encoding就将消息解析为 Unicode 字符会引发安全漏洞,这是由于字符串处理库string processing libraries处理包含 LF (%x0A) 非法多字节字符序列的方式有很多种而导致的。基于字符串的解析器string-based parsers只能工作在消息提取出协议元素之后且对单个元素进行解释才能保证有效,例如在定位出消息里所有头字段header fields后,对消息头中的一个字段值field-value使用基于字符串的解释器是可以保证安全的。

An HTTP message can be parsed as a stream for incremental processing or forwarding downstream. However, recipients cannot rely on incremental delivery of partial messages, since some implementations will buffer or delay message forwarding for the sake of network efficiency, security checks, or payload transformations.

HTTP 消息能够解析parse为用于增量处理incremental processing转发到下游forwarding downstream的流。但是,接收端不能依赖局部消息的增量投递,因为某些实现implementations会因为网络性能、安全校验或者有效载荷转换payload transformations章节 5.7.2)而对这些不完整的消息进行缓冲或延迟buffer or delay转发。

A sender MUST NOT send whitespace between the start-line and the first header field. A recipient that receives whitespace between the start-line and the first header field MUST either reject the message as invalid or consume each whitespace-preceded line without further processing of it (i.e., ignore the entire line, along with any subsequent lines preceded by whitespace, until a properly formed header field is received or the header section is terminated).

发送端所发送的消息里,起始行start line和第一个头字段header field之间 禁止 带有空白。当接收端发现它所接收到的消息起始行与第一个头字段之间带有空白时,必须 拒绝处理整个不合法的消息,或者仅忽略这种以空白开头的行(例如,忽略整行,连同后续所有以空白开头的行,直到遇到一个格式正确的头字段或者到达消息头部的结尾为止)。

The presence of such whitespace in a request might be an attempt to trick a server into ignoring that field or processing the line after it as a new request, either of which might result in a security vulnerability if other implementations within the request chain interpret the same message differently. Likewise, the presence of such whitespace in a response might be ignored by some clients or cause others to cease parsing.

在一个请求消息中如果出现上述非法空白,可能其目的是试图让服务器去忽略某些头字段或忽略处理某些行,欺骗服务器使其认为这个请求是一个新请求new request。如果在请求链路中其他实现implementation对这种带有非法空白的消息的有不同的处理方式的话,随便哪一种方式都可能导致安全隐患。同样,在一个响应消息中出现这种非法空白,可能会被某些客户端所忽略,或者导致客户端终止解析cease parsing

3.1. 起始行 / Start Line

An HTTP message can be either a request from client to server or a response from server to client. Syntactically, the two types of message differ only in the start-line, which is either a request-line (for requests) or a status-line (for responses), and in the algorithm for determining the length of the message body (Section 3.3).

一个 HTTP 消息要么是一个从客户端到服务器的请求消息,要么是一个从服务器到客户端的响应消息。从句法上看,这两种类型的消息的区别有两点:

  1. 起始行start line:请求消息的起始行称为请求行request line,响应消息的起始行称为状态行status line
  2. 测算消息体的长度的算法(章节 3.3

In theory, a client could receive requests and a server could receive responses, distinguishing them by their different start-line formats, but, in practice, servers are implemented to only expect a request (a response is interpreted as an unknown or invalid request method) and clients are implemented to only expect a response.

理论上,客户端同样能够接收请求消息,服务器同样能够接收响应消息,只需要让它们区别好消息起始行的不同格式就可以了。但实际上,一般将服务端实现为仅预期接收请求(而接收到响应的话,服务器会将其解释interpret为一个未知或非法的请求方法),将客户端实现为仅预期接收响应。

start-line     = request-line / status-line

3.1.1. 请求行 / Request Line

A request-line begins with a method token, followed by a single space (SP), the request-target, another single space (SP), the protocol version, and ends with CRLF.

请求行 request-line,开始于一个方法标识 method,紧接着一个空白 SP,然后是请求目标 request-target,另一个空白 SP,之后是协议版本 HTTP-version,最后是回车换行符 CRLF

request-line   = method SP request-target SP HTTP-version CRLF

The method token indicates the request method to be performed on the target resource. The request method is case-sensitive.

方法 method 标识了使用哪一种请求方法request method去获取目标资源target resource,请求方法是 区分大小写 的。

method         = token

The request methods defined by this specification can be found in Section 4 of [RFC7231], along with information regarding the HTTP method registry and considerations for defining new methods.

本规范所定义的请求方法request method连同关于 HTTP 方法登记表HTTP method registry以及对于定义新方法的注意事项的相关信息,见【RFC7231】章节 4

The request-target identifies the target resource upon which to apply the request, as defined in Section 5.3.

请求目标 request-target 标识了依据请求所申请的申请目标资源,定义在章节 5.3

Recipients typically parse the request-line into its component parts by splitting on whitespace (see Section 3.5), since no whitespace is allowed in the three components. Unfortunately, some user agents fail to properly encode or exclude whitespace found in hypertext references, resulting in those disallowed characters being sent in a request-target.

接收端在解析请求行request-line的过程中,通过以空白分割出请求行的各个组件(共有三个组件,分别为方法标识、请求目标以及协议版本),因此,以上三个组件的内容不能带有空白(章节 3.5)。不幸的是,某些用户代理不能对超文本引用(hypertext references,即超链接)里的空白进行正确的编码或者排除,导致用户代理所发送的请求消息中的请求目标request-target包含了那些不被允许出现的字符character

Recipients of an invalid request-line SHOULD respond with either a 400 (Bad Request) error or a 301 (Moved Permanently) redirect with the request-target properly encoded. A recipient SHOULD NOT attempt to autocorrect and then process the request without a redirect, since the invalid request-line might be deliberately crafted to bypass security filters along the request chain.

接收端接收到一个不合法的请求行request-line时,应当 响应一个 400 (Bad Request) 错误或者 301 (Move Permanently) 重定向,编码方式依据请求行的相应要求。接收端 不应当 试图在不重定向的情况下自动修正然后处理这种请求消息,这是因为这种非法的请求行可能是刻意制造出来用于越过请求链路中的安全过滤机制。

译注:除非特别说明,形如“响应一个 code (status) 消息”、“响应一个 code (status) 状态码”,或者“发送一个 code (status) 响应”皆表示“响应(或发送)一个带有 code (status) 状态码的响应消息”。例如“服务器发送一个 200 (OK) 响应”,表示的是“服务器发送一个带有 200 (OK) 状态码的响应消息”。

HTTP does not place a predefined limit on the length of a request-line, as described in Section 2.5. A server that receives a method longer than any that it implements SHOULD respond with a 501 (Not Implemented) status code. A server that receives a request-target longer than any URI it wishes to parse MUST respond with a 414 (URI Too Long) status code (see Section 6.5.12 of [RFC7231]).

HTTP 并没有对请求行request-line的长度限制进行预定义,相关原因在章节 2.5 已有描述。服务器接收到超出其长度要求的请求方法request method应当 响应一个 501 (Not Implemented) 状态码。服务器接收到一个 URI 其长度超出服务器所期望的最大长度时,必须 响应一个 414 (URI Too Long) 状态码(见【RFC7231】章节 6.5.12)。

Various ad hoc limitations on request-line length are found in practice. It is RECOMMENDED that all HTTP senders and recipients support, at a minimum, request-line lengths of 8000 octets.

在实践中发现,ad hoc 网络对于请求行request-line的长度限制多种多样。本规范 推荐 所有 HTTP 发送端和接收端对于请求行的长度限制不低于 8000 字节octets

3.1.2. 状态行 / Status Line

The first line of a response message is the status-line, consisting of the protocol version, a space (SP), the status code, another space, a possibly empty textual phrase describing the status code, and ending with CRLF.

响应消息的第一行称为状态行status-line,包含协议版本 HTTP-version,一个空白 SP,状态码 status-code,另一个空白 SP,一个可能为空的文本短语 reason-phrase 来描述该状态码,最后是回车换行符 CRLF

status-line = HTTP-version SP status-code SP reason-phrase CRLF

The status-code element is a 3-digit integer code describing the result of the server's attempt to understand and satisfy the client's corresponding request. The rest of the response message is to be interpreted in light of the semantics defined for that status code. See Section 6 of [RFC7231] for information about the semantics of status codes, including the classes of status code (indicated by the first digit), the status codes defined by this specification, considerations for the definition of new status codes, and the IANA registry.

状态码 status-code 是一个 3 位整数值,用于描述服务器尝试理解以及满足客户端相应请求的处理结果。接收端应该依据消息的状态码所定义的语义来解释interprete消息的剩余部分(即除了状态行以外的部分)。对于状态码的语义的相关信息,包括状态码的分类(由状态码的第一位数字来指定)、本规范所定义的状态码、定义新状态码的注意事项,以及 IANA 登记表,见【RFC7231】章节 6

status-code    = 3DIGIT

The reason-phrase element exists for the sole purpose of providing a textual description associated with the numeric status code, mostly out of deference to earlier Internet application protocols that were more frequently used with interactive text clients. A client SHOULD ignore the reason-phrase content.

原因短语 reason-phrase 的唯一存在目的是为数值型的状态码提供一种文本性的描述,大部都是遵从早期互联网使用频繁的交互术语。客户端 应当 忽略掉 reason-phrase 的内容。

reason-phrase  = *( HTAB / SP / VCHAR / obs-text )

3.2. 头字段 / Header Fields

Each header field consists of a case-insensitive field name followed by a colon (":"), optional leading whitespace, the field value, and optional trailing whitespace.

每一个头字段header field都由一个字段名field name及随后的一个分号(":")、可选的前置空白、一个字段值field value、一个可选的结尾空白组成。

header-field   = field-name ":" OWS field-value OWS

field-name     = token
field-value    = *( field-content / obs-fold )
field-content  = field-vchar [ 1*( SP / HTAB ) field-vchar ]
field-vchar    = VCHAR / obs-text

obs-fold       = CRLF 1*( SP / HTAB )
               ; obsolete line folding
               ; see Section 3.2.4

The field-name token labels the corresponding field-value as having the semantics defined by that header field. For example, the Date header field is defined in Section 7.1.1.2 of [RFC7231] as containing the origination timestamp for the message in which it appears.

头字段名称 field-name 将其对应的头字段值 field-value 算作这个消息带有相应头字段的语义。例如,定义在【RFC7231】章节 7.1.1.2Date 头字段的出现表明包含了这个消息诞生的时间戳。

3.2.1. 字段的可扩展性 / Field Extensibility

Header fields are fully extensible: there is no limit on the introduction of new field names, each presumably defining new semantics, nor on the number of header fields used in a given message. Existing fields are defined in each part of this specification and in many other specifications outside this document set.

不限制引入新的头字段名,每个头字段都可假定其定义了新的语言,也不限制给定消息中的头字段的数值作限制,因此,头字段是完全可扩展的。已知的头字段定义在本规范的各个部分中,以及超出本文档集的其他规范中。

New header fields can be defined such that, when they are understood by a recipient, they might override or enhance the interpretation of previously defined header fields, define preconditions on request evaluation, or refine the meaning of responses.

新头字段能够定义为这样:当新头字段能够被接收端所理解时,它们可以覆盖或增强之前所定义的头字段的解释interpretation,定义对评估请求request evaluation前提条件preconditions,或者优化refine响应消息的含义。

译注:计算机或数学领域中一般将 "evaluation" 翻译为“求值”(参考这里),例如,对一个表达式进行求值。"request evaluation",请求求值?请求评估?没找到合适的词来形容。我的理解是对请求消息进行解释,提取其信息,并转化为接收端所能理解的形式,让接收端理解请求的意图,也说是说,对请求消息进行求值。

A proxy MUST forward unrecognized header fields unless the field-name is listed in the Connection header field (Section 6.1) or the proxy is specifically configured to block, or otherwise transform, such fields. Other recipients SHOULD ignore unrecognized header fields. These requirements allow HTTP's functionality to be enhanced without requiring prior update of deployed intermediaries.

代理proxy 必须 将无法识别的头字段转发出去,除非 Connection 头字段里列出了该头字段名称 field-name (见章节 6.1),或者该代理被具体配置为对这些字段进行阻塞、转换。其他接收端 应当 忽略无法识别的头字段。这些要求使得在链路中的中间人还没有预先更新的情况下,HTTP 的功能仍能得到增强。

All defined header fields ought to be registered with IANA in the "Message Headers" registry, as described in Section 8.3 of [RFC7231].

所有已定义的头字段都应该被登记到 IANA 的 "Message Headers" 登记表中,详细描述见【RFC7231】章节 8.3

3.2.2. 字段的顺序 / Field Order

The order in which header fields with differing field names are received is not significant. However, it is good practice to send header fields that contain control data first, such as Host on requests and Date on responses, so that implementations can decide when not to handle a message as early as possible. A server MUST NOT apply a request to the target resource until the entire request header section is received, since later header fields might include conditionals, authentication credentials, or deliberately misleading duplicate header fields that would impact request processing.

不同的头字段名的接收顺序是无关要紧的。但是,最佳的实践是优先发送包含控制信息control data,例如请求消息中的 Host 和 响应消息中的 Date,以便于相应的实现implementations能够尽可能简单地决定在什么时候不去处理这个消息。服务端 禁止 在整个请求消息头部接收完毕之前去申请目标资源,这是因为接后来的请求头字段可能包含影响请求处理流程的条件、认证信息或者故意误导性的重复头字段。

A sender MUST NOT generate multiple header fields with the same field name in a message unless either the entire field value for that header field is defined as a comma-separated list [i.e., #(values)] or the header field is a well-known exception (as noted below).

发送端 禁止 在一个消息中生成多个相同名称的的头字段header fields,除非所有这些头字段的头字段值是定义为以英文逗号分隔comma-separated的列表(例如,#(values))或者头字段是公认的例外(如之前所列举的)。

A recipient MAY combine multiple header fields with the same field name into one "field-name: field-value" pair, without changing the semantics of the message, by appending each subsequent field value to the combined field value in order, separated by a comma. The order in which header fields with the same field name are received is therefore significant to the interpretation of the combined field value; a proxy MUST NOT change the order of these field values when forwarding a message.

在没有改变消息语义的情况下,接收端 可以 将多个相同名称的头字段合并成一个头字段键-值对("field-name: field-value" pair),具体实现方式是通过按顺序向要合并的那个字段值末尾附加其他子序列的头字段值,并以英文“逗号”分隔。因此,该同名消息头字段在消息中的接收顺序对所合并而成的字段值的解释interpretation有重要意义。代理 禁止 在转发消息的时候改变这些字段值的顺序。

Note: In practice, the "Set-Cookie" header field ([RFC6265]) often appears multiple times in a response message and does not use the list syntax, violating the above requirements on multiple header fields with the same name. Since it cannot be combined into a single field-value, recipients ought to handle "Set-Cookie" as a special case while processing header fields. (See Appendix A.2.3 of [Kri2001] for details.)

注意: 在实践中,头字段 Set-Cookie (见【RFC6265】)在一个响应消息中通常会出现多次,并且不是以列表的句法形式,违背了以上关于多个相同名称的头字段的要求。这是因为它不能组合成一个单一的字段值,因此,当处理头字段的时候,接收端应该将 Set-cookie 作为特殊情况。(详情见【Kri2001】的附件 A.2.3。)

3.2.3. 空白 / Whitespace

This specification uses three rules to denote the use of linear whitespace: OWS (optional whitespace), RWS (required whitespace), and BWS ("bad" whitespace).

本规范使用 3 个(ABNF)规则来表示连续空白的使用。

  • OWS,Optional Whitespace,可选的空白;
  • RWS,Required Whitespace,必要的空白;
  • BWS,Bad Whitespace,不可取的空白。

译注:Whitespace 译为“空白”而不是“空格”,这是因为 Whitespace character 不仅包括 U+0020 SPACE,通常还包括 U+0009 CHARACTER TABULATION (tab) 等许多字符,详见 Wikipedia 的 Whitespace character

The OWS rule is used where zero or more linear whitespace octets might appear. For protocol elements where optional whitespace is preferred to improve readability, a sender SHOULD generate the optional whitespace as a single SP; otherwise, a sender SHOULD NOT generate optional whitespace except as needed to white out invalid or unwanted protocol elements during in-place message filtering.

OWS 规则是用在可能出现零个或多个连续空白字节linear whitespace octets的位置上的。对于协议元素来说,可选空白的使用有助于提交可读性。发送端 应当 将可选空白生成为一个单一的 SP;接收端除了在消息就地过滤in-place message filtering期间因为修正非法或多余的协议元素的需要而生成可选空白以外,接收端 不应当 生成可选空白。

The RWS rule is used when at least one linear whitespace octet is required to separate field tokens. A sender SHOULD generate RWS as a single SP.

RWS 规则是用于当要求有至少一个连续空白linear whitespace octet来分隔字段标记field tokens的时候。发送端 应当RWS 生成为一个单一的 SP

The BWS rule is used where the grammar allows optional whitespace only for historical reasons. A sender MUST NOT generate BWS in messages. A recipient MUST parse for such bad whitespace and remove it before interpreting the protocol element.

BWS 规则是用在由于历史遗留因素才在语法上允许可选空白的位置上。发送端 禁止 在消息中生成 BWS。接收端 必须解释interpret协议元素之前对这些不可取的空白进行解析parse并且移除它们。

OWS            = *( SP / HTAB )
               ; optional whitespace
RWS            = 1*( SP / HTAB )
               ; required whitespace
BWS            = OWS
               ; "bad" whitespace

3.2.4. 字段解析 / Field Parsing

Messages are parsed using a generic algorithm, independent of the individual header field names. The contents within a given field value are not parsed until a later stage of message interpretation (usually after the message's entire header section has been processed). Consequently, this specification does not use ABNF rules to define each "Field-Name: Field Value" pair, as was done in previous editions. Instead, this specification uses ABNF rules that are named according to each registered field name, wherein the rule defines the valid grammar for that field's corresponding field values (i.e., after the field-value has been extracted from the header section by a generic field parser).

消息使用通用的算法进行解析parse,不依赖个别的头字段名。对一个给定的头字段值里的内容的解析parse会在消息解释interpretation的偏后阶段进行,通常在消息的整个消息头部header section都已经被处理好之后。因此,本规范不再使用 ABNF 规则去定义每一个头字段“键-值”对("Field-Name: Field Value" pair),正如上个版本的做法。取而代之的是,本规范使用以每一个已登记的头字段的名称来命名的 ABNF 规则,每个规则里定义了相应的头字段所对应的字段值(也就是说,通过一个通用头字段解析器从消息头部中抽取出 field-value 以后的内容)的合法语法。

No whitespace is allowed between the header field-name and colon. In the past, differences in the handling of such whitespace have led to security vulnerabilities in request routing and response handling. A server MUST reject any received request message that contains whitespace between a header field-name and colon with a response code of 400 (Bad Request). A proxy MUST remove any such whitespace from a response message before forwarding the message downstream.

头字段的名字和冒号之间不允许带有空白。在过去,对于这些空白的各不相同的处理方式已经导致请求路由和响应处理方面的安全隐患。服务器 必须 拒绝这种在头字段名和冒号之间包含空白请求消息,并响应一个 400 (Bad Request) 的状态码。代理 必须 在转发到下游之前从响应消息中移除所有上述空白。

A field value might be preceded and/or followed by optional whitespace (OWS); a single SP preceding the field-value is preferred for consistent readability by humans. The field value does not include any leading or trailing whitespace: OWS occurring before the first non-whitespace octet of the field value or after the last non-whitespace octet of the field value ought to be excluded by parsers when extracting the field value from a header field.

在头字段值之前并且(或者)之后可能带有可选空白(OWS);为了遵循人类可读性,在 field-value 之前推荐带有一个单一的空白 SP。头字段值并不包含任何前置或后置的空白:当从一个头字段中抽出头字段值的时候,出现在头字段值的第一个非空白字节之前或者最后一个非空白字节之后的可选空白 OWS 应该被解析器所排除。

Historically, HTTP header field values could be extended over multiple lines by preceding each extra line with at least one space or horizontal tab (obs-fold). This specification deprecates such line folding except within the message/http media type (Section 8.3.1). A sender MUST NOT generate a message that includes line folding (i.e., that has any field-value that contains a match to the obs-fold rule) unless the message is intended for packaging within the message/http media type.

历史上,HTTP 头字段值能被扩展为多行,通过在每个额外行之前添加至少一个空白或水平制表符(obs-fold)来实现。本规范废弃了这种折叠行line folding,除了在媒体类型media type "message/http"(见章节 8.3.1)之内的。发送端 禁止 生成一个包含行折叠(也就是说,带有任何包含能够匹配 obs-fold 规则的头字段值 field-value)的消息,除非消息是用于封装媒体类型 "message/http" 的内容。

A server that receives an obs-fold in a request message that is not within a message/http container MUST either reject the message by sending a 400 (Bad Request), preferably with a representation explaining that obsolete line folding is unacceptable, or replace each received obs-fold with one or more SP octets prior to interpreting the field value or forwarding the message downstream.

如果服务端在请求消息里接收到一个 obs-fold,但它并不在媒体类型 "message/http" 的容器内,服务端 必须 选择以下的处理方式之一:

  • 拒绝这种消息并响应一个 400 (Bad Request) 状态码,更好的方式是在响应消息中带有一种表示形式representation来解释这种已经废弃的折叠行line folding是不再允许出现在消息中的;
  • 解释interpret头字段值(或转发消息到下游)之前,使用一个或多个空白 SP 来替换掉所有接收到的 obs-fold

A proxy or gateway that receives an obs-fold in a response message that is not within a message/http container MUST either discard the message and replace it with a 502 (Bad Gateway) response, preferably with a representation explaining that unacceptable line folding was received, or replace each received obs-fold with one or more SP octets prior to interpreting the field value or forwarding the message downstream.

如果代理proxy网关gateway在一个响应消息里接收到 obs-fold,但它并不在媒体类型 "message/http" 的容器内,这些中间人 必须 选择以下的处理方式这一:

  • 丢弃这个消息改由响应一个 502 (Bad Gateway) 状态码,更好的方式是在响应消息中带有一种表示形式representation来解释之所以出现这种状况是因为接收到那种不被允许的折叠行line folding
  • 解释interpret头字段值(或转发消息到下游)之前使用一个或多个空白 SP 来替换掉所有接收到的 obs-fold

A user agent that receives an obs-fold in a response message that is not within a message/http container MUST replace each received obs-fold with one or more SP octets prior to interpreting the field value.

如果用户代理user agent接收到一个响应消息里的 obs-fold,但它并不在媒体类型 "message/http" 的容器内,用户代理 必须 在解释头字段值之前使用一个或多个空白 SP 来替换掉所有接收到的分行 obs-fold

Historically, HTTP has allowed field content with text in the ISO‑8859‑1 charset [ISO-8859-1], supporting other charsets only through use of [RFC2047] encoding. In practice, most HTTP header field values use only a subset of the US-ASCII charset [USASCII]. Newly defined header fields SHOULD limit their field values to US‑ASCII octets. A recipient SHOULD treat other octets in field content (obs‑text) as opaque data.

历史上,HTTP 已允许头字段的内容带有 ISO-8859-1 字符集【ISO-8859-1】的字符,要支持其他字符集,仅能通过使用【RFC2047】编码。实际上,大多 HTTP 头字段值仅仅使用 US-ASCII 字符集【USASCII】的子集。新近定义的头字段 应该 将它们的字段值的内容限制为 US-ASCII 字节Octets。接收端 应该 将在头字段里的其他字节文本(obs-text)作为不透明数据(Opaque Data)对待。

译注:计算机领域中 opaque 与 transparent 的区别: opaque(不透明的)和 transparent(透明的)都有“隐藏”的意思。 opaque: 表示虽然能看见某物(表象),但看不到它的内部,看不懂它的运作原理,看不清它的本质,另外,opaque 也有一点“封装”的意思,例如 API,你知道如何有哪些 API,也知道如何使用它们,但你不知道它们内部的实现原理,那么,它们的实现原理对你来说就是 opaque。 transparent: 表示某物的存在不被人感知,你看不见,但它确实存在。例如 Java 类的私有成员变量、私有成员方法等,你并不知道它们的存在,但它们确实存在。

简单来说,opaque 是“知其然而不知其所以然”,而 transparent 就是“我站在你面前,但你却看不见我”。

3.2.5. 字段限制 / Field Limits

HTTP does not place a predefined limit on the length of each header field or on the length of the header section as a whole, as described in Section 2.5. Various ad hoc limitations on individual header field length are found in practice, often depending on the specific field semantics.

HTTP 并没有预先限制每一个头字段header field的长度或者整个消息头部header section的长度,详情见章节 2.5。实践中已发现各家 ad hoc 网络对于个别头字段的长度有不同的限制,这咱长度限制通常依据具体的头字段的语义。

A server that receives a request header field, or set of fields, larger than it wishes to process MUST respond with an appropriate 4xx (Client Error) status code. Ignoring such header fields would increase the server's vulnerability to request smuggling attacks (Section 9.5).

服务端接收到一个请求头字段或头字段集合,其长度超出服务器所能处理的最大长度时,服务器 必须 响应一个恰当的 4xx (Client Error) 状态码。而忽略这些头字段会增加服务器被“请求走私攻击request smuggling attacks”(章节 9.5)的隐患。

A client MAY discard or truncate received header fields that are larger than the client wishes to process if the field semantics are such that the dropped value(s) can be safely ignored without changing the message framing or response semantics.

客户端 可以 丢弃或拼接所接收到的超出客户端所能处理的最大长度的头字段,如果那个字段的语义是那种摘除掉的值能够在没有改变消息的分帧或响应语义的情况下被安全地被忽略。

3.2.6. 字段值的组成 / Field Value Components

Most HTTP header field values are defined using common syntax components (token, quoted-string, and comment) separated by whitespace or specific delimiting characters. Delimiters are chosen from the set of US-ASCII visual characters not allowed in a token (DQUOTE and "(),/:;<=>?@[\]{}").

大多数 HTTP 头字段值使用通用句法组件common syntax components(包括标记 token、字符串 quoted-string、注释 comment )来进行定义。这些通用句法组件以空白或者特定的定界符delimiting characters来分隔。定界符选取自 US-ASCII 字符集中的可见字符visual characters,并且这些可见字符是不允许出现在标记token上的(包括 DQUOTE 和 "(),/:;<=>?@[\]{}")。

token          = 1*tchar

tchar          = "!" / "#" / "$" / "%" / "&" / "'" / "*"
               / "+" / "-" / "." / "^" / "_" / "`" / "|" / "~" 
               / DIGIT / ALPHA
               ; any VCHAR, except delimiters

A string of text is parsed as a single value if it is quoted using double-quote marks.

一个文本字符串 quoted-string 会作为一个单一的值来进行解析parse,如果这一文本字符串使用双引号所包裹的话。

quoted-string  = DQUOTE *( qdtext / quoted-pair ) DQUOTE
qdtext         = HTAB / SP /%x21 / %x23-5B / %x5D-7E / obs-text
obs-text       = %x80-FF

Comments can be included in some HTTP header fields by surrounding the comment text with parentheses. Comments are only allowed in fields containing "comment" as part of their field value definition.

可以通过在注释文本两边添加括号的方式将注释包含在某些 HTTP 头字段中。仅当某个头字段值的定义中带有 comment 作为它的一部分,注释才能够包含在该头字段中。

comment        = "(" *( ctext / quoted-pair / comment ) ")"
ctext          = HTAB / SP / %x21-27 / %x2A-5B / %x5D-7E / obs-text

The backslash octet ("\") can be used as a single-octet quoting mechanism within quoted-string and comment constructs. Recipients that process the value of a quoted-string MUST handle a quoted-pair as if it were replaced by the octet following the backslash.

反斜杠 "\" 能够在字符串 quoted-string 和注释 comment 中作为转义字符。接收端处理字符串 quoted-string 的值时 必须 处理引号对 quoted-pair,就像它被替换为反斜杠后面的字节octet一样。

quoted-pair    = "\" ( HTAB / SP / VCHAR / obs-text )

A sender SHOULD NOT generate a quoted-pair in a quoted-string except where necessary to quote DQUOTE and backslash octets occurring within that string. A sender SHOULD NOT generate a quoted-pair in a comment except where necessary to quote parentheses ["(" and ")"] and backslash octets occurring within that comment.

发送端 不应当 在一个 quoted-string 里生成一个 quoted-pair,除非在那个字符串中对所有出现的 DQUOTE 以及反斜字符杠字节octet进行转义。发送端 不应当 在注释中生成一个 quoted-pair,除非在那个注释中对所有出现的小括号("(" 和 ")")以及反斜杠字节进行转义。

3.3. 消息体 / Message Body

The message body (if any) of an HTTP message is used to carry the payload body of that request or response. The message body is identical to the payload body unless a transfer coding has been applied, as described in Section 3.3.1.

HTTP 消息的消息体(如果存在的话)是用来运载请求或响应的有效载荷payload body的。除非应用了传输编码,消息体message body等价于有效载荷payload body,详情见章节 3.3.1

message-body = *OCTET

The rules for when a message body is allowed in a message differ for requests and responses.

在什么情况下才允许带有消息体,有相应的规则进行约束,而且对于请求消息和响应消息的规则是不尽相同的。

The presence of a message body in a request is signaled by a Content-Length or Transfer-Encoding header field. Request message framing is independent of method semantics, even if the method does not define any use for a message body.

在一个请求中是否会出现消息体,以消息头中是否带有 Content-Length 或者 Transfer-Encoding 头字段作为信号。请求消息的分帧是独立于请求方法request method的语义之外的,即使请求方法并没有任何用于一个消息体的相关定义。

The presence of a message body in a response depends on both the request method to which it is responding and the response status code (Section 3.1.2). Responses to the HEAD request method (Section 4.3.2 of [RFC7231]) never include a message body because the associated response header fields (e.g., Transfer-Encoding, Content-Length, etc.), if present, indicate only what their values would have been if the request method had been GET (Section 4.3.1 of [RFC7231]). 2xx (Successful) responses to a CONNECT request method (Section 4.3.6 of [RFC7231]) switch to tunnel mode instead of having a message body. All 1xx (Informational), 204 (No Content), and 304 (Not Modified) responses do not include a message body. All other responses do include a message body, although the body might be of zero length.

在一个响应中是否会出现消息体,取决于请求方法所对应的响应是什么,且响应状态码是什么(章节 3.1.2)。响应给请求方法 HEAD(【RFC7231】章节 4.3.2)的消息永远不会包含一个消息体,这是因为如果出现上述相关的响应头字段(例如,Transfer-EncodingContent-Length 等),表明仅当这个消息的请求方法被改变为 GET(【RFC7231】章节 4.3.1)的时候,这些头字段才有效62xx (Successful) 响应给一个请求方法 CONNECT(【RFC7231】章节 4.3.6)。所有 1xx (Informational)204 (No Content)304 (Not Modified) 的响应消息并不会包含一个消息体。除上述情况以外,所有其他响应消息都会包含一个消息体,虽然消息体的长度可能为零。

3.3.1. 传输编码 / Transfer-Encoding

The Transfer-Encoding header field lists the transfer coding names corresponding to the sequence of transfer codings that have been (or will be) applied to the payload body in order to form the message body. Transfer codings are defined in Section 4.

头字段 Transfer-Encoding 罗列了所有与已经(或即将)被应用到有效载荷payload body传输编码值transfer coding相对应的名称。有效载荷编码后形成消息体message body。传输编码值定义在章节 4

Transfer-Encoding = 1#transfer-coding

Transfer-Encoding is analogous to the Content-Transfer-Encoding field of MIME, which was designed to enable safe transport of binary data over a 7-bit transport service ([RFC2045], Section 6). However, safe transport has a different focus for an 8bit-clean transfer protocol. In HTTP's case, Transfer-Encoding is primarily intended to accurately delimit a dynamically generated payload and to distinguish payload encodings that are only applied for transport efficiency or security from those that are characteristics of the selected resource.

多用途互联网邮件扩展类型(MIME)的 Content-Transfer-Encoding 被设计为用于对二进制数据在 7 位传输服务中进行安全传输(【RFC2045章节 6),Transfer-Encoding 与之类似,但是,安全传输在纯 8 位传输协议中有不同的关注点。在 HTTP 环境下,Transfer-Encoding 的主要作用是对一个动态生成的有效载荷(payload)进行准确的定界,辨别仅应用于提高有效载荷的传输效率的编码,以及辨别选定资源的那些特性的安全性。

A recipient MUST be able to parse the chunked transfer coding (Section 4.1) because it plays a crucial role in framing messages when the payload body size is not known in advance. A sender MUST NOT apply chunked more than once to a message body (i.e., chunking an already chunked message is not allowed). If any transfer coding other than chunked is applied to a request payload body, the sender MUST apply chunked as the final transfer coding to ensure that the message is properly framed. If any transfer coding other than chunked is applied to a response payload body, the sender MUST either apply chunked as the final transfer coding or terminate the message by closing the connection.

一个接收端 必须 能够解析分块传输编码值chunked transfer coding章节 4.1),这是因为它在预先并不知道有效载荷payload body的大小的情况下对消息进行分帧的过程中扮演了重要的角色。一个发送端 禁止 对一个消息体进行超过一次的分块(也就是说,不允许对一个已经分好块的消息进行再次分块)。如果任何除 chunked 以外的传输编码值被应用到一个请求消息的有效载荷request payload body里,发送端 必须chunked 作为最后的传输编码值来确保那个消息被正确的分帧。如果任何除 chunked 以外的传输编码值被应用到一个响应消息的有效载荷response payload body里,发送端 必须 要不将 chunked 作为最后的传输编码值,要不关闭连接来终止这个消息的发送。

For example,

例如:

Transfer-Encoding: gzip, chunked

indicates that the payload body has been compressed using the gzip coding and then chunked using the chunked coding while forming the message body.

表示有效载荷payload body已经使用 gzip 编码压缩过,然后在消息体分帧的时候使用 chunked 编码。

Unlike Content-Encoding (Section 3.1.2.1 of [RFC7231]), Transfer-Encoding is a property of the message, not of the representation, and any recipient along the request/response chain MAY decode the received transfer coding(s) or apply additional transfer coding(s) to the message body, assuming that corresponding changes are made to the Transfer-Encoding field-value. Additional information about the encoding parameters can be provided by other header fields not defined by this specification.

Content-Encoding 头字段(【RFC7231】章节 3.1.2.1)不同的是,Transfer-Encoding 头字段是消息message的一个属性,而不是表示形式representation的一个属性,并且在请求/响应链路中的所有接收端都 可以 依据所接收到的传输代码值(一个或多个)对消息体进行解码,或者向消息体应用额外的传输代码值,假设 Transfer-Encodingfield-value 有进行对应的改动的话。额外的编码参数相关的信息能够通过其他不在本规范中定义的头字段提供。

Transfer-Encoding MAY be sent in a response to a HEAD request or in a 304 (Not Modified) response (Section 4.1 of [RFC7232]) to a GET request, neither of which includes a message body, to indicate that the origin server would have applied a transfer coding to the message body if the request had been an unconditional GET. This indication is not required, however, because any recipient on the response chain (including the origin server) can remove transfer codings when they are not needed.

响应给 HEAD 请求的响应消息,或者回应给 GET 请求的 304 (Not Modified) 响应(【RFC7232】章节 4.1),上述两种消息都是没有包含消息体的,Transfer-Encoding 可以 包含在上述两种响应消息中,来指明如果这个请求曾经是一个无条件 GET 请求7的话,源服务器会使用一个传输编码对消息体进行编码。这种“指明”是非必要的,但是,因为在响应链路中的任何接收端(包括源服务器)都能够在移除传输编码值,如果传输编码值是不需要的话。

A server MUST NOT send a Transfer-Encoding header field in any response with a status code of 1xx (Informational) or 204 (No Content). A server MUST NOT send a Transfer-Encoding header field in any 2xx (Successful) response to a CONNECT request (Section 4.3.6 of [RFC7231]).

服务器 禁止 在任何带有 1xx (Informational) 或者 204 (No Content) 状态码的响应消息里包含一个 Transfer-Encoding 头字段。一个服务器 禁止 将任何带有 2xx (Successful) 状态码的,并且带有一个 Transfer-Encoding 头字段的响应消息响应给一个 CONNECT 请求(【RFC7231】章节 4.3.6)。

Transfer-Encoding was added in HTTP/1.1. It is generally assumed that implementations advertising only HTTP/1.0 support will not understand how to process a transfer-encoded payload. A client MUST NOT send a request containing Transfer-Encoding unless it knows the server will handle HTTP/1.1 (or later) requests; such knowledge might be in the form of specific user configuration or by remembering the version of a prior received response. A server MUST NOT send a response containing Transfer-Encoding unless the corresponding request indicates HTTP/1.1 (or later).

Transfer-Encoding 头字段自 HTTP/1.1 起新增,一般会作如下假定:声明仅支持到 HTTP/1.0 的实现implementations并不会理解如何去处理一个经过传输编码过的有效载荷transfer-encoded payload。客户端 禁止 发送一个包含 Transfer-Encoding 的请求,除非它知道服务器能够处理 HTTP/1.1(或以上)版本的请求;客户端可以使用具体的配置,或者通过记住一个之前接收到的响应消息的版本号的方式知道服务器是否可以处理这种请求。服务器 禁止 发送一个包含 Transfer-Encoding 的响应,除非对应的请求指明了 HTTP/1.1(或以上)版本。

A server that receives a request message with a transfer coding it does not understand SHOULD respond with 501 (Not Implemented).

服务器接收到一个带有 Transfer-Encoding 的请求消息,但并不理解某个传输编码值的时候,应当 响应一个带有 501 (Not Implemented) 状态码的消息。

3.3.2. 内容长度 / Content-Length

When a message does not have a Transfer-Encoding header field, a Content-Length header field can provide the anticipated size, as a decimal number of octets, for a potential payload body. For messages that do include a payload body, the Content-Length field-value provides the framing information necessary for determining where the body (and message) ends. For messages that do not include a payload body, the Content-Length indicates the size of the selected representation (Section 3 of [RFC7231]).

当一个消息不带有 Transfer-Encoding 头字段的时候,Content-Length 头字段能够提供对于一个潜在的有效载荷payload body的预期的大小。Content-Length 是一个十进制数字,以字节octets形式。对于包含一个有效载荷的消息,Content-Lengthfield-value 提供了决定消息体在哪里结束的必要的分帧信息。对于没有包含一个有效载荷的消息,Content-Length 指明了已选表示形式selected representation【RFC7231】章节 3)的大小。

Content-Length = 1*DIGIT

An example is

一个例子

Content-Length: 3495

A sender MUST NOT send a Content-Length header field in any message that contains a Transfer-Encoding header field.

发送端 禁止 在任何带有 Transfer-Encoding 头字段的消息里带有一个 Content-Length 头字段。

A user agent SHOULD send a Content-Length in a request message when no Transfer-Encoding is sent and the request method defines a meaning for an enclosed payload body. For example, a Content-Length header field is normally sent in a POST request even when the value is 0 (indicating an empty payload body). A user agent SHOULD NOT send a Content-Length header field when the request message does not contain a payload body and the method semantics do not anticipate such a body.

当请求消息未带有 Transfer-Encoding 头字段,并且消息的请求方法意味着该消息需要带有有效载荷payload body,那么,用户代理 应当 在请求消息里带有一个 Content-Length 头字段。例如,通常会在一个 POST 请求中带有一个 Content-Length 头字段,即使它的值是 0(指明一个空的有效载荷)。当请求消息没有包含一个有效载荷,并且请求方法的语义并不期望消息带有消息体的时候,用户代理 不应当 在请求消息中带有 Content-Length 头字段。

A server MAY send a Content-Length header field in a response to a HEAD request (Section 4.3.2 of [RFC7231]); a server MUST NOT send Content-Length in such a response unless its field-value equals the decimal number of octets that would have been sent in the payload body of a response if the same request had used the GET method.

服务器 可以 在响应给一个 HEAD 请求(【RFC7231】章节 4.3.2)的响应中带有一个 Content-Length 头字段。服务器 禁止 在这种响应中带有 Content-Length,除非它的字段值等于这样一个以字节表示的十进制数字:如果同样的请求使用的是 GET 请求方法的话,它的响应的有效载荷里带有的长度值。

A server MAY send a Content-Length header field in a 304 (Not Modified) response to a conditional GET request (Section 4.1 of [RFC7232]); a server MUST NOT send Content-Length in such a response unless its field-value equals the decimal number of octets that would have been sent in the payload body of a 200 (OK) response to the same request.

服务器 可以 在响应给一个带条件的 GET 请求conditional GET request【RFC7232】章节 4.1)的 304 (Not Modified) 响应消息里带有一个 Content-Length 头字段。服务器 禁止 在这种响应消息中带有 Content-Length,除非它的字段值等于这样一个以字节表示的十进制数字:如果响应是一个 200 (OK) 的话,它的有效载荷里带有的长度值。

A server MUST NOT send a Content-Length header field in any response with a status code of 1xx (Informational) or 204 (No Content). A server MUST NOT send a Content-Length header field in any 2xx (Successful) response to a CONNECT request (Section 4.3.6 of [RFC7231]).

服务器 禁止 在任何带有 1xx (Informational)204 (No Content) 状态码的消息中带有一个 Content-Length 头字段。服务器 禁止 在任何响应给一个 CONNECT 请求(【RFC7231】章节 4.3.6)的带有 2xx (Successful) 状态码的响应消息中带有一个 Content-Length 头字段。

Aside from the cases defined above, in the absence of Transfer-Encoding, an origin server SHOULD send a Content-Length header field when the payload body size is known prior to sending the complete header section. This will allow downstream recipients to measure transfer progress, know when a received message is complete, and potentially reuse the connection for additional requests.

除了上述情况以外,在缺少 Transfer-Encoding 头字段的情况下,在即将发送完整个消息头部之前就知道了有效载荷payload body的大小,一个源服务器 应该 带有一个 Content-Length 头字段。这样能够允许下游downstream的接收端去计量传输的进度,了解什么时候一个消息是完整的,而且可以对额外的请求进行复用连接(如果条件允许的话)。

Any Content-Length field value greater than or equal to zero is valid. Since there is no predefined limit to the length of a payload, a recipient MUST anticipate potentially large decimal numerals and prevent parsing errors due to integer conversion overflows (Section 9.3).

任何大于或等于 0 的 Content-Length 的字段值都是合法的。由于没有预定义有效载荷的长度限制,因此,接收端 必须 能预料到特大数值big decimal的可能,并且避免由于整型类型转换溢出所引起的解析错误(章节 9.3)。

If a message is received that has multiple Content-Length header fields with field-values consisting of the same decimal value, or a single Content-Length header field with a field value containing a list of identical decimal values (e.g., "Content-Length: 42, 42"), indicating that duplicate Content-Length header fields have been generated or combined by an upstream message processor, then the recipient MUST either reject the message as invalid or replace the duplicated field-values with a single valid Content-Length field containing that decimal value prior to determining the message body length or forwarding the message.

如果接收到一个消息带有多个 Content-Length 头字段,并且它们的 field-value 由相同的十进制数组成;或者一个 Content-Length 头字段带有一个包含一系列十进制数值的字段值(例如 "Content-Length: 42 42"),表明透过上游某个消息处理程序生成generated合成combined了重复的 Content-Length 头字段,那么,接收端 必须 要不将这种消息当作不合法的消息而拒绝接收;要不使用一个单一的合法的包含了决定消息体长度的十进制值的 Content-Length 字段来替换掉重复的 field-value;要不转发这个消息。

Note: HTTP's use of Content-Length for message framing differs significantly from the same field's use in MIME, where it is an optional field used only within the "message/external-body" media-type.

注意: Content-Length 在 HTTP 上关于消息分帧的应用,明显不同于该同名字段在 MIME 上的应用,MIME 上的 Content-Length 是一个可选字段,仅用于“message/external-body”媒体类型中。

3.3.3. 消息体的长度 / Message Body Length

The length of a message body is determined by one of the following (in order of precedence):

消息体的长度取决于以下其中之一(按优先级排序):

  1. Any response to a HEAD request and any response with a 1xx (Informational), 204 (No Content), or 304 (Not Modified) status code is always terminated by the first empty line after the header fields, regardless of the header fields present in the message, and thus cannot contain a message body.
  2. Any 2xx (Successful) response to a CONNECT request implies that the connection will become a tunnel immediately after the empty line that concludes the header fields. A client MUST ignore any Content-Length or Transfer-Encoding header fields received in such a message.
  3. If a Transfer-Encoding header field is present and the chunked transfer coding (Section 4.1) is the final encoding, the message body length is determined by reading and decoding the chunked data until the transfer coding indicates the data is complete.

    If a Transfer-Encoding header field is present in a response and the chunked transfer coding is not the final encoding, the message body length is determined by reading the connection until it is closed by the server. If a Transfer-Encoding header field is present in a request and the chunked transfer coding is not the final encoding, the message body length cannot be determined reliably; the server MUST respond with the 400 (Bad Request) status code and then close the connection.

    If a message is received with both a Transfer-Encoding and a Content-Length header field, the Transfer-Encoding overrides the Content-Length. Such a message might indicate an attempt to perform request smuggling (Section 9.5) or response splitting (Section 9.4) and ought to be handled as an error. A sender MUST remove the received Content-Length field prior to forwarding such a message downstream.

  4. If a message is received without Transfer-Encoding and with either multiple Content-Length header fields having differing field-values or a single Content-Length header field having an invalid value, then the message framing is invalid and the recipient MUST treat it as an unrecoverable error. If this is a request message, the server MUST respond with a 400 (Bad Request) status code and then close the connection. If this is a response message received by a proxy, the proxy MUST close the connection to the server, discard the received response, and send a 502 (Bad Gateway) response to the client. If this is a response message received by a user agent, the user agent MUST close the connection to the server and discard the received response.
  5. If a valid Content-Length header field is present without Transfer-Encoding, its decimal value defines the expected message body length in octets. If the sender closes the connection or the recipient times out before the indicated number of octets are received, the recipient MUST consider the message to be incomplete and close the connection.
  6. If this is a request message and none of the above are true, then the message body length is zero (no message body is present).
  7. Otherwise, this is a response message without a declared message body length, so the message body length is determined by the number of octets received prior to the server closing the connection.
  1. 任何响应给一个 HEAD 请求的响应消息,以及任何带有 1xx (Informational)204 (No Content)304 (Not Modified) 状态码的响应消息总是结止于在头字段之后的第一个空行,无论消息中出现了什么头字段,这个消息不能包含消息体。
  2. 任何响应给一个 CONNECT 请求的带有 2xx (Successful) 状态码的响应消息,意味着在空行结束头字段之后,这个连接将会变成一个隧道中间人。客户端 必须 忽略接收自这种消息里的任何 Content-Length 或者 Transfer-Encoding 头字段。
  3. 如果出现了一个 Transfer-Encoding 头字段并且分块传输编码值chunked transfer coding章节 4.1)是其最后一个编码值,消息体的长度通过读取和解码分块数据直到传输编码指明数据已经完整来确定。

    如果在响应消息中出现一个 Transfer-Encoding 头字段并且分块传输编码值不是其最后一个编码值,消息体的长度通过读取连接直到连接被服务器关闭来确定。如果在请求消息中出现一个 Transfer-Encoding 头字段并且分块传输编码值不是其最后一个编码值,消息体的长度不能被可靠地确定,服务器 必须 响应一个带有 400 (Bad Request) 状态码的响应消息然后关闭该连接。

    如果接收到一个消息既带有一个 Transfer-Encoding 又带有一个 Content-Length 头字段,那么 Transfer-Encoding 将覆盖 Content-Length。这种消息可能意味着一个执行请求走私request smuggling章节 9.5)或者响应分割response splitting章节 9.4)的意图,应该被当作一个错误来处理。发送端 必须 在转发这种消息到下游之前移除所接收到的 Content-Length 字段。

  4. 如果一个消息没有 Transfer-Encoding 头字段,但带有或者多个不同值的 Content-Length 头字段,或者一个字段值不合法的 Content-Length,那么这个消息的分帧是不合法的,并且接收端 必须 将其当作一个不可恢复的错误来对待。如果这是一个请求消息,服务器 必须 响应一个带有 400 (Bad Request) 状态码的响应消息然后关闭连接。如果这是一个代理接收到的响应消息,那么这个代理 必须 关闭与服务器的连接,丢弃掉这个消息并且发送一个 502 (Bad Gateway) 状态码的响应消息到客户端。如果这是一个用户代理接收到的响应消息,那么这个用户代理 必须 关闭与服务器的连接,并且丢弃掉这个消息。
  5. 如果消息在没有 Transfer-Encoding 的情况下带有一个合法的 Content-Length 头字段,这个 Content-Length 的数值(以字节形式)定义了消息体的预计长度。如果接收端在接收到 Content-Lengthfield-value 之前,发送端关闭了连接或者接收端超时,那么接收端 必须 认为该消息是不完整的,并且关闭该连接。
  6. 如果这是一个请求消息,并且不符合上述各种情况,那么该消息体的长度为 0(没有出现消息体)。
  7. 否则,这是一个没有声明消息体长度的响应消息,因此该消息体的长度是由服务器关闭连接之前,接收端接收到的字节数来决定的。

Since there is no way to distinguish a successfully completed, close-delimited message from a partially received message interrupted by network failure, a server SHOULD generate encoding or length-delimited messages whenever possible. The close-delimiting feature exists primarily for backwards compatibility with HTTP/1.0.

因为没有途径从一个被网络故障打断而只接收到一部分的消息中去辨别出一个完整、带结束定界close-delimited的消息,因此,服务器 应当 尽可能生成编码值coding或者长度限定length-delimited的消息。结束定界close-delimited这一功能存在的主要目的是为了向后兼容 HTTP/1.0。

A server MAY reject a request that contains a message body but not a Content-Length by responding with 411 (Length Required).

服务器 可以 通过发送一个带有 411 (Length Required) 状态码的响应来拒绝一个包含了一个消息体但却没有一个 Content-Length 头字段的请求。

Unless a transfer coding other than chunked has been applied, a client that sends a request containing a message body SHOULD use a valid Content-Length header field if the message body length is known in advance, rather than the chunked transfer coding, since some existing services respond to chunked with a 411 (Length Required) status code even though they understand the chunked transfer coding. This is typically because such services are implemented via a gateway that requires a content-length in advance of being called and the server is unable or unwilling to buffer the entire request before processing.

除非应用了一个除 chunked 以外的传输编码值,发送包含一个消息体的请求的客户端,如果预先知道消息体的长度,应该 使用一个合法的 Content-Length 头字段,而不是使用分块传输编码值,这是因为某些现存的服务会使用 411 (Length Required) 状态码的响应消息来回应这种分块的请求消息,即使这块服务理解这个分块传输编码值。因为这种服务是经由一个在呼叫(服务)之前要求预先知道内容长度的网关来实现的,并且,服务器并不能或者不愿意在处理之前先去缓冲好整个请求。

A user agent that sends a request containing a message body MUST send a valid Content-Length header field if it does not know the server will handle HTTP/1.1 (or later) requests; such knowledge can be in the form of specific user configuration or by remembering the version of a prior received response.

发送了一个包含消息体的请求消息的用户代理,如果它并不知道服务器能够处理 HTTP/1.1(或之后)的请求的时候,必须 发送一个合法的 Content-Length 头字段。其中,用户代理可以使用具体的配置,或者通过记住之前接收到的响应消息的版本号的方式知道服务器是否可以处理 HTTP/1.1(或之后)的请求。

If the final response to the last request on a connection has been completely received and there remains additional data to read, a user agent MAY discard the remaining data or attempt to determine if that data belongs as part of the prior response body, which might be the case if the prior message's Content-Length value is incorrect. A client MUST NOT process, cache, or forward such extra data as a separate response, since such behavior would be vulnerable to cache poisoning.

在一个连接中,如果一个用户代理已经完成了对应最后一个请求的最后一个完整响应消息的接收工作后,发现仍然剩余额外的数据需要读取,那么这个用户代理 可以 丢弃这些剩余数据,或者试图辨别这些剩余数据是否属于之前的响应主体,如果之前的消息的 Content-Length 的值是不正确的话可能会导致这种情况。客户端 禁止 将这种额外的数据作为一个单独的响应来进行处理、缓存或者转发,这是因为这种行为可能会存在缓存中毒(Cache Poisoning)的隐患。

3.4. 消息不完整的处理 / Handling Incomplete Messages

A server that receives an incomplete request message, usually due to a canceled request or a triggered timeout exception, MAY send an error response prior to closing the connection.

服务器接收到一个不完整的请求消息,通常是因为这是一个已取消的请求,或者一个触发超时的异常,可以 在关闭连接前发送一个错误响应。

A client that receives an incomplete response message, which can occur when a connection is closed prematurely or when decoding a supposedly chunked transfer coding fails, MUST record the message as incomplete. Cache requirements for incomplete responses are defined in Section 3 of [RFC7234].

当一个连接被过早地关闭或者当一个推想的分块传输代码(chunked transfer coding)解码失败时会导致客户端接收到一个不完整的响应消息,这时候客户端 必须 记录下这个消息是不完整的。缓存cache对于不完整消息的要求定义在【RFC7234】章节 3 里。

If a response terminates in the middle of the header section (before the empty line is received) and the status code might rely on header fields to convey the full meaning of the response, then the client cannot assume that meaning has been conveyed; the client might need to repeat the request in order to determine what action to take next.

假如一个响应终止在消息头部header section之间(在接收到空行之前),并且状态码可能需要依赖头字段才能传达响应消息的完整意义,那么客户端不能够假设这个“意义”已经传达了;客户端可能需要重复这一次请求,以便于决定下一步将如何行动。

A message body that uses the chunked transfer coding is incomplete if the zero-sized chunk that terminates the encoding has not been received. A message that uses a valid Content-Length is incomplete if the size of the message body received (in octets) is less than the value given by Content-Length. A response that has neither chunked transfer coding nor Content-Length is terminated by closure of the connection and, thus, is considered complete regardless of the number of message body octets received, provided that the header section was received intact.

一个使用分块传输代码的消息体,如果终止编码的 zero-sized 分块没有被接收到,那么这个消息体是不完整的。一个使用一个合法 Content-Length 的消息,如果所接收到的消息体的大小(按字节来算)小于给定 Content-Length 的大小,那么这个消息是不完整的。一个既没有使用分块传输代码,也没有提供 Content-Length 的响应消息,如果因为连接被关闭而终止,只要该消息所提供的消息头部已经接收完整,那么可以认为消息是完整的,无论消息体的内容被接收到多少。

3.5. 消息解析的健壮性 / Message Parsing Robustness

Older HTTP/1.0 user agent implementations might send an extra CRLF after a POST request as a workaround for some early server applications that failed to read message body content that was not terminated by a line-ending. An HTTP/1.1 user agent MUST NOT preface or follow a request with an extra CRLF. If terminating the request message body with a line-ending is desired, then the user agent MUST count the terminating CRLF octets as part of the message body length.

在 HTTP/1.0 时代,早期的服务器应用可能不能正确读取那些没有以换行符 line-ending 作为结尾的消息体内容。因此,早期的 HTTP/1.0 版本的用户代理可能会在 POST 请求消息中发送一个额外的回车换行 CRLF 来作为一个变通方案。到了 HTTP/1.1,用户代理 不能 在一个请求消息首或尾再添加额外的回车换行 CRLF 了。如果要求以换行符 line-ending 作为请求消息体的结束,那么用户代理 必须 对 CRLF 的字节数目进行计数,包含在消息体的长度之内。

In the interest of robustness, a server that is expecting to receive and parse a request-line SHOULD ignore at least one empty line (CRLF) received prior to the request-line.

出于健壮性的考虑,服务器在等待接收和解析parse一个请求行 request-line 的时候,应当 忽略至少一个空行(CRLF)。

Although the line terminator for the start-line and header fields is the sequence CRLF, a recipient MAY recognize a single LF as a line terminator and ignore any preceding CR.

虽然对于起始行 start-line 和消息头字段header fields行终结符line terminatorCRLF 字符序列,但接收端 可以 单独将一个 LF 作为一行的终结,同时忽略 LF 之前的所有 CR。

Although the request-line and status-line grammar rules require that each of the component elements be separated by a single SP octet, recipients MAY instead parse on whitespace-delimited word boundaries and, aside from the CRLF terminator, treat any form of whitespace as the SP separator while ignoring preceding or trailing whitespace; such whitespace includes one or more of the following octets: SP, HTAB, VT (%x0B), FF (%x0C), or bare CR. However, lenient parsing can result in security vulnerabilities if there are multiple recipients of the message and each has its own unique interpretation of robustness (see Section 9.5).

虽然请求行request line状态行status line的语法规则要求每一个组件元素需要以一个空白(8位字节) SP 分隔,但是接收端 可以 改由使用除 CRLF 结束符以外的更宽泛的空白限定符whitespace-delimited word替代 SP 来进行解析,从而将任何形式的空白作为 SP 分隔符来对待,在忽略前置或后置空白的时候。空白限定符由包括一个或多个以下的字节:SPHTABVT (%x0B)、FF (&x0C) 或者单独由 CR。但是,如果消息有多个接收端,而且这些接收端都有自己独特的健壮性的解释,那么,这种宽泛解析lenient parsing的方式会引起安全隐患(见章节 9.5)。

When a server listening only for HTTP request messages, or processing what appears from the start-line to be an HTTP request message, receives a sequence of octets that does not match the HTTP-message grammar aside from the robustness exceptions listed above, the server SHOULD respond with a 400 (Bad Request) response.

当一个服务端只针对 HTTP 请求进行监听,或者正在处理 HTTP 请求消息的起始行start-line具体有什么内容的时候,接到到一个与 HTTP 消息语法不匹配的字节octets序列,除以上列出的非健壮异常以外,服务端 应该 响应一个 400 (Bad Request) 的响应消息。

4. 传输编码值 / Transfer Codings

Transfer coding names are used to indicate an encoding transformation that has been, can be, or might need to be applied to a payload body in order to ensure "safe transport" through the network. This differs from a content coding in that the transfer coding is a property of the message rather than a property of the representation that is being transferred.

传输编码值transfer coding是用于表示已经、能够或者可能需要应用到一个有效载荷payload body中以确保网络传输安全的一种编码转换encoding transformation。与内容编码值content coding不同的是,传输编码值是一个消息的属性,而不是一个表示形式representation的属性。

transfer-coding    = "chunked" ; Section 4.1
                    / "compress" ; Section 4.2.1
                    / "deflate" ; Section 4.2.2
                    / "gzip" ; Section 4.2.3
                    / transfer-extension
transfer-extension = token *( OWS ";" OWS transfer-parameter )

Parameters are in the form of a name or name=value pair.

参数 transfer-parameter 是以一个名称 name 或者键值对 name=value 的形式存在。

transfer-parameter = token BWS "=" BWS ( token / quoted-string )

All transfer-coding names are case-insensitive and ought to be registered within the HTTP Transfer Coding registry, as defined in Section 8.4. They are used in the TE (Section 4.3) and Transfer-Encoding (Section 3.3.1) header fields.

所有 transfer-coding 的名称都是不区分大小写的,并且应该被登记在 "HTTP Transfer Coding" 登记表中,定义在章节 8.4 中。它们用在消息字段 TE章节 4.3)和 Transfer-Encoding章节 3.3.1)里。

4.1. 分块传输编码值 / Chunked Transfer Coding

The chunked transfer coding wraps the payload body in order to transfer it as a series of chunks, each with its own size indicator, followed by an OPTIONAL trailer containing header fields. Chunked enables content streams of unknown size to be transferred as a sequence of length-delimited buffers, which enables the sender to retain connection persistence and the recipient to know when it has received the entire message.

使用 chunked 传输编码值将有效载荷payload body包裹为一系列的分块来传输。每个分块都带有单独的大小指标,跟随于一个 可选的 包含头字段的结尾之后。chunked 允许以一系列长度限定length-delimited缓冲buffers的方式来传输大小未知的内容流。这种缓冲允许发送端去保持连接不关闭,以及使接收端得知在什么时候接收完整个消息。

chunked-body   = *chunk
                  last-chunk
                  trailer-part
                  CRLF

chunk          = chunk-size [ chunk-ext ] CRLF
                  chunk-data CRLF
chunk-size     = 1*HEXDIG
last-chunk     = 1*("0") [ chunk-ext ] CRLF

chunk-data     = 1*OCTET ; a sequence of chunk-size octets

The chunk-size field is a string of hex digits indicating the size of the chunk-data in octets. The chunked transfer coding is complete when a chunk with a chunk-size of zero is received, possibly followed by a trailer, and finally terminated by an empty line.

其中,chunk-size 字段是一个十六进制数字的字符串,代表 chunk-data 的字节大小。当接收到一个带有 chunk-size 的值为零的分块(可以位于一个 trailer 之后并以一个空行作为结束),即代表 chunked 传输编码值是完整的。

A recipient MUST be able to parse and decode the chunked transfer coding.

一个接收端 必须 能够解析parse解码decode chunked 传输编码值。

4.1.1. 分块扩展 / Chunk Extensions

The chunked encoding allows each chunk to include zero or more chunk extensions, immediately following the chunk-size, for the sake of supplying per-chunk metadata (such as a signature or hash), mid-message control information, or randomization of message body size.

chunked 编码允许每一个分块包含零个或以上分块扩展chunk extensions chunk-ext,紧跟在 chunk-size 之后,用于支持 per-chunk 元数据(例如一个签名或哈希值)、mid-message 控制信息,或者随机消息体大小。

chunk-ext      = *( ";" chunk-ext-name [ "=" chunk-ext-val ] )

chunk-ext-name = token
chunk-ext-val  = token / quoted-string

The chunked encoding is specific to each connection and is likely to be removed or recoded by each recipient (including intermediaries) before any higher-level application would have a chance to inspect the extensions. Hence, use of chunk extensions is generally limited to specialized HTTP services such as "long polling" (where client and server can have shared expectations regarding the use of chunk extensions) or for padding within an end-to-end secured connection.

chunked 编码是会对每一个连接而言的,而且它可能会在任何高层应用有机会检测到分块的扩展信息之前就被接收端(包括中间人)所移除或者重新编码。因此,分块扩展的使用通常限定于特定的 HTTP 服务中,例如“长轮询long polling”(一个客户端和服务端能够对分块扩展的使用达成共同的期望的地方)或者在一个端到端end-to-end安全连接之间的填充。

A recipient MUST ignore unrecognized chunk extensions. A server ought to limit the total length of chunk extensions received in a request to an amount reasonable for the services provided, in the same way that it applies length limitations and timeouts for other parts of a message, and generate an appropriate 4xx (Client Error) response if that amount is exceeded.

一个接收端 必须 忽略无法识别的分块扩展。一个服务端应该将所接收到的分块扩展的总长度限制到一个由相应服务提供的合理的值上,就像服务端对一个消息的其他部分应用长度限制和超时一样,并且生成一个合适的 4xx (Client Error) 响应,如果超过那个长度限制的话。

4.1.2. 分块尾部 / Chunked Trailer Part

A trailer allows the sender to include additional fields at the end of a chunked message in order to supply metadata that might be dynamically generated while the message body is sent, such as a message integrity check, digital signature, or post-processing status. The trailer fields are identical to header fields, except they are sent in a chunked trailer instead of the message's header section.

一个分块尾部trailer允许发送端在一个分块的消息的末尾处包含额外的头字段,为了提供可以在消息体被发送的时候动态生成的元数据,例如一个消息完整性检查message integrity check(MIC)、数字签名digital signature或者后处理状态post-processing status等。分块尾字段trailer fields等价于消息头字段header fields,除了它们是发送在一个分块尾部里,而不是在一个消息的消息头部里。

trailer-part   = *( header-field CRLF )

A sender MUST NOT generate a trailer that contains a field necessary for message framing (e.g., Transfer-Encoding and Content-Length), routing (e.g., Host), request modifiers (e.g., controls and conditionals in Section 5 of [RFC7231]), authentication (e.g., see [RFC7235] and [RFC6265]), response control data (e.g., see Section 7.1 of [RFC7231]), or determining how to process the payload (e.g., Content-Encoding, Content-Type, Content-Range, and Trailer).

发送端 禁止 生成一个包含某种用于以下用途的必要字段的分块尾部:消息分帧(例如,Transfer-EncodingContent-Length)、路由选择(例如,Host)、请求修饰符(例如,【RFC7231】章节 5 里定义的 controlsconditionals)、认证(见【RFC7235】和【RFC6265】)、响应控制数据(见【RFC7231】章节 7.1),以及决定如何处理有效载荷的字段(例如,Content-Encoding, Content-Type, Content-Range, and Trailer)。

When a chunked message containing a non-empty trailer is received, the recipient MAY process the fields (aside from those forbidden above) as if they were appended to the message's header section. A recipient MUST ignore (or consider as an error) any fields that are forbidden to be sent in a trailer, since processing them as if they were present in the header section might bypass external security filters.

当接收到一个包含一个非空尾部non-empty trailer的分块消息时,接收端 可以 像处理头字段一样的方式来处理这些分块尾部里的字段(除了上述禁止的字段以外)。接收端 必须 忽略(或者认为是一个错误)任何禁止存放于分块尾部里的字段,这是因为如果像处理头字段一样处理它们的话可能会越过外部安全过滤机制。

Unless the request includes a TE header field indicating "trailers" is acceptable, as described in Section 4.3, a server SHOULD NOT generate trailer fields that it believes are necessary for the user agent to receive. Without a TE containing "trailers", the server ought to assume that the trailer fields might be silently discarded along the path to the user agent. This requirement allows intermediaries to forward a de-chunked message to an HTTP/1.0 recipient without buffering the entire response.

除非请求消息里包含一个 TE 头字段指明允许 "trailers"(详情见章节 4.3),服务器 不应该 生成它相信用户代理必然会接收到的分块尾字段trailer fields。没有包含 "trailers" 的 TE 头字段时,服务端应该假设分块尾字段可能会在发送到用户代理的链路过程中被隐式地丢弃掉。这样的要求使得中间人能够在不用缓冲buffering整个响应消息的情况下去转发一个 de-chunked 的消息到一个 HTTP/1.0 接收端中去。

4.1.3. 解码分块 / Decoding Chunked

A process for decoding the chunked transfer coding can be represented in pseudo-code as:

对分块传输编码进行解码的过程可以使用下列伪代码来表示:

length := 0
read chunk-size, chunk-ext (if any), and CRLF
while (chunk-size > 0) {
    read chunk-data and CRLF
    append chunk-data to decoded-body
    length := length + chunk-size
    read chunk-size, chunk-ext (if any), and CRLF
}
read trailer field
while (trailer field is not empty) {
    if (trailer field is allowed to be sent in a trailer) {
        append trailer field to existing header fields
    }
    read trailer-field
}
Content-Length := length
Remove "chunked" from Transfer-Encoding
Remove Trailer from existing header fields

4.2. 压缩编码值 / Compression Codings

The codings defined below can be used to compress the payload of a message.

下列编码值能够用于对一个消息里的有效载荷payload进行压缩。

4.2.1. "compress" 编码值 / Compress Coding

The "compress" coding is an adaptive Lempel-Ziv-Welch (LZW) coding [Welch] that is commonly produced by the UNIX file compression program "compress". A recipient SHOULD consider "x-compress" to be equivalent to "compress".

编码值 compress 是一种用于适配 LZW 无损压缩算法(Lempel-Ziv-Welch)的编码,LZW 算法编码由 UNIX 文件压缩程序 "compress" 生成。接收端 应该x-compress 等同于 compress

4.2.2. "deflate" 编码值 / Deflate Coding

The "deflate" coding is a "zlib" data format [RFC1950] containing a "deflate" compressed data stream [RFC1951] that uses a combination of the Lempel-Ziv (LZ77) compression algorithm and Huffman coding.

编码值 deflate 是一种 "zlib" 数据格式【RFC1950】,该数据格式包含一个使用 "deflate" 来压缩的数据流【RFC1951】,这种数据流是使用 LZ77 压缩算法(LZ77,Lempel-Ziv Compress Algorithm)哈夫曼编码(Huffman Coding)结合而成的。

Note: Some non-conformant implementations send the "deflate" compressed data without the zlib wrapper.

注意: 某些不兼容 HTTP 规范的实现implementations在发送 "deflate" 压缩数据时并不会包含在 zlib 中。

4.2.3. "gzip" 编码值 / Gzip Coding

The "gzip" coding is an LZ77 coding with a 32-bit Cyclic Redundancy Check (CRC) that is commonly produced by the gzip file compression program [RFC1952]. A recipient SHOULD consider "x-gzip" to be equivalent to "gzip".

编码值 gzip 是一个带有一个通常由 gzip 文件压缩程序【RFC1950】生成的 32 位循环冗余检查(CRC)的 LZ77 编码。接收端 应该x-gzip 等同于 gzip

4.3. TE

The "TE" header field in a request indicates what transfer codings, besides chunked, the client is willing to accept in response, and whether or not the client is willing to accept trailer fields in a chunked transfer coding.

在一个请求消息中的头字段 TE 指明了客户端除了愿意接受 chunked 以外还愿意接受哪些传输编码的响应消息,以及客户端是否愿意在分块传输编码值中接受分块尾字段(trailer fields)。

The TE field-value consists of a comma-separated list of transfer coding names, each allowing for optional parameters (as described in Section 4), and/or the keyword "trailers". A client MUST NOT send the chunked transfer coding name in TE; chunked is always acceptable for HTTP/1.1 recipients.

TE 的字段值由一个以逗号分隔的传输编码值的名称列表组成。列表内的每一项都考虑到可选参数(见章节 4)以及/或者关键词 "trailers"。客户端 禁止TE 发送 chunked 传输编码值名称;对于 HTTP/1.1 接收端来说,chunked 总是被允许的。

TE        = #t-codings
t-codings = "trailers" / ( transfer-coding [ t-ranking ] )
t-ranking = OWS ";" OWS "q=" rank
rank      = ( "0" [ "." 0*3DIGIT ] )
            / ( "1" [ "." 0*3("0") ] )

Three examples of TE use are below.

以下是关于 TE 的三个例子。

TE: deflate
TE:
TE: trailers, deflate;q=0.5

The presence of the keyword "trailers" indicates that the client is willing to accept trailer fields in a chunked transfer coding, as defined in Section 4.1.2, on behalf of itself and any downstream clients. For requests from an intermediary, this implies that either: (a) all downstream clients are willing to accept trailer fields in the forwarded response; or, (b) the intermediary will attempt to buffer the response on behalf of downstream recipients. Note that HTTP/1.1 does not define any means to limit the size of a chunked response such that an intermediary can be assured of buffering the entire response.

TE 头字段里出现关键词 trailers 指明该客户端自身及其所有下游客户端都愿意在一个分块传输编码值里接受分块尾字段trailer fields(见章节 4.1.2)。对于来自一个中间人的请求,意味着以下两种情况:(a)该中间人的所有下游客户端都愿意接受分块尾字段,或者,(b)该中间人为了下游的接收端,将试图先缓冲整个响应消息。需要注意的是,HTTP/1.1 并没有对一个分块的响应消息chunked response的大小定义任何限制,因此,中间人并不能保证能够缓冲整个消息。

When multiple transfer codings are acceptable, the client MAY rank the codings by preference using a case-insensitive "q" parameter (similar to the qvalues used in content negotiation fields, Section 5.3.1 of [RFC7231]). The rank value is a real number in the range 0 through 1, where 0.001 is the least preferred and 1 is the most preferred; a value of 0 means "not acceptable".

当允许多个传输编码值时,客户端 可以 依据其偏好,通过使用一个不区分大小写的参数 q (类似于内容协商字段里的 qvalues,见【RFC7231】章节 5.3.1),来对这些编码值分配权重。权重值是一个 0 到 1 之间的实数,最小值是 0.001(优先级最低),最大值为 1(优先级最高),排序值为 0 代表“不接受这种传输编码”。

If the TE field-value is empty or if no TE field is present, the only acceptable transfer coding is chunked. A message with no transfer coding is always acceptable.

如果头字段 TE 不存在或者它的字段值为空,意味着仅接受 chunked 传输编码值。没有任何传输代码值的消息总是可打接受的。

Since the TE header field only applies to the immediate connection, a sender of TE MUST also send a "TE" connection option within the Connection header field (Section 6.1) in order to prevent the TE field from being forwarded by intermediaries that do not support its semantics.

因为头字段 TE 仅应用于直接连接immediate connection,因此,为了避免头字段 TE 被中间人转发出去(不符合 TE 的语义),TE 的发送端 必须 还要在 Connection 头字段(见章节 6.1)里发送一个 "TE" 连接选项connection option

4.4. Trailer

When a message includes a message body encoded with the chunked transfer coding and the sender desires to send metadata in the form of trailer fields at the end of the message, the sender SHOULD generate a Trailer header field before the message body to indicate which fields will be present in the trailers. This allows the recipient to prepare for receipt of that metadata before it starts processing the body, which is useful if the message is being streamed and the recipient wishes to confirm an integrity check on the fly.

当一个消息包含了一个使用 chunked 编码的消息体,并且发送端希望通过在消息末尾附加分块尾字段trailer fields的方式来发送元数据metadata,那么发送端 应该 在消息体之前生成一个头字段 Trailer 来指定有哪些字段将会被出现在分块尾部中。这样使得接收端在处理消息体之前可以做好接收那些元数据的准备,如果这种消息将被流式发送,并且接收端希望在接收消息的同时对其进行完整性检验,那么这种设计将非常有用。

Trailer = 1#field-name

5. 消息路由 / Message Routing

HTTP request message routing is determined by each client based on the target resource, the client's proxy configuration, and establishment or reuse of an inbound connection. The corresponding response routing follows the same connection chain back to the client.

HTTP 请求消息路由是取决于每个客户端,基于目标资源、客户端的代理配置,以及入站连接inbound connection的创建或者复用。与之对应的响应路由跟随同样一条链路反向回到客户端。

5.1. 标识目标资源 / Identifying a Target Resource

HTTP is used in a wide variety of applications, ranging from general-purpose computers to home appliances. In some cases, communication options are hard-coded in a client's configuration. However, most HTTP clients rely on the same resource identification mechanism and configuration techniques as general-purpose Web browsers.

HTTP 使用于各种各样的应用中,从通用计算机到家庭应用都有 HTTP 的身影。在某些情况下,通信选项是硬编码在客户端的配置里的,但是,大多数 HTTP 客户端依靠相同的资源识别方法以及配置技术,就像通用网页浏览器一样。

HTTP communication is initiated by a user agent for some purpose. The purpose is a combination of request semantics, which are defined in [RFC7231], and a target resource upon which to apply those semantics. A URI reference (Section 2.7) is typically used as an identifier for the "target resource", which a user agent would resolve to its absolute form in order to obtain the "target URI". The target URI excludes the reference's fragment component, if any, since fragment identifiers are reserved for client-side processing ([RFC3986], Section 3.5).

用户代理出于某种目的来初始化 HTTP 通信。该目的是由请求语义(定义在【RFC7231】),以及一个应用这些语义的目标资源target resource两者结合而成的。一个 URI 引用(URI reference,章节 2.7) 通常用于作为一个目标资源的定位符,用户代理可以将其解析resolve绝对形式absolute form来获得“目标 URI(target URI)”。如果 URI 引用里存在段落组件 fragment 的话,目标 URI 会排除掉 fragment,这是因为 fragment 标识是保留给客户端处理的(见【RFC3986】章节 3.5)。

5.2. 连接入站 / Connecting Inbound

Once the target URI is determined, a client needs to decide whether a network request is necessary to accomplish the desired semantics and, if so, where that request is to be directed.

一旦确定了目标 URI,客户端需要决定要想实现目标 URI 所代表的语义是否需要使用网络请求,如果是的话,请求会被导向到哪里。

If the client has a cache [RFC7234] and the request can be satisfied by it, then the request is usually directed there first.

如果客户端启用了缓存cache(【RFC7234】)并且满足该请求,那么该请求一般会优先导向到缓存里。

If the request is not satisfied by a cache, then a typical client will check its configuration to determine whether a proxy is to be used to satisfy the request. Proxy configuration is implementation-dependent, but is often based on URI prefix matching, selective authority matching, or both, and the proxy itself is usually identified by an "http" or "https" URI. If a proxy is applicable, the client connects inbound by establishing (or reusing) a connection to that proxy.

如果缓存没有满足该请求,那么一个典型的客户端将会检验自身的配置来决定是否需要使用代理proxy来满足该请求。代理配置是依赖于实现implementation-dependent的,但是通常基于 URI 前缀匹配URI prefix matching可选择的权威机构匹配selective authority matching,或者两者皆有。另外,代理自身通常是通过一个 "http" 或者 "https" URI 来标识的。如果代理是适用的,那么客户端通过建立(或复用)与该代理的连接来进行入站连接connect inbound

If no proxy is applicable, a typical client will invoke a handler routine, usually specific to the target URI's scheme, to connect directly to an authority for the target resource. How that is accomplished is dependent on the target URI scheme and defined by its associated specification, similar to how this specification defines origin server access for resolution of the "http" (Section 2.7.1) and "https" (Section 2.7.2) schemes.

如果没有适用的代理,那么客户端通常会执行一个处理程序例程,这个例程通常是特定于该目标 URI 的 scheme 的,来连接到一个指向到目标资源的 authority。这个程序例程如何才算是完成取决于目标 URI 的 scheme 以及定义该 scheme 的相关规范,类似于本规范如何定义对 "http"(章节 2.7.1)和 "https"(章节 2.7.2)这两种方案的解析来访问源服务器。

译注:authority 是一个 URI 的组成部分,表现为一个服务的 DNS 主机名称或者是 IP 地址。如果该服务不是使用默认端口的话,authority 还会包含具体的端口号,其中 "http" 方案的默认端口是 80,"https" 方案的默认端口是 443。而 host 就是一个服务的 DNS 主机名称或者 IP 地址,host 并不包含端口号。也就是说:当使用默认端口时,authority = host;当不使用默认端口时,authority = host + port,两者并不等同。因此,这里将 "authority" 翻译为“主机”不合适,索性就不翻译了。

HTTP requirements regarding connection management are defined in Section 6.

关于连接管理的相关 HTTP 规范要求定义在章节 6

5.3. 请求目标 / Request Target

Once an inbound connection is obtained, the client sends an HTTP request message (Section 3) with a request-target derived from the target URI. There are four distinct formats for the request-target, depending on both the method being requested and whether the request is to a proxy.

一旦获得了一个入站连接inbound connection,客户端会发送一个带有一个请求目标(request-target)的 HTTP 请求消息(章节 3)。请求目标从目标 URI 里推导得出。依据请求方法request method以及该请求是否是一个发送到代理proxy的请求来分,请求目标一共有四种不同的格式。

request-target = origin-form
               / absolute-form
               / authority-form
               / asterisk-form

5.3.1. 原始形式 / origin-form

The most common form of request-target is the origin-form.

请求目标最常见的形式是原始形式(origin-form)。

origin-form    = absolute-path [ "?" query ]

When making a request directly to an origin server, other than a CONNECT or server-wide OPTIONS request (as detailed below), a client MUST send only the absolute path and query components of the target URI as the request-target. If the target URI's path component is empty, the client MUST send "/" as the path within the origin-form of request-target. A Host header field is also sent, as defined in Section 5.4.

当生成一个直接导向到源服务器的请求时,除了一个 CONNECT 或者服务器范围内的 OPTIONS 请求(见下文)以外,客户端 必须 仅使用目标 URI 的绝对路径 absolute-path 组件以及查询字符串 query 组件作为请求目标。如果目标 URI 的 path 组件为空,客户端 必须 发送 "/" 作为请求目标的原始形式的 path。一个 Host 头字段同样会被发送,其定义见章节 5.4

译注:请求目标的原始形式由目标 URI 的绝对路径 absolute-path 组件以及查询字符串 query 组件组成。

For example, a client wishing to retrieve a representation of the resource identified as

例如,一个客户端希望从源服务器里获得这个资源的一种表示形式representation,资源对应的 URI 如下:

http://www.example.org/where?q=now

directly from the origin server would open (or reuse) a TCP connection to port 80 of the host "www.example.org" and send the lines:

客户端会打开(或者复用)一个 TCP 连接到 "www.example.org" 主机的 80 端口,并且发送以下几行

GET /where?q=now HTTP/1.1
Host: www.example.org

followed by the remainder of the request message.

以及随后的请求消息的其余部分。

5.3.2. 绝对形式 / absolute-form

When making a request to a proxy, other than a CONNECT or server-wide OPTIONS request (as detailed below), a client MUST send the target URI in absolute-form as the request-target.

当生成一个发送到一个代理的请求时,除了一个 CONNECT 或者服务器范围内的 OPTIONS 请求(见下文)以外,客户端 必须 使用请求目标的绝对形式(absolute-form)。

absolute-form  = absolute-URI

The proxy is requested to either service that request from a valid cache, if possible, or make the same request on the client's behalf to either the next inbound proxy server or directly to the origin server indicated by the request-target. Requirements on such "forwarding" of messages are defined in Section 5.7.

代理要不被要求去处理来自一个有效缓存的请求消息(如果可能的话),要不被要求去生成同样的请求来代表客户端向服务器(或者是下一个入站代理服务器,或者是请求目标所指定的源服务器)发送请求。关于消息的这种“转发”的相关要求,定义在章节 5.7 中。

An example absolute-form of request-line would be:

一个在请求行request line里使用绝对形式作为请求目标的例子如下:

GET http://www.example.org/pub/WWW/TheProject.html HTTP/1.1

To allow for transition to the absolute-form for all requests in some future version of HTTP, a server MUST accept the absolute-form in requests, even though HTTP/1.1 clients will only send them in requests to proxies.

为了允许在将来某个 HTTP 版本里将所有请求的请求目标转换为绝对形式,服务器 必须 接受请求目标是绝对形式的请求,即使 HTTP/1.1 客户端将仅向代理proxy发送这种请求。

5.3.3. 权威形式 / authority-form

The authority-form of request-target is only used for CONNECT requests (Section 4.3.6 of [RFC7231]).

请求目标的 authority-form 形式只用于 CONNECT 请求(【RFC7231】章节 4.3.6)。

authority-form = authority

When making a CONNECT request to establish a tunnel through one or more proxies, a client MUST send only the target URI's authority component (excluding any userinfo and its "@" delimiter) as the request-target. For example,

当生成一个 CONNECT 请求(用于建立一条贯穿一个或多个代理的隧道)时,客户端 必须 仅使用目标 URI 的 authority 组件(排除任何 userinfo 以及 "@" 分隔符)作为请求目标。例如:

CONNECT www.example.com:80 HTTP/1.1

5.3.4. 星号形式 / asterisk-form

The asterisk-form of request-target is only used for a server-wide OPTIONS request (Section 4.3.7 of [RFC7231]).

请求目标的 asterisk-form 形式只用于服务器范围内的 OPTIONS 请求(【RFC7231】章节 4.3.7)。

asterisk-form  = "*"

When a client wishes to request OPTIONS for the server as a whole, as opposed to a specific named resource of that server, the client MUST send only "*" (%x2A) as the request-target. For example,

当一个客户端希望获得服务器整体上的功能选项(与之相反的是该服务器的一个具体的命名资源)时,客户端 必须 仅使用 "*" (%x2A) 作为请求目标。例如:

OPTIONS * HTTP/1.1

If a proxy receives an OPTIONS request with an absolute-form of request-target in which the URI has an empty path and no query component, then the last proxy on the request chain MUST send a request-target of "*" when it forwards the request to the indicated origin server.

如果一个代理proxy接收到一个 OPTIONS 请求,该请求的请求目标为绝对形式,URI 的路径 path 为空并且没有 query 组件,那么,请求链路中的最后一个代理 必须 发送一个 "*" 作为请求目标,当它将请求转发到指定的源服务器的时候。

For example, the request

例如,请求:

OPTIONS http://www.example.org:8001 HTTP/1.1

would be forwarded by the final proxy as

会被最后一个代理服务器转发为:

OPTIONS * HTTP/1.1
Host: www.example.org:8001

after connecting to port 8001 of host "www.example.org".

在连接到 "www.example.org" 主机的 8001 接口之后。

5.4. 主机 / Host

The "Host" header field in a request provides the host and port information from the target URI, enabling the origin server to distinguish among resources while servicing requests for multiple host names on a single IP address.

在一个请求消息中的 Host 头字段提供了来自目标 URI 的主机host以及端口port信息,当源服务器对同一个 IP 地址使用多个不同主机名称来处理众多请求时,Host 能够让源服务器可以区分该请求所对应的资源。

Host = uri-host [ ":" port ] ; Section 2.7.1

A client MUST send a Host header field in all HTTP/1.1 request messages. If the target URI includes an authority component, then a client MUST send a field-value for Host that is identical to that authority component, excluding any userinfo subcomponent and its "@" delimiter (Section 2.7.1). If the authority component is missing or undefined for the target URI, then a client MUST send a Host header field with an empty field-value.

客户端 必须 在其发送的所有 HTTP/1.1 请求消息里包含一个 Host 头字段。如果目标 URI 包含一个 authority 组件,那么客户端 必须 在其发送的请求消息里将 Host 的字段值指定为 authority 组件的内容,同时,排除掉 authority 内任何 userinfo 子组件以及它的 "@" 分隔符(章节 2.7.1)。如果目标 URI 里缺少或未定义 authority 组件,那么客户端 必须 在其发送的请求消息里包含一个字段值为空的 Host 头字段。

Since the Host field-value is critical information for handling a request, a user agent SHOULD generate Host as the first header field following the request-line.

因为对于如何处理一个请求消息来说,Host 头字段的内容是一个关键信息,所以用户代理 应该Host 作为头部的第一个字段,紧随于请求行request line之后。

For example, a GET request to the origin server for http://www.example.org/pub/WWW/ would begin with:

例如,一个发送到源服务器的 GET 请求 http://www.example.org/pub/www/,其请求消息的开头为:

GET /pub/WWW/ HTTP/1.1
Host: www.example.org

A client MUST send a Host header field in an HTTP/1.1 request even if the request-target is in the absolute-form, since this allows the Host information to be forwarded through ancient HTTP/1.0 proxies that might not have implemented Host.

客户端 必须 在其发送的 HTTP/1.1 请求消息里包含一个 Host 头字段,即使是绝对形式(absolute-form)的请求目标,这是因为这样做使得该 Host 的信息能够穿透那些老旧的可能未实现 Host 的 HTTP/1.0 的代理而转发出去。

译注:Host 头字段是自 HTTP/1.1 开始引入的,因此对于 HTTP/1.0 的实现而言,它就是那种“未识别的”头字段。按照 HTTP/1.0 规范【RFC1945】,未识别的头字段等同于实体头字段,见【RFC1945】章节 4.3,而对于未识别的实体头字段,接收端应该忽略,代理应该转发,见【RFC1945】章节 7.1

When a proxy receives a request with an absolute-form of request-target, the proxy MUST ignore the received Host header field (if any) and instead replace it with the host information of the request-target. A proxy that forwards such a request MUST generate a new Host field-value based on the received request-target rather than forward the received Host field-value.

当一个代理接收到一个带有以绝对形式表示的请求目标的请求消息时,代理 必须 忽略其接收到的 Host 头字段(如果有的话),并且将其替换为请求目标的主机信息。转发这种请求的代理 必须 基于其接收到的请求目标来生成一个新的 Host 字段值,而不是直接转发原本的 Host 字段值。

Since the Host header field acts as an application-level routing mechanism, it is a frequent target for malware seeking to poison a shared cache or redirect a request to an unintended server. An interception proxy is particularly vulnerable if it relies on the Host field-value for redirecting requests to internal servers, or for use as a cache key in a shared cache, without first verifying that the intercepted connection is targeting a valid IP address for that host.

因为 Host 头字段充当一个应用层的路由机制,对于恶意软件来说它是寻求攻击的一个热点目标,例如,毒害共享缓存,或者将请求重定向其他非预期的服务器当中等。在没有先验证这个被拦截的连接intercepted connection是否指向该主机的一个合法的 IP 地址的情况下,如果一个拦截代理interception proxy依赖 Host 的字段值来将请求重定向到内部服务器internal server,或者将 Host 的字段值用于作为一个共享缓存的key来使用,那么它会特别容易受到攻击。

A server MUST respond with a 400 (Bad Request) status code to any HTTP/1.1 request message that lacks a Host header field and to any request message that contains more than one Host header field or a Host header field with an invalid field-value.

如果任何 HTTP/1.1 的请求消息缺少一个 Host 头字段,或者包含超过一个以上 Host 头字段,或者 Host 的字段值不合法,那么,服务器 必须 对这种请求响应一个带有 400 (Bad Request) 状态码的响应消息。

5.5. 实际请求 URI / Effective Request URI

Since the request-target often contains only part of the user agent's target URI, a server reconstructs the intended target as an "effective request URI" to properly service the request. This reconstruction involves both the server's local configuration and information communicated in the request-target, Host header field, and connection context.

因为请求目标request target通常仅包含用户代理的目标 URI 的一部分(见章节 5.3),服务器需要重建reconstruct该 URI 的预定目标来正确处理请求,这种经重建的 URI 称之为 实际请求 URI(effective request URI)。重建的过程涉及到服务器的本地配置信息,以及相关联的请求目标、Host 头字段和连接的上下文connection context

For a user agent, the effective request URI is the target URI.

对于用户代理来说,实际请求 URI 就是目标 URI(target URI)

If the request-target is in absolute-form, the effective request URI is the same as the request-target. Otherwise, the effective request URI is constructed as follows:

  • If the server's configuration (or outbound gateway) provides a fixed URI scheme, that scheme is used for the effective request URI. Otherwise, if the request is received over a TLS-secured TCP connection, the effective request URI's scheme is "https"; if not, the scheme is "http".
  • If the server's configuration (or outbound gateway) provides a fixed URI authority component, that authority is used for the effective request URI. If not, then if the request-target is in authority-form, the effective request URI's authority component is the same as the request-target. If not, then if a Host header field is supplied with a non-empty field-value, the authority component is the same as the Host field-value. Otherwise, the authority component is assigned the default name configured for the server and, if the connection's incoming TCP port number differs from the default port for the effective request URI's scheme, then a colon (":") and the incoming port number (in decimal form) are appended to the authority component.
  • If the request-target is in authority-form or asterisk-form, the effective request URI's combined path and query component is empty. Otherwise, the combined path and query component is the same as the request-target.
  • The components of the effective request URI, once determined as above, can be combined into absolute-URI form by concatenating the scheme, "://", authority, and combined path and query component.

如果请求目标是绝对形式(absolute-form),那么实际请求 URI 与请求目标(request target)相同。否则,实际请求 URI 会使用以下方式来重建:

  • 如果服务器的配置信息(或者出站网关)提供了一个固定的 URI scheme,那么,这个 URI scheme 会用于参与重建实际请求 URI。没有提供固定的 URI scheme,如果该请求是在一个 TLS 安全的(TLS-secured)的 TCP 连接,那么实际请求 URI 的 scheme 为 "https",否则,scheme 为 "http"。
  • 如果服务器的配置信息(或者出站网关)提供了一个固定的 URI authority 组件,那么,这个 authority 会用于参与重建实际请求 URI。如果没有固定的 URI authority,并且如果请求目标是 authority-form 形式,那么实际请求 URI 的 authority 组件与请求目标相同,如果请求目标不是 authority-form 形式,并且如果 Host 头字段提供了一个非空的字段值,那么实际请求 URI 的 authority 组件与 Host 的字段值相同。否则,实际请求 URI 的 authority 组件会被赋值为服务器所配置的默认名称,并且如果连接的 TCP 输入端口号不是实际请求 URI 的 scheme 所对应的默认端口号,那么需要在实际请求 URI 的 authority 组件后附加一个冒号(":")以及输入端口号(十进制形式)。
  • 如果请求目标是 authority-form 或者 asterisk-form,那么实际请求 URI 的 pathquery 组件为空。否则,pathquery 组件与请求目标所对应的 pathquery 相同。
  • 实际请求 URI 的各个组件一旦在上述步骤中确定了,就能够通过依次连结 scheme、"://"、authority、以及 pathquery,组合为绝对 URI(absolute-form)形式。

Example 1: the following message received over an insecure TCP connection

例一:以下消息接收于一个不安全的 TCP 连接中:

GET /pub/WWW/TheProject.html HTTP/1.1
Host: www.example.org:8080

has an effective request URI of

它的实际请求 URI 是:

http://www.example.org:8080/pub/WWW/TheProject.html

Example 2: the following message received over a TLS-secured TCP connection

例二:以下消息接收于一个 TLS 安全的 TCP 连接中:

OPTIONS * HTTP/1.1
Host: www.example.org

has an effective request URI of

它的实际请求 URI 是:

https://www.example.org

Recipients of an HTTP/1.0 request that lacks a Host header field might need to use heuristics (e.g., examination of the URI path for something unique to a particular host) in order to guess the effective request URI's authority component.

一个 HTTP/1.0 接收端,可能需要使用启发式heuristics(例如,测试 URI 的路径是否专属于某个具体主机)来猜测实际请求 URI 的 authority 组件。

Once the effective request URI has been constructed, an origin server needs to decide whether or not to provide service for that URI via the connection in which the request was received. For example, the request might have been misdirected, deliberately or accidentally, such that the information within a received request-target or Host header field differs from the host or port upon which the connection has been made. If the connection is from a trusted gateway, that inconsistency might be expected; otherwise, it might indicate an attempt to bypass security filters, trick the server into delivering non-public content, or poison a cache. See Section 9 for security considerations regarding message routing.

一旦重建好了实际请求 URI,源服务器需要确定是否对这个 URI 提供服务,通过接收到该请求的连接。例如,该请求可能被误导,刻意或者意外发送到当前服务器中,以致该请求消息的请求目标或者 Host 头字段内的信息与该请求消息所对应的发送端所发起的连接的信息(主机与端口)不一致。如果该连接是来自一个可信任的网关,那么这种不一致性可能还可以接受,否则,这可能代表一个企图越过安全过滤机制,欺骗服务去分发不公开的内容,或者毒害缓存。关于消息路由的安全注意事项见章节 9

5.6. 将响应关联到请求 / Associating a Response to a Request

HTTP does not include a request identifier for associating a given request message with its corresponding one or more response messages. Hence, it relies on the order of response arrival to correspond exactly to the order in which requests are made on the same connection. More than one response message per request only occurs when one or more informational responses (1xx, see Section 6.2 of [RFC7231]) precede a final response to the same request.

HTTP 并不包含一个请求标记,用于关联一个给定请求消息和与之对应的一个或多个响应消息。因此,HTTP 依靠响应消息到达的顺序来一一对应在同一个连接中生成请求的顺序。出现响应消息数与请求消息数的比值大于 1 的情况仅当对该请求发送最终响应final response(任何非 1xx 状态码的响应消息)之前,对其发送了一个或多个信息性响应消息informational responses(状态码为 1xx 的响应消息,见【RFC7231】章节 6.2)。

A client that has more than one outstanding request on a connection MUST maintain a list of outstanding requests in the order sent and MUST associate each received response message on that connection to the highest ordered request that has not yet received a final (non-1xx) response.

在一个连接中,如果客户端有超过一个未偿付的请求outstanding request的话,客户端 必须 以发送的顺序维护一个未完成请求的列表,并且 必须 关联每一个在该连接中接收到的响应消息到列表最高位的还未接收到最终响应(非 1xx 状态码)的请求。

5.7. 消息转发 / Message Forwarding

As described in Section 2.3, intermediaries can serve a variety of roles in the processing of HTTP requests and responses. Some intermediaries are used to improve performance or availability. Others are used for access control or to filter content. Since an HTTP stream has characteristics similar to a pipe-and-filter architecture, there are no inherent limits to the extent an intermediary can enhance (or interfere) with either direction of the stream.

正如章节 2.3 所述,中间人能够在处理 HTTP 请求和响应中饰演多种角色。某些中间人是用于提升性能或可用性,另外一些用于访问控制access control或者内容过滤filter content。因为一个 HTTP 流具有类似于一个管道架构(pipe-and-filter architecture)的性质,因此,中间人对流的任何方向的提升或抑制的程度没有任何固定限制。 。

An intermediary not acting as a tunnel MUST implement the Connection header field, as specified in Section 6.1, and exclude fields from being forwarded that are only intended for the incoming connection.

不充当隧道tunnel的中间人 必须 实现章节 6.1 中所指定的 Connection 头字段,并且在转发消息时排除所有仅作用于传入连接incoming connection的字段。

An intermediary MUST NOT forward a message to itself unless it is protected from an infinite request loop. In general, an intermediary ought to recognize its own server names, including any aliases, local variations, or literal IP addresses, and respond to such requests directly.

中间人 禁止 转发一个消息到自身,除非它具有避免无限请求循环的保护机制。通常情况下,中间人应该了解它自身的服务器名称,包括任何别名、局部变种、文本性的 IP 地址,并将这些信息直接响应到这种请求中去。

5.7.1. Via

The "Via" header field indicates the presence of intermediate protocols and recipients between the user agent and the server (on requests) or between the origin server and the client (on responses), similar to the "Received" header field in email (Section 3.6.7 of [RFC5322]). Via can be used for tracking message forwards, avoiding request loops, and identifying the protocol capabilities of senders along the request/response chain.

Via 头字段表示在用户代理和服务器的链路之间(关于请求消息)或者源服务器和客户端链路之间(关于响应消息),出现了中间人协议和接收端,类似于电子邮件中的 Received 头字段(【RFC5322】章节 3.6.7)。Via 能够用于跟踪消息的转发,避免请求循环request loops以及标识在请求/响应链路中的各个发送端的协议功能protocol capabilities

Via = 1#( received-protocol RWS received-by [ RWS comment ] )

received-protocol = [ protocol-name "/" ] protocol-version
                    ; see Section 6.7
received-by       = ( uri-host [ ":" port ] ) / pseudonym
pseudonym         = token

Multiple Via field values represent each proxy or gateway that has forwarded the message. Each intermediary appends its own information about how the message was received, such that the end result is ordered according to the sequence of forwarding recipients.

Via 头字段的多个字段值分别表示曾经转发过该消息的每一个代理或者网关。每个中间人都向 Via 附加关于消息如何被自身转发的信息,以使该最终结果按照参与转发的接收端的序列而排序的。

A proxy MUST send an appropriate Via header field, as described below, in each message that it forwards. An HTTP-to-HTTP gateway MUST send an appropriate Via header field in each inbound request message and MAY send a Via header field in forwarded response messages.

代理 必须 在其转发的所有消息里带有一个恰当的 Via 头字段,正如下面所描述的一样。一个 HTTP-to-HTTP 的网关 必须 在其发送的所有入站请求inbound request8消息里带有一个恰当的 Via 头字段,并且 可以 在其转发的所有响应消息里带有一个 Via 头字段。

For each intermediary, the received-protocol indicates the protocol and protocol version used by the upstream sender of the message. Hence, the Via field value records the advertised protocol capabilities of the request/response chain such that they remain visible to downstream recipients; this can be useful for determining what backwards-incompatible features might be safe to use in response, or within a later request, as described in Section 2.6. For brevity, the protocol-name is omitted when the received protocol is HTTP.

对于每一个中间人,received-protocol 表示该消息的上游发送端所使用的协议及其版本。所以,Via 的字段值记录了请求/响应链路所声明的协议功能,以使下游接收端对这些信息保持可见。它们能够用于确定在响应或者接下来的请求中哪些向后不兼容的功能能够被安全地使用,正如章节 2.6 所述。为简洁起见,当所接收到的消息的协议是 HTTP,protocol-name 会被忽略。

The received-by portion of the field value is normally the host and optional port number of a recipient server or client that subsequently forwarded the message. However, if the real host is considered to be sensitive information, a sender MAY replace it with a pseudonym. If a port is not provided, a recipient MAY interpret that as meaning it was received on the default TCP port, if any, for the received-protocol.

Via 字段值里的 receive-by 部分通常是接收端(服务器或者客户端)接下来转发消息所使用的主机名称和一个可选的端口号。但是,如果服务器认为该主机的真实名称是一个敏感信息的话,发送端 可以 将其替换为一个化名pseudonym。如果端口号未提供,接收端 可以 将其解释为就像它是在默认的 TCP 端口接收到该消息一样,if any,for the received-protocol。

A sender MAY generate comments in the Via header field to identify the software of each recipient, analogous to the User-Agent and Server header fields. However, all comments in the Via field are optional, and a recipient MAY remove them prior to forwarding the message.

发送端 可以Via 头字段里生成注释,用于标识每个接收端的软件,类似于 User-AgentServer 头字段。但是,所有在 Via 头字段里的注释都是可选的,并且接收端 可以 在转发该消息之前移除它们。

For example, a request message could be sent from an HTTP/1.0 user agent to an internal proxy code-named "fred", which uses HTTP/1.1 to forward the request to a public proxy at p.example.net, which completes the request by forwarding it to the origin server at www.example.com. The request received by www.example.com would then have the following Via header field:

例如,某个 HTTP/1.0 的用户代理,发送了一个请求消息到一个代号为 "fred" 的内部代理里,该内部代理使用 HTTP/1.1 来将该请求消息转发到一个名为 p.example.net 的公共代理中,该公共代理转发该请求消息到达名为 www.example.com 的源服务器。那么,www.example.com 接收到的请求消息将会带有如下的 Via 头字段:

Via: 1.0 fred, 1.1 p.example.net

An intermediary used as a portal through a network firewall SHOULD NOT forward the names and ports of hosts within the firewall region unless it is explicitly enabled to do so. If not enabled, such an intermediary SHOULD replace each received-by host of any host behind the firewall by an appropriate pseudonym for that host.

中间人用作通往某个网络防火墙的入口的时候,不应当 转发在防火墙内部的主机的名称和端口号,除非它被明确允许这样做。如果没有允许,这种中间人 应当 使用恰当的化名来替换掉每一个在防火墙内部的且出现在 received-by 里的主机的名称。

An intermediary MAY combine an ordered subsequence of Via header field entries into a single such entry if the entries have identical received-protocol values. For example,

中间人将 Via 的字段值里的条目序列合并为一个单一的条目,如果这些条目的 received-protocol 的值相同的话。例如:

Via: 1.0 ricky, 1.1 ethel, 1.1 fred, 1.0 lucy

could be collapsed to

能够合并为:

Via: 1.0 ricky, 1.1 mertz, 1.0 lucy

A sender SHOULD NOT combine multiple entries unless they are all under the same organizational control and the hosts have already been replaced by pseudonyms. A sender MUST NOT combine entries that have different received-protocol values.

发送端 不应当 合并多个条目,除非它们都在相同的组织控制之下,并且主机已被替换为化名。发送端 禁止 合并 receive-protocol 的值不相同的条目。

5.7.2. Transformations

Some intermediaries include features for transforming messages and their payloads. A proxy might, for example, convert between image formats in order to save cache space or to reduce the amount of traffic on a slow link. However, operational problems might occur when these transformations are applied to payloads intended for critical applications, such as medical imaging or scientific data analysis, particularly when integrity checks or digital signatures are used to ensure that the payload received is identical to the original.

某些中间人具有转换消息及其有效载荷payload的功能。代理可能出于节省缓存的存储空间或者在慢速连接中减少流量的总量的目的对消息进行转换,例如,图片格式的转换。但是,当这些转换被应用到提供给关键应用程序(例如医学图像或者科学数据分析)使用的有效载荷上的时候,特别是当使用完整性检验或者数字签名来保证所接收到的有效载荷与原始无异的时候,可能会引发业务上的问题。

An HTTP-to-HTTP proxy is called a "transforming proxy" if it is designed or configured to modify messages in a semantically meaningful way (i.e., modifications, beyond those required by normal HTTP processing, that change the message in a way that would be significant to the original sender or potentially significant to downstream recipients). For example, a transforming proxy might be acting as a shared annotation server (modifying responses to include references to a local annotation database), a malware filter, a format transcoder, or a privacy filter. Such transformations are presumed to be desired by whichever client (or client organization) selected the proxy.

被设计或配置用来通过一种语义上有意义的方式来修改消息(也就是说,超出正常 HTTP 的处理要求的额外更改,以一种可能对于原始发送端有意义或者对于下游接收端有潜在意义的方式改变了消息)的一种 HTTP-to-HTTP 代理,称之为“转换代理transforming proxy”。例如,转换代理可以充当一个共享注释服务器(通过修改响应消息,让其包含一个本地注释数据库的引用)、一个恶意软件过滤器、一个格式转码器,或者一个隐私过滤器。无论客户端选择哪一种代理,这种转换都被认为是该代理所期望的。

If a proxy receives a request-target with a host name that is not a fully qualified domain name, it MAY add its own domain to the host name it received when forwarding the request. A proxy MUST NOT change the host name if the request-target contains a fully qualified domain name.

如果代理接收到一个带有一个主机名称不是一个完全限定域名(Fully Qualified Domain Name)request-target,它 可以 在转发该请求消息的时候添加自已的域名到主机名称上。如果该 request-target 包含了一个完全限定域名,那么代理 禁止 改变主机名称。

A proxy MUST NOT modify the "absolute-path" and "query" parts of the received request-target when forwarding it to the next inbound server, except as noted above to replace an empty path with "/" or "*".

当代理把接收到的 request-target 转发到其后的入站服务器inbound server的时候,除了上述提及到的使用 "/" 或者 "*" 来替换掉一个空的 path 以外(见章节 5.3.1 以及章节 5.3.4),代理 禁止 修改该 request-target 中的 absolute-path 以及 query 部分。

A proxy MAY modify the message body through application or removal of a transfer coding (Section 4).

代理 可以 通过应用或移除一个传输编码值transfer coding章节 4)来修改消息体。

A proxy MUST NOT transform the payload (Section 3.3 of [RFC7231]) of a message that contains a no-transform cache-control directive (Section 5.2 of [RFC7234]).

代理 禁止 对包含有 no-transformCache-Control 指令directive【RFC7234】章节 5.2)的消息中的有效载荷payload【RFC7231】章节 3.3)进行转换。

A proxy MAY transform the payload of a message that does not contain a no-transform cache-control directive. A proxy that transforms a payload MUST add a Warning header field with the warn-code of 214 ("Transformation Applied") if one is not already in the message (see Section 5.5 of [RFC7234]). A proxy that transforms the payload of a 200 (OK) response can further inform downstream recipients that a transformation has been applied by changing the response status code to 203 (Non-Authoritative Information) (Section 6.3.4 of [RFC7231]).

代理 可以 对并不包含有 no-transform 缓存控制指令的消息中的有效载荷进行转换。代理对有效载荷进行转换的时候,如果该消息未包含一个警告码为 214 ("Transformation Applied")Warning 头字段的话,代理 必须 添加上带有该警告码的 Warning 头字段(见【RFC7234】章节 5.5)。代理在转换一个响应状态码为 200 (OK) 的消息中的有效载荷的时候,可以通过改变消息的响应状态码为 203 (Non-Authoritative Information) 来进一步通知下游接收端——这个消息已经被转换过(【RFC7231】章节 6.3.4)。

A proxy SHOULD NOT modify header fields that provide information about the endpoints of the communication chain, the resource state, or the selected representation (other than the payload) unless the field's definition specifically allows such modification or the modification is deemed necessary for privacy or security.

代理 不应当 修改提供有关通信链路端点、资源状态,或者已选表示形式selected representation(除了有效载荷)的信息的头字段,除非该字段的定义明确允许这种修改或者该修改被认为对于保护隐私或安全性是必要的。

6. 连接管理 / Connection Management

HTTP messaging is independent of the underlying transport- or session-layer connection protocol(s). HTTP only presumes a reliable transport with in-order delivery of requests and the corresponding in-order delivery of responses. The mapping of HTTP request and response structures onto the data units of an underlying transport protocol is outside the scope of this specification.

HTTP 的消息交换是独立于底层传输层和会话层相关的连接协议。HTTP 仅假定有一个可靠的传输对请求进行按次序发送出去,以及与之对应的响应被按次序发送回来。至于如何将 HTTP 的请求和响应的结构映射到底层传输协议的数据单元上,并不在本规范探讨的范围之内。

As described in Section 5.2, the specific connection protocols to be used for an HTTP interaction are determined by client configuration and the target URI. For example, the "http" URI scheme (Section 2.7.1) indicates a default connection of TCP over IP, with a default TCP port of 80, but the client might be configured to use a proxy via some other connection, port, or protocol.

正如章节 5.2 描述的那样,具体使用哪一种连接协议与 HTTP 交互是取决于客户端的配置以及目标 URI 的。例如,"http" URI 方案(章节 2.7.1)表示一个在 IP 协议之上的 TCP 默认连接,使用默认的 80 TCP 端口,但是,客户端可能被配置为使用代理来途经其他连接、端口以及协议。

HTTP implementations are expected to engage in connection management, which includes maintaining the state of current connections, establishing a new connection or reusing an existing connection, processing messages received on a connection, detecting connection failures, and closing each connection. Most clients maintain multiple connections in parallel, including more than one connection per server endpoint. Most servers are designed to maintain thousands of concurrent connections, while controlling request queues to enable fair use and detect denial-of-service attacks.

6.1. 连接 / Connection

The "Connection" header field allows the sender to indicate desired control options for the current connection. In order to avoid confusing downstream recipients, a proxy or gateway MUST remove or replace any received connection options before forwarding the message.

Connection 头字段允许发送端去指定希望如何控制当前连接的选项。为了避免下游接收端的困惑,代理或者网关 必须 在转发消息的时候移除或替换出现在该消息中的任何连接选项connection options

译注: 连接选项connection options是一个专有名词,是特指 Connection 头字段的字段值里的内容。

When a header field aside from Connection is used to supply control information for or about the current connection, the sender MUST list the corresponding field-name within the Connection header field. A proxy or gateway MUST parse a received Connection header field before a message is forwarded and, for each connection-option in this field, remove any header field(s) from the message with the same name as the connection-option, and then remove the Connection header field itself (or replace it with the intermediary's own connection options for the forwarded message).

除了 Connection 头字段以外,当存在某个头字段,用于提供应用在(或者仅说明)当前连接的控制信息,发送端 必须Connection 头字段里列出该头字段的名称。代理或者网关在转发消息之前,必须 解析消息里的 Connection 头字段,并且对于 Connection 头字段内的每一个连接选项 connection-option,从消息中移除与之同名的头字段,然后再移除 Connection 头字段自身(或者将其替换为中间人自己的连接选项,用于转发消息)。

译注: 原文本段的第一句 "When a header field … is used to supply control information for or about the current connection, the sender MUST …",将 "for or about" 扩展为 "for the current connection or about the current connection"。

对于 "supply control information for the current connection",我的理解是“对当前连接施加或设置了某些控制,然后将这些控制信息提供出来”,就像对消息的有效正文应用了某种编码,需要将该编码值添加到 Transfer-Encoding 头字段上一样。

如果接受上述的解释,那么 "supply control information about the current connection" 就很好理解了,“并没有对当前连接作新的任何控制,只是列出了当前连接有哪些控制信息”。

Hence, the Connection header field provides a declarative way of distinguishing header fields that are only intended for the immediate recipient ("hop-by-hop") from those fields that are intended for all recipients on the chain ("end-to-end"), enabling the message to be self-descriptive and allowing future connection-specific extensions to be deployed without fear that they will be blindly forwarded by older intermediaries.

因此,Connection 头字段提供了一种声明式的方式来区分哪些头字段是打算只作用于当前发送端的直接接收端的(“逐跳hop-by-hop”),哪些头字段是打算作用于链路中的所有接收端(“端到端end-to-end”)。这样,使消息能够自描述,同时,避免将来新增的连接专用的扩展connection-specific extensions被旧的(即不支持该扩展的)中间人盲转发blindly forward

译注:Connection 内指定的头字段是仅作用于直接接收端的。关于 "hop-by-hop" 与 "end-to-end" 的解释见章节 2.1 相关的译注。

The Connection header field's value has the following grammar:

Connection 头字段的值的语法如下:

Connection        = 1#connection-option
connection-option = token

Connection options are case-insensitive.

其中,连接选项 connection-option 是不区分大小写的。

A sender MUST NOT send a connection option corresponding to a header field that is intended for all recipients of the payload. For example, Cache-Control is never appropriate as a connection option (Section 5.2 of [RFC7234]).

发送端 禁止 发送一个与作用于所有接收端的有效载荷的头字段名称相一致的连接选项。例如,Cache-Control 是永远不能作为一个连接选项的(【RFC7234】章节 5.2)。

The connection options do not always correspond to a header field present in the message, since a connection-specific header field might not be needed if there are no parameters associated with a connection option. In contrast, a connection-specific header field that is received without a corresponding connection option usually indicates that the field has been improperly forwarded by an intermediary and ought to be ignored by the recipient.

连接选项并不总是对应到某个出现在消息中的头字段的,这是因为如果这种头字段并没有字段值(即关联到某个连接选项的参数),那么它就不必出现在消息头里了。与之相对的是,如果消息中带有某个连接专用的头字段,但并不存在一个与之对应的连接选项(即 Connection 里并没有包含与这个头字段名称一致的连接选项),出现这种情况通常表明该头字段是被某个中间人错误地转发过来的,并且应该被接收端所忽略掉。

When defining new connection options, specification authors ought to survey existing header field names and ensure that the new connection option does not share the same name as an already deployed header field. Defining a new connection option essentially reserves that potential field-name for carrying additional information related to the connection option, since it would be unwise for senders to use that field-name for anything else.

当定义新的连接选项时,规范的作者们应该审视已有的头字段名称,确保新的连接选项的名称与目前已部署的头字段名称不相冲突。定义一个新的连接选项基本上会一并将同名头字段名作为保留字,用来搭载与该连接选项相关的额外信息,所以,如果发送端将这种头字段用作他用是很不明智的。

The "close" connection option is defined for a sender to signal that this connection will be closed after completion of the response. For example,

连接选项 "close" 用于发送端向接收端发出信号——当前连接将会在完成响应后被关闭。例如:

Connection: close

in either the request or the response header fields indicates that the sender is going to close the connection after the current request/response is complete (Section 6.6).

如果连接选项 "close" 出现在请求消息或者响应消息中,表明发送端将会在完成目前的请求/响应后关闭该连接(章节 6.6)。

A client that does not support persistent connections MUST send the "close" connection option in every request message.

不支持持久连接(Persistent)的客户端 必须 在其发送的每个请求消息中包含有 "close" 连接选项。

A server that does not support persistent connections MUST send the "close" connection option in every response message that does not have a 1xx (Informational) status code.

不支持持久连接服务器 必须 在其发送的每个状态码不是 1xx (Informational) 的响应消息中包含有 "close" 连接选项。

6.2. 建立 / Establishment

It is beyond the scope of this specification to describe how connections are established via various transport- or session-layer protocols. Each connection applies to only one transport link.

关于连接如何经由传输层或会话层协议被建立,并不在本规范探讨的范围之内。每个连接仅适用于一个传输链路。

6.3. 持久 / Persistence

HTTP/1.1 defaults to the use of "persistent connections", allowing multiple requests and responses to be carried over a single connection. The "close" connection option is used to signal that a connection will not persist after the current request/response. HTTP implementations SHOULD support persistent connections.

HTTP/1.1 对持久连接(Persistent Connections)的使用进行了定义,允许在单个连接中搭载多个请求和响应。"close" 连接选项用于作为连接将会在当前请求/响应后不再维持的信号。HTTP 的实现implementations 应当 支持持久连接。

译注:持久连接又称为长连接。

A recipient determines whether a connection is persistent or not based on the most recently received message's protocol version and Connection header field (if any):

  • If the "close" connection option is present, the connection will not persist after the current response; else,
  • If the received protocol is HTTP/1.1 (or later), the connection will persist after the current response; else,
  • If the received protocol is HTTP/1.0, the "keep-alive" connection option is present, the recipient is not a proxy, and the recipient wishes to honor the HTTP/1.0 "keep-alive" mechanism, the connection will persist after the current response; otherwise,
  • The connection will close after the current response.

接收端基于最近接收到的消息的协议版本以及 Connection 头字段(如果有的话)来确定一个连接是否是持久连接:

  • if 接收到的消息里出现 "close" 连接选项,then 连接将会在当前响应之后不再维持;
  • else if 接收到的协议版本是 HTTP/1.1(或者更新),then 连接将会在当前响应之后继续维持;
  • else if 接收到的协议版本是 HTTP/1.0,并且出现 "keep-alive" 连接选项,并且该接收端不是一个代理,并且该接收端希望遵循 HTTP/1.0 的 "keep-alive" 机制,then 连接将会在当前响应之后继续维持;
  • else 连接将会在当前响应之后关闭。

译注:遗留问题,对连接的描述上,"not persist" 与 "close" 有什么区别?

A client MAY send additional requests on a persistent connection until it sends or receives a "close" connection option or receives an HTTP/1.0 response without a "keep-alive" connection option.

客户端 可以 在一个持久连接中发送额外的请求消息,直到它发送或接收到一个带有 "close" 连接选项的消息,或者接收到一个没有 "keep-alive" 连接选项的 HTTP/1.0 响应消息为止。

In order to remain persistent, all messages on a connection need to have a self-defined message length (i.e., one not defined by closure of the connection), as described in Section 3.3. A server MUST read the entire request message body or close the connection after sending its response, since otherwise the remaining data on a persistent connection would be misinterpreted as the next request. Likewise, a client MUST read the entire response message body if it intends to reuse the same connection for a subsequent request.

为了保持持久,在一个连接中的所有消息都需要带有一个由自身所定义self-defined的消息长度(也就是说,并不是由连接的关闭而定义的那个长度),如章节 3.3 所述。服务器 必须 在发送响应消息后,读取整个请求消息体或者关闭该连接,这是因为如果不这样做的话,存在于持久连接中的剩余的数据会被误解为是下一个请求消息的内容。同样,如果客户端打算在接下来的请求中复用同一个连接,那么,客户端 必须 读取整个响应消息体。

A proxy server MUST NOT maintain a persistent connection with an HTTP/1.0 client (see Section 19.7.1 of [RFC2068] for information and discussion of the problems with the Keep-Alive header field implemented by many HTTP/1.0 clients).

代理服务器 禁止 与一个 HTTP/1.0 客户端维持一个持久连接(见【RFC2068】章节 19.7.1 关于某些 HTTP/1.0 客户端所实现的 Keep-Alive 头字段的问题的信息和讨论)。

See Appendix A.1.2 for more information on backwards compatibility with HTTP/1.0 clients.

关于向后兼容 HTTP/1.0 客户端的更多信息,见附录 A.1.2

6.3.1. 请求重试 / Retrying Requests

Connections can be closed at any time, with or without intention. Implementations ought to anticipate the need to recover from asynchronous close events.

连接能够在任何时候有意或无意地被关闭。HTTP 实现implementations应当预料到存在从异步关闭事件中恢复连接的需要。

When an inbound connection is closed prematurely, a client MAY open a new connection and automatically retransmit an aborted sequence of requests if all of those requests have idempotent methods (Section 4.2.2 of [RFC7231]). A proxy MUST NOT automatically retry non-idempotent requests.

当一个入站连接被过早地关闭,客户端 可以 开启一个新的连接并自动重传被中止的请求消息序列,如果所有这些请求消息的请求方法都是幂等(【RFC7231】章节 4.2.2)的话。代理 禁止 自动重试非幂等的请求。

A user agent MUST NOT automatically retry a request with a non-idempotent method unless it has some means to know that the request semantics are actually idempotent, regardless of the method, or some means to detect that the original request was never applied. For example, a user agent that knows (through design or configuration) that a POST request to a given resource is safe can repeat that request automatically. Likewise, a user agent designed specifically to operate on a version control repository might be able to recover from partial failure conditions by checking the target resource revision(s) after a failed connection, reverting or fixing any changes that were partially applied, and then automatically retrying the requests that failed.

用户代理 禁止 自动重试一个带有非幂等请求方法non-idempotent method的请求,除非用户代理通过某些方式了解到不管请求方法是否幂等,该请求的语义实际上都是幂等的,或者通过某些方式检测到原来的请求未被接收端处理。例如,某个用户代理(通过设计或配置)了解到某个获取给定资源的 POST 请求是能够自动安全地重复请求的。同样,某个用户代理特别设计为对一个版本控制仓库进行操作,可能能够在一次连接失败后,通过测试目标资源的修订版本,撤回或修复部分应用过的更改,然后自动重试失败过的请求,使连接从部分失败条件中恢复过来。

A client SHOULD NOT automatically retry a failed automatic retry.

对于一个已经自动重试失败的请求,客户端 不应当 再次自动重试。

6.3.2. 流水线处理 / Pipelining

A client that supports persistent connections MAY "pipeline" its requests (i.e., send multiple requests without waiting for each response). A server MAY process a sequence of pipelined requests in parallel if they all have safe methods (Section 4.2.1 of [RFC7231]), but it MUST send the corresponding responses in the same order that the requests were received.

支持持久连接的客户端 可以 流水线式处理它的请求(也就是说,在不需要等待响应完成的情况下发送多个请求)。服务器 可以 并行处理一系列的流水线化的请求,如果它们所有都带有安全的请求方法(【RFC7231】章节 4.2.1)的话。但服务器 必须 以其接收到的请求消息的顺序发送对应的响应消息。

译注: Pipeline,译为“流水线”,国内大多数译为“管道”,但我认为“流水线”更容易理解,也更符合意境。试想一下,

pipeline 的优势是并行处理,可将一个大任务按一定顺序细分为一个个串联起来的小任务,pipeline 的并行处理实际上就是每个小任务不关心共他任务的完成进度,只需要不断完成当前任务就可以了。而所有小任务完成了,这个大任务就算完成了。这不就是“流水线作业”吗?

而“管道”就是一条管子,你可以任意连接,而且由于“管道壁”的存在,防止了管子内的流体被外界污染,体现了管道的密封性和隔离性,即可看为管道里是一个黑盒子,你不管理里面是如何动作的,但外界也不容易影响管道的内部。例如水管的主要作用是输送,其次是隔绝外界污染,同时可以灵活接驳。但“管道”并没有体现了并行的意思。

有些 pipeline 还可以任意组装、增减操作步骤,这个“流水线”与“管理”都有这个意思。

A client that pipelines requests SHOULD retry unanswered requests if the connection closes before it receives all of the corresponding responses. When retrying pipelined requests after a failed connection (a connection not explicitly closed by the server in its last complete response), a client MUST NOT pipeline immediately after connection establishment, since the first remaining request in the prior pipeline might have caused an error response that can be lost again if multiple requests are sent on a prematurely closed connection (see the TCP reset problem described in Section 6.6).

客户端流水线处理请求的时候,如果连接在客户端接收到所有对应的响应消息之前被关闭了,客户端 应当 重试未应答过的请求。当在连接失败(连接未被服务器发送完最后一个完整响应消息之后显式关闭)后重试流水线化的请求,客户端 禁止 在连接建立之后立即进行流水线,这是因为如果多个请求在一个被过早关闭的连接中被发送(见章节 6.6 所描述的 TCP 重启问题),在之前流水线的第一个剩余的响应消息可能导致一个错误的响应,可能会再一次被丢失。

Idempotent methods (Section 4.2.2 of [RFC7231]) are significant to pipelining because they can be automatically retried after a connection failure. A user agent SHOULD NOT pipeline requests after a non-idempotent method, until the final response status code for that method has been received, unless the user agent has a means to detect and recover from partial failure conditions involving the pipelined sequence.

幂等请求方法(【RFC7231】章节 4.2.2)对于流水线处理是很有意义的,这是因为在连接失败后它们能够被自动重试。用户代理 不应当 对一个非幂等请求方法之后的请求消息进行流水线处理,直到接收到该非幂等请求方法所对应的最终响应状态码final response status code,除非该用户代理有某种方法去检测和从部分涉及到流水线序列的失败条件中恢复连接。

译注:最终响应的状态码就是除了过渡性响应状态以外的状态码。过渡性状态码(interim response status code),即所有 1xx 的信息性状态码,见【RFC7231】章节 6.2。其对应的响应分别叫最终响应(final response)和过渡性响应(interim response)。

An intermediary that receives pipelined requests MAY pipeline those requests when forwarding them inbound, since it can rely on the outbound user agent(s) to determine what requests can be safely pipelined. If the inbound connection fails before receiving a response, the pipelining intermediary MAY attempt to retry a sequence of requests that have yet to receive a response if the requests all have idempotent methods; otherwise, the pipelining intermediary SHOULD forward any received responses and then close the corresponding outbound connection(s) so that the outbound user agent(s) can recover accordingly.

接收流水线化的请求消息的中间人,当转发这些请求到站内inbound的时候 可以 流水线处理它们,这是因为中间人能够依靠站外outbound的用户代理来确定哪些请求消息能够被放心地使用流水线处理。如果在接收某个响应之前,入站连接失败,正在进行流水线处理的中间人 可以 试图去重试一系列有待接收响应的请求,如果这些请求都带有幂等请求方法的话;否则,该正在进行流水线处理的中间人 应当 转发任何接收到的响应,然后关闭对应的出站连接以便那些站外用户代理能够相应地恢复。

6.4. 并发 / Concurrency

A client ought to limit the number of simultaneous open connections that it maintains to a given server.

客户端应该限制与某个给定服务器同时维持打开的连接的数量。

Previous revisions of HTTP gave a specific number of connections as a ceiling, but this was found to be impractical for many applications. As a result, this specification does not mandate a particular maximum number of connections but, instead, encourages clients to be conservative when opening multiple connections.

HTTP 早前的修订版给出了一个连接数上限的具体数值,但最终发现对于许多应用来说,这是不切实际的。所以,本规范并不指定一个详细的连接数最大值,但是,取而代之的是,鼓励客户端在开启多连接的时候尽可能的保守。

Multiple connections are typically used to avoid the "head-of-line blocking" problem, wherein a request that takes significant server-side processing and/or has a large payload blocks subsequent requests on the same connection. However, each connection consumes server resources. Furthermore, using multiple connections can cause undesirable side effects in congested networks.

多连接通常用于避免队头堵塞("head-of-line blocking")的问题,引起该问题的某个请求可能带有一个巨大的有效载荷,从而耗费了服务端很多处理时间,从而堵塞掉随后的在同一连接中的其他请求。但是,每个连接都消耗服务端资源,而且,在拥堵的网络中使用多连接可能会引发不良副作用。

Note that a server might reject traffic that it deems abusive or characteristic of a denial-of-service attack, such as an excessive number of open connections from a single client.

需要注意的是,在服务器认为滥用多连接或具有拒绝服务攻击denial-of-service attack的特征的时候,服务器可能会拒绝该流量,例如,由单独一个客户端对服务器发送了过量的开启连接的请求。

6.5. 失败和超时 / Failures and Timeouts

Servers will usually have some timeout value beyond which they will no longer maintain an inactive connection. Proxy servers might make this a higher value since it is likely that the client will be making more connections through the same proxy server. The use of persistent connections places no requirements on the length (or existence) of this timeout for either the client or the server.

服务器通常会带有某些超时值,如果超过该值时,服务器将不再维持一个可交互的连接。代理服务器可能会提高这些超时值的上限,这是因为很有可能客户端会通过同一个代理服务器创建更多的连接。持久连接的使用并不对客户端或服务器的超时时长(或者超时机制的存在)作任何要求。

A client or server that wishes to time out SHOULD issue a graceful close on the connection. Implementations SHOULD constantly monitor open connections for a received closure signal and respond to it as appropriate, since prompt closure of both sides of a connection enables allocated system resources to be reclaimed.

希望超时的客户端或服务器 应当 在连接中发出一个优雅的关闭信号。所有 HTTP 实现implementations 应当 在已打开的连接中不断监听关闭信号,并且正确地响应该信号,这是因为连接两端的关闭确认可以使系统分配的资源得到回收。

A client, server, or proxy MAY close the transport connection at any time. For example, a client might have started to send a new request at the same time that the server has decided to close the "idle" connection. From the server's point of view, the connection is being closed while it was idle, but from the client's point of view, a request is in progress.

客户端、服务器或者代理 可以 在任何时候关闭传输连接。例如,客户端可能刚开始去发送一个新的请求,但同时,服务器却决定关闭这个在此之前一直闲置的连接。站在服务器的立场来看,该连接将会被关闭是因为它一直闲置,但站在客户端的立场来看,连接正在传输某个请求。

A server SHOULD sustain persistent connections, when possible, and allow the underlying transport's flow-control mechanisms to resolve temporary overloads, rather than terminate connections with the expectation that clients will retry. The latter technique can exacerbate network congestion.

可以的话,服务器 应当 维护持久连接,并且允许底层传输的流量控制机制去解决临时过载的问题,而不是异常中断连接让客户端去重试。后者的方式可能会加剧网络的拥堵。

A client sending a message body SHOULD monitor the network connection for an error response while it is transmitting the request. If the client sees a response that indicates the server does not wish to receive the message body and is closing the connection, the client SHOULD immediately cease transmitting the body and close its side of the connection.

客户端正在发送一个请求消息体,对应的连接正在传输客户端的请求,这时候,客户端 应当 监听网络连接中的错误响应error response。如果该客户端观察到一个响应消息,该响应消息表明服务器并不希望接收该消息体并且打算关闭这个连接,那么,客户端 应当 立即停止传输该消息体,并关闭客户端这一端的连接。

6.6. 销毁 / Tear-down

The Connection header field (Section 6.1) provides a "close" connection option that a sender SHOULD send when it wishes to close the connection after the current request/response pair.

Connection 头字段(章节 6.1)提供了一个 "close" 连接选项,如果服务器希望在完成接收请求消息并且发送相应的响应消息之后关闭连接,服务器 应当 在其发送的响应消息中包含 "close" 连接选项。

A client that sends a "close" connection option MUST NOT send further requests on that connection (after the one containing "close") and MUST close the connection after reading the final response message corresponding to this request.

客户端发送了一个带有 "close" 连接选项的请求消息之后,禁止 在该(已标记为关闭的)连接上进一步发送请求消息,并且 必须 在读取完该请求消息所对应的最后一个响应消息之后关闭这个连接。

A server that receives a "close" connection option MUST initiate a close of the connection (see below) after it sends the final response to the request that contained "close". The server SHOULD send a "close" connection option in its final response on that connection. The server MUST NOT process any further requests received on that connection.

服务器接收到一个带有 "close" 连接连接的请求消息,在其发送完该请求消息所对应的最后一个响应消息之后,必须 进行关闭连接的流程(见下文)。服务器 应当 在该连接中的最后一个响应消息中带有一个 "close" 连接选项。其后,服务器 禁止 在该连接中处理任何进一步的请求消息。

A server that sends a "close" connection option MUST initiate a close of the connection (see below) after it sends the response containing "close". The server MUST NOT process any further requests received on that connection.

服务器发送了一个带有 "close" 连接选项的响应消息之后,必须 进行关闭连接的流程(见下文)。其后,服务器 禁止 在该连接中处理任何进一步的请求消息。

A client that receives a "close" connection option MUST cease sending requests on that connection and close the connection after reading the response message containing the "close"; if additional pipelined requests had been sent on the connection, the client SHOULD NOT assume that they will be processed by the server.

客户端接收到一个带有 "close" 连接选项的响应消息之后,必须 在该连接中停止发送请求,并在读取完该响应消息之后关闭这个连接。如果额外的流水线化的请求已经被发送到连接,客户端 不应当 假设服务器会处理它们。

If a server performs an immediate close of a TCP connection, there is a significant risk that the client will not be able to read the last HTTP response. If the server receives additional data from the client on a fully closed connection, such as another request that was sent by the client before receiving the server's response, the server's TCP stack will send a reset packet to the client; unfortunately, the reset packet might erase the client's unacknowledged input buffers before they can be read and interpreted by the client's HTTP parser.

如果服务器立即关闭 TCP 连接,会出现客户端将不再能够读取到最后一个 HTTP 响应的重大风险。如果服务器在一个完全关闭掉的连接中接收到来自客户端所发送的额外数据,例如客户端在接收到服务器的响应之前发送了其他请求,那么,服务器的 TCP 栈(TCP stack)将会发送一个重置消息数据包(Reset Packet)到该客户端;不幸的是,在客户端的未确认输入缓冲区unacknowledged input buffers能够被客户端的 HTTP 解析器读取并解释interpret之前,该缓冲区的数据可能会被这个重置数据包所抹去。

To avoid the TCP reset problem, servers typically close a connection in stages. First, the server performs a half-close by closing only the write side of the read/write connection. The server then continues to read from the connection until it receives a corresponding close by the client, or until the server is reasonably certain that its own TCP stack has received the client's acknowledgement of the packet(s) containing the server's last response. Finally, the server fully closes the connection.

为了避免 TCP 连接重置的问题,服务器一般会分阶段关闭一个连接。首先,服务器通过仅仅关闭该连接的写入端(一个连接有读/写两端)来实现连接的“半关闭half-close”。然后,服务器继续读取连接里的数据,直到服务器接收到一个来自客户端的关闭信号,或者直到服务器有理由确认它自身的 TCP 栈已经接收到来自客户端的确认数据包。最后,服务器完全关闭这个连接。

It is unknown whether the reset problem is exclusive to TCP or might also be found in other transport connection protocols.

目前并不知道重置问题是否是 TCP 协议独有的,也有可能出现在其他传输连接协议上。

6.7. 升级 / Upgrade

The "Upgrade" header field is intended to provide a simple mechanism for transitioning from HTTP/1.1 to some other protocol on the same connection. A client MAY send a list of protocols in the Upgrade header field of a request to invite the server to switch to one or more of those protocols, in order of descending preference, before sending the final response. A server MAY ignore a received Upgrade header field if it wishes to continue using the current protocol on that connection. Upgrade cannot be used to insist on a protocol change.

Upgrade 头字段是打算提供一个简单的方式来将同一个连接中的 HTTP/1.1 消息过渡到其他协议消息。客户端 可以 在其发送的请求消息中带有一个包含一系列协议的 Upgrade 头字段,来邀请invite服务器在发送最终响应之前,将响应消息切换到一个或多个上述的协议,按优先级从高到低排序descending preference。服务器 可以 忽略其接收到的 Upgrade 头字段,如果它希望继续在该连接中使用当前协议的话。Upgrade 不能用于督促insist on服务器改变协议。

译注:客户端可以“建议”服务器改变协议,但不能“要求”服务器改变协议。

Upgrade          = 1#protocol

protocol         = protocol-name ["/" protocol-version]
protocol-name    = token
protocol-version = token

A server that sends a 101 (Switching Protocols) response MUST send an Upgrade header field to indicate the new protocol(s) to which the connection is being switched; if multiple protocol layers are being switched, the sender MUST list the protocols in layer-ascending order. A server MUST NOT switch to a protocol that was not indicated by the client in the corresponding request's Upgrade header field. A server MAY choose to ignore the order of preference indicated by the client and select the new protocol(s) based on other factors, such as the nature of the request or the current load on the server.

服务器发送一个带有 101 (Switching Protocols) 状态码的响应消息时,必须 带有一个 Upgrade 头字段用于指定连接将会被切换到哪一种(或多种)新的协议上。如果将会切换多个协议层,发送端 必须 按协议层从低到高的顺序Layer-ascending列出这些协议。服务器 禁止 切换到一个未被客户端(在对应的请求消息中的 Upgrade 头字段)所指定的协议。服务器 可以 选择忽略掉客户端所指定优先级顺序,依据其他因素来选择需要切换到哪一种的协议上,例如依据请求的性质或者当前服务器的负荷来选择。

A server that sends a 426 (Upgrade Required) response MUST send an Upgrade header field to indicate the acceptable protocols, in order of descending preference.

服务器发送了一个带有 426 (Upgrade Required) 状态码的响应消息时,必须 带有一个 Upgrade 头字段用于指定服务器可接受切换到哪些协议,按优先级从高到低排序。

A server MAY send an Upgrade header field in any other response to advertise that it implements support for upgrading to the listed protocols, in order of descending preference, when appropriate for a future request.

服务器 可以 在其他响应消息(也就是说该响应消息所对应的请求消息并未带有 Upgrade 头字段)中带有 Upgrade 头字段,来声明服务器自身所支持的协议列表,按优先级从高到低排序,用于通知客户端将来使用哪些协议来发送请求更合适。

The following is a hypothetical example sent by a client:

假设某个客户端发送了以下一个请求消息:

GET /hello.txt HTTP/1.1
Host: www.example.com
Connection: upgrade
Upgrade: HTTP/2.0, SHTTP/1.3, IRC/6.9, RTA/x11

The capabilities and nature of the application-level communication after the protocol change is entirely dependent upon the new protocol(s) chosen. However, immediately after sending the 101 (Switching Protocols) response, the server is expected to continue responding to the original request as if it had received its equivalent within the new protocol (i.e., the server still has an outstanding request to satisfy after the protocol has been changed, and is expected to do so without requiring the request to be repeated).

在协议变更以后,应用层通信的能力和性质取决于所选择的(一个或多个)新协议上。但是,在服务器直接发送了一个带有 101 (Switching Protocols) 状态码的过渡性响应interim response之后,服务器需要继续发送一个最终响应final response给到原先的请求,就像服务器已经在新协议连接中接收到请求消息一样(也就是说,在协议已经变更以后,服务器仍然存在一个未偿付的请求在等待满足,服务器打算以新的协议形式来满足这个请求,而不需要让客户端以新的协议形式来再一次发送该请求)

译注:也就是说,在向客户端发送 101 (Switching Protocols) 过渡性响应之后就算真正切换到新协议了,但是,虽然在协议变更了,该服务器仍然未向客户端发送最终响应,因此,服务器需要以新的协议形式来响应这个请求,而不需要让客户端以新的协议形式来再一次发送该请求。

For example, if the Upgrade header field is received in a GET request and the server decides to switch protocols, it first responds with a 101 (Switching Protocols) message in HTTP/1.1 and then immediately follows that with the new protocol's equivalent of a response to a GET on the target resource. This allows a connection to be upgraded to protocols with the same semantics as HTTP without the latency cost of an additional round trip. A server MUST NOT switch protocols unless the received message semantics can be honored by the new protocol; an OPTIONS request can be honored by any protocol.

例如,服务器接收到一个 GET 请求消息,带有一个 Upgrade 头字段的,服务器决定切换协议,它首先使用 HTTP/1.1 协议响应一个 101 (Switching Protocols) 过渡性消息给到该请求,然后马上再以新协议的方式响应目标资源给到该请求。这样使得连接的协议能够以 HTTP 相同的语义来进行更新,而没有带来额外消息往返的延迟成本。服务器 禁止 切换协议,除非其接收到的消息语义能够被新协议所遵循;一个 OPTIONS 请求能够被任何协议所遵循。

The following is an example response to the above hypothetical request:

以下消息样例是对应上述假设的请求消息的响应消息:

HTTP/1.1 101 Switching Protocols
Connection: upgrade
Upgrade: HTTP/2.0

[... data stream switches to HTTP/2.0 with an appropriate response
(as defined by new protocol) to the "GET /hello.txt" request ...]

When Upgrade is sent, the sender MUST also send a Connection header field (Section 6.1) that contains an "upgrade" connection option, in order to prevent Upgrade from being accidentally forwarded by intermediaries that might not implement the listed protocols. A server MUST ignore an Upgrade header field that is received in an HTTP/1.0 request.

当发送一个带有 Upgrade 头字段的消息时,发送端 必须 使该消息同时带上一个包含有一个 "upgrade" 连接选项的 Connection 头字段(章节 6.1),以免 Upgrade 头字段被没有实现这些协议的中间人意外转发出去。服务端 必须 忽略接收自一个 HTTP/1.0 请求消息的 Upgrade 头字段。

A client cannot begin using an upgraded protocol on the connection until it has completely sent the request message (i.e., the client can't change the protocol it is sending in the middle of a message). If a server receives both an Upgrade and an Expect header field with the "100-continue" expectation (Section 5.1.1 of [RFC7231]), the server MUST send a 100 (Continue) response before sending a 101 (Switching Protocols) response.

客户端不能在连接中使用新的协议,直到它将当前的请求消息的完整地发送出去(也就是说,客户端不能在发送一个消息的中途改变协议)。如果服务器接收到一个请求消息,既带有一个 Upgrade 头字段,也带有一个 "100-continue" 期望值expectationExpect 头字段(【RFC7231】章节 5.1.1)的时候,服务器 必须 在发送一个 101 (Switching Protocols) 响应消息之前,发送一个 100 (Continue) 响应消息。

The Upgrade header field only applies to switching protocols on top of the existing connection; it cannot be used to switch the underlying connection (transport) protocol, nor to switch the existing communication to a different connection. For those purposes, it is more appropriate to use a 3xx (Redirection) response (Section 6.4 of [RFC7231]).

Upgrade 头字段仅应用于切换现有最顶层的连接所使用的协议,它并不能用于切换底层的连接(传输)协议,也不能用于将目前的通信切换到其他不同的连接之上。要达到上述的其他目的,使用一个 3xx (Redirection) 响应会更加合适(【RFC7231】章节 6.4)。

译注:Upgrade 既不能切换底层协议,也不能切换当前的连接,它只能切换当前连接所使用的协议。

This specification only defines the protocol name "HTTP" for use by the family of Hypertext Transfer Protocols, as defined by the HTTP version rules of Section 2.6 and future updates to this specification. Additional tokens ought to be registered with IANA using the registration procedure defined in Section 8.6.

按照章节 2.6 中 HTTP 版本规则所定义的,以及将来对本规范的更新,本规范仅定义提供给超文传输协议家族使用的的协议名称 "HTTP"。更多的标记应该使用定义在章节 8.6 的登记手续到 IANA 进行登记。

7. ABNF 列表扩展:#rule / ABNF List Extension: #rule

A #rule extension to the ABNF rules of [RFC5234] is used to improve readability in the definitions of some header field values.

#rule 是对【RFC5234】ABNF 规则的扩展,用于提高某些头字段值定义的可读性。

A construct "#" is defined, similar to "*", for defining comma-delimited lists of elements. The full form is "<n>#<m>element" indicating at least <n> and at most <m> elements, each separated by a single comma (",") and optional whitespace (OWS).

定义了一个 # 结构,类似于 *,用于定义以英文逗号分隔的元素列表。完整的形式是 <n>#<m>element,表明至少 <n> 个,至多 <m>element,每个 element 以单个逗号(",")以及可选的空白(OWS)分隔。

In any production that uses the list construct, a sender MUST NOT generate empty list elements. In other words, a sender MUST generate lists that satisfy the following syntax:

在任何使用列表结构的场景中,发送端 禁止 生成一个空的列表元素。也就是说,发送端 必须 生成一个满足下列句法的列表:

1#element => element *( OWS "," OWS element )

and:

以及:

#element => [ 1#element ]

and for n >= 1 and m > 1:

对于 n >= 1 并且 m > 1,有:

<n>#<m>element => element <n-1>*<m-1>( OWS "," OWS element )

For compatibility with legacy list rules, a recipient MUST parse and ignore a reasonable number of empty list elements: enough to handle common mistakes by senders that merge values, but not so much that they could be used as a denial-of-service mechanism. In other words, a recipient MUST accept lists that satisfy the following syntax:

为了兼容历史遗留的列表规则,接收端 必须 解析(parse)并且忽略一个合理数量的空列表元素:足以处理发送端合并字段值时出现的常见的错误,但不足以处理用作拒绝服务攻击。也就是说,接收端 必须 接受满足以下句法的列表:

#element => [ ( "," / element ) *( OWS "," [ OWS element ] ) ]

1#element => *( "," OWS ) element *( OWS "," [ OWS element ] )

Empty elements do not contribute to the count of elements present. For example, given these ABNF productions:

空元素不计入元素的数目,例如,给定以下 ABNF 规则:

example-list      = 1#example-list-elmt
example-list-elmt = token ; see Section 3.2.6 

Then the following are valid values for example-list (not including the double quotes, which are present for delimitation only):

那么,以下都是合乎 example-list 规则的值(不包含双引号,双引号仅用于对数值进行定界):

"foo,bar"
"foo ,bar,"
"foo , ,bar,charlie   "

In contrast, the following values would be invalid, since at least one non-empty element is required by the example-list production:

作为对比,以下都是不合乎 example-list 规则的值,这是因为 example-list 要求至少存在一个非空元素。

""
","
",   ,"

Appendix B shows the collected ABNF for recipients after the list constructs have been expanded.

附录 B 展示了在加入列表结构以后,接收端的 ABNF 规则集合。

8. INAN 注意事项 / IANA Considerations

8.1. 头字段登记 / Header Field Registration

HTTP header fields are registered within the "Message Headers" registry maintained at http://www.iana.org/assignments/message-headers/.

This document defines the following HTTP header fields, so the "Permanent Message Header Field Names" registry has been updated accordingly (see [BCP90]).

Header Field Name Protocol Status Reference
Connection http standard Section 6.1
Content-Length http standard Section 3.3.2
Host http standard Section 5.4
TE http standard Section 4.3
Trailer http standard Section 4.4
Transfer-Encoding http standard Section 3.3.1
Upgrade http standard Section 6.7
Via http standard Section 5.7.1

Furthermore, the header field-name "Close" has been registered as "reserved", since using that name as an HTTP header field might conflict with the "close" connection option of the Connection header field (Section 6.1).

Header Field Name Protocol Status Reference
Close http reserved Section 8.1

The change controller is: "IETF (iesg@ietf.org) - Internet Engineering Task Force".

8.2. URI 方案登记 / URI Scheme Registration

IANA maintains the registry of URI Schemes [BCP115] at http://www.iana.org/assignments/uri-schemes/.

This document defines the following URI schemes, so the "Permanent URI Schemes" registry has been updated accordingly.

URI Scheme Description Reference
http Hypertext Transfer Prototol Section 2.7.1
https Hypertext Transfer Prototol Secure Section 2.7.2

8.3. 互联网媒体类型登记 Internet Media Type Registration

IANA maintains the registry of Internet media types [BCP13] at http://www.iana.org/assignments/media-types.

This document serves as the specification for the Internet media types "message/http" and "application/http". The following has been registered with IANA.

8.3.1. Internet Media Type message/http

The message/http type can be used to enclose a single HTTP request or response message, provided that it obeys the MIME restrictions for all "message" types regarding line length and encodings.

Type name: message
Subtype name: http
Required parameters: N/A
Optional parameters: version, msgtype
  version: The HTTP-version number of the enclosed message (e.g., "1.1"). If not present, the version can be determined from the first line of the body.
  msgtype: The message type — "request" or "response". If not present, the type can be determined from the first line of the body.
Encoding considerations: only "7bit", "8bit", or "binary" are permitted
Security considerations: see Section 9
Interoperability considerations: N/A
Published specification: This specification (see Section 8.3.1).
Applications that use this media type: N/A
Fragment identifier considerations: N/A
Additional information: 
  Magic number(s): N/A
  Deprecated alias names for this type: N/A
  File extension(s): N/A
  Macintosh file type code(s): N/A
Person and email address to contact for further information: See Authors' Addresses section.
Intended usage: COMMON
Restrictions on usage: N/A
Author: See Authors' Addresses section.
Change controller: IESG

8.3.2. Internet Media Type application/http

The application/http type can be used to enclose a pipeline of one or more HTTP request or response messages (not intermixed).

Type name: application
Subtype name: http
Required parameters: N/A
Optional parameters: version, msgtype
  version: The HTTP-version number of the enclosed messages (e.g., "1.1"). If not present, the version can be determined from the first line of the body.
  msgtype: The message type — "request" or "response". If not present, the type can be determined from the first line of the body.
Encoding considerations: HTTP messages enclosed by this type are in "binary" format; use of an appropriate Content-Transfer-Encoding is required when transmitted via email.
Security considerations: see Section 9
Interoperability considerations: N/A
Published specification: This specification (see Section 8.3.2).
Applications that use this media type: N/A
Fragment identifier considerations: N/A
Additional information: 
  Deprecated alias names for this type: N/A
  Magic number(s): N/A
  File extension(s): N/A
  Macintosh file type code(s): N/A
Person and email address to contact for further information: See Authors' Addresses section.
Intended usage: COMMON
Restrictions on usage: N/A
Author: See Authors' Addresses section.
Change controller: IESG

8.4. 传输编码登记 / Transfer Coding Registry

The "HTTP Transfer Coding Registry" defines the namespace for transfer coding names. It is maintained at http://www.iana.org/assignments/http-parameters.

8.4.1. Procedure

Registrations MUST include the following fields:

  • Name
  • Description
  • Pointer to specification text

Names of transfer codings MUST NOT overlap with names of content codings (Section 3.1.2.1 of [RFC7231]) unless the encoding transformation is identical, as is the case for the compression codings defined in Section 4.2.

Values to be added to this namespace require IETF Review (see Section 4.1 of [RFC5226]), and MUST conform to the purpose of transfer coding defined in this specification.

Use of program names for the identification of encoding formats is not desirable and is discouraged for future encodings.

8.4.2. Registration

The "HTTP Transfer Coding Registry" has been updated with the registrations below:

Name Description Reference
chunked Transfer in a series of chunks Section 4.1
compress UNIX "compress" data format [Welch] Section 4.2.1
deflate "deflate" compressed data ([RFC1951]) inside the "zlib" data format ([RFC1950]) Section 4.2.2
gzip GZIP file format [RFC1952] Section 4.2.3
x-compress Deprecated (alias for compress) Section 4.2.1
x-gzip Deprecated (alias for gzip) Section 4.2.3

8.5. 内容编码登记 / Content Coding Registration

IANA maintains the "HTTP Content Coding Registry" at http://www.iana.org/assignments/http-parameters.

The "HTTP Content Coding Registry" has been updated with the registrations below:

Name Description Reference
compress UNIX "compress" data format [Welch] Section 4.2.1
deflate "deflate" compressed data ([RFC1951]) inside the "zlib" data format ([RFC1950]) Section 4.2.2
gzip GZIP file format [RFC1952] Section 4.2.3
x-compress Deprecated (alias for compress) Section 4.2.1
x-gzip Deprecated (alias for gzip) Section 4.2.3

8.6. 升级令牌登记 / Upgrade Token Registry

The "Hypertext Transfer Protocol (HTTP) Upgrade Token Registry" defines the namespace for protocol-name tokens used to identify protocols in the Upgrade header field. The registry is maintained at http://www.iana.org/assignments/http-upgrade-tokens.

8.6.1. Procedure

Each registered protocol name is associated with contact information and an optional set of specifications that details how the connection will be processed after it has been upgraded.

Registrations happen on a "First Come First Served" basis (see Section 4.1 of [RFC5226]) and are subject to the following rules:

  1. A protocol-name token, once registered, stays registered forever.
  2. The registration MUST name a responsible party for the registration.
  3. The registration MUST name a point of contact.
  4. The registration MAY name a set of specifications associated with that token. Such specifications need not be publicly available.
  5. The registration SHOULD name a set of expected "protocol-version" tokens associated with that token at the time of registration.
  6. The responsible party MAY change the registration at any time. The IANA will keep a record of all such changes, and make them available upon request.
  7. The IESG MAY reassign responsibility for a protocol token. This will normally only be used in the case when a responsible party cannot be contacted.

This registration procedure for HTTP Upgrade Tokens replaces that previously defined in Section 7.2 of [RFC2817].

8.6.2. Upgrade Token Registration

The "HTTP" entry in the upgrade token registry has been updated with the registration below:

Value Description Expected Version Tokens Reference
HTTP Hypertext Transfer Protocol any DIGIT.DIGIT (e.g, "2.0") Section 2.6

The responsible party is: "IETF (iesg@ietf.org) - Internet Engineering Task Force".

9. 安全注意事项 / Security Considerations

This section is meant to inform developers, information providers, and users of known security considerations relevant to HTTP message syntax, parsing, and routing. Security considerations about HTTP semantics and payloads are addressed in [RFC7231].

本章节是打算将已知的与 HTTP 消息句法、解析,以及路由相关的安全注意事项告知开发者、信息提供商,以及用户。关于 HTTP 的语义以及有效载荷的安全注意事项放在【RFC7231】处理。

9.1. 确立权威 / Establishing Authority

HTTP relies on the notion of an authoritative response: a response that has been determined by (or at the direction of) the authority identified within the target URI to be the most appropriate response for that request given the state of the target resource at the time of response message origination. Providing a response from a non-authoritative source, such as a shared cache, is often useful to improve performance and availability, but only to the extent that the source can be trusted or the distrusted response can be safely used.

HTTP 提出了一个“权威响应authoritative response”的概念——由目标 URI 所表明的 authority (权威机构、官方机构)所决定(或指导)的最合适于这个请求的一种响应。它给出了诞生响应消息那时候的目标资源状态。提供自一个非权威来源non-authoritative source——例如一个共享缓存——的响应,一般对于提升性能和可用性有很大作用,但是仅在该来源是受信任的或者该不受信任的响应可以被安全地使用的情况。

译注: "authoritative response" 译为“权威响应”、“官方响应”,也就是说,这种响应是来自客户端所期望的那一个源服务器的,而且其有效载荷没有被任何中间人修改(编码)过的。 另外,"authoritative" 是“有权威的,当局的,官方的”的意思,但并没有“授权”、“认证”的意思!某些人将 "authoritative response" 译为“授权响应、认证响应”都是错误的,个人认为是弄混了 "authority","authentication","certification","verification" 这几个单词的意思,例如 "certificate authority" 是“认证中心、认证机构”。

Unfortunately, establishing authority can be difficult. For example, phishing is an attack on the user's perception of authority, where that perception can be misled by presenting similar branding in hypertext, possibly aided by userinfo obfuscating the authority component (see Section 2.7.1). User agents can reduce the impact of phishing attacks by enabling users to easily inspect a target URI prior to making an action, by prominently distinguishing (or rejecting) userinfo when present, and by not sending stored credentials and cookies when the referring document is from an unknown or untrusted source.

不幸的是,确立 authority 不是一个容易的事。例如,钓鱼phishing是一种针对用户对于 authority 的认知perception的攻击,这种认知可能会因为在超文本里出现了相似的品牌branding,可能还会辅以 userinfo 来对 authority 组件进行混淆(见章节 2.7.1),从而造成误导。用户代理能够通过提供一种途径,让用户在执行操作之前可以轻易检测某个目标 URI 的真伪,来降低钓鱼攻击的影响。例如,明显区分(或拒绝) userinfo 组件的出现,并且当所指向的文档是来自某个未知或不受信任的来源的时候,不对其发送存储在客户端的授权证书和 cookies。

When a registered name is used in the authority component, the "http" URI scheme (Section 2.7.1) relies on the user's local name resolution service to determine where it can find authoritative responses. This means that any attack on a user's network host table, cached names, or name resolution libraries becomes an avenue for attack on establishing authority. Likewise, the user's choice of server for Domain Name Service (DNS), and the hierarchy of servers from which it obtains resolution results, could impact the authenticity of address mappings; DNS Security Extensions (DNSSEC, [RFC4033]) are one way to improve authenticity.

当某个已登记的名称用在 authority 组件中的时候,"http" URI 方案(章节 2.7.1)依靠用户的本地名称解析服务来确定哪里能够找到权威响应authoritative responses。这意味着任何对于用户的网络主机映射表network host table、缓存名称,或者名称解析库的攻击都会成为一个攻击确立权威的隐患。同样,用户选择哪一个服务器用于域名解析服务domain name service(DNS),以及其获取域名解析结果的服务器所在层级,都有可能影响到地址映射的可靠性。DNS 安全性扩展DNS Security Extensions(DNSSEC,【RFC4033】)是一种提高可靠性的方式。

Furthermore, after an IP address is obtained, establishing authority for an "http" URI is vulnerable to attacks on Internet Protocol routing.

另外,在获得一个 IP 地址之后,确立一个 "http" URI 的 authority 在 IP 协议Internet Protocol路由方面也存在被攻击的风险。

The "https" scheme (Section 2.7.2) is intended to prevent (or at least reveal) many of these potential attacks on establishing authority, provided that the negotiated TLS connection is secured and the client properly verifies that the communicating server's identity matches the target URI's authority component (see [RFC2818]). Correctly implementing such verification can be difficult (see [Georgiev]).

"https" 方案(章节 2.7.2)是意图避免(或者至少揭露)上述这些对于确立权威方面的潜在攻击,它提供了一种协商的的安全 TLS 连接,同时让客户端能够正确验证当前通信的服务器的身份是否与目标 URI 的 authority 组件相匹配(见【RFC2818】)。正确实现这种验证可能不是一件容易的事(见【Georgiev】)。

9.2. 中间人的风险 / Risks of Intermediaries

By their very nature, HTTP intermediaries are men-in-the-middle and, thus, represent an opportunity for man-in-the-middle attacks. Compromise of the systems on which the intermediaries run can result in serious security and privacy problems. Intermediaries might have access to security-related information, personal information about individual users and organizations, and proprietary information belonging to users and content providers. A compromised intermediary, or an intermediary implemented or configured without regard to security and privacy considerations, might be used in the commission of a wide range of potential attacks.

就其本质而言,HTTP 中间人intermediaries就是一种“中间人men-in-the-middle”,由此,为中间人提供了攻击的机会。如果泄漏了运行于中间人上的系统,可能会导致严重的安全和隐私问题。中间人可能能够访问安全相关的信息、个人用户或组织的个人信息,以及属于用户或内容提供商的专有信息。一个已泄漏出去的中间人,或者一个没有实现或没有配置关于安全和隐私方面的注意事项的中间人,可能会被用于各种潜在攻击。

Intermediaries that contain a shared cache are especially vulnerable to cache poisoning attacks, as described in Section 8 of [RFC7234].

包含了一个共享缓存的中间人特别容易受到缓存中毒攻击cache poisoning attacks,正如【RFC7234】章节 8 所述那样。

Implementers need to consider the privacy and security implications of their design and coding decisions, and of the configuration options they provide to operators (especially the default configuration).

实现者implementers需要去思考他们的设计以及编码决策,以及提供给使用者operators的配置选项(特别是默认配置)所蕴含的隐私性和安全性。

Users need to be aware that intermediaries are no more trustworthy than the people who run them; HTTP itself cannot solve this problem.

用户需要被告知中间人并不见得比运行它们的人们更加值得信赖。HTTP 本身并不能解决这种问题。

9.3. 通过协议元素长度进行攻击 / Attacks via Protocol Element Length

Because HTTP uses mostly textual, character-delimited fields, parsers are often vulnerable to attacks based on sending very long (or very slow) streams of data, particularly where an implementation is expecting a protocol element with no predefined length.

因为 HTTP 使用的主要是文本性的、以字符分隔的字段,因此,解析器parser常常受到基于发送非常长(或非常慢)的数据流的方式的攻击,特别是在一个中间人期望某个没有预定义长度的协议元素身上。

To promote interoperability, specific recommendations are made for minimum size limits on request-line (Section 3.1.1) and header fields (Section 3.2). These are minimum recommendations, chosen to be supportable even by implementations with limited resources; it is expected that most implementations will choose substantially higher limits.

为了提高互操作性,本规范提出了对于请求行(章节 3.1.1)以及头字段(章节 3.2)的最小值限制的具体建议。这些最小值建议,本规范选择的是即使资源受限的实现implementations也能够支持的,期望大多数的实现会选择更大幅度的限制。

A server can reject a message that has a request-target that is too long (Section 6.5.12 of [RFC7231]) or a request payload that is too large (Section 6.5.11 of [RFC7231]). Additional status codes related to capacity limits have been defined by extensions to HTTP [RFC6585].

服务器能够拒绝一个带有过长的请求目标(【RFC7231】章节 6.5.12),或者过大的请求有效载荷(【RFC7231】章节 6.5.11)的消息。作为对 HTTP 的扩展,额外的关于容量限制的响应状态码已经定义在【RFC6585】。

Recipients ought to carefully limit the extent to which they process other protocol elements, including (but not limited to) request methods, response status phrases, header field-names, numeric values, and body chunks. Failure to limit such processing can result in buffer overflows, arithmetic overflows, or increased vulnerability to denial-of-service attacks.

接收端应该小心地限制其处理的除上述以外的其他协议元素的大小,包括(但不限于)请求方法,响应状态短语,头字段的名称,数字值以及消息分块。

9.4. 响应分割 / Response Splitting

Response splitting (a.k.a, CRLF injection) is a common technique, used in various attacks on Web usage, that exploits the line-based nature of HTTP message framing and the ordered association of requests to responses on persistent connections [Klein]. This technique can be particularly damaging when the requests pass through a shared cache.

响应分割(Response Splitting,也称为“CRLF 注入”)是一种针对网页用途Web usage而进行各种攻击的常见的技术,它利用了 HTTP 基于行line-based来进行消息分帧,以及在持久连接中请求与响应的顺序关联的性质【Klein】。当请求穿过一个共享缓存的时候,这种技术特别具有破坏性。

Response splitting exploits a vulnerability in servers (usually within an application server) where an attacker can send encoded data within some parameter of the request that is later decoded and echoed within any of the response header fields of the response. If the decoded data is crafted to look like the response has ended and a subsequent response has begun, the response has been split and the content within the apparent second response is controlled by the attacker. The attacker can then make any other request on the same persistent connection and trick the recipients (including intermediaries) into believing that the second half of the split is an authoritative answer to the second request.

响应分割利用了服务器(通常在一个应用服务器内)的一个弱点,攻击者能够在某些请求参数里带有经过编码的数据,而这些参数并不会在服务器立即解码,而是回显echo至对应的响应消息的头字段里。如果解码数据被精心设计得看起来像是该响应消息的结束以及后续消息的开始,那么该响应消息就被“分割”为前后两个响应消息,并且后一个“响应消息”(表面上看像是一个响应消息,但实际是原响应消息的后半部分)受控于攻击者。然后攻击者就能够在同一个持久连接中做任何请求并欺骗接收端(包括中间人)让它们相信分割过的后半部分是第二个请求的权威应答authoritative answer

For example, a parameter within the request-target might be read by an application server and reused within a redirect, resulting in the same parameter being echoed in the Location header field of the response. If the parameter is decoded by the application and not properly encoded when placed in the response field, the attacker can send encoded CRLF octets and other content that will make the application's single response look like two or more responses.

例如,在请求目标内的一个参数可能会被一个应用服务器所读取,并复用在一个重定义响应内,导致同一个参数被回显到响应消息的 Location 头字段内。如果该参数由客户端应用负责解码,而且服务器在将该参数放置到响应头字段的时候没有正确地对其编码,那么,攻击者能够发送多个编码过的 CRLF 字节以及其他内容来让该应用的单一响应消息看起来像是两个或多个响应消息。

A common defense against response splitting is to filter requests for data that looks like encoded CR and LF (e.g., "%0D" and "%0A"). However, that assumes the application server is only performing URI decoding, rather than more obscure data transformations like charset transcoding, XML entity translation, base64 decoding, sprintf reformatting, etc. A more effective mitigation is to prevent anything other than the server's core protocol libraries from sending a CR or LF within the header section, which means restricting the output of header fields to APIs that filter for bad octets and not allowing application servers to write directly to the protocol stream.

预防响应分割的一个常见方式是去过滤掉请求消息中看起来像是经过编码的 CR 以及 LF 的数据(例如,"%0D" 和 "%0A")。但是,假设应用服务器仅执行 URI 解码decoding,而不是更加模糊的数据转换,例如,字符集转码charset transcodingXML 实体翻译XML entity translation,base64 解码,sprintf 字符串格式转换reformatting等等,一个更有效的缓解方式是阻止任何除了服务器自身的核心协议库以外的程序在消息头部中发送一个 CR 或者 LF,这样做意味着将对头字段的输出限制为只能使用 API 的方式,通过这些 API 来过滤掉不好的字节,同时不允许应用服务器越过 API 将数据直接写入到协议流。

9.5. 请求走私 / Request Smuggling

Request smuggling ([Linhart]) is a technique that exploits differences in protocol parsing among various recipients to hide additional requests (which might otherwise be blocked or disabled by policy) within an apparently harmless request. Like response splitting, request smuggling can lead to a variety of attacks on HTTP usage.

请求走私(Request Smuggling)是一种利用多个接收端对 HTTP 消息解析的不一致性,来达到在一个表面无害的请求里隐藏(按正常规则来说应该被阻止或者禁止的)额外请求的技术。像响应分割response splitting一样,请求走私能够引发针对 HTTP 用途的各种攻击。

This specification has introduced new requirements on request parsing, particularly with regard to message framing in Section 3.3.3, to reduce the effectiveness of request smuggling.

本规范对请求解析request parsing引入了新的要求,特别是在消息分帧message framing方面(章节 3.3.3),以降低请求走私的影响。

9.6. 消息完整性 / Message Integrity

HTTP does not define a specific mechanism for ensuring message integrity, instead relying on the error-detection ability of underlying transport protocols and the use of length or chunk-delimited framing to detect completeness. Additional integrity mechanisms, such as hash functions or digital signatures applied to the content, can be selectively added to messages via extensible metadata header fields. Historically, the lack of a single integrity mechanism has been justified by the informal nature of most HTTP communication. However, the prevalence of HTTP as an information access mechanism has resulted in its increasing use within environments where verification of message integrity is crucial.

HTTP 并没有定义一个具体方法来保证消息的完整性,而是依靠底层传输协议的错误检测能力error-detection ability以及使用长度或块限定的分帧chunk-delimited framing来检测完整。如果想添加额外的完整性机制,例如对内容应用哈希函数或者数字签名digital sinatures,可以通过使用可扩展的元数据头字段extensible metadata header fields来选择性地将这些额外的完整性机制应用到消息。

User agents are encouraged to implement configurable means for detecting and reporting failures of message integrity such that those means can be enabled within environments for which integrity is necessary. For example, a browser being used to view medical history or drug interaction information needs to indicate to the user when such information is detected by the protocol to be incomplete, expired, or corrupted during transfer. Such mechanisms might be selectively enabled via user agent extensions or the presence of message integrity metadata in a response. At a minimum, user agents ought to provide some indication that allows a user to distinguish between a complete and incomplete response message (Section 3.4) when such verification is desired.

鼓励用户代理去实现某些可配置的方式来对消息完整性进行检测和错误报告,以使这些方式能够在必要的情形下被启用。例如,一个浏览器被用于浏览病史,或者药物相互作信息,当协议检测到这些信息在传输过程中因为某些因素导致不完整、过期,或者损坏的时候,浏览器需要向用户指出来。这些方式可能通过用户代理的扩展程序,或者某个响应中的消息完整性元数据的出现,来被选择性地启用。至少在需要进行完整性验证的时候,用户代理应该提供某些指示来让某个用户区分到完整和不完整的响应消息(章节 3.4

9.7. 消息保密性 / Message Confidentiality

HTTP relies on underlying transport protocols to provide message confidentiality when that is desired. HTTP has been specifically designed to be independent of the transport protocol, such that it can be used over many different forms of encrypted connection, with the selection of such transports being identified by the choice of URI scheme or within user agent configuration.

HTTP 依靠底层传输协议在需要的时候提供消息保密性的保障。HTTP 已被专门设计为不依赖传输协议,使其能够应用在多种不同类型的加密连接之上。与 HTTP 一同被应用的还有由 URI 方案的选取或用户代理的配置所决定的传输方式的相关选项。

The "https" scheme can be used to identify resources that require a confidential connection, as described in Section 2.7.2.

"https" 方案能够用于标识那些要求在受信任的连接中使用的资源,正如章节 2.7.2 描述的那样。

9.8. 服务器日志信息的隐私 / Privacy of Server Log Information

A server is in the position to save personal data about a user's requests over time, which might identify their reading patterns or subjects of interest. In particular, log information gathered at an intermediary often contains a history of user agent interaction, across a multitude of sites, that can be traced to individual users.

服务器应该将用户的请求的某些个人信息随时处在保护的状态中,这是由于借由这些信息可以鉴定他们的读取模式或者兴趣订阅。实际上,聚集在中间人里的日志信息通常会包含用户代理与众多网站之间的交互历史,同样可能会被追踪到个人用户。

HTTP log information is confidential in nature; its handling is often constrained by laws and regulations. Log information needs to be securely stored and appropriate guidelines followed for its analysis. Anonymization of personal information within individual entries helps, but it is generally not sufficient to prevent real log traces from being re-identified based on correlation with other access characteristics. As such, access traces that are keyed to a specific client are unsafe to publish even if the key is pseudonymous.

HTTP 日志信息如个涉及机密信息,日志信息需要被安全地存储,对它们的分析需要遵循正确的指导方针。对个人信息进行匿名化是有效的,但这样做还不足以阻止通过关联其他访问特征来重新识别出真实的日志痕迹的行为。因此,公布具有唯一定位一个具体客户端的访问痕迹access traces是危险的,即使这个唯一码只是一个化名。

To minimize the risk of theft or accidental publication, log information ought to be purged of personally identifiable information, including user identifiers, IP addresses, and user-provided query parameters, as soon as that information is no longer necessary to support operational needs for security, auditing, or fraud control.

为了最小化的风险,对于日志信息里的可识别信息indentifiable information,包含用户标识符user identifiers、IP 地址,以及用户提供的查询参数query parameters,一旦不再需要运用这些信息来支持安全、审计,或者防欺诈fraud control需求的运作的时候,应当及时清除掉。

10. 鸣谢 / Acknowledgments

This edition of HTTP/1.1 builds on the many contributions that went into RFC 1945, RFC 2068, RFC 2145, and RFC 2616, including substantial contributions made by the previous authors, editors, and Working Group Chairs: Tim Berners-Lee, Ari Luotonen, Roy T. Fielding, Henrik Frystyk Nielsen, Jim Gettys, Jeffrey C. Mogul, Larry Masinter, and Paul J. Leach. Mark Nottingham oversaw this effort as Working Group Chair.

Since 1999, the following contributors have helped improve the HTTP specification by reporting bugs, asking smart questions, drafting or reviewing text, and evaluating open issues:

Adam Barth, Adam Roach, Addison Phillips, Adrian Chadd, Adrian Cole, Adrien W. de Croy, Alan Ford, Alan Ruttenberg, Albert Lunde, Alek Storm, Alex Rousskov, Alexandre Morgaut, Alexey Melnikov, Alisha Smith, Amichai Rothman, Amit Klein, Amos Jeffries, Andreas Maier, Andreas Petersson, Andrei Popov, Anil Sharma, Anne van Kesteren, Anthony Bryan, Asbjorn Ulsberg, Ashok Kumar, Balachander Krishnamurthy, Barry Leiba, Ben Laurie, Benjamin Carlyle, Benjamin Niven-Jenkins, Benoit Claise, Bil Corry, Bill Burke, Bjoern Hoehrmann, Bob Scheifler, Boris Zbarsky, Brett Slatkin, Brian Kell, Brian McBarron, Brian Pane, Brian Raymor, Brian Smith, Bruce Perens, Bryce Nesbitt, Cameron Heavon-Jones, Carl Kugler, Carsten Bormann, Charles Fry, Chris Burdess, Chris Newman, Christian Huitema, Cyrus Daboo, Dale Robert Anderson, Dan Wing, Dan Winship, Daniel Stenberg, Darrel Miller, Dave Cridland, Dave Crocker, Dave Kristol, Dave Thaler, David Booth, David Singer, David W. Morris, Diwakar Shetty, Dmitry Kurochkin, Drummond Reed, Duane Wessels, Edward Lee, Eitan Adler, Eliot Lear, Emile Stephan, Eran Hammer-Lahav, Eric D. Williams, Eric J. Bowman, Eric Lawrence, Eric Rescorla, Erik Aronesty, EungJun Yi, Evan Prodromou, Felix Geisendoerfer, Florian Weimer, Frank Ellermann, Fred Akalin, Fred Bohle, Frederic Kayser, Gabor Molnar, Gabriel Montenegro, Geoffrey Sneddon, Gervase Markham, Gili Tzabari, Grahame Grieve, Greg Slepak, Greg Wilkins, Grzegorz Calkowski, Harald Tveit Alvestrand, Harry Halpin, Helge Hess, Henrik Nordstrom, Henry S. Thompson, Henry Story, Herbert van de Sompel, Herve Ruellan, Howard Melman, Hugo Haas, Ian Fette, Ian Hickson, Ido Safruti, Ilari Liusvaara, Ilya Grigorik, Ingo Struck, J. Ross Nicoll, James Cloos, James H. Manger, James Lacey, James M. Snell, Jamie Lokier, Jan Algermissen, Jari Arkko, Jeff Hodges (who came up with the term 'effective Request-URI'), Jeff Pinner, Jeff Walden, Jim Luther, Jitu Padhye, Joe D. Williams, Joe Gregorio, Joe Orton, Joel Jaeggli, John C. Klensin, John C. Mallery, John Cowan, John Kemp, John Panzer, John Schneider, John Stracke, John Sullivan, Jonas Sicking, Jonathan A. Rees, Jonathan Billington, Jonathan Moore, Jonathan Silvera, Jordi Ros, Joris Dobbelsteen, Josh Cohen, Julien Pierre, Jungshik Shin, Justin Chapweske, Justin Erenkrantz, Justin James, Kalvinder Singh, Karl Dubost, Kathleen Moriarty, Keith Hoffman, Keith Moore, Ken Murchison, Koen Holtman, Konstantin Voronkov, Kris Zyp, Leif Hedstrom, Lionel Morand, Lisa Dusseault, Maciej Stachowiak, Manu Sporny, Marc Schneider, Marc Slemko, Mark Baker, Mark Pauley, Mark Watson, Markus Isomaki, Markus Lanthaler, Martin J. Duerst, Martin Musatov, Martin Nilsson, Martin Thomson, Matt Lynch, Matthew Cox, Matthew Kerwin, Max Clark, Menachem Dodge, Meral Shirazipour, Michael Burrows, Michael Hausenblas, Michael Scharf, Michael Sweet, Michael Tuexen, Michael Welzl, Mike Amundsen, Mike Belshe, Mike Bishop, Mike Kelly, Mike Schinkel, Miles Sabin, Murray S. Kucherawy, Mykyta Yevstifeyev, Nathan Rixham, Nicholas Shanks, Nico Williams, Nicolas Alvarez, Nicolas Mailhot, Noah Slater, Osama Mazahir, Pablo Castro, Pat Hayes, Patrick R. McManus, Paul E. Jones, Paul Hoffman, Paul Marquess, Pete Resnick, Peter Lepeska, Peter Occil, Peter Saint-Andre, Peter Watkins, Phil Archer, Phil Hunt, Philippe Mougin, Phillip Hallam-Baker, Piotr Dobrogost, Poul-Henning Kamp, Preethi Natarajan, Rajeev Bector, Ray Polk, Reto Bachmann-Gmuer, Richard Barnes, Richard Cyganiak, Rob Trace, Robby Simpson, Robert Brewer, Robert Collins, Robert Mattson, Robert O'Callahan, Robert Olofsson, Robert Sayre, Robert Siemer, Robert de Wilde, Roberto Javier Godoy, Roberto Peon, Roland Zink, Ronny Widjaja, Ryan Hamilton, S. Mike Dierken, Salvatore Loreto, Sam Johnston, Sam Pullara, Sam Ruby, Saurabh Kulkarni, Scott Lawrence (who maintained the original issues list), Sean B. Palmer, Sean Turner, Sebastien Barnoud, Shane McCarron, Shigeki Ohtsu, Simon Yarde, Stefan Eissing, Stefan Tilkov, Stefanos Harhalakis, Stephane Bortzmeyer, Stephen Farrell, Stephen Kent, Stephen Ludin, Stuart Williams, Subbu Allamaraju, Subramanian Moonesamy, Susan Hares, Sylvain Hellegouarch, Tapan Divekar, Tatsuhiro Tsujikawa, Tatsuya Hayashi, Ted Hardie, Ted Lemon, Thomas Broyer, Thomas Fossati, Thomas Maslen, Thomas Nadeau, Thomas Nordin, Thomas Roessler, Tim Bray, Tim Morgan, Tim Olsen, Tom Zhou, Travis Snoozy, Tyler Close, Vincent Murphy, Wenbo Zhu, Werner Baumann, Wilbur Streett, Wilfredo Sanchez Vega, William A. Rowe Jr., William Chan, Willy Tarreau, Xiaoshu Wang, Yaron Goland, Yngve Nysaeter Pettersen, Yoav Nir, Yogesh Bang, Yuchung Cheng, Yutaka Oiwa, Yves Lafon (long-time member of the editor team), Zed A. Shaw, and Zhong Yu.

See Section 16 of [RFC2616] for additional acknowledgements from prior revisions.

11. 参考资料 / References

11.1. 规范性参考资料 / Normative References

[RFC0793]
Postel, J., “Transmission Control Protocol”, STD 7, RFC 793, September 1981.
[RFC1950]
Deutsch, L. and J-L. Gailly, “ZLIB Compressed Data Format Specification version 3.3”, RFC 1950, May 1996.
[RFC1951]
Deutsch, P., “DEFLATE Compressed Data Format Specification version 1.3”, RFC 1951, May 1996.
[RFC1952]
Deutsch, P., Gailly, J-L., Adler, M., Deutsch, L., and G. Randers-Pehrson, “GZIP file format specification version 4.3”, RFC 1952, May 1996.
[RFC2119]
Bradner, S., “Key words for use in RFCs to Indicate Requirement Levels”, BCP 14, RFC 2119, March 1997.
[RFC3986]
Berners-Lee, T., Fielding, R., and L. Masinter, “Uniform Resource Identifier (URI): Generic Syntax”, STD 66, RFC 3986, January 2005.
[RFC5234]
Crocker, D., Ed. and P. Overell, “Augmented BNF for Syntax Specifications: ABNF”, STD 68, RFC 5234, January 2008.
[RFC7231]
Fielding, R., Ed. and J. Reschke, Ed., “Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content”, RFC 7231, June 2014.
[RFC7232]
Fielding, R., Ed. and J. Reschke, Ed., “Hypertext Transfer Protocol (HTTP/1.1): Conditional Requests”, RFC 7232, June 2014.
[RFC7233]
Fielding, R., Ed., Lafon, Y., Ed., and J. Reschke, Ed., “Hypertext Transfer Protocol (HTTP/1.1): Range Requests”, RFC 7233, June 2014.
[RFC7234]
Fielding, R., Ed., Nottingham, M., Ed., and J. Reschke, Ed., “Hypertext Transfer Protocol (HTTP/1.1): Caching”, RFC 7234, June 2014.
[RFC7235]
Fielding, R., Ed. and J. Reschke, Ed., “Hypertext Transfer Protocol (HTTP/1.1): Authentication”, RFC 7235, June 2014.
[USASCII]
American National Standards Institute, “Coded Character Set – 7-bit American Standard Code for Information Interchange”, ANSI X3.4, 1986.
[Welch]
Welch, T., “A Technique for High-Performance Data Compression”, IEEE Computer 17(6), June 1984.

11.2. 信息性参考资料 / Informative References

[BCP115]
Hansen, T., Hardie, T., and L. Masinter, “Guidelines and Registration Procedures for New URI Schemes”, BCP 115, RFC 4395, February 2006.
[BCP13]
Freed, N., Klensin, J., and T. Hansen, “Media Type Specifications and Registration Procedures”, BCP 13, RFC 6838, January 2013.
[BCP90]
Klyne, G., Nottingham, M., and J. Mogul, “Registration Procedures for Message Header Fields”, BCP 90, RFC 3864, September 2004.
[Georgiev]
Georgiev, M., Iyengar, S., Jana, S., Anubhai, R., Boneh, D., and V. Shmatikov, “The Most Dangerous Code in the World: Validating SSL Certificates in Non-browser Software”, In Proceedings of the 2012 ACM Conference on Computer and Communications Security (CCS '12), pp. 38-49, October 2012, http://doi.acm.org/10.1145/2382196.2382204.
[ISO-8859-1]
International Organization for Standardization, “Information technology – 8-bit single-byte coded graphic character sets – Part 1: Latin alphabet No. 1”, ISO/IEC 8859-1:1998, 1998.
[Klein]
Klein, A., “Divide and Conquer - HTTP Response Splitting, Web Cache Poisoning Attacks, and Related Topics”, March 2004, http://packetstormsecurity.com/papers/general/whitepaper_httpresponse.pdf.
[Kri2001]
Kristol, D., “HTTP Cookies: Standards, Privacy, and Politics”, ACM Transactions on Internet Technology 1(2), November 2001, http://arxiv.org/abs/cs.SE/0105018.
[Linhart]
Linhart, C., Klein, A., Heled, R., and S. Orrin, “HTTP Request Smuggling”, June 2005, http://www.watchfire.com/news/whitepapers.aspx.
[RFC1919]
Chatel, M., “Classical versus Transparent IP Proxies”, RFC 1919, March 1996.
[RFC1945]
Berners-Lee, T., Fielding, R., and H. Nielsen, “Hypertext Transfer Protocol – HTTP/1.0”, RFC 1945, May 1996.
[RFC2045]
Freed, N. and N. Borenstein, “Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies”, RFC 2045, November 1996.
[RFC2047]
Moore, K., “MIME (Multipurpose Internet Mail Extensions) Part Three: Message Header Extensions for Non-ASCII Text”, RFC 2047, November 1996.
[RFC2068]
Fielding, R., Gettys, J., Mogul, J., Nielsen, H., and T. Berners-Lee, “Hypertext Transfer Protocol – HTTP/1.1”, RFC 2068, January 1997.
[RFC2145]
Mogul, J., Fielding, R., Gettys, J., and H. Nielsen, “Use and Interpretation of HTTP Version Numbers”, RFC 2145, May 1997.
[RFC2616]
Fielding, R., Gettys, J., Mogul, J., Frystyk, H., Masinter, L., Leach, P., and T. Berners-Lee, “Hypertext Transfer Protocol – HTTP/1.1”, RFC 2616, June 1999.
[RFC2817]
Khare, R. and S. Lawrence, “Upgrading to TLS Within HTTP/1.1”, RFC 2817, May 2000.
[RFC2818]
Rescorla, E., “HTTP Over TLS”, RFC 2818, May 2000.
[RFC3040]
Cooper, I., Melve, I., and G. Tomlinson, “Internet Web Replication and Caching Taxonomy”, RFC 3040, January 2001.
[RFC4033]
Arends, R., Austein, R., Larson, M., Massey, D., and S. Rose, “DNS Security Introduction and Requirements”, RFC 4033, March 2005.
[RFC4559]
Jaganathan, K., Zhu, L., and J. Brezak, “SPNEGO-based Kerberos and NTLM HTTP Authentication in Microsoft Windows”, RFC 4559, June 2006.
[RFC5226]
Narten, T. and H. Alvestrand, “Guidelines for Writing an IANA Considerations Section in RFCs”, BCP 26, RFC 5226, May 2008.
[RFC5246]
Dierks, T. and E. Rescorla, “The Transport Layer Security (TLS) Protocol Version 1.2”, RFC 5246, August 2008.
[RFC5322]
Resnick, P., “Internet Message Format”, RFC 5322, October 2008.
[RFC6265]
Barth, A., “HTTP State Management Mechanism”, RFC 6265, April 2011.
[RFC6585]
Nottingham, M. and R. Fielding, “Additional HTTP Status Codes”, RFC 6585, April 2012.

附录 A:HTTP 版本历史 / Appendix A. HTTP Version History

HTTP has been in use since 1990. The first version, later referred to as HTTP/0.9, was a simple protocol for hypertext data transfer across the Internet, using only a single request method (GET) and no metadata. HTTP/1.0, as defined by [RFC1945], added a range of request methods and MIME-like messaging, allowing for metadata to be transferred and modifiers placed on the request/response semantics. However, HTTP/1.0 did not sufficiently take into consideration the effects of hierarchical proxies, caching, the need for persistent connections, or name-based virtual hosts. The proliferation of incompletely implemented applications calling themselves "HTTP/1.0" further necessitated a protocol version change in order for two communicating applications to determine each other's true capabilities.

HTTP 自 1990 年开始被使用。第一个版本,也就是后来被称为 HTTP/0.9,是一个用于在互联网中对超文本数据进行转输的简单协议,只能使用单一的请求方法 (GET),而且还不能带有元数据。HTTP/1.0,由【RFC1945】所定义,新增了一系列的请求方法以及与 MIME 相类似的消息通信机制,允许传输元数据,以及请求/响应语言的修饰符。但是,HTTP/1.0 并没有充分考虑代理的层级结构的影响、缓存、持久连接的需要,以及基于名称的虚拟主机name-based virtual hosts。这些不完全实现 HTTP 的应用都声称自身版本为 "HTTP/1.0",并逐渐在网络中扩散,进一步迫使一个协议版本的变更来为两个正在通信的应用来确定彼此的真实的能力。

HTTP/1.1 remains compatible with HTTP/1.0 by including more stringent requirements that enable reliable implementations, adding only those features that can either be safely ignored by an HTTP/1.0 recipient or only be sent when communicating with a party advertising conformance with HTTP/1.1.

HTTP/1.1 仍保留了对 HTTP/1.0 的兼容:通过包含更多严格的要求来让实现implementations更加可靠,仅添加那些能够被 HTTP/1.0 接收端安全忽略的,或者只能在通信双方都声明其遵循 HTTP/1.1 语义的时候才会使用的功能。

HTTP/1.1 has been designed to make supporting previous versions easy. A general-purpose HTTP/1.1 server ought to be able to understand any valid request in the format of HTTP/1.0, responding appropriately with an HTTP/1.1 message that only uses features understood (or safely ignored) by HTTP/1.0 clients. Likewise, an HTTP/1.1 client can be expected to understand any valid HTTP/1.0 response.

HTTP/1.1 已被设计为使支持以前的版本变得容易。一个通用的 HTTP/1.1 服务器应该能够理解任何合法的 HTTP/1.0 格式的请求消息,并能够 HTTP/1.1 消息正确响应,并且这个响应消息只使用 HTTP/1.0 客户端所能理解(或者安全忽略)的功能。同样,一个 HTTP/1.1 客户端应该能够理解任何合法的 HTTP/1.0 响应。

Since HTTP/0.9 did not support header fields in a request, there is no mechanism for it to support name-based virtual hosts (selection of resource by inspection of the Host header field). Any server that implements name-based virtual hosts ought to disable support for HTTP/0.9. Most requests that appear to be HTTP/0.9 are, in fact, badly constructed HTTP/1.x requests caused by a client failing to properly encode the request-target.

因为 HTTP/0.9 并不支持在请求消息中带有头字段,因此,没有任何方法来让其支持基于名称的虚拟主机name-based virtual hosts(通过检查 Host 头字段来对资源进行选择)。任何实现基于名称的虚拟主机的服务器应该禁止支持 HTTP/0.9。大多数看起来像是 HTTP/0.9 的请求消息,实际上是因为客户端没有正确对 request-target 进行编码导致没有很好的构建出 HTTP/1.x 请求消息。

A.1. 相对 HTTP/1.0 的变化 / Changes from HTTP/1.0

This section summarizes major differences between versions HTTP/1.0 and HTTP/1.1.

本章节总结了 HTTP/1.0 与 HTTP/1.1 的主要区别。

A.1.1. 多宿主网络服务器 / Multihomed Web Servers

The requirements that clients and servers support the Host header field (Section 5.4), report an error if it is missing from an HTTP/1.1 request, and accept absolute URIs (Section 5.3) are among the most important changes defined by HTTP/1.1.

HTTP/1.1 与 HTTP/1.0 最重要的改变有如下:要求客户端和服务器支持 Host 头字段(章节 5.4),如果在一个 HTTP/1.1 请求消息中没有 Host 的话要报告错误;接受绝对 URI(absolute URIs,章节 5.3)。

译注:Multihomed 译作多宿主、多重地址,更多介绍请移步这里

Older HTTP/1.0 clients assumed a one-to-one relationship of IP addresses and servers; there was no other established mechanism for distinguishing the intended server of a request than the IP address to which that request was directed. The Host header field was introduced during the development of HTTP/1.1 and, though it was quickly implemented by most HTTP/1.0 browsers, additional requirements were placed on all HTTP/1.1 requests in order to ensure complete adoption. At the time of this writing, most HTTP-based services are dependent upon the Host header field for targeting requests.

旧的 HTTP/1.0 客户端会假定 IP 地址与服务器是一对一的关系,除了只能使用 IP 地址这种方式来将一个请求导向到它所期望到达的服务器以外,没有任何其他的方式。在 HTTP/1.1 发展过程中引入了 Host 头字段,它能够很快被 HTTP/1.0 浏览器所实现,另外,还需要对所有 HTTP/1.1 请求消息增加额外的要求,来确保完整采用。在写这份文档的时候,大多数基于 HTTP 的服务都已经由 Host 头字段来决定请求目标了。

A.1.2. Keep-Alive 连接 / Keep-Alive Connections

In HTTP/1.0, each connection is established by the client prior to the request and closed by the server after sending the response. However, some implementations implement the explicitly negotiated ("Keep-Alive") version of persistent connections described in Section 19.7.1 of [RFC2068].

在 HTTP/1.0 里,每个连接会被客户端在发送请求之前建立,被服务器在发送响应之后关闭。但是,某些实现implementations实现了明确的持久连接的协商版本("Keep-Alive"),见【RFC2068】章节 19.7.1

Some clients and servers might wish to be compatible with these previous approaches to persistent connections, by explicitly negotiating for them with a "Connection: keep-alive" request header field. However, some experimental implementations of HTTP/1.0 persistent connections are faulty; for example, if an HTTP/1.0 proxy server doesn't understand Connection, it will erroneously forward that header field to the next inbound server, which would result in a hung connection.

某些客户端和服务器可能希望兼容上述这种持久连接的实现方式,通过带有一个 "Connection: keep-alive" 请求头字段明确地与通信双方协商。但是,HTTP/1.0 持久连接的某些带实验性质的实现experimental implementations是有缺陷的,例如,如果一个 HTTP/1.0 代理服务器并不理解 Connection 头字段而将其错误地转发到下一个入站服务器中,这样会导致连接挂起。

One attempted solution was the introduction of a Proxy-Connection header field, targeted specifically at proxies. In practice, this was also unworkable, because proxies are often deployed in multiple layers, bringing about the same problem discussed above.

一个试图的解决方案是引入一个 Proxy-Connection 头字段特定针对于代理。实际上这样也是很难实行的,因为代理通常部署在多个通信层上,同样会带来上述的这些问题。

As a result, clients are encouraged not to send the Proxy-Connection header field in any requests.

所以,鼓励客户端不要在任何请求消息中带有 Proxy-Connection 头字段。

Clients are also encouraged to consider the use of Connection: keep-alive in requests carefully; while they can enable persistent connections with HTTP/1.0 servers, clients using them will need to monitor the connection for "hung" requests (which indicate that the client ought stop sending the header field), and this mechanism ought not be used by clients at all when a proxy is being used.

同样,鼓励客户端在请求消息中小心谨慎地使用 Connection: keep-alive 头字段,这是因为该头字段会启用与 HTTP/1.0 服务器的持久连接。客户端使用它的时候需要监听连接是否存在被挂起hung的请求(也就是说客户端应该停止发送头字段),并且客户端不应该在有使用代理的场合下使用这种方法。

A.1.3. 引入 Trnasfer-Encoding / Introduction of Transfer-Encoding

HTTP/1.1 introduces the Transfer-Encoding header field (Section 3.3.1). Transfer codings need to be decoded prior to forwarding an HTTP message over a MIME-compliant protocol.

HTTP/1.1 引入了 Transfer-Encoding 头字段(章节 3.3.1)。传输编码值需要在 HTTP 消息在某个遵循 MIME 协议的连接上转发之前进行解码。

A.2. 相对 RFC 2616 的变化 / Changes from RFC 2616

HTTP's approach to error handling has been explained. (Section 2.5)

明确了 HTTP 的错误处理方法。(章节 2.5

The HTTP- version ABNF production has been clarified to be case-sensitive. Additionally, version numbers have been restricted to single digits, due to the fact that implementations are known to handle multi-digit version numbers incorrectly. (Section 2.6)

明确了 HTTP-version ABNF 规则为区分大小写。另外,限制了版本号的数值为一位数字single digits,这是基于已知实现implementations不能正确处理版本号为多位数字的事实来做的改动。

Userinfo (i.e., username and password) are now disallowed in HTTP and HTTPS URIs, because of security issues related to their transmission on the wire. (Section 2.7.1)

现在,不再允许在 HTTP 和 HTTPS URI 中出现 userinfo (也就是说,用户名和密码)了,因为它在通信线路上传输时存在安全问题。(章节 2.7.1

The HTTPS URI scheme is now defined by this specification; previously, it was done in Section 2.4 of [RFC2818]. Furthermore, it implies end-to-end security. (Section 2.7.2)

现在,HTTPS URI 方案已经被定义在本规范中。以前,它是定义在【RFC2818】章节 2.4 中。此外,HTTPS 意味着端到端安全。(章节 2.7.2

HTTP messages can be (and often are) buffered by implementations; despite it sometimes being available as a stream, HTTP is fundamentally a message-oriented protocol. Minimum supported sizes for various protocol elements have been suggested, to improve interoperability. (Section 3)

HTTP 消息能够(而且通常会)被缓冲buffer,尽管它有时候作为流的形式来提供,HTTP 本质上来说是一种面向消息的协议message-oriented protocol。为了提高互操作性,本规则建议了各种协议元素的最小可支持的大小。(章节 3

Invalid whitespace around field-names is now required to be rejected, because accepting it represents a security vulnerability. The ABNF productions defining header fields now only list the field value. (Section 3.2)

现在,字段名称field-names两端不能出现非法的空白字符,这是因为允许它们就意味着安全隐患。现在,定义头字段的 ABNF 规则 header-field 只列出了 field-value。(章节 3.2

译注:【RFC2616】对消息字段的 ABNF 定义在这里(message-header);【RFC7231】对头字段的 ABNF 定义在这里(header-field)

Rules about implicit linear whitespace between certain grammar productions have been removed; now whitespace is only allowed where specifically defined in the ABNF. (Section 3.2.3)

移除了某些暗示连续空白的相关语法规则,现在,空白只允许在 ABNF 中有明确定义的地方。(章节 3.2.3

Header fields that span multiple lines ("line folding") are deprecated. (Section 3.2.4)

废弃了能够横跨多行(折叠行line folding)的头字段。(章节 3.2.4

The NUL octet is no longer allowed in comment and quoted-string text, and handling of backslash-escaping in them has been clarified. The quoted-pair rule no longer allows escaping control characters other than HTAB. Non-US-ASCII content in header fields and the reason phrase has been obsoleted and made opaque (the TEXT rule was removed). (Section 3.2.6)

NUL 字节不再允许出现在 comment (注释)和 quoted-string (以双引号包裹的字符串)里,同时,明确了使用反斜杠转义 NUL 字节的处理方式。quoted-pair 规则不再允许转义除 HTAB 以外的控制字符(Control Characters)。在头字段(header fields)和 reason-phrase 中存在的非 USASCII 的内容已被弃用,并将其设置为不透明数据(移除了 TEXT 规则)。(章节 3.2.6

译注:关于不透明数据的解释见章节 3.2.4

Bogus Content-Length header fields are now required to be handled as errors by recipients. (Section 3.3.2)

现在,要求接收端将伪造的 Content-Length 头字段作为错误来处理。(章节 3.3.2

The algorithm for determining the message body length has been clarified to indicate all of the special cases (e.g., driven by methods or status codes) that affect it, and that new protocol elements cannot define such special cases. CONNECT is a new, special case in determining message body length. "multipart/byteranges" is no longer a way of determining message body length detection. (Section 3.3.3)

明确指出了所有影响确定消息体长度的算法的所有特殊情况(例如,由方法或状态码决定),并且,将来新增的协议元素不能再定义这种特殊情况。CONNECT 是新加入的影响消息体长度的特殊情况。"multipart/byteranges" 不再是一种确定消息体长度的检测方式。(章节 3.3.3

The "identity" transfer coding token has been removed. (Sections 3.3 and 4)

移除了 "identity" 传输编码值。(章节 3.3 以及 章节 4

Chunk length does not include the count of the octets in the chunk header and trailer. Line folding in chunk extensions is disallowed. (Section 4.1)

分块头部chunk header分块尾部trailer内的字节不计入分块chunk的长度。块扩展chunk extensions不允许折叠行line folding。(章节 4.1

The meaning of the "deflate" content coding has been clarified. (Section 4.2.2)

明确了 "deflate" 内容编码值的意思。(章节 4.2.2

The segment + query components of RFC 3986 have been used to define the request-target, instead of abs_path from RFC 1808. The asterisk-form of the request-target is only allowed with the OPTIONS method. (Section 5.3)

使用 【RFC3986】里的 segmentquery 组件来定义 request-target,而不再是【RFC1808】里的 abs_pathrequest-target 的星号形式(asterisk-form)仅允许用于 OPTIONS 请求方法。(章节 5.3

The term "Effective Request URI" has been introduced. (Section 5.5)

引入了术语 实际请求 URI(effective request URI)。(章节 5.5

Gateways do not need to generate Via header fields anymore. (Section 5.7.1)

网关不再需要生成 Via 头字段了。(章节 5.7.1

Exactly when "close" connection options have to be sent has been clarified. Also, "hop-by-hop" header fields are required to appear in the Connection header field; just because they're defined as hop-by-hop in this specification doesn't exempt them. (Section 6.1)

明确了何时需要发送 "close" 连接选项connection options的时机。同时,要求“逐跳hop-by-hop”类型的头字段出现在 Connection 头字段内,被本规范定义为“逐跳”类型的头字段,同样不能豁免这个要求。(章节 6.1

The limit of two connections per server has been removed. An idempotent sequence of requests is no longer required to be retried. The requirement to retry requests under certain circumstances when the server prematurely closes the connection has been removed. Also, some extraneous requirements about when servers are allowed to close connections prematurely have been removed. (Section 6.3)

移除了每个服务器两个连接的限制。一个幂等的请求序列不再需要重试。移除了当服务器过早关闭连接时,在特定环境下要求重试请求的要求。同样,移除了某些关于何时允许服务器能过早地关闭连接的无关的要求。(章节 6.3

The semantics of the Upgrade header field is now defined in responses other than 101 (this was incorporated from [RFC2817]). Furthermore, the ordering in the field value is now significant. (Section 6.7)

Upgrade 头字段的语义现在定义在响应里,而不再是 101(来自【RFC2817】,这是不正确的)。而且,它的字段值的顺序现在是有意义的。(章节 6.7

Empty list elements in list productions (e.g., a list header field containing ", ,") have been deprecated. (Section 7)

废弃了列表类型的值里允许出现空元素的规则(例如,一个列表类型的头字段,包含 ", ,")。(章节 7

Registration of Transfer Codings now requires IETF Review (Section 8.4)

登记传输编码值transfer codings现在需要经过 IETF 的复审。(章节 8.4

This specification now defines the Upgrade Token Registry, previously defined in Section 7.2 of [RFC2817]. (Section 8.6)

本规范现在定义了升级标记登记表(Upgrade Token Registry),之前是定义在【RFC2817】章节 7.2 里的。(章节 8.6

The expectation to support HTTP/0.9 requests has been removed. (Appendix A)

移除了期望支持 HTTP/0.9 请求的相关描述。(附录 A

Issues with the Keep-Alive and Proxy-Connection header fields in requests are pointed out, with use of the latter being discouraged altogether. (Appendix A.1.2)

指出了与请求里的 Keep-AliveProxy-Connection 头字段相关的问题,以及对于后者的使用完全失去信心。(附录 A.1.2

附录 B:ABNF 集合 / Appendx B. Collected ABNF

BWS = OWS

Connection = *( "," OWS ) connection-option *( OWS "," [ OWS
 connection-option ] )
Content-Length = 1*DIGIT

HTTP-message = start-line *( header-field CRLF ) CRLF [ message-body
 ]
HTTP-name = %x48.54.54.50 ; HTTP
HTTP-version = HTTP-name "/" DIGIT "." DIGIT
Host = uri-host [ ":" port ]

OWS = *( SP / HTAB )

RWS = 1*( SP / HTAB )

TE = [ ( "," / t-codings ) *( OWS "," [ OWS t-codings ] ) ]
Trailer = *( "," OWS ) field-name *( OWS "," [ OWS field-name ] )
Transfer-Encoding = *( "," OWS ) transfer-coding *( OWS "," [ OWS
 transfer-coding ] )

URI-reference = <URI-reference, see [RFC3986], Section 4.1>
Upgrade = *( "," OWS ) protocol *( OWS "," [ OWS protocol ] )

Via = *( "," OWS ) ( received-protocol RWS received-by [ RWS comment
 ] ) *( OWS "," [ OWS ( received-protocol RWS received-by [ RWS
 comment ] ) ] )

absolute-URI = <absolute-URI, see [RFC3986], Section 4.3>
absolute-form = absolute-URI
absolute-path = 1*( "/" segment )
asterisk-form = "*"
authority = <authority, see [RFC3986], Section 3.2>
authority-form = authority

chunk = chunk-size [ chunk-ext ] CRLF chunk-data CRLF
chunk-data = 1*OCTET
chunk-ext = *( ";" chunk-ext-name [ "=" chunk-ext-val ] )
chunk-ext-name = token
chunk-ext-val = token / quoted-string
chunk-size = 1*HEXDIG
chunked-body = *chunk last-chunk trailer-part CRLF
comment = "(" *( ctext / quoted-pair / comment ) ")"
connection-option = token
ctext = HTAB / SP / %x21-27 ; '!'-'''
 / %x2A-5B ; '*'-'['
 / %x5D-7E ; ']'-'~'
 / obs-text

field-content = field-vchar [ 1*( SP / HTAB ) field-vchar ]
field-name = token
field-value = *( field-content / obs-fold )
field-vchar = VCHAR / obs-text
fragment = <fragment, see [RFC3986], Section 3.5>

header-field = field-name ":" OWS field-value OWS
http-URI = "http://" authority path-abempty [ "?" query ] [ "#"
 fragment ]
https-URI = "https://" authority path-abempty [ "?" query ] [ "#"
 fragment ]

last-chunk = 1*"0" [ chunk-ext ] CRLF

message-body = *OCTET
method = token

obs-fold = CRLF 1*( SP / HTAB )
obs-text = %x80-FF
origin-form = absolute-path [ "?" query ]

partial-URI = relative-part [ "?" query ]
path-abempty = <path-abempty, see [RFC3986], Section 3.3>
port = <port, see [RFC3986], Section 3.2.3>
protocol = protocol-name [ "/" protocol-version ]
protocol-name = token
protocol-version = token
pseudonym = token

qdtext = HTAB / SP / "!" / %x23-5B ; '#'-'['
 / %x5D-7E ; ']'-'~'
 / obs-text
query = <query, see [RFC3986], Section 3.4>
quoted-pair = "\" ( HTAB / SP / VCHAR / obs-text )
quoted-string = DQUOTE *( qdtext / quoted-pair ) DQUOTE

rank = ( "0" [ "." *3DIGIT ] ) / ( "1" [ "." *3"0" ] )
reason-phrase = *( HTAB / SP / VCHAR / obs-text )
received-by = ( uri-host [ ":" port ] ) / pseudonym
received-protocol = [ protocol-name "/" ] protocol-version
relative-part = <relative-part, see [RFC3986], Section 4.2>
request-line = method SP request-target SP HTTP-version CRLF
request-target = origin-form / absolute-form / authority-form /
 asterisk-form

scheme = <scheme, see [RFC3986], Section 3.1>
segment = <segment, see [RFC3986], Section 3.3>
start-line = request-line / status-line
status-code = 3DIGIT
status-line = HTTP-version SP status-code SP reason-phrase CRLF

t-codings = "trailers" / ( transfer-coding [ t-ranking ] )
t-ranking = OWS ";" OWS "q=" rank
tchar = "!" / "#" / "$" / "%" / "&" / "'" / "*" / "+" / "-" / "." /
 "^" / "_" / "`" / "|" / "~" / DIGIT / ALPHA
token = 1*tchar
trailer-part = *( header-field CRLF )
transfer-coding = "chunked" / "compress" / "deflate" / "gzip" /
 transfer-extension
transfer-extension = token *( OWS ";" OWS transfer-parameter )
transfer-parameter = token BWS "=" BWS ( token / quoted-string )

uri-host = <host, see [RFC3986], Section 3.2.2>

索引 / Index

  • A
    • absolute-form (of request-target) 5.3.2
    • accelerator 2.3
    • application/http Media Type 8.3.2
    • asterisk-form (of request-target) 5.3.4
    • authoritative response 9.1
    • authority-form (of request-target) 5.3.3
  • B
    • BCP115 8.2, 11.2
    • BCP13 8.3, 11.2
    • BCP90 8.1, 11.2
    • browser 2.1
  • C
    • cache 2.4
    • cacheable 2.4
    • captive portal 2.3
    • chunked (Coding Format) 3.3.1, 3.3.3, 4.1
    • client 2.1
    • close 3.2.1, 4.3, 5.7, 6.1, 6.1, 6.3.2, 6.6, 6.6, 6.7, 8.1, 8.1, A.2
    • compress (Coding Format) 4.2.1
    • connection 2.1
    • Connection header field 3.2.1, 4.3, 5.7, 6.1, 6.1, 6.3.2, 6.6, 6.6, 6.7, 8.1, 8.1, A.2
    • Content-Length header field 3.3.2, 8.1, A.2
  • D
    • deflate (Coding Format) 4.2.2
    • Delimiters 3.2.6
    • downstream 2.3
  • E
    • effective request URI 5.5
  • G
    • gateway 2.3
    • Georgiev 9.1, 11.2
    • Grammar
      • absolute-form 5.3, 5.3.2
      • absolute-path 2.7
      • absolute-URI 2.7
      • ALPHA 1.2
      • asterisk-form 5.3, 5.3.4
      • authority 2.7
      • authority-form 5.3, 5.3.3
      • BWS 3.2.3
      • chunk 4.1
      • chunk-data 4.1
      • chunk-ext 4.1, 4.1.1
      • chunk-ext-name 4.1.1
      • chunk-ext-val 4.1.1
      • chunk-size 4.1
      • chunked-body 4.1, 4.1.1
      • comment 3.2.6
      • Connection 6.1
      • connection-option 6.1
      • Content-Length 3.3.2
      • CR 1.2
      • CRLF 1.2
      • ctext 3.2.6
      • CTL 1.2
      • DIGIT 1.2
      • DQUOTE 1.2
      • field-content 3.2
      • field-name 3.2, 4.4
      • field-value 3.2
      • field-vchar 3.2
      • fragment 2.7
      • header-field 3.2, 4.1.2
      • HEXDIG 1.2
      • Host 5.4
      • HTAB 1.2
      • HTTP-message 3
      • HTTP-name 2.6
      • http-URI 2.7.1
      • HTTP-version 2.6
      • https-URI 2.7.2
      • last-chunk 4.1
      • LF 1.2
      • message-body 3.3
      • method 3.1.1
      • obs-fold 3.2
      • obs-text 3.2.6
      • OCTET 1.2
      • origin-form 5.3, 5.3.1
      • OWS 3.2.3
      • partial-URI 2.7
      • port 2.7
      • protocol-name 5.7.1
      • protocol-version 5.7.1
      • pseudonym 5.7.1
      • qdtext 3.2.6
      • query 2.7
      • quoted-pair 3.2.6
      • quoted-string 3.2.6
      • rank 4.3
      • reason-phrase 3.1.2
      • received-by 5.7.1
      • received-protocol 5.7.1
      • request-line 3.1.1
      • request-target 5.3
      • RWS 3.2.3
      • scheme 2.7
      • segment 2.7
      • SP 1.2
      • start-line 3.1
      • status-code 3.1.2
      • status-line 3.1.2
      • t-codings 4.3
      • t-ranking 4.3
      • tchar 3.2.6
      • TE 4.3
      • token 3.2.6
      • Trailer 4.4
      • trailer-part 4.1, 4.1.2
      • transfer-coding 4
      • Transfer-Encoding 3.3.1
      • transfer-extension 4
      • transfer-parameter 4
      • Upgrade 6.7
      • uri-host 2.7
      • URI-reference 2.7
      • VCHAR 1.2
      • Via 5.7.1
    • gzip (Coding Format) 4.2.3
  • H
    • header field 3
    • header section 3
    • headers 3
    • Host header field 5.3.1, 5.4, 8.1, A.1.1
    • http URI scheme 2.7.1
    • https URI scheme 2.7.2
  • I
    • inbound 2.3
    • interception proxy 2.3
    • intermediary 2.3
    • ISO-8859-1 3.2.4, 11.2
  • K
    • Klein 9.4, 11.2
    • Kri2001 3.2.2, 11.2
  • L
    • Linhart 9.5, 11.2
  • M
    • Media Type
      • application/http 8.3.2
      • message/http 8.3.1
    • message 2.1
    • message/http Media Type 8.3.1
    • method 3.1.1
  • N
    • non-transforming proxy 5.7.2
  • O
    • origin server 2.1
    • origin-form (of request-target) 5.3.1
    • outbound 2.3
  • P
    • phishing 9.1
    • proxy 2.3
  • R
    • recipient 2.1
    • request 2.1
    • request-target 3.1.1
    • resource 2.7
    • response 2.1
    • reverse proxy 2.3
    • RFC0793 2.7.1, 11.1
    • RFC1919 2.3, 11.2
    • RFC1945 2.6, 10, 11.2, A
    • RFC1950 4.2.2, 8.4.2, 8.5, 11.1
    • RFC1951 4.2.2, 8.4.2, 8.5, 11.1
    • RFC1952 4.2.3, 8.4.2, 8.5, 11.1
    • RFC2045 2.1, 3.3.1, 11.2
      • Section 6 3.3.1
    • RFC2047 3.2.4, 11.2
    • RFC2068 2.6, 6.3, 10, 11.2, A.1.2
      • Section 19.7.1 6.3, A.1.2
    • RFC2119 1.1, 11.1
    • RFC2145 1, 10, 11.2
    • RFC2616 1, 2.6, 10, 10, 11.2
      • Section 16 10
    • RFC2817 1, 8.6.1, 11.2, A.2, A.2
      • Section 7.2 8.6.1, A.2
    • RFC2818 1, 2.7.2, 9.1, 11.2, A.2
      • Section 2.4 A.2
    • RFC3040 2.3, 11.2
    • RFC3986 2.1, 2.7, 2.7, 2.7, 2.7, 2.7, 2.7, 2.7, 2.7, 2.7, 2.7, 2.7, 2.7, 2.7.1, 2.7.1, 2.7.1, 2.7.3, 2.7.3, 2.7.3, 2.7.3, 5.1, 11.1
      • Section 2.1 2.7.3
      • Section 2.2 2.7.3
      • Section 3.1 2.7
      • Section 3.2 2.7
      • Section 3.2.1 2.7.1
      • Section 3.2.2 2.7, 2.7.1
      • Section 3.2.3 2.7
      • Section 3.3 2.7, 2.7
      • Section 3.4 2.7
      • Section 3.5 2.7, 2.7.1, 5.1
      • Section 4.1 2.7
      • Section 4.2 2.7
      • Section 4.3 2.7
      • Section 6 2.7.3
    • RFC4033 9.1, 11.2
    • RFC4559 2.3, 11.2
    • RFC5226 8.4.1, 8.6.1, 11.2
      • Section 4.1 8.4.1, 8.6.1
    • RFC5234 1.2, 1.2, 7, 11.1
      • Appendix B.1 1.2
    • RFC5246 2.3, 2.7.2, 11.2
    • RFC5322 2.1, 3, 5.7.1, 11.2
      • Section 3.6.7 5.7.1
    • RFC6265 2.7.2, 3.2.2, 4.1.2, 11.2
    • RFC6585 9.3, 11.2
    • RFC7231 1, 2.1, 2.1, 2.7, 2.7.1, 3.1.1, 3.1.1, 3.1.2, 3.2, 3.2.1, 3.3, 3.3, 3.3, 3.3.1, 3.3.1, 3.3.2, 3.3.2, 3.3.2, 4.1.2, 4.1.2, 4.3, 5.1, 5.3.3, 5.3.4, 5.6, 5.7.2, 5.7.2, 6.3.1, 6.3.2, 6.3.2, 6.7, 6.7, 8.4.1, 9, 9.3, 9.3, 11.1
      • Section 2 2.7
      • Section 3 3.3.2
      • Section 3.1.2.1 3.3.1, 8.4.1
      • Section 3.3 5.7.2
      • Section 4 3.1.1
      • Section 4.2.1 6.3.2
      • Section 4.2.2 6.3.1, 6.3.2
      • Section 4.3.1 2.1, 3.3
      • Section 4.3.2 3.3, 3.3.2
      • Section 4.3.6 3.3, 3.3.1, 3.3.2, 5.3.3
      • Section 4.3.7 5.3.4
      • Section 5 4.1.2
      • Section 5.1.1 6.7
      • Section 5.3.1 4.3
      • Section 6 2.7.1, 3.1.2
      • Section 6.2 5.6
      • Section 6.3.4 5.7.2
      • Section 6.4 6.7
      • Section 6.5.11 9.3
      • Section 6.5.12 3.1.1, 9.3
      • Section 7.1 4.1.2
      • Section 7.1.1.2 3.2
      • Section 8.3 3.2.1
      • Appendix A 2.1
    • RFC7232 1, 3.3.1, 3.3.2, 11.1
      • Section 4.1 3.3.1, 3.3.2
    • RFC7233 1, 11.1
    • RFC7234 1, 2.4, 3.4, 5.2, 5.7.2, 5.7.2, 6.1, 9.2, 11.1
      • Section 2 2.4
      • Section 3 3.4
      • Section 5.2 5.7.2, 6.1
      • Section 5.5 5.7.2
      • Section 8 9.2
    • RFC7235 1, 4.1.2, 11.1
  • S
    • sender 2.1
    • server 2.1
    • spider 2.1
  • T
    • target resource 5.1
    • target URI 5.1
    • TE header field 4, 4.1.2, 4.3, 8.1
    • Trailer header field 4.4, 8.1
    • Transfer-Encoding header field 3.3, 3.3.1, 4, 8.1, A.1.3
    • transforming proxy 5.7.2
    • transparent proxy 2.3
    • tunnel 2.3
  • U
    • Upgrade header field 5.7.1, 6.7, 8.1, A.2
    • upstream 2.3
    • URI scheme
    • http 2.7.1
    • https 2.7.2
    • USASCII 1.2, 3, 3.2.4, 11.1
    • user agent 2.1
  • V
    • Via header field 5.7.1, 8.1, A.2
  • W
    • Welch 4.2.1, 8.4.2, 8.5, 11.1

Authors' Addresses

Roy T. Fielding (editor)
Adobe Systems Incorporated
345 Park Ave
San Jose, CA 95110
USA
Email: fielding@gbiv.com
URI: http://roy.gbiv.com/
Julian F. Reschke (editor)
greenbytes GmbH
Hafenweg 16
Muenster, NW 48155
Germany
Email: julian.reschke@greenbytes.de
URI: http://greenbytes.de/tech/webdav/
阿多(译者)

Footnotes:

1

Message Framing,消息分帧、消息组帧,将一个消息与下一个消息进行分离。Separating one message from the next (framing).

2

访问点(AP, Access Point)一般翻译为“无线访问节点”,或“桥接器”。其主要在媒体存取控制层MAC中扮演无线工作站及有线局域网络的桥梁。

3

Authority 在文档管理领域中还有“组织、建立规范统一的索引”的意思,例如 Authority Control,即规范控制,通常是指始终如一地使用和维护统一的名称、主题和题名等规范形式,而这些名称、主题和题名等在书目记录文档中用作标目。

  • Wikisource: Authority Control is the practice of creating and maintaining index terms for bibliographic material in a catalogue, and is particularly useful for assigning unique identifiers to people, works or subjects. When applied to Wikisource, it means maintaining links to a set of standard external catalogues.
  • Wikipedia: In library science, authority control is a process that organizes bibliographic information, for example in library catalogs by using a single, distinct spelling of a name (heading) or a numeric identifier for each topic. The word authority in authority control derives from the idea that the names of people, places, things, and concepts are authorized, i.e., they are established in one particular form.
4

Authoritative Access,权威访问,【RFC7230】章节 5.7.2 有这样的描述:

A proxy that transforms the payload of a 200 (OK) response can further inform downstream recipients that a transformation has been applied by changing the response status code to 203 (Non-Authoritative Information).

客户端接收到来自源服务器的状态码为 200 (OK) 的响应消息时,其中消息的 payload 没有被中间代理转换或更改过,那么这次访问就叫“权威访问”,那个无经过转换或更改的消息就叫“权威信息”。而当 200 (OK) 响应消息经过中间代理修改过时,这个消息就叫“非权威信息”,通常中间代理会将源服务器返回的 200 (OK) 状态码改为 203 (Non-Authoritative Information)

5

Percent-encoded,百分号编码,也叫作 URL 编码。

6

HEAD 请求方法在什么情况下会被改变为 GET?

7

无条件请求(Unconditional Request),即请求头字段内不带有任何条件头字段的请求消息。详情见【RFC7232】。

8

入站请求(Inbound Request),即指向源服务器方向的请求。