Diff: draft-fielding-uri-rfc2396bis - rfc3986.txt

	< draft-fielding-uri-rfc2396bis		rfc3986.txt

	Network Working Group T. Berners-Lee		Network Working Group T. Berners-Lee

	Internet-Draft W3C/MIT		Request for Comments: 3986 W3C/MIT
	Updates: 1738 (if approved) R. Fielding		STD: 66 R. Fielding
	Obsoletes: 2732, 2396, 1808 (if approved) Day Software		Updates: 1738 Day Software
	L. Masinter		Obsoletes: 2732, 2396, 1808 L. Masinter
	Expires: March 26, 2005 Adobe		Category: Standards Track Adobe Systems
	September 25, 2004		January 2005

	Uniform Resource Identifier (URI): Generic Syntax		Uniform Resource Identifier (URI): Generic Syntax

	draft-fielding-uri-rfc2396bis-07

	Status of this Memo

	This document is an Internet-Draft and is subject to all provisions
	of section 3 of RFC 3667. By submitting this Internet-Draft, each
	author represents that any applicable patent or other IPR claims of
	which he or she is aware have been or will be disclosed, and any of
	which he or she become aware will be disclosed, in accordance with
	RFC 3668.


	Internet-Drafts are working documents of the Internet Engineering		Status of This Memo
	Task Force (IETF), its areas, and its working groups. Note that
	other groups may also distribute working documents as
	Internet-Drafts.

	Internet-Drafts are draft documents valid for a maximum of six months
	and may be updated, replaced, or obsoleted by other documents at any
	time. It is inappropriate to use Internet-Drafts as reference
	material or to cite them other than as "work in progress."

	The list of current Internet-Drafts can be accessed at
	<http://www.ietf.org/ietf/1id-abstracts.txt>.


	The list of Internet-Draft Shadow Directories can be accessed at		This document specifies an Internet standards track protocol for the
	<http://www.ietf.org/shadow.html>.		Internet community, and requests discussion and suggestions for
			improvements. Please refer to the current edition of the "Internet
			Official Protocol Standards" (STD 1) for the standardization state
			and status of this protocol. Distribution of this memo is unlimited.

	Copyright Notice		Copyright Notice


	Copyright (C) The Internet Society (2004).		Copyright (C) The Internet Society (2005).

	Abstract		Abstract

	A Uniform Resource Identifier (URI) is a compact sequence of		A Uniform Resource Identifier (URI) is a compact sequence of

	characters for identifying an abstract or physical resource. This		characters that identifies an abstract or physical resource. This
	specification defines the generic URI syntax and a process for		specification defines the generic URI syntax and a process for
	resolving URI references that might be in relative form, along with		resolving URI references that might be in relative form, along with
	guidelines and security considerations for the use of URIs on the		guidelines and security considerations for the use of URIs on the
	Internet. The URI syntax defines a grammar that is a superset of all		Internet. The URI syntax defines a grammar that is a superset of all

	valid URIs, such that an implementation can parse the common		valid URIs, allowing an implementation to parse the common components
	components of a URI reference without knowing the scheme-specific		of a URI reference without knowing the scheme-specific requirements
	requirements of every possible identifier. This specification does		of every possible identifier. This specification does not define a
	not define a generative grammar for URIs; that task is performed by		generative grammar for URIs; that task is performed by the individual
	the individual specifications of each URI scheme.		specifications of each URI scheme.

	Editorial Note

	Discussion of this draft and comments to the editors should be sent
	to the [email protected] mailing list. An issues list and version history
	is available at <http://gbiv.com/protocols/uri/rev-2002/issues.html>.

	Table of Contents		Table of Contents

	1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4		1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4

	1.1 Overview of URIs . . . . . . . . . . . . . . . . . . . . . 4		1.1. Overview of URIs . . . . . . . . . . . . . . . . . . . . 4
	1.1.1 Generic Syntax . . . . . . . . . . . . . . . . . . . . 6		1.1.1. Generic Syntax . . . . . . . . . . . . . . . . . 6
	1.1.2 Examples . . . . . . . . . . . . . . . . . . . . . . . 7		1.1.2. Examples . . . . . . . . . . . . . . . . . . . . 7
	1.1.3 URI, URL, and URN . . . . . . . . . . . . . . . . . . 7		1.1.3. URI, URL, and URN . . . . . . . . . . . . . . . 7
	1.2 Design Considerations . . . . . . . . . . . . . . . . . . 7		1.2. Design Considerations . . . . . . . . . . . . . . . . . 8
	1.2.1 Transcription . . . . . . . . . . . . . . . . . . . . 7		1.2.1. Transcription . . . . . . . . . . . . . . . . . 8
	1.2.2 Separating Identification from Interaction . . . . . . 9		1.2.2. Separating Identification from Interaction . . . 9
	1.2.3 Hierarchical Identifiers . . . . . . . . . . . . . . . 10		1.2.3. Hierarchical Identifiers . . . . . . . . . . . . 10
	1.3 Syntax Notation . . . . . . . . . . . . . . . . . . . . . 11		1.3. Syntax Notation . . . . . . . . . . . . . . . . . . . . 11
	2. Characters . . . . . . . . . . . . . . . . . . . . . . . . . . 11		2. Characters . . . . . . . . . . . . . . . . . . . . . . . . . . 11

	2.1 Percent-Encoding . . . . . . . . . . . . . . . . . . . . . 12		2.1. Percent-Encoding . . . . . . . . . . . . . . . . . . . . 12
	2.2 Reserved Characters . . . . . . . . . . . . . . . . . . . 12		2.2. Reserved Characters . . . . . . . . . . . . . . . . . . 12
	2.3 Unreserved Characters . . . . . . . . . . . . . . . . . . 13		2.3. Unreserved Characters . . . . . . . . . . . . . . . . . 13
	2.4 When to Encode or Decode . . . . . . . . . . . . . . . . . 13		2.4. When to Encode or Decode . . . . . . . . . . . . . . . . 14
	2.5 Identifying Data . . . . . . . . . . . . . . . . . . . . . 14		2.5. Identifying Data . . . . . . . . . . . . . . . . . . . . 14
	3. Syntax Components . . . . . . . . . . . . . . . . . . . . . . 16		3. Syntax Components . . . . . . . . . . . . . . . . . . . . . . 16

	3.1 Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . 16		3.1. Scheme . . . . . . . . . . . . . . . . . . . . . . . . . 17
	3.2 Authority . . . . . . . . . . . . . . . . . . . . . . . . 17		3.2. Authority . . . . . . . . . . . . . . . . . . . . . . . 17
	3.2.1 User Information . . . . . . . . . . . . . . . . . . . 18		3.2.1. User Information . . . . . . . . . . . . . . . . 18
	3.2.2 Host . . . . . . . . . . . . . . . . . . . . . . . . . 18		3.2.2. Host . . . . . . . . . . . . . . . . . . . . . . 18
	3.2.3 Port . . . . . . . . . . . . . . . . . . . . . . . . . 21		3.2.3. Port . . . . . . . . . . . . . . . . . . . . . . 22
	3.3 Path . . . . . . . . . . . . . . . . . . . . . . . . . . . 22		3.3. Path . . . . . . . . . . . . . . . . . . . . . . . . . . 22
	3.4 Query . . . . . . . . . . . . . . . . . . . . . . . . . . 23		3.4. Query . . . . . . . . . . . . . . . . . . . . . . . . . 23
	3.5 Fragment . . . . . . . . . . . . . . . . . . . . . . . . . 24		3.5. Fragment . . . . . . . . . . . . . . . . . . . . . . . . 24
	4. Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25		4. Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

	4.1 URI Reference . . . . . . . . . . . . . . . . . . . . . . 25		4.1. URI Reference . . . . . . . . . . . . . . . . . . . . . 25
	4.2 Relative Reference . . . . . . . . . . . . . . . . . . . . 26		4.2. Relative Reference . . . . . . . . . . . . . . . . . . . 26
	4.3 Absolute URI . . . . . . . . . . . . . . . . . . . . . . . 26		4.3. Absolute URI . . . . . . . . . . . . . . . . . . . . . . 27
	4.4 Same-document Reference . . . . . . . . . . . . . . . . . 27		4.4. Same-Document Reference . . . . . . . . . . . . . . . . 27
	4.5 Suffix Reference . . . . . . . . . . . . . . . . . . . . . 27		4.5. Suffix Reference . . . . . . . . . . . . . . . . . . . . 27

	5. Reference Resolution . . . . . . . . . . . . . . . . . . . . . 28		5. Reference Resolution . . . . . . . . . . . . . . . . . . . . . 28

	5.1 Establishing a Base URI . . . . . . . . . . . . . . . . . 28		5.1. Establishing a Base URI . . . . . . . . . . . . . . . . 28
	5.1.1 Base URI Embedded in Content . . . . . . . . . . . . . 29		5.1.1. Base URI Embedded in Content . . . . . . . . . . 29
	5.1.2 Base URI from the Encapsulating Entity . . . . . . . . 29		5.1.2. Base URI from the Encapsulating Entity . . . . . 29
	5.1.3 Base URI from the Retrieval URI . . . . . . . . . . . 30		5.1.3. Base URI from the Retrieval URI . . . . . . . . 30
	5.1.4 Default Base URI . . . . . . . . . . . . . . . . . . . 30		5.1.4. Default Base URI . . . . . . . . . . . . . . . . 30
	5.2 Relative Resolution . . . . . . . . . . . . . . . . . . . 30		5.2. Relative Resolution . . . . . . . . . . . . . . . . . . 30
	5.2.1 Pre-parse the Base URI . . . . . . . . . . . . . . . . 30		5.2.1. Pre-parse the Base URI . . . . . . . . . . . . . 31
	5.2.2 Transform References . . . . . . . . . . . . . . . . . 31		5.2.2. Transform References . . . . . . . . . . . . . . 31
	5.2.3 Merge Paths . . . . . . . . . . . . . . . . . . . . . 32		5.2.3. Merge Paths . . . . . . . . . . . . . . . . . . 32
	5.2.4 Remove Dot Segments . . . . . . . . . . . . . . . . . 32		5.2.4. Remove Dot Segments . . . . . . . . . . . . . . 33
	5.3 Component Recomposition . . . . . . . . . . . . . . . . . 34		5.3. Component Recomposition . . . . . . . . . . . . . . . . 35
	5.4 Reference Resolution Examples . . . . . . . . . . . . . . 34		5.4. Reference Resolution Examples . . . . . . . . . . . . . 35
	5.4.1 Normal Examples . . . . . . . . . . . . . . . . . . . 35		5.4.1. Normal Examples . . . . . . . . . . . . . . . . 36
	5.4.2 Abnormal Examples . . . . . . . . . . . . . . . . . . 35		5.4.2. Abnormal Examples . . . . . . . . . . . . . . . 36
	6. Normalization and Comparison . . . . . . . . . . . . . . . . . 36
	6.1 Equivalence . . . . . . . . . . . . . . . . . . . . . . . 37		6. Normalization and Comparison . . . . . . . . . . . . . . . . . 38
	6.2 Comparison Ladder . . . . . . . . . . . . . . . . . . . . 37		6.1. Equivalence . . . . . . . . . . . . . . . . . . . . . . 38
	6.2.1 Simple String Comparison . . . . . . . . . . . . . . . 38		6.2. Comparison Ladder . . . . . . . . . . . . . . . . . . . 39
	6.2.2 Syntax-based Normalization . . . . . . . . . . . . . . 39		6.2.1. Simple String Comparison . . . . . . . . . . . . 39
	6.2.3 Scheme-based Normalization . . . . . . . . . . . . . . 40		6.2.2. Syntax-Based Normalization . . . . . . . . . . . 40
	6.2.4 Protocol-based Normalization . . . . . . . . . . . . . 41		6.2.3. Scheme-Based Normalization . . . . . . . . . . . 41
	7. Security Considerations . . . . . . . . . . . . . . . . . . . 41		6.2.4. Protocol-Based Normalization . . . . . . . . . . 42
	7.1 Reliability and Consistency . . . . . . . . . . . . . . . 41		7. Security Considerations . . . . . . . . . . . . . . . . . . . 43
	7.2 Malicious Construction . . . . . . . . . . . . . . . . . . 42		7.1. Reliability and Consistency . . . . . . . . . . . . . . 43
	7.3 Back-end Transcoding . . . . . . . . . . . . . . . . . . . 42		7.2. Malicious Construction . . . . . . . . . . . . . . . . . 43
	7.4 Rare IP Address Formats . . . . . . . . . . . . . . . . . 43		7.3. Back-End Transcoding . . . . . . . . . . . . . . . . . . 44
	7.5 Sensitive Information . . . . . . . . . . . . . . . . . . 44		7.4. Rare IP Address Formats . . . . . . . . . . . . . . . . 45
	7.6 Semantic Attacks . . . . . . . . . . . . . . . . . . . . . 44		7.5. Sensitive Information . . . . . . . . . . . . . . . . . 45
	8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 45		7.6. Semantic Attacks . . . . . . . . . . . . . . . . . . . . 45
	9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 45		8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 46
	10. References . . . . . . . . . . . . . . . . . . . . . . . . . 46		9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 46
	10.1 Normative References . . . . . . . . . . . . . . . . . . . . 46		10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 46
	10.2 Informative References . . . . . . . . . . . . . . . . . . . 46		10.1. Normative References . . . . . . . . . . . . . . . . . . 46
	Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 48		10.2. Informative References . . . . . . . . . . . . . . . . . 47
	A. Collected ABNF for URI . . . . . . . . . . . . . . . . . . . . 49		A. Collected ABNF for URI . . . . . . . . . . . . . . . . . . . . 49

	B. Parsing a URI Reference with a Regular Expression . . . . . . 51		B. Parsing a URI Reference with a Regular Expression . . . . . . 50
	C. Delimiting a URI in Context . . . . . . . . . . . . . . . . . 52		C. Delimiting a URI in Context . . . . . . . . . . . . . . . . . 51
	D. Changes from RFC 2396 . . . . . . . . . . . . . . . . . . . . 53		D. Changes from RFC 2396 . . . . . . . . . . . . . . . . . . . . 53

	D.1 Additions . . . . . . . . . . . . . . . . . . . . . . . . 53		D.1. Additions . . . . . . . . . . . . . . . . . . . . . . . 53
	D.2 Modifications . . . . . . . . . . . . . . . . . . . . . . 54		D.2. Modifications . . . . . . . . . . . . . . . . . . . . . 53
	E. Instructions to RFC Editor . . . . . . . . . . . . . . . . . . 56		Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
	Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57		Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 60
	Intellectual Property and Copyright Statements . . . . . . . . 61		Full Copyright Statement . . . . . . . . . . . . . . . . . . . . . 61

	1. Introduction		1. Introduction

	A Uniform Resource Identifier (URI) provides a simple and extensible		A Uniform Resource Identifier (URI) provides a simple and extensible
	means for identifying a resource. This specification of URI syntax		means for identifying a resource. This specification of URI syntax
	and semantics is derived from concepts introduced by the World Wide		and semantics is derived from concepts introduced by the World Wide

	Web global information initiative, whose use of such identifiers		Web global information initiative, whose use of these identifiers
	dates from 1990 and is described in "Universal Resource Identifiers		dates from 1990 and is described in "Universal Resource Identifiers

	in WWW" [RFC1630], and is designed to meet the recommendations laid		in WWW" [RFC1630]. The syntax is designed to meet the
	out in "Functional Recommendations for Internet Resource Locators"		recommendations laid out in "Functional Recommendations for Internet
	[RFC1736] and "Functional Requirements for Uniform Resource Names"		Resource Locators" [RFC1736] and "Functional Requirements for Uniform
	[RFC1737].		Resource Names" [RFC1737].

	This document obsoletes [RFC2396], which merged "Uniform Resource		This document obsoletes [RFC2396], which merged "Uniform Resource
	Locators" [RFC1738] and "Relative Uniform Resource Locators"		Locators" [RFC1738] and "Relative Uniform Resource Locators"
	[RFC1808] in order to define a single, generic syntax for all URIs.		[RFC1808] in order to define a single, generic syntax for all URIs.

	It contains the updates from, and obsoletes, [RFC2732], which		It obsoletes [RFC2732], which introduced syntax for an IPv6 address.
	introduced syntax for IPv6 addresses. It excludes those portions of		It excludes portions of RFC 1738 that defined the specific syntax of
	RFC 1738 that defined the specific syntax of individual URI schemes;		individual URI schemes; those portions will be updated as separate
	those portions will be updated as separate documents. The process		documents. The process for registration of new URI schemes is
	for registration of new URI schemes is defined separately by [BCP35].		defined separately by [BCP35]. Advice for designers of new URI
	Advice for designers of new URI schemes can be found in [RFC2718].		schemes can be found in [RFC2718]. All significant changes from RFC
			2396 are noted in Appendix D.
	All significant changes from RFC 2396 are noted in Appendix D.

	This specification uses the terms "character" and "coded character		This specification uses the terms "character" and "coded character
	set" in accordance with the definitions provided in [BCP19], and		set" in accordance with the definitions provided in [BCP19], and
	"character encoding" in place of what [BCP19] refers to as a		"character encoding" in place of what [BCP19] refers to as a
	"charset".		"charset".


	1.1 Overview of URIs		1.1. Overview of URIs

	URIs are characterized as follows:		URIs are characterized as follows:

	Uniform		Uniform


	Uniformity provides several benefits: it allows different types of		Uniformity provides several benefits. It allows different types
	resource identifiers to be used in the same context, even when the		of resource identifiers to be used in the same context, even when
	mechanisms used to access those resources may differ; it allows		the mechanisms used to access those resources may differ. It
	uniform semantic interpretation of common syntactic conventions		allows uniform semantic interpretation of common syntactic
	across different types of resource identifiers; it allows		conventions across different types of resource identifiers. It
	introduction of new types of resource identifiers without		allows introduction of new types of resource identifiers without
	interfering with the way that existing identifiers are used; and,		interfering with the way that existing identifiers are used. It
	it allows the identifiers to be reused in many different contexts,		allows the identifiers to be reused in many different contexts,
	thus permitting new applications or protocols to leverage a		thus permitting new applications or protocols to leverage a pre-
	pre-existing, large, and widely-used set of resource identifiers.		existing, large, and widely used set of resource identifiers.

	Resource		Resource

	This specification does not limit the scope of what might be a		This specification does not limit the scope of what might be a
	resource; rather, the term "resource" is used in a general sense		resource; rather, the term "resource" is used in a general sense
	for whatever might be identified by a URI. Familiar examples		for whatever might be identified by a URI. Familiar examples
	include an electronic document, an image, a source of information		include an electronic document, an image, a source of information

	with consistent purpose (e.g., "today's weather report for Los		with a consistent purpose (e.g., "today's weather report for Los
	Angeles"), a service (e.g., an HTTP to SMS gateway), a collection		Angeles"), a service (e.g., an HTTP-to-SMS gateway), and a
	of other resources, and so on. A resource is not necessarily		collection of other resources. A resource is not necessarily
	accessible via the Internet; e.g., human beings, corporations, and		accessible via the Internet; e.g., human beings, corporations, and
	bound books in a library can also be resources. Likewise,		bound books in a library can also be resources. Likewise,
	abstract concepts can be resources, such as the operators and		abstract concepts can be resources, such as the operators and
	operands of a mathematical equation, the types of a relationship		operands of a mathematical equation, the types of a relationship
	(e.g., "parent" or "employee"), or numeric values (e.g., zero,		(e.g., "parent" or "employee"), or numeric values (e.g., zero,
	one, and infinity).		one, and infinity).

	Identifier		Identifier

	An identifier embodies the information required to distinguish		An identifier embodies the information required to distinguish
	what is being identified from all other things within its scope of		what is being identified from all other things within its scope of
	identification. Our use of the terms "identify" and "identifying"		identification. Our use of the terms "identify" and "identifying"
	refer to this purpose of distinguishing one resource from all		refer to this purpose of distinguishing one resource from all
	other resources, regardless of how that purpose is accomplished		other resources, regardless of how that purpose is accomplished

	(e.g., by name, address, context, etc.). These terms should not		(e.g., by name, address, or context). These terms should not be
	be mistaken as an assumption that an identifier defines or		mistaken as an assumption that an identifier defines or embodies
	embodies the identity of what is referenced, though that may be		the identity of what is referenced, though that may be the case
	the case for some identifiers. Nor should it be assumed that a		for some identifiers. Nor should it be assumed that a system
	system using URIs will access the resource identified: in many		using URIs will access the resource identified: in many cases,
	cases, URIs are used to denote resources without any intention		URIs are used to denote resources without any intention that they
	that they be accessed. Likewise, the "one" resource identified		be accessed. Likewise, the "one" resource identified might not be
	might not be singular in nature (e.g., a resource might be a named		singular in nature (e.g., a resource might be a named set or a
	set or a mapping that varies over time).		mapping that varies over time).


	A URI is an identifier, consisting of a sequence of characters		A URI is an identifier consisting of a sequence of characters
	matching the syntax rule named <URI> in Section 3, that enables		matching the syntax rule named <URI> in Section 3. It enables
	uniform identification of resources via a separately defined,		uniform identification of resources via a separately defined
	extensible set of naming schemes (Section 3.1). How that		extensible set of naming schemes (Section 3.1). How that
	identification is accomplished, assigned, or enabled is delegated to		identification is accomplished, assigned, or enabled is delegated to
	each scheme specification.		each scheme specification.

	This specification does not place any limits on the nature of a		This specification does not place any limits on the nature of a

	resource, the reasons why an application might wish to refer to a		resource, the reasons why an application might seek to refer to a
	resource, or the kinds of system that might use URIs for the sake of		resource, or the kinds of systems that might use URIs for the sake of
	identifying resources. This specification does not require that a		identifying resources. This specification does not require that a

	URI persists in identifying the same resource over all time, though		URI persists in identifying the same resource over time, though that
	that is a common goal of all URI schemes. Nevertheless, nothing in		is a common goal of all URI schemes. Nevertheless, nothing in this
	this specification prevents an application from limiting itself to		specification prevents an application from limiting itself to
	particular types of resources, or to a subset of URIs that maintains		particular types of resources, or to a subset of URIs that maintains
	characteristics desired by that application.		characteristics desired by that application.

	URIs have a global scope and are interpreted consistently regardless		URIs have a global scope and are interpreted consistently regardless
	of context, though the result of that interpretation may be in		of context, though the result of that interpretation may be in
	relation to the end-user's context. For example, "http://localhost/"		relation to the end-user's context. For example, "http://localhost/"
	has the same interpretation for every user of that reference, even		has the same interpretation for every user of that reference, even
	though the network interface corresponding to "localhost" may be		though the network interface corresponding to "localhost" may be
	different for each end-user: interpretation is independent of access.		different for each end-user: interpretation is independent of access.
	However, an action made on the basis of that reference will take		However, an action made on the basis of that reference will take
	place in relation to the end-user's context, which implies that an		place in relation to the end-user's context, which implies that an

	action intended to refer to a single, globally unique thing must use		action intended to refer to a globally unique thing must use a URI
	a URI that distinguishes that resource from all other things. URIs		that distinguishes that resource from all other things. URIs that
	that identify in relation to the end-user's local context should only		identify in relation to the end-user's local context should only be
	be used when the context itself is a defining aspect of the resource,		used when the context itself is a defining aspect of the resource,
	such as when an on-line help manual refers to a file on the		such as when an on-line help manual refers to a file on the end-
	end-user's filesystem (e.g., "file:///etc/hosts").		user's file system (e.g., "file:///etc/hosts").


	1.1.1 Generic Syntax		1.1.1. Generic Syntax

	Each URI begins with a scheme name, as defined in Section 3.1, that		Each URI begins with a scheme name, as defined in Section 3.1, that
	refers to a specification for assigning identifiers within that		refers to a specification for assigning identifiers within that
	scheme. As such, the URI syntax is a federated and extensible naming		scheme. As such, the URI syntax is a federated and extensible naming
	system wherein each scheme's specification may further restrict the		system wherein each scheme's specification may further restrict the
	syntax and semantics of identifiers using that scheme.		syntax and semantics of identifiers using that scheme.

	This specification defines those elements of the URI syntax that are		This specification defines those elements of the URI syntax that are
	required of all URI schemes or are common to many URI schemes. It		required of all URI schemes or are common to many URI schemes. It

	thus defines the syntax and semantics that are needed to implement a		thus defines the syntax and semantics needed to implement a scheme-
	scheme-independent parsing mechanism for URI references, such that		independent parsing mechanism for URI references, by which the
	the scheme-dependent handling of a URI can be postponed until the		scheme-dependent handling of a URI can be postponed until the
	scheme-dependent semantics are needed. Likewise, protocols and data		scheme-dependent semantics are needed. Likewise, protocols and data
	formats that make use of URI references can refer to this		formats that make use of URI references can refer to this

	specification as defining the range of syntax allowed for all URIs,		specification as a definition for the range of syntax allowed for all
	including those schemes that have yet to be defined, thus decoupling		URIs, including those schemes that have yet to be defined. This
	the evolution of identification schemes from the evolution of		decouples the evolution of identification schemes from the evolution
	protocols, data formats, and implementations that make use of URIs.		of protocols, data formats, and implementations that make use of
			URIs.


	A parser of the generic URI syntax is capable of parsing any URI		A parser of the generic URI syntax can parse any URI reference into
	reference into its major components; once the scheme is determined,		its major components. Once the scheme is determined, further
	further scheme-specific parsing can be performed on the components.		scheme-specific parsing can be performed on the components. In other
	In other words, the URI generic syntax is a superset of the syntax of		words, the URI generic syntax is a superset of the syntax of all URI
	all URI schemes.		schemes.


	1.1.2 Examples		1.1.2. Examples

	The following example URIs illustrate several URI schemes and		The following example URIs illustrate several URI schemes and
	variations in their common syntax components:		variations in their common syntax components:

	ftp://ftp.is.co.za/rfc/rfc1808.txt		ftp://ftp.is.co.za/rfc/rfc1808.txt

	http://www.ietf.org/rfc/rfc2396.txt		http://www.ietf.org/rfc/rfc2396.txt

	ldap://[2001:db8::7]/c=GB?objectClass?one		ldap://[2001:db8::7]/c=GB?objectClass?one

	mailto:[email protected]		mailto:[email protected]

	news:comp.infosystems.www.servers.unix		news:comp.infosystems.www.servers.unix

	tel:+1-816-555-1212		tel:+1-816-555-1212

	telnet://192.0.2.16:80/		telnet://192.0.2.16:80/

	urn:oasis:names:specification:docbook:dtd:xml:4.1.2		urn:oasis:names:specification:docbook:dtd:xml:4.1.2


	1.1.3 URI, URL, and URN		1.1.3. URI, URL, and URN

	A URI can be further classified as a locator, a name, or both. The		A URI can be further classified as a locator, a name, or both. The
	term "Uniform Resource Locator" (URL) refers to the subset of URIs		term "Uniform Resource Locator" (URL) refers to the subset of URIs
	that, in addition to identifying a resource, provide a means of		that, in addition to identifying a resource, provide a means of
	locating the resource by describing its primary access mechanism		locating the resource by describing its primary access mechanism
	(e.g., its network "location"). The term "Uniform Resource Name"		(e.g., its network "location"). The term "Uniform Resource Name"
	(URN) has been used historically to refer to both URIs under the		(URN) has been used historically to refer to both URIs under the
	"urn" scheme [RFC2141], which are required to remain globally unique		"urn" scheme [RFC2141], which are required to remain globally unique
	and persistent even when the resource ceases to exist or becomes		and persistent even when the resource ceases to exist or becomes
	unavailable, and to any other URI with the properties of a name.		unavailable, and to any other URI with the properties of a name.


	An individual scheme does not need to be classified as being just one		An individual scheme does not have to be classified as being just one
	of "name" or "locator". Instances of URIs from any given scheme may		of "name" or "locator". Instances of URIs from any given scheme may
	have the characteristics of names or locators or both, often		have the characteristics of names or locators or both, often
	depending on the persistence and care in the assignment of		depending on the persistence and care in the assignment of

	identifiers by the naming authority, rather than any quality of the		identifiers by the naming authority, rather than on any quality of
	scheme. Future specifications and related documentation should use		the scheme. Future specifications and related documentation should
	the general term "URI", rather than the more restrictive terms URL		use the general term "URI" rather than the more restrictive terms
	and URN [RFC3305].		"URL" and "URN" [RFC3305].


	1.2 Design Considerations		1.2. Design Considerations


	1.2.1 Transcription		1.2.1. Transcription

	The URI syntax has been designed with global transcription as one of		The URI syntax has been designed with global transcription as one of
	its main considerations. A URI is a sequence of characters from a		its main considerations. A URI is a sequence of characters from a
	very limited set: the letters of the basic Latin alphabet, digits,		very limited set: the letters of the basic Latin alphabet, digits,
	and a few special characters. A URI may be represented in a variety		and a few special characters. A URI may be represented in a variety

	of ways: e.g., ink on paper, pixels on a screen, or a sequence of		of ways; e.g., ink on paper, pixels on a screen, or a sequence of
	character encoding octets. The interpretation of a URI depends only		character encoding octets. The interpretation of a URI depends only

	on the characters used and not how those characters are represented		on the characters used and not on how those characters are
	in a network protocol.		represented in a network protocol.

	The goal of transcription can be described by a simple scenario.		The goal of transcription can be described by a simple scenario.
	Imagine two colleagues, Sam and Kim, sitting in a pub at an		Imagine two colleagues, Sam and Kim, sitting in a pub at an
	international conference and exchanging research ideas. Sam asks Kim		international conference and exchanging research ideas. Sam asks Kim
	for a location to get more information, so Kim writes the URI for the		for a location to get more information, so Kim writes the URI for the
	research site on a napkin. Upon returning home, Sam takes out the		research site on a napkin. Upon returning home, Sam takes out the
	napkin and types the URI into a computer, which then retrieves the		napkin and types the URI into a computer, which then retrieves the
	information to which Kim referred.		information to which Kim referred.

	There are several design considerations revealed by the scenario:		There are several design considerations revealed by the scenario:

	o A URI is a sequence of characters that is not always represented		o A URI is a sequence of characters that is not always represented
	as a sequence of octets.		as a sequence of octets.


	o A URI might be transcribed from a non-network source, and thus		o A URI might be transcribed from a non-network source and thus
	should consist of characters that are most likely to be able to be		should consist of characters that are most likely able to be
	entered into a computer, within the constraints imposed by		entered into a computer, within the constraints imposed by
	keyboards (and related input devices) across languages and		keyboards (and related input devices) across languages and
	locales.		locales.


	o A URI often needs to be remembered by people, and it is easier for		o A URI often has to be remembered by people, and it is easier for
	people to remember a URI when it consists of meaningful or		people to remember a URI when it consists of meaningful or
	familiar components.		familiar components.

	These design considerations are not always in alignment. For		These design considerations are not always in alignment. For
	example, it is often the case that the most meaningful name for a URI		example, it is often the case that the most meaningful name for a URI
	component would require characters that cannot be typed into some		component would require characters that cannot be typed into some
	systems. The ability to transcribe a resource identifier from one		systems. The ability to transcribe a resource identifier from one
	medium to another has been considered more important than having a		medium to another has been considered more important than having a
	URI consist of the most meaningful of components.		URI consist of the most meaningful of components.

	In local or regional contexts and with improving technology, users		In local or regional contexts and with improving technology, users
	might benefit from being able to use a wider range of characters;		might benefit from being able to use a wider range of characters;
	such use is not defined by this specification. Percent-encoded		such use is not defined by this specification. Percent-encoded
	octets (Section 2.1) may be used within a URI to represent characters		octets (Section 2.1) may be used within a URI to represent characters

	outside the range of the US-ASCII coded character set if such		outside the range of the US-ASCII coded character set if this
	representation is allowed by the scheme or by the protocol element in		representation is allowed by the scheme or by the protocol element in

	which the URI is referenced; such a definition should specify the		which the URI is referenced. Such a definition should specify the
	character encoding used to map those characters to octets prior to		character encoding used to map those characters to octets prior to
	being percent-encoded for the URI.		being percent-encoded for the URI.


	1.2.2 Separating Identification from Interaction		1.2.2. Separating Identification from Interaction

	A common misunderstanding of URIs is that they are only used to refer		A common misunderstanding of URIs is that they are only used to refer

	to accessible resources. In fact, the URI alone only provides		to accessible resources. The URI itself only provides
	identification; access to the resource is neither guaranteed nor		identification; access to the resource is neither guaranteed nor

	implied by the presence of a URI. Instead, an operation (if any)		implied by the presence of a URI. Instead, any operation associated
	associated with a URI reference is defined by the protocol element,		with a URI reference is defined by the protocol element, data format
	data format attribute, or natural language text in which it appears.		attribute, or natural language text in which it appears.

	Given a URI, a system may attempt to perform a variety of operations		Given a URI, a system may attempt to perform a variety of operations

	on the resource, as might be characterized by such words as "access",		on the resource, as might be characterized by words such as "access",
	"update", "replace", or "find attributes". Such operations are		"update", "replace", or "find attributes". Such operations are
	defined by the protocols that make use of URIs, not by this		defined by the protocols that make use of URIs, not by this
	specification. However, we do use a few general terms for describing		specification. However, we do use a few general terms for describing
	common operations on URIs. URI "resolution" is the process of		common operations on URIs. URI "resolution" is the process of
	determining an access mechanism and the appropriate parameters		determining an access mechanism and the appropriate parameters

	necessary to dereference a URI; such resolution may require several		necessary to dereference a URI; this resolution may require several
	iterations. To use that access mechanism to perform an action on the		iterations. To use that access mechanism to perform an action on the
	URI's resource is to "dereference" the URI.		URI's resource is to "dereference" the URI.

	When URIs are used within information retrieval systems to identify		When URIs are used within information retrieval systems to identify
	sources of information, the most common form of URI dereference is		sources of information, the most common form of URI dereference is
	"retrieval": making use of a URI in order to retrieve a		"retrieval": making use of a URI in order to retrieve a
	representation of its associated resource. A "representation" is a		representation of its associated resource. A "representation" is a
	sequence of octets, along with representation metadata describing		sequence of octets, along with representation metadata describing
	those octets, that constitutes a record of the state of the resource		those octets, that constitutes a record of the state of the resource

	at the time that the representation is generated. Retrieval is		at the time when the representation is generated. Retrieval is
	achieved by a process that might include using the URI as a cache key		achieved by a process that might include using the URI as a cache key
	to check for a locally cached representation, resolution of the URI		to check for a locally cached representation, resolution of the URI
	to determine an appropriate access mechanism (if any), and		to determine an appropriate access mechanism (if any), and
	dereference of the URI for the sake of applying a retrieval		dereference of the URI for the sake of applying a retrieval
	operation. Depending on the protocols used to perform the retrieval,		operation. Depending on the protocols used to perform the retrieval,
	additional information might be supplied about the resource (resource		additional information might be supplied about the resource (resource
	metadata) and its relation to other resources.		metadata) and its relation to other resources.

	URI references in information retrieval systems are designed to be		URI references in information retrieval systems are designed to be

	late-binding: the result of an access is generally determined at the		late-binding: the result of an access is generally determined when it
	time it is accessed and may vary over time or due to other aspects of		is accessed and may vary over time or due to other aspects of the
	the interaction. Such references are created in order to be used in		interaction. These references are created in order to be used in the
	the future: what is being identified is not some specific result that		future: what is being identified is not some specific result that was
	was obtained in the past, but rather some characteristic that is		obtained in the past, but rather some characteristic that is expected
	expected to be true for future results. In such cases, the resource		to be true for future results. In such cases, the resource referred
	referred to by the URI is actually a sameness of characteristics as		to by the URI is actually a sameness of characteristics as observed
	observed over time, perhaps elucidated by additional comments or		over time, perhaps elucidated by additional comments or assertions
	assertions made by the resource provider.		made by the resource provider.

	Although many URI schemes are named after protocols, this does not		Although many URI schemes are named after protocols, this does not

	imply that use of such a URI will result in access to the resource		imply that use of these URIs will result in access to the resource
	via the named protocol. URIs are often used simply for the sake of		via the named protocol. URIs are often used simply for the sake of
	identification. Even when a URI is used to retrieve a representation		identification. Even when a URI is used to retrieve a representation
	of a resource, that access might be through gateways, proxies,		of a resource, that access might be through gateways, proxies,
	caches, and name resolution services that are independent of the		caches, and name resolution services that are independent of the

	protocol associated with the scheme name, and the resolution of some		protocol associated with the scheme name. The resolution of some
	URIs may require the use of more than one protocol (e.g., both DNS		URIs may require the use of more than one protocol (e.g., both DNS
	and HTTP are typically used to access an "http" URI's origin server		and HTTP are typically used to access an "http" URI's origin server
	when a representation isn't found in a local cache).		when a representation isn't found in a local cache).


	1.2.3 Hierarchical Identifiers		1.2.3. Hierarchical Identifiers

	The URI syntax is organized hierarchically, with components listed in		The URI syntax is organized hierarchically, with components listed in
	order of decreasing significance from left to right. For some URI		order of decreasing significance from left to right. For some URI
	schemes, the visible hierarchy is limited to the scheme itself:		schemes, the visible hierarchy is limited to the scheme itself:
	everything after the scheme component delimiter (":") is considered		everything after the scheme component delimiter (":") is considered
	opaque to URI processing. Other URI schemes make the hierarchy		opaque to URI processing. Other URI schemes make the hierarchy
	explicit and visible to generic parsing algorithms.		explicit and visible to generic parsing algorithms.

	The generic syntax uses the slash ("/"), question mark ("?"), and		The generic syntax uses the slash ("/"), question mark ("?"), and

	number sign ("#") characters for the purpose of delimiting components		number sign ("#") characters to delimit components that are
	that are significant to the generic parser's hierarchical		significant to the generic parser's hierarchical interpretation of an
	interpretation of an identifier. In addition to aiding the		identifier. In addition to aiding the readability of such
	readability of such identifiers through the consistent use of		identifiers through the consistent use of familiar syntax, this
	familiar syntax, this uniform representation of hierarchy across		uniform representation of hierarchy across naming schemes allows
	naming schemes allows scheme-independent references to be made		scheme-independent references to be made relative to that hierarchy.
	relative to that hierarchy.

	It is often the case that a group or "tree" of documents has been		It is often the case that a group or "tree" of documents has been
	constructed to serve a common purpose, wherein the vast majority of		constructed to serve a common purpose, wherein the vast majority of
	URI references in these documents point to resources within the tree		URI references in these documents point to resources within the tree

	rather than outside of it. Similarly, documents located at a		rather than outside it. Similarly, documents located at a particular
	particular site are much more likely to refer to other resources at		site are much more likely to refer to other resources at that site
	that site than to resources at remote sites. Relative referencing of		than to resources at remote sites. Relative referencing of URIs
	URIs allows document trees to be partially independent of their		allows document trees to be partially independent of their location
	location and access scheme. For instance, it is possible for a		and access scheme. For instance, it is possible for a single set of
	single set of hypertext documents to be simultaneously accessible and		hypertext documents to be simultaneously accessible and traversable
	traversable via each of the "file", "http", and "ftp" schemes if the		via each of the "file", "http", and "ftp" schemes if the documents
	documents refer to each other using relative references.		refer to each other with relative references. Furthermore, such
	Furthermore, such document trees can be moved, as a whole, without		document trees can be moved, as a whole, without changing any of the
	changing any of the relative references.		relative references.

	A relative reference (Section 4.2) refers to a resource by describing		A relative reference (Section 4.2) refers to a resource by describing
	the difference within a hierarchical name space between the reference		the difference within a hierarchical name space between the reference
	context and the target URI. The reference resolution algorithm,		context and the target URI. The reference resolution algorithm,
	presented in Section 5, defines how such a reference is transformed		presented in Section 5, defines how such a reference is transformed

	to the target URI. Since relative references can only be used within		to the target URI. As relative references can only be used within
	the context of a hierarchical URI, designers of new URI schemes		the context of a hierarchical URI, designers of new URI schemes
	should use a syntax consistent with the generic syntax's hierarchical		should use a syntax consistent with the generic syntax's hierarchical
	components unless there are compelling reasons to forbid relative		components unless there are compelling reasons to forbid relative
	referencing within that scheme.		referencing within that scheme.

	NOTE: Previous specifications used the terms "partial URI" and		NOTE: Previous specifications used the terms "partial URI" and

	"relative URI" to denote a relative reference to a URI. Since		"relative URI" to denote a relative reference to a URI. As some
	some readers misunderstood those terms to mean that relative URIs		readers misunderstood those terms to mean that relative URIs are a
	are a subset of URIs, rather than a method of referencing URIs,		subset of URIs rather than a method of referencing URIs, this
	this specification simply refers to them as relative references.		specification simply refers to them as relative references.

	All URI references are parsed by generic syntax parsers when used.		All URI references are parsed by generic syntax parsers when used.

	However, since hierarchical processing has no effect on an absolute		However, because hierarchical processing has no effect on an absolute
	URI used in a reference unless it contains one or more dot-segments		URI used in a reference unless it contains one or more dot-segments
	(complete path segments of "." or "..", as described in Section 3.3),		(complete path segments of "." or "..", as described in Section 3.3),
	URI scheme specifications can define opaque identifiers by		URI scheme specifications can define opaque identifiers by
	disallowing use of slash characters, question mark characters, and		disallowing use of slash characters, question mark characters, and
	the URIs "scheme:." and "scheme:..".		the URIs "scheme:." and "scheme:..".


	1.3 Syntax Notation		1.3. Syntax Notation

	This specification uses the Augmented Backus-Naur Form (ABNF)		This specification uses the Augmented Backus-Naur Form (ABNF)
	notation of [RFC2234], including the following core ABNF syntax rules		notation of [RFC2234], including the following core ABNF syntax rules
	defined by that specification: ALPHA (letters), CR (carriage return),		defined by that specification: ALPHA (letters), CR (carriage return),
	DIGIT (decimal digits), DQUOTE (double quote), HEXDIG (hexadecimal		DIGIT (decimal digits), DQUOTE (double quote), HEXDIG (hexadecimal
	digits), LF (line feed), and SP (space). The complete URI syntax is		digits), LF (line feed), and SP (space). The complete URI syntax is
	collected in Appendix A.		collected in Appendix A.

	2. Characters		2. Characters

	The URI syntax provides a method of encoding data, presumably for the		The URI syntax provides a method of encoding data, presumably for the
	sake of identifying a resource, as a sequence of characters. The URI		sake of identifying a resource, as a sequence of characters. The URI
	characters are, in turn, frequently encoded as octets for transport		characters are, in turn, frequently encoded as octets for transport
	or presentation. This specification does not mandate any particular		or presentation. This specification does not mandate any particular
	character encoding for mapping between URI characters and the octets		character encoding for mapping between URI characters and the octets
	used to store or transmit those characters. When a URI appears in a		used to store or transmit those characters. When a URI appears in a
	protocol element, the character encoding is defined by that protocol;		protocol element, the character encoding is defined by that protocol;

	absent such a definition, a URI is assumed to be in the same		without such a definition, a URI is assumed to be in the same
	character encoding as the surrounding text.		character encoding as the surrounding text.

	The ABNF notation defines its terminal values to be non-negative		The ABNF notation defines its terminal values to be non-negative
	integers (codepoints) based on the US-ASCII coded character set		integers (codepoints) based on the US-ASCII coded character set

	[ASCII]. Since a URI is a sequence of characters, we must invert		[ASCII]. Because a URI is a sequence of characters, we must invert
	that relation in order to understand the URI syntax. Therefore, the		that relation in order to understand the URI syntax. Therefore, the
	integer values used by the ABNF must be mapped back to their		integer values used by the ABNF must be mapped back to their
	corresponding characters via US-ASCII in order to complete the syntax		corresponding characters via US-ASCII in order to complete the syntax
	rules.		rules.

	A URI is composed from a limited set of characters consisting of		A URI is composed from a limited set of characters consisting of
	digits, letters, and a few graphic symbols. A reserved subset of		digits, letters, and a few graphic symbols. A reserved subset of
	those characters may be used to delimit syntax components within a		those characters may be used to delimit syntax components within a

	URI, while the remaining characters, including both the unreserved		URI while the remaining characters, including both the unreserved set
	set and those reserved characters not acting as delimiters, define		and those reserved characters not acting as delimiters, define each
	each component's identifying data.		component's identifying data.


	2.1 Percent-Encoding		2.1. Percent-Encoding

	A percent-encoding mechanism is used to represent a data octet in a		A percent-encoding mechanism is used to represent a data octet in a
	component when that octet's corresponding character is outside the		component when that octet's corresponding character is outside the
	allowed set or is being used as a delimiter of, or within, the		allowed set or is being used as a delimiter of, or within, the
	component. A percent-encoded octet is encoded as a character		component. A percent-encoded octet is encoded as a character
	triplet, consisting of the percent character "%" followed by the two		triplet, consisting of the percent character "%" followed by the two
	hexadecimal digits representing that octet's numeric value. For		hexadecimal digits representing that octet's numeric value. For
	example, "%20" is the percent-encoding for the binary octet		example, "%20" is the percent-encoding for the binary octet
	"00100000" (ABNF: %x20), which in US-ASCII corresponds to the space		"00100000" (ABNF: %x20), which in US-ASCII corresponds to the space
	character (SP). Section 2.4 describes when percent-encoding and		character (SP). Section 2.4 describes when percent-encoding and
	decoding is applied.		decoding is applied.

	pct-encoded = "%" HEXDIG HEXDIG		pct-encoded = "%" HEXDIG HEXDIG

	The uppercase hexadecimal digits 'A' through 'F' are equivalent to		The uppercase hexadecimal digits 'A' through 'F' are equivalent to

	the lowercase digits 'a' through 'f', respectively. Two URIs that		the lowercase digits 'a' through 'f', respectively. If two URIs
	differ only in the case of hexadecimal digits used in percent-encoded		differ only in the case of hexadecimal digits used in percent-encoded

	octets are equivalent. For consistency, URI producers and		octets, they are equivalent. For consistency, URI producers and
	normalizers should use uppercase hexadecimal digits for all		normalizers should use uppercase hexadecimal digits for all percent-
	percent-encodings.		encodings.


	2.2 Reserved Characters		2.2. Reserved Characters

	URIs include components and subcomponents that are delimited by		URIs include components and subcomponents that are delimited by
	characters in the "reserved" set. These characters are called		characters in the "reserved" set. These characters are called
	"reserved" because they may (or may not) be defined as delimiters by		"reserved" because they may (or may not) be defined as delimiters by
	the generic syntax, by each scheme-specific syntax, or by the		the generic syntax, by each scheme-specific syntax, or by the
	implementation-specific syntax of a URI's dereferencing algorithm.		implementation-specific syntax of a URI's dereferencing algorithm.
	If data for a URI component would conflict with a reserved		If data for a URI component would conflict with a reserved
	character's purpose as a delimiter, then the conflicting data must be		character's purpose as a delimiter, then the conflicting data must be

	percent-encoded before forming the URI.		percent-encoded before the URI is formed.

	reserved = gen-delims / sub-delims		reserved = gen-delims / sub-delims

	gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"		gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"

	sub-delims = "!" / "$" / "&" / "'" / "(" / ")"		sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
	/ "*" / "+" / "," / ";" / "="		/ "*" / "+" / "," / ";" / "="

	The purpose of reserved characters is to provide a set of delimiting		The purpose of reserved characters is to provide a set of delimiting
	characters that are distinguishable from other data within a URI.		characters that are distinguishable from other data within a URI.
	URIs that differ in the replacement of a reserved character with its		URIs that differ in the replacement of a reserved character with its

	corresponding percent-encoded octet are not equivalent.		corresponding percent-encoded octet are not equivalent. Percent-
	Percent-encoding a reserved character, or decoding a percent-encoded		encoding a reserved character, or decoding a percent-encoded octet
	octet that corresponds to a reserved character, will change how the		that corresponds to a reserved character, will change how the URI is
	URI is interpreted by most applications. Thus, characters in the		interpreted by most applications. Thus, characters in the reserved
	reserved set are protected from normalization and are therefore safe		set are protected from normalization and are therefore safe to be
	to be used by scheme-specific and producer-specific algorithms for		used by scheme-specific and producer-specific algorithms for
	delimiting data subcomponents within a URI.		delimiting data subcomponents within a URI.


	A subset of the reserved characters (gen-delims) are used as		A subset of the reserved characters (gen-delims) is used as
	delimiters of the generic URI components described in Section 3. A		delimiters of the generic URI components described in Section 3. A
	component's ABNF syntax rule will not use the reserved or gen-delims		component's ABNF syntax rule will not use the reserved or gen-delims
	rule names directly; instead, each syntax rule lists the characters		rule names directly; instead, each syntax rule lists the characters

	allowed within that component (i.e., not delimiting it) and any of		allowed within that component (i.e., not delimiting it), and any of
	those characters that are also in the reserved set are "reserved" for		those characters that are also in the reserved set are "reserved" for
	use as subcomponent delimiters within the component. Only the most		use as subcomponent delimiters within the component. Only the most
	common subcomponents are defined by this specification; other		common subcomponents are defined by this specification; other
	subcomponents may be defined by a URI scheme's specification, or by		subcomponents may be defined by a URI scheme's specification, or by
	the implementation-specific syntax of a URI's dereferencing		the implementation-specific syntax of a URI's dereferencing
	algorithm, provided that such subcomponents are delimited by		algorithm, provided that such subcomponents are delimited by
	characters in the reserved set allowed within that component.		characters in the reserved set allowed within that component.

	URI producing applications should percent-encode data octets that		URI producing applications should percent-encode data octets that

	correspond to characters in the reserved set. However, if a reserved		correspond to characters in the reserved set unless these characters
	character is found in a URI component and no delimiting role is known		are specifically allowed by the URI scheme to represent data in that
	for that character, then it should be interpreted as representing the		component. If a reserved character is found in a URI component and
	data octet corresponding to that character's encoding in US-ASCII.		no delimiting role is known for that character, then it must be
			interpreted as representing the data octet corresponding to that
			character's encoding in US-ASCII.


	2.3 Unreserved Characters		2.3. Unreserved Characters

	Characters that are allowed in a URI but do not have a reserved		Characters that are allowed in a URI but do not have a reserved
	purpose are called unreserved. These include uppercase and lowercase		purpose are called unreserved. These include uppercase and lowercase
	letters, decimal digits, hyphen, period, underscore, and tilde.		letters, decimal digits, hyphen, period, underscore, and tilde.

	unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"		unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"

	URIs that differ in the replacement of an unreserved character with		URIs that differ in the replacement of an unreserved character with
	its corresponding percent-encoded US-ASCII octet are equivalent: they		its corresponding percent-encoded US-ASCII octet are equivalent: they
	identify the same resource. However, URI comparison implementations		identify the same resource. However, URI comparison implementations

	do not always perform normalization prior to comparison Section 6.		do not always perform normalization prior to comparison (see Section
	For consistency, percent-encoded octets in the ranges of ALPHA		6). For consistency, percent-encoded octets in the ranges of ALPHA
	(%41-%5A and %61-%7A), DIGIT (%30-%39), hyphen (%2D), period (%2E),		(%41-%5A and %61-%7A), DIGIT (%30-%39), hyphen (%2D), period (%2E),
	underscore (%5F), or tilde (%7E) should not be created by URI		underscore (%5F), or tilde (%7E) should not be created by URI
	producers and, when found in a URI, should be decoded to their		producers and, when found in a URI, should be decoded to their

	corresponding unreserved character by URI normalizers.		corresponding unreserved characters by URI normalizers.


	2.4 When to Encode or Decode		2.4. When to Encode or Decode


	Under normal circumstances, the only time that octets within a URI		Under normal circumstances, the only time when octets within a URI
	are percent-encoded is during the process of producing the URI from		are percent-encoded is during the process of producing the URI from

	its component parts. It is during that process that an		its component parts. This is when an implementation determines which
	implementation determines which of the reserved characters are to be		of the reserved characters are to be used as subcomponent delimiters
	used as subcomponent delimiters and which can be safely used as data.		and which can be safely used as data. Once produced, a URI is always
	Once produced, a URI is always in its percent-encoded form.		in its percent-encoded form.

	When a URI is dereferenced, the components and subcomponents		When a URI is dereferenced, the components and subcomponents
	significant to the scheme-specific dereferencing process (if any)		significant to the scheme-specific dereferencing process (if any)
	must be parsed and separated before the percent-encoded octets within		must be parsed and separated before the percent-encoded octets within

	those components can be safely decoded, since otherwise the data may		those components can be safely decoded, as otherwise the data may be
	be mistaken for component delimiters. The only exception is for		mistaken for component delimiters. The only exception is for
	percent-encoded octets corresponding to characters in the unreserved		percent-encoded octets corresponding to characters in the unreserved
	set, which can be decoded at any time. For example, the octet		set, which can be decoded at any time. For example, the octet
	corresponding to the tilde ("~") character is often encoded as "%7E"		corresponding to the tilde ("~") character is often encoded as "%7E"
	by older URI processing implementations; the "%7E" can be replaced by		by older URI processing implementations; the "%7E" can be replaced by
	"~" without changing its interpretation.		"~" without changing its interpretation.

	Because the percent ("%") character serves as the indicator for		Because the percent ("%") character serves as the indicator for

	percent-encoded octets, it must be percent-encoded as "%25" in order		percent-encoded octets, it must be percent-encoded as "%25" for that
	for that octet to be used as data within a URI. Implementations must		octet to be used as data within a URI. Implementations must not
	not percent-encode or decode the same string more than once, since		percent-encode or decode the same string more than once, as decoding
	decoding an already decoded string might lead to misinterpreting a		an already decoded string might lead to misinterpreting a percent
	percent data octet as the beginning of a percent-encoding, or vice		data octet as the beginning of a percent-encoding, or vice versa in
	versa in the case of percent-encoding an already percent-encoded		the case of percent-encoding an already percent-encoded string.
	string.


	2.5 Identifying Data		2.5. Identifying Data

	URI characters provide identifying data for each of the URI		URI characters provide identifying data for each of the URI
	components, serving as an external interface for identification		components, serving as an external interface for identification
	between systems. Although the presence and nature of the URI		between systems. Although the presence and nature of the URI

	production interface is hidden from clients that use its URIs, and		production interface is hidden from clients that use its URIs (and is
	thus beyond the scope of the interoperability requirements defined by		thus beyond the scope of the interoperability requirements defined by

	this specification, it is a frequent source of confusion and errors		this specification), it is a frequent source of confusion and errors
	in the interpretation of URI character issues. Implementers need to		in the interpretation of URI character issues. Implementers have to
	be aware that there are multiple character encodings involved in the		be aware that there are multiple character encodings involved in the
	production and transmission of URIs: local name and data encoding,		production and transmission of URIs: local name and data encoding,
	public interface encoding, URI character encoding, data format		public interface encoding, URI character encoding, data format
	encoding, and protocol encoding.		encoding, and protocol encoding.


	The first encoding of identifying data is the one in which the local		Local names, such as file system names, are stored with a local
	names or data are stored. URI producing applications (a.k.a., origin		character encoding. URI producing applications (e.g., origin
	servers) will typically use the local encoding as the basis for		servers) will typically use the local encoding as the basis for
	producing meaningful names. The URI producer will transform the		producing meaningful names. The URI producer will transform the

	local encoding to one that is suitable for a public interface, and		local encoding to one that is suitable for a public interface and
	then transform the public interface encoding into the restricted set		then transform the public interface encoding into the restricted set
	of URI characters (reserved, unreserved, and percent-encodings).		of URI characters (reserved, unreserved, and percent-encodings).
	Those characters are, in turn, encoded as octets to be used as a		Those characters are, in turn, encoded as octets to be used as a
	reference within a data format (e.g., a document charset), and such		reference within a data format (e.g., a document charset), and such
	data formats are often subsequently encoded for transmission over		data formats are often subsequently encoded for transmission over
	Internet protocols.		Internet protocols.

	For most systems, an unreserved character appearing within a URI		For most systems, an unreserved character appearing within a URI
	component is interpreted as representing the data octet corresponding		component is interpreted as representing the data octet corresponding
	to that character's encoding in US-ASCII. Consumers of URIs assume		to that character's encoding in US-ASCII. Consumers of URIs assume

	that the letter "X" corresponds to the octet "01011000", and there is		that the letter "X" corresponds to the octet "01011000", and even
	no harm in making that assumption even when it is incorrect. A		when that assumption is incorrect, there is no harm in making it. A
	system that internally provides identifiers in the form of a		system that internally provides identifiers in the form of a
	different character encoding, such as EBCDIC, will generally perform		different character encoding, such as EBCDIC, will generally perform
	character translation of textual identifiers to UTF-8 [STD63] (or		character translation of textual identifiers to UTF-8 [STD63] (or
	some other superset of the US-ASCII character encoding) at an		some other superset of the US-ASCII character encoding) at an
	internal interface, thereby providing more meaningful identifiers		internal interface, thereby providing more meaningful identifiers

	than simply percent-encoding the original octets.		than those resulting from simply percent-encoding the original
			octets.

	For example, consider an information service that provides data,		For example, consider an information service that provides data,

	stored locally using an EBCDIC-based filesystem, to clients on the		stored locally using an EBCDIC-based file system, to clients on the
	Internet through an HTTP server. When an author creates a file on		Internet through an HTTP server. When an author creates a file with
	that filesystem with the name "Laguna Beach", their expectation is		the name "Laguna Beach" on that file system, the "http" URI
	that the "http" URI corresponding to that resource would also contain		corresponding to that resource is expected to contain the meaningful
	the meaningful string "Laguna%20Beach". If, however, that server		string "Laguna%20Beach". If, however, that server produces URIs by
	produces URIs using an overly-simplistic raw octet mapping, then the		using an overly simplistic raw octet mapping, then the result would
	result would be a URI containing		be a URI containing "%D3%81%87%A4%95%81@%C2%85%81%83%88". An
	"%D3%81%87%A4%95%81@%C2%85%81%83%88". An internal transcoding		internal transcoding interface fixes this problem by transcoding the
	interface fixes that problem by transcoding the local name to a		local name to a superset of US-ASCII prior to producing the URI.
	superset of US-ASCII prior to producing the URI. Naturally, proper		Naturally, proper interpretation of an incoming URI on such an
	interpretation of an incoming URI on such an interface requires that		interface requires that percent-encoded octets be decoded (e.g.,
	percent-encoded octets be decoded (e.g., "%20" to SP) before the		"%20" to SP) before the reverse transcoding is applied to obtain the
	reverse transcoding is applied to obtain the local name.		local name.

	In some cases, the internal interface between a URI component and the		In some cases, the internal interface between a URI component and the
	identifying data that it has been crafted to represent is much less		identifying data that it has been crafted to represent is much less
	direct than a character encoding translation. For example, portions		direct than a character encoding translation. For example, portions

	of a URI might reflect a query on non-ASCII data, numeric coordinates		of a URI might reflect a query on non-ASCII data, or numeric
	on a map, etc. Likewise, a URI scheme may define components with		coordinates on a map. Likewise, a URI scheme may define components
	additional encoding requirements that are applied prior to forming		with additional encoding requirements that are applied prior to
	the component and producing the URI.		forming the component and producing the URI.

	When a new URI scheme defines a component that represents textual		When a new URI scheme defines a component that represents textual

	data consisting of characters from the Unicode character set [UCS],		data consisting of characters from the Universal Character Set [UCS],
	the data should be encoded first as octets according to the UTF-8		the data should first be encoded as octets according to the UTF-8
	character encoding [STD63], and then only those octets that do not		character encoding [STD63]; then only those octets that do not
	correspond to characters in the unreserved set should be		correspond to characters in the unreserved set should be percent-
	percent-encoded. For example, the character A would be represented		encoded. For example, the character A would be represented as "A",
	as "A", the character LATIN CAPITAL LETTER A WITH GRAVE would be		the character LATIN CAPITAL LETTER A WITH GRAVE would be represented
	represented as "%C3%80", and the character KATAKANA LETTER A would be		as "%C3%80", and the character KATAKANA LETTER A would be represented
	represented as "%E3%82%A2".		as "%E3%82%A2".

	3. Syntax Components		3. Syntax Components

	The generic URI syntax consists of a hierarchical sequence of		The generic URI syntax consists of a hierarchical sequence of
	components referred to as the scheme, authority, path, query, and		components referred to as the scheme, authority, path, query, and
	fragment.		fragment.

	URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]		URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]

	hier-part = "//" authority path-abempty		hier-part = "//" authority path-abempty
	/ path-absolute		/ path-absolute
	/ path-rootless		/ path-rootless
	/ path-empty		/ path-empty


	The scheme and path components are required, though path may be empty		The scheme and path components are required, though the path may be
	(no characters). When authority is present, the path must either be		empty (no characters). When authority is present, the path must
	empty or begin with a slash ("/") character. When authority is not		either be empty or begin with a slash ("/") character. When
	present, the path cannot begin with two slash characters ("//").		authority is not present, the path cannot begin with two slash
	These restrictions result in five different ABNF rules for a path		characters ("//"). These restrictions result in five different ABNF
	(Section 3.3), only one of which will match any given URI reference.		rules for a path (Section 3.3), only one of which will match any
			given URI reference.

	The following are two example URIs and their component parts:		The following are two example URIs and their component parts:

	foo://example.com:8042/over/there?name=ferret#nose		foo://example.com:8042/over/there?name=ferret#nose
	\_/ \______________/\_________/ \_________/ \__/		\_/ \______________/\_________/ \_________/ \__/
	\| \| \| \| \|		\| \| \| \| \|
	scheme authority path query fragment		scheme authority path query fragment
	\| _____________________\|__		\| _____________________\|__
	/ \ / \		/ \ / \
	urn:example:animal:ferret:nose		urn:example:animal:ferret:nose


	3.1 Scheme		3.1. Scheme

	Each URI begins with a scheme name that refers to a specification for		Each URI begins with a scheme name that refers to a specification for
	assigning identifiers within that scheme. As such, the URI syntax is		assigning identifiers within that scheme. As such, the URI syntax is
	a federated and extensible naming system wherein each scheme's		a federated and extensible naming system wherein each scheme's
	specification may further restrict the syntax and semantics of		specification may further restrict the syntax and semantics of
	identifiers using that scheme.		identifiers using that scheme.

	Scheme names consist of a sequence of characters beginning with a		Scheme names consist of a sequence of characters beginning with a
	letter and followed by any combination of letters, digits, plus		letter and followed by any combination of letters, digits, plus

	("+"), period ("."), or hyphen ("-"). Although scheme is		("+"), period ("."), or hyphen ("-"). Although schemes are case-
	case-insensitive, the canonical form is lowercase and documents that		insensitive, the canonical form is lowercase and documents that
	specify schemes must do so using lowercase letters. An		specify schemes must do so with lowercase letters. An implementation
	implementation should accept uppercase letters as equivalent to		should accept uppercase letters as equivalent to lowercase in scheme
	lowercase in scheme names (e.g., allow "HTTP" as well as "http"), for		names (e.g., allow "HTTP" as well as "http") for the sake of
	the sake of robustness, but should only produce lowercase scheme		robustness but should only produce lowercase scheme names for
	names, for consistency.		consistency.

	scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )		scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )

	Individual schemes are not specified by this document. The process		Individual schemes are not specified by this document. The process
	for registration of new URI schemes is defined separately by [BCP35].		for registration of new URI schemes is defined separately by [BCP35].
	The scheme registry maintains the mapping between scheme names and		The scheme registry maintains the mapping between scheme names and
	their specifications. Advice for designers of new URI schemes can be		their specifications. Advice for designers of new URI schemes can be
	found in [RFC2718]. URI scheme specifications must define their own		found in [RFC2718]. URI scheme specifications must define their own

	syntax such that all strings matching their scheme-specific syntax		syntax so that all strings matching their scheme-specific syntax will
	will also match the <absolute-URI> grammar, as described in		also match the <absolute-URI> grammar, as described in Section 4.3.
	Section 4.3.

	When presented with a URI that violates one or more scheme-specific		When presented with a URI that violates one or more scheme-specific
	restrictions, the scheme-specific resolution process should flag the		restrictions, the scheme-specific resolution process should flag the
	reference as an error rather than ignore the unused parts; doing so		reference as an error rather than ignore the unused parts; doing so
	reduces the number of equivalent URIs and helps detect abuses of the		reduces the number of equivalent URIs and helps detect abuses of the

	generic syntax that might indicate the URI has been constructed to		generic syntax, which might indicate that the URI has been
	mislead the user (Section 7.6).		constructed to mislead the user (Section 7.6).


	3.2 Authority		3.2. Authority

	Many URI schemes include a hierarchical element for a naming		Many URI schemes include a hierarchical element for a naming

	authority, such that governance of the name space defined by the		authority so that governance of the name space defined by the
	remainder of the URI is delegated to that authority (which may, in		remainder of the URI is delegated to that authority (which may, in
	turn, delegate it further). The generic syntax provides a common		turn, delegate it further). The generic syntax provides a common
	means for distinguishing an authority based on a registered name or		means for distinguishing an authority based on a registered name or
	server address, along with optional port and user information.		server address, along with optional port and user information.

	The authority component is preceded by a double slash ("//") and is		The authority component is preceded by a double slash ("//") and is
	terminated by the next slash ("/"), question mark ("?"), or number		terminated by the next slash ("/"), question mark ("?"), or number
	sign ("#") character, or by the end of the URI.		sign ("#") character, or by the end of the URI.

	authority = [ userinfo "@" ] host [ ":" port ]		authority = [ userinfo "@" ] host [ ":" port ]

	URI producers and normalizers should omit the ":" delimiter that		URI producers and normalizers should omit the ":" delimiter that
	separates host from port if the port component is empty. Some		separates host from port if the port component is empty. Some
	schemes do not allow the userinfo and/or port subcomponents.		schemes do not allow the userinfo and/or port subcomponents.

	If a URI contains an authority component, then the path component		If a URI contains an authority component, then the path component

	must either be empty or begin with a slash ("/") character.		must either be empty or begin with a slash ("/") character. Non-
	Non-validating parsers (those that merely separate a URI reference		validating parsers (those that merely separate a URI reference into
	into its major components) will often ignore the subcomponent		its major components) will often ignore the subcomponent structure of
	structure of authority, treating it as an opaque string from the		authority, treating it as an opaque string from the double-slash to
	double-slash to the first terminating delimiter, until such time as		the first terminating delimiter, until such time as the URI is
	the URI is dereferenced.		dereferenced.


	3.2.1 User Information		3.2.1. User Information

	The userinfo subcomponent may consist of a user name and, optionally,		The userinfo subcomponent may consist of a user name and, optionally,
	scheme-specific information about how to gain authorization to access		scheme-specific information about how to gain authorization to access
	the resource. The user information, if present, is followed by a		the resource. The user information, if present, is followed by a
	commercial at-sign ("@") that delimits it from the host.		commercial at-sign ("@") that delimits it from the host.

	userinfo = *( unreserved / pct-encoded / sub-delims / ":" )		userinfo = *( unreserved / pct-encoded / sub-delims / ":" )

	Use of the format "user:password" in the userinfo field is		Use of the format "user:password" in the userinfo field is
	deprecated. Applications should not render as clear text any data		deprecated. Applications should not render as clear text any data
	after the first colon (":") character found within a userinfo		after the first colon (":") character found within a userinfo
	subcomponent unless the data after the colon is the empty string		subcomponent unless the data after the colon is the empty string
	(indicating no password). Applications may choose to ignore or		(indicating no password). Applications may choose to ignore or

	reject such data when received as part of a reference, and should		reject such data when it is received as part of a reference and
	reject the storage of such data in unencrypted form. The passing of		should reject the storage of such data in unencrypted form. The
	authentication information in clear text has proven to be a security		passing of authentication information in clear text has proven to be
	risk in almost every case where it has been used.		a security risk in almost every case where it has been used.

	Applications that render a URI for the sake of user feedback, such as		Applications that render a URI for the sake of user feedback, such as
	in graphical hypertext browsing, should render userinfo in a way that		in graphical hypertext browsing, should render userinfo in a way that
	is distinguished from the rest of a URI, when feasible. Such		is distinguished from the rest of a URI, when feasible. Such
	rendering will assist the user in cases where the userinfo has been		rendering will assist the user in cases where the userinfo has been
	misleadingly crafted to look like a trusted domain name		misleadingly crafted to look like a trusted domain name
	(Section 7.6).		(Section 7.6).


	3.2.2 Host		3.2.2. Host

	The host subcomponent of authority is identified by an IP literal		The host subcomponent of authority is identified by an IP literal

	encapsulated within square brackets, an IPv4 address in		encapsulated within square brackets, an IPv4 address in dotted-
	dotted-decimal form, or a registered name. The host subcomponent is		decimal form, or a registered name. The host subcomponent is case-
	case-insensitive. The presence of a host subcomponent within a URI		insensitive. The presence of a host subcomponent within a URI does
	does not imply that the scheme requires access to the given host on		not imply that the scheme requires access to the given host on the
	the Internet. In many cases, the host syntax is used only for the		Internet. In many cases, the host syntax is used only for the sake
	sake of reusing the existing registration process created and		of reusing the existing registration process created and deployed for
	deployed for DNS, thus obtaining a globally unique name without the		DNS, thus obtaining a globally unique name without the cost of
	cost of deploying another registry. However, such use comes with its		deploying another registry. However, such use comes with its own
	own costs: domain name ownership may change over time for reasons not		costs: domain name ownership may change over time for reasons not
	anticipated by the URI producer. In other cases, the data within the		anticipated by the URI producer. In other cases, the data within the
	host component identifies a registered name that has nothing to do		host component identifies a registered name that has nothing to do
	with an Internet host. We use the name "host" for the ABNF rule		with an Internet host. We use the name "host" for the ABNF rule

	because that is its most common purpose, not its only purpose, and		because that is its most common purpose, not its only purpose.
	thus should not be considered as semantically limiting the data
	within it.

	host = IP-literal / IPv4address / reg-name		host = IP-literal / IPv4address / reg-name

	The syntax rule for host is ambiguous because it does not completely		The syntax rule for host is ambiguous because it does not completely
	distinguish between an IPv4address and a reg-name. In order to		distinguish between an IPv4address and a reg-name. In order to
	disambiguate the syntax, we apply the "first-match-wins" algorithm:		disambiguate the syntax, we apply the "first-match-wins" algorithm:
	If host matches the rule for IPv4address, then it should be		If host matches the rule for IPv4address, then it should be
	considered an IPv4 address literal and not a reg-name. Although host		considered an IPv4 address literal and not a reg-name. Although host
	is case-insensitive, producers and normalizers should use lowercase		is case-insensitive, producers and normalizers should use lowercase
	for registered names and hexadecimal addresses for the sake of		for registered names and hexadecimal addresses for the sake of
	uniformity, while only using uppercase letters for percent-encodings.		uniformity, while only using uppercase letters for percent-encodings.

	A host identified by an Internet Protocol literal address, version 6		A host identified by an Internet Protocol literal address, version 6
	[RFC3513] or later, is distinguished by enclosing the IP literal		[RFC3513] or later, is distinguished by enclosing the IP literal
	within square brackets ("[" and "]"). This is the only place where		within square brackets ("[" and "]"). This is the only place where
	square bracket characters are allowed in the URI syntax. In		square bracket characters are allowed in the URI syntax. In
	anticipation of future, as-yet-undefined IP literal address formats,		anticipation of future, as-yet-undefined IP literal address formats,

	an optional version flag may be used to indicate such a format		an implementation may use an optional version flag to indicate such a
	explicitly rather than relying on heuristic determination.		format explicitly rather than rely on heuristic determination.

	IP-literal = "[" ( IPv6address / IPvFuture ) "]"		IP-literal = "[" ( IPv6address / IPvFuture ) "]"

	IPvFuture = "v" 1HEXDIG "." 1( unreserved / sub-delims / ":" )		IPvFuture = "v" 1HEXDIG "." 1( unreserved / sub-delims / ":" )

	The version flag does not indicate the IP version; rather, it		The version flag does not indicate the IP version; rather, it
	indicates future versions of the literal format. As such,		indicates future versions of the literal format. As such,

	implementations must not provide the version flag for existing IPv4		implementations must not provide the version flag for the existing
	and IPv6 literal addresses. If a URI containing an IP-literal that		IPv4 and IPv6 literal address forms described below. If a URI
	starts with "v" (case-insensitive), indicating that the version flag		containing an IP-literal that starts with "v" (case-insensitive),
	is present, is dereferenced by an application that does not know the		indicating that the version flag is present, is dereferenced by an
	meaning of that version flag, then the application should return an		application that does not know the meaning of that version flag, then
	appropriate error for "address mechanism not supported".		the application should return an appropriate error for "address
			mechanism not supported".

	A host identified by an IPv6 literal address is represented inside		A host identified by an IPv6 literal address is represented inside
	the square brackets without a preceding version flag. The ABNF		the square brackets without a preceding version flag. The ABNF
	provided here is a translation of the text definition of an IPv6		provided here is a translation of the text definition of an IPv6

	literal address provided in [RFC3513]. A 128-bit IPv6 address is		literal address provided in [RFC3513]. This syntax does not support
	divided into eight 16-bit pieces. Each piece is represented		IPv6 scoped addressing zone identifiers.
	numerically in case-insensitive hexadecimal, using one to four
	hexadecimal digits (leading zeroes are permitted). The eight encoded		A 128-bit IPv6 address is divided into eight 16-bit pieces. Each
	pieces are given most-significant first, separated by colon		piece is represented numerically in case-insensitive hexadecimal,
	characters. Optionally, the least-significant two pieces may instead		using one to four hexadecimal digits (leading zeroes are permitted).
	be represented in IPv4 address textual format. A sequence of one or		The eight encoded pieces are given most-significant first, separated
	more consecutive zero-valued 16-bit pieces within the address may be		by colon characters. Optionally, the least-significant two pieces
	elided, omitting all their digits and leaving exactly two consecutive		may instead be represented in IPv4 address textual format. A
	colons in their place to mark the elision.		sequence of one or more consecutive zero-valued 16-bit pieces within
			the address may be elided, omitting all their digits and leaving
			exactly two consecutive colons in their place to mark the elision.

	IPv6address = 6( h16 ":" ) ls32		IPv6address = 6( h16 ":" ) ls32
	/ "::" 5( h16 ":" ) ls32		/ "::" 5( h16 ":" ) ls32
	/ [ h16 ] "::" 4( h16 ":" ) ls32		/ [ h16 ] "::" 4( h16 ":" ) ls32
	/ [ *1( h16 ":" ) h16 ] "::" 3( h16 ":" ) ls32		/ [ *1( h16 ":" ) h16 ] "::" 3( h16 ":" ) ls32
	/ [ *2( h16 ":" ) h16 ] "::" 2( h16 ":" ) ls32		/ [ *2( h16 ":" ) h16 ] "::" 2( h16 ":" ) ls32
	/ [ *3( h16 ":" ) h16 ] "::" h16 ":" ls32		/ [ *3( h16 ":" ) h16 ] "::" h16 ":" ls32
	/ [ *4( h16 ":" ) h16 ] "::" ls32		/ [ *4( h16 ":" ) h16 ] "::" ls32
	/ [ *5( h16 ":" ) h16 ] "::" h16		/ [ *5( h16 ":" ) h16 ] "::" h16
	/ [ *6( h16 ":" ) h16 ] "::"		/ [ *6( h16 ":" ) h16 ] "::"

	skipping to change at page 20, line 38 ¶		skipping to change at page 20, line 48 ¶

	IPv4address = dec-octet "." dec-octet "." dec-octet "." dec-octet		IPv4address = dec-octet "." dec-octet "." dec-octet "." dec-octet

	dec-octet = DIGIT ; 0-9		dec-octet = DIGIT ; 0-9
	/ %x31-39 DIGIT ; 10-99		/ %x31-39 DIGIT ; 10-99
	/ "1" 2DIGIT ; 100-199		/ "1" 2DIGIT ; 100-199
	/ "2" %x30-34 DIGIT ; 200-249		/ "2" %x30-34 DIGIT ; 200-249
	/ "25" %x30-35 ; 250-255		/ "25" %x30-35 ; 250-255

	A host identified by a registered name is a sequence of characters		A host identified by a registered name is a sequence of characters

	that is usually intended for lookup within a locally-defined host or		usually intended for lookup within a locally defined host or service
	service name registry, though the URI's scheme-specific semantics may		name registry, though the URI's scheme-specific semantics may require
	require that a specific registry (or fixed name table) be used		that a specific registry (or fixed name table) be used instead. The
	instead. The most common name registry mechanism is the Domain Name		most common name registry mechanism is the Domain Name System (DNS).
	System (DNS). A registered name intended for lookup in the DNS uses		A registered name intended for lookup in the DNS uses the syntax
	the syntax defined in Section 3.5 of [RFC1034] and Section 2.1 of		defined in Section 3.5 of [RFC1034] and Section 2.1 of [RFC1123].
	[RFC1123]. Such a name consists of a sequence of domain labels		Such a name consists of a sequence of domain labels separated by ".",
	separated by ".", each domain label starting and ending with an		each domain label starting and ending with an alphanumeric character
	alphanumeric character and possibly also containing "-" characters.		and possibly also containing "-" characters. The rightmost domain
	The rightmost domain label of a fully qualified domain name in DNS		label of a fully qualified domain name in DNS may be followed by a
	may be followed by a single "." and should be followed by one if it		single "." and should be if it is necessary to distinguish between
	is necessary to distinguish between the complete domain name and some		the complete domain name and some local domain.
	local domain.

	reg-name = *( unreserved / pct-encoded / sub-delims )		reg-name = *( unreserved / pct-encoded / sub-delims )

	If the URI scheme defines a default for host, then that default		If the URI scheme defines a default for host, then that default
	applies when the host subcomponent is undefined or when the		applies when the host subcomponent is undefined or when the
	registered name is empty (zero length). For example, the "file" URI		registered name is empty (zero length). For example, the "file" URI

	scheme is defined such that no authority, an empty host, and		scheme is defined so that no authority, an empty host, and
	"localhost" all mean the end-user's machine, whereas the "http"		"localhost" all mean the end-user's machine, whereas the "http"

	scheme considers a missing authority or empty host to be invalid.		scheme considers a missing authority or empty host invalid.

	This specification does not mandate a particular registered name		This specification does not mandate a particular registered name

	lookup technology and therefore does not restrict the syntax of		lookup technology and therefore does not restrict the syntax of reg-
	reg-name beyond that necessary for interoperability. Instead, it		name beyond what is necessary for interoperability. Instead, it
	delegates the issue of registered name syntax conformance to the		delegates the issue of registered name syntax conformance to the
	operating system of each application performing URI resolution, and		operating system of each application performing URI resolution, and
	that operating system decides what it will allow for the purpose of		that operating system decides what it will allow for the purpose of
	host identification. A URI resolution implementation might use DNS,		host identification. A URI resolution implementation might use DNS,
	host tables, yellow pages, NetInfo, WINS, or any other system for		host tables, yellow pages, NetInfo, WINS, or any other system for

	lookup of registered names. However, a globally-scoped naming		lookup of registered names. However, a globally scoped naming
	system, such as DNS fully-qualified domain names, is necessary for		system, such as DNS fully qualified domain names, is necessary for
	URIs that are intended to have global scope. URI producers should		URIs intended to have global scope. URI producers should use names
	use names that conform to the DNS syntax, even when use of DNS is not		that conform to the DNS syntax, even when use of DNS is not
	immediately apparent, and should limit such names to no more than 255		immediately apparent, and should limit these names to no more than
	characters in length.		255 characters in length.

	The reg-name syntax allows percent-encoded octets in order to		The reg-name syntax allows percent-encoded octets in order to
	represent non-ASCII registered names in a uniform way that is		represent non-ASCII registered names in a uniform way that is

	independent of the underlying name resolution technology; such		independent of the underlying name resolution technology. Non-ASCII
	non-ASCII characters must first be encoded according to UTF-8 [STD63]		characters must first be encoded according to UTF-8 [STD63], and then
	and then each octet of the corresponding UTF-8 sequence must be		each octet of the corresponding UTF-8 sequence must be percent-
	percent-encoded to be represented as URI characters. URI producing		encoded to be represented as URI characters. URI producing
	applications must not use percent-encoding in host unless it is used		applications must not use percent-encoding in host unless it is used
	to represent a UTF-8 character sequence. When a non-ASCII registered		to represent a UTF-8 character sequence. When a non-ASCII registered
	name represents an internationalized domain name intended for		name represents an internationalized domain name intended for
	resolution via the DNS, the name must be transformed to the IDNA		resolution via the DNS, the name must be transformed to the IDNA
	encoding [RFC3490] prior to name lookup. URI producers should		encoding [RFC3490] prior to name lookup. URI producers should

	provide such registered names in the IDNA encoding, rather than a		provide these registered names in the IDNA encoding, rather than a
	percent-encoding, if they wish to maximize interoperability with		percent-encoding, if they wish to maximize interoperability with
	legacy URI resolvers.		legacy URI resolvers.


	3.2.3 Port		3.2.3. Port

	The port subcomponent of authority is designated by an optional port		The port subcomponent of authority is designated by an optional port
	number in decimal following the host and delimited from it by a		number in decimal following the host and delimited from it by a
	single colon (":") character.		single colon (":") character.

	port = *DIGIT		port = *DIGIT

	A scheme may define a default port. For example, the "http" scheme		A scheme may define a default port. For example, the "http" scheme
	defines a default port of "80", corresponding to its reserved TCP		defines a default port of "80", corresponding to its reserved TCP
	port number. The type of port designated by the port number (e.g.,		port number. The type of port designated by the port number (e.g.,

	TCP, UDP, SCTP, etc.) is defined by the URI scheme. URI producers		TCP, UDP, SCTP) is defined by the URI scheme. URI producers and
	and normalizers should omit the port component and its ":" delimiter		normalizers should omit the port component and its ":" delimiter if
	if port is empty or its value would be the same as the scheme's		port is empty or if its value would be the same as that of the
	default.		scheme's default.


	3.3 Path		3.3. Path

	The path component contains data, usually organized in hierarchical		The path component contains data, usually organized in hierarchical
	form, that, along with data in the non-hierarchical query component		form, that, along with data in the non-hierarchical query component
	(Section 3.4), serves to identify a resource within the scope of the		(Section 3.4), serves to identify a resource within the scope of the
	URI's scheme and naming authority (if any). The path is terminated		URI's scheme and naming authority (if any). The path is terminated
	by the first question mark ("?") or number sign ("#") character, or		by the first question mark ("?") or number sign ("#") character, or
	by the end of the URI.		by the end of the URI.

	If a URI contains an authority component, then the path component		If a URI contains an authority component, then the path component
	must either be empty or begin with a slash ("/") character. If a URI		must either be empty or begin with a slash ("/") character. If a URI

	skipping to change at page 23, line 9 ¶		skipping to change at page 23, line 22 ¶
	("/") character. A path is always defined for a URI, though the		("/") character. A path is always defined for a URI, though the
	defined path may be empty (zero length). Use of the slash character		defined path may be empty (zero length). Use of the slash character
	to indicate hierarchy is only required when a URI will be used as the		to indicate hierarchy is only required when a URI will be used as the
	context for relative references. For example, the URI		context for relative references. For example, the URI
	<mailto:[email protected]> has a path of "[email protected]", whereas		<mailto:[email protected]> has a path of "[email protected]", whereas
	the URI <foo://info.example.com?fred> has an empty path.		the URI <foo://info.example.com?fred> has an empty path.

	The path segments "." and "..", also known as dot-segments, are		The path segments "." and "..", also known as dot-segments, are
	defined for relative reference within the path name hierarchy. They		defined for relative reference within the path name hierarchy. They
	are intended for use at the beginning of a relative-path reference		are intended for use at the beginning of a relative-path reference

	(Section 4.2) for indicating relative position within the		(Section 4.2) to indicate relative position within the hierarchical
	hierarchical tree of names. This is similar to their role within		tree of names. This is similar to their role within some operating
	some operating systems' file directory structure to indicate the		systems' file directory structures to indicate the current directory
	current directory and parent directory, respectively. However,		and parent directory, respectively. However, unlike in a file
	unlike a file system, these dot-segments are only interpreted within		system, these dot-segments are only interpreted within the URI path
	the URI path hierarchy and are removed as part of the resolution		hierarchy and are removed as part of the resolution process (Section
	process (Section 5.2).		5.2).

	Aside from dot-segments in hierarchical paths, a path segment is		Aside from dot-segments in hierarchical paths, a path segment is

	considered opaque by the generic syntax. URI-producing applications		considered opaque by the generic syntax. URI producing applications
	often use the reserved characters allowed in a segment for the		often use the reserved characters allowed in a segment to delimit
	purpose of delimiting scheme-specific or dereference-handler-specific		scheme-specific or dereference-handler-specific subcomponents. For
	subcomponents. For example, the semicolon (";") and equals ("=")		example, the semicolon (";") and equals ("=") reserved characters are
	reserved characters are often used for delimiting parameters and		often used to delimit parameters and parameter values applicable to
	parameter values applicable to that segment. The comma (",")		that segment. The comma (",") reserved character is often used for
	reserved character is often used for similar purposes. For example,		similar purposes. For example, one URI producer might use a segment
	one URI producer might use a segment like "name;v=1.1" to indicate a		such as "name;v=1.1" to indicate a reference to version 1.1 of
	reference to version 1.1 of "name", whereas another might use a		"name", whereas another might use a segment such as "name,1.1" to
	segment like "name,1.1" to indicate the same. Parameter types may be		indicate the same. Parameter types may be defined by scheme-specific
	defined by scheme-specific semantics, but in most cases the syntax of		semantics, but in most cases the syntax of a parameter is specific to
	a parameter is specific to the implementation of the URI's		the implementation of the URI's dereferencing algorithm.
	dereferencing algorithm.


	3.4 Query		3.4. Query

	The query component contains non-hierarchical data that, along with		The query component contains non-hierarchical data that, along with
	data in the path component (Section 3.3), serves to identify a		data in the path component (Section 3.3), serves to identify a
	resource within the scope of the URI's scheme and naming authority		resource within the scope of the URI's scheme and naming authority
	(if any). The query component is indicated by the first question		(if any). The query component is indicated by the first question
	mark ("?") character and terminated by a number sign ("#") character		mark ("?") character and terminated by a number sign ("#") character
	or by the end of the URI.		or by the end of the URI.

	query = *( pchar / "/" / "?" )		query = *( pchar / "/" / "?" )

	The characters slash ("/") and question mark ("?") may represent data		The characters slash ("/") and question mark ("?") may represent data
	within the query component. Beware that some older, erroneous		within the query component. Beware that some older, erroneous

	implementations may not handle such data correctly when used as the		implementations may not handle such data correctly when it is used as
	base URI for relative references (Section 5.1), apparently because		the base URI for relative references (Section 5.1), apparently
	they fail to to distinguish query data from path data when looking		because they fail to distinguish query data from path data when
	for hierarchical separators. However, since query components are		looking for hierarchical separators. However, as query components
	often used to carry identifying information in the form of		are often used to carry identifying information in the form of
	"key=value" pairs, and one frequently used value is a reference to		"key=value" pairs and one frequently used value is a reference to
	another URI, it is sometimes better for usability to avoid		another URI, it is sometimes better for usability to avoid percent-
	percent-encoding those characters.		encoding those characters.


	3.5 Fragment		3.5. Fragment

	The fragment identifier component of a URI allows indirect		The fragment identifier component of a URI allows indirect
	identification of a secondary resource by reference to a primary		identification of a secondary resource by reference to a primary
	resource and additional identifying information. The identified		resource and additional identifying information. The identified
	secondary resource may be some portion or subset of the primary		secondary resource may be some portion or subset of the primary
	resource, some view on representations of the primary resource, or		resource, some view on representations of the primary resource, or
	some other resource defined or described by those representations. A		some other resource defined or described by those representations. A
	fragment identifier component is indicated by the presence of a		fragment identifier component is indicated by the presence of a
	number sign ("#") character and terminated by the end of the URI.		number sign ("#") character and terminated by the end of the URI.

	fragment = *( pchar / "/" / "?" )		fragment = *( pchar / "/" / "?" )

	The semantics of a fragment identifier are defined by the set of		The semantics of a fragment identifier are defined by the set of
	representations that might result from a retrieval action on the		representations that might result from a retrieval action on the
	primary resource. The fragment's format and resolution is therefore		primary resource. The fragment's format and resolution is therefore
	dependent on the media type [RFC2046] of a potentially retrieved		dependent on the media type [RFC2046] of a potentially retrieved
	representation, even though such a retrieval is only performed if the		representation, even though such a retrieval is only performed if the
	URI is dereferenced. If no such representation exists, then the		URI is dereferenced. If no such representation exists, then the

	semantics of the fragment are considered unknown and, effectively,		semantics of the fragment are considered unknown and are effectively
	unconstrained. Fragment identifier semantics are independent of the		unconstrained. Fragment identifier semantics are independent of the
	URI scheme and thus cannot be redefined by scheme specifications.		URI scheme and thus cannot be redefined by scheme specifications.


	Individual media types may define their own restrictions on, or		Individual media types may define their own restrictions on or
	structure within, the fragment identifier syntax for specifying		structures within the fragment identifier syntax for specifying
	different types of subsets, views, or external references that are		different types of subsets, views, or external references that are
	identifiable as secondary resources by that media type. If the		identifiable as secondary resources by that media type. If the
	primary resource has multiple representations, as is often the case		primary resource has multiple representations, as is often the case
	for resources whose representation is selected based on attributes of		for resources whose representation is selected based on attributes of
	the retrieval request (a.k.a., content negotiation), then whatever is		the retrieval request (a.k.a., content negotiation), then whatever is
	identified by the fragment should be consistent across all of those		identified by the fragment should be consistent across all of those

	representations: each representation should either define the		representations. Each representation should either define the
	fragment such that it corresponds to the same secondary resource,		fragment so that it corresponds to the same secondary resource,
	regardless of how it is represented, or the fragment should be left		regardless of how it is represented, or should leave the fragment
	undefined by the representation (i.e., not found).		undefined (i.e., not found).

	As with any URI, use of a fragment identifier component does not		As with any URI, use of a fragment identifier component does not
	imply that a retrieval action will take place. A URI with a fragment		imply that a retrieval action will take place. A URI with a fragment
	identifier may be used to refer to the secondary resource without any		identifier may be used to refer to the secondary resource without any
	implication that the primary resource is accessible or will ever be		implication that the primary resource is accessible or will ever be
	accessed.		accessed.

	Fragment identifiers have a special role in information retrieval		Fragment identifiers have a special role in information retrieval
	systems as the primary form of client-side indirect referencing,		systems as the primary form of client-side indirect referencing,

	allowing an author to specifically identify those aspects of an		allowing an author to specifically identify aspects of an existing
	existing resource that are only indirectly provided by the resource		resource that are only indirectly provided by the resource owner. As
	owner. As such, the fragment identifier is not used in the		such, the fragment identifier is not used in the scheme-specific
	scheme-specific processing of a URI; instead, the fragment identifier		processing of a URI; instead, the fragment identifier is separated
	is separated from the rest of the URI prior to a dereference, and		from the rest of the URI prior to a dereference, and thus the
	thus the identifying information within the fragment itself is		identifying information within the fragment itself is dereferenced
	dereferenced solely by the user agent and regardless of the URI		solely by the user agent, regardless of the URI scheme. Although
	scheme. Although this separate handling is often perceived to be a		this separate handling is often perceived to be a loss of
	loss of information, particularly in regards to accurate redirection		information, particularly for accurate redirection of references as
	of references as resources move over time, it also serves to prevent		resources move over time, it also serves to prevent information
	information providers from denying reference authors the right to		providers from denying reference authors the right to refer to
	selectively refer to information within a resource. Indirect		information within a resource selectively. Indirect referencing also
	referencing also provides additional flexibility and extensibility to		provides additional flexibility and extensibility to systems that use
	systems that use URIs, since new media types are easier to define and		URIs, as new media types are easier to define and deploy than new
	deploy than new schemes of identification.		schemes of identification.

	The characters slash ("/") and question mark ("?") are allowed to		The characters slash ("/") and question mark ("?") are allowed to
	represent data within the fragment identifier. Beware that some		represent data within the fragment identifier. Beware that some

	older, erroneous implementations may not handle such data correctly		older, erroneous implementations may not handle this data correctly
	when used as the base URI for relative references (Section 5.1).		when it is used as the base URI for relative references (Section
			5.1).

	4. Usage		4. Usage

	When applications make reference to a URI, they do not always use the		When applications make reference to a URI, they do not always use the

	full form of reference defined by the "URI" syntax rule. In order to		full form of reference defined by the "URI" syntax rule. To save
	save space and take advantage of hierarchical locality, many Internet		space and take advantage of hierarchical locality, many Internet
	protocol elements and media type formats allow an abbreviation of a		protocol elements and media type formats allow an abbreviation of a

	URI, while others restrict the syntax to a particular form of URI.		URI, whereas others restrict the syntax to a particular form of URI.
	We define the most common forms of reference syntax in this		We define the most common forms of reference syntax in this
	specification because they impact and depend upon the design of the		specification because they impact and depend upon the design of the
	generic syntax, requiring a uniform parsing algorithm in order to be		generic syntax, requiring a uniform parsing algorithm in order to be
	interpreted consistently.		interpreted consistently.


	4.1 URI Reference		4.1. URI Reference

	URI-reference is used to denote the most common usage of a resource		URI-reference is used to denote the most common usage of a resource
	identifier.		identifier.

	URI-reference = URI / relative-ref		URI-reference = URI / relative-ref

	A URI-reference is either a URI or a relative reference. If the		A URI-reference is either a URI or a relative reference. If the
	URI-reference's prefix does not match the syntax of a scheme followed		URI-reference's prefix does not match the syntax of a scheme followed
	by its colon separator, then the URI-reference is a relative		by its colon separator, then the URI-reference is a relative
	reference.		reference.

	A URI-reference is typically parsed first into the five URI		A URI-reference is typically parsed first into the five URI
	components, in order to determine what components are present and		components, in order to determine what components are present and

	whether or not the reference is relative, after which each component		whether the reference is relative. Then, each component is parsed
	is parsed for its subparts and their validation. The ABNF of		for its subparts and their validation. The ABNF of URI-reference,
	URI-reference, along with the "first-match-wins" disambiguation rule,		along with the "first-match-wins" disambiguation rule, is sufficient
	is sufficient to define a validating parser for the generic syntax.		to define a validating parser for the generic syntax. Readers
	Readers familiar with regular expressions should see Appendix B for		familiar with regular expressions should see Appendix B for an
	an example of a non-validating URI-reference parser that will take		example of a non-validating URI-reference parser that will take any
	any given string and extract the URI components.		given string and extract the URI components.


	4.2 Relative Reference		4.2. Relative Reference

	A relative reference takes advantage of the hierarchical syntax		A relative reference takes advantage of the hierarchical syntax

	(Section 1.2.3) in order to express a URI reference relative to the		(Section 1.2.3) to express a URI reference relative to the name space
	name space of another hierarchical URI.		of another hierarchical URI.

	relative-ref = relative-part [ "?" query ] [ "#" fragment ]		relative-ref = relative-part [ "?" query ] [ "#" fragment ]

	relative-part = "//" authority path-abempty		relative-part = "//" authority path-abempty
	/ path-absolute		/ path-absolute
	/ path-noscheme		/ path-noscheme
	/ path-empty		/ path-empty

	The URI referred to by a relative reference, also known as the target		The URI referred to by a relative reference, also known as the target
	URI, is obtained by applying the reference resolution algorithm of		URI, is obtained by applying the reference resolution algorithm of
	Section 5.		Section 5.

	A relative reference that begins with two slash characters is termed		A relative reference that begins with two slash characters is termed
	a network-path reference; such references are rarely used. A		a network-path reference; such references are rarely used. A
	relative reference that begins with a single slash character is		relative reference that begins with a single slash character is
	termed an absolute-path reference. A relative reference that does		termed an absolute-path reference. A relative reference that does
	not begin with a slash character is termed a relative-path reference.		not begin with a slash character is termed a relative-path reference.

	A path segment that contains a colon character (e.g., "this:that")		A path segment that contains a colon character (e.g., "this:that")

	cannot be used as the first segment of a relative-path reference		cannot be used as the first segment of a relative-path reference, as
	because it would be mistaken for a scheme name. Such a segment must		it would be mistaken for a scheme name. Such a segment must be
	be preceded by a dot-segment (e.g., "./this:that") to make a		preceded by a dot-segment (e.g., "./this:that") to make a relative-
	relative-path reference.		path reference.


	4.3 Absolute URI		4.3. Absolute URI

	Some protocol elements allow only the absolute form of a URI without		Some protocol elements allow only the absolute form of a URI without
	a fragment identifier. For example, defining a base URI for later		a fragment identifier. For example, defining a base URI for later
	use by relative references calls for an absolute-URI syntax rule that		use by relative references calls for an absolute-URI syntax rule that
	does not allow a fragment.		does not allow a fragment.

	absolute-URI = scheme ":" hier-part [ "?" query ]		absolute-URI = scheme ":" hier-part [ "?" query ]


	URI scheme specifications must define their own syntax such that all		URI scheme specifications must define their own syntax so that all
	strings matching their scheme-specific syntax will also match the		strings matching their scheme-specific syntax will also match the

	<absolute-URI> grammar. Scheme specifications are not responsible		<absolute-URI> grammar. Scheme specifications will not define
	for defining fragment identifier syntax or usage, regardless of its		fragment identifier syntax or usage, regardless of its applicability
	applicability to resources identifiable via that scheme, since		to resources identifiable via that scheme, as fragment identification
	fragment identification is orthogonal to scheme definition. However,		is orthogonal to scheme definition. However, scheme specifications
	scheme specifications are encouraged to include a wide range of		are encouraged to include a wide range of examples, including
	examples, including examples that show use of the scheme's URIs with		examples that show use of the scheme's URIs with fragment identifiers
	fragment identifiers when such usage is appropriate.		when such usage is appropriate.


	4.4 Same-document Reference		4.4. Same-Document Reference

	When a URI reference refers to a URI that is, aside from its fragment		When a URI reference refers to a URI that is, aside from its fragment
	component (if any), identical to the base URI (Section 5.1), that		component (if any), identical to the base URI (Section 5.1), that
	reference is called a "same-document" reference. The most frequent		reference is called a "same-document" reference. The most frequent
	examples of same-document references are relative references that are		examples of same-document references are relative references that are
	empty or include only the number sign ("#") separator followed by a		empty or include only the number sign ("#") separator followed by a
	fragment identifier.		fragment identifier.


	When a same-document reference is dereferenced for the purpose of a		When a same-document reference is dereferenced for a retrieval
	retrieval action, the target of that reference is defined to be		action, the target of that reference is defined to be within the same
	within the same entity (representation, document, or message) as the		entity (representation, document, or message) as the reference;
	reference; therefore, a dereference should not result in a new		therefore, a dereference should not result in a new retrieval action.
	retrieval action.

	Normalization of the base and target URIs prior to their comparison,		Normalization of the base and target URIs prior to their comparison,

	as described in Section 6.2.2 and Section 6.2.3, is allowed but		as described in Sections 6.2.2 and 6.2.3, is allowed but rarely
	rarely performed in practice. Normalization may increase the set of		performed in practice. Normalization may increase the set of same-
	same-document references, which may be of benefit to some caching		document references, which may be of benefit to some caching
	applications. As such, reference authors should not assume that a		applications. As such, reference authors should not assume that a
	slightly different, though equivalent, reference URI will (or will		slightly different, though equivalent, reference URI will (or will
	not) be interpreted as a same-document reference by any given		not) be interpreted as a same-document reference by any given
	application.		application.


	4.5 Suffix Reference		4.5. Suffix Reference

	The URI syntax is designed for unambiguous reference to resources and		The URI syntax is designed for unambiguous reference to resources and
	extensibility via the URI scheme. However, as URI identification and		extensibility via the URI scheme. However, as URI identification and
	usage have become commonplace, traditional media (television, radio,		usage have become commonplace, traditional media (television, radio,
	newspapers, billboards, etc.) have increasingly used a suffix of the		newspapers, billboards, etc.) have increasingly used a suffix of the
	URI as a reference, consisting of only the authority and path		URI as a reference, consisting of only the authority and path
	portions of the URI, such as		portions of the URI, such as

	www.w3.org/Addressing/		www.w3.org/Addressing/

	or simply a DNS registered name on its own. Such references are		or simply a DNS registered name on its own. Such references are

	primarily intended for human interpretation, rather than for		primarily intended for human interpretation rather than for machines,
	machines, with the assumption that context-based heuristics are		with the assumption that context-based heuristics are sufficient to
	sufficient to complete the URI (e.g., most registered names beginning		complete the URI (e.g., most registered names beginning with "www"
	with "www" are likely to have a URI prefix of "http://"). Although		are likely to have a URI prefix of "http://"). Although there is no
	there is no standard set of heuristics for disambiguating a URI		standard set of heuristics for disambiguating a URI suffix, many
	suffix, many client implementations allow them to be entered by the		client implementations allow them to be entered by the user and
	user and heuristically resolved.		heuristically resolved.


	While this practice of using suffix references is common, it should		Although this practice of using suffix references is common, it
	be avoided whenever possible and never used in situations where		should be avoided whenever possible and should never be used in
	long-term references are expected. The heuristics noted above will		situations where long-term references are expected. The heuristics
	change over time, particularly when a new URI scheme becomes popular,		noted above will change over time, particularly when a new URI scheme
	and are often incorrect when used out of context. Furthermore, they		becomes popular, and are often incorrect when used out of context.
	can lead to security issues along the lines of those described in		Furthermore, they can lead to security issues along the lines of
	[RFC1535].		those described in [RFC1535].


	Since a URI suffix has the same syntax as a relative-path reference,		As a URI suffix has the same syntax as a relative-path reference, a
	a suffix reference cannot be used in contexts where a relative		suffix reference cannot be used in contexts where a relative
	reference is expected. As a result, suffix references are limited to		reference is expected. As a result, suffix references are limited to

	those places where there is no defined base URI, such as dialog boxes		places where there is no defined base URI, such as dialog boxes and
	and off-line advertisements.		off-line advertisements.

	5. Reference Resolution		5. Reference Resolution

	This section defines the process of resolving a URI reference within		This section defines the process of resolving a URI reference within

	a context that allows relative references, such that the result is a		a context that allows relative references so that the result is a
	string matching the <URI> syntax rule of Section 3.		string matching the <URI> syntax rule of Section 3.


	5.1 Establishing a Base URI		5.1. Establishing a Base URI


	The term "relative" implies that there exists a "base URI" against		The term "relative" implies that a "base URI" exists against which
	which the relative reference is applied. Aside from fragment-only		the relative reference is applied. Aside from fragment-only
	references (Section 4.4), relative references are only usable when a		references (Section 4.4), relative references are only usable when a
	base URI is known. A base URI must be established by the parser		base URI is known. A base URI must be established by the parser
	prior to parsing URI references that might be relative. A base URI		prior to parsing URI references that might be relative. A base URI

	must conform to the <absolute-URI> syntax rule (Section 4.3): if the		must conform to the <absolute-URI> syntax rule (Section 4.3). If the
	base URI is obtained from a URI reference, then that reference must		base URI is obtained from a URI reference, then that reference must
	be converted to absolute form and stripped of any fragment component		be converted to absolute form and stripped of any fragment component

	prior to use as a base URI.		prior to its use as a base URI.

	The base URI of a reference can be established in one of four ways,		The base URI of a reference can be established in one of four ways,
	discussed below in order of precedence. The order of precedence can		discussed below in order of precedence. The order of precedence can
	be thought of in terms of layers, where the innermost defined base		be thought of in terms of layers, where the innermost defined base
	URI has the highest precedence. This can be visualized graphically		URI has the highest precedence. This can be visualized graphically

	as:		as follows:


	.----------------------------------------------------------.		.----------------------------------------------------------.
	\| .----------------------------------------------------. \|		\| .----------------------------------------------------. \|
	\| \| .----------------------------------------------. \| \|		\| \| .----------------------------------------------. \| \|
	\| \| \| .----------------------------------------. \| \| \|		\| \| \| .----------------------------------------. \| \| \|
	\| \| \| \| .----------------------------------. \| \| \| \|		\| \| \| \| .----------------------------------. \| \| \| \|
	\| \| \| \| \| <relative-reference> \| \| \| \| \|		\| \| \| \| \| <relative-reference> \| \| \| \| \|
	\| \| \| \| `----------------------------------' \| \| \| \|		\| \| \| \| `----------------------------------' \| \| \| \|
	\| \| \| \| (5.1.1) Base URI embedded in content \| \| \| \|		\| \| \| \| (5.1.1) Base URI embedded in content \| \| \| \|
	\| \| \| `----------------------------------------' \| \| \|		\| \| \| `----------------------------------------' \| \| \|
	\| \| \| (5.1.2) Base URI of the encapsulating entity \| \| \|		\| \| \| (5.1.2) Base URI of the encapsulating entity \| \| \|
	\| \| \| (message, representation, or none) \| \| \|		\| \| \| (message, representation, or none) \| \| \|
	\| \| `----------------------------------------------' \| \|		\| \| `----------------------------------------------' \| \|
	\| \| (5.1.3) URI used to retrieve the entity \| \|		\| \| (5.1.3) URI used to retrieve the entity \| \|
	\| `----------------------------------------------------' \|		\| `----------------------------------------------------' \|
	\| (5.1.4) Default Base URI (application-dependent) \|		\| (5.1.4) Default Base URI (application-dependent) \|
	`----------------------------------------------------------'		`----------------------------------------------------------'


	5.1.1 Base URI Embedded in Content		5.1.1. Base URI Embedded in Content

	Within certain media types, a base URI for relative references can be		Within certain media types, a base URI for relative references can be

	embedded within the content itself such that it can be readily		embedded within the content itself so that it can be readily obtained
	obtained by a parser. This can be useful for descriptive documents,		by a parser. This can be useful for descriptive documents, such as
	such as tables of content, which may be transmitted to others through		tables of contents, which may be transmitted to others through
	protocols other than their usual retrieval context (e.g., E-Mail or		protocols other than their usual retrieval context (e.g., email or
	USENET news).		USENET news).

	It is beyond the scope of this specification to specify how, for each		It is beyond the scope of this specification to specify how, for each
	media type, a base URI can be embedded. The appropriate syntax, when		media type, a base URI can be embedded. The appropriate syntax, when
	available, is described by the data format specification associated		available, is described by the data format specification associated
	with each media type.		with each media type.


	5.1.2 Base URI from the Encapsulating Entity		5.1.2. Base URI from the Encapsulating Entity

	If no base URI is embedded, the base URI is defined by the		If no base URI is embedded, the base URI is defined by the
	representation's retrieval context. For a document that is enclosed		representation's retrieval context. For a document that is enclosed
	within another entity, such as a message or archive, the retrieval		within another entity, such as a message or archive, the retrieval

	context is that entity; thus, the default base URI of a		context is that entity. Thus, the default base URI of a
	representation is the base URI of the entity in which the		representation is the base URI of the entity in which the
	representation is encapsulated.		representation is encapsulated.

	A mechanism for embedding a base URI within MIME container types		A mechanism for embedding a base URI within MIME container types
	(e.g., the message and multipart types) is defined by MHTML		(e.g., the message and multipart types) is defined by MHTML
	[RFC2557]. Protocols that do not use the MIME message header syntax,		[RFC2557]. Protocols that do not use the MIME message header syntax,

	but do allow some form of tagged metadata to be included within		but that do allow some form of tagged metadata to be included within
	messages, may define their own syntax for defining a base URI as part		messages, may define their own syntax for defining a base URI as part
	of a message.		of a message.


	5.1.3 Base URI from the Retrieval URI		5.1.3. Base URI from the Retrieval URI

	If no base URI is embedded and the representation is not encapsulated		If no base URI is embedded and the representation is not encapsulated
	within some other entity, then, if a URI was used to retrieve the		within some other entity, then, if a URI was used to retrieve the
	representation, that URI shall be considered the base URI. Note that		representation, that URI shall be considered the base URI. Note that
	if the retrieval was the result of a redirected request, the last URI		if the retrieval was the result of a redirected request, the last URI
	used (i.e., the URI that resulted in the actual retrieval of the		used (i.e., the URI that resulted in the actual retrieval of the
	representation) is the base URI.		representation) is the base URI.


	5.1.4 Default Base URI		5.1.4. Default Base URI

	If none of the conditions described above apply, then the base URI is		If none of the conditions described above apply, then the base URI is

	defined by the context of the application. Since this definition is		defined by the context of the application. As this definition is
	necessarily application-dependent, failing to define a base URI using		necessarily application-dependent, failing to define a base URI by
	one of the other methods may result in the same content being		using one of the other methods may result in the same content being
	interpreted differently by different types of application.		interpreted differently by different types of applications.

	A sender of a representation containing relative references is		A sender of a representation containing relative references is
	responsible for ensuring that a base URI for those references can be		responsible for ensuring that a base URI for those references can be
	established. Aside from fragment-only references, relative		established. Aside from fragment-only references, relative
	references can only be used reliably in situations where the base URI		references can only be used reliably in situations where the base URI

	is well-defined.		is well defined.


	5.2 Relative Resolution		5.2. Relative Resolution

	This section describes an algorithm for converting a URI reference		This section describes an algorithm for converting a URI reference
	that might be relative to a given base URI into the parsed components		that might be relative to a given base URI into the parsed components
	of the reference's target. The components can then be recomposed, as		of the reference's target. The components can then be recomposed, as
	described in Section 5.3, to form the target URI. This algorithm		described in Section 5.3, to form the target URI. This algorithm
	provides definitive results that can be used to test the output of		provides definitive results that can be used to test the output of
	other implementations. Applications may implement relative reference		other implementations. Applications may implement relative reference

	resolution using some other algorithm, provided that the results		resolution by using some other algorithm, provided that the results
	match what would be given by this algorithm.		match what would be given by this one.


	5.2.1 Pre-parse the Base URI		5.2.1. Pre-parse the Base URI

	The base URI (Base) is established according to the procedure of		The base URI (Base) is established according to the procedure of
	Section 5.1 and parsed into the five main components described in		Section 5.1 and parsed into the five main components described in
	Section 3. Note that only the scheme component is required to be		Section 3. Note that only the scheme component is required to be
	present in a base URI; the other components may be empty or		present in a base URI; the other components may be empty or
	undefined. A component is undefined if its associated delimiter does		undefined. A component is undefined if its associated delimiter does
	not appear in the URI reference; the path component is never		not appear in the URI reference; the path component is never
	undefined, though it may be empty.		undefined, though it may be empty.


	Normalization of the base URI, as described in Section 6.2.2 and		Normalization of the base URI, as described in Sections 6.2.2 and
	Section 6.2.3, is optional. A URI reference must be transformed to		6.2.3, is optional. A URI reference must be transformed to its
	its target URI before it can be normalized.		target URI before it can be normalized.


	5.2.2 Transform References		5.2.2. Transform References

	For each URI reference (R), the following pseudocode describes an		For each URI reference (R), the following pseudocode describes an
	algorithm for transforming R into its target URI (T):		algorithm for transforming R into its target URI (T):

	-- The URI reference is parsed into the five URI components		-- The URI reference is parsed into the five URI components
	--		--
	(R.scheme, R.authority, R.path, R.query, R.fragment) = parse(R);		(R.scheme, R.authority, R.path, R.query, R.fragment) = parse(R);

	-- A non-strict parser may ignore a scheme in the reference		-- A non-strict parser may ignore a scheme in the reference
	-- if it is identical to the base URI's scheme.		-- if it is identical to the base URI's scheme.

	skipping to change at page 32, line 5 ¶		skipping to change at page 32, line 38 ¶
	endif;		endif;
	T.query = R.query;		T.query = R.query;
	endif;		endif;
	T.authority = Base.authority;		T.authority = Base.authority;
	endif;		endif;
	T.scheme = Base.scheme;		T.scheme = Base.scheme;
	endif;		endif;

	T.fragment = R.fragment;		T.fragment = R.fragment;


	5.2.3 Merge Paths		5.2.3. Merge Paths

	The pseudocode above refers to a "merge" routine for merging a		The pseudocode above refers to a "merge" routine for merging a
	relative-path reference with the path of the base URI. This is		relative-path reference with the path of the base URI. This is
	accomplished as follows:		accomplished as follows:

	o If the base URI has a defined authority component and an empty		o If the base URI has a defined authority component and an empty
	path, then return a string consisting of "/" concatenated with the		path, then return a string consisting of "/" concatenated with the
	reference's path; otherwise,		reference's path; otherwise,


	o Return a string consisting of the reference's path component		o return a string consisting of the reference's path component
	appended to all but the last segment of the base URI's path (i.e.,		appended to all but the last segment of the base URI's path (i.e.,
	excluding any characters after the right-most "/" in the base URI		excluding any characters after the right-most "/" in the base URI
	path, or excluding the entire base URI path if it does not contain		path, or excluding the entire base URI path if it does not contain
	any "/" characters).		any "/" characters).


	5.2.4 Remove Dot Segments		5.2.4. Remove Dot Segments

	The pseudocode also refers to a "remove_dot_segments" routine for		The pseudocode also refers to a "remove_dot_segments" routine for
	interpreting and removing the special "." and ".." complete path		interpreting and removing the special "." and ".." complete path
	segments from a referenced path. This is done after the path is		segments from a referenced path. This is done after the path is
	extracted from a reference, whether or not the path was relative, in		extracted from a reference, whether or not the path was relative, in
	order to remove any invalid or extraneous dot-segments prior to		order to remove any invalid or extraneous dot-segments prior to
	forming the target URI. Although there are many ways to accomplish		forming the target URI. Although there are many ways to accomplish
	this removal process, we describe a simple method using two string		this removal process, we describe a simple method using two string
	buffers.		buffers.

	1. The input buffer is initialized with the now-appended path		1. The input buffer is initialized with the now-appended path
	components and the output buffer is initialized to the empty		components and the output buffer is initialized to the empty
	string.		string.


	2. While the input buffer is not empty, loop:		2. While the input buffer is not empty, loop as follows:

	A. If the input buffer begins with a prefix of "../" or "./",		A. If the input buffer begins with a prefix of "../" or "./",
	then remove that prefix from the input buffer; otherwise,		then remove that prefix from the input buffer; otherwise,


	B. If the input buffer begins with a prefix of "/./" or "/.",		B. if the input buffer begins with a prefix of "/./" or "/.",
	where "." is a complete path segment, then replace that		where "." is a complete path segment, then replace that
	prefix with "/" in the input buffer; otherwise,		prefix with "/" in the input buffer; otherwise,


	C. If the input buffer begins with a prefix of "/../" or "/..",		C. if the input buffer begins with a prefix of "/../" or "/..",
	where ".." is a complete path segment, then replace that		where ".." is a complete path segment, then replace that
	prefix with "/" in the input buffer and remove the last		prefix with "/" in the input buffer and remove the last
	segment and its preceding "/" (if any) from the output		segment and its preceding "/" (if any) from the output
	buffer; otherwise,		buffer; otherwise,


	D. If the input buffer consists only of "." or "..", then remove		D. if the input buffer consists only of "." or "..", then remove
	that from the input buffer; otherwise,		that from the input buffer; otherwise,


	E. Move the first path segment in the input buffer to the end of		E. move the first path segment in the input buffer to the end of
	the output buffer, including the initial "/" character (if		the output buffer, including the initial "/" character (if
	any) and any subsequent characters up to, but not including,		any) and any subsequent characters up to, but not including,
	the next "/" character or the end of the input buffer.		the next "/" character or the end of the input buffer.

	3. Finally, the output buffer is returned as the result of		3. Finally, the output buffer is returned as the result of
	remove_dot_segments.		remove_dot_segments.

	Note that dot-segments are intended for use in URI references to		Note that dot-segments are intended for use in URI references to
	express an identifier relative to the hierarchy of names in the base		express an identifier relative to the hierarchy of names in the base
	URI. The remove_dot_segments algorithm respects that hierarchy by		URI. The remove_dot_segments algorithm respects that hierarchy by

	removing extra dot-segments rather than treating them as an error or		removing extra dot-segments rather than treat them as an error or
	leaving them to be misinterpreted by dereference implementations.		leaving them to be misinterpreted by dereference implementations.

	The following illustrates how the above steps are applied for two		The following illustrates how the above steps are applied for two

	example merged paths, showing the state of the two buffers after each		examples of merged paths, showing the state of the two buffers after
	step.		each step.

	STEP OUTPUT BUFFER INPUT BUFFER		STEP OUTPUT BUFFER INPUT BUFFER

	1 : /a/b/c/./../../g		1 : /a/b/c/./../../g
	2E: /a /b/c/./../../g		2E: /a /b/c/./../../g
	2E: /a/b /c/./../../g		2E: /a/b /c/./../../g
	2E: /a/b/c /./../../g		2E: /a/b/c /./../../g
	2B: /a/b/c /../../g		2B: /a/b/c /../../g
	2C: /a/b /../g		2C: /a/b /../g
	2C: /a /g		2C: /a /g

	skipping to change at page 33, line 46 ¶		skipping to change at page 34, line 35 ¶

	STEP OUTPUT BUFFER INPUT BUFFER		STEP OUTPUT BUFFER INPUT BUFFER

	1 : mid/content=5/../6		1 : mid/content=5/../6
	2E: mid /content=5/../6		2E: mid /content=5/../6
	2E: mid/content=5 /../6		2E: mid/content=5 /../6
	2C: mid /6		2C: mid /6
	2E: mid/6		2E: mid/6

	Some applications may find it more efficient to implement the		Some applications may find it more efficient to implement the

	remove_dot_segments algorithm using two segment stacks rather than		remove_dot_segments algorithm by using two segment stacks rather than
	strings.		strings.

	Note: Beware that some older, erroneous implementations will fail		Note: Beware that some older, erroneous implementations will fail
	to separate a reference's query component from its path component		to separate a reference's query component from its path component
	prior to merging the base and reference paths, resulting in an		prior to merging the base and reference paths, resulting in an
	interoperability failure if the query component contains the		interoperability failure if the query component contains the
	strings "/../" or "/./".		strings "/../" or "/./".


	5.3 Component Recomposition		5.3. Component Recomposition

	Parsed URI components can be recomposed to obtain the corresponding		Parsed URI components can be recomposed to obtain the corresponding
	URI reference string. Using pseudocode, this would be:		URI reference string. Using pseudocode, this would be:

	result = ""		result = ""

	if defined(scheme) then		if defined(scheme) then
	append scheme to result;		append scheme to result;
	append ":" to result;		append ":" to result;
	endif;		endif;

	skipping to change at page 34, line 42 ¶		skipping to change at page 35, line 42 ¶
	endif;		endif;

	return result;		return result;

	Note that we are careful to preserve the distinction between a		Note that we are careful to preserve the distinction between a
	component that is undefined, meaning that its separator was not		component that is undefined, meaning that its separator was not
	present in the reference, and a component that is empty, meaning that		present in the reference, and a component that is empty, meaning that
	the separator was present and was immediately followed by the next		the separator was present and was immediately followed by the next
	component separator or the end of the reference.		component separator or the end of the reference.


	5.4 Reference Resolution Examples		5.4. Reference Resolution Examples


	Within a representation with a well-defined base URI of		Within a representation with a well defined base URI of

	http://a/b/c/d;p?q		http://a/b/c/d;p?q

	a relative reference is transformed to its target URI as follows.		a relative reference is transformed to its target URI as follows.


	5.4.1 Normal Examples		5.4.1. Normal Examples

	"g:h" = "g:h"		"g:h" = "g:h"
	"g" = "http://a/b/c/g"		"g" = "http://a/b/c/g"
	"./g" = "http://a/b/c/g"		"./g" = "http://a/b/c/g"
	"g/" = "http://a/b/c/g/"		"g/" = "http://a/b/c/g/"
	"/g" = "http://a/g"		"/g" = "http://a/g"
	"//g" = "http://g"		"//g" = "http://g"
	"?y" = "http://a/b/c/d;p?y"		"?y" = "http://a/b/c/d;p?y"
	"g?y" = "http://a/b/c/g?y"		"g?y" = "http://a/b/c/g?y"
	"#s" = "http://a/b/c/d;p?q#s"		"#s" = "http://a/b/c/d;p?q#s"

	skipping to change at page 35, line 31 ¶		skipping to change at page 36, line 31 ¶
	"" = "http://a/b/c/d;p?q"		"" = "http://a/b/c/d;p?q"
	"." = "http://a/b/c/"		"." = "http://a/b/c/"
	"./" = "http://a/b/c/"		"./" = "http://a/b/c/"
	".." = "http://a/b/"		".." = "http://a/b/"
	"../" = "http://a/b/"		"../" = "http://a/b/"
	"../g" = "http://a/b/g"		"../g" = "http://a/b/g"
	"../.." = "http://a/"		"../.." = "http://a/"
	"../../" = "http://a/"		"../../" = "http://a/"
	"../../g" = "http://a/g"		"../../g" = "http://a/g"


	5.4.2 Abnormal Examples		5.4.2. Abnormal Examples

	Although the following abnormal examples are unlikely to occur in		Although the following abnormal examples are unlikely to occur in
	normal practice, all URI parsers should be capable of resolving them		normal practice, all URI parsers should be capable of resolving them

	consistently. Each example uses the same base as above.		consistently. Each example uses the same base as that above.

	Parsers must be careful in handling cases where there are more ".."		Parsers must be careful in handling cases where there are more ".."
	segments in a relative-path reference than there are hierarchical		segments in a relative-path reference than there are hierarchical
	levels in the base URI's path. Note that the ".." syntax cannot be		levels in the base URI's path. Note that the ".." syntax cannot be
	used to change the authority component of a URI.		used to change the authority component of a URI.

	"../../../g" = "http://a/g"		"../../../g" = "http://a/g"
	"../../../../g" = "http://a/g"		"../../../../g" = "http://a/g"

	Similarly, parsers must remove the dot-segments "." and ".." when		Similarly, parsers must remove the dot-segments "." and ".." when

	skipping to change at page 36, line 25 ¶		skipping to change at page 37, line 29 ¶
	"./../g" = "http://a/b/g"		"./../g" = "http://a/b/g"
	"./g/." = "http://a/b/c/g/"		"./g/." = "http://a/b/c/g/"
	"g/./h" = "http://a/b/c/g/h"		"g/./h" = "http://a/b/c/g/h"
	"g/../h" = "http://a/b/c/h"		"g/../h" = "http://a/b/c/h"
	"g;x=1/./y" = "http://a/b/c/g;x=1/y"		"g;x=1/./y" = "http://a/b/c/g;x=1/y"
	"g;x=1/../y" = "http://a/b/c/y"		"g;x=1/../y" = "http://a/b/c/y"

	Some applications fail to separate the reference's query and/or		Some applications fail to separate the reference's query and/or
	fragment components from the path component before merging it with		fragment components from the path component before merging it with
	the base path and removing dot-segments. This error is rarely		the base path and removing dot-segments. This error is rarely

	noticed, since typical usage of a fragment never includes the		noticed, as typical usage of a fragment never includes the hierarchy
	hierarchy ("/") character, and the query component is not normally		("/") character and the query component is not normally used within
	used within relative references.		relative references.

	"g?y/./x" = "http://a/b/c/g?y/./x"		"g?y/./x" = "http://a/b/c/g?y/./x"
	"g?y/../x" = "http://a/b/c/g?y/../x"		"g?y/../x" = "http://a/b/c/g?y/../x"
	"g#s/./x" = "http://a/b/c/g#s/./x"		"g#s/./x" = "http://a/b/c/g#s/./x"
	"g#s/../x" = "http://a/b/c/g#s/../x"		"g#s/../x" = "http://a/b/c/g#s/../x"

	Some parsers allow the scheme name to be present in a relative		Some parsers allow the scheme name to be present in a relative
	reference if it is the same as the base URI scheme. This is		reference if it is the same as the base URI scheme. This is
	considered to be a loophole in prior specifications of partial URI		considered to be a loophole in prior specifications of partial URI

	[RFC1630]. Its use should be avoided, but is allowed for backward		[RFC1630]. Its use should be avoided but is allowed for backward
	compatibility.		compatibility.

	"http:g" = "http:g" ; for strict parsers		"http:g" = "http:g" ; for strict parsers
	/ "http://a/b/c/g" ; for backward compatibility		/ "http://a/b/c/g" ; for backward compatibility

	6. Normalization and Comparison		6. Normalization and Comparison

	One of the most common operations on URIs is simple comparison:		One of the most common operations on URIs is simple comparison:

	determining if two URIs are equivalent without using the URIs to		determining whether two URIs are equivalent without using the URIs to
	access their respective resource(s). A comparison is performed every		access their respective resource(s). A comparison is performed every
	time a response cache is accessed, a browser checks its history to		time a response cache is accessed, a browser checks its history to
	color a link, or an XML parser processes tags within a namespace.		color a link, or an XML parser processes tags within a namespace.
	Extensive normalization prior to comparison of URIs is often used by		Extensive normalization prior to comparison of URIs is often used by

	spiders and indexing engines to prune a search space or reduce		spiders and indexing engines to prune a search space or to reduce
	duplication of request actions and response storage.		duplication of request actions and response storage.


	URI comparison is performed in respect to some particular purpose,		URI comparison is performed for some particular purpose. Protocols
	and implementations with differing purposes will often be subject to		or implementations that compare URIs for different purposes will
	differing design trade-offs in regards to how much effort should be		often be subject to differing design trade-offs in regards to how
	spent in reducing aliased identifiers. This section describes a		much effort should be spent in reducing aliased identifiers. This
	variety of methods that may be used to compare URIs, the trade-offs		section describes various methods that may be used to compare URIs,
	between them, and the types of applications that might use them.		the trade-offs between them, and the types of applications that might
			use them.


	6.1 Equivalence		6.1. Equivalence


	Since URIs exist to identify resources, presumably they should be		Because URIs exist to identify resources, presumably they should be
	considered equivalent when they identify the same resource. However,		considered equivalent when they identify the same resource. However,

	such a definition of equivalence is not of much practical use, since		this definition of equivalence is not of much practical use, as there
	there is no way for an implementation to compare two resources that		is no way for an implementation to compare two resources unless it
	are not under its own control. For this reason, determination of		has full knowledge or control of them. For this reason,
	equivalence or difference of URIs is based on string comparison,		determination of equivalence or difference of URIs is based on string
	perhaps augmented by reference to additional rules provided by URI		comparison, perhaps augmented by reference to additional rules
	scheme definitions. We use the terms "different" and "equivalent" to		provided by URI scheme definitions. We use the terms "different" and
	describe the possible outcomes of such comparisons, but there are		"equivalent" to describe the possible outcomes of such comparisons,
	many application-dependent versions of equivalence.		but there are many application-dependent versions of equivalence.

	Even though it is possible to determine that two URIs are equivalent,		Even though it is possible to determine that two URIs are equivalent,

	URI comparison is not sufficient to determine if two URIs identify		URI comparison is not sufficient to determine whether two URIs
	different resources. For example, an owner of two different domain		identify different resources. For example, an owner of two different
	names could decide to serve the same resource from both, resulting in		domain names could decide to serve the same resource from both,
	two different URIs. Therefore, comparison methods are designed to		resulting in two different URIs. Therefore, comparison methods are
	minimize false negatives while strictly avoiding false positives.		designed to minimize false negatives while strictly avoiding false
			positives.

	In testing for equivalence, applications should not directly compare		In testing for equivalence, applications should not directly compare
	relative references; the references should be converted to their		relative references; the references should be converted to their

	respective target URIs before comparison. When URIs are being		respective target URIs before comparison. When URIs are compared to
	compared for the purpose of selecting (or avoiding) a network action,		select (or avoid) a network action, such as retrieval of a
	such as retrieval of a representation, fragment components (if any)		representation, fragment components (if any) should be excluded from
	should be excluded from the comparison.		the comparison.


	6.2 Comparison Ladder		6.2. Comparison Ladder

	A variety of methods are used in practice to test URI equivalence.		A variety of methods are used in practice to test URI equivalence.
	These methods fall into a range, distinguished by the amount of		These methods fall into a range, distinguished by the amount of
	processing required and the degree to which the probability of false		processing required and the degree to which the probability of false
	negatives is reduced. As noted above, false negatives cannot be		negatives is reduced. As noted above, false negatives cannot be
	eliminated. In practice, their probability can be reduced, but this		eliminated. In practice, their probability can be reduced, but this
	reduction requires more processing and is not cost-effective for all		reduction requires more processing and is not cost-effective for all
	applications.		applications.

	If this range of comparison practices is considered as a ladder, the		If this range of comparison practices is considered as a ladder, the

	following discussion will climb the ladder, starting with those		following discussion will climb the ladder, starting with practices
	practices that are cheap but have a relatively higher chance of		that are cheap but have a relatively higher chance of producing false
	producing false negatives, and proceeding to those that have higher		negatives, and proceeding to those that have higher computational
	computational cost and lower risk of false negatives.		cost and lower risk of false negatives.


	6.2.1 Simple String Comparison		6.2.1. Simple String Comparison


	If two URIs, considered as character strings, are identical, then it		If two URIs, when considered as character strings, are identical,
	is safe to conclude that they are equivalent. This type of		then it is safe to conclude that they are equivalent. This type of
	equivalence test has very low computational cost and is in wide use		equivalence test has very low computational cost and is in wide use
	in a variety of applications, particularly in the domain of parsing.		in a variety of applications, particularly in the domain of parsing.

	Testing strings for equivalence requires some basic precautions.		Testing strings for equivalence requires some basic precautions.
	This procedure is often referred to as "bit-for-bit" or		This procedure is often referred to as "bit-for-bit" or
	"byte-for-byte" comparison, which is potentially misleading. Testing		"byte-for-byte" comparison, which is potentially misleading. Testing

	of strings for equality is normally based on pairwise comparison of		strings for equality is normally based on pair comparison of the
	the characters that make up the strings, starting from the first and		characters that make up the strings, starting from the first and
	proceeding until both strings are exhausted and all characters found		proceeding until both strings are exhausted and all characters are
	to be equal, a pair of characters compares unequal, or one of the		found to be equal, until a pair of characters compares unequal, or
	strings is exhausted before the other.		until one of the strings is exhausted before the other.


	Such character comparisons require that each pair of characters be		This character comparison requires that each pair of characters be
	put in comparable form. For example, should one URI be stored in a		put in comparable form. For example, should one URI be stored in a

	byte array in EBCDIC encoding, and the second be in a Java String		byte array in EBCDIC encoding and the second in a Java String object
	object (UTF-16), bit-for-bit comparisons applied naively will produce		(UTF-16), bit-for-bit comparisons applied naively will produce
	errors. It is better to speak of equality on a		errors. It is better to speak of equality on a character-for-
	character-for-character rather than byte-for-byte or bit-for-bit		character basis rather than on a byte-for-byte or bit-for-bit basis.
	basis. In practical terms, character-by-character comparisons should		In practical terms, character-by-character comparisons should be done
	be done codepoint-by-codepoint after conversion to a common character		codepoint-by-codepoint after conversion to a common character
	encoding.		encoding.

	False negatives are caused by the production and use of URI aliases.		False negatives are caused by the production and use of URI aliases.
	Unnecessary aliases can be reduced, regardless of the comparison		Unnecessary aliases can be reduced, regardless of the comparison

	method, by consistently providing URI references in an		method, by consistently providing URI references in an already-
	already-normalized form (i.e., a form identical to what would be		normalized form (i.e., a form identical to what would be produced
	produced after normalization is applied, as described below).		after normalization is applied, as described below).
	Protocols and data formats often choose to limit some URI comparisons
	to simple string comparison, based on the theory that people and		Protocols and data formats often limit some URI comparisons to simple
			string comparison, based on the theory that people and
	implementations will, in their own best interest, be consistent in		implementations will, in their own best interest, be consistent in
	providing URI references, or at least consistent enough to negate any		providing URI references, or at least consistent enough to negate any
	efficiency that might be obtained from further normalization.		efficiency that might be obtained from further normalization.


	6.2.2 Syntax-based Normalization		6.2.2. Syntax-Based Normalization

	Implementations may use logic based on the definitions provided by		Implementations may use logic based on the definitions provided by
	this specification to reduce the probability of false negatives.		this specification to reduce the probability of false negatives.

	Such processing is moderately higher in cost than		This processing is moderately higher in cost than character-for-
	character-for-character string comparison. For example, an		character string comparison. For example, an application using this
	application using this approach could reasonably consider the		approach could reasonably consider the following two URIs equivalent:
	following two URIs equivalent:

	example://a/b/c/%7Bfoo%7D		example://a/b/c/%7Bfoo%7D
	eXAMPLE://a/./b/../b/%63/%7bfoo%7d		eXAMPLE://a/./b/../b/%63/%7bfoo%7d

	Web user agents, such as browsers, typically apply this type of URI		Web user agents, such as browsers, typically apply this type of URI
	normalization when determining whether a cached response is		normalization when determining whether a cached response is
	available. Syntax-based normalization includes such techniques as		available. Syntax-based normalization includes such techniques as
	case normalization, percent-encoding normalization, and removal of		case normalization, percent-encoding normalization, and removal of
	dot-segments.		dot-segments.


	6.2.2.1 Case Normalization		6.2.2.1. Case Normalization

	For all URIs, the hexadecimal digits within a percent-encoding		For all URIs, the hexadecimal digits within a percent-encoding
	triplet (e.g., "%3a" versus "%3A") are case-insensitive and therefore		triplet (e.g., "%3a" versus "%3A") are case-insensitive and therefore
	should be normalized to use uppercase letters for the digits A-F.		should be normalized to use uppercase letters for the digits A-F.

	When a URI uses components of the generic syntax, the component		When a URI uses components of the generic syntax, the component
	syntax equivalence rules always apply; namely, that the scheme and		syntax equivalence rules always apply; namely, that the scheme and
	host are case-insensitive and therefore should be normalized to		host are case-insensitive and therefore should be normalized to
	lowercase. For example, the URI <HTTP://www.EXAMPLE.com/> is		lowercase. For example, the URI <HTTP://www.EXAMPLE.com/> is
	equivalent to <http://www.example.com/>. The other generic syntax		equivalent to <http://www.example.com/>. The other generic syntax
	components are assumed to be case-sensitive unless specifically		components are assumed to be case-sensitive unless specifically
	defined otherwise by the scheme (see Section 6.2.3).		defined otherwise by the scheme (see Section 6.2.3).


	6.2.2.2 Percent-Encoding Normalization		6.2.2.2. Percent-Encoding Normalization

	The percent-encoding mechanism (Section 2.1) is a frequent source of		The percent-encoding mechanism (Section 2.1) is a frequent source of
	variance among otherwise identical URIs. In addition to the case		variance among otherwise identical URIs. In addition to the case
	normalization issue noted above, some URI producers percent-encode		normalization issue noted above, some URI producers percent-encode
	octets that do not require percent-encoding, resulting in URIs that		octets that do not require percent-encoding, resulting in URIs that

	are equivalent to their non-encoded counterparts. Such URIs should		are equivalent to their non-encoded counterparts. These URIs should
	be normalized by decoding any percent-encoded octet that corresponds		be normalized by decoding any percent-encoded octet that corresponds
	to an unreserved character, as described in Section 2.3.		to an unreserved character, as described in Section 2.3.


	6.2.2.3 Path Segment Normalization		6.2.2.3. Path Segment Normalization

	The complete path segments "." and ".." are intended only for use		The complete path segments "." and ".." are intended only for use
	within relative references (Section 4.1) and are removed as part of		within relative references (Section 4.1) and are removed as part of
	the reference resolution process (Section 5.2). However, some		the reference resolution process (Section 5.2). However, some
	deployed implementations incorrectly assume that reference resolution		deployed implementations incorrectly assume that reference resolution

	is not necessary when the reference is already a URI, and thus fail		is not necessary when the reference is already a URI and thus fail to
	to remove dot-segments when they occur in non-relative paths. URI		remove dot-segments when they occur in non-relative paths. URI
	normalizers should remove dot-segments by applying the		normalizers should remove dot-segments by applying the
	remove_dot_segments algorithm to the path, as described in		remove_dot_segments algorithm to the path, as described in
	Section 5.2.4.		Section 5.2.4.


	6.2.3 Scheme-based Normalization		6.2.3. Scheme-Based Normalization

	The syntax and semantics of URIs vary from scheme to scheme, as		The syntax and semantics of URIs vary from scheme to scheme, as
	described by the defining specification for each scheme.		described by the defining specification for each scheme.
	Implementations may use scheme-specific rules, at further processing		Implementations may use scheme-specific rules, at further processing
	cost, to reduce the probability of false negatives. For example,		cost, to reduce the probability of false negatives. For example,

	since the "http" scheme makes use of an authority component, has a		because the "http" scheme makes use of an authority component, has a
	default port of "80", and defines an empty path to be equivalent to		default port of "80", and defines an empty path to be equivalent to
	"/", the following four URIs are equivalent:		"/", the following four URIs are equivalent:

	http://example.com		http://example.com
	http://example.com/		http://example.com/
	http://example.com:/		http://example.com:/
	http://example.com:80/		http://example.com:80/

	In general, a URI that uses the generic syntax for authority with an		In general, a URI that uses the generic syntax for authority with an

	empty path should be normalized to a path of "/"; likewise, an		empty path should be normalized to a path of "/". Likewise, an
	explicit ":port", where the port is empty or the default for the		explicit ":port", for which the port is empty or the default for the
	scheme, is equivalent to one where the port and its ":" delimiter are		scheme, is equivalent to one where the port and its ":" delimiter are

	elided, and thus should be removed by scheme-based normalization.		elided and thus should be removed by scheme-based normalization. For
	For example, the second URI above is the normal form for the "http"		example, the second URI above is the normal form for the "http"
	scheme.		scheme.

	Another case where normalization varies by scheme is in the handling		Another case where normalization varies by scheme is in the handling
	of an empty authority component or empty host subcomponent. For many		of an empty authority component or empty host subcomponent. For many
	scheme specifications, an empty authority or host is considered an		scheme specifications, an empty authority or host is considered an
	error; for others, it is considered equivalent to "localhost" or the		error; for others, it is considered equivalent to "localhost" or the
	end-user's host. When a scheme defines a default for authority and a		end-user's host. When a scheme defines a default for authority and a
	URI reference to that default is desired, the reference should be		URI reference to that default is desired, the reference should be
	normalized to an empty authority for the sake of uniformity, brevity,		normalized to an empty authority for the sake of uniformity, brevity,
	and internationalization. If, however, either the userinfo or port		and internationalization. If, however, either the userinfo or port

	subcomponent is non-empty, then the host should be given explicitly		subcomponents are non-empty, then the host should be given explicitly
	even if it matches the default.		even if it matches the default.

	Normalization should not remove delimiters when their associated		Normalization should not remove delimiters when their associated
	component is empty unless licensed to do so by the scheme		component is empty unless licensed to do so by the scheme
	specification. For example, the URI "http://example.com/?" cannot be		specification. For example, the URI "http://example.com/?" cannot be
	assumed to be equivalent to any of the examples above. Likewise, the		assumed to be equivalent to any of the examples above. Likewise, the
	presence or absence of delimiters within a userinfo subcomponent is		presence or absence of delimiters within a userinfo subcomponent is
	usually significant to its interpretation. The fragment component is		usually significant to its interpretation. The fragment component is
	not subject to any scheme-based normalization; thus, two URIs that		not subject to any scheme-based normalization; thus, two URIs that
	differ only by the suffix "#" are considered different regardless of		differ only by the suffix "#" are considered different regardless of
	the scheme.		the scheme.


	Some schemes define additional subcomponents that consist of		Some schemes define additional subcomponents that consist of case-
	case-insensitive data, giving an implicit license to normalizers to		insensitive data, giving an implicit license to normalizers to
	convert such data to a common case (e.g., all lowercase). For		convert this data to a common case (e.g., all lowercase). For
	example, URI schemes that define a subcomponent of path to contain an		example, URI schemes that define a subcomponent of path to contain an
	Internet hostname, such as the "mailto" URI scheme, cause that		Internet hostname, such as the "mailto" URI scheme, cause that
	subcomponent to be case-insensitive and thus subject to case		subcomponent to be case-insensitive and thus subject to case
	normalization (e.g., "mailto:[email protected]" is equivalent to		normalization (e.g., "mailto:[email protected]" is equivalent to

	"mailto:[email protected]" even though the generic syntax considers the		"mailto:[email protected]", even though the generic syntax considers
	path component to be case-sensitive).		the path component to be case-sensitive).

	Other scheme-specific normalizations are possible.		Other scheme-specific normalizations are possible.


	6.2.4 Protocol-based Normalization		6.2.4. Protocol-Based Normalization


	Web spiders, for which substantial effort to reduce the incidence of		Substantial effort to reduce the incidence of false negatives is
	false negatives is often cost-effective, are observed to implement		often cost-effective for web spiders. Therefore, they implement even
	even more aggressive techniques in URI comparison. For example, if		more aggressive techniques in URI comparison. For example, if they
	they observe that a URI such as		observe that a URI such as

	http://example.com/data		http://example.com/data

	redirects to a URI differing only in the trailing slash		redirects to a URI differing only in the trailing slash

	http://example.com/data/		http://example.com/data/

	they will likely regard the two as equivalent in the future. This		they will likely regard the two as equivalent in the future. This
	kind of technique is only appropriate when equivalence is clearly		kind of technique is only appropriate when equivalence is clearly
	indicated by both the result of accessing the resources and the		indicated by both the result of accessing the resources and the
	common conventions of their scheme's dereference algorithm (in this		common conventions of their scheme's dereference algorithm (in this
	case, use of redirection by HTTP origin servers to avoid problems		case, use of redirection by HTTP origin servers to avoid problems
	with relative references).		with relative references).

	7. Security Considerations		7. Security Considerations


	A URI does not in itself pose a security threat. However, since URIs		A URI does not in itself pose a security threat. However, as URIs
	are often used to provide a compact set of instructions for access to		are often used to provide a compact set of instructions for access to
	network resources, care must be taken to properly interpret the data		network resources, care must be taken to properly interpret the data
	within a URI, to prevent that data from causing unintended access,		within a URI, to prevent that data from causing unintended access,
	and to avoid including data that should not be revealed in plain		and to avoid including data that should not be revealed in plain
	text.		text.


	7.1 Reliability and Consistency		7.1. Reliability and Consistency


	There is no guarantee that, having once used a given URI to retrieve		There is no guarantee that once a URI has been used to retrieve
	some information, the same information will be retrievable by that		information, the same information will be retrievable by that URI in
	URI in the future. Nor is there any guarantee that the information		the future. Nor is there any guarantee that the information
	retrievable via that URI in the future will be observably similar to		retrievable via that URI in the future will be observably similar to
	that retrieved in the past. The URI syntax does not constrain how a		that retrieved in the past. The URI syntax does not constrain how a

	given scheme or authority apportions its name space or maintains it		given scheme or authority apportions its namespace or maintains it
	over time. Such a guarantee can only be obtained from the person(s)		over time. Such guarantees can only be obtained from the person(s)
	controlling that name space and the resource in question. A specific		controlling that namespace and the resource in question. A specific
	URI scheme may define additional semantics, such as name persistence,		URI scheme may define additional semantics, such as name persistence,
	if those semantics are required of all naming authorities for that		if those semantics are required of all naming authorities for that
	scheme.		scheme.


	7.2 Malicious Construction		7.2. Malicious Construction


	It is sometimes possible to construct a URI such that an attempt to		It is sometimes possible to construct a URI so that an attempt to
	perform a seemingly harmless, idempotent operation, such as the		perform a seemingly harmless, idempotent operation, such as the
	retrieval of a representation, will in fact cause a possibly damaging		retrieval of a representation, will in fact cause a possibly damaging

	remote operation to occur. The unsafe URI is typically constructed		remote operation. The unsafe URI is typically constructed by
	by specifying a port number other than that reserved for the network		specifying a port number other than that reserved for the network
	protocol in question. The client unwittingly contacts a site that is		protocol in question. The client unwittingly contacts a site running
	running a different protocol service and data within the URI contains		a different protocol service, and data within the URI contains
	instructions that, when interpreted according to this other protocol,		instructions that, when interpreted according to this other protocol,
	cause an unexpected operation. A frequent example of such abuse has		cause an unexpected operation. A frequent example of such abuse has
	been the use of a protocol-based scheme with a port component of		been the use of a protocol-based scheme with a port component of
	"25", thereby fooling user agent software into sending an unintended		"25", thereby fooling user agent software into sending an unintended
	or impersonating message via an SMTP server.		or impersonating message via an SMTP server.

	Applications should prevent dereference of a URI that specifies a TCP		Applications should prevent dereference of a URI that specifies a TCP
	port number within the "well-known port" range (0 - 1023) unless the		port number within the "well-known port" range (0 - 1023) unless the
	protocol being used to dereference that URI is compatible with the		protocol being used to dereference that URI is compatible with the
	protocol expected on that well-known port. Although IANA maintains a		protocol expected on that well-known port. Although IANA maintains a
	registry of well-known ports, applications should make such		registry of well-known ports, applications should make such
	restrictions user-configurable to avoid preventing the deployment of		restrictions user-configurable to avoid preventing the deployment of
	new services.		new services.

	When a URI contains percent-encoded octets that match the delimiters		When a URI contains percent-encoded octets that match the delimiters
	for a given resolution or dereference protocol (for example, CR and		for a given resolution or dereference protocol (for example, CR and

	LF characters for the TELNET protocol), such percent-encoded octets		LF characters for the TELNET protocol), these percent-encodings must
	must not be decoded before transmission across that protocol.		not be decoded before transmission across that protocol. Transfer of
	Transfer of the percent-encoding, which might violate the protocol,		the percent-encoding, which might violate the protocol, is less
	is less harmful than allowing decoded octets to be interpreted as		harmful than allowing decoded octets to be interpreted as additional
	additional operations or parameters, perhaps triggering an unexpected		operations or parameters, perhaps triggering an unexpected and
	and possibly harmful remote operation.		possibly harmful remote operation.


	7.3 Back-end Transcoding		7.3. Back-End Transcoding

	When a URI is dereferenced, the data within it is often parsed by		When a URI is dereferenced, the data within it is often parsed by
	both the user agent and one or more servers. In HTTP, for example, a		both the user agent and one or more servers. In HTTP, for example, a
	typical user agent will parse a URI into its five major components,		typical user agent will parse a URI into its five major components,
	access the authority's server, and send it the data within the		access the authority's server, and send it the data within the
	authority, path, and query components. A typical server will take		authority, path, and query components. A typical server will take
	that information, parse the path into segments and the query into		that information, parse the path into segments and the query into
	key/value pairs, and then invoke implementation-specific handlers to		key/value pairs, and then invoke implementation-specific handlers to
	respond to the request. As a result, a common security concern for		respond to the request. As a result, a common security concern for
	server implementations that handle a URI, either as a whole or split		server implementations that handle a URI, either as a whole or split
	into separate components, is proper interpretation of the octet data		into separate components, is proper interpretation of the octet data
	represented by the characters and percent-encodings within that URI.		represented by the characters and percent-encodings within that URI.

	Percent-encoded octets must be decoded at some point during the		Percent-encoded octets must be decoded at some point during the
	dereference process. Applications must split the URI into its		dereference process. Applications must split the URI into its

	components and subcomponents prior to decoding the octets, since		components and subcomponents prior to decoding the octets, as
	otherwise the decoded octets might be mistaken for delimiters.		otherwise the decoded octets might be mistaken for delimiters.
	Security checks of the data within a URI should be applied after		Security checks of the data within a URI should be applied after
	decoding the octets. Note, however, that the "%00" percent-encoding		decoding the octets. Note, however, that the "%00" percent-encoding
	(NUL) may require special handling and should be rejected if the		(NUL) may require special handling and should be rejected if the
	application is not expecting to receive raw data within a component.		application is not expecting to receive raw data within a component.

	Special care should be taken when the URI path interpretation process		Special care should be taken when the URI path interpretation process

	involves the use of a back-end filesystem or related system		involves the use of a back-end file system or related system
	functions. Filesystems typically assign an operational meaning to		functions. File systems typically assign an operational meaning to
	special characters, such as the "/", "\", ":", "[", and "]"		special characters, such as the "/", "\", ":", "[", and "]"

	characters, and special device names like ".", "..", "...", "aux",		characters, and to special device names like ".", "..", "...", "aux",
	"lpt", etc. In some cases, merely testing for the existence of such		"lpt", etc. In some cases, merely testing for the existence of such
	a name will cause the operating system to pause or invoke unrelated		a name will cause the operating system to pause or invoke unrelated
	system calls, leading to significant security concerns regarding		system calls, leading to significant security concerns regarding
	denial of service and unintended data transfer. It would be		denial of service and unintended data transfer. It would be
	impossible for this specification to list all such significant		impossible for this specification to list all such significant

	characters and device names; implementers should research the		characters and device names. Implementers should research the
	reserved names and characters for the types of storage device that		reserved names and characters for the types of storage device that

	may be attached to their application and restrict the use of data		may be attached to their applications and restrict the use of data
	obtained from URI components accordingly.		obtained from URI components accordingly.


	7.4 Rare IP Address Formats		7.4. Rare IP Address Formats


	Although the URI syntax for IPv4address only allows the common,		Although the URI syntax for IPv4address only allows the common
	dotted-decimal form of IPv4 address literal, many implementations		dotted-decimal form of IPv4 address literal, many implementations
	that process URIs make use of platform-dependent system routines,		that process URIs make use of platform-dependent system routines,
	such as gethostbyname() and inet_aton(), to translate the string		such as gethostbyname() and inet_aton(), to translate the string
	literal to an actual IP address. Unfortunately, such system routines		literal to an actual IP address. Unfortunately, such system routines
	often allow and process a much larger set of formats than those		often allow and process a much larger set of formats than those
	described in Section 3.2.2.		described in Section 3.2.2.

	For example, many implementations allow dotted forms of three		For example, many implementations allow dotted forms of three
	numbers, wherein the last part is interpreted as a 16-bit quantity		numbers, wherein the last part is interpreted as a 16-bit quantity
	and placed in the right-most two bytes of the network address (e.g.,		and placed in the right-most two bytes of the network address (e.g.,

	a Class B network). Likewise, a dotted form of two numbers means the		a Class B network). Likewise, a dotted form of two numbers means
	last part is interpreted as a 24-bit quantity and placed in the right		that the last part is interpreted as a 24-bit quantity and placed in
	most three bytes of the network address (Class A), and a single		the right-most three bytes of the network address (Class A), and a
	number (without dots) is interpreted as a 32-bit quantity and stored		single number (without dots) is interpreted as a 32-bit quantity and
	directly in the network address. Adding further to the confusion,		stored directly in the network address. Adding further to the
	some implementations allow each dotted part to be interpreted as		confusion, some implementations allow each dotted part to be
	decimal, octal, or hexadecimal, as specified in the C language (i.e.,		interpreted as decimal, octal, or hexadecimal, as specified in the C
	a leading 0x or 0X implies hexadecimal; otherwise, a leading 0		language (i.e., a leading 0x or 0X implies hexadecimal; a leading 0
	implies octal; otherwise, the number is interpreted as decimal).		implies octal; otherwise, the number is interpreted as decimal).

	These additional IP address formats are not allowed in the URI syntax		These additional IP address formats are not allowed in the URI syntax
	due to differences between platform implementations. However, they		due to differences between platform implementations. However, they
	can become a security concern if an application attempts to filter		can become a security concern if an application attempts to filter
	access to resources based on the IP address in string literal format.		access to resources based on the IP address in string literal format.

	If such filtering is performed, literals should be converted to		If this filtering is performed, literals should be converted to
	numeric form and filtered based on the numeric value, rather than a		numeric form and filtered based on the numeric value, and not on a
	prefix or suffix of the string form.		prefix or suffix of the string form.


	7.5 Sensitive Information		7.5. Sensitive Information

	URI producers should not provide a URI that contains a username or		URI producers should not provide a URI that contains a username or

	password which is intended to be secret: URIs are frequently		password that is intended to be secret. URIs are frequently
	displayed by browsers, stored in clear text bookmarks, and logged by		displayed by browsers, stored in clear text bookmarks, and logged by
	user agent history and intermediary applications (proxies). A		user agent history and intermediary applications (proxies). A
	password appearing within the userinfo component is deprecated and		password appearing within the userinfo component is deprecated and
	should be considered an error (or simply ignored) except in those		should be considered an error (or simply ignored) except in those
	rare cases where the 'password' parameter is intended to be public.		rare cases where the 'password' parameter is intended to be public.


	7.6 Semantic Attacks		7.6. Semantic Attacks

	Because the userinfo subcomponent is rarely used and appears before		Because the userinfo subcomponent is rarely used and appears before
	the host in the authority component, it can be used to construct a		the host in the authority component, it can be used to construct a

	URI that is intended to mislead a human user by appearing to identify		URI intended to mislead a human user by appearing to identify one
	one (trusted) naming authority while actually identifying a different		(trusted) naming authority while actually identifying a different
	authority hidden behind the noise. For example		authority hidden behind the noise. For example


	ftp://cnn.example.com&[email protected]/top_story.htm		ftp://cnn.example.com&[email protected]/top_story.htm

	might lead a human user to assume that the host is 'cnn.example.com',		might lead a human user to assume that the host is 'cnn.example.com',
	whereas it is actually '10.0.0.1'. Note that a misleading userinfo		whereas it is actually '10.0.0.1'. Note that a misleading userinfo
	subcomponent could be much longer than the example above.		subcomponent could be much longer than the example above.


	A misleading URI, such as the one above, is an attack on the user's		A misleading URI, such as that above, is an attack on the user's
	preconceived notions about the meaning of a URI, rather than an		preconceived notions about the meaning of a URI rather than an attack
	attack on the software itself. User agents may be able to reduce the		on the software itself. User agents may be able to reduce the impact
	impact of such attacks by distinguishing the various components of		of such attacks by distinguishing the various components of the URI
	the URI when rendered, such as by using a different color or tone to		when they are rendered, such as by using a different color or tone to
	render userinfo if any is present, though there is no general		render userinfo if any is present, though there is no panacea. More
	panacea. More information on URI-based semantic attacks can be found		information on URI-based semantic attacks can be found in [Siedzik].
	in [Siedzik].

	8. IANA Considerations		8. IANA Considerations

	URI scheme names, as defined by <scheme> in Section 3.1, form a		URI scheme names, as defined by <scheme> in Section 3.1, form a

	registered name space that is managed by IANA according to the		registered namespace that is managed by IANA according to the
	procedures defined in [BCP35]. No IANA actions are required by this		procedures defined in [BCP35]. No IANA actions are required by this
	document.		document.


	9. Acknowledgments		9. Acknowledgements

	This specification is derived from RFC 2396 [RFC2396], RFC 1808		This specification is derived from RFC 2396 [RFC2396], RFC 1808

	[RFC1808], and RFC 1738 [RFC1738]; the acknowledgments in those		[RFC1808], and RFC 1738 [RFC1738]; the acknowledgements in those
	documents still apply. It also incorporates the update (with		documents still apply. It also incorporates the update (with
	corrections) for IPv6 literals in the host syntax, as defined by		corrections) for IPv6 literals in the host syntax, as defined by

	Robert M. Hinden, Brian E. Carpenter, and Larry Masinter in		Robert M. Hinden, Brian E. Carpenter, and Larry Masinter in
	[RFC2732]. In addition, contributions by Gisle Aas, Reese Anschultz,		[RFC2732]. In addition, contributions by Gisle Aas, Reese Anschultz,
	Daniel Barclay, Tim Bray, Mike Brown, Rob Cameron, Jeremy Carroll,		Daniel Barclay, Tim Bray, Mike Brown, Rob Cameron, Jeremy Carroll,

	Dan Connolly, Adam M. Costello, John Cowan, Jason Diamond, Martin		Dan Connolly, Adam M. Costello, John Cowan, Jason Diamond, Martin
	Duerst, Stefan Eissing, Clive D.W. Feather, Al Gilman, Tony Hammond,		Duerst, Stefan Eissing, Clive D.W. Feather, Al Gilman, Tony Hammond,
	Elliotte Harold, Pat Hayes, Henry Holtzman, Ian B. Jacobs, Michael		Elliotte Harold, Pat Hayes, Henry Holtzman, Ian B. Jacobs, Michael
	Kay, John C. Klensin, Graham Klyne, Dan Kohn, Bruce Lilly, Andrew		Kay, John C. Klensin, Graham Klyne, Dan Kohn, Bruce Lilly, Andrew
	Main, Dave McAlpin, Ira McDonald, Michael Mealling, Ray Merkert,		Main, Dave McAlpin, Ira McDonald, Michael Mealling, Ray Merkert,
	Stephen Pollei, Julian Reschke, Tomas Rokicki, Miles Sabin, Kai		Stephen Pollei, Julian Reschke, Tomas Rokicki, Miles Sabin, Kai
	Schaetzl, Mark Thomson, Ronald Tschalaer, Norm Walsh, Marc Warne,		Schaetzl, Mark Thomson, Ronald Tschalaer, Norm Walsh, Marc Warne,
	Stuart Williams, and Henry Zongaro are gratefully acknowledged.		Stuart Williams, and Henry Zongaro are gratefully acknowledged.

	10. References		10. References


	10.1 Normative References		10.1. Normative References

	[ASCII] American National Standards Institute, "Coded Character		[ASCII] American National Standards Institute, "Coded Character
	Set -- 7-bit American Standard Code for Information		Set -- 7-bit American Standard Code for Information
	Interchange", ANSI X3.4, 1986.		Interchange", ANSI X3.4, 1986.

	[RFC2234] Crocker, D. and P. Overell, "Augmented BNF for Syntax		[RFC2234] Crocker, D. and P. Overell, "Augmented BNF for Syntax
	Specifications: ABNF", RFC 2234, November 1997.		Specifications: ABNF", RFC 2234, November 1997.


	[STD63] Yergeau, F., "UTF-8, a transformation format of ISO		[STD63] Yergeau, F., "UTF-8, a transformation format of
	10646", STD 63, RFC 3629, November 2003.		ISO 10646", STD 63, RFC 3629, November 2003.

	[UCS] International Organization for Standardization,		[UCS] International Organization for Standardization,
	"Information Technology - Universal Multiple-Octet Coded		"Information Technology - Universal Multiple-Octet Coded
	Character Set (UCS)", ISO/IEC 10646:2003, December 2003.		Character Set (UCS)", ISO/IEC 10646:2003, December 2003.


	10.2 Informative References		10.2. Informative References

	[BCP19] Freed, N. and J. Postel, "IANA Charset Registration		[BCP19] Freed, N. and J. Postel, "IANA Charset Registration
	Procedures", BCP 19, RFC 2978, October 2000.		Procedures", BCP 19, RFC 2978, October 2000.

	[BCP35] Petke, R. and I. King, "Registration Procedures for URL		[BCP35] Petke, R. and I. King, "Registration Procedures for URL
	Scheme Names", BCP 35, RFC 2717, November 1999.		Scheme Names", BCP 35, RFC 2717, November 1999.


	[RFC0952] Harrenstien, K., Stahl, M. and E. Feinler, "DoD Internet		[RFC0952] Harrenstien, K., Stahl, M., and E. Feinler, "DoD Internet
	host table specification", RFC 952, October 1985.		host table specification", RFC 952, October 1985.

	[RFC1034] Mockapetris, P., "Domain names - concepts and facilities",		[RFC1034] Mockapetris, P., "Domain names - concepts and facilities",
	STD 13, RFC 1034, November 1987.		STD 13, RFC 1034, November 1987.

	[RFC1123] Braden, R., "Requirements for Internet Hosts - Application		[RFC1123] Braden, R., "Requirements for Internet Hosts - Application
	and Support", STD 3, RFC 1123, October 1989.		and Support", STD 3, RFC 1123, October 1989.

	[RFC1535] Gavron, E., "A Security Problem and Proposed Correction		[RFC1535] Gavron, E., "A Security Problem and Proposed Correction

	With Widely Deployed DNS Software", RFC 1535, October		With Widely Deployed DNS Software", RFC 1535,
	1993.		October 1993.

	[RFC1630] Berners-Lee, T., "Universal Resource Identifiers in WWW: A		[RFC1630] Berners-Lee, T., "Universal Resource Identifiers in WWW: A
	Unifying Syntax for the Expression of Names and Addresses		Unifying Syntax for the Expression of Names and Addresses
	of Objects on the Network as used in the World-Wide Web",		of Objects on the Network as used in the World-Wide Web",
	RFC 1630, June 1994.		RFC 1630, June 1994.

	[RFC1736] Kunze, J., "Functional Recommendations for Internet		[RFC1736] Kunze, J., "Functional Recommendations for Internet
	Resource Locators", RFC 1736, February 1995.		Resource Locators", RFC 1736, February 1995.


	[RFC1737] Masinter, L. and K. Sollins, "Functional Requirements for		[RFC1737] Sollins, K. and L. Masinter, "Functional Requirements for
	Uniform Resource Names", RFC 1737, December 1994.		Uniform Resource Names", RFC 1737, December 1994.


	[RFC1738] Berners-Lee, T., Masinter, L. and M. McCahill, "Uniform		[RFC1738] Berners-Lee, T., Masinter, L., and M. McCahill, "Uniform
	Resource Locators (URL)", RFC 1738, December 1994.		Resource Locators (URL)", RFC 1738, December 1994.


	[RFC1808] Fielding, R., "Relative Uniform Resource Locators", RFC		[RFC1808] Fielding, R., "Relative Uniform Resource Locators",
	1808, June 1995.		RFC 1808, June 1995.

	[RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail		[RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail
	Extensions (MIME) Part Two: Media Types", RFC 2046,		Extensions (MIME) Part Two: Media Types", RFC 2046,
	November 1996.		November 1996.

	[RFC2141] Moats, R., "URN Syntax", RFC 2141, May 1997.		[RFC2141] Moats, R., "URN Syntax", RFC 2141, May 1997.


	[RFC2396] Berners-Lee, T., Fielding, R. and L. Masinter, "Uniform		[RFC2396] Berners-Lee, T., Fielding, R., and L. Masinter, "Uniform
	Resource Identifiers (URI): Generic Syntax", RFC 2396,		Resource Identifiers (URI): Generic Syntax", RFC 2396,
	August 1998.		August 1998.


	[RFC2518] Goland, Y., Whitehead, E., Faizi, A., Carter, S. and D.		[RFC2518] Goland, Y., Whitehead, E., Faizi, A., Carter, S., and D.
	Jensen, "HTTP Extensions for Distributed Authoring --		Jensen, "HTTP Extensions for Distributed Authoring --
	WEBDAV", RFC 2518, February 1999.		WEBDAV", RFC 2518, February 1999.


	[RFC2557] Palme, F., Hopmann, A., Shelness, N. and E. Stefferud,		[RFC2557] Palme, J., Hopmann, A., and N. Shelness, "MIME
	"MIME Encapsulation of Aggregate Documents, such as HTML		Encapsulation of Aggregate Documents, such as HTML
	(MHTML)", RFC 2557, March 1999.		(MHTML)", RFC 2557, March 1999.


	[RFC2718] Masinter, L., Alvestrand, H., Zigmond, D. and R. Petke,		[RFC2718] Masinter, L., Alvestrand, H., Zigmond, D., and R. Petke,
	"Guidelines for new URL Schemes", RFC 2718, November 1999.		"Guidelines for new URL Schemes", RFC 2718, November 1999.


	[RFC2732] Hinden, R., Carpenter, B. and L. Masinter, "Format for		[RFC2732] Hinden, R., Carpenter, B., and L. Masinter, "Format for
	Literal IPv6 Addresses in URL's", RFC 2732, December 1999.		Literal IPv6 Addresses in URL's", RFC 2732, December 1999.


	[RFC3305] Mealling, M. and R. Denenberg, "Report from the Joint W3C/		[RFC3305] Mealling, M. and R. Denenberg, "Report from the Joint
	IETF URI Planning Interest Group: Uniform Resource		W3C/IETF URI Planning Interest Group: Uniform Resource
	Identifiers (URIs), URLs, and Uniform Resource Names		Identifiers (URIs), URLs, and Uniform Resource Names
	(URNs): Clarifications and Recommendations", RFC 3305,		(URNs): Clarifications and Recommendations", RFC 3305,
	August 2002.		August 2002.


	[RFC3490] Faltstrom, P., Hoffman, P. and A. Costello,		[RFC3490] Faltstrom, P., Hoffman, P., and A. Costello,
	"Internationalizing Domain Names in Applications (IDNA)",		"Internationalizing Domain Names in Applications (IDNA)",
	RFC 3490, March 2003.		RFC 3490, March 2003.

	[RFC3513] Hinden, R. and S. Deering, "Internet Protocol Version 6		[RFC3513] Hinden, R. and S. Deering, "Internet Protocol Version 6
	(IPv6) Addressing Architecture", RFC 3513, April 2003.		(IPv6) Addressing Architecture", RFC 3513, April 2003.

	[Siedzik] Siedzik, R., "Semantic Attacks: What's in a URL?",		[Siedzik] Siedzik, R., "Semantic Attacks: What's in a URL?",
	April 2001, <http://www.giac.org/practical/gsec/		April 2001, <http://www.giac.org/practical/gsec/
	Richard_Siedzik_GSEC.pdf>.		Richard_Siedzik_GSEC.pdf>.


	Authors' Addresses

	Tim Berners-Lee
	World Wide Web Consortium
	Massachusetts Institute of Technology
	77 Massachusetts Avenue
	Cambridge, MA 02139
	USA

	Phone: +1-617-253-5702
	Fax: +1-617-258-5999
	EMail: [email protected]
	URI: http://www.w3.org/People/Berners-Lee/

	Roy T. Fielding
	Day Software
	5251 California Ave., Suite 110
	Irvine, CA 92617
	USA

	Phone: +1-949-679-2960
	Fax: +1-949-679-2972
	EMail: [email protected]
	URI: http://roy.gbiv.com/

	Larry Masinter
	Adobe Systems Incorporated
	345 Park Ave
	San Jose, CA 95110
	USA

	Phone: +1-408-536-3024
	EMail: [email protected]
	URI: http://larry.masinter.net/

	Appendix A. Collected ABNF for URI		Appendix A. Collected ABNF for URI


	URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]		URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]


	hier-part = "//" authority path-abempty		hier-part = "//" authority path-abempty
	/ path-absolute		/ path-absolute
	/ path-rootless		/ path-rootless
	/ path-empty		/ path-empty


	URI-reference = URI / relative-ref		URI-reference = URI / relative-ref


	absolute-URI = scheme ":" hier-part [ "?" query ]		absolute-URI = scheme ":" hier-part [ "?" query ]


	relative-ref = relative-part [ "?" query ] [ "#" fragment ]		relative-ref = relative-part [ "?" query ] [ "#" fragment ]


	relative-part = "//" authority path-abempty		relative-part = "//" authority path-abempty
	/ path-absolute		/ path-absolute
	/ path-noscheme		/ path-noscheme
	/ path-empty		/ path-empty


	scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )		scheme = ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )


	authority = [ userinfo "@" ] host [ ":" port ]		authority = [ userinfo "@" ] host [ ":" port ]
	userinfo = *( unreserved / pct-encoded / sub-delims / ":" )		userinfo = *( unreserved / pct-encoded / sub-delims / ":" )
	host = IP-literal / IPv4address / reg-name		host = IP-literal / IPv4address / reg-name
	port = *DIGIT		port = *DIGIT


	IP-literal = "[" ( IPv6address / IPvFuture ) "]"		IP-literal = "[" ( IPv6address / IPvFuture ) "]"


	IPvFuture = "v" 1HEXDIG "." 1( unreserved / sub-delims / ":" )		IPvFuture = "v" 1HEXDIG "." 1( unreserved / sub-delims / ":" )


	IPv6address = 6( h16 ":" ) ls32		IPv6address = 6( h16 ":" ) ls32
	/ "::" 5( h16 ":" ) ls32		/ "::" 5( h16 ":" ) ls32
	/ [ h16 ] "::" 4( h16 ":" ) ls32		/ [ h16 ] "::" 4( h16 ":" ) ls32
	/ [ *1( h16 ":" ) h16 ] "::" 3( h16 ":" ) ls32		/ [ *1( h16 ":" ) h16 ] "::" 3( h16 ":" ) ls32
	/ [ *2( h16 ":" ) h16 ] "::" 2( h16 ":" ) ls32		/ [ *2( h16 ":" ) h16 ] "::" 2( h16 ":" ) ls32
	/ [ *3( h16 ":" ) h16 ] "::" h16 ":" ls32		/ [ *3( h16 ":" ) h16 ] "::" h16 ":" ls32
	/ [ *4( h16 ":" ) h16 ] "::" ls32		/ [ *4( h16 ":" ) h16 ] "::" ls32
	/ [ *5( h16 ":" ) h16 ] "::" h16		/ [ *5( h16 ":" ) h16 ] "::" h16
	/ [ *6( h16 ":" ) h16 ] "::"		/ [ *6( h16 ":" ) h16 ] "::"


	h16 = 1*4HEXDIG		h16 = 1*4HEXDIG
	ls32 = ( h16 ":" h16 ) / IPv4address		ls32 = ( h16 ":" h16 ) / IPv4address
	IPv4address = dec-octet "." dec-octet "." dec-octet "." dec-octet		IPv4address = dec-octet "." dec-octet "." dec-octet "." dec-octet
			dec-octet = DIGIT ; 0-9
			/ %x31-39 DIGIT ; 10-99
			/ "1" 2DIGIT ; 100-199
			/ "2" %x30-34 DIGIT ; 200-249
			/ "25" %x30-35 ; 250-255


	dec-octet = DIGIT ; 0-9		reg-name = *( unreserved / pct-encoded / sub-delims )
	/ %x31-39 DIGIT ; 10-99
	/ "1" 2DIGIT ; 100-199
	/ "2" %x30-34 DIGIT ; 200-249
	/ "25" %x30-35 ; 250-255


	reg-name = *( unreserved / pct-encoded / sub-delims )		path = path-abempty ; begins with "/" or is empty
			/ path-absolute ; begins with "/" but not "//"
			/ path-noscheme ; begins with a non-colon segment
			/ path-rootless ; begins with a segment
			/ path-empty ; zero characters


	path = path-abempty ; begins with "/" or is empty		path-abempty = *( "/" segment )
	/ path-absolute ; begins with "/" but not "//"		path-absolute = "/" [ segment-nz *( "/" segment ) ]
	/ path-noscheme ; begins with a non-colon segment		path-noscheme = segment-nz-nc *( "/" segment )
	/ path-rootless ; begins with a segment		path-rootless = segment-nz *( "/" segment )
	/ path-empty ; zero characters		path-empty = 0<pchar>


	path-abempty = *( "/" segment )		segment = *pchar
	path-absolute = "/" [ segment-nz *( "/" segment ) ]		segment-nz = 1*pchar
	path-noscheme = segment-nz-nc *( "/" segment )		segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims / "@" )
	path-rootless = segment-nz *( "/" segment )		; non-zero-length segment without any colon ":"
	path-empty = 0<pchar>


	segment = *pchar		pchar = unreserved / pct-encoded / sub-delims / ":" / "@"
	segment-nz = 1*pchar
	segment-nz-nc = 1*( unreserved / pct-encoded / sub-delims / "@" )
	; non-zero-length segment without any colon ":"


	pchar = unreserved / pct-encoded / sub-delims / ":" / "@"		query = *( pchar / "/" / "?" )


	query = *( pchar / "/" / "?" )		fragment = *( pchar / "/" / "?" )
	fragment = *( pchar / "/" / "?" )


	pct-encoded = "%" HEXDIG HEXDIG		pct-encoded = "%" HEXDIG HEXDIG


	unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"		unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
	reserved = gen-delims / sub-delims		reserved = gen-delims / sub-delims
	gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"		gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"
	sub-delims = "!" / "$" / "&" / "'" / "(" / ")"		sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
	/ "*" / "+" / "," / ";" / "="		/ "*" / "+" / "," / ";" / "="

	Appendix B. Parsing a URI Reference with a Regular Expression		Appendix B. Parsing a URI Reference with a Regular Expression


	Since the "first-match-wins" algorithm is identical to the "greedy"		As the "first-match-wins" algorithm is identical to the "greedy"
	disambiguation method used by POSIX regular expressions, it is		disambiguation method used by POSIX regular expressions, it is
	natural and commonplace to use a regular expression for parsing the		natural and commonplace to use a regular expression for parsing the
	potential five components of a URI reference.		potential five components of a URI reference.

	The following line is the regular expression for breaking-down a		The following line is the regular expression for breaking-down a
	well-formed URI reference into its components.		well-formed URI reference into its components.

	^(([^:/?#]+):)?(//([^/?#]))?([^?#])(\?([^#]))?(#(.))?		^(([^:/?#]+):)?(//([^/?#]))?([^?#])(\?([^#]))?(#(.))?
	12 3 4 5 6 7 8 9		12 3 4 5 6 7 8 9


	skipping to change at page 51, line 39 ¶		skipping to change at page 51, line 29 ¶
	$3 = //www.ics.uci.edu		$3 = //www.ics.uci.edu
	$4 = www.ics.uci.edu		$4 = www.ics.uci.edu
	$5 = /pub/ietf/uri/		$5 = /pub/ietf/uri/
	$6 = <undefined>		$6 = <undefined>
	$7 = <undefined>		$7 = <undefined>
	$8 = #Related		$8 = #Related
	$9 = Related		$9 = Related

	where <undefined> indicates that the component is not present, as is		where <undefined> indicates that the component is not present, as is
	the case for the query component in the above example. Therefore, we		the case for the query component in the above example. Therefore, we

	can determine the value of the four components and fragment as		can determine the value of the five components as

	scheme = $2		scheme = $2
	authority = $4		authority = $4
	path = $5		path = $5
	query = $7		query = $7
	fragment = $9		fragment = $9


	and, going in the opposite direction, we can recreate a URI reference		Going in the opposite direction, we can recreate a URI reference from
	from its components using the algorithm of Section 5.3.		its components by using the algorithm of Section 5.3.

	Appendix C. Delimiting a URI in Context		Appendix C. Delimiting a URI in Context

	URIs are often transmitted through formats that do not provide a		URIs are often transmitted through formats that do not provide a
	clear context for their interpretation. For example, there are many		clear context for their interpretation. For example, there are many
	occasions when a URI is included in plain text; examples include text		occasions when a URI is included in plain text; examples include text

	sent in electronic mail, USENET news messages, and, most importantly,		sent in email, USENET news, and on printed paper. In such cases, it
	printed on paper. In such cases, it is important to be able to		is important to be able to delimit the URI from the rest of the text,
	delimit the URI from the rest of the text, and in particular from		and in particular from punctuation marks that might be mistaken for
	punctuation marks that might be mistaken for part of the URI.		part of the URI.

	In practice, URIs are delimited in a variety of ways, but usually		In practice, URIs are delimited in a variety of ways, but usually
	within double-quotes "http://example.com/", angle brackets		within double-quotes "http://example.com/", angle brackets

	<http://example.com/>, or just using whitespace		<http://example.com/>, or just by using whitespace:

	http://example.com/		http://example.com/

	These wrappers do not form part of the URI.		These wrappers do not form part of the URI.

	In some cases, extra whitespace (spaces, line-breaks, tabs, etc.) may		In some cases, extra whitespace (spaces, line-breaks, tabs, etc.) may

	need to be added to break a long URI across lines. The whitespace		have to be added to break a long URI across lines. The whitespace
	should be ignored when extracting the URI.		should be ignored when the URI is extracted.

	No whitespace should be introduced after a hyphen ("-") character.		No whitespace should be introduced after a hyphen ("-") character.
	Because some typesetters and printers may (erroneously) introduce a		Because some typesetters and printers may (erroneously) introduce a

	hyphen at the end of line when breaking a line, the interpreter of a		hyphen at the end of line when breaking it, the interpreter of a URI
	URI containing a line break immediately after a hyphen should ignore		containing a line break immediately after a hyphen should ignore all
	all whitespace around the line break, and should be aware that the		whitespace around the line break and should be aware that the hyphen
	hyphen may or may not actually be part of the URI.		may or may not actually be part of the URI.

	Using <> angle brackets around each URI is especially recommended as		Using <> angle brackets around each URI is especially recommended as
	a delimiting style for a reference that contains embedded whitespace.		a delimiting style for a reference that contains embedded whitespace.

	The prefix "URL:" (with or without a trailing space) was formerly		The prefix "URL:" (with or without a trailing space) was formerly
	recommended as a way to help distinguish a URI from other bracketed		recommended as a way to help distinguish a URI from other bracketed
	designators, though it is not commonly used in practice and is no		designators, though it is not commonly used in practice and is no
	longer recommended.		longer recommended.

	For robustness, software that accepts user-typed URI should attempt		For robustness, software that accepts user-typed URI should attempt
	to recognize and strip both delimiters and embedded whitespace.		to recognize and strip both delimiters and embedded whitespace.


	For example, the text:		For example, the text

	Yes, Jim, I found it under "http://www.w3.org/Addressing/",		Yes, Jim, I found it under "http://www.w3.org/Addressing/",
	but you can probably pick it up from <ftp://foo.example.		but you can probably pick it up from <ftp://foo.example.
	com/rfc/>. Note the warning in <http://www.ics.uci.edu/pub/		com/rfc/>. Note the warning in <http://www.ics.uci.edu/pub/
	ietf/uri/historical.html#WARNING>.		ietf/uri/historical.html#WARNING>.

	contains the URI references		contains the URI references

	http://www.w3.org/Addressing/		http://www.w3.org/Addressing/
	ftp://foo.example.com/rfc/		ftp://foo.example.com/rfc/
	http://www.ics.uci.edu/pub/ietf/uri/historical.html#WARNING		http://www.ics.uci.edu/pub/ietf/uri/historical.html#WARNING

	Appendix D. Changes from RFC 2396		Appendix D. Changes from RFC 2396


	D.1 Additions		D.1. Additions

	An ABNF rule for URI has been introduced to correspond to one common		An ABNF rule for URI has been introduced to correspond to one common
	usage of the term: an absolute URI with optional fragment.		usage of the term: an absolute URI with optional fragment.

	IPv6 (and later) literals have been added to the list of possible		IPv6 (and later) literals have been added to the list of possible
	identifiers for the host portion of an authority component, as		identifiers for the host portion of an authority component, as
	described by [RFC2732], with the addition of "[" and "]" to the		described by [RFC2732], with the addition of "[" and "]" to the
	reserved set and a version flag to anticipate future versions of IP		reserved set and a version flag to anticipate future versions of IP
	literals. Square brackets are now specified as reserved within the		literals. Square brackets are now specified as reserved within the

	authority component and not allowed outside their use as delimiters		authority component and are not allowed outside their use as
	for an IP literal within host. In order to make this change without		delimiters for an IP literal within host. In order to make this
	changing the technical definition of the path, query, and fragment		change without changing the technical definition of the path, query,
	components, those rules were redefined to directly specify the		and fragment components, those rules were redefined to directly
	characters allowed.		specify the characters allowed.


	Since [RFC2732] defers to [RFC3513] for definition of an IPv6 literal		As [RFC2732] defers to [RFC3513] for definition of an IPv6 literal
	address, which unfortunately lacks an ABNF description of		address, which, unfortunately, lacks an ABNF description of
	IPv6address, we created a new ABNF rule for IPv6address that matches		IPv6address, we created a new ABNF rule for IPv6address that matches
	the text representations defined by Section 2.2 of [RFC3513].		the text representations defined by Section 2.2 of [RFC3513].
	Likewise, the definition of IPv4address has been improved in order to		Likewise, the definition of IPv4address has been improved in order to
	limit each decimal octet to the range 0-255.		limit each decimal octet to the range 0-255.


	Section 6 (Section 6) on URI normalization and comparison has been		Section 6, on URI normalization and comparison, has been completely
	completely rewritten and extended using input from Tim Bray and		rewritten and extended by using input from Tim Bray and discussion
	discussion within the W3C Technical Architecture Group.		within the W3C Technical Architecture Group.


	D.2 Modifications		D.2. Modifications

	The ad-hoc BNF syntax of RFC 2396 has been replaced with the ABNF of		The ad-hoc BNF syntax of RFC 2396 has been replaced with the ABNF of
	[RFC2234]. This change required all rule names that formerly		[RFC2234]. This change required all rule names that formerly
	included underscore characters to be renamed with a dash instead. In		included underscore characters to be renamed with a dash instead. In
	addition, a number of syntax rules have been eliminated or simplified		addition, a number of syntax rules have been eliminated or simplified
	to make the overall grammar more comprehensible. Specifications that		to make the overall grammar more comprehensible. Specifications that
	refer to the obsolete grammar rules may be understood by replacing		refer to the obsolete grammar rules may be understood by replacing
	those rules according to the following table:		those rules according to the following table:

	+----------------+--------------------------------------------------+		+----------------+--------------------------------------------------+
	\| obsolete rule \| translation \|		\| obsolete rule \| translation \|
	+----------------+--------------------------------------------------+		+----------------+--------------------------------------------------+
	\| absoluteURI \| absolute-URI \|		\| absoluteURI \| absolute-URI \|
	\| relativeURI \| relative-part [ "?" query ] \|		\| relativeURI \| relative-part [ "?" query ] \|
	\| hier_part \| ( "//" authority path-abempty / \|		\| hier_part \| ( "//" authority path-abempty / \|

	\| \| path-absolute ) [ "?" query ] \|		\| \| path-absolute ) [ "?" query ] \|
	\| \| \|		\| \| \|
	\| opaque_part \| path-rootless [ "?" query ] \|		\| opaque_part \| path-rootless [ "?" query ] \|
	\| net_path \| "//" authority path-abempty \|		\| net_path \| "//" authority path-abempty \|
	\| abs_path \| path-absolute \|		\| abs_path \| path-absolute \|
	\| rel_path \| path-rootless \|		\| rel_path \| path-rootless \|
	\| rel_segment \| segment-nz-nc \|		\| rel_segment \| segment-nz-nc \|
	\| reg_name \| reg-name \|		\| reg_name \| reg-name \|
	\| server \| authority \|		\| server \| authority \|
	\| hostport \| host [ ":" port ] \|		\| hostport \| host [ ":" port ] \|
	\| hostname \| reg-name \|		\| hostname \| reg-name \|

	skipping to change at page 55, line 5 ¶		skipping to change at page 54, line 42 ¶
	\| \| / "(" / ")" \|		\| \| / "(" / ")" \|
	\| \| \|		\| \| \|
	\| escaped \| pct-encoded \|		\| escaped \| pct-encoded \|
	\| hex \| HEXDIG \|		\| hex \| HEXDIG \|
	\| alphanum \| ALPHA / DIGIT \|		\| alphanum \| ALPHA / DIGIT \|
	+----------------+--------------------------------------------------+		+----------------+--------------------------------------------------+

	Use of the above obsolete rules for the definition of scheme-specific		Use of the above obsolete rules for the definition of scheme-specific
	syntax is deprecated.		syntax is deprecated.


	Section 2 on characters has been rewritten to explain what characters		Section 2, on characters, has been rewritten to explain what
	are reserved, when they are reserved, and why they are reserved even		characters are reserved, when they are reserved, and why they are
	when not used as delimiters by the generic syntax. The mark		reserved, even when they are not used as delimiters by the generic
	characters that are typically unsafe to decode, including the		syntax. The mark characters that are typically unsafe to decode,
	exclamation mark ("!"), asterisk ("*"), single-quote ("'"), and open		including the exclamation mark ("!"), asterisk ("*"), single-quote
	and close parentheses ("(" and ")"), have been moved to the reserved		("'"), and open and close parentheses ("(" and ")"), have been moved
	set in order to clarify the distinction between reserved and		to the reserved set in order to clarify the distinction between
	unreserved and hopefully answer the most common question of scheme		reserved and unreserved and, hopefully, to answer the most common
	designers. Likewise, the section on percent-encoded characters has		question of scheme designers. Likewise, the section on
	been rewritten, and URI normalizers are now given license to decode		percent-encoded characters has been rewritten, and URI normalizers
	any percent-encoded octets corresponding to unreserved characters.		are now given license to decode any percent-encoded octets
	In general, the terms "escaped" and "unescaped" have been replaced		corresponding to unreserved characters. In general, the terms
	with "percent-encoded" and "decoded", respectively, to reduce		"escaped" and "unescaped" have been replaced with "percent-encoded"
	confusion with other forms of escape mechanisms.		and "decoded", respectively, to reduce confusion with other forms of
			escape mechanisms.

	The ABNF for URI and URI-reference has been redesigned to make them		The ABNF for URI and URI-reference has been redesigned to make them

	more friendly to LALR parsers and reduce complexity. As a result,		more friendly to LALR parsers and to reduce complexity. As a result,
	the layout form of syntax description has been removed, along with		the layout form of syntax description has been removed, along with
	the uric, uric_no_slash, opaque_part, net_path, abs_path, rel_path,		the uric, uric_no_slash, opaque_part, net_path, abs_path, rel_path,
	path_segments, rel_segment, and mark rules. All references to		path_segments, rel_segment, and mark rules. All references to
	"opaque" URIs have been replaced with a better description of how the		"opaque" URIs have been replaced with a better description of how the
	path component may be opaque to hierarchy. The relativeURI rule has		path component may be opaque to hierarchy. The relativeURI rule has
	been replaced with relative-ref to avoid unnecessary confusion over		been replaced with relative-ref to avoid unnecessary confusion over

	whether or not they are a subset of URI. The ambiguity regarding the		whether they are a subset of URI. The ambiguity regarding the
	parsing of URI-reference as a URI or a relative-ref with a colon in		parsing of URI-reference as a URI or a relative-ref with a colon in
	the first segment has been eliminated through the use of five		the first segment has been eliminated through the use of five
	separate path matching rules.		separate path matching rules.

	The fragment identifier has been moved back into the section on		The fragment identifier has been moved back into the section on
	generic syntax components and within the URI and relative-ref rules,		generic syntax components and within the URI and relative-ref rules,
	though it remains excluded from absolute-URI. The number sign ("#")		though it remains excluded from absolute-URI. The number sign ("#")
	character has been moved back to the reserved set as a result of		character has been moved back to the reserved set as a result of
	reintegrating the fragment syntax.		reintegrating the fragment syntax.

	The ABNF has been corrected to allow the path component to be empty.		The ABNF has been corrected to allow the path component to be empty.
	This also allows an absolute-URI to consist of nothing after the		This also allows an absolute-URI to consist of nothing after the
	"scheme:", as is present in practice with the "dav:" namespace		"scheme:", as is present in practice with the "dav:" namespace

	[RFC2518] and the "about:" scheme used internally by many WWW browser		[RFC2518] and with the "about:" scheme used internally by many WWW
	implementations. The ambiguity regarding the boundary between		browser implementations. The ambiguity regarding the boundary
	authority and path has been eliminated through the use of five		between authority and path has been eliminated through the use of
	separate path matching rules.		five separate path matching rules.

	Registry-based naming authorities that use the generic syntax are now		Registry-based naming authorities that use the generic syntax are now
	defined within the host rule. This change allows current		defined within the host rule. This change allows current
	implementations, where whatever name provided is simply fed to the		implementations, where whatever name provided is simply fed to the
	local name resolution mechanism, to be consistent with the		local name resolution mechanism, to be consistent with the

	specification and removes the need to re-specify DNS name formats		specification. It also removes the need to re-specify DNS name
	here. It also allows the host component to contain percent-encoded		formats here. Furthermore, it allows the host component to contain
	octets, which is necessary to enable internationalized domain names		percent-encoded octets, which is necessary to enable
	to be provided in URIs, processed in their native character encodings		internationalized domain names to be provided in URIs, processed in
	at the application layers above URI processing, and passed to an IDNA		their native character encodings at the application layers above URI
	library as a registered name in the UTF-8 character encoding. The		processing, and passed to an IDNA library as a registered name in the
	server, hostport, hostname, domainlabel, toplabel, and alphanum rules		UTF-8 character encoding. The server, hostport, hostname,
	have been removed.		domainlabel, toplabel, and alphanum rules have been removed.

	The resolving relative references algorithm of [RFC2396] has been		The resolving relative references algorithm of [RFC2396] has been

	rewritten using pseudocode for this revision to improve clarity and		rewritten with pseudocode for this revision to improve clarity and
	fix the following issues:		fix the following issues:

	o [RFC2396] section 5.2, step 6a, failed to account for a base URI		o [RFC2396] section 5.2, step 6a, failed to account for a base URI
	with no path.		with no path.

	o Restored the behavior of [RFC1808] where, if the reference		o Restored the behavior of [RFC1808] where, if the reference

	contains an empty path and a defined query component, then the		contains an empty path and a defined query component, the target
	target URI inherits the base URI's path component.		URI inherits the base URI's path component.

	o The determination of whether a URI reference is a same-document		o The determination of whether a URI reference is a same-document
	reference has been decoupled from the URI parser, simplifying the		reference has been decoupled from the URI parser, simplifying the
	URI processing interface within applications in a way consistent		URI processing interface within applications in a way consistent
	with the internal architecture of deployed URI processing		with the internal architecture of deployed URI processing
	implementations. The determination is now based on comparison to		implementations. The determination is now based on comparison to
	the base URI after transforming a reference to absolute form,		the base URI after transforming a reference to absolute form,
	rather than on the format of the reference itself. This change		rather than on the format of the reference itself. This change
	may result in more references being considered "same-document"		may result in more references being considered "same-document"

	under this specification than would be under the rules given in		under this specification than there would be under the rules given
	RFC 2396, especially when normalization is used to reduce aliases.		in RFC 2396, especially when normalization is used to reduce
	However, it does not change the status of existing same-document		aliases. However, it does not change the status of existing
	references.		same-document references.

	o Separated the path merge routine into two routines: merge, for		o Separated the path merge routine into two routines: merge, for
	describing combination of the base URI path with a relative-path		describing combination of the base URI path with a relative-path
	reference, and remove_dot_segments, for describing how to remove		reference, and remove_dot_segments, for describing how to remove
	the special "." and ".." segments from a composed path. The		the special "." and ".." segments from a composed path. The
	remove_dot_segments algorithm is now applied to all URI reference		remove_dot_segments algorithm is now applied to all URI reference

	paths in order to match common implementations and improve the		paths in order to match common implementations and to improve the
	normalization of URIs in practice. This change only impacts the		normalization of URIs in practice. This change only impacts the
	parsing of abnormal references and same-scheme references wherein		parsing of abnormal references and same-scheme references wherein
	the base URI has a non-hierarchical path.		the base URI has a non-hierarchical path.


	Appendix E. Instructions to RFC Editor

	Prior to publication as an RFC, please remove this section and the
	"Editorial Note" that appears after the Abstract. If [BCP35] or any
	of the normative references are updated prior to publication, the
	associated reference in this document can be safely updated as well.
	This document has been produced using the xml2rfc tool set; the XML
	version can be obtained via the URI listed in the editorial note.

	Index		Index


	A		A
	ABNF 11		ABNF 11
	absolute 26		absolute 27
	absolute-path 26		absolute-path 26
	absolute-URI 26		absolute-URI 27
	access 9		access 9
	authority 16, 17		authority 17, 18


	B		B
	base URI 28		base URI 28


	C		C
	character encoding 4		character encoding 4
	character 4		character 4
	characters 11		characters 8, 11
	coded character set 4		coded character set 4


	D		D
	dec-octet 20		dec-octet 20
	dereference 9		dereference 9
	dot-segments 22		dot-segments 23


	F		F
	fragment 16, 24		fragment 16, 24


	G		G
	gen-delims 12		gen-delims 13
	generic syntax 6		generic syntax 6


	H		H
	h16 19		h16 20
	hier-part 16		hier-part 16
	hierarchical 10		hierarchical 10
	host 18		host 18


	I		I
	identifier 5		identifier 5
	IP-literal 19		IP-literal 19
	IPv4 20		IPv4 20
	IPv4address 20		IPv4address 19, 20
	IPv6 19		IPv6 19
	IPv6address 19, 20		IPv6address 19, 20
	IPvFuture 19		IPvFuture 19


	L		L
	locator 7		locator 7
	ls32 19		ls32 20


	M		M
	merge 32		merge 32


	N		N
	name 7		name 7
	network-path 26		network-path 26


	P		P
	path 16, 22		path 16, 22, 26
	path-abempty 22		path-abempty 22
	path-absolute 22		path-absolute 22
	path-empty 22		path-empty 22
	path-noscheme 22		path-noscheme 22
	path-rootless 22		path-rootless 22
	path-abempty 16		path-abempty 16, 22, 26
	path-absolute 16		path-absolute 16, 22, 26
	path-empty 16		path-empty 16, 22, 26
	path-rootless 16		path-rootless 16, 22
	pchar 22		pchar 23
	pct-encoded 12		pct-encoded 12
	percent-encoding 12		percent-encoding 12
	port 21		port 22


	Q		Q
	query 16, 23		query 16, 23


	R		R
	reg-name 20		reg-name 21
	registered name 20		registered name 20
	relative 10, 28		relative 10, 28
	relative-path 26		relative-path 26
	relative-ref 26		relative-ref 26
	remove_dot_segments 32		remove_dot_segments 33
	representation 9		representation 9
	reserved 12		reserved 12
	resolution 9, 28		resolution 9, 28
	resource 5		resource 5
	retrieval 9		retrieval 9


	S		S
	same-document 27		same-document 27
	sameness 9		sameness 9
	scheme 16, 16		scheme 16, 17
	segment 22		segment 22, 23
	segment-nz 22		segment-nz 23
	segment-nz-nc 22		segment-nz-nc 23
	sub-delims 12		sub-delims 13
	suffix 27		suffix 27


	T		T
	transcription 7		transcription 8


	U		U
	uniform 4		uniform 4
	unreserved 13
	URI grammar
	absolute-URI 26
	ALPHA 11
	authority 16, 17
	CR 11
	dec-octet 20
	DIGIT 11
	DQUOTE 11
	fragment 16, 24, 26
	gen-delims 12
	h16 19
	HEXDIG 11
	hier-part 16
	host 17, 18
	IP-literal 19
	IPv4address 20
	IPv6address 19, 20
	IPvFuture 19
	LF 11
	ls32 19
	mark 13
	OCTET 11
	path 22
	path-abempty 16, 22
	path-absolute 16, 22
	path-empty 16, 22
	path-noscheme 22
	path-rootless 16, 22
	pchar 22, 23, 24
	pct-encoded 12
	port 17, 21
	query 16, 23, 26, 26
	reg-name 20
	relative-ref 25, 26
	reserved 12
	scheme 16, 16, 26
	segment 22
	segment-nz 22
	segment-nz-nc 22
	SP 11
	sub-delims 12
	unreserved 13		unreserved 13

	URI 16, 25		URI grammar
			absolute-URI 27
			ALPHA 11
			authority 18
			CR 11
			dec-octet 20
			DIGIT 11
			DQUOTE 11
			fragment 24
			gen-delims 13
			h16 20
			HEXDIG 11
			hier-part 16
			host 19
			IP-literal 19
			IPv4address 20
			IPv6address 20
			IPvFuture 19
			LF 11
			ls32 20
			OCTET 11
			path 22
			path-abempty 22
			path-absolute 22
			path-empty 22
			path-noscheme 22
			path-rootless 22
			pchar 23
			pct-encoded 12
			port 22
			query 24
			reg-name 21
			relative-ref 26
			reserved 13
			scheme 17
			segment 23
			segment-nz 23
			segment-nz-nc 23
			SP 11
			sub-delims 13
			unreserved 13
			URI 16
			URI-reference 25
			userinfo 18
			URI 16
	URI-reference 25		URI-reference 25

	userinfo 17, 18		URL 7
	URI 16		URN 7
	URI-reference 25		userinfo 18
	URL 7
	URN 7
	userinfo 17, 18


	Intellectual Property Statement		Authors' Addresses

			Tim Berners-Lee
			World Wide Web Consortium
			Massachusetts Institute of Technology
			77 Massachusetts Avenue
			Cambridge, MA 02139
			USA

			Phone: +1-617-253-5702
			Fax: +1-617-258-5999
			EMail: [email protected]
			URI: http://www.w3.org/People/Berners-Lee/

			Roy T. Fielding
			Day Software
			5251 California Ave., Suite 110
			Irvine, CA 92617
			USA

			Phone: +1-949-679-2960
			Fax: +1-949-679-2972
			EMail: [email protected]
			URI: http://roy.gbiv.com/

			Larry Masinter
			Adobe Systems Incorporated
			345 Park Ave
			San Jose, CA 95110
			USA

			Phone: +1-408-536-3024
			EMail: [email protected]
			URI: http://larry.masinter.net/

			Full Copyright Statement

			Copyright (C) The Internet Society (2005).

			This document is subject to the rights, licenses and restrictions
			contained in BCP 78, and except as set forth therein, the authors
			retain all their rights.

			This document and the information contained herein are provided on an
			"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
			OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
			ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
			INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
			INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
			WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

			Intellectual Property

	The IETF takes no position regarding the validity or scope of any		The IETF takes no position regarding the validity or scope of any
	Intellectual Property Rights or other rights that might be claimed to		Intellectual Property Rights or other rights that might be claimed to
	pertain to the implementation or use of the technology described in		pertain to the implementation or use of the technology described in
	this document or the extent to which any license under such rights		this document or the extent to which any license under such rights
	might or might not be available; nor does it represent that it has		might or might not be available; nor does it represent that it has
	made any independent effort to identify any such rights. Information		made any independent effort to identify any such rights. Information

	on the procedures with respect to rights in RFC documents can be		on the IETF's procedures with respect to rights in IETF Documents can
	found in BCP 78 and BCP 79.		be found in BCP 78 and BCP 79.

	Copies of IPR disclosures made to the IETF Secretariat and any		Copies of IPR disclosures made to the IETF Secretariat and any
	assurances of licenses to be made available, or the result of an		assurances of licenses to be made available, or the result of an
	attempt made to obtain a general license or permission for the use of		attempt made to obtain a general license or permission for the use of
	such proprietary rights by implementers or users of this		such proprietary rights by implementers or users of this
	specification can be obtained from the IETF on-line IPR repository at		specification can be obtained from the IETF on-line IPR repository at
	http://www.ietf.org/ipr.		http://www.ietf.org/ipr.

	The IETF invites any interested party to bring to its attention any		The IETF invites any interested party to bring to its attention any
	copyrights, patents or patent applications, or other proprietary		copyrights, patents or patent applications, or other proprietary
	rights that may cover technology that may be required to implement		rights that may cover technology that may be required to implement

	this standard. Please address the information to the IETF at		this standard. Please address the information to the IETF at ietf-
	[email protected].		[email protected].

	Disclaimer of Validity

	This document and the information contained herein are provided on an
	"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
	OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
	ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
	INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
	INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
	WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

	Copyright Statement

	Copyright (C) The Internet Society (2004). This document is subject
	to the rights, licenses and restrictions contained in BCP 78, and
	except as set forth therein, the authors retain all their rights.


	Acknowledgment		Acknowledgement

	Funding for the RFC Editor function is currently provided by the		Funding for the RFC Editor function is currently provided by the
	Internet Society.		Internet Society.

End of changes. 326 change blocks.
	1075 lines changed or deleted		1041 lines changed or added
This html diff was produced by rfcdiff 1.46. The latest version is available from http://tools.ietf.org/tools/rfcdiff/