draft-koster-rep-04 - Robots Exclusion Protocol Toggle navigation Datatracker Enable Javascript for full functionality. Groups Active WGs Active RGs Other Active AGs Active Areas Active Directorates Active Programs Active RAGs Active Teams RSOC By area/parent Applications and Real-Time General Internet Ops & Mgmt Routing Security Transport IRTF New work Chartering groups BOFs Other groups Concluded groups Non-WG lists Documents Search Recent drafts Draft submission Sign in to track docs RFC streams IAB IRTF ISE Meetings Agenda Materials Floor plan Registration Important dates Proceedings Upcoming Past Request a session Session requests Other IPR disclosures Liaison statements IESG agenda NomComs Downref registry Statistics Drafts/RFCs Meetings Tutorials API Help Release notes Report a bug User Sign in Password reset Preferences Handling of personal information New account Robots Exclusion Protocol draft-koster-rep-04 Status IESG evaluation record IESG writeups Email expansions History Versions 00 01 02 03 04 Document Type Active Internet-Draft (individual in art area) Authors Martijn Koster  , Gary Illyes  , Henner Zeller  , Lizzi Harvey  Last updated 2020-12-13 (latest revision 2020-12-08) Replaces draft-rep-wg-topic Stream IETF Intended RFC status Informational Formats plain text pdf htmlized (tools) htmlized bibtex Stream WG state Submitted to IESG for Publication Document shepherd Ted Hardie Shepherd write-up Show (last changed 2020-12-08) IESG IESG state AD Evaluation::Revised I-D Needed Consensus Boilerplate Unknown Telechat date Responsible AD Murray Kucherawy Send notices to Ted Hardie Email authors IPR References Referenced by Nits Search lists IETF Mail Archive Google Network Working Group M. Koster Internet-Draft Stalworthy Computing, Ltd. Intended status: Informational G. Illyes Expires: June 5, 2021 H. Zeller L. Harvey Google December 08, 2020 Robots Exclusion Protocol draft-koster-rep-04 Abstract This document standardizes and extends the "Robots Exclusion Protocol" method originally defined by Martijn Koster in 1996 for service owners to control how content served by their services may be accessed, if at all, by automatic clients known as crawlers. Status of This Memo This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet- Drafts is at http://datatracker.ietf.org/drafts/current/. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." This document may not be modified, and derivative works of it may not be created, except to format it for publication as an RFC or to translate it into languages other than English. This Internet-Draft will expire on June 5, 2021. Copyright Notice Copyright (c) 2020 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Koster, et al. Expires June 5, 2021 [Page 1] Internet-Draft I-D July 2019 Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 1.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 2 2. Specification . . . . . . . . . . . . . . . . . . . . . . . . 3 2.1. Protocol definition . . . . . . . . . . . . . . . . . . . 3 2.2. Formal syntax . . . . . . . . . . . . . . . . . . . . . . 3 2.2.1. The user-agent line . . . . . . . . . . . . . . . . . 4 2.2.2. The Allow and Disallow lines . . . . . . . . . . . . 4 2.2.3. Special characters . . . . . . . . . . . . . . . . . 5 2.2.4. Other records . . . . . . . . . . . . . . . . . . . . 6 2.3. Access method . . . . . . . . . . . . . . . . . . . . . . 6 2.3.1. Access results . . . . . . . . . . . . . . . . . . . 7 2.4. Caching . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.5. Limits . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.6. Security Considerations . . . . . . . . . . . . . . . . . 8 2.7. IANA Considerations . . . . . . . . . . . . . . . . . . . 8 3. Examples . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.1. Simple example . . . . . . . . . . . . . . . . . . . . . 8 3.2. Longest Match . . . . . . . . . . . . . . . . . . . . . . 9 4. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 4.1. Normative References . . . . . . . . . . . . . . . . . . 9 4.2. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 10 1. Introduction This document applies to services that provide resources that clients can access through URIs as defined in RFC3986 [1]. For example, in the context of HTTP, a browser is a client that displays the content of a web page. Crawlers are automated clients. Search engines for instance have crawlers to recursively traverse links for indexing as defined in RFC8288 [2]. It may be inconvenient for service owners if crawlers visit the entirety of their URI space. This document specifies the rules that crawlers MUST obey when accessing URIs. These rules are not a form of access authorization. 1.1. Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", Show full document text RFC Editor IASA & IETF LLC IETF Trust IRTF IETF IESG IAB IANA Privacy Statement IETF Tools About | IETF Datatracker | Version 7.23.0.p1 | 2020-11-17 | Report a bug: Tracker: Email: Python 3.6.12 | Django 2.2.17