This library deals with the analysis and construction of a URL, Universal Resource Locator. URL is the basis for communicating locations of resources (data) on the web. A URL consists of a protocol identifier (e.g. HTTP, FTP), and a protocol-specific syntax further defining the location. URLs are standardized in RFC-1738.
The implementation in this library covers only a small portion of the defined protocols. Though the initial implementation followed RFC-1738 strictly, the current is more relaxed to deal with frequent violations of the standard encountered in practical use.
This library contains code by Jan Wielemaker who wrote the initial version and Lukas Faulstich who added various extensions.
Name(Value)
. Defined
components are:
url:
, an
identifier separated from the remainder of the URL using :
.
parse_url/2
assumes the http
protocol if no protocol is specified and
the URL can be parsed as a valid HTTP url. In addition to the RFC-1738
specified protocols, the file:
protocol is supported as
well.
ftp
, http
and file
protocols. If
no path appears, the library generates the path /
.
?
,
normally used to transfer data from HTML forms that use the `GET
'
protocol. In the URL it consists of a www-form-encoded list of Name=Value
pairs. This is mapped to a list of Prolog Name=Value
terms with decoded names and values.
#
character.
The example below illustrates the all this for an HTTP UTL.
?- parse_url('http://swi.psy.uva.nl/message.cgi?msg=Hello+World%21#x', P). P = [ protocol(http), host('swi.psy.uva.nl'), fragment(x), search([ msg = 'Hello World!' ]), path('/message.cgi') ]. |
By instantiating the parts-list this predicate can be used to create a URL.
Action Location [HTTP/
HttpVersion]
Location is either an atom or a code-list.
Encoding implies mapping space to +, preserving alpha-numercial characters, map newlines to %0D%0A and anything else to %XX. When decoding, newlines appear as a single newline (10) character.