Client
HTTP.jl provides a wide range of HTTP client functionality, mostly exposed through the HTTP.request
family of functions. This document aims to walk through the various ways requests can be configured to accomplish whatever your goal may be.
Basic Usage
The standard form for making requests is:
HTTP.request(method, url [, headers [, body]]; <keyword arguments>]) -> HTTP.Response
First, let's walk through the positional arguments.
Method
method
refers to the HTTP method (sometimes known as "verb"), including GET, POST, PUT, DELETE, PATCH, TRACE, etc. It can be provided either as a String
like HTTP.request("GET", ...)
, or a Symbol
like HTTP.request(:GET, ...)
. There are also convenience methods for the most common methods:
HTTP.get(...)
HTTP.post(...)
HTTP.put(...)
HTTP.delete(...)
HTTP.patch(...)
HTTP.head(...)
These methods operate identically to HTTP.request
, except the method
argument is "builtin" via function name.
Url
For the request url
argument, the URIs.jl package is used to parse this String
argument into a URI
object, which detects the HTTP scheme, user info, host, port (if any), path, query parameters, fragment, etc. The host and port will be used to actually make a connection to the remote server and send data to and receive data from. Query parameters can be included in the url
String
itself, or passed as a separate query
keyword argument to HTTP.request
, which will be discussed in more detail later.
Headers
The headers
argument is an optional list of header name-value pairs to be included in the request. They can be provided as a Vector
of Pair
s, like ["header1" => "value1", "header2" => "value2"]
, or a Dict
, like Dict("header1" => "value1", "header2" => "value2")
. Header names do not have to be unique, so any argument passed will be converted to Vector{Pair}
. By default, HTTP.request
will include a few headers automatically, including "Host" => url.host
, "Accept" => "*/*"
, "User-Agent" => "HTTP.jl/1.0"
, and "Content-Length" => body_length
. These can be overwritten by providing them yourself, like HTTP.get(url, ["Accept" => "application/json"])
, or you can prevent the header from being included by default by including it yourself with an empty string as value, like HTTP.get(url, ["Accept" => ""])
, following the curl convention. There are keyword arguments that control the inclusion/setting of other headers, like basicauth
, detect_content_type
, cookies
, etc. that will be discussed in more detail later.
Body
The optional body
argument makes up the "body" of the sent request and is only used for HTTP methods that expect a body, like POST and PUT. A variety of objects are supported:
- a
Dict
orNamedTuple
to be serialized as the "application/x-www-form-urlencoded" content type, soDict("nm" => "val")
will be sent in the request body likenm=val
- any
AbstractString
orAbstractVector{UInt8}
which will be sent "as is" for the request body - a readable
IO
stream or anyIO
-like typeT
for whicheof(T)
andreadavailable(T)
are defined. This stream will be read and sent untileof
istrue
. This object should support themark
/reset
methods if request retires are desired (if not, no retries will be attempted). - Any collection or iterable of the above (
Dict
,AbstractString
,AbstractVector{UInt8}
, orIO
) which will result in a "Transfer-Encoding=chunked" request body, where each iterated element will be sent as a separate chunk - a
HTTP.Form
, which will be serialized as the "multipart/form-data" content-type
Response
Upon successful completion, HTTP.request
returns a HTTP.Response
object. It includes the following fields:
status
: the HTTP status code of the response, e.g.200
for normal responseheaders
: aVector
ofString
Pair
s for each name=value pair in the response headers. Convenience methods for working with headers include:HTTP.hasheader(resp, key)
to check if header exists;HTTP.header(resp, key)
retrieve the value of a header with namekey
;HTTP.headers(resp, key)
retrieve a list of all the headers with namekey
, since headers can have duplicate namesbody
: aVector{UInt8}
of the response body bytes. Alternatively, anIO
object can be provided via theresponse_stream
keyword argument to have the response body streamed as it is received.
Keyword Arguments
A number of keyword arguments are provided to give fine-tuned control over the request process.
query
Query parameters are included in the url of the request like http://httpbin.org/anything?q1=v1&q2=v2
, where the string ?q1=v1&q2=v2
after the question mark represent the "query parameters". They are essentially a list of key-value pairs. Query parameters can be included in the url itself when calling HTTP.request
, like HTTP.request(:GET, "http://httpbin.org/anything?q1=v1&q2=v2")
, but oftentimes, it's convenient to generate and pass them programmatically. To do this, pass an object that iterates String
Pair
s to the query
keyword argument, like the following examples:
HTTP.get(url; query=Dict("x1" => "y1", "x2" => "y2")
HTTP.get(url; query=["x1" => "y1", "x1" => "y2"]
: this form allows duplicate key valuesHTTP.get(url; query=[("x1", "y1)", ("x2", "y2")]
response_stream
By default, the HTTP.Response
body is returned as a Vector{UInt8}
. There may be scenarios, however, where more control is desired, like downloading large files, where it's preferable to stream the response body directly out to file or into some other IO
object. By passing a writeable IO
object to the response_stream
keyword argument, the response body will not be fully materialized and will be written to as it is received from the remote connection. Note that in the presence of request redirects and retries, multiple requests end up being made in a single call to HTTP.request
by default (configurable via the redirects
and retry
keyword arguments). If response_stream
is provided and a request is redirected or retried, the response_stream
is not written to until the final request is completed (either the redirect is successfully followed, or the request doesn't need to be retried, etc.).
Examples
Stream body to file:
io = open("get_data.txt", "w")
r = HTTP.request("GET", "http://httpbin.org/get", response_stream=io)
close(io)
println(read("get_data.txt", String))
Stream body through buffer:
r = HTTP.get("http://httpbin.org/get", response_stream=IOBuffer())
println(String(take!(r.body)))
verbose
HTTP requests can be a pain sometimes. We get it. It can be tricky to get the headers, query parameters, or expected body in just the right format. For convenience, the verbose
keyword argument is provided to enable debug logging for the duration of the call to HTTP.request
. This can be helpful to "peek under the hood" of what all goes on in the process of making request: are redirects returned and followed? Is the connection not getting made to the right host? Is some error causing the request to be retried unexpectedly? Currently, the verbose
keyword argument supports passing increasing levels to enable more and more verbose logging, from 0
(no debug logging, the default), up to 3
(most verbose, probably too much for anyone but package developers).
If you're running into a real head-scratcher, don't hesitate to open an issue and include the problem you're running into; it's most helpful when you can include the output of passing verbose=3
, so package maintainers can see a detailed view of what's going on.
Connection keyword arguments
connect_timeout
When a connection is attempted to a remote host, sometimes the connection is unable to be established for whatever reason. Passing a non-zero connect_timetout
value will cause HTTP.request
to wait that many seconds before giving up and throwing an error.
pool
Many remote web services/APIs have rate limits or throttling in place to avoid bad actors from abusing their service. They may prevent too many requests over a time period or they may prevent too many connections being simultaneously open from the same client. By default, when HTTP.request
opens a remote connection, it remembers the exact host:port combination and will keep the connection open to be reused by subsequent requests to the same host:port. The pool
keyword argument specifies a specific HTTP.Pool
object to be used for controlling the maximum number of concurrent connections allowed to be happening across the pool. It's constructed via HTTP.Pool(max::Int)
. Requests attempted when the maximum is already hit will block until previous requests finish. The idle_timeout
keyword argument can be passed to HTTP.request
to control how long it's been since a connection was lasted used in order to be considered 'valid'; otherwise, "stale" connections will be discarded.
readtimeout
After a connection is established and a request is sent, a response is expected. If a non-zero value is passed to the readtimeout
keyword argument, HTTP.request
will wait to receive a response that many seconds before throwing an error. Passing readtimeout = 0
disables any timeout checking and is the default.
status_exception
When a non-2XX HTTP status code is received in a response, this is meant to convey some error condition. 3XX responses typically deal with "redirects" where the request should actually try a different url (these are followed automatically by default in HTTP.request
, though up to a limit; see redirect
). 4XX status codes typically mean the remote server thinks something is wrong in how the request is made. 5XX typically mean something went wrong on the server-side when responding. By default, as mentioned previously, HTTP.request
will attempt to follow redirect responses, and retry "retryable" requests (where the status code and original request method allow). If, after redirects/retries, a response still has a non-2XX response code, the default behavior is to throw an HTTP.StatusError
exception to signal that the request didn't succeed. This behavior can be disabled by passing status_exception=false
, where the HTTP.Response
object will be returned with the non-2XX status code intact.
logerrors
If true
, HTTP.StatusError
, HTTP.TimeoutError
, HTTP.IOError
, and HTTP.ConnectError
will be logged via @error
as they happen, regardless of whether the request is then retried or not. Useful for debugging or monitoring requests where there's worry of certain errors happening but ignored because of retries.
logtag
If provided, will be used as the tag for error logging. Useful for debugging or monitoring requests.
observelayers
If true
, enables the HTTP.observelayer
to wrap each client-side "layer" to track the amount of time spent in each layer as a request is processed. This can be useful for debugging performance issues. Note that when retries or redirects happen, the time spent in each layer is cumulative, as noted by the [layer]_count
. The metrics are stored in the Request.context
dictionary, and can be accessed like HTTP.get(...).request.context
.
basicauth
By default, if "user info" is detected in the request url, like http://user:password@host
, the Authorization: Basic
header will be added to the request headers before the request is sent. While not very common, some APIs use this form of authentication to verify requests. This automatic adding of the header can be disabled by passing basicauth=false
.
canonicalize_headers
In the HTTP specification, header names are case insensitive, yet it is sometimes desirable to send/receive headers in a more predictable format. Passing canonicalize_headers=true
(false by default) will reformat all request and response headers to use the Canonical-Camel-Dash-Format.
proxy
In certain network environments, connections to a proxy must first be made and external requests are then sent "through" the proxy. HTTP.request
supports this workflow by allowing the passing of a proxy url via the proxy
keyword argument. Alternatively, it's a common pattern for HTTP libraries to check for the http_proxy
, HTTP_PROXY
, https_proxy
, HTTPS_PROXY
, and no_proxy
environment variables and, if present, be used for these kind of proxied requests. Note that environment variables typically should be set prior to starting the Julia process; alternatively, the withenv
Julia function can be used to temporarily modify the current process environment variables, so it could be used like:
resp = withenv("http_proxy" => proxy_url) do
HTTP.request(:GET, url)
end
The no_proxy
argument is typically a comma-separated list of urls that should not use the proxy, and is parsed when the HTTP.jl package is loaded, and thus won't work with the withenv
method mentioned.
detect_content_type
By default, the Content-Type
header is not included by HTTP.request
. To automatically detect various content types (html, xml, pdf, images, zip, gzip, JSON) and set this header, pass detect_content_type=true
.
decompress
By default, when the response includes the Content-Encoding: gzip
header, the response body will be decompressed. To avoid this behavior, pass decompress=false
. This keyword also controls the automatic inclusion of the Accept-Encoding: gzip
header in the request being sent.
Retry arguments
retry
Controls overall whether requests will be retried at all; pass retry=false
to disable all retries.
retries
Controls the total number of retries that will be attempted. Can also disable all retries by passing retries = 0
. Note that for a request to be retried, in addition to the retry
and retries
keyword arguments, must be "retryable", which includes the following requirements:
- Request body must be static (string or bytes) or an
IO
the supports themark
/reset
interface - The request method must be idempotent as defined by RFC-7231, which includes GET, HEAD, OPTIONS, TRACE, PUT, and DELETE (not POST or PATCH).
- If the method isn't idempotent, can pass
retry_non_idempotent=true
keyword argument to retry idempotent requests - The retry limit hasn't been reached, as specified by
retries
keyword argument - The "failed" response must have one of the following status codes: 403, 408, 409, 429, 500, 502, 503, 504, 599.
retry_non_idempotent
By default, this keyword argument is false
, which controls whether non-idempotent requests will be retried (POST or PATCH requests).
retry_delays
Allows providing a custom ExponentialBackOff
object to control the delay between retries. Default is ExponentialBackOff(n = retries)
.
retry_check
Allows providing a custom function to control whether a retry should be attempted. The function should accept 5 arguments: the delay state, exception, request, response (an HTTP.Response
object if a request was successfully made, otherwise nothing
), and resp_body
response body (which may be nothing
if there is no response yet, otherwise a Vector{UInt8}
), and return true
if a retry should be attempted. So in traditional nomenclature, the function would have the form f(s, ex, req, resp, resp_body) -> Bool
.
Redirect Arguments
redirect
This keyword argument controls whether responses that specify a redirect (via 3XX status code + Location
header) will be "followed" by issuing a follow up request to the specified Location
. There are certain rules/logic that are followed when deciding whether to redirect and how the redirected request will be made:
- The
Location
redirect url must be "valid" with an appropriate scheme and host; if the new location is relative to the original request url, the new url is resolved by callingURIs.resolvereference
- The response status code is one of: 301, 302, 307, or 308
- The method of the redirected request may change: 307 or 308 status codes will not change the method, 303 means only a GET request is allowed, otherwise, if the
redirect_method
keyword argument is provided, it will be used. If not provided, the redirected request will default to a GET request. - We'll only make a redirect request if we haven't made too many redirect attempts already, as controlled by the
redirect_limit
keyword argument (default 3) - The original request headers will, by default, be forwarded in the redirected request, unless the new url is an entirely new host, then
Cookie
andAuthorization
headers will be removed.
redirect_limit
Controls how many redirects will be "followed" by making additional requests in the case of a redirected response url recursively returning redirect responses and so on. In addition to redirect=false
, passing redirect_limit=0
will also disable any redirect behavior all together.
redirect_method
May control the method that will be used in the redirected request. For 307 or 308 status codes, the same method will be used by default. For 303, only GET requests are allowed. For other status codes (301, 302), this keyword argument will be used to determine what the redirect request method will be. Passing redirect_method=:same
will result in the same method as the original request. Otherwise, passing redirect_method=:GET
, or any other valid method as String
or Symbol
, will result in that method for the redirected request.
forwardheaders
Controls whether original request headers will be included in the redirected request. true
by default. Pass forwardheaders=false
to disable headers being used in redirected requests. By design, if the new redirect location url is a different host than the original host, "sensitive" headers will not be forwarded, including Cookie
and Authorization
headers.
SSL Arguments
require_ssl_verification
Controls whether the SSL configuration for a secure connection is verified in the handshake process. true
by default. Should only be set to false
if developing against a local server that can be completely trusted.
sslconfig
Allows specifying a custom MbedTLS.SSLConfig
configuration to be used in the secure connection handshake process to verify the connection. A custom cert and key file can be passed to construct a custom SSLConfig
like MbedTLS.SSLConfig(cert_file, key_file)
.
Cookie Arguments
cookies
Controls if and how a request Cookie
header is set for a request. By default, cookies
is true
, which means the cookiejar
(an internal global by default) will be checked for previously received Set-Cookie
headers from past responses for the request domain and, if found, will be included in the outgoing request. Passing cookies=false
will disable any automatic cookie tracking/setting. For more granular control, the Cookie
header can be set manually in the headers
request argument and cookies=false
is passed. Otherwise, a Dict{String, String}
can be passed in the cookies
argument, which should be cookie name-value pairs to be serialized into the Cookie
header for the current request. If cookies=false
or an empty Dict
is passed in the cookies
keyword argument, no Set-Cookie
response headers will be parsed and stored for use in future requests. In the automatic case of including previously received cookies, verification is done to ensure the cookie hasn't expired, matches the correct domain, etc.
cookiejar
If cookies are "enabled" (either by passing cookies=true
or passing a non-empty Dict
via cookies
keyword argument), this keyword argument specifies the HTTP.CookieJar
object that should be used to store received Set-Cookie
cookie name-value pairs from responses. Cookies are stored for the appropriate host/domain and will, by default, be included in future requests when the host/domain match and other conditions are met (cookie hasn't expired, etc.). By default, a single global CookieJar
is used, which is threadsafe. To pass a custom CookieJar
, first create one: jar = HTTP.CookieJar()
, then pass like HTTP.get(...; cookiejar=jar)
.
Streaming Requests
HTTP.open
Allows a potentially more convenient API when the request and/or response bodies need to be streamed. Works like:
HTTP.open(method, url, [, headers]; kw...) do io
write(io, body)
[startread(io) -> HTTP.Response]
while !eof(io)
readavailable(io) -> AbstractVector{UInt8}
end
end -> HTTP.Response
Where the io
argument provided to the function body is an HTTP.Stream
object, a custom IO
that represents an open connection that is ready to be written to in order to send the request body, and/or read from to receive the response body. Note that startread(io)
should be called before calling readavailable
to ensure the response status line and headers are received and parsed appropriately. Calling eof(io)
will return true until the response body has been completely received. Note that the returned HTTP.Response
from HTTP.open
will not have a .body
field since the body was read in the function body.
Download
A download
function is provided for similar functionality to Downloads.download
.
Client-side Middleware (Layers)
An HTTP.Layer
is an abstract type to represent a client-side middleware. A layer is any function of the form f(::Handler) -> Handler
, where Handler
is a function of the form f(::Request) -> Response
. Note that this Handler
definition is the same from the server-side documentation. It may also be apparent that a Layer
is the same as the Middleware
interface from server-side, which is true, but we define Layer
to clarify the client-side distinction and its unique usage.
Creating custom layers can be a convenient way to "enhance" the HTTP.request
process with custom functionality. It might be a layer that computes a special authorization header, or modifies the body in some way, or treats the response specially. Oftentimes, layers are application or domain-specific, where certain domain knowledge can be used to improve or simplify the request process. Layers can also be used to enforce the usage of certain keyword arguments if desired.
Custom layers can be deployed in one of two ways:
HTTP.@client
: Create a custom "client" with shorthand verb definitions, but which include custom layers; only these new verb methods will use the custom layers.HTTP.pushlayer!
/HTTP.poplayer!
: Allows globally adding and removing layers from the default HTTP.jl layer stack; all http requests will then use the custom layers
Quick Examples
module Auth
using HTTP
function auth_layer(handler)
# returns a `Handler` function; check for a custom keyword arg `authcreds` that
# a user would pass like `HTTP.get(...; authcreds=creds)`.
# We also accept trailing keyword args `kw...` and pass them along later.
return function(req; authcreds=nothing, kw...)
# only apply the auth layer if the user passed `authcreds`
if authcreds !== nothing
# we add a custom header with stringified auth creds
HTTP.setheader(req, "X-Auth-Creds" => string(authcreds))
end
# pass the request along to the next layer by calling `auth_layer` arg `handler`
# also pass along the trailing keyword args `kw...`
return handler(req; kw...)
end
end
# Create a new client with the auth layer added
HTTP.@client [auth_layer]
end # module
# Can now use custom client like:
Auth.get(url; authcreds=creds) # performs GET request with auth_layer layer included
# Or can include layer globally in all HTTP.jl requests
HTTP.pushlayer!(Auth.auth_layer)
# Now can use normal HTTP.jl methods and auth_layer will be included
HTTP.get(url; authcreds=creds)
For more ideas or examples on how client-side layers work, it can be useful to see how HTTP.request
is built on layers internally, in the /src/clientlayers
source code directory.