NGINX proxy pitfalls related with DNS resolving¶
If you're using proxy_pass
and your endpoint's IPs can vary in time, please read this article to avoid misunderstandings about how nginx works.
TL;DR¶
If you want to force nginx resolve your endpoints, you should:
- Use variables within
proxy_pass
directive, e.g.proxy_pass https://$endpoint/;
, where$endpoint
can be manually set or extracted from location regex. - Make sure that your endpoint isn't used on another locations w/o variables, because in this case resolving won't work. To fix this move endpoint domain to the
upstream
or use variables in theproxy_pass
in all locations to make resolving works. - You can have both resolve and non-resolve locations for same domain
- When a variable is used in proxy_pass directive, the location header is not longer adjusted. To get around this, simply set
proxy_redirect
Explanatory example¶
location /api/ {
proxy_pass http://api.com/;
}
In this case nginx will resolve api.com only once at startup (or reload). But there are some cases when your endpoint can be resolved to any IP, e.g. if you're using load balancer which doing magic failover via DNS mapping. If api.com will point to another IP your proxying will fail.
Finding the solution¶
Add a resolver directive¶
You can check official nginx documentation and find there resolver directive description:
location /api/ {
resolver 8.8.8.8;
proxy_pass https://api.com/;
}
No, it will not work. Even this will not work:
location /api/ {
resolver 8.8.8.8 valid=1s;
proxy_pass https://api.com/;
}
It's because of nginx doesn't respect resolver
directive in this case. It will resolve api.com only at startup (or reload) by system resolver (/etc/resolv.conf), even if real TTL of A/AAAA record api.com is 1s.
Add variables¶
You can google a bit and find that nginx try to resolve proxy endpoint with variables. Also official documentation for proxy_pass directive notices this too. Ok, I think this should be noticed in the resolver
description, but let's try anyway:
location = /proxy/ {
set $endpoint proxy.com;
resolver 8.8.8.8 valid=10s;
proxy_pass https://$endpoint/;
}
nginx
will query proxy.com
every 10s on particular requests. These configurations works too:
set $endpoint api.com;
location ~ ^/api/(.*)$ {
resolver 8.8.8.8 valid=60s;
proxy_pass https://$endpoint/$1$is_args$args;
}
location ~ ^/(?<dest_proxy>[\w-]+)(?:/(?<path_proxy>.*))? {
resolver 8.8.8.8 ipv6=off valid=60s;
proxy_pass https://${dest_proxy}.example.com/${path_proxy}$is_args$args;
}
resolver
directive, but will fail with 502 at runtime, because "no resolver defined to resolve".
Caveats¶
location = /api_version/ {
proxy_pass https://api.com/version/;
}
location ~ ^/api/(.*)$ {
set $endpoint api.com;
resolver 8.8.8.8 valid=60s;
proxy_pass https://$endpoint/$1$is_args$args;
}
Use variables everywhere to make it work as expected:
location = /api_version/ {
set $endpoint api.com;
resolver 8.8.8.8 valid=60s;
proxy_pass https://$endpoint/version/;
}
location ~ ^/api/(.*)$ {
set $endpoint api.com;
resolver 8.8.8.8 valid=60s;
proxy_pass https://$endpoint/$1$is_args$args;
}
set
and resolver
to the server
or http
(or use include
) directives to avoid copy-paste (also I assume that it will increase performance a bit, but I haven't tested it).
If response from proxy contains Location
header, as in the case of a redirect, nginx will automatically replace these values as needed. However, if variables are used in proxy_pass
, this must be done explicitly via proxy_redirect
:
location = /api_version/ {
set $endpoint api.com;
resolver 8.8.8.8 valid=60s;
proxy_pass https://$endpoint/version/;
proxy_redirect https://$endpoint/ /;
}
location ~ ^/api/(.*)$ {
set $endpoint api.com;
resolver 8.8.8.8 valid=60s;
proxy_pass https://$endpoint/$1$is_args$args;
proxy_redirect https://$endpoint/ /;
}
Upstreams¶
If you're using nginx plus, you can use resolve
parameter, check out documentation. I assume that it will be efficient, because documentation says "monitors changes of the IP addresses that correspond to a domain name of the server", while solutions listed above will query DNS on the particular requests. But if you're using open source nginx, no honey is available for you. No money — no honey.
Two in one¶
You can have both resolve and non-resolve locations for same domain
upstream proxy {
server proxy.com:443;
}
server {
listen 80;
server_name fillo.me;
location = /proxy-with-resolve/ {
set $endpoint proxy.com;
resolver 8.8.8.8 valid=1s;
proxy_pass https://$endpoint/;
}
location = /proxy-without-resolve/ {
proxy_pass https://proxy/;
proxy_set_header Host proxy.com;
}
}
Yes, http://fillo.me/proxy-with-resolve/
will resolve proxy.com every 1s on particular requests, while http://fillo.me/proxy-without-resolve/
will not resolve proxy.com (nginx will resolve proxy.com at startup/reload once). This magic works because upstream
directive is used.
Another example:
upstream api_version {
server version.api.com:443;
}
server {
listen 80;
server_name fillo.me;
location = /api_version/ {
proxy_pass https://api_version/version/;
proxy_set_header Host version.api.com;
}
location ~ ^/api/(?<dest_proxy>[\w-]+)(?:/(?<path_proxy>.*))? {
resolver 8.8.8.8 valid=60s;
proxy_pass https://${dest_proxy}.api.com/${path_proxy}$is_args$args;
}
}
Tested on¶
- 1.9.6
- 1.10.1
Although I think it works for many other versions.
Further research¶
- This issue says that changing HTTPS to the HTTP helps. Check how protocol changes affects examples above.
- Compare performance with and without resolving.
- Compare performance with different variables scopes.
- How to force upstream resolving.