Difference between revisions of "Nginx"

From WhyAskWhy.org Wiki
Jump to: navigation, search
m (Fixed language so it would match the previous modifer)
m (Added a link to "Understanding the Nginx Configuration Inheritance Model" (need to re-read it))
 
(10 intermediate revisions by the same user not shown)
Line 6: Line 6:
 
{{WorkInProgress}}
 
{{WorkInProgress}}
  
I've been using Nginx for a few years now and plan to do so for many more, but it wasn't an easy setup to get used to. In fact, I'm still getting bit by the differences between it and [[Apache]], which I have much more experience with. The one difference that stands out the most to me is the <code>location</code> blocks.
+
== Subpages ==
  
For the most part, [[Apache]] directives are matched in a top down manner with later directives overwriting earlier ones if duplicated. Nginx directives on the other hand seem to operate a little differently.
+
{{Special:PrefixIndex/Nginx/}}
  
== Location blocks ==
 
  
{{InfoBox
+
== nginx is thorny ==
|warning
 
|Pay close attention to what version of nginx a book or web resource is based on as behavior may change between releases <ref name="prefix-none">[http://nginx.org/en/docs/http/ngx_http_core_module.html#location In versions from 0.7.1 to 0.8.41, if a request matched the prefix location without the "=" and "^~" prefixes, the search also terminated and regular expressions were not checked.]</ref>
 
}}
 
  
According to the official documentation <ref>[http://nginx.org/en/docs/http/request_processing.html#simple_php_site_configuration How nginx processes a request, A simple PHP configuration]</ref>, this is how nginx processes a request:
+
I've been using Nginx for a few years now and plan to do so for many more, but it wasn't an easy setup to get used to. In fact, I'm still getting bit by the differences between it and [[Apache]], which I have much more experience with.
  
<blockquote>
+
{{InfoBox|warning|Pay close attention to what version of nginx a book or web resource is based on as behavior may change between releases <ref name="prefix-none" />}}
nginx first searches for the most specific prefix location given by literal strings regardless of the listed order. In the configuration above the only prefix location is "/" and since it matches any request it will be used as a last resort. Then nginx checks locations given by regular expression in the order listed in the configuration file. The first matching expression stops the search and nginx will use this location. If no regular expression matches a request, then nginx uses the most specific prefix location found earlier.
 
</blockquote>
 
  
The same thing, said a little differently according to Samuel at the RackCorp Industry Blog <ref>[http://blog.rackcorp.com/?p=31 Nginx location and rewrite configuration made easy]</ref>:
 
  
<blockquote>
+
=== Directives ===
The best way to think of things is that as a request comes in, Nginx will scan through the configuration to find a "location" line that matches the request. There are TWO modes that nginx uses to scan through the configuration file: '''literal string matching''' and '''regular expression checks'''. Nginx first scans through ALL literal string location entries in the order that they occur in the configuration file, and secondly scans through ALL the regular expression location entries in the order that they occur in the configuration file. So be aware – ''location ordering order DOES matter''.
 
</blockquote>
 
  
 +
==== Case-sensitive ====
  
''The information below is heavily borrowed from the [http://wiki.nginx.org/HttpCoreModule#location Official nginx wiki].''
+
For the most part, [[Apache]] directives are matched in a top down manner with later directives overwriting earlier ones if duplicated. Nginx directives on the other hand seem to operate a little differently. One such example is that with [[Apache]], directives in the configuration files are case-insensitive <ref name="apache-directives" />, but with nginx, directives are case-sensitive.
  
{| class="wikitable"
 
|-
 
|'''Syntax:'''
 
|'''location''' [ <code>=</code> <nowiki>|</nowiki> <code>~</code> <nowiki>|</nowiki> <code>~*</code> <nowiki>|</nowiki> <code>^~</code> ] ''uri''  { ... }<br/>'''location''' {  } <code>@</code>  ''name''  { ... }
 
|-
 
|'''Default:'''
 
|
 
|-
 
|'''Context:'''
 
|server<br />location
 
|-
 
|'''Reference:'''
 
|[http://nginx.org/en/docs/http/ngx_http_core_module.html#location location]
 
|}
 
  
This directive allows different configurations depending on the request URI. It can be configured using both literal strings and regular expressions. To use regular expressions, you must use a prefix:
+
==== Array values ====
*  <code>~</code> for case sensitive matching
 
*  <code>~*</code> for case insensitive matching
 
  
{{InfoBox
+
Another gotcha is setting array values <ref name="nginx-array-values" />. It may not be immediately obvious, but when using a directive like <code>access_log</code> you are setting an array value. Because you are able to define multiple access logs, each use of the directive adds another entry in the array.
|info
 
|There is no syntax for NOT matching a regular expression. Instead, match the target regular expression and assign an empty block, then use <code>location /</code> to match anything else.
 
}}
 
  
 +
So in a configuration like this:
  
{| class="wikitable"
+
<syntaxhighlight lang="nginx">
|+Location Modifiers
+
http {
|-
+
    include      /etc/nginx/mime.types;
!Search Order
+
    default_type  application/octet-stream;
!Modifier
 
!Description
 
!Match Type
 
!Stops search on match
 
|-
 
|1st
 
|align="center"|<code>=</code>
 
|Location URI must match the specified pattern exactly
 
|Simple string
 
|align="center"|Yes
 
|-
 
|2nd
 
|align="center"|<code>^~</code>
 
|The location URI must begin with the specified pattern
 
|Simple string
 
|align="center"|Yes
 
|-
 
|3rd
 
|align="center"|<code>(None)</code>
 
|The location URI must begin with the specified pattern
 
|Simple string
 
|align="center"|No<ref name="prefix-none" />
 
|-
 
|4th
 
|align="center"|<code>~</code>
 
|The requested URI must be a case-sensitive match to the specified regular expression<ref name="case-insensitive-os">For case-insensitive operating systems such as Mac OS X and Cygwin, matching with prefix strings ignores a case (0.7.7). However, comparison is limited to one-byte locales.</ref>
 
|[[wikipedia:Perl_Compatible_Regular_Expressions|Perl Compatible Regular Expressions]]
 
|align="center"|Yes (first match)
 
|-
 
|4th
 
|align="center"|<code>~*</code>
 
|The requested URI must be a case-insensitive match to the specified regular expression
 
|[[wikipedia:Perl_Compatible_Regular_Expressions|Perl Compatible Regular Expressions]]
 
|align="center"|Yes (first match)
 
|-
 
|N/A
 
|align="center"|<code>@</code>
 
|Defines a named <code>location</code> block.<ref>The prefix "@" specifies a named location. Such locations are not used during normal processing of requests, they are intended only to process internally redirected requests (see
 
[http://wiki.nginx.org/HttpCoreModule#error_page error_page], [http://wiki.nginx.org/HttpCoreModule#try_files try_files]).</ref>
 
|Simple string
 
|align="center"|Yes
 
|}
 
  
 +
    # Found in default nginx 1.2.2 conf file
 +
    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
 +
                      '$status $body_bytes_sent "$http_referer" '
 +
                      '"$http_user_agent" "$http_x_forwarded_for"';
  
The order in which <code>location</code> directives are checked is as follows:
+
    log_format vhost_combined_debugging '$server_name $remote_addr - $remote_user [$time_local] '
 +
                    '"(request = \'$request\')" $status $bytes_sent '
 +
                    '(request_filename = \'$request_filename\') $request_uri (args = \'$args\') '
 +
                    '"$http_referer" "$http_user_agent"';
  
# Directives with the <code>=</code> prefix that match the query exactly (literal string). If found, searching stops.
+
    access_log /var/log/nginx/access.log main;
#  All remaining directives with conventional strings. If this match used the <code>^~</code> prefix, searching stops.
 
# Regular expressions, in the order they are defined in the configuration file.
 
#  If #3 yielded a match, that result is used. Otherwise, the match from #2 is used.
 
  
 +
    # more conf statements here
  
-- The following needs to be cleaned up --
+
    server {
 +
        # more conf statements here
  
The matching is performed against a normalized URI, after decoding a text encoded in the “%XX” form, resolving references to relative path components “.” and “..”, and possible compression of two or more adjacent slashes into a single slash.
+
        # Enabled/disabled as needed for troubleshooting
 +
        # Note: See explanation below about the first access_log that was defined
 +
        access_log  /var/log/nginx/www.access.log vhost_combined_debugging;
  
A location can either be defined by a prefix string, or by a regular expression. Regular expressions are specified by prepending them with the “~*” prefix (for case-insensitive matching), or with the “~” prefix (for case-sensitive matching). To find a location matching a given request, nginx first checks locations defined using the prefix strings (prefix locations). Among them, the most specific one is searched. Then regular expressions are checked, in the order of their appearance in a configuration file. A search of regular expressions terminates on the first match, and the corresponding configuration is used. If no match with a regular expression is found then a configuration of the most specific prefix location is used.
+
        # more conf statements here
 +
    }
  
Locations can be nested, with some exceptions mentioned below.
+
}
 +
</syntaxhighlight>
  
 +
Defining that second <code>access_log</code> value resets the array and causes the earlier array values (other access logs) to be unset. This is unfortunate as it is easy to forget when you expect the default behavior to be one of inheritance. I've personally been bitten by this and only figured it out by observation.
  
 +
Maxim Dounin explains it as:
  
 +
<blockquote>
 +
Basically, when you set array directive at certan (sic) level this clears everything inherited from upper levels for this array. This applies to other array directives as well (proxy_add_header, access_log, etc.).
 +
</blockquote>
  
Details below.
 
  
To determine which ''location'' directive matches a particular query, the literal strings are checked first. Literal strings match the beginning portion of the query - the most specific match will be used. Afterwards, regular expressions are checked in the order defined in the configuration file. The first regular expression to match the query will stop the search. If no regular expression matches are found, the result from the literal string search is used.
+
==== Location blocks ====
  
For case-insensitive operating systems, like Mac OS X or Windows with Cygwin, literal string matching is done in a case insensitive way (0.7.7).  However, comparison is limited to single-byte locale's only.
+
This one directive to me is the biggest difference between how Apache and nginx processes requests. The <code>location</code> blocks gave me a lot of trouble until I went through the nginx documentation and wiki pages. I also made the mistake of placing everything in include files which often muddied the picture for me. If you're new to nginx and are having trouble getting certain <code>location</code> blocks to match, I recommend ''not'' placing the <code>location</code> blocks in include files until you are comfortable with how nginx handles requests. See [[Nginx/Location]] for more information.
  
Regular expression may contain captures (0.7.40), which can then be used in other directives.
 
  
It is possible to disable regular expression checks after literal string matching by using "^~" prefix.  If the most specific match literal location has this prefix: regular expressions aren't checked.
 
  
The "=" prefix forces an '''exact''' (literal) match between the request URI and the ''location'' parameter. When matched, the search stops immediately.  A useful application is that if the request "/" occurs frequently, it's better to use "location = /", as that will speed up the processing of this request a bit, since the search will stop after the first comparison.
+
=== Additional Info ===
  
On exact match with literal location without "=" or "^~" prefixes search is also immediately terminated.<ref>I've asked on the nginx forums whether this is still true with recent versions because it sounds like [http://nginx.org/en/docs/http/ngx_http_core_module.html#location this page contradicts that information]</ref>.
+
* [http://kbeezie.com/view/securing-nginx-php/2/ Securing nginx and PHP]
 
+
* [http://www.ruby-forum.com/topic/151853 History of nginx]
It is important to know that nginx does the comparison against decoded URIs. For example, if you wish to match "/images/%20/test", then you must use "/images/ /test" to determine the location.
+
* [http://articles.slicehost.com/2010/8/27/customizing-nginx-web-logs Customizing nginx web logs]
 
+
* [http://blog.martinfjordvald.com/2012/08/understanding-the-nginx-configuration-inheritance-model/ Understanding the Nginx Configuration Inheritance Model]
Example:
 
 
 
<syntaxhighlight lang="nginx">
 
location  = / {
 
  # matches the query / only.
 
  [ configuration A ]
 
}
 
location  / {
 
  # matches any query, since all queries begin with /, but regular
 
  # expressions and any longer conventional blocks will be
 
  # matched first.
 
  [ configuration B ]
 
}
 
location ^~ /images/ {
 
  # matches any query beginning with /images/ and halts searching,
 
  # so regular expressions will not be checked.
 
  [ configuration C ]  
 
}
 
location ~* \.(gif|jpg|jpeg)$ {
 
  # matches any request ending in gif, jpg, or jpeg. However, all
 
  # requests to the /images/ directory will be handled by
 
  # Configuration C. 
 
  [ configuration D ]  
 
}
 
</syntaxhighlight>
 
  
Example requests:
 
  
*  / -> configuration A
+
==== References ====
* /documents/document.html -> configuration B
 
* /images/1.gif -> configuration C
 
* /documents/1.jpg -> configuration D
 
  
Note that you could define these 4 configurations in any order and the results would remain the same.
+
<references>
 +
<ref name="prefix-none">[http://nginx.org/en/docs/http/ngx_http_core_module.html#location In versions from 0.7.1 to 0.8.41, if a request matched the prefix location without the "=" and "^~" prefixes, the search also terminated and regular expressions were not checked.]</ref>
 +
<ref name="apache-directives">[http://httpd.apache.org/docs/2.2/configuring.html Apache 2.2.x Configuration Files]</ref>
 +
<ref name="nginx-array-values">[http://permalink.gmane.org/gmane.comp.web.nginx.english/4022 possible fastcgi_params bug or undocumented feature]</ref>
 +
</references>
  
The prefix "@" specifies a named location. Such locations are not used during normal processing of requests, they are intended only to process internally redirected requests (see
 
[http://wiki.nginx.org/HttpCoreModule#error_page error_page], [http://wiki.nginx.org/HttpCoreModule#try_files try_files]).
 
  
=== References ===
+
==== Changelogs ====
  
<references />
+
* [http://www.nginx.org/en/CHANGES All releases, including development releases]
 
+
* [http://nginx.org/en/CHANGES-1.2 1.2.x series]
=== Additional Info ===
 
  
* [http://library.linode.com/web-servers/nginx/configuration/basic#sph_location-configuration Linode Library, Basic Nginx Configuration, Location Configuration]
 
* [http://wiki.nginx.org/HttpCoreModule#location nginx wiki - HttpCoreModule] (Official)
 
* [http://nginx.org/en/docs/http/ngx_http_core_module.html#location nginx module documentation, nginx core http] (Official)
 
* [http://stackoverflow.com/questions/8437613/nginx-location-directive-matching-order stackoverflow - location directive matching order]
 
* [http://www.ruby-forum.com/topic/151853 History of nginx]
 
* [http://kbeezie.com/view/securing-nginx-php/2/ Securing nginx and PHP]
 
  
 +
==== Books ====
  
==== Changelogs ====
+
Unfortunately there are not a lot of books on nginx out there, and the ones that are available are starting to show their age.  
 
 
* [http://www.nginx.org/en/CHANGES All releases, including development releases]
 
* [http://www.nginx.org/en/CHANGES 1.2.x series]
 
  
 +
===== Nginx HTTP Server =====
  
==== Books ====
+
The one I got started with is [http://www.packtpub.com/nginx-http-server-for-web-applications/book Nginx HTTP Server] and while it doesn't cover the current nginx versions available it is good place to start. Just make sure to check any suggestions it offers against current nginx documentation and wiki pages before putting the configurations into production.
  
* [http://www.packtpub.com/nginx-http-server-for-web-applications/book PacktPub - Nginx HTTP Server]
 
 
{| class="wikitable sortable" style="text-align: center;"
 
{| class="wikitable sortable" style="text-align: center;"
 
|+nginx versions covered (Jun 2010)
 
|+nginx versions covered (Jun 2010)

Latest revision as of 08:26, 6 September 2012




The following content is a Work In Progress and may contain broken links, incomplete directions or other errors. Once the initial work is complete this notice will be removed. Please contact me via Twitter with any questions and I'll try to help you out.


Subpages


nginx is thorny

I've been using Nginx for a few years now and plan to do so for many more, but it wasn't an easy setup to get used to. In fact, I'm still getting bit by the differences between it and Apache, which I have much more experience with.


Pay close attention to what version of nginx a book or web resource is based on as behavior may change between releases [1]


Directives

Case-sensitive

For the most part, Apache directives are matched in a top down manner with later directives overwriting earlier ones if duplicated. Nginx directives on the other hand seem to operate a little differently. One such example is that with Apache, directives in the configuration files are case-insensitive [2], but with nginx, directives are case-sensitive.


Array values

Another gotcha is setting array values [3]. It may not be immediately obvious, but when using a directive like access_log you are setting an array value. Because you are able to define multiple access logs, each use of the directive adds another entry in the array.

So in a configuration like this:

http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    # Found in default nginx 1.2.2 conf file
    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    log_format vhost_combined_debugging '$server_name $remote_addr - $remote_user [$time_local] '
                    '"(request = \'$request\')" $status $bytes_sent '
                    '(request_filename = \'$request_filename\') $request_uri (args = \'$args\') '
                    '"$http_referer" "$http_user_agent"';

    access_log  /var/log/nginx/access.log  main;

    # more conf statements here

    server {
        # more conf statements here

        # Enabled/disabled as needed for troubleshooting
        # Note: See explanation below about the first access_log that was defined
        access_log  /var/log/nginx/www.access.log vhost_combined_debugging;

        # more conf statements here
    }

}

Defining that second access_log value resets the array and causes the earlier array values (other access logs) to be unset. This is unfortunate as it is easy to forget when you expect the default behavior to be one of inheritance. I've personally been bitten by this and only figured it out by observation.

Maxim Dounin explains it as:

Basically, when you set array directive at certan (sic) level this clears everything inherited from upper levels for this array. This applies to other array directives as well (proxy_add_header, access_log, etc.).


Location blocks

This one directive to me is the biggest difference between how Apache and nginx processes requests. The location blocks gave me a lot of trouble until I went through the nginx documentation and wiki pages. I also made the mistake of placing everything in include files which often muddied the picture for me. If you're new to nginx and are having trouble getting certain location blocks to match, I recommend not placing the location blocks in include files until you are comfortable with how nginx handles requests. See Nginx/Location for more information.


Additional Info


References

  1. In versions from 0.7.1 to 0.8.41, if a request matched the prefix location without the "=" and "^~" prefixes, the search also terminated and regular expressions were not checked.
  2. Apache 2.2.x Configuration Files
  3. possible fastcgi_params bug or undocumented feature


Changelogs


Books

Unfortunately there are not a lot of books on nginx out there, and the ones that are available are starting to show their age.

Nginx HTTP Server

The one I got started with is Nginx HTTP Server and while it doesn't cover the current nginx versions available it is good place to start. Just make sure to check any suggestions it offers against current nginx documentation and wiki pages before putting the configurations into production.

nginx versions covered (Jun 2010)
Stable Dev Legacy
0.7.66 0.8.40 0.5.38, 0.6.39