Difference between revisions of "Nginx"

From WhyAskWhy.org Wiki
Jump to: navigation, search
m (Replaced geshi tags with syntaxhighlight tags)
m (Added a link to "Understanding the Nginx Configuration Inheritance Model" (need to re-read it))
 
(14 intermediate revisions by the same user not shown)
Line 6: Line 6:
 
{{WorkInProgress}}
 
{{WorkInProgress}}
  
I've been using Nginx for a few years now and plan to do so for many more, but it wasn't an easy setup to get used to. In fact, I'm still getting bit by the differences between it and [[Apache]], which I have much more experience with. The one difference that stands out the most to me is the <code>location</code> blocks.
+
== Subpages ==
  
For the most part, [[Apache]] directives are matched in a top down manner with later directives overwriting earlier ones if duplicated. Nginx directives on the other hand seem to operate a little differently.
+
{{Special:PrefixIndex/Nginx/}}
  
== Location blocks ==
 
  
{{InfoBox
+
== nginx is thorny ==
|warning
 
|Pay close attention to what version of nginx a book or web resource is based on as behavior may change between releases <ref>[http://nginx.org/en/docs/http/ngx_http_core_module.html#location In versions from 0.7.1 to 0.8.41, if a request matched the prefix location without the “=” and “^~” prefixes, the search also terminated and regular expressions were not checked.]</ref>
 
}}
 
  
According to the official documentation <ref>[http://nginx.org/en/docs/http/request_processing.html#simple_php_site_configuration How nginx processes a request, A simple PHP configuration]</ref>, this is how nginx processes a request:
+
I've been using Nginx for a few years now and plan to do so for many more, but it wasn't an easy setup to get used to. In fact, I'm still getting bit by the differences between it and [[Apache]], which I have much more experience with.
  
<blockquote>
+
{{InfoBox|warning|Pay close attention to what version of nginx a book or web resource is based on as behavior may change between releases <ref name="prefix-none" />}}
nginx first searches for the most specific prefix location given by literal strings regardless of the listed order. In the configuration above the only prefix location is "/" and since it matches any request it will be used as a last resort. Then nginx checks locations given by regular expression in the order listed in the configuration file. The first matching expression stops the search and nginx will use this location. If no regular expression matches a request, then nginx uses the most specific prefix location found earlier.
 
</blockquote>
 
 
 
The same thing, said a little differently according to Samuel at the RackCorp Industry Blog <ref>[http://blog.rackcorp.com/?p=31 Nginx location and rewrite configuration made easy]</ref>:
 
  
<blockquote>
 
The best way to think of things is that as a request comes in, Nginx will scan through the configuration to find a "location" line that matches the request. There are TWO modes that nginx uses to scan through the configuration file: '''literal string matching''' and '''regular expression checks'''. Nginx first scans through ALL literal string location entries in the order that they occur in the configuration file, and secondly scans through ALL the regular expression location entries in the order that they occur in the configuration file. So be aware – ''location ordering order DOES matter''.
 
</blockquote>
 
  
 +
=== Directives ===
  
''The information below is heavily borrowed from the [http://wiki.nginx.org/HttpCoreModule#location Official nginx wiki].''
+
==== Case-sensitive ====
  
{| class="wikitable sortable"
+
For the most part, [[Apache]] directives are matched in a top down manner with later directives overwriting earlier ones if duplicated. Nginx directives on the other hand seem to operate a little differently. One such example is that with [[Apache]], directives in the configuration files are case-insensitive <ref name="apache-directives" />, but with nginx, directives are case-sensitive.
|-
 
|'''Syntax:'''
 
|'''location''' [ <code>=</code> <nowiki>|</nowiki> <code>~</code> <nowiki>|</nowiki> <code>~*</code> <nowiki>|</nowiki> <code>^~</code> ] ''uri''  { ... }<br/>'''location''' {  } <code>@</code>  ''name''  { ... }
 
|-
 
|'''Default:'''
 
|
 
|-
 
|'''Context:'''
 
|server<br />location
 
|-
 
|'''Reference:'''
 
|[http://nginx.org/en/docs/http/ngx_http_core_module.html#location location]
 
|}
 
  
This directive allows different configurations depending on the URI. It can be configured using both literal strings and regular expressions. To use regular expressions, you must use a prefix:
 
#  <code>~</code> for case sensitive matching
 
#  <code>~*</code> for case insensitive matching
 
# there is no syntax for NOT matching a regular expression. Instead, match the target regular expression and assign an empty block, then use <code>location /</code> to match anything else.
 
  
The order in which <code>location</code> directives are checked is as follows:
+
==== Array values ====
  
#  Directives with the <code>=</code> prefix that match the query exactly (literal string). If found, searching stops.
+
Another gotcha is setting array values <ref name="nginx-array-values" />. It may not be immediately obvious, but when using a directive like <code>access_log</code> you are setting an array value. Because you are able to define multiple access logs, each use of the directive adds another entry in the array.
#  All remaining directives with conventional strings. If this match used the <code>^~</code> prefix, searching stops.
 
#  Regular expressions, in the order they are defined in the configuration file.
 
#  If #3 yielded a match, that result is used. Otherwise, the match from #2 is used.
 
  
Details below.
+
So in a configuration like this:
  
To determine which ''location'' directive matches a particular query, the literal strings are checked first. Literal strings match the beginning portion of the query - the most specific match will be used. Afterwards, regular expressions are checked in the order defined in the configuration file. The first regular expression to match the query will stop the search. If no regular expression matches are found, the result from the literal string search is used.
+
<syntaxhighlight lang="nginx">
 +
http {
 +
    include      /etc/nginx/mime.types;
 +
    default_type  application/octet-stream;
  
For case-insensitive operating systems, like Mac OS X or Windows with Cygwin, literal string matching is done in a case insensitive way (0.7.7)However, comparison is limited to single-byte locale's only.
+
    # Found in default nginx 1.2.2 conf file
 +
    log_format  main '$remote_addr - $remote_user [$time_local] "$request" '
 +
                      '$status $body_bytes_sent "$http_referer" '
 +
                      '"$http_user_agent" "$http_x_forwarded_for"';
  
Regular expression may contain captures (0.7.40), which can then be used in other directives.
+
    log_format vhost_combined_debugging '$server_name $remote_addr - $remote_user [$time_local] '
 +
                    '"(request = \'$request\')" $status $bytes_sent '
 +
                    '(request_filename = \'$request_filename\') $request_uri (args = \'$args\') '
 +
                    '"$http_referer" "$http_user_agent"';
  
It is possible to disable regular expression checks after literal string matching by using "^~" prefixIf the most specific match literal location has this prefix: regular expressions aren't checked.
+
    access_log  /var/log/nginx/access.log main;
  
The "=" prefix forces an '''exact''' (literal) match between the request URI and the ''location'' parameter. When matched, the search stops immediately.  A useful application is that if the request "/" occurs frequently, it's better to use "location = /", as that will speed up the processing of this request a bit, since the search will stop after the first comparison.
+
    # more conf statements here
  
On exact match with literal location without "=" or "^~" prefixes search is also immediately terminated.
+
    server {
 +
        # more conf statements here
  
It is important to know that nginx does the comparison against decoded URIs. For example, if you wish to match "/images/%20/test", then you must use "/images/ /test" to determine the location.
+
        # Enabled/disabled as needed for troubleshooting
 +
        # Note: See explanation below about the first access_log that was defined
 +
        access_log  /var/log/nginx/www.access.log vhost_combined_debugging;
  
Example:
+
        # more conf statements here
 +
    }
  
<syntaxhighlight lang="nginx">
 
location  = / {
 
  # matches the query / only.
 
  [ configuration A ]
 
}
 
location  / {
 
  # matches any query, since all queries begin with /, but regular
 
  # expressions and any longer conventional blocks will be
 
  # matched first.
 
  [ configuration B ]
 
}
 
location ^~ /images/ {
 
  # matches any query beginning with /images/ and halts searching,
 
  # so regular expressions will not be checked.
 
  [ configuration C ]
 
}
 
location ~* \.(gif|jpg|jpeg)$ {
 
  # matches any request ending in gif, jpg, or jpeg. However, all
 
  # requests to the /images/ directory will be handled by
 
  # Configuration C. 
 
  [ configuration D ]
 
 
}
 
}
 
</syntaxhighlight>
 
</syntaxhighlight>
  
Example requests:
+
Defining that second <code>access_log</code> value resets the array and causes the earlier array values (other access logs) to be unset. This is unfortunate as it is easy to forget when you expect the default behavior to be one of inheritance. I've personally been bitten by this and only figured it out by observation.
 +
 
 +
Maxim Dounin explains it as:
 +
 
 +
<blockquote>
 +
Basically, when you set array directive at certan (sic) level this clears everything inherited from upper levels for this array. This applies to other array directives as well (proxy_add_header, access_log, etc.).
 +
</blockquote>
  
*  / -> configuration A
 
* /documents/document.html -> configuration B
 
* /images/1.gif -> configuration C
 
* /documents/1.jpg -> configuration D
 
  
Note that you could define these 4 configurations in any order and the results would remain the same.
+
==== Location blocks ====
  
The prefix "@" specifies a named location. Such locations are not used during normal processing of requests, they are intended only to process internally redirected requests (see
+
This one directive to me is the biggest difference between how Apache and nginx processes requests. The <code>location</code> blocks gave me a lot of trouble until I went through the nginx documentation and wiki pages. I also made the mistake of placing everything in include files which often muddied the picture for me. If you're new to nginx and are having trouble getting certain <code>location</code> blocks to match, I recommend ''not'' placing the <code>location</code> blocks in include files until you are comfortable with how nginx handles requests. See [[Nginx/Location]] for more information.
[http://wiki.nginx.org/HttpCoreModule#error_page|error_page], [http://wiki.nginx.org/HttpCoreModule#try_files|try_files]).
 
  
=== References ===
 
  
<references />
 
  
 
=== Additional Info ===
 
=== Additional Info ===
  
* [http://library.linode.com/web-servers/nginx/configuration/basic#sph_location-configuration Linode Library, Basic Nginx Configuration, Location Configuration]
+
* [http://kbeezie.com/view/securing-nginx-php/2/ Securing nginx and PHP]
* [http://wiki.nginx.org/HttpCoreModule#location nginx wiki - HttpCoreModule] (Official)
 
* [http://nginx.org/en/docs/http/ngx_http_core_module.html#location nginx module documentation, nginx core http] (Official)
 
* [http://stackoverflow.com/questions/8437613/nginx-location-directive-matching-order stackoverflow - location directive matching order]
 
 
* [http://www.ruby-forum.com/topic/151853 History of nginx]
 
* [http://www.ruby-forum.com/topic/151853 History of nginx]
* [http://kbeezie.com/view/securing-nginx-php/2/ Securing nginx and PHP]
+
* [http://articles.slicehost.com/2010/8/27/customizing-nginx-web-logs Customizing nginx web logs]
 +
* [http://blog.martinfjordvald.com/2012/08/understanding-the-nginx-configuration-inheritance-model/ Understanding the Nginx Configuration Inheritance Model]
 +
 
 +
 
 +
==== References ====
 +
 
 +
<references>
 +
<ref name="prefix-none">[http://nginx.org/en/docs/http/ngx_http_core_module.html#location In versions from 0.7.1 to 0.8.41, if a request matched the prefix location without the "=" and "^~" prefixes, the search also terminated and regular expressions were not checked.]</ref>
 +
<ref name="apache-directives">[http://httpd.apache.org/docs/2.2/configuring.html Apache 2.2.x Configuration Files]</ref>
 +
<ref name="nginx-array-values">[http://permalink.gmane.org/gmane.comp.web.nginx.english/4022 possible fastcgi_params bug or undocumented feature]</ref>
 +
</references>
  
  
Line 130: Line 98:
  
 
* [http://www.nginx.org/en/CHANGES All releases, including development releases]
 
* [http://www.nginx.org/en/CHANGES All releases, including development releases]
* [http://www.nginx.org/en/CHANGES 1.2.x series]
+
* [http://nginx.org/en/CHANGES-1.2 1.2.x series]
  
  
 
==== Books ====
 
==== Books ====
  
* [http://www.packtpub.com/nginx-http-server-for-web-applications/book PacktPub - Nginx HTTP Server]
+
Unfortunately there are not a lot of books on nginx out there, and the ones that are available are starting to show their age.
 +
 
 +
===== Nginx HTTP Server =====
 +
 
 +
The one I got started with is [http://www.packtpub.com/nginx-http-server-for-web-applications/book Nginx HTTP Server] and while it doesn't cover the current nginx versions available it is good place to start. Just make sure to check any suggestions it offers against current nginx documentation and wiki pages before putting the configurations into production.
 +
 
 
{| class="wikitable sortable" style="text-align: center;"
 
{| class="wikitable sortable" style="text-align: center;"
 
|+nginx versions covered (Jun 2010)
 
|+nginx versions covered (Jun 2010)

Latest revision as of 09:26, 6 September 2012




The following content is a Work In Progress and may contain broken links, incomplete directions or other errors. Once the initial work is complete this notice will be removed. Please contact me via Twitter with any questions and I'll try to help you out.


Subpages


nginx is thorny

I've been using Nginx for a few years now and plan to do so for many more, but it wasn't an easy setup to get used to. In fact, I'm still getting bit by the differences between it and Apache, which I have much more experience with.


Pay close attention to what version of nginx a book or web resource is based on as behavior may change between releases [1]


Directives

Case-sensitive

For the most part, Apache directives are matched in a top down manner with later directives overwriting earlier ones if duplicated. Nginx directives on the other hand seem to operate a little differently. One such example is that with Apache, directives in the configuration files are case-insensitive [2], but with nginx, directives are case-sensitive.


Array values

Another gotcha is setting array values [3]. It may not be immediately obvious, but when using a directive like access_log you are setting an array value. Because you are able to define multiple access logs, each use of the directive adds another entry in the array.

So in a configuration like this:

http {
    include       /etc/nginx/mime.types;
    default_type  application/octet-stream;

    # Found in default nginx 1.2.2 conf file
    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    log_format vhost_combined_debugging '$server_name $remote_addr - $remote_user [$time_local] '
                    '"(request = \'$request\')" $status $bytes_sent '
                    '(request_filename = \'$request_filename\') $request_uri (args = \'$args\') '
                    '"$http_referer" "$http_user_agent"';

    access_log  /var/log/nginx/access.log  main;

    # more conf statements here

    server {
        # more conf statements here

        # Enabled/disabled as needed for troubleshooting
        # Note: See explanation below about the first access_log that was defined
        access_log  /var/log/nginx/www.access.log vhost_combined_debugging;

        # more conf statements here
    }

}

Defining that second access_log value resets the array and causes the earlier array values (other access logs) to be unset. This is unfortunate as it is easy to forget when you expect the default behavior to be one of inheritance. I've personally been bitten by this and only figured it out by observation.

Maxim Dounin explains it as:

Basically, when you set array directive at certan (sic) level this clears everything inherited from upper levels for this array. This applies to other array directives as well (proxy_add_header, access_log, etc.).


Location blocks

This one directive to me is the biggest difference between how Apache and nginx processes requests. The location blocks gave me a lot of trouble until I went through the nginx documentation and wiki pages. I also made the mistake of placing everything in include files which often muddied the picture for me. If you're new to nginx and are having trouble getting certain location blocks to match, I recommend not placing the location blocks in include files until you are comfortable with how nginx handles requests. See Nginx/Location for more information.


Additional Info


References


Changelogs


Books

Unfortunately there are not a lot of books on nginx out there, and the ones that are available are starting to show their age.

Nginx HTTP Server

The one I got started with is Nginx HTTP Server and while it doesn't cover the current nginx versions available it is good place to start. Just make sure to check any suggestions it offers against current nginx documentation and wiki pages before putting the configurations into production.

nginx versions covered (Jun 2010)
Stable Dev Legacy
0.7.66 0.8.40 0.5.38, 0.6.39