Varnish: time, resources and money saver. A good friend for journalists and bloggers

Let’s call him “Jim”. Jim had this problem: his blog was so successful that he had to upgrade its server. For a few months, it worked. But soon the problem repeated: hundreds of page-views for second, MySQL and Apache so busy that the server was like melting.

One simple update on Facebook — and the load rockets to 40, even 50. WordPress cache plug-ins and tweaks on the limit. Nothing else to do, but upgrade the hardware again?

Jim doesn’t like the idea. Ads are paying good, but raise the operational costs will cut the profit.

Solution: an HTTP cache/proxy/accelerator as front end, saving Apache+MySQL efforts.

I had heard about Varnish before, but at that time it looked to complex to my simple problems — and Jim’s problems as well. But as I was planning a new topics mash-up for a new client — a Belgian television website — I decided to take a second look; if it delivers as they promise, this HTTP accelerator can save a lot of money.

We tested it and here’s the conclusion: For Jim, using Varnish resulted on a day/night difference. Now, more than 90% of the page requests don’t event reach Apache + WordPress caching + MySQL. Server load went down to the 6-10 range and now he can engage more readers promoting his articles on Facebook.

And how did we do?

I read about it for two days. Everything, from blog analysis to documentation to examples and stories of implementation. Google was my dedicated, helping friend.

At the third day, action. It was so simple and easy. It’s more difficult to describe :)

Varnish has good install guides — replace them I will not, look here for what you need, if you need.

1. Preparation

I lost a few hours trying to understand how Varnish treats domains. You can serve any site with a Varnish front end. Some projects will perform better with a Varnish autonomous server. But the majority can share the same machine bot for the proxy/accelerator and the web server.

The problem comes with the default HTTP port. One can have Varnish and Apache (or nginx) sharing the port with some tricks. Documentation points to examples of different ports for Apache. Mut why complicate? I made it simple: Apache listens to port 80 on one IP address, Varnish listens on another IP address. As long as you have at least one spare IP address…

This solution has side benefits. First, you don’t have to change anything in your httpd conf files. One simple DNS modification is enough. For my case, and Jim’s, the simpler, the better. We both operate several domains on the same server and messing with httpd confs is always a headache.

2. Varnish and Plesk
I use Plesk no more, but Jim does. After much reading and testing, the best thing to do was edit httpd.conf and add a less broaden Listen directive.

We commented this line…

# Listen 80

… and add one line per each of Jim’s server IPs …

Listen your-ip-1:80
Listen your-ip-2:80
..
Listen your-ip-8:80

… Except one: the IP address we reserve for Varnish

It’s known that Plesk rewrites Apache conf files for each domain. But apparently it doesn’t touch httpd.conf — at least, not in Jim’s case (Plesk 9.5.2).

3. Configuration
Varnish has two configurations. First, you tell how the daemon should operate. That can be done in the command line or using the varnish daemon configuration file. Read the documentation for exhaustive coverage. The next is just a working example: mine.

DAEMON_OPTS="-a 127.0.0.1:80 \
-f /etc/varnish/default.vcl \
-s malloc,3G"

First line points out the IP address Apache is listening. Depending on your configuration, it can be 127.0.0.1, localhost or a real IP.
Second line indicates the second configuration file — a more complex one.
Third line tells Varnish to cache pages in memory, using no more than 3 GB (using memory is generally better for performance than using disk, but you can use disk also).

Now, the VCL configuration (VCL 3.0 is described here).

There aren’t many examples out there of VCL for WordPress installations. And some of them didn’t work for me. Cookies are the problem. After some reading and testing, I came to this VCL configuration, a mix of solutions from others with a touch (like req.grace):

backend origin {
.host = "127.0.0.1";
.port = "80";
}

sub vcl_recv {
set req.backend = origin; # using our unique backend
set req.grace = 2h; # MAGIC: Varnish will serve your pages for 2 hours EVEN if Apache stops.
# set standard proxied ip header for getting original remote address
set req.http.X-Forwarded-For = client.ip;

# logged in users must always pass
if( req.url ~ "^/wp-(login|admin)" || req.http.Cookie ~ "wordpress_logged_in_" ) {
return (pass);
}

# I have a community section I don't want to cache
if( req.url ~ "^/community/" ) {
return (pass);
}

# don't cache search results
if( req.url ~ "\?s=" ){
return (pass);
}

# always pass through posted requests and those with basic auth
if ( req.request == "POST" || req.http.Authorization ) {
return (pass);
}

# else ok to fetch a cached page
unset req.http.Cookie;
return (lookup);
}

sub vcl_fetch {
set beresp.grace = 2h;

# remove some headers we never want to see
unset beresp.http.Server;
unset beresp.http.X-Powered-By;

# only allow cookies to be set if we're in admin or community area
if( beresp.http.Set-Cookie && req.url !~ "/wp-(login|admin)" && req.url !~"/community" ) {
unset beresp.http.Set-Cookie;
}

# don't cache response to posted requests or those with basic auth
if ( req.request == "POST" || req.http.Authorization ) {
return (hit_for_pass);
}

# only cache status ok
if ( beresp.status != 200 ) {
return (hit_for_pass);
}

# don't cache search results
if( req.url ~ "\?s=" ){
return (hit_for_pass);
}

# else ok to cache the response
set beresp.ttl = 300s;
return (deliver);
}

Do you have working examples to share?