OL: using HAProxy as a better proxy balancer 

To distribute the requests between the webnodes we decided to use an HAProxy between the NGINX webserver and the apps on the webnodes. HAProxy allow us to balance the requests between the webnodes with more granularity then NGINX.

To check the HAProxy statistics report at the page http://openlibrary.org/admin?stats

To do so we installed HAProxy server on ol-www1 on port 7072, with /etc/haproxy/haproxy.cfg:

global
        log 127.0.0.1   daemon info
        maxconn 4096
        user haproxy
        group haproxy
        daemon
 
defaults
        log     global
        mode    http
        option  httplog
        option  dontlognull
        retries 3
        option redispatch
        maxconn 2000
        contimeout      9000
        clitimeout      7200000
        srvtimeout      7200000
    stats uri     /admin?stats
    stats refresh 5s
    # these are added so the client ip comes through
    option httpclose
    option forwardfor
    option forceclose
 
 
listen  ol-web-app 0.0.0.0:7072
        mode    http
        balance roundrobin
    option httpchk GET /
    timeout check 3000
    #option httpchk GET /solr/select?rows=0&q=*:*
    server  web1 ol-web1:7071 maxconn 23 check inter 5000 rise 2 fall 2
    server  web2 ol-web2:7071 maxconn 15 check inter 5000 rise 2 fall 2

With this config file the requests will be distribuited between ol-web1 and ol-web2 in a numer of 23 and 15 requests. There is an asymmetry between the nodes because the different number of gnunicorn instances running on them.

Then we changed the NGINX conf at /etc/nginx/sites-enabled/openlibrary.conf setting the upstream to the HAProxy server:

upstream webnodes {
    server 127.0.0.1:7072;
}
 
...