Nathan Evans' Nemesis of the Moment

Targeting Mono in Visual Studio 2012

Posted in .NET Framework, Unix Environment, Windows Environment by Nathan B. Evans on February 13, 2013

These steps are known good on my Windows 8 machine, with Visual Studio 2012 w/ Update 1 and Mono 2.10.9.

  1. Install Mono for Windows, from http://www.go-mono.com/mono-downloads/download.html
    Choose a decent path, which for me was C:\Program Files (x86)\Mono-2.10.9
  2. Load an elevated administrative Command Prompt (Top tip: On Windows 8, hit WinKey+X then choose “Command Prompt (Admin)“)
  3. From this Command Prompt, execute the following commands (in order):
    $ cd "C:\Program Files (x86)\Reference Assemblies\Microsoft\Framework\.NETFramework\v4.0\Profile"
    $ mklink /d Mono "C:\Program Files (x86)\Mono-2.10.9\lib\mono\4.0"
    $ cd Mono
    $ mkdir RedistList
    $ cd RedistList
    $ notepad FrameworkList.xml
  4. Notepad will start and ask about creating a new file, choose Yes.
  5. Now paste in this text and Save the file:
    <?xml version="1.0" encoding="UTF-8"?>
    <FileList ToolsVersion="4.0" RuntimeVersion="4.0" Name="Mono 2.10.9 Profile" Redist="Mono_2.10.9">
    </FileList>
  6. From the same Command Prompt, type:
    $ regedit
  7. In the Registry Editor, navigate to: HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\.NETFramework\v4.0.30319\SKUs\ and create a new Key folder called .NETFramework,Version=v4.0,Profile=Mono
  8. Now navigate to HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\.NETFramework\v4.0.30319\SKUs\ and create the same Key folder again here (this step is only necessary for x64 machines, but since all software developers use those then you’ll probably need to do it!)
  9. Now load up VS2012 and a project. Goto the Project Properties (Alt+Enter whilst having it selected on Solution Explorer) .
  10. Choose the Application tab on the left and look for the “Target framework” drop-down list.
  11. On this list you should see an entry called “Mono 2.10.9 Profile”.
  12. Select it, and Visual Studio should then convert your project to target the Mono framework. You’ll notice that it will re-reference all your various System assemblies and if you study the filenames they will point to the Mono ones that were created during Step #3.

Note: I was scratching my head at first as I kept getting an error from Visual Studio saying:

This application requires one of the following versions of the .NET Framework:
.NETFramework,Version=v4.0,Profile=Mono

Do you want to install this .NET Framework version now?

Turns out that even on a x64 machine you MUST add both Registry key SKUs (see Steps #7 and #8). It is not enough to just add the Wow6432Node key, you must add the other as well. I can only assume this is because VS2012 is a 32-bit application. But maybe it’s also affected by whether you’re compiling to Any CPU, x64 or x86… who knows. It doesn’t really matter as long as this fixes it, which it does!

Building and executing your first program

Now that your development environment is nicely setup, you can proceed and build your first program.

The Mono equivalent of MSBuild is called XBuild (not sure why they didn’t call it MBuild or something!). You can build your .csproj by doing the following:

  1. Load the Mono Command Prompt (it will be on your Start Menu/Screen, just search for “mono”).
  2. Change directory to your project folder.
  3. Execute the following command to build your project using the Mono compiler:
    $ xbuild /p:TargetFrameworkProfile=""
    Note: You must specify the blank TargetFrameworkProfile parameter as otherwise the compiler will issue warnings along the lines of:

    Unable to find framework corresponding to the target framework moniker ‘.NETFramework,Version=v4.0,Profile=Mono’. Framework assembly references will be resolved from the GAC, which might not be the intended behavior.

  4. Hopefully you’ll not have any errors from the  build…
  5. Now you can run your program using the Mono runtime, to do this:
    $ mono --gc=sgen "bin\Debug\helloworld.exe"
    Note: You'll definitely want to use the "Sgen" garbage collector (hence the parameter) as the default one in Mono is unbearably slow.
  6. You can do a quick “smoke test” to verify everything is in order with both your compilation and execution. Have your program execute something like:
    Console.Write(typeof (Console).Assembly.CodeBase);
    … and this should print out a path similar to:
    file:///C:/PROGRA~2/MONO-2~1.9/lib/mono/4.0/mscorlib.dll
    I’ve no idea why it prints it out using 8.3 filename format, but there you go! You’ll notice that if you run your program outside of the Mono runtime then it will pick up the Microsoft CLR version from the GAC.

Simulating the P of CAP on a Riak cluster

Posted in .NET Framework, Automation, Unix Environment by Nathan B. Evans on February 10, 2013

When developing and testing a distributed system, one of the essential concerns you will deal with is eventual consistency (EC). There are plenty of articles covering EC so I’m not going to dwell on that much here. What I am going to talk about is testing, particularly of the automated kind.

Testing an eventually consistent system is difficult because everything is transient and unpredictable. Unless you tune your consistency values like N and DW then you’re offered no guarantees about the extent of propagation of your commit around the cluster. And whilst consistency tuning may be acceptable for some tests, it most definitely won’t be acceptable for tests that cover concurrent write concerns such as your sibling resolution protocols.

What is “partition tolerance”?

This is where a single node or a group of nodes in the cluster become segregated from the rest of the cluster. I liken it to a “net split” on an IRC network. The system continues operating but it has been split into two or more parts. When a node has become segregated from the rest of the cluster it does not necessarily mean that its clients can also not reach it. Therefore all nodes can continue to perform writes on the dataset.

Generally speaking, a partition event is a transient situation. They may last a few seconds, a few minutes, hours.., days.. or even weeks. But the expectation is that eventually the partition will be repaired and the cluster returned to full health.

State changes of a cluster during a partition event

Fig. 1: State changes of a cluster during a partition event

In the diagram (Fig. 1) there is a cluster comprised of three nodes, and this is the sequence of events:

  1. N2 becomes network partitioned from N1 and N3.
    N1 and N3 can continue replicating between one another.
  2. Client(s) connected to either N1 or N3 perform two writes (indicated by the green and orange).
  3. Client(s) connected to N2 perform three writes (purple, red, blue).
  4. When the network partition is resolved, the cluster begins to heal by merging the commits between the nodes.
  5. The yellow and green commits (that were already replicated between N1 and N3) are propagated onto N2.
  6. The purple, red and blue commits on N2 are propagated onto N1 and N3.
  7. The cluster is now fully healed.

Simulating a partition

I have a Riak cluster of three nodes running on Windows Azure and I needed some way to deterministically simulate a network partition scenario. The solution that I came up with was quite simple. I basically wrote some iptables scripts that temporarily firewalled certain IP traffic on the LAN in order to prevent the selected node from communicating with any other nodes.

To erect the partition:

# Simulates a network partition scenario by blocking all TCP LAN traffic for a node.
# This will prevent the node from talking to other nodes in the cluster.
# The rules are not persisted, so a restart of the iptables service (or indeed the whole box) will reset things to normal.

# First add special exemption to allow loopback traffic on the LAN interface.
# Without this, riak-admin gets confused and thinks the local node is down when it isn't.
sudo iptables -I OUTPUT -p tcp -d $(hostname -i) -j ACCEPT
sudo iptables -I INPUT -p tcp -s $(hostname -i) -j ACCEPT

# Now block all other LAN traffic.
sudo iptables -I OUTPUT 2 -p tcp -d 10.0.0.0/8 -j REJECT
sudo iptables -I INPUT 2 -p tcp -s 10.0.0.0/8 -j REJECT

To tear down the partition:

# Restarts the iptables service, thereby resetting any temporary rules that were applied to it.
sudo service iptables restart

Disclaimer: These are just rough shell scripts designed for use on a test bed development environment.

I then came up with a fairly neat little class that wraps up the concern of a network partition:

internal class NetworkPartition : IDisposable {
    private readonly string _nodeName;
    private bool _closed;

    private NetworkPartition(string nodeName) {
        _nodeName = nodeName;
    }

    /// <summary>
    /// Creates a temporary network partition by segregating (at the IP firewall level) a particular node from all other nodes in the cluster.
    /// </summary>
    /// <param name="nodeName">The name of the node to be network partitioned. The name must exist as a PuTTY "Saved Session".</param>
    /// <returns>An object that can be disposed when the partition is to be removed.</returns>
    public static IDisposable Create(string nodeName) {
        var np = new NetworkPartition(nodeName);
        Plink.Execute("simulate-network-partition.sh", nodeName);
        return np;
    }

    private void Close() {
        if (_closed)
            return;

        Plink.Execute("restart-iptables.sh", _nodeName);

        _closed = true;
    }

    public void Dispose() {
        Close();
    }
}

Which allows me to write test cases in the following way:

RingStatus.AssertOkay("riak03");

using (NetworkPartition.Create("riak03")) {
    RingStatus.AssertDegraded("riak03");

    // TODO: Do other stuff here whilst riak03 is network partitioned from the rest of the cluster.
}

RingReady.Wait("riak03");
RingStatus.AssertOkay("riak03");

Yes, RingStatus and RingReady aren’t documented here. But they’re pretty simple.

Obviously as part of this work I had to write a quick and dirty wrapper around the PuTTY plink.exe tool. This tool is basically a CLI version of PuTTY and it is very good for scripting automated tasks.

My solution could be improved to support creating partitions that consist of more than just one node, but for me the added value of this would be very small at this stage. Maybe I will add support for this later when I get into stress testing territory!

Source

You can view it over on GitHub; the namespace is tentatively called “ClusterTools”. Bear in mind it’s currently held inside another project but if there’s enough interest I will make it standalone. There has been talk on the CorrugatedIron project (which is a Riak client library for .NET) about starting up a Contrib library, so maybe this could be one its first contributions.

Riak timing out during startup

Posted in Databases, Unix Environment by Nathan B. Evans on February 6, 2013

Recently I’ve been working on some interesting projects involving the eventually consistent Riak key-value database. Today I encountered a puzzling issue with a fresh cluster I was deploying. I had seemingly done everything identically to in the past except something was causing it to fail to startup when in service mode.

[administrator@riak01 ~]$ sudo service riak start
Starting Riak: Riak failed to start within 15 seconds,
see the output of 'riak console' for more information.
If you want to wait longer, set the environment variable
WAIT_FOR_ERLANG to the number of seconds to wait.
[FAILED]

Bizarrely, in console mode it would start up fine. This led to me to believe it was some sort of user or permissions issue but I wasn’t totally sure. Perhaps I had accidentally executed Riak as another user and some sort of locking file was created? Or was it perhaps a performance issue with the “Shared Core” Azure VM’s I was using.

First I followed Riak’s advice by increasing the WAIT_FOR_ERLANG environment variable, I tried first 30 and then 60 seconds. But this made no difference at all. I’m not even sure if Riak was even using my new value as it still kept on printing out “15 seconds” as its reason for failing to start.

I researched some more and many places on the interweb were suggesting to purge the /var/lib/riak/ring/ directory (don’t do this if you have valuable data stored on the Riak instance). I tried this, but it also had no effect.

But it turned out that the solution was incredibly simple. Riak had created some sort of temporary lock directory at /tmp/riak. All I had to do was delete this directory and, hey presto, Riak would now start perfectly fine as a service!

$ sudo rm -r /tmp/riak

There may be more posts on the subject of Riak soon. 🙂

PS: I am using Riak version 1.2.1, on CentOS 6.3.

Tagged with: ,

Cultural learnings of HA-Proxy, for make benefit…

Posted in Unix Environment by Nathan B. Evans on March 3, 2011

I’ve been setting up lots and lots of small details on our HA-Proxy cluster this week. This post is just a small digest of some of the things I have learnt.

The option nolinger is considered harmful.

I read somewhere that this option should be enabled because it frees up socket resources quicker and doesn’t leave them lying around when blatently dead. I enabled it and thought nothing more of it. Having forgot I had done so, I then started noticing strange behaviours. Most tellingly was that HA-Proxy’s webstats UI would truncate abruptly before completing. Fortunately, Willy Tarreau (the author/maintainer) was very quick to respond to my pestering e-mails and after seeing my Wireshark trace he immediately had a few ideas of what could be causing it. After following his suggestion to avoid using the “no linger” option, I removed it from my configuration and the problem went away.

Therefore: “option nolinger considered harmful.” You’ve be warned!

Webstats UI has “hidden” administrative functions

While reading the infamous “wall of text” that is the HA-Proxy documentation, I came across a neat option called “stats admin“. It enables a single piece of extra functionality (at least it does in v1.4.11) that will let you flag servers as being online or offline. This is useful if you’re planning to take one or more servers out of a backend’s pool, for maintenance possibly. I would wager that Willy intends to add more administrative features in the future so adding this one to your config now could save you some time in the future.

Of course, it is not likely that you will want such a sensitive function to be exposed to everyone that uses webstats. So it is fortunate then that this option supports a condition expression. I set mine up like the following:

userlist UsersFor_HAProxyStatistics
  group admin users admin
  user admin insecure-password godwouldntbeupthislate
  user stats insecure-password letmein

listen HAProxy-Statistics *:81
  mode http
  stats enable
  stats uri /haproxy?stats
  stats refresh 60s
  stats show-node
  stats show-legends
  acl AuthOkay_ReadOnly http_auth(UsersFor_HAProxyStatistics)
  acl AuthOkay_Admin http_auth_group(UsersFor_HAProxyStatistics) admin
  stats http-request auth realm HAProxy-Statistics unless AuthOkay_ReadOnly
  stats admin if AuthOkay_Admin

Request/response rewriting is mutually exclusive of keep-alive connections

At least in current versions, HA-Proxy doesn’t seem to be able to perform rewriting on connections that have been kept alive. It is limited to analysing only the first request and response. Any further requests that occur on that connection will go unanalysed. So if you are doing request or response rewriting, it is imperative that you set a special option to ensure that a connection can only be used once.

In my case, I just added the following to my frontend definition.

option http-server-close

Identifying your frontend from your backend

I was creating some rules to ensure that a particular URL could only be accessed through my HTTPS frontend. I wanted to prevent unencrypted HTTP access to this URL because it was using HTTP Basic authentication which uses clear text passwords across the wire.

Fortunately, HA-Proxy supports a fairly neat way of doing this by the means of tagging your frontend with a unique identifier which can then be matched against by the backend.

First of all, I setup my frontends like the following:

frontend Public-HTTP
  id 80
  mode http
  bind *:80
  option http-server-close
  default_backend Web-Farm

frontend Public-HTTPS
  id 8443
  mode http
  # Note: Port 8443 because the true 443 is being terminated by Stunnel, which then forwards to this 8433.
  bind *:8443
  option http-server-close
  default_backend Web-Farm

Then in my backend I cleared a space for defining “reusable” ACLs and then added the protective rule for the URL in question:

backend Web-Farm
  mode http
  balance roundrobin
  option httpchk
  server Web0 172.16.61.181:80 check
  server Web1 172.16.61.182:80 check

  # Common/useful ACLs
  acl ViaFrontend_PublicHttp fe_id 80
  acl ViaFrontend_PublicHttps fe_id 8443

  # Application security for: /MyWebPage/
  acl PathIs_MyWebPage path_beg -i /mywebpage
  http-request deny if PathIs_MyWebPage !ViaFrontend_PublicHttps

The piece of magic that makes this all work is the fe_id ACL criterion. Note that the “fe” stands for “frontend”.

Note the http-request deny rule is comprised of two ACLs, by boolean AND’ing them. HA-Proxy defaults to AND’ing. If you want to OR just type “or” or “||“. Negation is done in the normal C way by using an exclamation symbol, as shown in the above example. I seem to like avoiding the use of the “unless” statement as I prefer the explicitness of using “if” and then using negation. But that’s just my personal preference as a long-time coder 🙂

Now if a user tries to visit http://.../MyWebPage they will get a big fat ugly 403 Forbidden error.

HTTP Basic authentication is finally very basic to do!

I came across a stumbling block this week. I assumed that Microsoft IIS, one of the best web servers available, could do HTTP Basic authentication i.e. clear text passwords over the wire and then validating against some sort of clear text password file or database. Turns out that while IIS does support HTTP Basic auth’, it doesn’t support any form of simple backend. You have to validate against either the web servers local Windows user accounts, or against Active Directory. Great. The web page in question was just a little hacky thing we knocked up to get a customer of ours out of a hole. We didn’t want to be creating maintenance headaches for ourselves by creating a local user account on each web server in the farm, nor did we fancy creating them an AD account. They don’t even belong to our company!

Fortunately (that word again), and despite how poorly documented it is, HA-Proxy *does* support this!

First of all you need to create a userlist that will contain your users/groups that you will authenticate against:

userlist UsersFor_AcmeCorp
  user joebloggs insecure-password letmein

Then in your backend, you need to create an ACL that uses the http_auth criterion. And lastly, create an http-request auth rule that will cause the appropriate 401 Unauthorized and WWW-Authenticate: Basic response to be generated if the authentication has failed.

backend HttpServers
  .. normal backend stuff goes here as usual ..
  acl AuthOkay_AcmeCorp http_auth(UsersFor_AcmeCorp)
  http-request auth realm AcmeCorp if !AuthOkay_AcmeCorp

Remove sensitive IIS / ASP.NET response headers

Security unconscious folk need not apply.

It’s a slight security risk to be leaking your precise IIS and ASP.NET version numbers. Whilst these can be turned off in IIS configuration, it is more a concern for your frontend load balancer i.e. HA-Proxy. The reason I believe this is because the headers can be useful debugging on the internal LAN/VPN inside your company. Only when the headers are about to touch the WAN does it become dangerous. Therefore:

frontend Public-HTTP
  # Remove headers that expose security-sensitive information.
  rspidel ^Server:.*$
  rspidel ^X-Powered-By:.*$
  rspidel ^X-AspNet-Version:.*$

HTTPS and separation of concerns

I don’t know about Apache, but IIS 7.5 can have some annoying (but arguably expected) behaviours when HA-Proxy is passing traffic where the client believes it has an end-to-end HTTPS connection with the web server. My setup involves Stunnel terminating the SSL connection and then from that point on it is just standard HTTP traffic to the backend servers. This means the backend servers don’t actually need to be listening on HTTPS/443 at all. However when GET requests come in to them using the https:/ scheme they can get a bit confused (or argumentative, I’m undecided). IIS seems to like sending back a 302 Moved Permanently response, with a Location header that uses the http:/ scheme. So then of course the web browser will follow the redirect to either a URL that doesn’t exist or one which does exist but is already merely a redirect to the https:/ scheme! Infinitely loop anyone?

The way to solve this is request rewriting, through some clever use of regular expressions.

frontend Public-HTTPS
  id 8443
  mode http
  bind *:8443
  option http-server-close
  default_backend Web-Farm

  # Rewrite requests so that they are passed to the backend as http:/ schemed requests.
  # This may be required if the backend web servers don't like handling https schemed requests over non-https transport.
  # I didn't use this in the end - but it might come in handy in the future so I left it commented out.
  # reqirep ^(\w+\ )https:/(/.*)$ \1http:/\2

  # Rewrite responses containing a Location header with HTTP scheme using the relative path.
  # We could alternatively just rewrite the http:/ to be https:/ but then it could break off-site redirects.
  rspirep ^Location:\s*http://.*?\.acmecorp.co.tld(/.*)$ Location:\ \1
  rspirep ^Location:(.*\?\w+=)http(%3a%2f%2f.*?\.acmecorp.co.tld%2f.*)$ Location:\ \1https\2

The first rspirep in the above example is the most important. The second is something more specific to a particular web application we’re hosting that uses a ?Redirect=http://yada.yada style query string in certain places.

The rsprep / rspirep rule (the i means case-insensitive matching) is very powerful. The only downside is that you do need to be fairly fluent with regular expressions. It requires only two parameters, the first is your regular expression and the second is your string replacement.

The string replacement that occurs in the second parameter supports expansion based upon indexed capture groups from the regular expression that was matched. This is useful for merging very specific pieces from the match back into the replacement string, as I am doing in the example above. They take the form of \1 or \2 etc. Where the number indicates the capture group index number. And capture groups are denoted in the regular expression by using parenthesis, if you didn’t know.

Truly “live” updates on the Webstats UI

One of the first things I noticed in the hours after deploying HA-Proxy is that the webstat counters that are held for each frontend, listen and backend are not actually updated as frequently as they perhaps ought to be. Indeed, the counters for any given connection are not accumulated until that connection has ended. This is bad if your application(s) tend to hold open long-duration connections. It reduces your usability of HA-Proxy’s reporting. I’m sure there are very good performance reasons that Willy did this, as that is what is alluded to in the documentation. Fortunately there is a very simple workaround for this in the form of the contstats option.

Simply add the following to your proxy and benefit from higher accuracy webstats:

option contstats

Until next time…

Tagged with:

Safely pairing HA-Proxy with virtual network interface providers like Keepalived or Heartbeat

Posted in Unix Environment by Nathan B. Evans on March 1, 2011

This is sort of a follow-up to the Deploying HA-Proxy + Keepalived with Mercurial for distributed config post.

During testing we were coming across an issue where the HA-Proxy instance running on the slave member of our cluster would fail to bind some of its frontend proxies:

Starting haproxy: [ALERT] : Starting proxy Public-HTTPS: cannot bind socket

After some head scratching I noticed that the problem was only arising on those proxies that explicitly defined the IP address of a virtual interface that was being managed by Keepalived (or maybe Heartbeat for you).

This is because both of these High-Availability clustering systems use a rather simplistic design whereby the “shared” virtual IP is only installed on the active node in the cluster. While the nodes that are in a dormant state (i.e. the slaves) do not actually have those virtual IPs assigned to them during that state. It’s a sort of “IP address hot-swapping” design. I learnt this by executing a simple a command, first from the master server:

$ ip a
<snipped stuff for brevity>
2: seth0:  mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 00:15:5d:28:7d:19 brd ff:ff:ff:ff:ff:ff
    inet 172.16.61.151/24 brd 172.16.61.255 scope global seth0
    inet 172.16.61.150/24 brd 172.16.61.255 scope global secondary seth0:0
    inet 172.16.61.159/24 brd 172.16.61.255 scope global secondary seth0:1
    inet6 fe80::215:5dff:fe28:7d19/64 scope link
       valid_lft forever preferred_lft forever
<snipped trailing stuff for brevity>

Then again, from the slave server:

$ ip a
<snipped stuff for brevity>
2: seth0:  mtu 1500 qdisc pfifo_fast qlen 1000
    link/ether 00:15:5d:2d:9c:11 brd ff:ff:ff:ff:ff:ff
    inet 172.16.61.152/24 brd 172.16.61.255 scope global seth0
    inet6 fe80::215:5dff:fe2d:9c11/64 scope link
       valid_lft forever preferred_lft forever
<snipped trailing stuff for brevity>

Unfortunately this behaviour can cause problems for programs like HA-Proxy which have been configured to expect the existence of specific network interfaces on the server. I was considering working around it by writing some scripts that hook events within the HA cluster to handle stopping and starting the HA-Proxy when needed. But this approach seemed clunky and unintuitive. So I dug a little deeper and came across a bit of a gem hidden away in the depths of the Linux networking stack. It is a simple boolean setting called “net.ipv4.ip_nonlocal_bind” and it allows a program like HA-Proxy to create listening sockets on network interfaces that do not actually exist on the server. It was created specially for this situation.

So in the end the fix was as simple as adding/updating the /etc/sysctl.conf file to include the following key/value pair:

net.ipv4.ip_nonlocal_bind=1

My previous experience of setting up these low-level High-Availability clusters was with Windows Server’s feature called Network Load Balancing (NLB). This works quite different from Keepalived and Heartbeat. It relies upon some low level ARP hacking/trickery and some sort of distributed time splicing algorithm. But it does ensure that each node in the cluster (whether in a master or slave position) will remain allocated with the virtual IP address(es) at all times. I suppose there is always more than one way to crack an egg…

Tagged with: , ,

Deploying HA-Proxy + Keepalived with Mercurial for distributed config

Posted in Automation, Source Control, Unix Environment by Nathan B. Evans on February 27, 2011

Something I have learnt (and re-learnt) too many times to count is that one of the strange wonders of working for a startup company is that the most bizarre tasks can land on your lap seemingly with no warning.

We’ve recently been doing a big revamp of our data centre environment, including two shiny new Hyper-V hosts, a Sonicwall firewall and the decommissioning of lots of legacy hardware that doesn’t support virtualisation. As part of all this work we needed to put in place several capabilities for routing application requests on our SaaS platform:

  1. Expose HTTP/80 and HTTPS/443 endpoints on the public web and route incoming requests based upon URL to specific (and possibly many) private internal servers.
  2. Expose a separate and “special” TCP 443 endpoint (on public web) that isn’t really HTTPS at all but will be used for tunnelling of our TCP application protocol. We intend to use this when we acquire pilot programme customers that don’t want the “hassle” of modifying anything on their network firewalls/proxies. Yes, really. Even worse, it will inspect the source IP address and, from that, determine what customer it is and then route it to the appropriate private internal server and port number.
  3. Expose various other TCP ports on public web and map these (in as traditional “port map” style as possible) directly to one specific private internal server.
  4. Be easy to change the configuration and be scriptable, so we can tick off the “continuous deployment” check box.
  5. Configuration changes must never tamper with existing connections.
  6. Optional bonus, be source controllable.

My first suggestion was that we would write some PowerShell scripts to access the Sonicwall firewall through SSH and control its firewall tables directly. This was the plan for several months in fact, whilst everything was getting put in place inside the data centre. I knew full well it wouldn’t be easy. First there was some political issues inside the company with regard to a developer (me) having access to a central firewall. Second, I knew that creation and testing of the scripts would be difficult and that the whole CLI on the Sonicwall would surely not be as good as a Cisco.

I knew I could achieve #1 and #3 easily on a Sonicwall, like with any router really. But #2 was a little bit of an unknown as, frankly, I doubted if a Sonicwall could do it without jumping through a ton of usability hoops. #4 and #6 were the greatest unknown. I know you can export a Sonicwall’s configuration from the web interface. But it comes down as a binary file; which sort of made me doubt whether the CLI could do it properly as some form of text file. And of course if you can’t get the configuration as a text file then it’s not really going to be truly source controllable either, so that’s #6 out.

Fortunately an alternative (and better!) solution presented itself in the form of HA-Proxy. I’ve been hearing more and more positive things about this over the past couple years: most notably from the Stack Exchange. And having recently finally shed my long-time slight phobia of Linux, I decided to have a go at setting it up this weekend on a virtual machine.

The only downside was that as soon as you move some of your routing decisions away from your core firewall then you start to get a bit worrisome about server failure. So naturally we had to ensure that whatever we came up with involving HA-Proxy can be deployed as a clustered master-master or master-slave style solution. That would mean that if our VM host “A” had a failure then Mr Backup over there, “B”, could immediately take up the load.

It seems that Stack Exchange chose to use the Linux-HA Heartbeat system for providing their master-slave cluster behaviour. In the end we opted for Keepalived instead. It is more or less the same thing except that it’s apparently more geared towards load balancers and proxies such as HA-Proxy. Whereas Heartbeat is designed more for situations where you only ever want one active server (i.e. master-slave(s)). Keepalived just seems more flexible in the event that we decide to switch to a master-master style cluster in the future.

HA-Proxy Configuration

Here’s the basic /etc/haproxy/haproxy.conf that I came up with to meet requirements #1, #2 and #3.

#
# Global settings for HA-Proxy.
global
	daemon
	maxconn 8192

#
# Default settings for all sections, unless overridden.
defaults
	mode http

	# Known-good TCP timeouts.
	timeout connect 5000ms
	timeout client 20000ms
	timeout server 20000ms

	# Prevents zombie connections hanging around holding resources.
	option nolinger

#
# Host HA-Proxy's web stats on Port 81.
listen HAProxy-Statistics *:81
	mode http
	stats enable
	stats uri /haproxy?stats
	stats refresh 20s
	stats show-node
	stats show-legends
	stats auth admin:letmein

#
# Front-ends
#
#########
	#
	# Public HTTP/80 endpoint.
	frontend Public-HTTP
		mode http
		bind *:80
		default_backend Web-Farm

	#
	# Public HTTPS/443 endpoint.
	frontend Public-HTTPS
		mode tcp
		bind 172.16.61.150:443
		default_backend Web-Farm-SSL

	#
	# A "fake" HTTPS endpoint that is used for tunnelling some customers based on the source IP address.
	# Note: At no point is this a true TLS/SSL connection!
	# Note 2: This only works if the customer network allows TCP 443 outbound without passing through an internal proxy (... which most of ours do).
	frontend Public-AppTunnel
		mode tcp

		#
		# Bind to a different interface so as not to conflict with Public-HTTPS (above).
		bind 172.16.61.159:443

		#
		# Pilot Customer 2 (testing)
		acl IsFrom_PilotCustomer2 src 213.213.213.0/24
		use_backend App-PilotCustomer2 if IsFrom_PilotCustomer2

#
# Back-ends
#
# General
#
#########
	#
	# IIS 7.5 web servers.
	backend Web-Farm
		mode http
		balance roundrobin
		option httpchk
		server Web0 172.16.61.181:80 check
		server Web1 172.16.61.182:80 check

	#
	# IIS 7.5 web servers, that expose HTTPS/443.
	# Note: This is probably not the best way, but it works for now. Need to investigate using the stunnel solution.
	backend Web-Farm-SSL
		mode tcp
		balance roundrobin
		server Web0 172.16.61.181:443 check
		server Web1 172.16.61.182:443 check

#
# Back-ends
#
# Application Servers (TCP bespoke protocol)
#
#########
	#
	# Customer 1
	listen App-Customer1
		mode tcp
		bind *:35007
		server AppLive0 172.16.61.12:35007 check

	#
	# Pilot Customer 2 (testing)
	listen App-PilotCustomer2
		mode tcp
		bind *:35096
		server AppLive0 172.16.61.12:35096 check

I doubt the file will remain this small for long. It’ll probably be 15x bigger in a week or two 🙂

Keepalived Configuration

And here’s the /etc/keepalived/keepalived.conf file.

vrrp_instance_VI_1 {
	state MASTER
	interface seth0
	virtual_router_id 51
	! this priority (below) should be higher on the master server, than on the slave.
	! a bit of a pain as it makes Mercurial'ising this config more difficult - anyone know a solution?
	priority 200
	advert_int 1
	authentication {
		auth_type PASS
		auth_pass some_secure_password_goes_here
	}
	virtual_ipaddress {
		172.16.61.150
		172.16.61.159
	}
}

It is rather straight forward as far as other Keepalived configurations go. It is effectively no different to a Windows Server Network Load Balancing (NLB) deployment, with the right options to give the master-slave behaviour. Note the only reason I’ve specified two virtual IP addresses is because I need to use the TCP port 443 twice (for different purposes). These will be port mapped on the Sonicwall to different public IP addresses, of course.

Mercurial, auto-propagation script for haproxy.conf

#!/bin/sh
cd /etc/haproxy/

#
# Check whether remote repo contains new changesets.
# Otherwise we have no work to do and can abort.
if hg incoming; then
  #
  #
  echo "The HA-Proxy remote repo contains new changesets. Pulling changesets..."
  hg pull

  #
  # Update to the working directory to latest revision.
  echo "Updating HA-Proxy configuration to latest revision..."
  hg update -C

  #
  # Re-initialize the HA-Proxy by informing the running instance
  # to close its listen sockets and then load a new instance to
  # recapture those sockets. This ensures that no active
  # connections are dropped like a full restart would cause.
  echo "Reloading HA-Proxy with new configuration..."
  /etc/init.d/haproxy reload

else
  echo "The HA-Proxy local repo is already up to date."
fi

I turned the whole /etc/haproxy/ directory into a Mercurial repository. The script above was also included in this directory (to gain free version control!), called sync-haproxy-conf.sh. I cloned this repository onto our central Mercurial master server.

It is then just a case of setting up a basic “* * * * * /etc/haproxy/sync-haproxy-conf.sh” cronjob so that the script above gets executed every minute (don’t worry it’s not exactly going to generate much load).

This is very cool because we can use the slave HA-Proxy server as a sort of testing ground of sorts. We can modify the config on that server quite a lot and test against it (by connecting directly to it’s IP rather than the clustered/virtual IP provided by Keepalived). Then once we’ve got the config just right we can commit it to the Mercurial repository and then push the changeset(s) to the master server. Within 60 seconds then the other server (or servers, in your case possibly!) will then run the synchronisation script.

One very neat thing about the newer versions of HA-Proxy (I deployed version 1.4.11) is that they have an /etc/init.d script that already includes everything you need for doing configuration file rebinds/reloads. This is great because what actually happens is that HA-Proxy will send a special signal to the old process so that it stops listening on the front-end sockets. Then it will attempt to start the new instance based upon the new configuration. If this fails it will send another signal to the “old”, but now resurrected process, that it can resume listening. Otherwise the old process will eventually exit once all its existing client connections have ended. This is brilliant because it meets and rather elegantly exceeds exceeds our expectations for requirement #5.

The fact that our HA-Proxy’s will contain far more meticulous configuration details than even our Sonicwall, I think that this solution based upon Mercurial is simply brilliant. We have what is effectively a test and slave server all-in-one, and a hg revert or hg rollback command is of course only a click away.

It’s still a work in progress but so far I’m very pleased with the progress with HA-Proxy.

CentOS 5.5 losing time synchronisation on Hyper-V R2

Posted in Unix Environment, Windows Environment by Nathan B. Evans on February 21, 2011

Recently I deployed a CentOS 5.5 x64 guest on a Server 2008 R2 Hyper-V host for running a basic mail server. However I quickly noticed that the guest was losing time sync very rapidly, in the order of several positive minutes per hour. This was a surprise as I had already installed the Hyper-V Linux Integration components from Microsoft which I had assumed would just take care of everything for me. Not so, apparently.

I might also add that Dovecot (an IMAP/POP3 server) kept crashing due to the periodic NTP sync being so far out that it resulted in the guest’s clock actually going BACK in time! This appears to be an acknowledged bug in Dovecot but it is understandable why they don’t feel a pressing need to fix it. Though I understand the 2.0 release has. For the record, I was running the 1.0.7.7.el5 release.

Anyway, it turned out that the solution was very simple.

Modify the /boot/grub/grub.conf as follows:

Editing the /boot/grub/grub.conf file.

Essentially you need to modify the lines that start with the word “kernel” and add two extra options onto the end:

  • clock=pit

    This sets the clock source to use the Programmable Interrupt Timer (PIT). This is a fairly low level way for the kernel to track time and it works best with Hyper-V and Linux.

  • notsc

    This is included more as belt-and-braces than anything. Because setting the PIT clock source (above) should already imply this setting really. But I include it for pure expressiveness 🙂

  • divider=10

    This adjusts the PIT frequency resolution to be accurate to 10 milliseconds (which is perfectly sufficient for most applications). This isn’t strictly required but it will reduce some CPU load caused by the VM. If the VM will be running time sensitive calculations a lot (such as say a VoIP server or gaming server) then you probably shouldn’t include this option.

Once you’ve done this, save the file and reboot the box.

The time should now be synchronized precisely with the Hyper-V host!

PS: I’ve not tested whether this solution will work without the Hyper-V Linux Integration Components installed, but I believe that it will. As it operates independently of Hyper-V’s clock synchronisation mechanism and relies purely on the virtualised Programmable Interrupt Timer that is exposed by Hyper-V. And the PIT clock source will remain unchanged whether you have the Linux Integration Components installed or not.

PPS: The VMware knowledge base has a great article on this subject: http://kb.vmware.com/kb/1006427. Although take it with a slight pinch of salt (because the subject is Hyper-V) but it certainly gives several more options and ideas to try for different Linux distributions and 32-bit vs 64-bit etc environments.

Tagged with: ,