What is a firewall?

Often in films where the enemy is attacking there'll be a comment about how "they're breaching our firewalls", yet often the firewall just does as it's told.  It is possible to breach a firewall, subverting it to do something else, but this post will concentrate on what a firewall is in the first place - apologies if it gets a bit technical!

At a very basic level, a firewall is a list of rules that define what connections will be allowed or denied.  For example, I might create a rule that says allow my laptop access to the Internet on HTTP and HTTPS.  If that's my only rule, and my laptop tries to reach the Internet on FTP, SSH or anything that's not HTTP / HTTPS it'll probably be prevented from doing so.  Generally speaking a firewall will work from the top of the list down to the bottom and will stop processing the rules when the traffic matches a rule.

To liken a (basic) firewall to the real world, you could consider it like a bouncer at an exclusive nightclub.  You arrive at the door, give your name and the bouncer checks a list.  If your name is on the list you get through, if not you're turned around.

I use a reasonably easy to understand way of writing firewall rules in documentation and this is the format I'll be using in this post:

ALLOW <SOURCE> TO <DESTINATION> ON <PORT/NAME> [NAT]

In the format above I'm listing the source and destination for the traffic, the port the traffic is connecting to and if Network Address Translation (NAT) should be used.  Note not all firewalls support time restrictions.  An example set of rules is below:

ALLOW 192.168.1.23 TO 10.212.89.6 ON SSH
ALLOW 192.168.1.23 TO 10.212.89.7 ON UDP5060
ALLOW 192.168.1.23 TO 8.8.8.8 ON DNS NAT
DENY 192.168.1.23 TO 8.8.4.4 ON DNS

How does the firewall know the source, destination and port?

As traffic passes through the firewall the packet's[1] header is read.  The header lists where the traffic has come from, where it's going and what it's connecting to (the port).  Using a tool like Wireshark we're able to see this information.  The screenshot below shows a packet with a source (Src) of 10.0.2.15, a destination (Dst) of 1.1.1.1 and a destination port (Dst Port) of 53 which is DNS.  We can also see this traffic used the User Datagram Protocol (UDP), although I won't be discussing UDP here.

Wireshark showing the packet's source, destination and destination port.

Default rules

The default rule, sometimes called the implicit rule, is the rule at the bottom of the list.  What it says varies by firewall, but it's not uncommon for the default rule to block traffic.  This way, when traffic doesn't match a rule it'll simply be denied.  Earlier I said if I only had one rule, and my traffic didn't match it, the traffic would probably be denied - that's all up to the default rule.

Firewalls don't always show their default rule in the way they show the normal rule sets.  For example, iptables shows the default rule on the chain (a chain is a list of rules, as some firewalls have multiple chains branched from the main list) in parenthesis like this: Chain INPUT (policy ACCEPT).

On modern versions of Windows (certainly 7 and above, including server variants) the default rule is controlled on the properties page:

The default rule behaviour on the Windows firewall.

Does rule order matter?

By working from the top down, and stopping when there's a match, the firewall is able to be specific and broad at the same time.  For example, consider I have a network with a subnet of 192.168.1.0/24 (254 devices, with IPs from 192.168.1.1 to 192.168.1.254).  I might want to tell my firewall that all devices can connect to the Internet except the PC at the sign in desk (we'll call it 192.168.1.15).  To do this I create two rules:

ALLOW 192.168.1.0/24 TO The-Internet ON ALL-Ports NAT

and

DENY 192.168.1.15 TO The-Internet ON ALL-Ports

Rule order matters because the firewall will stop processing as soon as it finds a match.  If the rules on the firewall were in the order I've listed above the sign in desk PC (192.168.1.15) would have access to the Internet because it's part of 192.168.1.0/24.  Instead we need to swap the rules round so the DENY is processed first.

DENY 192.168.1.15 TO The-Internet ON ALL-Ports
ALLOW 192.168.1.0/24 TO The-Internet ON ALL-Ports NAT

Correctly ordering rules is easy when there's only a handful of them but as the ruleset grows, particularly in a business or larger organisation, rules can end up in the wrong order.  This can make some rules completely ineffective while others grant excessive access.  Some firewalls, like the Watchguard series of products, will automatically order your rules while others (Cisco ASA, Fortinet's Fortigate, iptables) require you to review the ruleset and place your new entry in the correct place.

What about return traffic?  Stateful firewalls

So far we've looked at how firewalls know to let traffic pass through, but what about traffic replies?  Accessing a resource is normally done through requesting and receiving information:

  1. Browser: "Hello https://blog.jonsdocs.org.uk, please can I see your webpage?"
  2. Web server: "Sure, here it is!"

When the initial traffic passes through a firewall to get the web page over HTTPS, the firewall lets the traffic out because there's a rule for it:

ALLOW 192.168.1.213 TO blog.jonsdocs.org.uk ON HTTPS NAT

As you can see, there isn't a rule that says "let me have the answer from the web server" but so long as we're using a stateful firewall we will get the answer.  By being stateful the firewall is keeping track of connections that go out and it lets the reply data come back through.  If you like, it creates a temporary allow rule automatically.

As I mention in my plea to software developers and support companies, quite often companies don't understand about stateful firewalls so they request needless additional rules.

Data breaches - the firewall's fault?

I mentioned at the beginning that it is sometimes possible to subvert a firewall to behave as its owners don't want it to, perhaps letting attackers in.  It's worth noting though that a lot of data breaches are actually the result of problems caused by system administrators or developers.  For example, an administrator backs up their database and leaves the file downloadable via the company website.  When a third party downloads that backup (the breach) it's not the firewall's fault - it has just seen a request that matches a rule so has allowed it.

Is that it?

At a conceptual level, firewalls really are that easy to understand - they're just a list of rules.  Firewalls have become a lot more usable over time and now offer many features under the guise of a Next Generation Firewall.  I'll look at those in a future blog post.


Banner image, firewall, from Openclipart.org.

[1] Packet - a unit of data on a network