Conclusive UK VAT Number Regular Expression

This morning I have been trying to find information regarding the official government stance on the format of UK VAT numbers. I posted my original regular expression for UK VAT number formats on RegexLib.com some time ago, but that has received a public comment stating that in some cases the UK format is not always 9 digits, but can sometimes be 12. Puzzled by this, I set off to clarify.

I searched the HM Custom & Excise website to try and find out the information from the horses mouth. I come across a great resource that lists every country in the EU and the relevant VAT number format.

However, I came across a small, little, tiny hitch. In a fit of anti-EU madness, the author forgot to include the UK as one of those countries that belong to the EU!! I think that says a lot about British views on the UK’s EU membership!

Anyway, I did manage to find the specifications for all EU VAT numbers (as of 06 DEC 2005), on the EU central government website, and they are listed at the end of this post.

Hence my original regular expression:

^[1-9]\d{8}$

Changes to this, so to include branch traders:

^(([1-9]\d{8})|([1-9]\d{11}))$

Then, if we prefix the EU country code for the UK, which is “GB”, you then have:

^([GB])*(([1-9]\d{8})|([1-9]\d{11}))$

Following on to add the other minor formats (Government Departments and Health Authorities), you get this:

^([GB])*(([1-9]\d{8})|([1-9]\d{11})|(GD[1-9]\d{2})|(HA[1-9]\d{2}))$

If you need a named groups Regular Expression then try the following (with pattern whitespace off):
^
([GB])*
(
(?[1-9]\d{8})
|
(?[1-9]\d{11})
|
(?GD[1-9]\d{2})
|
(?HA[1-9]\d{2})
)
$

The EU itself does provide the full breakdown of tax number formats for all EU countries, and as of today the following are given:

Member State Structure Format*

AT-Austria ATU999999991 1 block of 9 characters

BE-Belgium BE999999999 or
BE09999999992
1 block of 9 digits or
1 block of 10 digits 3

CY-Cyprus CY99999999L 1 block of 9 characters

CZ-Czech Republic CZ99999999 or
CZ999999999 or
CZ9999999999
1 block of either 8, 9 or 10 digits

DE-Germany DE999999999 1 block of 9 digits

DK-Denmark DK99 99 99 99 4 blocks of 2 digits

EE-Estonia EE999999999 1 block of 9 digits

EL-Greece EL999999999 1 block of 9 digits

ES-Spain ESX9999999X4 1 block of 9 characters

FI-Finland FI99999999 1 block of 8 digits

FR-France FRXX 999999999 1 block of 2 characters, 1 block of 9 digits

GB-United Kingdom GB999 9999 99 or
GB999 9999 99 9995 or
GBGD9996 or
GBHA9997
1 block of 3 digits, 1 block of 4 digits and 1 block of 2 digits; or the above followed by a block of 3 digits; or 1 block of 5 characters

HU-Hungary HU99999999 1 block of 8 digits

IE-Ireland IE9S99999L 1 block of 8 characters

IT-Italy IT99999999999 1 block of 11 digits

LT-Lithuania LT999999999 or
LT999999999999
1 block of 9 digits, or 1 block of 12 digits

LU-Luxembourg LU99999999 1 block of 8 digits

LV-Latvia LV99999999999 1 block of 11 digits

MT-Malta MT99999999 1 block of 8 digits

NL-The Netherlands NL999999999B998 1 block of 12 characters

PL-Poland PL9999999999 1 block of 10 digits

PT-Portugal PT999999999 1 block of 9 digits

SE-Sweden SE999999999999 1 block of 12 digits

SI-Slovenia SI99999999 1 block of 8 digits

SK-Slovakia SK9999999999 1 block of 10 digits

Remarks:

*: Format excludes 2 letter alpha prefix
9: A digit
X: A letter or a digit
S: A letter; a digit; “+” or “*”
L: A letter

Notes:

1: The 1st position following the prefix is always “U”.
2: The first digit following the prefix is always zero (‘0’).
3: The VAT number of a Belgian trader can appear in any of these two formats. The (new) 10-digit format is the result of adding a leading zero to the (old) 9-digit format.
4: The first and last characters may be alpha or numeric; but they may not both be numeric.
5: Identifies branch traders.
6: Identifies Government Departments.
7: Identifies Health Authorities.
8: The 10th position following the prefix is always “B”.