Sunday, March 16, 2008

Divisibility Testing and Pattern Matching

How do you tell if a number is divisible by 10? Everybody knows you just look to see if the last digit is 0 or not. There are similar rules for checking divisibility by 5 (ends in 0 or 5) and 2 (ends in an even number). What I think is interesting is that these rules are not mathematical in the usual sense. Rather, they are based on simple textual pattern matching. That is, they don't rely on manipulating values, they just require inspecting the textual representation of the value in question.

If you are going to do pattern matching on text, the natural tool is regular expressions. Using the usual regular expression syntax, the above rules can be expressed as 0$, [05]$, and [02468]$, respectively. Those are the easy ones. Can we code more complicated divisibility tests as regular expressions? How about checking for divisibility by 4?

The mathematical shortcut for checking if 4 divides a number is to see if it divides just the last 2 digits. This works because if n = 100 a + b, 4 | n iff 4 | 100 a + b iff 4 | b. If we choose a and b so that b < 100, b will be exactly the last 2 digits of n. So we know we can get away with looking at just the last 2 digits; is there a regular expression we can use? Yes: ([02468][048])|([13579][26])$ works! Unsurprisingly, our regular expression considers only the last 2 digits.

So far, I've been considering the easy cases - testing for divisibility by combinations of 2 and 5 (which are easy since we're using the base 10 representation). You may know the shortcut for testing for divisibility by 3: add up the digits of the number, and if this sum is itself divisible by 3 then so was the original number. Is it possible to code this test as a regular expression?

To be continued...

[Erratum: the regular expression above for divisibility by 4 requires 2 digits, so it won't match 0, 4 and 8]

No comments: