For determining divisibility by 3, such an enumeration isn't available to us (we have to consider every digit since there is no power of 10 evenly divisible by 3). How do we know that it's possible to write a regular expression that matches on multiples of 3? Because we can write a finite state automaton that tests for divisibility by 3.
enum State
{
Accept,
Rem1,
Rem2
}
bool IsDivisibleBy3(char* num)
{
State state = State.Accept;
for (; *num; ++num)
{
switch (*num)
{
case '0': case '3': case '6': case '9':
break;
case '1': case '4': case '7':
if (state == State.Accept)
state = State.Rem1;
else if (state == State.Rem1)
state = State.Rem2;
else
state = State.Accept;
break;
case '2': case '5': case '8':
if (state == State.Accept)
state = State.Rem2;
else if (state == State.Rem1)
state = State.Accept;
else
state = State.Rem1;
break;
}
}
return s == State.Accept;
}
I wrote it in this style to make explicit that it represents a finite state automaton. It's essentially an obfuscated version of what you might actually write to test for divisibility by 3:
bool IsDivisibleBy3(char* num)
{
int remainder = 0;
for (; *num; ++num)
{
remainder = ((10 * remainder + *num - '0') % 3);
}
return remainder == 0;
}
This itself is just a slight modification of the code to read in a number and see if 3 divides it, with the important change that this one won't overflow on large input. Notice how close this code comes to implementing the rule that 3 divides a number if 3 divides the sum of its digits!
Since there exists an FSA that checks for divisibility by 3, there must be a regular expression that codes this FSA. To simplify the definition, we'll introduce some character classes:
Z = [0369], O = [147], T = [258]
These partition the digits into equivalence classes modulo 3 (and correspond to the different cases in the code - coincidence?). Then, a regular expression to test for divisibility by 3 is:
^(Z|(OZ*OZ*(TZ*O)*Z*O)|(TZ*TZ*(OZ*T)*Z*T)|(O(OZ*T)*Z*T)|(TZ*(TZ*O)Z*O))+$
If you're not into the whole brevity thing, that expands to:
^([0369] | ([147][0369]*[147][0369]*([258][0369]*[147])*[0369]*[147]) | ([258][0369]*[258][0369]*([147][0369]*[258])*[0369]*[258]) | ([147]([147][0369]*[258])*[0369]*[258]) | ([258][0369]*([258][0369]*[147])[0369]*[147]))+$
So now we have regular expressions for testing for divisibility by 2, 3, 4, 5, and 10. We could make one for 8 (it would be similar to the one for 4). Testing for divisibility by 6 is equivalent to testing for both 2 and 3, both of which we can do (but which it would not be easy to write a standard RE for, though I believe there are extensions that implement an And operator). Alternately, we could modify the one above to make sure that the last digit is even (ick - that one is left for the reader). What would a regular expression that tests for divisibility by 7 look like?
No comments:
Post a Comment