This document discusses algorithms for converting between Roman numerals and Hindu-Arabic numerals. It analyzes the differences between the two systems and develops an analytical method for checking the validity of Roman numerals and performing the conversions. The method involves checking for invalid characters, repetitions of numerals, position restrictions, and invalid subtractive sequences. The document also compares the analytical method to more intuitive and computer-based approaches. An appendix includes sample C code to implement the analytical conversion algorithm.
Conversion of Roman Numbers to Hindu-Arabic Numbers
1. Conversion of Roman Numbers to Hindu-Arabic
Sualeh Fatehi
1993
Although it may seem easy to learn how to form Roman numbers from Hindu-Arabic, and vice versa, the
analysis is quite involved. An analysis of the inter-conversions would lead to the development of a
method or methods of performing these conversions easily. Based on these algorithms, computer
programs could be written to interconvert Roman and Hindu-Arabic numbers, while at the same time
checking that they are not performing the task on incorrectly formed Roman numbers. As the following
discussion will show, checking for wrongly formed Roman numbers is more difficult than the actual
conversions.
Roman numbers are formed differently from Hindu-Arabic numbers, for although based on the radix 10,
only two digits are available upto ten videlicet I and V. In addition to this there are other symbols for
numbers ten and greater: X, L, C, D and M, in that order.
Therefore, the values of Roman numerals in order are:
I 1 V 5
X 10 L 50
C 100 D 500
M 1000
Table 1
Those on the left may be termed as belonging to the ones group, and those on the right to the fives
group.
Other numbers are formed by additive or subtractive sequences of these numerals: VI = 6 (additive) and
IV = 4 (subtractive). Also, there is no zero, even to be used as a placeholder.
This is in contrast to Hindu-Arabic numbers, which have ten digits (being based on 10, and having one
digit for every number up to ten, with zero being used as a placeholder or as a null value). The value of
the digit depends on its position in the number: 9 followed by two other digits indicates a magnitude of
900, but if followed by three digits indicates 9000. Therefore, the Hindu-Arabic system of ciphering is a
positional one consisting solely of additive sequences.
Because of the positional property of the Hindu-Arabic system, conversion of Hindu-Arabic to Roman is
extremely simple, and can be done simply by looking up a table (Table 2) for each digit in a particular
position, and collating the sequence.
2. Conversion of Roman Numbers to Hindu-Arabic
Sualeh Fatehi
Digit '000 '00 '0 ' Sequence
0 - - - - -
1 M C X I additive
2 MM CC XX II additive
3 MMM CCC XXX III additive
4 - CD XL IV subtractive
5 - D L V additive
6 - DC LX VI additive
7 - DCC LXX VII additive
8 - DCCC LXXX VIII additive
9 - CM XC IX subtractive
Table 2
Thus, 1964 = 1/9/6/4 = M/CM/LX/IV = MCMLXIV, and1001 = 1/0/0/1 = M///I = MI.
The reverse process, though, is not so easy, because we do not know inadvance where to place the
breaks. So, we use another technique, andthat is of remembering the value of the previous Roman
numeral read. Ifthe current value (the value of the digit being read, while the digitsare read from right to
left) is greater than or equal to the previousone read, add it, or else subtract it.
For example:
<>>><>
MCMLXIV = 5 - 1 + 10 + 50 + 1000 - 100 + 1000 = 1964
Using this technique, the reverse conversion from Roman to Hindu-Arabic becomes equally easy.
If a computer program is to be written to convert Roman numbers to Hindu-Arabic, it should be able to
check whether the input is invalid. The above method of conversion has the drawback that it will
translate IIV to 5 and IM to 999, both of which are wrong: it cannot identify invalid Roman numbers.
We may therefore make an observation that to form a valid Roman number, there are restrictions on
both the position of a digit and the number of times it may appear consecutively. (These restrictions do
not apply to the Hindu-Arabic system, in which if all the characters making up a number are digits, the
number is valid.)
We now proceed to examine all cases of unacceptable Roman numbers inductively.
The first case, of course, is that of a character appearing in a Roman number that is not an acceptable
Roman digit (for example CMA). This is trivial, and easily disposed of.
Secondly, we take up the criterion of repetition. In an additive sequence, no digit may appear more than
thrice, if it is of the ones group, or more than once if it belongs to the fives group: for instance VIIII and
VVI are invalid. Also, in a subtractive sequence, no member of the ones group may appear more than
once, and a fives group member may not appear at all. (This condition would render subtractive
sequences like IIX and IXX invalid.)
2|Page
3. Conversion of Roman Numbers to Hindu-Arabic
Sualeh Fatehi
Thirdly, we take up position, by consideration of value. In additive sequences, two Roman numerals of
any value may be juxtaposed (MI = 1001). However, in subtractive sequences only the following may
immediately precede the digits given in Table 3 below.
I - V I
X I L X
C X D C
M C - -
Table 3(This may be observed from Table 2.)
Fourthly, we come to general subtractive sequences. Subtractive sequences can only be two symbols
long (Table 2). Hence by examining every possible set of three consecutive numerals in a Roman
number, we can identify invalid numbers. This examination has to be done by value, and also by group.
The first column in Table 4 below, which is marked `Group', shows what follows in the same row. If a
row is marked as 151, the first and third numeral of a triplet found in this row is of the ones group, the
second of the fives group. In the same table, this diagrammatic notation indicates three numerals rising
in value,
/
and then falling, the third being the smallest of the three. Other markings may be similarly understood.
In this table also only the smallest possible triplet of numerals for each type is given, because for the
other triplets of the same type, the conclusions drawn will hold good if the triplet is not invalidated by
the repetition or position criteria. Taking XCI, we find other triplets of the same type: XMI, CMX, CMI.
Here we find that XMI has already been invalidated by the above criterion of position, while for CMX
and CMI we may draw the same conclusions as for XCI: that they are valid. The two parts of Table 4
taken together cover all possible types of triplets, and are the basis for forming general rules of what is
possible and not possible.
1 2 3 4 5 6 7 8 9 10 11 12
/ / / / / / ___ ___
/ / / / / / / ___ ___/
111 IXC CXI XCI ICX XIC CIX IXI XIX IXX XXI XII IIX
115 IXL CXV XCV IXV XIL XIV - - - XXV - IIV
151 IVX XVI XLI ILX XVC CVX IVI XVX - - - -
511 VXC LXI VXI VCX VIX LIX - - VXX - VII -
551 VLC LVI VLI VLX LVC LVX - - - VVI - VVX
515 VXL LXV LCV VCL VIL LIV VXV VIV - - - -
155 IVL CLV XLV ILV XVL CVL - - IVV - XVV -
555 VLD DLV LDV VDL LVD DVL VLV LVL VLL LLV LVV VVL
red= invalid; green = valid
Table 4
An alternative way of forming Roman numbers is writing IIII for 4 and VIIII for 9, and so on. This
convention of representing Roman numbers is not considered in this discussion, because it renders a
3|Page
4. Conversion of Roman Numbers to Hindu-Arabic
Sualeh Fatehi
detailed analysis unnecessary. The conditions for checking validity are greatly reduced. As long as a
member of the ones group appears not more than four times, and a member of the fives group not
more than once, the Roman number is valid if all the numerals appear in descending order (this being so
because subtractive sequences are eliminated). Conversion is then done by simply adding up all the
values represented by the numerals making up the number. The other difference between the former
method, and this alternative method is the magnitude of the largest number that can be represented. If
4 is written as IIII, then by the same token, 4000 is written as MMMM, and so the largest number that it
is possible to represent is 4999. As against this, using the former convention, the largest number that
can be represented is 3999.
We now proceed to develop an analytical procedure for performing the conversion:
1. If there are any invalid characters, the Roman number is invalid.
2.
a. If a member of the ones group appears more than thrice consecutively, the number is
invalid.
b. If a member of the fives group appears more than once consecutively, the number is
invalid.
3. The following numerals cannot appear anywhere before those given in Table 5.
I - V -
X V L I, V
C I, V, L D I, V, X, L
M I, V, X, L, D - -
Table 5
Table 5 may be succinctly expressed by the rule that if a number is preceded by another that is
smaller, the smaller value may only be either one-tenth or one-fifth of the larger.
4. Examining every possible consecutive set of three numerals, (in MCMLXIV: MCM, CML, MLX, LXI
and XIV are examined), a set is invalid if:
a. the last is greater than the first (this checks for the combinations appearing in columns
1, 4, 5, 9 and 12 of Table 4, all of which are invalid)
b. the first and last digits of the triplet are the same, and different from the first; but if all
three belong to the ones group, and the second is less then the first the triplet is valid
(this condition is developed from columns 7 and 8 of Table 4)
A single invalid triplet makes the whole Roman number invalid.
Conditions 3 and 4(a) both check for a wide variety of invalid sequences, but the checks in condition 4
are easier to perform. In the above procedure, if conditions 3 and 4 are interchanged, an invalid Roman
number will be detected faster, with less effort. However, a number can be known to be valid only after
all the checks have been performed.
Although it is possible to write a computer program to check analytically for invalid input, and then
convert, this is certainly an inefficient way of solving the problem. Taking advantage of the computer as
a calculating machine, and the simplicity of back and forth conversions, the easiest computer solution to
the problem of converting Roman numbers to Hindu-Arabic, and checking for invalid input would be:
1. Convert the Roman number to its equivalent Hindu- Arabic using the method of remembering
the previous digit.
2. Convert back from Hindu-Arabic to Roman.
4|Page
5. Conversion of Roman Numbers to Hindu-Arabic
Sualeh Fatehi
3. Check the original Roman number against the converted one obtained from step 2. If they are
the same, the Roman number is valid, so output the Hindu-Arabic equivalent. If different, the
input is to be rejected. Finally, it is unlikely that the human mind works on conversions from
Roman to Hindu-Arabic using an analytical method. It is equally unlikely that it uses the method
that a computer may be programmed to use. It seems to work by inserting breaks in the Roman
number, analysing for absurdities, changing the positions of the breaks, and reanalysing:
repeatedly, until either a solution is found, or all possible positions of breaks are exhausted.
For example, 1961 (MCMLXI) may be analysed as follows:
MC/M/L/XI absurd
M/CM/L/XI not correct
M/CM/LX/I =1961
(It may be parenthetically noted that while the first system of breaks is definitely wrong, the second may
be acceptable to a person not rigidly Hindu-Arabic.)
Another example: VIV
V/IV absurd
VI/V also absurd
All combinations of breaks having been explored, the number is declared to be nonsensical.
Thus, having seen three different methods of conversion of Roman numbers to Hindu-Arabic decimal,
that is an analytical method, a method suited to computers, and one that is intuitive, we realise that
each is good for a particular situation. A computer program that follows the analytical method will take
more time than one that converts by the second method. On the other hand, if one wants to manually
convert a Roman number, and be absolutely sure of not making a mistake, one will not rely on intuition,
and probably opt for the analytical algorithm.
5|Page
6. Conversion of Roman Numbers to Hindu-Arabic
Sualeh Fatehi
Appendix 1
1 :
2 : /* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3 : @@@ @@@@ | Converts Roman numbers to Hindu-Arabic
4 : @@ @ | by an analytical method.
5 : @@ @@@ |
6 : @@@ @ | Program by Sualeh Fatehi.
7 : |
8 : ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */
9 :
10 :
11 : /* *
12 : * Converts Roman numbers to Hindu-Arabic decimal equivalents, *
13 : * by an analytical method, taking care of invalid input, *
14 : * for which the equivalent is 0 *
15 : * */
16 :
17 :
18 : /* specification of include files */
19 : #include <stdio.h>
20 : #include <string.h>
21 : /* end specification of include files */
22 :
23 : /* definition of global constants */
24 : #define TRUE 1
25 : #define FALSE 0
26 : #define MAXLEN 20
27 : #define MAXNUMERALS 7
28 : /* end definition of global constants */
29 :
30 : /* function prototypes */
31 :int main (int, char **, char **);
32 :int convert (char *);
33 : /* end function prototypes */
34 :
35 :
36 :int main (intargc, char **argv, char **envp)
37 : {
38 :
39 : /* declaration of automatic variables */
40 : char roman[MAXLEN] = "";
41 : /* end declaration of automatic variables */
42 :
43 : /* get the Roman number, convert it to Hindu-Arabic,
44 : and print the result */
45 : return (printf ("%dn", convert (gets (roman))));
46 :
47 : } /* end main */
48 :
49 :
50 :
51 :int convert (char *roman)
52 : /* converts a Roman number to its Hindu-Arabic equivalent,
53 : * using an analytical method, which checks for invalid input:
54 : * the converted value is returned, or 0 if the string was
6|Page
7. Conversion of Roman Numbers to Hindu-Arabic
Sualeh Fatehi
55 : * an invalid Roman number
56 : */
57 : {
58 :
59 : /* declaration of automatic variables */
60 :struct {
61 : char romanchar;
62 :int value;
63 : } numerals[MAXNUMERALS] = {{'I', 1}, {'V', 5}, {'X', 10},
64 : {'L', 50}, {'C', 100}, {'D', 500},
65 : {'M', 1000}};
66 :inthindu_arabic = 0;
67 :intmaxrepeat = 3, repeatcnt = 0;
68 :int length, loop, aloop;
69 :int current = 0, previous = 0;
70 :int found;
71 : /* end declaration of automatic variables */
72 :
73 : /* find the length of the roman number */
74 : length = strlen (roman);
75 :
76 : /* examine a character at a time, and exit if the
77 : character is not a Roman numeral: also store the
78 : offsets of the numeral, for future comparison */
79 : for (aloop = 0; aloop< length; aloop++) {
80 :
81 : /* examine the character for a valid Roman numeral, and find
82 : which one it is: only uppercase I, V, X, L, C, D and M are
83 : valid */
84 : found = FALSE;
85 : for (loop = 0; loop < MAXNUMERALS; loop++) {
86 : if (numerals[loop].romanchar == roman[aloop]) {
87 : found = TRUE;
88 : roman[aloop] = loop;
89 : break;
90 : } /* end if */
91 : } /* end for */
92 : if (!found) return 0;
93 :
94 : } /* end for */
95 :
96 : /* begin the analytical examination */
97 : for (aloop = length-1; aloop>= 0; aloop--) {
98 :
99 : /* define current and previous values */
100 : current = numerals[roman[aloop]].value;
101 : previous = ((aloop + 1) <= (length - 1)) ?
102 : numerals[roman[aloop+1]].value : 0;
103 :
104 : /* check if the numeral has been repeated more often than
105 : allowed ('maxrepeat') by keeping a count of the number
106 : of times it appears ('repeatcnt') */
107 : if (current != previous) {
108 :repeatcnt = 0;
109 : if (roman[aloop] % 2 == 0)
110 :maxrepeat = 3;
111 : else
112 :maxrepeat = 1;
113 : } /* end if */
7|Page
8. Conversion of Roman Numbers to Hindu-Arabic
Sualeh Fatehi
114 : if (current == previous)
115 :repeatcnt++;
116 : if (current < previous)
117 :maxrepeat = 1;
118 : if (repeatcnt>= maxrepeat)
119 : return 0;
120 :
121 : /* checking each triplet of numerals for invalid sets */
122 : if (aloop> 1) {
123 : /* check if a smaller numeral appears two places before
124 : a larger one */
125 : if (numerals[roman[aloop-2]].value < current)
126 : return 0;
127 : /* if the first and third are equal, and the second
128 : different, the triplet (and Roman number) is
129 : invalid: except in the case where all belong to the
130 : one's group (I, X, C, M), and the second is less
131 : the first */
132 : if (roman[aloop] == roman[aloop-2]) {
133 : if ((roman[aloop] != roman[aloop-1]) &&
134 : !((roman[aloop] > roman[aloop-1])
135 :&& roman[aloop] % 2 == 0
136 :&& roman[aloop-1] % 2 == 0))
137 : return 0;
138 : } /* end if */
139 : } /* end if */
140 :
141 : /* check if a numeral appears before one that it is not
142 : allowed to appear before */
143 : for (loop = 0; (loop <aloop); loop++) {
144 : if (roman[loop] < roman[aloop]) {
145 : if ((roman[loop] % 2) != 0)
146 : return 0;
147 : if (roman[loop] < roman[aloop] - 2)
148 : return 0;
149 : } /* end if */
150 : } /* end for */
151 :
152 : /* add value to the accumulator, or subtract, depending on the
153 : previous value read */
154 : if (current < previous)
155 :hindu_arabic -= current;
156 : else
157 :hindu_arabic += current;
158 :
159 : } /* end for */
160 :
161 : /* return the converted value */
162 : return hindu_arabic;
163 :
164 : } /* end convert */
165 :
166 :
167 : /* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */
168 :
8|Page
9. Conversion of Roman Numbers to Hindu-Arabic
Sualeh Fatehi
Appendix 2
1 :
2 : /* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
3 : @@@ @@@@ | Converts Roman numbers to Hindu-Arabic
4 : @@ @ | by coverting to and back.
5 : @@ @@@ |
6 : @@@ @ | Program by Sualeh Fatehi.
7 : |
8 : ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */
9 :
10 :
11 : /* *
12 : * Converts Roman numbers to Hindu-Arabic decimal equivalents, *
13 : * by converting first using the method of remembering the *
14 : * previous value, and converting back to Roman, and checking *
15 : * against the original: this takes care of invalid input, for *
16 : * which the equivalent is 0 *
17 : * */
18 :
19 :
20 : /* specification of include files */
21 : #include <stdio.h>
22 : #include <string.h>
23 : /* end specification of include files */
24 :
25 : /* definition of global constants */
26 : #define TRUE 1
27 : #define FALSE 0
28 : #define MAXLEN 20
29 : #define MAXNUMERALS 7
30 : /* end definition of global constants */
31 :
32 : /* function prototypes */
33 :int main (int, char **, char **);
34 :int convert (char *);
35 : /* end function prototypes */
36 :
37 :
38 :int main (intargc, char **argv, char **envp)
39 : {
40 :
41 : /* declaration of automatic variables */
42 : char roman[MAXLEN];
43 : /* end declaration of automatic variables */
44 :
45 : /* get the Roman number, convert it to Hindu-Arabic,
46 : and print the result */
47 : return (printf ("%dn", convert (gets (roman))));
48 :
49 : } /* end main */
50 :
51 :
52 :
53 :int convert (char *roman)
54 : /* converts Roman numbers to Hindu-Arabic decimal equivalents,
9|Page
10. Conversion of Roman Numbers to Hindu-Arabic
Sualeh Fatehi
55 : * by converting first using the method of remembering the
56 : * previous value, and converting back to Roman, and checking
57 : * against the original: this takes care of invalid input, for
58 : * which the equivalent is 0
59 : */
60 : {
61 :
62 : /* declaration of automatic variables */
63 :struct {
64 : char romanchar;
65 :int value;
66 : } numerals[MAXNUMERALS] = {{'I', 1}, {'V', 5}, {'X', 10},
67 : {'L', 50}, {'C', 100}, {'D', 500},
68 : {'M', 1000}};
69 :inthindu_arabic = 0, copy = 0;
70 :int length, current = 0, previous = 0;
71 : char newroman[MAXLEN] = "";
72 :int size, digit, loop, place = 1000;
73 : /* end declaration of automatic variables */
74 :
75 : /* converting the Roman number to Hindu-Arabic: although
76 : * checking for invalid Roman numerals is done, there no
77 : * checking for invalid sequences: 0 is returned on
78 : * encountering an invalid Roman numeral
79 : */
80 :
81 : /* examine a character at a time */
82 : for (length = strlen (roman) - 1; length >= 0; length--) {
83 :
84 : /* examine the character for a valid Roman numeral, and find
85 : which one it is: only uppercase I, V, X, L, C, D and M are
86 : valid */
87 : current = 0;
88 : for (loop = 0; loop < MAXNUMERALS; loop++)
89 : if (numerals[loop].romanchar == roman[length]) {
90 : current = numerals[loop].value;
91 : break;
92 : }; /* end if .. for */
93 : /* return if an invalid digit is found */
94 : if (current == 0) return 0;
95 :
96 : /* add value to the accumulator, or subtract, depending on the
97 : previous value read */
98 : if (current < previous)
99 :hindu_arabic -= current;
100 : else
101 :hindu_arabic += current;
102 :
103 : /* store the previous values, and repeat the loop */
104 : previous = current;
105 :
106 : } /* end for */
107 :
108 :
109 : /* converting from Hindu-Arabic to Roman: each digit is
110 : * isolated, and the equivalent found, which is then
111 : * concatenated to the Roman number; it returns 0 if the
112 : * Hindu-Arabic number is out of range (0 to 4000)
113 : */
10 | P a g e
11. Conversion of Roman Numbers to Hindu-Arabic
Sualeh Fatehi
114 :
115 : /* check if the number is within the range */
116 : if (hindu_arabic< 1 || hindu_arabic> 3999) return 0;
117 :
118 : /* make a copy, which will be broken up for its digits */
119 : copy = hindu_arabic;
120 :
121 : length = 0;
122 :
123 : for (size = 4; size > 0; size--) {
124 :
125 : /* taking a digit at a time, starting with the most
126 : significant one */
127 : digit = copy / place;
128 : copy %= place;
129 : place /= 10;
130 :
131 : /* converting each digit into its equivalent Roman form, and
132 : concatenating */
133 : switch (digit) {
134 : case 0:
135 : break;
136 : case 9:
137 :newroman[length++] = numerals[2*size-2].romanchar;
138 :newroman[length++] = numerals[2*size].romanchar;
139 : break;
140 : case 4:
141 :newroman[length++] = numerals[2*size-2].romanchar;
142 : digit++;
143 : /* fall through */
144 : default:
145 : if (digit >= 5) {
146 :newroman[length++] = numerals[2*size-1].romanchar;
147 : digit -= 5;
148 : } /* end if */
149 : for (loop = 0; loop < digit; loop++)
150 :newroman[length++] = numerals[2*size-2].romanchar;
151 : break;
152 : } /* end switch */
153 : } /* end for */
154 :newroman[length] = '0';
155 :
156 : /* if the converted form is the same as the Roman number
157 : input, print the Hindu-Arabic equivalent, else the input is
158 : invalid: signal this by returning '0' */
159 : if (strcmp (roman, newroman) == 0)
160 : return hindu_arabic;
161 : else
162 : return 0;
163 :
164 : } /* end convert */
165 :
166 :
167 : /* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */
168 :
11 | P a g e