A rule is simply an if-then statement. It is comprised of two parts - a antecedent and a classification. A rule may capture a data point, or sample, if that sample satisfies the rule's antecedent. For instance, a rule may be anything of the following:
If age > 25 then predict favorite sport = tennis
If location = Eastern US then predict average temperature = 55 degrees C
If likes pie then predict chance of a heart attack before 50 = 30%
The antecedent in the last example above is
likes pie while the classification is
chance of a heart attack before 50 = 30%. A rule is generated by simply looking at which classification from results in the highest accuracy when paired with a given antecedent. For instance, if, in the third example above, 32.4% of the training set said they liked pie, then
chance of a heart attack before 50 = 30% would be a better classification for that antecedent than
chance of a heart attack before 50 = 60%
In the algorithms presented on this webpage, we use only rules with binary antecedents and binary classifications. For example,
If age > 25 then predict likes sports = true
If IQ > 180 AND is mathematician = true then predict Fields medal winner = true
A rule list is simply a group of rules in a particular order, followed by a default rule. For instance,
A rule list can be used to classify samples. For if you were the evaluate an individual who is 50 years old, lives in the Western US, and wears glasses, over this rule list, she would be predicted to not like sports. However, if she was 24 years old, the rule list would predict that she does like sports.
The advantage of rule lists as predictive models is that they are human interpretable, as opposed to black box models such as neural nets. Proponents of such non-interpretable models often claim that they are necessary to acheive greater accuracy, but we have shown that it is possible for rule lists to have comparable or even greater accuracy.