Programming Assignment
Question one
An itemset is considered to be a closed pattern if all the intermediate supersets lacks the same support as the itemset (Han and Kamber 231). That is, if there is not item x in an itemset A such that each transaction containing A also contain x, then A is a closed pattern.
Maximum patterns
These are frequent patterns that lack the right super pattern. That is, if none of the intermediate supersets of an itemset is frequent, then the itemset is said to be maximal frequent (Han and Kamber 231).
Support
Support is defined as the frequency of the rule within transactions (Han and Kamber 230). That is given the rule or rather implication expressionA→B, where A and B are itemsets that do not overlap, then support is the fraction of the transactions which contain both A and B.
Confidence
Confidence is the estimation or inference of conditioned probability (Han and Kamber 230). Using the implication expressionA→B, the confidence is a measure of how often items in B appear in the transactions which contain A.
Given the itemset below:
{Bread, Honey}→Eggs (Rule)
Then support, s=σ(Bread, Honey, Eggs)|T|=25=0.4,
While confidence,c= σ(Bread, Honey, Eggs)σ(Bread, Honey)=23=0.67
Question Two
Frequent itemsets
Based on absolute support, and counting duplicate values once per TID, the database is scanned to generate the first frequent itemsets. The sum of TID is five, hence the minsupport of 60% equates to 3/5. As a result, itemsets with support counts of 1 or 2 are eliminated. The database is then scanned for the second time to come up with the second-level frequent itemsets. Ten combinations are possible; each combination is counted for each TID and combinations which are below the support value of 3 are purged. The database is then scanned against to produce level-3 frequent itemsets. Sets {k,e}, {o,k} and {o,e} make {o,k,e}. As a result, the frequent itemsets are: {e, k, y, o, m, ke, mk, oe, ky, ok, oke }
Strong association rules
The highest itemset is {o,k,e}. Hence, there can be 3!/(1!2!)=3 possible association rules
Association rules from (o,k,e} :
Rule1: o∩k→e
Confidence = support of {o, k, e} / support of {o,k} = 3 / 3 = 100%
Hence, Rule1 is a strong rule of association
Rule2: o∩e→k
Confidence = support of {o,k,e} / support of {o,e} = 3 / 3 = 100%
Hence, Rule 2 is a strong rule of association
Rule3: k∩e→o
Confidence = support of {o,k,e} / support of {k,e} = 3 / 4 = 75%
Hence, Rule3 is a strong rule of association.
Works Cited
Han, Jiawei, and Micheline Kamber. Data Mining: Concepts and Techniques. Amsterdam: Elsevier, 2006. Internet resource.