CISC 7700X Final Exam 1. c 2. b 3. a 4. d 5. c 6. b 7. a 8. d 9. c 10. b 11. a 12. 0.7778; or 7/9 Given: P(L)=0.5, P(S|L)=0.7, P(S|-L)=0.2 P(L|S) = P(S|L)P(L) / P(S) = P(S|L)P(L) / ( P(S|L)P(L) + P(S|-L)P(-L) ) = (0.7*0.5) / (0.7*0.5 + 0.2*0.5) = 0.7778 13. Not enough data. We don't know P(S,A|L) and P(S,A) Given: P(A|L)=0.6, P(A|-L)=0.15 P(L|S,A) = P(S,A|L)P(L) / P(S,A). 14. 0.9333; or 14/15 Given same as above. naive assumption: P(S,A|L) = P(A|L)P(S|L) P(L|S,A) = P(S,A|L)P(L) / P(S,A) = P(A|L)P(S|L)P(L) / ( P(A|L)P(S|L)P(L) + P(A|-L)P(S|-L)P(-L) ) = 0.6*0.7*0.5 / (0.6*0.7*0.5 + 0.15*0.2*0.5) = 0.9333 Another way to solve it is reuse results of q12: P(L|S,A) = P(A|L)P(L|S) / P(A) = P(A|L)P(L|S) / (P(A|L)P(L|S) + P(A|-L)P(-L|S)) = (0.6 * 0.7778) / (0.6*0.7778 + 0.15*(1-0.7778)) = 0.9333 15. a 16. d 17. c 18. Suppose n=100, then to store P(x_1,...,x_n|c) would require a table with at least 2^100 entries. similarly, if our model has 2^100 numbers, we'd need way more than 2^100 training instances to fill in probability estimates. also, with a table that large, we'd essentially be memorizing the input and recalling it for classification (would not generalize well) Naive Bayes turns: P(x_1,...,x_n|c) into P(x_1|c)P(...|c)P(x_n|c). if n=100, we'd have 100 small tables. 19. a 20. d