CISC 7700X Final Exam

1. b
2. 0; (25 + 25 -50)/3
3. 0.7812;  exp(log(1+0.25)+log(1+0.25)+log(1-0.50))
4. b
5. b
6. b
7. b; about ~10. sqrt(2*7^2)
8. b; we found a random widget, there is a 50% chance that the serial number 959569 is within the interquartile range of all serial numbers.
9. c
10. d
11. e; invalid; p(x,y)=p(x|y)p(y)=p(y|x)p(x)
12. d
13. b
14. a
15. c
16. 0.15; 
  % P(fraud|amnt) = P(amnt|fraud)P(fraud) / ( P(amnt|fraud)P(fraud) + P(amnt|-fraud)P(-fraud) ) 
  %                 (0.9 * 0.001) / (0.9*0.001 + 0.005*0.999) = 0.1526717557251908
17. 0.038
  % P(fraud|st) = P(st|fraud)P(fraud) / ( P(st|fraud)P(fraud) + P(st|-fraud)P(-fraud) ) 
  %                 (0.8 * 0.001) / (  0.8 * 0.001 + 0.02 * 0.999 ) = 0.0384985563041386
18. not enough data; we don't know P(amnt,st|fraud)
  % answer is: P(fraud|amnt,st) = P(amnt,st|fraud)P(fraud) / (P(amnt,st|fraud)P(fraud) + P(amnt,st|-fraud)P(-fraud) ) 
  % but we don't know P(amnt,st|fraud)
19. 0.878
  % P(fraud|amnt,st) = P(amnt,st|fraud)P(fraud) / (P(amnt,st|fraud)P(fraud) + P(amnt,st|-fraud)P(-fraud) )
  %   naive assumption: P(amnt,st|fraud) = P(amnt|fraud)P(st|fraud)
  %   P(amnt|fraud)P(st|fraud)P(fraud) / (P(amnt|fraud)P(st|fraud))P(fraud) + P(amnt|-fraud)P(st|-fraud)P(-fraud))
  %    (0.9 * 0.8 * 0.001) / ( 0.9 * 0.8 * 0.001 + 0.02 * 0.005 * 0.999) = 0.8781558726673985
20. One way to fix these kinds of issues is to keep stats by customer. In other words, if a particular customer travels a lot, then having an out of state transaction should not raise red flags... adjust P(out-of-state|fraud) for that customer.