Where do False Positive rate and False Negative rate sit in the confusion matrix?

If you could only use the top 25% of instance which model would you go for?
If you could go for more?

Looking at this graph, if you wanted to target the top 25% or less of customers, you’d choose the classification tree model; if you wanted to go further down the list you should choose NB.
What are the equations for sensitivity and specificity?
How do you remember them?
Sensitivity refers to the true positive rate
Specificity refers to the true negative rate
Sensitivity, view it as how well it can detect true positives relative to positives that were actually indentified as negative (false negative)
false negative, in other words, it falsely assigned a negative.

What do ROC curves allow for?
ROC make it easy to indentify the best threshold to make a decision.
What does the AUC allow you to do?
The AUC can help you decide which categorisation method is better.
What should we know about ranking classifiers?
What are profit curves and how do you choose a classifier?
Choosing a classifier:
How do you build a profit curve and what the critical conditions for the suitability of a profit curve?
How do you build it?
Critical conditions of the suitability of profit curves
What is a ROC curve and when is it used?
What are important points in the ROC space?
How do you evaluate points in a ROC curve?
Which side is more important?
When thinking about ranking classifiers, for logistic regression, what does the threshold translate to?
The threshold translates to the probabilty scale on the y-axis of the sigmoid plot.
How is a ranking model with a threshold applied to a ROC curve?
If classifier value is above threshold, above upward (TP)
If classifier value is below threshold, move right (FP)
What is the AUC?
What do want to see in a Cumulative Response curve?
Plots percentage of positives targeted against the percentage of test instances.
What we want to see is that if target 20% of the population well find more than 20% of our positive cases.
What can we calculate/derive from the cumulative response curve?
We can calculate Lift, which is the advantage the model gives us over the baseline at a given percentage targeted, by dividing the percentage of positive targeted by the percentage of total instances seen. Lift tells us what sort of advantge the model gives us over baseline.
Lift should be above 1 to outperforn baseline.
What is the Cumulative Response Curve?
What is the Lift curve?
Lift: amount by which a classifier concentrates the positive examples above the negative examples
How much more prevalent is the positive class in the selected sub-population over the distribution of the class in the entire population? E.g. if the prevalence of target class A is 1% in the entire population and in our sub-population it is 5% we have a corresponding lift of 5!
What are important things to consider about cumulative reponse and lift curves?
What are all the graphs/plots individually sensitive to and on the other hand what are advantages?

What equation do you need to calculate the profit curve?
The expecterd profit equation.
You can also use the modified version with conditionals.
