Filling Gaps Flashcards

(72 cards)

1
Q

What does CTR stand for?

A

click-through rate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

What is the A group in A/B test?

A

the control group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

What is the B group in A/B test?

A

the treatment group

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

How would we compute conversion rates in an A/B test?

A

the number of times A or B was chosen divided by number of users shown either A or B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

What would be the null hypothesis and alternate hypothesis for an A/B test with click through rates?

A

H_0: p_a = p_b, no difference
H_1: p_a != p_b, there is a difference

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

after identifying the null hypothesis for an a/b test, what should we do?

A

compute the pooled proportion, compute the standard error and then use that to compute z-score

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

what is the equivalent of 5% significance level in z values?

A

the critical z values for two tailed tests are += 1.96

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

what can we conclude if our z score was larger than 2?

A

we’d reject h_0 and conclude B performs better

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

When would we use a t-test?

A

use it if we’re comparing averages

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

what is an example of a metric that tells us we need to use a t-test?

A

average time spent, purchase amount

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

what is an example of a metric that tells us we need to use a two-proportion z-test?

A

click-through rate, conversion rate

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

what is an example of a metric that tells us we need to use a mann-whitney U test?

A

non-normal data or small samples

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

When would we use a mann-whitney u test?

A

non-parametric alternative

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

What is another time we would use a two-proportion z-test?

A

when the data are binary/categorical like clicked(1) or did not click(0)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

What does a two-sample t-test measure?

A

whether the means differ significantly

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
16
Q

What is an example that a two-sample t-test would be good for?

A

comparing the average time on a site for version A vs. version B

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
17
Q

What is the chi-squared test of independence?

A

tests whether two categorical variables are related

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
18
Q

what is an example that we would use chi-squared test of independence?

A

does gender affect click behavior?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
19
Q

what is the chi-squared goodness-of-fit test?

A

tests whether one categorical variable follows an expected distribution

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
20
Q

what is an example of when we would use a chi-squared goodness-of-fit test?

A

are clicks evenly distributed across 3 button colors?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
21
Q

What is an easy way to tell if we need to use chi-squared test?

A

are these two things connected?

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
22
Q

what is a true positive?

A

you said yes and you were right

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
23
Q

what is a false positive?

A

you said yes but were wrong

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
24
Q

what is a false negative?

A

you said no but were wrong

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
25
what is a true negative?
you said no and were right
26
what is accuracy?
how often the model is right
27
what is precision?
of all the things the model said "yes" to, how many were actually yes?
28
what is recall?
of all the real yes things, how many did the model catch?
29
what is f1 score?
the balance between precision and recall
30
what is specificity?
how many real "no" things did it correctly say no to?
31
What is an example of a true positive?
sick person correctly diagnosed as sick
32
what is an example of a false positive?
healthy person told they're sick
33
what is an example of a false negative?
sick person told they're healthy
34
what is an example of a true negative?
healthy person correctly told they're healthy
35
what is a real world example or precision?
when the doctor says someone is sick, how often is it true?
36
what is a real world example or recall?
of all sick people, how many did the doctor find?
37
what is a real world example or accuracy?
how often did the doctor get it right overall?
38
what is a taxonomy a fancy word for?
organized categories
39
why are taxonomies important?
they group similar things, find information faster, compare things fairly
40
what is hierarchy?
from big to small
41
what is a node?
one level in that chain
42
what is a parent?
the category above
43
what is a child?
the category below
44
what are siblings?
categories on the same level
45
how are taxonomies like maps?
they tell you where something belongs and how it connects to other things
46
what is a trunk?
the main idea
47
what are the branches?
big groups
48
what are leaves?
specific items
49
what is a category?
a named group of similar things
50
what is labeling?
assigning items to categories
51
what is data labeling?
giving names or tags to examples so the model can learn from it later
52
why do we label data?
because computers don't know what they're looking at so we turn raw data into something meaningful
53
if we add a label to a photo of an apple, what are we using it for?
image recognition
54
if we add a label of "positive sentiment" to a text example, what are we using it for?
sentiment analysis
55
if we add a label of "bark" to an audio clip, what are we using it for?
sound detection
56
if we are adding a label of "car" to a video of traffic, what are we using it for?
self-driving cars
57
how do we do a classification label?
pick one label for the whole thing
58
how do we create a bounding box label?
draw a box around an object in an image
59
how do we create a segmentation label?
color every pixel of an object
60
how do we create an entity tagging label?
mark key words in text
61
how do we create sentiment labeling?
mark how something feels
62
what is a data annotator?
the person labeling the data
63
what is a data annotation?
the act of marking or highlighting the data
64
what is data governance?
knowing who manages the data, where it comes from, and how it's used
65
what is data quality?
making sure the data is accurate, timely, complete, and consistent so its fit for purpose for whoever needs it.
66
what is data modeling?
it gives structure to the data and represents how we represent real world entities and relations in a way that's scalable
67
what is data lifecycle management?
focuses on how data is created, maintained, improved, and retired over time
68
What is the consistency property of a good taxonomy?
same logic at every level
69
What is the completeness property of a good taxonomy?
all relevant categories are represented
70
What is the single inheritence property of a good taxonomy?
each child has one parent
71
What is the proper granularity property of a good taxonomy?
levels are consistent detail
72
What is the no cycles property of a good taxonomy?
should form a tree, not loops