What is a CNN?
A deep learning architecture designed for processing data with grid-like topology, e.g., images.
What is an RNN?
A neural network designed for sequential data where outputs depend on previous computations.
What is a transformer model?
An architecture relying on self-attention mechanisms, excelling in NLP tasks.
What is ReLU?
Activation function: f(x) = max(0, x). Helps avoid vanishing gradients.
What is sigmoid activation?
Outputs values in (0,1), often used for binary classification.
What is softmax?
Converts logits into probability distribution over classes.
What is cross-entropy loss?
A loss function for classification comparing predicted probabilities to true labels.
What is dropout?
A regularization method that randomly deactivates neurons during training to prevent overfitting.
What is batch normalization?
A technique to normalize inputs to each layer, speeding up training.
What is Adam optimizer?
Adaptive Moment Estimation optimizer combining RMSProp and momentum.