Training Flashcards by Rich Alberth

Training

What is Labeled data?

Input data to train a model with a reliable description of what it is (truth).

How well did you know this?

Not at all

Perfectly

Training

Highest clasifications of ML algorithms?

Supervised learning, unsupervised learning, and reinforcement learning

How well did you know this?

Not at all

Perfectly

Training

How does Supervised Learning work?

Aalgorithms are trained on labeled data

How well did you know this?

Not at all

Perfectly

Training

What is the goal of supervised learning?

Learn a mapping function that can predict the output for new, unseen input data

How well did you know this?

Not at all

Perfectly

Training

How does unsupervised learning work?

Trained on unlabeled data

How well did you know this?

Not at all

Perfectly

Training

Goal of unsupervised learning?

Discover inherent patterns, structures, or relationships within the input data

How well did you know this?

Not at all

Perfectly

Reinforcement Learning

What is reinforcement learning?

Semi-supervised learning. Algorithm given rewards or penalties for its actions, and the machine learns from this feedback to improve its decision-making over time.

How well did you know this?

Not at all

Perfectly

Reinforcement Learning

What data is used for Semi-Supervised learning?

Mostly labeled, some unlabeled

How well did you know this?

Not at all

Perfectly

Reinforcement Learning

What is Reinforcement Learning usually used for?

Teach AI to play games, robotics to navigate and manipulate objects

How well did you know this?

Not at all

Perfectly

Reinforcement Learning

Example of Reinforcement Learning in healthcare?

Optimize treatment plans

How well did you know this?

Not at all

Perfectly

Reinforcement Learning

Example of Reinforcement Learning in finance?

Trading strategies

How well did you know this?

Not at all

Perfectly

Reinforcement Learning

Are images and videos structured or unstructured data?

Unstructured

How well did you know this?

Not at all

Perfectly

RLHF

What is RLHF?

Reinforcement Learning from Human Feedback

How well did you know this?

Not at all

Perfectly

RLHF

Why bother with RLHF?

Better align with human values

How well did you know this?

Not at all

Perfectly

RLHF

How do you do RLHF?

Separate reward model: ask humans which of two generated answers sound more human. Use reward model to tune the real model

How well did you know this?

Not at all

Perfectly