Pricing Flashcards

(15 cards)

1
Q

Pricing

Charge model for text prompting and output?

A

Charged per Token in and Token out

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
2
Q

Pricing

Charge model for embedding models?

A

Charged for every input token processed

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
3
Q

Pricing

Charge model for image models?

A

Charge for each image generated

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
4
Q

Pricing

How is Batch inference price comapred to non-batch?

A

Batch can be up to 50% cheaper

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
5
Q

Pricing

How does Provisioned work?

A

Purchase model units for a certain time (1 month, 6 months, …)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
6
Q

Pricing

What do you get with Provisioned Throughput?

A

Guaranteed throughput (max tokens in/out per minute)

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
7
Q

Pricing

What types of models support Provisioned Throughput?

A

Base, Fine-tuned, and custom models

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
8
Q

Pricing

Cheapest of the Model Improvement techniques?

A

Prompt engineering – no training, nothing special

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
9
Q

Pricing

Next-cheapest of the Model Improvement techniques?

A

RAG: model doesn’t have to know everything, stored externally

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
10
Q

Pricing

Next cheapest of the model improvement techniques after RAG?

A

Instruction-based fine-tuning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
11
Q

Pricing

Most expensive of the Model Improvement techniques?

A

Domain adaptation fine-tuning

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
12
Q

Pricing

How can Provisioned Throughput save costs?

A

Can’t, it’s reserved capacity

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
13
Q

Pricing

How can Temperature save costs?

A

Can’t

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
14
Q

Pricing

How can Top-K / Top-P save costs?

A

Can’t

How well did you know this?
1
Not at all
2
3
4
5
Perfectly
15
Q

Pricing

Main cost driver for LLMs?

A

Number of input and output tokens

How well did you know this?
1
Not at all
2
3
4
5
Perfectly