Pricing Flashcards by Rich Alberth

Pricing

Charge model for text prompting and output?

Charged per Token in and Token out

How well did you know this?

Not at all

Perfectly

Pricing

Charge model for embedding models?

Charged for every input token processed

How well did you know this?

Not at all

Perfectly

Pricing

Charge model for image models?

Charge for each image generated

How well did you know this?

Not at all

Perfectly

Pricing

How is Batch inference price comapred to non-batch?

Batch can be up to 50% cheaper

How well did you know this?

Not at all

Perfectly

Pricing

How does Provisioned work?

Purchase model units for a certain time (1 month, 6 months, …)

How well did you know this?

Not at all

Perfectly

Pricing

What do you get with Provisioned Throughput?

Guaranteed throughput (max tokens in/out per minute)

How well did you know this?

Not at all

Perfectly

Pricing

What types of models support Provisioned Throughput?

Base, Fine-tuned, and custom models

How well did you know this?

Not at all

Perfectly

Pricing

Cheapest of the Model Improvement techniques?

Prompt engineering – no training, nothing special

How well did you know this?

Not at all

Perfectly

Pricing

Next-cheapest of the Model Improvement techniques?

RAG: model doesn’t have to know everything, stored externally

How well did you know this?

Not at all

Perfectly

Pricing

Next cheapest of the model improvement techniques after RAG?

Instruction-based fine-tuning

How well did you know this?

Not at all

Perfectly

Pricing

Most expensive of the Model Improvement techniques?

Domain adaptation fine-tuning

How well did you know this?

Not at all

Perfectly

Pricing

How can Provisioned Throughput save costs?

Can’t, it’s reserved capacity

How well did you know this?

Not at all

Perfectly

Pricing

How can Temperature save costs?

Can’t

How well did you know this?

Not at all

Perfectly

Pricing

How can Top-K / Top-P save costs?

Can’t

How well did you know this?

Not at all

Perfectly

Pricing

Main cost driver for LLMs?

Number of input and output tokens

How well did you know this?

Not at all

Perfectly

Pricing Flashcards

(15 cards)