Back to topics

01 About Prompt Engineering

Zero-shot

Zero-shot learning — go at it without any examples.

Suppose the model has no knowledge of a completely new task, yet can extract relevant features from its training data. The classic example: you only tell the model to classify reviews as positive or negative, without teaching it what positive or negative means. The model directly executes the "classification" pattern learned during pretraining — that's Zero-shot.

This concept is often used to highlight the core advantage of large models over traditional machine learning. Since large models are trained on vast corpora, they have essentially seen all types of human tasks and learned the processing patterns for common tasks. Therefore, they can produce relatively good results directly without additional training.

However, scenarios that require complex reasoning, precise format control, or deep domain knowledge often demand Few-shot or even Fine-tuning.

Few-shot

Few-shot learning — provide a few examples via the prompt and let the model imitate the pattern.

Commonly used for tasks with strict format requirements, like outputting JSON in a precise format.

It's also useful for tasks with ambiguous definitions. For example, if you ask a large model to summarize an article, it doesn't know what kind of summary you want. You can teach it by including a few examples. Examples should be quality over quantity — around three is usually enough. Diversity among examples matters more than count.

Fine-tuning

Zero-shot and Few-shot fall under prompt engineering. Fine-tuning refers to using a prepared dataset to perform another round of training, updating model parameters to improve performance in specific domains. For instruction fine-tuning, datasets typically range from 500 to 5,000 examples.

It's generally used for scenarios with extremely strict format or style control, or domains with very narrow and deep expertise.

It can also be used to reduce model inference costs — replacing large model calls with a fine-tuned smaller model can reduce costs by 10-100x.


Zero-shot

Zero-shot learning — give it a task with no examples and let it go.

The model doesn't need to be retrained for every specific task, because it has already learned the patterns of countless task formats from its pretraining corpus. The classic example: you tell the model "classify this review as positive or negative" without teaching it what "positive" or "negative" even means. The model draws on classification patterns it learned during pretraining and executes directly — that's Zero-shot.

This is the core advantage LLMs hold over traditional machine learning. With enough pretraining data, the model has seen virtually every task format humans can describe and has internalized common task patterns, so it can deliver solid results with zero additional training.

That said, complex reasoning, strict format control, and deep domain knowledge often require Few-shot or even Fine-tuning.

Few-shot

Few-shot learning — give the model a handful of examples via the prompt and let it follow the pattern.

Commonly used for tasks demanding precise output formatting, like producing exact JSON schemas.

Also useful when the task definition is ambiguous. If you ask the model to "summarize an article," it has no idea what kind of summary you want. A few examples teach it. Examples should be quality over quantity — around three is usually enough. Diversity among examples matters more than raw count.

Fine-tuning

Zero-shot and Few-shot fall under prompt engineering. Fine-tuning is different: you take a prepared dataset and run another round of training, actually updating the model's parameters to boost performance in specific domains. For instruction fine-tuning, datasets typically range from 500 to 5,000 examples.

Fine-tuning is generally reserved for situations demanding extremely tight format or style control, or domains with deep, narrow expertise.

It's also used to slash inference costs — swapping a large model call for a fine-tuned small model can cut costs by 10 to 100 times.