When to Use This Guide
Use this guide when you want to get better, more reliable outputs from any large language model. Whether you are writing prompts for ChatGPT, Claude, Gemini, Llama, or any other model, the principles here apply universally. This is the foundation that makes every other prompt in this library work.
Prompt Engineering: How to Write Effective Prompts
Prompt engineering is the discipline of crafting inputs to large language models (LLMs) that produce reliable, high-quality outputs. It is not guesswork. It is a repeatable skill built on clear principles.
This page covers the core techniques, from basic structure to advanced strategies, with practical examples you can use immediately.
The Anatomy of a Great Prompt
Every effective prompt has five structural elements. Missing any of them degrades output quality.
| Element | What It Does | Example |
|---|---|---|
| Role | Sets the persona and expertise level | “You are a senior backend engineer” |
| Context | Provides background the model needs | “The codebase uses FastAPI with SQLAlchemy” |
| Task | States exactly what you want done | “Review this function for security vulnerabilities” |
| Format | Specifies the output structure | “Return a numbered list with severity ratings” |
| Constraints | Sets boundaries and rules | “Do not suggest changes to the database schema” |
Core Techniques
System vs User Messages
OpenAI models use a three-role message structure:
| Role | Purpose | When to Use |
|---|---|---|
system | Sets persistent behavior, personality, rules | Once at the start. Put role, constraints, format rules here |
user | The human’s input | Every turn. Contains the actual task or question |
assistant | The model’s response | Included in history for context continuity |
{
"model": "gpt-4o",
"messages": [
{"role": "system", "content": "You are a concise code reviewer. Reply only with the issue and fix. No explanations."},
{"role": "user", "content": "Review this: def get_user(id): return db.query(f'SELECT * FROM users WHERE id = {id}')"}
]
}
The system message persists across the entire conversation. Put all your rules, persona, and format instructions there.
Claude uses a system parameter (separate from messages) plus user and assistant turns:
{
"model": "claude-sonnet-4-20250514",
"system": "You are a concise code reviewer. Reply only with the issue and fix. No explanations.",
"messages": [
{"role": "user", "content": "Review this: def get_user(id): return db.query(f'SELECT * FROM users WHERE id = {id}')"}
]
}
Claude’s system prompt is strongly followed. It is the best place for behavioral rules, output format, and constraints.
Gemini uses system_instruction at the model level:
{
"model": "gemini-2.0-flash",
"system_instruction": "You are a concise code reviewer. Reply only with the issue and fix.",
"contents": [
{"role": "user", "parts": [{"text": "Review this function..."}]}
]
}
Gemini also supports responseMimeType: "application/json" with a responseSchema for enforced structured output — no prompt tricks needed.
Open-source models (Llama, Mistral, etc.) use chat templates. The system prompt goes in the template header:
<|begin_of_text|><|start_header_id|>system<|end_header_id|>
You are a concise code reviewer. Reply only with the issue and fix.<|eot_id|>
<|start_header_id|>user<|end_header_id|>
Review this function...<|eot_id|>
<|start_header_id|>assistant<|end_header_id|>
System prompt adherence varies by model. Llama 3.3 70B follows system prompts reliably. Smaller models may need reinforcement in the user message.
Temperature and Sampling Parameters
Temperature controls randomness. It is the single most misunderstood parameter in prompt engineering.
| Temperature | Behavior | Best For |
|---|---|---|
| 0.0 | Deterministic. Same input produces same output. | Code generation, factual Q&A, classification, data extraction |
| 0.3 - 0.5 | Slightly varied. Minor wording changes between runs. | Technical writing, summarization, structured analysis |
| 0.7 - 0.8 | Creative variation. Different approaches each run. | Marketing copy, brainstorming, creative writing |
| 1.0+ | High randomness. Unpredictable, sometimes incoherent. | Experimental ideation only. Rarely useful in production. |
Other important parameters:
| Parameter | What It Controls | Recommendation |
|---|---|---|
top_p | Nucleus sampling — limits token pool to top P% probability mass | Keep at 1.0 unless you know what you are doing |
max_tokens | Maximum output length | Set explicitly to avoid truncation. 4096 for long outputs |
stop | Stop sequences — model stops generating when it hits these | Useful for structured output: stop at \n\n or --- |
frequency_penalty | Penalizes repeated tokens | 0.0 default. Increase to 0.3-0.5 for varied creative writing |
Advanced Techniques
Technique: Prompt Chaining
Break complex tasks into a sequence of simpler prompts, where each prompt’s output feeds into the next.
Example — Research Report Pipeline:
Prompt 1 (Research): "Search for the top 5 competitors in the
AI writing assistant market. For each, list: name, pricing,
key features, and target audience."
Prompt 2 (Analysis): "Given this competitor data: [output from Prompt 1].
Identify the three biggest gaps in the market that no competitor
is addressing well."
Prompt 3 (Strategy): "Given these market gaps: [output from Prompt 2].
Write a positioning strategy for a new AI writing tool targeting
[specific audience]. Include messaging pillars and differentiation."
Why it works: Each prompt is simple and focused. The model does one thing well instead of juggling everything at once. Error rates drop dramatically.
Technique: Self-Consistency Verification
Ask the model to generate multiple answers and then evaluate which one is best. This reduces hallucination on complex reasoning tasks.
Answer the following question three different ways, using different
reasoning approaches each time. Then compare your three answers
and select the one you are most confident in. Explain why.
Question: Should a 50-person startup adopt microservices or
a modular monolith?
Technique: Constraint Stacking
Layer constraints to progressively narrow the output space. Each constraint eliminates a category of bad output.
Write a product description for wireless headphones.
Constraints:
1. Exactly 3 paragraphs
2. First paragraph: the problem (max 2 sentences)
3. Second paragraph: the solution with 3 specific features
4. Third paragraph: social proof + CTA
5. Reading level: 8th grade (Flesch-Kincaid)
6. No superlatives (no "best", "amazing", "incredible")
7. Include one specific number (battery life, weight, or price)
Each constraint eliminates a category of generic output. The result is specific and structured.
Technique: Meta-Prompting
Use the AI to improve your prompts. This is recursive prompt engineering.
I want to write a prompt that generates SQL queries from natural
language questions. Here is my current prompt:
[paste your prompt]
Analyze this prompt for weaknesses. Then rewrite it to be more
robust, handling edge cases like ambiguous table names, aggregate
queries, and JOINs across 3+ tables.
This is especially useful when a prompt works 80% of the time but fails on edge cases. The model can often identify the failure patterns you missed.
Technique: Output Scaffolding
Give the model a partial output structure and ask it to fill in the blanks. This guarantees format compliance.
Fill in the sections marked [FILL] based on the codebase analysis:
## Architecture Overview
[FILL: 2-3 sentence summary of the system architecture]
## Tech Stack
| Layer | Technology | Version |
|-------|-----------|---------|
| Frontend | [FILL] | [FILL] |
| Backend | [FILL] | [FILL] |
| Database | [FILL] | [FILL] |
## Key Design Decisions
1. [FILL: Decision about data storage] — Rationale: [FILL]
2. [FILL: Decision about auth] — Rationale: [FILL]
3. [FILL: Decision about deployment] — Rationale: [FILL]
Common Mistakes
Avoid these patterns. They are the most frequent causes of poor output.
| Mistake | Why It Fails | Fix |
|---|---|---|
| “Be creative” | Too vague. The model has no constraints to work within. | Specify the type of creativity: “Write 3 alternative headlines using metaphor” |
| Wall of text prompt | The model loses focus in long, unstructured paragraphs | Use sections, headers, numbered lists, and labeled blocks |
| No examples | The model guesses your preferred format | Add 2-3 examples of the exact output you want |
| Asking for perfection | “Write the perfect email” creates anxiety in the output | Ask for a specific draft: “Write a 3-paragraph cold email following AIDA framework” |
| Contradictory instructions | “Be concise but thorough” paralyzes the model | Choose one priority and be explicit: “Prioritize brevity. Max 200 words.” |
| Ignoring the system prompt | Putting everything in the user message wastes the system prompt | Move rules, persona, and format to system. Keep task in user message. |
| No output format spec | The model picks a random format each time | Always specify: JSON, markdown, table, numbered list, etc. |
| Assuming context | The model does not remember previous sessions | Include all relevant context in each prompt. Models are stateless between sessions. |
Quick Reference: Prompt Template
Use this template as a starting point for any new prompt:
ROLE:
[Who is the AI in this context? What expertise does it have?]
CONTEXT:
[What background does the model need? What has happened before?]
TASK:
[What exactly should the model do? Be specific.]
FORMAT:
[How should the output be structured? Give an example if possible.]
CONSTRAINTS:
[What should the model NOT do? What are the boundaries?]
EXAMPLES (optional):
[Input/output pairs showing the pattern you want]
Related Pages
Technical Documentation Prompt
Generate comprehensive developer docs from any codebase using a structured 5-phase process.
User Documentation Prompt
Create end-user help centers and knowledge bases organized around user journeys.
Landing Page Audit Prompt
Run a structured conversion audit on any landing page with evidence-based recommendations.
Market Research Prompt
Turn any AI with web search into a professional market research analyst.
Web Scraping Framework Prompt
Build production-grade web scrapers that output clean markdown for LLM consumption.