In 2023, a job posting from Anthropic advertised a "Prompt Engineer and Librarian" position with a salary range of $175,000-$335,000. The posting went viral, generating equal parts fascination and scepticism. Fascination because the salary range was startling for a role that seemed, on its surface, to involve writing instructions to a chatbot. Scepticism because critics questioned whether prompt engineering was a real discipline or an artefact of a transitional moment in AI development that would disappear as models improved.

Both reactions were partially right. Prompt engineering is a real and consequential skill in the current moment of AI deployment. The gap between an ad hoc interaction with a large language model and a carefully engineered system prompt — with examples, guardrails, and structured output requirements — can be the difference between a product that works reliably and one that fails unpredictably. At the same time, the most sceptical critics have a point: the portions of prompt engineering that involve manually crafting better phrasing are genuinely being automated by model improvements, and the long-term trajectory of the role is uncertain in ways that most in-demand technical skills are not.

This article provides an honest account of what prompt engineers actually do, which parts of the job description are substantive and which are marketing, what the genuine salary market looks like (rather than the top of the range cited in viral posts), who actually hires for this role, and the unresolved question of whether prompt engineering is a durable career or a transitional specialisation.

"The most important skill in prompt engineering is not knowing the perfect phrasing — it is knowing how to systematically test what works and why." — Common observation in applied AI circles


Key Definitions

Large language model (LLM): A neural network trained on large text datasets that can generate text, answer questions, summarise documents, write code, and perform many other language-based tasks. Examples include GPT-4, Claude, Gemini, and Llama.

System prompt: Instructions given to an LLM before the user conversation begins, typically invisible to the end user. The system prompt establishes the model's persona, constraints, output format, and behavioural guidelines for an application. System prompt design is the foundational skill of prompt engineering.

Few-shot prompting: A prompting technique in which examples of desired input-output pairs are included in the prompt, teaching the model the pattern to follow through demonstration rather than instruction. More reliable than zero-shot instruction for complex tasks.

Chain-of-thought prompting: A technique that asks the model to show its reasoning step-by-step before providing a final answer, which consistently improves performance on complex reasoning tasks. Introduced by Wei et al. at Google in 2022 and now a standard technique.

Retrieval-augmented generation (RAG): An architecture that combines an LLM with a retrieval system, allowing the model to access relevant documents or data at query time rather than relying only on information in its training data. Prompt engineering is a significant component of RAG system design.

Hallucination: A phenomenon where an LLM generates factually incorrect information with apparent confidence. Managing hallucination risk through prompt design and evaluation is one of the most important functions of a prompt engineer in production systems.


Prompt Engineer Salary by Employer Type (US, 2024)

Employer Type Role Title Total Compensation
AI model companies (Anthropic, OpenAI, Cohere) Prompt Engineer / AI Safety Researcher $175,000-$335,000
Large tech companies (Google, Microsoft, Meta) Applied AI Specialist / LLM Application Developer $130,000-$220,000
AI-native startups AI Product Engineer / Prompt Specialist $100,000-$180,000
Consulting firms (Accenture, Deloitte, McKinsey) AI Consultant / AI Implementation Specialist $90,000-$150,000
Enterprise organisations (healthcare, finance, legal) AI Operations Analyst / AI Content Specialist $70,000-$120,000
Freelance / contract Prompt Engineer (specialist domains) $30-$200/hour

The $175,000-$335,000 range represents the very top of the market. Glassdoor 2024 data shows the US median for roles explicitly titled "prompt engineer" at approximately $85,000-$110,000, with most roles sitting in enterprise and consulting contexts rather than AI company contexts.


What a Prompt Engineer Actually Does

The role is sufficiently new and varied that a clear, universal job description does not exist. But the core activities that appear across prompt engineering roles can be grouped into several categories.

System Prompt Design and Iteration

The most fundamental task is designing the system prompts that govern how an AI application behaves. A system prompt for a customer service chatbot might specify the company's tone of voice, the products the model should and should not discuss, how to handle complaints, what to say when it does not know the answer, and the format of responses. A system prompt for a medical documentation assistant might specify clinical terminology standards, privacy requirements, handling of uncertainty, and escalation procedures.

This is more than writing; it is a form of specification work. A poorly designed system prompt produces an AI application that behaves inconsistently, hallucinates confidently, refuses reasonable requests, or fails to enforce important constraints. The iteration process is the core of the work: prompt engineers test their prompts against large and varied input sets — adversarial inputs, edge cases, unusual phrasings, multilingual queries — and observe where behaviour breaks down. This cycle is closer to software testing methodology than to creative writing.

Evaluation Framework Development

One of the most underappreciated aspects of prompt engineering is building the systems used to evaluate whether prompts are working. Human evaluation of LLM outputs does not scale: if a model processes thousands of queries per day, a person cannot review each response. Prompt engineers design evaluation frameworks — sets of test cases, automated scoring criteria, and human rating protocols — that allow systematic measurement of output quality.

Evaluation design is technically demanding. What counts as a correct response to an open-ended question? How do you measure consistency across paraphrased versions of the same query? How do you detect subtle failures in instruction-following? The quality of an evaluation framework directly determines how much reliable signal the prompt engineer has to work with during iteration.

RAG System Design and Optimisation

Many enterprise AI applications use retrieval-augmented generation: the LLM is connected to a knowledge base of documents, and at query time, relevant passages are retrieved and included in the prompt context. The performance of these systems depends heavily on prompt design decisions: how retrieved documents are formatted for inclusion, how the model is instructed to use versus override retrieved information, how uncertainty and contradictions in retrieved content are handled.

This is substantive technical work requiring understanding of both LLM behaviour and retrieval system architecture. Prompt engineers working on RAG systems typically collaborate closely with ML engineers who build the retrieval components.

Red Teaming and Safety Testing

Prompt engineers at AI companies and large enterprise deployers spend significant time on adversarial testing — attempting to elicit harmful, deceptive, or policy-violating outputs from the models they are working with. This involves designing prompts intended to bypass safety measures, identifying jailbreak patterns, and stress-testing the robustness of guardrails.

The findings from red teaming feed directly into improved system prompt design and, for AI companies, into model training and safety research. This work requires creativity, methodical thinking, and willingness to systematically explore uncomfortable edge cases.

Documentation and Knowledge Management

The "Librarian" component of Anthropic's 2023 posting points to a genuine operational need: as organisations deploy AI systems at scale, the prompt library — the collection of system prompts, few-shot examples, evaluation cases, and documented design decisions — becomes a significant knowledge asset. Prompt engineers maintain version-controlled prompt libraries, create guidelines for how prompts should be structured and tested, and train colleagues who are beginning to work with AI systems.


Who Actually Hires Prompt Engineers

AI product companies are the most explicit employers — they hire people whose primary job is improving the prompts that make their AI products work. This category includes both model providers and AI-native startups building on top of those models.

Healthcare technology companies are increasingly deploying AI for clinical documentation, prior authorisation, patient communication, and diagnostic support — all domains where reliable prompt design is critical and errors have real consequences.

Legal technology firms use AI for contract analysis, legal research, and document review, where precision in prompt design determines whether the system produces usable output or hallucinations that waste lawyer time.

Marketing technology platforms deploy AI for content generation, personalisation, and campaign analysis — and need prompt engineers to maintain output quality and brand consistency across millions of automated interactions.

Financial services firms use AI for document analysis, compliance checking, customer communication, and research — domains with regulatory constraints that make reliable prompt behaviour especially important.

Government agencies in the US, UK, and EU are beginning to deploy AI for public service delivery and regulatory analysis, creating demand for practitioners who understand both the technical and governance dimensions of prompt design.


Is Prompt Engineering a Long-Term Career?

This is the question the field genuinely cannot answer yet.

The case for durability: As AI systems become more embedded in every industry, the skill of reliably directing those systems toward specific outcomes will become more valuable. The most sophisticated prompt engineering work — evaluation framework design, red teaming, RAG system optimisation — is already being recognised as ML engineering territory. People developing this deeper technical capability have a strong career trajectory.

The case for transitioning: Model improvements are steadily reducing the gap between naive and optimised prompts for well-defined tasks. Automatic prompt optimisation research — methods that tune prompts algorithmically rather than manually — is advancing. The simplest prompt engineering work is increasingly doable by domain experts with no specialised training.

The most likely scenario: The narrow definition of prompt engineering — the standalone craft of writing better prompts — will consolidate into broader roles rather than growing as an independent profession. The broader definition — applied AI system design including evaluation, testing, and reliability engineering — will grow and be called something else: applied AI engineer, AI product engineer, LLM application developer.

People entering the field with this understanding are better positioned than those who treat prompt engineering as a permanent destination rather than a platform for building adjacent AI engineering skills.


Skills Needed

Strong written communication: The foundation of the work. Precision in language, understanding of ambiguity, and the ability to write instructions that are interpreted consistently across varied inputs.

Systematic thinking: The ability to design experiments, identify variables, and evaluate results objectively. Prompt engineering without systematic testing is iterative guessing.

LLM API familiarity: Understanding how to use OpenAI, Anthropic, Hugging Face, and other APIs programmatically — including parameters like temperature, top-p, and context window management.

Basic Python: Scripting ability to automate prompt testing, process outputs at scale, and integrate with data pipelines. Not deep software engineering, but functional programming ability.

Domain expertise: In most enterprise applications, prompt engineering without domain knowledge produces generic results. Healthcare prompt engineers who understand clinical workflows are significantly more effective than generalists.


Practical Takeaways

Prompt engineering as currently practiced is most valuable as a component of broader AI system development skills rather than as a standalone specialisation. The most employable practitioners combine prompt expertise with ML engineering, product management, or deep domain knowledge.

If you are entering this space, prioritise building systematic evaluation skills over polishing individual prompt phrasing. The ability to measure whether your prompts work — rigorously, at scale, and against adversarial cases — is what separates practitioners who create reliable AI systems from those who create impressive demos.

Treat the current high-profile salaries at AI companies as an indicator of market demand for the skill set, not as a reliable baseline for what the role pays broadly. Build the technical depth that makes you employable across multiple AI roles, not just the one that had a viral job posting.


References

  1. Anthropic. "Prompt Engineer and Librarian Job Posting." Anthropic.com, 2023.
  2. Brown, T. et al. "Language Models are Few-Shot Learners." Advances in Neural Information Processing Systems 33, 2020.
  3. Wei, J. et al. "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models." Advances in Neural Information Processing Systems 35, 2022.
  4. Lewis, P. et al. "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks." Advances in Neural Information Processing Systems 33, 2020.
  5. Ouyang, L. et al. "Training Language Models to Follow Instructions with Human Feedback." Advances in Neural Information Processing Systems 35, 2022.
  6. Perez, E., Kiela, D., & Cho, K. "True Few-Shot Learning with Language Models." Advances in Neural Information Processing Systems 34, 2021.
  7. Anthropic. "The Claude Model Card." Anthropic.com, 2024.
  8. OpenAI. "GPT-4 Technical Report." OpenAI.com, 2023.
  9. Glassdoor. "Prompt Engineer Salary Data." Glassdoor.com, 2024.
  10. LinkedIn Economic Graph. "Emerging Jobs in AI." LinkedIn, 2024.
  11. White, J. et al. "A Prompt Pattern Catalog to Enhance Prompt Engineering with ChatGPT." arXiv:2302.11382, 2023.
  12. Zhou, Y. et al. "Large Language Models Are Human-Level Prompt Engineers." arXiv:2211.01910, 2022.

Frequently Asked Questions

What does a prompt engineer actually do?

A prompt engineer designs, tests, and iterates the system prompts and evaluation frameworks that make AI applications behave reliably. The work is closer to QA and applied AI research than to creative writing.

How much does a prompt engineer earn?

AI company specialists earn \(175,000-\)335,000, but Glassdoor 2024 data shows the broader median at \(85,000-\)110,000, with most roles sitting in enterprise or consulting contexts rather than at Anthropic or OpenAI.

Who hires prompt engineers?

AI model companies hire most explicitly, but healthcare tech, legal tech, marketing platforms, and financial services firms all hire people with prompt engineering skills, usually as part of broader AI implementation roles.

Is prompt engineering a long-term career or a transitional role?

The narrow craft of writing better prompts is likely to consolidate into broader AI engineering roles. The deeper skills — evaluation design, RAG optimisation, red teaming — have strong long-term demand under different job titles.

What skills do you need to become a prompt engineer?

Strong written communication, systematic testing methodology, LLM API familiarity, basic Python scripting, and domain expertise in the relevant industry. Evaluation framework design is the most underrated skill to develop.