AI Red Teamer, LLM Generalist

HandshakeRecruitment Platform company

Seattle, United StatesMid

Data & AI

About the role

Stress-test large language models to ensure AI safety and robustness.

•As an AI Red Teamer, you will stress-test large language models by intentionally trying to break them.
•Your work directly supports AI safety and model robustness for leading research labs.
•Key Responsibilities Craft creative prompts and multi-turn scenarios to stress-test AI guardrails across diverse risk categories Discover ways around safety filters, restrictions, and defenses using jailbreak, evasion, and prompt injection techniques Explore edge cases to provoke disallowed, harmful, or incorrect outputs Evaluate and score model responses against structured harm taxonomies and severity rubrics Document experiments clearly, including what you tried, why you tried it, and what it revealed Requirements Strong hands-on experience using multiple LLMs (ChatGPT, Claude, Gemini, open-source models, etc.) Intuition for crafting adversarial prompts; familiarity with jailbreak or evasion techniques is a strong plus Creative, adversarial problem-solving skills Clear and thoughtful written communication Strong ethical judgment and the ability to separate adversarial thinking from personal values

LLMsOpenAI APIAnthropic APIGemini API

Tech:LLMs, OpenAI API, Anthropic API, Gemini API

Level:Mid