AI Red Teamer, LLM Generalist
HandshakeRecruitment Platform company
Seattle, United StatesMid
Data & AI
About the role
Stress-test large language models to ensure AI safety and robustness.
- •As an AI Red Teamer, you will stress-test large language models by intentionally trying to break them.
- •Your work directly supports AI safety and model robustness for leading research labs.
- •Key Responsibilities Craft creative prompts and multi-turn scenarios to stress-test AI guardrails across diverse risk categories Discover ways around safety filters, restrictions, and defenses using jailbreak, evasion, and prompt injection techniques Explore edge cases to provoke disallowed, harmful, or incorrect outputs Evaluate and score model responses against structured harm taxonomies and severity rubrics Document experiments clearly, including what you tried, why you tried it, and what it revealed Requirements Strong hands-on experience using multiple LLMs (ChatGPT, Claude, Gemini, open-source models, etc.) Intuition for crafting adversarial prompts; familiarity with jailbreak or evasion techniques is a strong plus Creative, adversarial problem-solving skills Clear and thoughtful written communication Strong ethical judgment and the ability to separate adversarial thinking from personal values
Tech stack
LLMsOpenAI APIAnthropic APIGemini API
Match insights
Tech:LLMs, OpenAI API, Anthropic API, Gemini API
Level:Mid