As an AI QA Engineer (Multilingual) at Scaled Cognition, you will be the final line of defense for our model's quality. You'll sit at the critical intersection of data engineering, quality assurance, and linguistics, ensuring our LLM training data and evaluation sets are flawless. You'll be getting your hands dirty, meticulously inspecting data, and making direct code contributions to fix issues. If you love the idea of turning messy, imperfect data into gold and have the technical chops to automate parts of that cleanup, you will thrive here.
What you'll do:
- Meticulously inspect, review, and grade LLM training data, evaluation test cases, and model outputs to ensure maximum quality and accuracy.
- Maintain local development environments to run test pipelines, investigate edge cases, and submit PRs via Git/GitHub to update our training repositories.
- Act as a technical data detective, diving deep into training data to spot error cases.
- Leverage LLMs as internal tools to translate, verify, and maintain our cross-lingual datasets.
- Collaborate closely with the engineering team to refine our evaluation criteria and improve our data pipelines.
You might be the right person for the job if you:
- Have an obsessive attention to detail and get a dopamine hit from finding the one edge case or bad translation that broke a prompt.
- Are a builder who doesn't mind the weeds. You understand that high-quality AI is built on rigorous, sometimes repetitive data inspection, and you embrace that reality.
- Are technically self-sufficient. You’re comfortable navigating a terminal, running Python scripts locally, and managing your own version control.
- Love languages and understand the linguistic nuances required for high-quality translation and cross-lingual model evaluation.
- Thrive in a fast-paced environment where you can take ownership of the data quality that directly drives model performance.
Key Qualifications:
- Strong technical background with hands-on coding experience (Python preferred) and proficiency with Git/GitHub.
- Fluency in English and native or near-native proficiency in at least one other language.
- Deep understanding of Large Language Models, their failure modes (hallucinations, formatting errors), and effective prompting techniques.
- Proven experience in Quality Assurance, Data Quality, or Data Engineering, with a track record of auditing and maintaining large datasets.
- Exceptional written communication skills across multiple languages.