Pangram Labs is hiring an AI Tooling and Evaluation Engineer to join our AI research team. You'll work directly with AI research scientists and engineers to build internal tools, platforms, and systems that power the model development lifecycle: from data collection and annotation through evaluation and deployment monitoring.
This is a full-stack software engineering role with a research support focus. You will spend most of your time building web applications, APIs, and data pipelines, but you'll need enough ML fluency to understand what the research team needs and why.
Roles and Responsibilities:
- Build evaluation infrastructure for standardized model assessments and sign-off, including dashboards and reporting tools to track performance over time
- Maintain Label Studio deployment for data annotation campaigns
- Build browser based interfaces for exploratory data analysis on training datasets
- Build research support systems to monitor customer usage of the platform
- Improve upon and support synthetic data generation workflows
Requirements
- B.S. / M.S. equivalent in Computer Science or a related domain
- 3+ years of experience as a software engineer, with strong full-stack skills
- Proficiency in Python and at least 1 modern frontend framework
- Experience building internal tools, admin panels, and dashboards
- Comfortable with databases, REST APIs, and basic DevOps
- Familiarity with ML concepts
- Ability to work autonomously and translate ambiguous research needs into concrete software solutions
Nice to have
- Experience with ML experiment tracking tools such as Weights & Biases
- Experience with Label Studio or other annotation platforms
- Background in AI/NLP research
- Experience with data visualization libraries
- Experience with evaluation and/or QA of large scale ML systems
What we offer
- In-person role in our Downtown Brooklyn office, occasional remote work is okay
- Flexible hours