
Python Insfrastructure Engineer - Model Evaluation
Job Description
Posted on: May 26, 2026
Python Infrastructure Engineer — Model Evaluation (AI Training)About The Role What if your Python expertise could directly shape how the world's most advanced AI models are built, evaluated, and improved? We're looking for a Senior Python Infrastructure Engineer to design and build the data pipelines, evaluation harnesses, and annotation tooling that power next-generation AI systems at leading research labs. This is a fully remote contract role with serious technical depth — the kind of work that ships to production and influences model quality at scale.
- Organization: Alignerr
- Type: Hourly Contract
- Location: Remote
- Commitment: 20–40 hours/week
What You'll Do
- Design, build, and optimize high-performance Python systems supporting AI data pipelines and model evaluation workflows
- Develop full-stack tooling and backend services for large-scale data annotation, validation, and quality control
- Build and maintain evaluation harnesses that integrate with inference frameworks and benchmarking pipelines
- Improve reliability, performance, and safety across existing Python codebases
- Instrument systems with observability tooling and metrics collection to monitor model performance and system health
- Identify bottlenecks and edge cases in data and system behavior, and implement scalable, maintainable fixes
- Collaborate with data, research, and engineering teams through synchronous design reviews and async communication
Who You Are
- Native or fluent English speaker with clear written and verbal communication skills
- 3–5+ years of professional experience writing production-grade Python
- Full-stack developer with a strong systems programming background
- Experienced building evaluation harnesses for ML models and integrating with inference frameworks
- Strong grasp of observability, metrics collection, and system reliability practices
- Able to commit 20–40 hours per week with consistent availability
Nice to Have
- Prior experience with data annotation pipelines, data quality systems, or model evaluation infrastructure
- Familiarity with AI/ML workflows, model training, or benchmarking frameworks
- Experience with distributed systems or internal developer tooling
- Background working directly with AI labs or ML research teams
Why Join Us
- Work on real production systems at the frontier of AI development alongside leading research labs
- Fully remote and flexible — work from wherever you do your best work
- Freelance autonomy with the structure of high-impact, technically challenging projects
- Make a direct, measurable contribution to how next-generation AI models are evaluated and improved
- Potential for ongoing work and contract extension as new projects launch
Apply now
Please let the company know that you found this position on our job board. This is a great way to support us, so we can keep posting cool jobs every day!
RemoteITJobs.app
Get RemoteITJobs.app on your phone!

Frontend Developer (UI/UX) for NATO

Senior Python Data Scraping Engineer (Freelance)

Senior Web Developer (6 month contract)

Python Insfrastructure Engineer - Model Evaluation

