Senior ML Engineer
The client has satellite-derived observations of real-world events, ground-truthed at the property level across geographies. They are turning this asset into production risk models that are calibrated, validated, and built to withstand rigorous external scrutiny. They are now looking for a Senior ML Engineer to own the training and calibration infrastructure for these models. You will work alongside other scientists and analytics leads who define the modelling problem; your responsibility is to ensure it trains correctly, efficiently, and reproducibly, with access to the compute resources required by the task. Producing well-calibrated outputs is central to this role: scores that are statistically meaningful and externally defensible, not just good at ranking.
Start: whenever possible, flexible
Duration: end of year, extension possible
Location: Espoo
Work type: Hybrid
Language: English
Allocation: 100%
Responsibilities:
- Training Infrastructure: Set up and maintain a reproducible ML environment across the compute spectrum, local development, GPU cloud (AWS), and HPC; ensure training is fast, consistent, and repeatable
- Model Training: Scale an existing training pipeline from research prototype to production, large labelled datasets, scoring across millions of properties
- Calibration: Implement and validate probability calibration, ensuring model outputs are statistically meaningful and externally defensible, not just good at ranking
- Experimentation: Build a rigorous experimentation framework with reproducible runs and clear data, feature, and model provenance; design validation strategies appropriate for geospatial data, and drive systematic hyperparameter optimisation and model selection
- ML/ModelOps: Manage experiment tracking, model versioning, and artifact lineage; maintain clean, reliable training and scoring pipelines for reproducible deployment
- Documentation: Produce model documentation that satisfies external technical review
- Communication: Ability to communicate results clearly with non-technical stakeholders
- Collaboration: Work closely with Data Engineers to build reliable, scalable training and scoring pipelines, and with Data Scientists to ensure features, labels, evaluation metrics, and calibration approaches are scientifically sound and production-ready
Requirements (must haves):
- Education: Master's degree or higher in computer science, machine learning, statistics, applied mathematics, or related quantitative field
- Experience: 5+ years of professional industry experience training ML models in production settings, with significant experience optimizing model performance for large-scale datasets, including training and inference (e.g., parallelization, distributed execution, or GPU acceleration)
- Calibration: Hands-on experience with probability calibration, you have debugged calibration curves and know when they break and why
- Evaluation: Strong grasp of evaluation for imbalanced classification: beyond accuracy, into calibration metrics and ranking quality
- Optimisation: Systematic hyperparameter optimisation at scale; experience with automated search frameworks
- ML/ModelOps: Experiment tracking, model registry, and artifact management in practice, not just in theory, including reproducibility, versioning, and reliable model deployment workflows
- Foundations: Strong Python, Pandas / NumPy / scikit-learn; cloud compute experience (AWS) with GPU instances and distributed training or inference workloads
- Modern Tooling: Pragmatic use of AI tooling (Cursor, Claude, Copilot) as a core part of the development workflow
Nice to haves:
- Experience shipping ML-powered features in a product development context (agile, CI/CD, production monitoring), not just research or offline analysis
- Spatial cross-validation, you know why random CV leaks in geospatial problems
- Uncertainty quantification: quantile regression, conformal prediction HPC experience (LUMI, SLURM-based clusters) Databricks ML Runtime, AWS RDS/Aurora, or PostGIS experience
- Insurance, catastrophe modeling, or climate risk vocabulary
- Tabular deep learning (TabNet, FT-Transformer) as comparison baselines
- Tech stack: Python, gradient boosting libraries, experiment tracking tooling, cloud compute (GPU)
Please note that the client will do a security screening (incl. SUPO, where required) for the chosen candidate.
Interested? Please contact Lisa_Witted in Slack / lisa.sandstrom@witted.com
-
Duration:
End of year, extension possible
-
Skills:
-
Languages:
-
Location:
New to Witted Partners?
Witted Partners helps freelance software developers and designers to find the projects that match their skills and preferences, all for free. Apply for an agent to get validated and be able to pursue our projects!
Are we friends already?
If you have already met with us or are working on a project through Witted Partners, please contact any of our talent agents or subscribe to our project newsletter to directly apply for projects!