Open position
Engineering Architect, AI & HPC Compilation
Why this role exists
Every AI silicon company ships a compiler. Almost none of them ship a good one. The gap between what MLIR can represent and what custom hardware can execute is where performance goes to die — and most internal teams lack the compiler depth to close it. They build a functional pipeline that leaves 40% of the hardware’s capability unused, declare victory, and move on.
The same gap exists in HPC. Scientific codes — whether written in Fortran, C++, or increasingly Julia — push hardware to its theoretical limits. The compilation path from a high-level computation through MLIR or LLVM to machine code that saturates the memory subsystem and fills every vector lane is where the real performance engineering happens.
We don’t stop at functional. VRULL builds the MLIR and LLVM pipelines, the graph-lowering strategies, and the kernel optimisations that extract what the silicon was actually designed to deliver. We work at the boundary that most teams avoid: where ML framework semantics, HPC runtime requirements, compiler IR, and hardware constraints all collide.
This is compilation work that doesn’t exist in textbooks yet. Custom matrix extensions, non-standard data types, inference pipelines and simulation kernels that need to hit latency and throughput targets on hardware that’s still in simulation. AI-assisted workflows let you iterate at the speed the problem demands — exploring lowering strategies, generating kernel variants, prototyping passes — while your architectural understanding ensures the output is correct.
What you’ll do
- Design and build MLIR-based compilation pipelines for custom AI and HPC silicon — from framework IR to hardware-specific code generation
- Develop graph-lowering strategies and kernel optimisations that close the gap between what models and scientific codes need and what hardware provides
- Work with ISA design teams to prove that proposed extensions are compilable and that the compiler can actually exploit them — for both AI inference and HPC workloads
- Build the compilation infrastructure for matrix-computing extensions, custom instructions, and non-standard data types
- Bridge MLIR and LLVM: ensure that high-level optimisation decisions carry through to the backend code generation that matters
- Explore compilation paths for modern languages — Julia’s type-specialised compilation model is a natural fit for hardware-aware code generation
- Contribute to MLIR and LLVM upstream and maintain presence in both compiler communities
What we’re looking for
- Deep experience with MLIR and/or LLVM — dialects, passes, lowering pipelines, not just usage
- Understanding of AI framework internals (PyTorch, TensorFlow) and/or HPC runtime patterns — how computation is represented at the top of the stack and how that maps to hardware
- The ability to trace a performance problem from a model or simulation kernel through the compilation pipeline to the generated machine code
- Experience with at least one hardware target’s ISA at the level needed to write code generation
- Interest in modern language compilation — Julia, domain-specific languages, and the compilation models that break the Fortran/C++ duopoly in HPC
- Active engagement with the MLIR and LLVM communities — contributions, conference talks, working-group participation
What sets you apart
- Experience building compilation pipelines for custom or pre-silicon hardware
- Knowledge of quantisation, sparsity, mixed-precision compilation, or HPC-specific optimisations (stencil codes, FFT, sparse linear algebra)
- A track record of closing the gap between “functionally correct” and “actually fast” on real AI or HPC workloads
- Familiarity with Julia’s compiler internals or experience with Flang/gfortran on performance-critical HPC codes
- The ability to work at the intersection of domains that most engineers only know one of: ML/HPC frameworks, MLIR/LLVM infrastructure, and hardware architecture
Most AI compiler roles are about maintaining an existing pipeline. This one is about building pipelines for hardware that doesn’t exist yet — for workloads that span inference, training, and scientific simulation — and making them good enough that the hardware is worth building.
Interested in this role?
Send your CV and a note about why this role interests you to careers@vrull.eu.
Apply for this role