Kernel Developer

About the Role

We are looking for a C/C++ engineer to design and optimize high-performance compute kernels for AI accelerators.

This is not Linux kernel development, and it does not involve drivers, operating systems, or kernel modules.
In this context, “kernels” refer to user-space compute kernels for tensor operations used in deep learning workloads.

You will work at the lowest level of AI performance engineering — where software meets specialized hardware.

Responsibilities

Design and implement high-performance compute (operator) kernels in C/C++
Develop core tensor operations
Optimize performance for AI accelerators (latency, throughput, and efficiency)
Apply low-level optimization techniques
Profile, benchmark, and tune kernels to eliminate performance bottlenecks
Contribute to internal libraries and runtime systems for AI workloads

Requirements

Strong proficiency in C/C++
Experience with performance-critical software development
Strong understanding of low-level optimization techniques
Understanding of CPU/GPU or accelerator architecture fundamentals
Ability to analyze and debug complex systems
Experience working with large, complex codebases
Strong communication and teamwork skills

Nice to Have

Experience with GPU kernel programming (CUDA / ROCm / OpenCL)
Experience with Triton or similar kernel programming frameworks
Knowledge of instruction set architectures (ISA)
Familiarity with compiler technologies (e.g., LLVM-based stacks)
Experience with distributed communication frameworks (NCCL, MPI, libfabric)
Understanding of deep learning models

What We Offer

Highly competitive salary, employment contract (Umowa o Pracę), and a comprehensive benefits package, including Medicover healthcare coverage.
Work on the performance-critical compute layer for next-generation AI accelerators
Direct impact on deep learning model efficiency and latency
Collaboration with experts in hardware, compilers, and systems
Challenging low-level performance engineering problems at the hardware–software boundary

Kernel Developer

Kernel Developer: design and optimize high-performance user-space compute kernels for AI accelerators in C/C++. Shape latency, throughput, and efficiency at the hardware–software edge.

Kernel Developer

About the Role

Responsibilities

Requirements

Nice to Have

What We Offer

Kernel Developer

Kernel Developer: design and optimize high-performance user-space compute kernels for AI accelerators in C/C++. Shape latency, throughput, and efficiency at the hardware–software edge.

Already working at EER Poland?