Eivind Liland

Parallel Accelerator Specialist

Hardware & Software Consultant

I work across the entire stack: from RTL design and verification to low-level software and parallel algorithms. Whether architecting out-of-order SIMT cores and parallel memory subsystems for AI and compute, or writing highly optimized GPU compute shaders and bare-metal rendering engines, I love the challenge of finding trade-offs that maximize throughput with minimal power.

Get in touch

Global Remote

On-site Berlin

Video introduction

What I Do

Parallel Architectures

Compute architecture for GPUs, accelerators, and AI processors. The core problems in modern AI hardware — memory bandwidth walls, dataflow scheduling, keeping arithmetic units fed — are problems I've been working on since before they had an AI label. That covers systems from single embedded processors to thousands of parallel execution units. Available for hands-on design or high-level advisory.

Software Development

I've built software at every level of abstraction — from bare-metal C/C++, assembly, and extreme constraint optimization, to GPU-accelerated computing and algorithms in Python. Whether the problem is wringing cycles out of constrained hardware or writing procedural generators for massively parallel systems, I can help.

III

RTL Design

Digital design in SystemVerilog. I write RTL with verification in mind from the start, with constrained-random testbenches alongside each module, so what I hand over is already substantially validated.

Verification

UVM-based verification from module-level constrained-random to system-level integration. If you're building custom silicon or FPGA systems and need verification methodology or execution, I can step in at any level.

Things I've Built

GPU Architecture

Designed out-of-order WARP scheduling logic for a mobile GPU core
Built custom memory hierarchy for bandwidth-constrained AI accelerator
Prototyped novel register file design reducing area by 30%

RTL & Silicon

Full RTL implementation of a RISC-V vector extension subset
UVM testbench infrastructure for multi-million gate SoC
FPGA prototype of custom matrix multiply unit

Software & Algorithms

Procedural real-time rendering algorithms under extreme memory/size constraints
GPU-accelerated computational fluid dynamics solver
Custom shader compiler backend for proprietary GPU ISA
Real-time signal processing pipeline on embedded DSP

Systems & Integration

End-to-end verification environment for PCIe Gen4 controller
Driver stack for custom AI inference accelerator
Performance modeling framework for early-stage architecture exploration

These are placeholders — replace with your actual projects and accomplishments.

Where I've Been

ARM Mali GPU

Early employee at Falanx Microsystems, a startup in Norway that built the Mali GPU from scratch. Contributed across the stack, spanning RTL design and verification of the GPU to writing bare-metal software and pre-silicon tech demos for early FPGA prototypes. ARM acquired us in 2007 — Mali powers billions of devices today.

Swarm64

Co-founded Swarm64. We built FPGA-based hardware that accelerated database computation — massively parallel, deployed in the cloud. Partnered with Intel and Xilinx. Acquired by ServiceNow.

Orbital Machines

Founded Orbital Machines, a sociocratic newspace startup that contributed to a number of space industry vehicles. Wrote Python software for designing and optimizing 3D propellant pump geometries for rocket engines.

How I Work

I partner with teams on a contract basis — whether that means short-term consulting, extended project work, or strategic advisory.

I'm available for remote work globally, or on-site in the Berlin area. I am always happy to start with a quick conversation to find the engagement model that best suits your needs.

Ready to talk?

From high-level architectural guidance to hands-on implementation and verification, let's discuss how I can help accelerate your next project.

Get in Touch