This is an urgent project to translate a single CUDA kernel into the ROCm/HIP language. We are attempting to implement the Mamba State Space model on AMD GPUs. You can see the CUDA implementation here: https://github.com/state-spaces/mamba

Specifically, we need someone to rewrite this CUDA kernel for ROCm/HIP: https://github.com/state-spaces/mamba/blob/main/csrc/selective_scan/selective_scan_fwd_fp32.cu

The above kernel uses templates from this file: https://github.com/state-spaces/mamba/blob/main/csrc/selective_scan/selective_scan_fwd_kernel.cuh

You will receive a 50% bonus if is done by Sunday. 25% if done by Monday.

This is a very urgent project that will requires immediate start date and working over the weekend. Make an offer if you think you can pull this off!

Hourly Range: $50.00-$150.00

Posted On: March 22, 2024 16:23 UTC
Category: Firmware Development
Skills:C++, CUDA, GPU

Country: United States

click to apply

Powered by WPeMatico