Skip to content

Lack of FP16/BF16 Precision Support for GEMM Kernels on RISCV #5279

Open
@Srangrang

Description

@Srangrang

Hello,

We are currently working on porting LLMs to RISCV for inference, and due to hardware limitations, we need to maintain FP16 or BF16 precision. In this process, we are using the widely-used and highly-regarded OpenBLAS as the underlying computational support. However, the computational speed didn't meet our expectations.

It seems that this issue is due to the lack of pure FP16 or mixed-precision GEMM kernels in OpenBLAS for the RISCV architecture. Although we have seen sbgemm kernels on other architectures, they are not available for RISCV.

We noticed that the RISCV instruction set manual released on May 8, 2025, already supports RVV 1.0 for mixed-precision FP16 and BF16 Instruction Set Manual, and the RVV intrinsics have also been updated for FP16 mixed-precision RVV Intrinsic Document.
Based on the vectorized sgemm kernel, we have implemented a mixed-precision GEMM kernel for RISCV, which we believe should be shgemm, similar to the approach in #2767.

We would like to know if there is any plan to add support for FP16/BF16 precision GEMM kernels on the RISCV architecture in OpenBLAS. Specifically, we are interested in whether there are plans to support these kernels on the RISC-V 64-bit architecture with vlen of 128 bits (RISCV64_ZVL128B) and 256 bits (RISCV64_ZVL256B). If not, we are willing to contribute our implementation and provide the code.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions