Description
Nice project! I tried the BLAS package Octavian built on top of LoopVectorization, the performance is amazing.
Now I wish to make a BLAS package for Tropcal algebra (a simple semi-ring algebra by replacing + with max, * with +) and more complicated algebra (see the CountingTropical bellow). Wondering how difficult it may be.
Now if I try using the Tropical type, the error information shows
julia> using TropicalNumbers, Octavian
julia> A, B = randn(1000, 1000), randn(1000, 1000);
julia> A2, B2 = Tropical.(A), Tropical.(B);
julia> Octavian.matmul(A2, B2);
ERROR: MethodError: no method matching stridedpointer(::Matrix{TropicalF64})
Closest candidates are:
stridedpointer(::Ptr{T}, ::StaticInt{C}, ::StaticInt{B}, ::Val{R}, ::X, ::O) where {T<:Union{Bool, Float32, Float64, Int16, Int32, Int64, Int8, UInt16, UInt32, UInt64, UInt8, VectorizationBase.Bit}, C, B, R, N, X<:Tuple{Vararg{Any, N}}, O<:Tuple{Vararg{Any, N}}} at /home/leo/.julia/packages/VectorizationBase/bYkVf/src/strided_pointers/stridedpointers.jl:73
stridedpointer(::BitVector) at /home/leo/.julia/packages/VectorizationBase/bYkVf/src/strided_pointers/stridedpointers.jl:206
stridedpointer(::BitArray{N}) where N at /home/leo/.julia/packages/VectorizationBase/bYkVf/src/strided_pointers/stridedpointers.jl:207
...
Stacktrace:
[1] zstridedpointer
@ ~/.julia/packages/VectorizationBase/bYkVf/src/strided_pointers/stridedpointers.jl:95 [inlined]
[2] _matmul!
@ ~/.julia/packages/Octavian/KUsJ4/src/matmul.jl:232 [inlined]
[3] matmul(A::Matrix{TropicalF64}, B::Matrix{TropicalF64})
@ Octavian ~/.julia/packages/Octavian/KUsJ4/src/matmul.jl:191
[4] top-level scope
@ ./timing.jl:206 [inlined]
[5] top-level scope
@ ./REPL[19]:0
References
- The functions define tropical numbers (L38 and L48)
-
The functions define counting-tropical numbers (L10-21)
https://github.com/TensorBFS/TropicalNumbers.jl/blob/547edebf9abd3247299975c712e59e08eff2df75/src/counting_tropical.jl#L10 -
Tropical blas is useful in solving combinatoric optimization problems: https://arxiv.org/abs/2008.06888 , in this work, we used the generic matmul on GPU, and it pushed the state of the art in solving spin glasses, potts model and 2-sat counting et. al. I wish the BLAS on CPU can also get a proper speed.