# BLAS Pitfalls with Mac OSX Accelerate Framework

Today I'm going to talk about bugs you can encounter when using the Apple (i.e. Mac OSX) Accelerate framework and BLAS functions like dot product, norm, and sum. I hope that describing these bugs here saves you a headache in the future. This will also be a fun chance to use f2py and the fortran_magic Jupyter extension.

%load_ext fortranmagic


Let's start with a very simple routine written in Fortran to see how fortran_magic works.

%%fortran
subroutine f(x, y, z)
real(4), intent(in) :: x, y
real(4), intent(out) :: z
z = x + y
end subroutine f

print(f(1, 2))

3.0


We can also work with arrays.

%%fortran
subroutine fa(x, y, z, N)
integer, intent(in) :: N
real(4), dimension(n), intent(in) :: x, y
real(4), dimension(n), intent(out) :: z
z = x + y
end subroutine fa

from numpy import array
a = array([1, 2, 3])
b = array([4, 5, 6])
print(fa(a, b))

[5. 7. 9.]


Now we can try rewriting the above function to call BLAS.

%%fortran
subroutine fb(x, y, z, n)
integer, intent(in) :: n
real(4), dimension(n), intent(in) :: x, y
real(4), dimension(n), intent(out) :: z
real(4), parameter :: fac = 1.0
call scopy(n, x, 1, z, 1)
call saxpy(n, fac, y, 1, z, 1)
end subroutine fb

print(fb(a, b))

[5. 7. 9.]


BLAS has functions, in addition to subroutines. For example, to compute the dot product.

%%fortran
subroutine fc(x, y, res, n)
integer, intent(in) :: n
real(4), dimension(n), intent(in) :: x, y
real(4), intent(out) :: res
res = sdot(n, x, 1, y, 1)
end subroutine fc

print(fc(a, b))

0.0


It seems like something has gone wrong. Did I make a mistake in my fortran routine? Maybe we should debug a little.

%%fortran
subroutine fc(x, y, res, n)
integer, intent(in) :: n
real(4), dimension(n), intent(in) :: x, y
real(4), intent(out) :: res
write(*, *) x
write(*, *) y
write(*, *) dot_product(x, y)
res = sdot(n, x, 1, y, 1)
write(*, *) res
end subroutine fc

print(fc(a, b))

0.0   1.00000000       2.00000000       3.00000000
4.00000000       5.00000000       6.00000000
32.0000000
0.00000000


Nothing seems wrong with the code. What is worse, the intrinsic dot_product works, but BLAS doesn't! There is a trick to fix this though.

%%fortran
subroutine fc(x, y, res, n)
real(8), external :: sdot
integer, intent(in) :: n
real(4), dimension(n), intent(in) :: x, y
real(8), intent(out) :: res
res = sdot(n, x, 1, y, 1)
end subroutine fc

print(fc(a, b))

32.0


It seems I had to coach my Fortran routine to believe that the returned value is double precision. Yet, it's sdot, and if we look at the documentation it should definitely return a single precision floating point number. Let's try some other routines.

%%fortran
subroutine fd(x, res, n)
real(8), external :: snrm2
integer, intent(in) :: n
real(4), dimension(n), intent(in) :: x
real(8), intent(out) :: res
res = snrm2(n, x, 1)
end subroutine fd

print(fd(a))

3.7416574954986572


Indeed, for any of the functions in BLAS LEVEL-1, I think you will suffer through this. As I mentioned before though, this is a problem specific to Apple's Accelerate framework. Let's try linking against openblas instead.

%%fortran --extra '/Users/wddawson/miniconda3/envs/fmagic/lib/libopenblas.0.dylib'
subroutine fe(x, res, n)
integer, intent(in) :: n
real(4), dimension(n), intent(in) :: x
real(4), intent(out) :: res
res = snrm2(n, x, 1)
end subroutine fe

print(fe(a))

3.7416574954986572


If you find yourself calling BLAS directly, you might want to take some care to provide a backup plan if your code is being compiled and executed on a Mac.