Recently, we embarked in the adventure of porting a “Monte Carlo radiative transfer in curved spacetimes” code from multi-threaded CPUs to GPUs. This work is important for three reasons:

- It is fun!
- It is important because most general relativistic radiative transfer procedures currently available neglect inverse Compton scattering, which is particularly important to the group since we want to understand the nonthermal emission of active galactic nuclei
- Radiative transfer can be
*slow*; on the other hand, it can usually be made massively parallel with a couple of algorithmic improvements

These are some of the reasons why we decided to port a radiative transfer code to GPUs. We have reasons to believe that we can achieve speed-ups of a factor of ~100 times compared to a parallel, OpenMP (CPU) version of the code when using a modern GPU such as a GTX 1080 Ti. This can be a game-changer to allow faster modelling of the radiation from accreting black holes.

We are collaborating with colleagues from the computer science department at the university (Alfredo Goldmann and Matheus Tavares Bernardino). This is work in progress and we have some exciting preliminary results, which we unfortunately cannot publicly share yet. We hope to soon have a paper reporting these results and share the GPU-accelerated black hole radiative transfer code on Github.

In the meantime, we have one software deliverable from this project which may be useful for other researchers: we partially ported some functions of the GNU Scientific Library (GSL) to CUDA. More specifically, we ported the following functions:

- gsl_ran_dir_3d: generate random vectors in the 3D space
- gsl_sf_bessel_K0_scaled_e, gsl_sf_bessel_K1_scaled_e, gsl_sf_bessel_Kn: different sorts of “exotic” Bessel functions
- gsl_ran_chisq: generate random numbers drawn from the chi-square distribution
- cheb_eval_e: Chebyshev polynomial
- gsl_sf_lnfact_e, gsl_sf_fact_e: factorial
- gsl_ran_gamma_double: Gamma distribution
- gsl_sf_psi_int_e: Digamma (Psi) function
- gsl_poly_eval: polynomial evaluation

We called this GSL CUDA-port *cuSL*. It is publicly available on Github. Hopefully, this may be useful to other researchers that are doing mathematical physics computations using CUDA and NVIDIA GPUs.