Open science, reproducible research, data champion, computational biology, proteomics, more omics, emacs, a lot of R, quite a bit of running, and parenting.
I won’t be replicating the Python code here, as the first version,
that dynamically grows the sequence would be horribly slow in R, and
others don’t apply directly.
Here’s a plain implementation that first initialises the results
variable res before applying the Fibonacci formula F[i] = F[i-1] + F[i-2].
Let’s now benchmark the function for n = 1000 numbers.
About 1.259 milliseconds. Not great, but
reasonable. We’ll compare these timings with python later.
Byte compiling
The first optimisation we can do is to byte compile the fib function
using the cmpfun function from the compiler package.
We improve the median timing by 10
fold and reach 120.52 microseconds.
That’s a neat improvement for very little extra effort (but note
that, in my experience, byte compiling will not always give such
benefits, is any).
Using Rcpp
The famous Rcpp package is of course the way to
go when efficiency is key. The package is nowadays so mature, well
documented and has such a clean R/C++ interface and elegant
syntactic sugar,
that the overhead of calling C/C++ has become substantially smaller
and certainly worth the extra effort.
A median 5.51 microseconds, that’s
certainly competitive.
Summary
Let’s summarise our timings and benchmark the plain R implementation
fib, the byte compiled version cmpfib, and the C++ version fibc.
Of course, all this only makes sense if the results are actually
identical.
Here, I’m only running the code for the 10th Fibonacci number.
Comparing with python
The
python examples
used n = 10000 to run the benchmarks. Let’s run our code with the
same input and compare.
The first python implementation was
which times, on my computer, at
The implementation using numba, a just in time compilation library
for Python, which I failed to install locally, made a huge
improvement - 45 microseconds, along the lines of our Rcpp
implementation. (Stuart claimed that this was “probably too much of a
difference, something fishy is probably going on here” - not sure
why.)
Anyway, the byte compiled R implementation comes pretty close to the
standard (and other non-jit) Python implementations. The real
difference, however, is in the output. In R, we reach the limit of
2^1024 at fib(1447)
whereas in Python
I don't know of a way, if any, to bypass this limitation in R.
EDIT Below is a
solutionbyTim Yates, that gives arbitrary
precision integers, to break the 1447 barrier.
Unfortunately, this comes at a substantial cost in terms of execution
time (thanks to Willem Ligtenbergsuggesting
to check this) and, to a lesser extend, size.
I was invited to contribute to a
seminar/discussion
on AI, and using language models in research. These are the notes I
prepared my short presentation. I am ...
The HUPO Early Career Researcher (ECR) committee has organised a
discussion panel on Getting recognised for your work and have asked
me to participate - than...
The MuseeL is the
UCLouvain University museum in
Louvain-la-Neuve. Highly recommended. It’s located to the lively
place des sciences, in a nice brutalist sty...