🐢
You know, when I was writing my crispy doom install script for Raspberry Pi, I was thinking: how slow is bash, really?
Bourne-again shell scripts are for gluing commands together, not for crunching numbers of course. But I wanted to try it, because I was curious.
If you want to know what the program does, see Project Euler problem 92. You might not want to read the source code if you want to fix it yourself. By the way, there are more efficient implementations that run 2 magnitudes faster, but this one is more simple and I didn't have the brains for the better one anyways.
First the C version(I cherry-picked the fastest due to some run-to-run variance):
cc -O3 -o sqdigchains2 -Wall -march=native -mtune=native sqdigchains2.c time ./sqdigchains2 8581146 real 0m1.639s user 0m1.638s sys 0m0.001s
Not bad. Now bash:
time ./sqdigchains.sh 8581146 real 238m10.498s user 238m3.131s sys 0m1.190s
Hmm. That seems slower. About 8719 times slower. Still, within 4 orders of magnitude. I already optimized the bash version and it would take 50% less time compared to the old one, mainly by manually inlining stuff. Everything is running on a Raspberry Pi 3 Model A+. This solution is about 88% slower than a C implementation on a Arduino Uno/Atmega 328p
The time seems to be from interpeting the script. The bash version actually has to calculate more, probably. But it contains less variables. The C version, the compiler does all sort of clever stuff. Reordering instructions so it doesn't stall the pipeline. Plus it's machine code, and it can use branch prediction effectively, etc.
#include <stdio.h> #include <stdlib.h> #define TEST_RANGE 10000000 static inline size_t digit_sum_square(size_t n) { size_t sum = 0; size_t a = 0; size_t b = 0; while (n > 0) { a = (size_t)(n / 10); b = n % 10; b *= b; sum += b; n = a; } return sum; } char chain(size_t n) { while (n != 1) { if (n == 89) { return 89; } n = digit_sum_square(n); } return 1; } int main(int argc, char **argv) { size_t eightynine = 0; for (size_t i = 1; i < TEST_RANGE; i++) { if (chain(i) == 89) { eightynine++; } } printf("%zu\n", eightynine); return 0; }
#!/usr/bin/env bash i=1 en=0 # short for eightynine while (( i < 10000000 )) do n=i while (( n != 1 )) do if (( n == 89 )) then ((en++)) break fi t=$n n=0 while (( t > 0 )) do # you'd probably think (( b = t % 10 )) and then (( n = n + b * b )) is faster, but the bottom seems to be faster (( n = n + (t % 10) * (t % 10) )) (( t = t / 10 )) done done ((i++)) done echo $en
That's all for this pointless test. Have a nice one!
BONUS UPDATE (on the same day): strata sent me a perl implementation of the program!
time ./sqdigchains.pl 8581146 real 7m41.425s user 7m40.883s sys 0m0.051s
Much faster than bash, ~282 times slower than the C version. 2 orders of magnitude instead of 4. Perl is also a script language, but it also has a compile stage, which creates a syntax tree, and then walks on it. During this compiling it also does some optimizations, such as constant folding and peephole optimization(all according to wikipedia) and you can clearly see it back in the runtimes. Strata was surprised, he thought it would take much longer considering how slow bash was.
#!/usr/bin/env perl use strict; use warnings; use 5; my $TEST_RANGE = 10000000; sub digit_sum_square { my $n = shift; my ($sum, $a, $b) = 0; while ($n > 0) { $a = int($n / 10); $b = $n % 10; $b *= $b; $sum += $b; $n = $a; } return $sum; } sub chain { my $n = shift; while ($n != 1) { return $n if $n == 89; $n = digit_sum_square($n); } return 1; } sub main { my $eightynine = 0; for (1..$TEST_RANGE-1) { if (chain($_) == 89) { $eightynine++ } } printf "%zu\n", $eightynine; } main;
Thank you, strata!