2023-01-21

spwnn - delightful

My Raspberry Pi 3 scales beautifully:

Pi Results 2018 02 27

Raspberry Pi 3		
Cores	Time (seconds)	% of single core
1	    294	
2	    155	        52.72%
3	    102	        34.69%
4	    81	        27.55%



It's not perfect but it aligns with what I would expect from a delightfully parallel application.

Here's the Graviton 3:


Core Count Time 

1 265.567797582 

2 132.647143997 <- twice as fast

3 88.403560662 

4 66.813798615 <- twice as fast again

5 53.335466866 

6 44.565484130 

7 38.216117062 

8 33.582891359 <- twice as fast again

9 29.859718747 

10 27.029442167 

11 24.661027855 

12 22.643859876 

13 20.973853902 

14 19.607836384 

15 18.385444233 

16 17.306577742 <- twice as fast again

17 16.450237619 

18 15.628561562 

19 14.888891200 

20 14.261602746 

21 13.606370892 

22 13.055653563 

23 12.614830791 

24 12.153624482 

25 11.723116590 

26 11.352718307 

27 11.005397888 

28 10.668465013 

29 10.335751571 

30 10.087745416 

31 9.828481508 

32 9.564378933 <- twice as fast again

33 9.321003014 

34 9.098431188 

35 8.890813944 

36 8.722495945 

37 8.522875722 

38 8.363417592 

39 8.245630509 

40 8.089272822 

41 7.946181339 

42 7.786933669 

43 7.692171341 

44 7.561693493 

45 7.463335691 

46 7.332565460 

47 7.207386715 

48 7.137983038 

49 7.037931854 

50 6.947267273 

51 6.858636482 

52 6.774299242 

53 6.720831775 

54 6.626360231 

55 6.580545203 

56 6.500433604 

57 6.431546966 

58 6.405486738 

59 6.332962231 

60 6.266468248 

61 6.255682436 

62 6.210253876 

63 6.157009092

64 6.130827268 <- twice as fast again

So, double the core count takes half the time, which is delightful and beautiful.  It's not perfect - if you multiply the core count by the time spent, it ought to be exactly the same, but instead it does get a bit slower in terms of total cycles as the core count increases:


Luckily, I care about the clock time, not this multiplication, and it just gets fast with each additional core.  I don't mind in the machine does a little extra work to save me time.

Compare that to a 192 "core" system (only 96 real cores [c6a.48xlarge]):


It's okay up to around 64; then since there are two processors with 48 cores each, I suspect some memory contention creeps in betwixt them.  At 96 cores you can see it gets much less efficient.  Those spikes are something else on the machine interrupting my code (or the garbage collector, hrm).

It's a bit hard to see below, but as more HT cores are added, the entire process goes slower!

The graph starts to trend upward at midpoint.

At any rate, the cost-effectiveness, for my little app, of the Graviton 3 can't be beat. It's only a bit over $2.00 / hour on-demand (and billed by the minute me thinks?), vs these other giant machines at $7.00 or $8.00 / hour.





No comments:

Post a Comment