I was talking to John Grden about some geeky math stuff and I dropped one of my favorite little optimization tricks on him: the idea that Math.PI is a constant and 180 is a constant, and when you code 180/Math.PI or Math.PI/180, this gets evaluated at compile time, and the numerical value is inserted into directly into the byte code. So, if you do a calculation like this:
var myangle = a * 180 / Math.PI;
it will be faster than doing this:
var toDegrees:Number = 180 / Math.PI;
var myangle = a * toDegrees;
because in the second case, toDegrees has to be evaluated at runtime, whereas in the first case, 180 / Math.PI is hardcoded in the bytecode. Of course, the initial value of toDegrees is a constant, but because toDegrees is a variable, it is possible it could change, so the runtime needs to look up the value when it is encountered.
So, naturally, John gets a flashon about this and tries it out in AS3 with the following code:
[as]var toDegrees:Number = 180/Math.PI;
function runTest0() {
var time:Number = getTimer();
var a:Number = 90;
for (var i:Number = 0; i<100000; i++) {
var myangle = a*180/Math.PI;
}
trace((getTimer()-time)/1000);
}
function runTest1() {
var time:Number = getTimer();
var a:Number = 90;
for (var i:Number = 0; i<100000; i++) {
var myangle = a*toDegrees;
}
trace((getTimer()-time)/1000);
}
runTest0();
runTest1();[/as]
Now, per my understanding, test0 should run faster, as it's using the constants, and test1 should be slower because it's got to do a lookup of toDegrees at run time.
Oddly enough, though, test1 runs on average twice as fast as test0!
I also tried making toDegrees a const instead of a var. I could understand this being faster, but it actually seems to make no difference at all.
He's just blown away something I believed in. This is like finding out there is no Santa Claus... OK, not quite that traumatic, but still, I was shocked. So I try the same code in AS2, and some faith is restored.
In AS2, test0 takes about 85% of the time test1 takes. That's what I expected.
Really odd that AS3 is opposite though. My off the cuff guess is that this has something to to with the JIT optimizing stuff into low level code, seeing that the value is called thousands of times without being changed, thus it knows it can use the same value. But this is something I know relatively little about, so I could just be talking out of my ... inexperience. If anyone has some more insight into this, we'd love to know what's really going on.
But, getting back to the title of the post, don't take your old optimization tricks from AS1/2 and assume they are going to work in AS3. It's a whole new ball game.
[ADDENDUM]
I just modified the code to make the initial angle a random number, like so:
[as]const toDegrees:Number = 180/Math.PI;
function runTest0() {
var time:Number = getTimer();
var a:Number = 180;
for (var i:Number = 0; i<100000; i++) {
var myangle = Math.random() * 360 *180/Math.PI;
}
trace((getTimer()-time)/1000);
}
function runTest1() {
var time:Number = getTimer();
var a:Number = 180;
for (var i:Number = 0; i<100000; i++) {
var myangle = Math.random() * 360 *toDegrees;
}
trace((getTimer()-time)/1000);
}
runTest0();
runTest1();[/as]
In this case, the results are dead even. Veeeeeery interesting.
I ran that same code, and sometimes they were nearly identical, and other times, they test1 was faster still.
0.041
0.025
It seems that the flash player is a woman [ changes mood/mind when it feels like it ]
DON’T hate me ladies!! [ but you know its true ]
Oddly enough, from the first code block, test0 was 6ms faster than test1 (13 vs 19), but with 1000000 iterations, test0 was 80ms slower than test1 (206 vs 123). :S
Oh come on Keith, all you have to do is read the AVM2 Overview (http://www.adobe.com/devnet/actionscript/articles/avm2overview.pdf), determine the expected operation codes, and then deduce the answer from that. That should only take a few months! 🙂
I got really inconsistent results so I made graphs. (Visualization, woot!) I was able to see that most of the time the function would take about 25ms but once in a while (9 times out of 180 calls) it would spike as high as 70ms. When these spikes were cleaned out, test1 was consistently faster by about 1.5-2ms (Sorry, Keith).
So now I’m wondering, what exactly are these “mood” spikes? It wouldn’t be garbage collection, would it?
For me, test1 was four times faster than test0. I bumped up the iterations to 10000000 and got 0.889 and 0.233 (N.B. test1 is faster on the first call to it but stabalises to this speed over multiple calls to the function).
Using a local variable for toDegrees makes it faster still – test1 then takes 0.086 – a 10x improvement over test0 (and I’m sure it’s no coincidence that this is also the time that test1 achieved on it’s first call).
These results were very consistent over 100 tests of each function. I got no mood spikes at all.
It looks like the AS3 compiler is pretty darn smart. My guess that the reason that test0 is faster in the old code block is because either
1) the compiler recognizes 180/Math.PI as a constant expression and inlines it into the loop body
2) the compiler recognizes 180/Math.PI as something that will remain constant every loop iteration, and pre-calculates it in a register before the loop executes.
I guess that since “toDegrees” is not an expression in test1, it can’t be optimized the same.
*shrug*
Wait a second… you are intending to time the execution of code but you injected the trace() into your test… you’re timing how fast trace() works. T
hat is, change:
trace((getTimer()-time)/1000);
to
var d = (getTimer()-time)/1000
trace(d)
And… in my tests… you’ll not only see that toDegrees (the second rest) consistently runs faster (and you expected in the first place).
Actually, the expectation was that toDegrees would be slower, as it is in AS2. Moving the trace doesn’t change the overall result of the test.
Very interesting test. Some things to try:
1) Declaring a type for myangle:Number speeds it up a lot, eliminating overhead.
2) Using int instead of Number for i helps a bit as well.
3) Order of operations: a*180/Math.PI is compiled to (a*180)/Math.PI, at least in AS1 Flasm. Using a*(180/Math.PI) is theoretically better; I sure wish I had an AS3 decompiler.
The AS3 compiler makes no assumptions about the value of Math.PI. Try
var Math:Object = {PI:3};
trace( Math.PI );
In AS3 the output is “3”, in AS2 it’s “3.14159265358979”. As a result the AS3 compiler can’t optimise 180/Math.PI and has to evaluate it every time through the loop.
It’s good to see AS3 behaving more like other programming languages I’ve used.
N.B. The time difference I experienced between local and non-local variables for myDegrees dissappeared after implementing the trace as Phillip suggested.
N.B.2 You get a massive speed boost if you type myangle to a Number –
var myangle:Number =...
Well, I”m totally confused now. But it’s fun stuff to mess with. 🙂
So reading through the comments and not being able keep track of everyones tips. I have a few questions:
1. Does the trace effect the speed?
2. Does declaring a type for myangle:Number speed it up?
3. Any idea whats causing the spikes (is there spikes?) is it the GC?
4. Combining everyones tips, what is the most efficient way of writing this?
I could probably spend many more hours testing this stuff.
1. theoretically, the value passed to trace() is calculated first, which includes the elapsed time, and THEN trace is called with that evaluated value, so trace shouldn’t affect the timing. *Theoretically*.
2. I believe Number should be faster than int or uint.
http://www.gskinner.com/blog/archives/2006/06/types_in_as3_in.html
http://kuwamoto.org/2006/06/15/avoid-ints-in-actionscript/
3. GC is my guess. Or some other process running on the cpu, taking up cycles. No idea. Thats why you run lots of tests and take averages.
4. I may look at that over the weekend, and come up with a definitive test. Then again, I may not. 🙂
I think the rule is to only optimize as the last step. Sure, you don’t need to do things in a way you know are slow… but a lot of these such tests often don’t apply to real projects. I’m not saying the whole discussion is dumb, just that it’s best to not really worry about sub-atomic time differences until after you’ve pieced something together and it otherwise works as you want. I’ve made the mistake of cleaning up code or optimizing code that I end up later removing because the feature was ditched. Just some thoughts on those looking for definitive answers.
P.S. those spam protection questions are hard. Sum of 19 + 21? First I had to do some quick addition in my head. Then, I had to think about whether it was some trick because the “+” indicates sum already… if it had said “what’s the result of 19 + 21?” I wouldn’t be pondering this so long. I guess the result is the same. Still… I guess it’s a small price to pay for spam protection.
Phillip, I absolutely agree about optimizing last. It’s a pretty solid principle. This is more experimental/research/geekfest kind of stuff. Still, it’s nice to know that of two similar ways of doing something, one might be twice as fast. Then, when you get to that optimization step, you know what to do.
As for the spam questions, well, I’m glad you figured it out. 🙂
Dont forget to subtract the base cost;
The time of the for loop, Math.random(), variable declaration, and opperations.
function runTest2() {
var time:Number = getTimer();
var a:Number = 180;
for (var i:Number = 0; i
bah, cant post code. any way, I think the local variable is closer to 20 times fast than Math.PI when you remove the the other factors from the test.
It only goes to show that you should write readable code first and foremost, and optimze later. Make no assumptions on what is more efficient before you have actually verified that it is so, and more importantly that it makes a difference in the end.