Working on a CPU-intensive app in flash is a challenging experience. It can be wonderful or frustrating, depending on your mindset.

Based on my experience with the flash chess and other intensive applications, I’ll give some tips and ideas on how to get the most out of your flash project.

Does your code need to be optimized?

The first hard question you need to answer is whether or not you actually need to optimize the performance. Most flash apps are not that intensive in terms of processing so the time would be better spent elsewhere.

당신의 코드를 최적화 시킬 필요가 있는가? 퍼포먼스를 위해 반드시 필요하다.

Your code’s speed needs to be optimized only when there’s a hard time constraint. Some examples would be:

  • 3D animation, where the model would to be rendered fast enough to have a decent framerate;
  • complex physics simulations, again to have a decent framerate;
  • encoding and encryption;
  • AI (pathfinding and others).

Don’t optimize early

Don’t start coding with optimization in mind. It’s a very common mistake and rather unintuitive, but when you start a project you need to concentrate on functionality rather than speed. Write clean code, document it and make sure it works correctly rather than fast. Any non-trivial application will have bugs during development and having to deal with clever tricks won’t make the debugging process any faster.

코딩을 시작할때부터 최적화에 대해서 신경쓰지 말아라.

Identify key areas

Once your application (or framework or whatever) works A to Z and you believe it’s reasonably bug-free, you can start looking at performance, and the first thing to do is to establish some sort of metric, for example frames/second or nodes/second, or time to encode a certain file or something like that.

The problem now lies in identifying the really intensive areas – and it’s not as easy as it seems. Depending on your application’s complexity, you may have anywhere from ten to a hundred classes and tens of thousands of lines of code. You may have a rough idea of the most intensive areas, but it’s difficult to pinpoint it.

The best tool I know of to help you in this stage is the Flex Builder Profiler. It shows you the memory usage, but more importantly, it shows you how much time is spent in each function. At this time, I am not aware of any other tool for Flash that does the same, so it may be worth for you to download trial version of Flex Builder and use it at least for profiling. Even if your project is not based on Flex, you should still be able to make the classes work with it, maybe add a simple interface.

When I first run the profiler on my chess engine, I was quite surprised by the results. I found out that 85% of the time was spent by an little function that was called around 500,000 times. Just by improving that function’s logic (no code optimization per se), the performance increased from 1000 to 7000 nodes/second.

플렉스 빌더 프로파일러를 사용하면 (트라이얼 버젼이라도) 메모리 사용량과 각 함수가 연산할때 걸리는 시간도 알려준다.

Always stay strongly typed

Now that you know the most intensive areas, you can start optimizing them, and have a good prioritization.

The first – and one of the most important optimization tips is to always use strongly typed variables. This is especially evident when working with array elements. Since in actionscript arrays can contain any kind of data, the Flash VM must do extra work when processing them.

최적화 시작의 첫번째 - 항상 강력하게 형식화된 변수(특히 배열)를 사용한다.

To give you an example, consider this code:

var crtPiece:int = movesArray[i].piece;

If you rewrite this into

var crtMove:Move = movesArray[i];
var crtPiece:int = crtMove.piece;

you’ll notice a massive speed improvement.

The same goes to a code like this:

var element:int = matrix[i][j];

which can be rewritten into

var row:Array = matrix[i];
var element:int = row[j];

Alternatively, if you’re targeting Flash Player 10, you can define

var movesArray:Vector.<Move>;

or, for the second case

var matrix:Vector.<Vector.<int>>;

and then you won’t need the extra code.

It’s worth mentioning that the speed is actually about the same when using the array strong type method versus the Vector method, so if you already use the first method, there’s no need to use Vectors for performance reasons.

위 코드처럼 복합적인 변수 선언보다는 풀어서 선언해주는것이 좋다.

Don’t use constants

This is a tip I received from a Flash Player engineer at Adobe.

See, I always thought that when you define something like

상수를 사용하지 말아라.(adobe 개발자의 팁) 

public const NAME:String = "foo";

the compiler will actually replace any occurrences of the constant with its value. This is not the case.

Using constants is a very good way to keep the code clean and readable and that’s why I wrote earlier that you should optimize only where absolutely necessary.

In my chess game, I used to have nice constants for piece values, e.g. WHITE_PAWN and even for squares (A1 or H8). By replacing the constants with their values, the code decreased in readability – if (pieceType==3) or if (targetSquare==7) is not terribly intuitive but the speed increase is significant.

컴파일러는 상수를 하나의 케이스로 간주한다. 상수로 깔끔하게 코드를 유지할 수는 있지만 직관적인 속도에 증가는 없다.

Minimize function calls

This is another potentially painful decision. Years ago, looking at some C code, I was wondering why the author is defining some complex macros like

연산을 간소화 해라.

#define MIN(a,b) ((a)<(b) ? (a) : (b)) 

instead of writing a function.

It’s because it’s faster to do so. If you call a simple function 100,000 times, you’ll get a benefit from just writing an equivalent inline. This is especially true for functions in the Math class. You can rewrite

var x:int = Math.min(a,b)

into

var x:int = (a<b) ? a : b;

I removed a number of small functions (1-2 lines) that were dealing with conversions or simple math, for some 10-15% speed increase.

In the same vein, don’t use Array methods like push()pop()shift() or unshift(); you can rewrite the code a little and get better performance.

Use int and bitwise operations

Whenever you deal with integer numbers, use int. Interestingly, using uint does not bring any speed benefits for array indexes for example, so don’t bother with it unless you really really need it (in fact I noticed uint to be slower in certain cases).

int 와 비트연산을 사용해라.

Bitwise operations can be anywhere from 2 to 10 times as fast as "normal" operations. This is not a tutorial on bitwise operations (it would help if you understand bits and bytes), but some examples would be:

Multiplication and division by powers of 2

var x:int = a*2;
var y:int = b*16;
var z:int = c/4;

can be rewritten as

var x:int = a << 1; //2^1 = 2;
var y:int = b << 4; //2^4 = 16;
var z:int = c >> 2; //2^2 = 4; 

Modulo

Usually you can test if a number n is a multiple of x like this:

if (n % x == 0)

you can rewrite this is

if (n & (x-1) == 0)

Precompute / cache results

The tips above were related to coding only; this one is about saving time by having data already calculated.

Math functions like sin()ln()sqrt() are slow and moreover, you may need the same value over and over again. For example, assuming you only deal with integer angles, it may be useful to precompute the sine for all 360 degrees and place them in an array. This way you don’t need to convert from degrees to radians and call Math.sin() at all, just read sine[25] and that’s it.

In other cases, you may not have the capacity to store all possible values, but if the values result after intensive computations, it’s still worth caching them for possible future use. In the case of my chess game, after I evaluate a branch, I store the result in an object using a unique hash. Because at different times the search algorithm can encounter the same board position, it can look up that position to see if it has been handled before, and if so it’ll just take the result. There are a few catches with caching, the first one being that you must create a unique ID (hash) for each value that you cache. Generating unique IDs is not that easy and must be fast enough so that it does not negate the speed benefit from caching. The other constraint is memory; major chess engines can allocate hundreds of megs for their caches, whereas people don’t tolerate this in a flash app, so you must clean your cache from time to time.

Conclusion

Using the strategies above, I was able to improve the performance of my chess game from 7000 to 14000 nodes/second.

You may use some of them or most of them, depending on the project you’re working on. Precomputing and caching is a good idea regardless of the project and bit operations are too fast not to consider. Other strategies are more time consuming or may affect the structure too much.

Good luck