Introduction to the StrongARM Revision 3, 04-Oct-96PerformancePerformance issues The StrongARM has significantly different performance characteristics to older ARM processors. It is clocked 5 times faster than any previous ARM, and many instructions execute in fewer cycles. In particular:
For fuller information see the StrongARM Technical Reference Manual, available from Digital Semiconductor's WWW site (currently at http://www.digital.com/info/semiconductor/dsc-strongarm.html) The StrongARM's cache and write buffer are also significantly better than previous ARMs, allowing an average fivefold speed increase, despite the unaltered system bus. Pumping large amounts of data will still be limited by the system bus, but advantage can be taken of the write buffer to interleave a large amount of processing with memory accesses. For example on StrongARM it is quicker to plot a 4bpp sprite to a 32bpp mode than to plot a 32bpp sprite to a 32bpp mode; the latter case is pure data transfer, while the former is less data transfer with interleaved (ie effectively free) processing. The long cache lines of the ARM710 and StrongARM can impact performance. A random read or instruction fetch from a cached area will load 8 words into the cache; this can make traversal of a long linked list inefficient. It is also often worth aligning code to an 8-word boundary. In current versions of RISC OS modules are loaded at an address 16*n+4. Future versions of RISC OS will probably load modules at an address 32*n+4, so it is worth aligning your service call entries appropriately in preparation for this change. Two significant disadvantages of StrongARM over previous processors are:
Note that future processors will no doubt have different performance characteristics again; you shouldn't optimise your code too much for one particular architecture at the expense of others. However, hopefully you will now have a better idea how to get better performance from your StrongARM. |
||||
|
||||
|