Re: flash and external ram timing on TI 2812



On Thu, 30 Mar 2006 06:12:25 -0600, "laurence" <l.cazaban@xxxxxxxxxx>
wrote in comp.dsp:

Hello !
Would someone be able to explain me how to calculate the execution time of
a
NOP instruction when it is executed into flash and into external ram of a
DSP
TI 2812?
Is it normal to have a time more important (2*) into ram than into flash
?
Is it normal to see that a change into the configuration of XINTF doesn't

change the execution time ?
Thanks a lot for your answers

I want to add to what Noway2 said. He is correct, as I said in a
reply to an earlier post you made, that you can't really measure
anything by the time of a single NOP instruction, due to CPU core
pipelining, and also due to flash pipelining if you turn that on.

If you look closely at the timing sheets for the external bus
interface, it is very difficult to meet the read timing spec in less
than 4 CPU clock cycles. Look at the set-up time for data in to the
processor. Theoretically, if you had an external ram without about a
2 ns access time to data out, you could meet it in 3 CPU clocks.

Due to pipelining there is no single time for a single instruction.
There are ways you can get some idea of the best and worst cases for
execution speed, however.

One of the ways I use for checking timing on this part is to use one
of the three general purpose timers, free-running for the whole 32 bit
range. Then you can read the timer into a register, perform some
other operations, and read the timer into a second register. The
difference between the two register values is the actual number of
clock cycles took to execute the operations in between.

To study the best case speed, you need code that will be fully
pipelined, by the flash and the core. You need assembly language code
that contains about 20 NOPs, then reads the timer, performs some
instructions, and reads the timer again. Run this code from internal
RAM, external RAM, and flash.

To get the worst case speed, jump to the instruction that performs the
first timer read from other code that is far away. That flushes the
flash and CPU pipelines and represents the worst case.

It is normal for code from flash (1 + 5 wait states = 6 clocks total)
to run faster than code from external RAM (1 + 3 wait states = 4
clocks total) when the flash pipeline is turned on. I have never
tried to time it exactly, first because of issues like I mentioned
above, there are jumps and loops in the machine code that flushes the
caches, and second because it does not matter to my application. The
speed up from external RAM to flash could perhaps be as much as 2x at
times, although in general I don't think it is quite that high.

It does not matter in out applications because all the time critical
high-speed interrupt service routines are executing from internal RAM,
which is faster than external RAM or flash. We do all our timing
testing with the rest of the code running in external RAM, and verify
that it meets all the timing requirements. Since it only runs faster
from flash, we know that the flash will also meet the timing
requirements, with more to spare.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://c-faq.com/
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~ajo/docs/FAQ-acllc.html
.


Quantcast