Re: Build your own Forth for Microchip PIC (Episode 837)



In article <137u61koul8n7f1@xxxxxxxxxxxxxxxxxx>,
Elizabeth D Rather <eratherXXX@xxxxxxxxx> wrote:
none Byron Jeff wrote:
In article <137sb0raq9lv503@xxxxxxxxxxxxxxxxxx>,
Elizabeth D Rather <eratherXXX@xxxxxxxxx> wrote:

...
I see that I missed explaining something. Code isn't going into RAM. The
code is burned into the flash of the PIC. There's 4-8K of flash on a
typical 16F part now. And the only parts I'll even consider are ones
that can programmatically program their own flash.

There's more than enough space for code. Even if I tokenize, I wouldn't
put those tokens in RAM. RAM is strictly for stacks and variables.

The flash is the reason that I'm harping on the incremental development
model. It's slow to program, much faster to read. The flash is the
Harvard architecture part of the PIC. The Harvard architecture also
dictates that native code cannot be run from RAM anyway.

Thanks, it's a lot clearer now. But now a few more questions.

Is all your flash code space, or can you set it to be part code, part
data?

The flash is readable in a couple of different ways. So data can be
embedded into it. The slow way is to programmatically read the flash. It
has the advantage of giving you access to all 14 bits of each location
facilitating 14 bit tokens. The disadvantage is that reading in this
manner is really slow, like 15-20 instructions cycles slow. The
advantage is that you could theoretically have 32k tokens, though the
token lookup table would in fact fill the flash for the part.

The faster way is using the PIC RETLW instruction which performs a
return and sticks a 8 bit value into the W register. Coupled with the
PIC computed GOTO capability it's possible to get this 8 bit value in
about 4-5 instruction cycles. But the cost is that you lose nearly half
of the usable bits from the word. BTW this cost is just the cost to
fetch the token. You still need to do the indirect table lookup and call
to the routine in either instance.

Because if you're doing tokens, wouldn't your token tables go in
data space, with your token interpreter and primitives in code space?

Nope. The tokens go into code space as outlined above. The second method
is exactly how I built the NPCI get_token routine. It can fetch a 8 bit
token from anywhere in the flash.

In any case, even 8K is by no means generous for the code you'll need
plus token tables.

It'll be a tight squeeze no doubt. IIRC the current NPCI bytecode
interpreter clocks in at a bit under 2K. But it currently implements a
complete 8 bit and 16 bit set of stack routines along with code for a
frame pointer (C like language). So with some trimming I probably can
get it back down towards 1K.

Also with some encouragement, I've decided that my first crack at this
is going to be a STC. So I'm keeping tokens in my back pocket for now.
STC facilitates not having to limit the numer of words and the execution
speed is going to be probably 8-20 times faster than either of the token
methods.


...

The model I envision:

1. Prototype/test/debug words on host using tether to manipulate I/O
2. Compile words for target substituting local I/O access for remote ones.
3. Transfer words to target flash.
4. Retest word on target.

The difficulty I see with this model is that your host is *very*
different from your target. You can't, for example, test any of your
PIC code (of which there will be some, for your primitives at least,
regardless of what model you use) on the host. And unless you make a
token interpreter on the host as well, there's no way to test that logic.

Tokens are off the table for now, though a token interpreter in forth
wouldn't be that difficult to pull off. I in fact started writing a
token emulator in C precisely so that I could do testing on the host
before dealing with the target.

But what I don't think that you see is that I'm interested in executing
forth, not PIC code, on the host. I'm not concerned about primitives
since the executive is already written.

But you do raise an interesting point of how could forth on the host be
used to assist in the testing of primitives on the target. It brings
several assets to the table:

1. It can assemble and download native PIC code. At 35 instructions I
would guess that a PIC assembler would not take too terribly long to
construct in forth.

2. Via the tether you can program the PIC (a big win by the way. I'm a
bootloader lover) execute code (Frank's third instruction), view memory,
and exercise I/O if necessary.

3. It can be used as a cross compiler for forth words. The existing
PicForth compiler generates native PIC code.

If I didn't have the executive substatively written, I'd look to writing
it in forth and getting a small kernel going so that I could tether the
target.

What to do if PICforth or Mary didn't exist at all is a discussion for
another day.


The model that I don't want:

1. Compile all words on host
2. Transfer all compiled words to flash on target
3. Debug/test on target.
4. Rinse and repeat.

I can work that model perfectly in assembly, Jal, C, NPCI, Basic, or
any number of traditional laguages/development models.

I certainly sympathize with not wishing that model.

That's the only model that I've had in the past for doing development.
Even worse the traditional development model inserts an actual
programmer in the middle of it. Drives me nuts. All of my successful
projects have been done working off a tether with a bootloader.


...
Tokens is the winner in my case too. Still a bit concerned about if I'm
constrained in my token size (or if it really matters).
Not really.

From what I was reading in Brad's article it seems to be possible to
implement the token interpreter so that variable length tokens are
doable. It's a model that I had already implemented for my bytecode
interpreter, so I have no problem with it. So for now that's what I'll
plan to do reserving some of the 8 bit token space for extended word
tokens. It'll make my life on the host a bit more difficult because I'll
need to essentially do some hamming encoding of words to make sure that
the most frequently used words are in the smallest token space. But
again that's an optimization issue, not an implementation one.

The usual approach is to have 1-byte tokens for your most common words,
maybe 250 of them, and then use the few remaining tokens (e.g. FE) to
signal that there will be a 2nd token used against a secondary table.
So, unless you're planning really ambitious apps (which you're not) you
never need more than 2-byte tokens.

That's what I figured. Somehow in all of this reading I got the
impression that tokens/code had to fit some fixed sized mold. Totally
incorrect of course.


...


I think that you may have convinced me to take a stab at an STC
implementation. It does eliminate the need for the address interpreter
and will speed up execution of the most critical part of the threading
mechanism. It'll save me the code space of a token table. It would be an
absolute no brainer of a decision for an 18F part (32 level addressible
stack). I figure that if I don't implement recursion it should be fine.

Well, that certainly precludes testing your definitions on the host in
more than a cursory fashion.

Cursory is all I wanted. Being able to test logic and flow without
having to compile and download each time is a big win. It's somewhat a
part of my development style as it is. I generally use a PIC emulator to
test code logic before dealing with real hardware. Helps to get gross
misconceptions out of the way before firing up real hardware.

In fac the emulator is how I'll test all of this development before
firing up any real hardware (which my interns are working on right now).


And if not, I know that I have tokens as a backup.

Exactly how limited were those 8051 stacks?
64 bytes (32 cells) as I recall. Could have been 48 bytes. It's been a
while.

Now we are talking about the hardware subroutine stack right? Just
making sure.

Yes. On the most limited 8051 parts we don't use the Forth return stack
for return addresses, but the subroutine stack.

Got it. That's where I am.


...

The execution model doesn't affect whether you can or cannot do
incremental compilation. What determines that is whether you have a
place to put the downloaded definitions to test them. You make it sound
like the address interpreter is a big deal. It isn't.

From my reading (primarily Brad's articles) it's the core concept of
interpreted Forth. Without it you're back to the traditional compilation
model of inline expansion of native code. I already have a compiler like
that languishing on my hard disk. Not real interested in writing
another.

The address interpreter is a core concept of indirect-threaded code
Forth. Other models work differently, and all may support incremental
compilation.

I see that now. They are in fact completely orthogonal. It just so
happened that the two Pic Forth cross compilers happened to be
constructed in a non incremental fashion.


...
I guess my question is what is the structure of an optimized compiled
code word then? What I cannot visualize is the linkages between the code
fragments.
It's code in code space. Small primitives or optimized code sequences
are expanded in place. Larger words are called. Linkage is normal
call/return.

So it's STC with inline expansion of smaller fragments. Got it.

I think that structurally I can easily see how to compile a definition
into a collection of addresses or tokens. However compiling native code
is a different animal.

Simpler, really.

If it's straight STC you're right it's simpler. Still a bit worried
about the PIC hardware call/return stack. If it overflows, your
application goes into the weeds.

If small primitives are expanded in line, there's less requirement for
calls.

I'll keep that in mind.

I have to go. I'll tackle the rest when I get a chance....

BAJ
.



Relevant Pages

  • Re: Build your own Forth for Microchip PIC (Episode 837)
    ... Because if you're doing tokens, wouldn't your token tables go in data space, with your token interpreter and primitives in code space? ... Compile words for target substituting local I/O access for remote ones. ... You can't, for example, test any of your PIC code on the host. ...
    (comp.lang.forth)
  • Bytecode source
    ... You can compile bytecode and get Forth code ... But it will probably be fast to compress and to decompress, ... Forth word with the string token followed by the length followed by the ... those must be made to store tokens instead of strings. ...
    (comp.lang.forth)
  • Re: Denesting
    ... tokens could be interpreted as a table lookup to a callable ... The tokens could also be native code if implemented as the ... at compile time into inlined native code. ...
    (comp.lang.forth)
  • Re: RfD: SYNONYM
    ... compile magic and the runtime result is the same. ... It isn't easy to do it with wordlists. ... INTERPRET and FORTH. ... ColorForth Chuck has moved on to having 8 or so special tokens that ...
    (comp.lang.forth)
  • Re: #define and (brackets)
    ... likely compile just fine, but I'm too lazy to check). ... works after lexer, handling tokens rather than individual ... characters. ...
    (microsoft.public.vc.language)