Re: Build your own Forth for Microchip PIC (Episode 837)
- From: byron@upstairs.(none) (Byron Jeff)
- Date: Sat, 23 Jun 2007 17:07:05 -0500
In article <137quf979qhcr75@xxxxxxxxxxxxxxxxxx>,
Elizabeth D Rather <eratherXXX@xxxxxxxxx> wrote:
none Byron Jeff wrote:
[Snippage]
There is at least a draft proposal for a cross-compiler addendum to ANS
Forth, at ftp://ftp.forth.com/pub/ANSForth. The actual proposed
standard is in XCtext5.doc, and non-normative explanatory appendices in
XCapp5.doc. There are also pdf forms. The '5' indicates that this is
the 5th draft, following an extended public review period.
(NOTE: at this moment, 9:30 am Hawaii time on 6/23, that link isn't
working. I'm trying to find out why and get it fixed. Meanwhile, if
you really want the docs now, email me erather {at} forth {dot} com)
I'll take a look. I can wait.
3. For the purposes of embedded development, that an environment that
appears as close to self hosting as possible would be desirable.
... Elizabeth gives an explanation of the tethered Forth
approach to embedded systems development here:
http://tinyurl.com/yv6z99
Yes, this abbreviated description is consistent with the draft standard
above. It was developed jointly by FORTH, Inc. and MPE in the late
90's, and both FORTH, Inc. and MPE have been developing and using
cross-compilers that work this way since then. It's very mature and
well-understood technology.
Then that's a draft I definitely want to take a look see.
That said, improvements do occur. By hosting the actual compilation on
a powerful desktop, you can do things that are both difficult and
inappropriate on a limited target, such as compiling optimized target
machine code (which we do). The compile-to-optimized-machine-code
approach not only generates much faster code than the traditional
indirect-threaded approach, it's comparable in size on even small
targets such as the 8051, and significantly smaller on larger ones such
as the 68K family.
I'm not necessarily opposed to the optimized-machine-code approach. It's
just that with the PIC based forths I've seen so far, there's no
incremental way to do it. So it reverts back to the traditional compile
the whole app, download the whole app, test cycle that associated with
traditional HLLs. And the target in this instance cannot be programmed
at wire speed. So downloading a entire program to test takes a while.
It's not like a PC where compiling is virtually instantaneous.
So the question is "Given an embedded micro, what is the minimum forth
environment required to execute Forth words?". If we can answer this
question then along with the memory read/write instructions it should be
possible to migrate Forth words (both code and compiled) to the target
and have them executed there. If you can incrementally migrate the
application to the target while developing it, then when you finish you
can simply disconnect the target from the host and run the entire
application on the target. All the while during development, you have
the full facilities of the hosted forth environment available for you to
interactively develop the application.
Well, obviously that depends both on the overall architecture you're
using (e.g. compiling optimized machine code vs. supporting an indirect
threaded model), not to mention the needs of the applications you're
going to be using it for. For example, if you know you're going to be
doing a lot of certain kinds of functions (e.g. string management or
double-length arithmetic) you'd go with code implementations of those
key functions from the get-go, rather than having to go back and
optimize them later.
Obviously there are design issues there. The problem is that if you're
starting with a blank slate per-se, then having the ability to have a
blended approach (a combo of optimized native and high level compiled
code) to development is a win.
As you point out especially in embedded systems development there are
segments where really really fast is critical and where fast enough is
often good enough. The tethered approach give you a rapid prototyping
platform for your target by using the host to implement your words.
That's a great idea. I'm struggling with the migration of those words to
the target. I'm trying to find a way out of what I perceive as a "gotta
compile the whole shebang" trap.
So the way I see it, even with a collection of optimized code words, in
order to gain the interactivity on the target I crave, I'd still have to
implement an inner interpreter to string the collection of optimized and
non optimzed words together, right?
Efficiency of execution isn't my primary goal. If I wanted to achieve
that then working on optimizing the PicForth compiler would be a better
use of my time. The tool I'm seeking is a fluid, interactive development
environment where at the end everything ends up on the target so I can
untether it and works fast enough to get the specified task done.
...In my estimation one needs two sets of items to execute forth words:
1. The forth virtual machine including the standard registers and
stacks.
2. Three critical words: ENTER, EXIT, and NEXT.
In short create enough Forth to execute the inner interpreter and you
can get going.
Except that if you compile to code NEXT goes away, and ENTER/EXIT are
merely subroutine call/return machine instructions.
But in my self-chosen constrained target environment this approach fails
on several levels given the goals I hope to achieve:
1. The pic's hardware stack is limited. Subroutine calls are simply not
an option because the stack overflows after only 8 levels of calls. This
is controllable in an assembly environment. But with Forth specifically
designed around making calls, it's a guaranteed path to doom.
2. Taking this route commits you to compiling your entire application
because once you do away with the inner interpreter, then everything on
the target must be compiled.
Let me outline how I envision using the target to give you a sense of
why I'm looking for a blended approach.
1. Starting out on a new project. Grab a part and use the traditional
programmer to dump the core executive on the part. Put traditional
programmer away until the next project because I absolutely detest
having to have a special programmer just to dump code on the chip.
Another advantage to Frank's kernel is that if it can write program
memory then it can serve as a bootloader for the chip even if I wanted
to dump something non Forth into it. I've tasked one of my summer interns
with writing a PIC16F bootloader in Forth combined with a picoforth
kernel that can program the PIC's program memory.
2. Wire up the project with the serial (or USB) interface and hook up to
the PC. Fire up gforth on the host and load the standard port
definitions and whatever words I have from previous projects that I
often use for embedded systems projects. Don't compile or download to
the target yet simply because I don't necessarily know what words I'll
actually need for the project.
3. As of now the target only has the microexecutive on it. But it's enough
to get started. I wire up whatever I/O I need for my application and
either test prewritten words on that I/O or write up new words necessary
to exercise it. All the debugging is done on the PC in gforth initially
until I'm happy with the result. I now start migration. When I get a
word I'm sure I'm going to need on the target, I move that word to the
target. Depending on the speed requirements, this may be a compiled word
which essentially functions as CODE, or a high level definition if speed
isn't critical. I retest the word on the target to make sure it works as
expected. Once that's done the word is added to the target wordset on
gforth and any further usage of that word will be remotely called.
4. Continue the process of building the application and incrementally
moving needed words to the target. Eventually the application will be
complete and well tested and all the words moved to the target and
nothing other than a GO command being run on the host.
5. Untether the target board, put it into service. Rinse and repeat with
the next project adding any interesting new words generated and tested
for this application to the hopefully growing library of useful words
that have been developed over previous projects.
Now in my view if an inner interpreter doesn't exist on the target that
activities 3 and 4 cannot be done. The inner interpreter is critical in
order to have both incremental compilation/movement of words to the
target and to facilitate the distributed execution of the application
between the host and target.
Did I miss something?
It seems to me that neither of those two items are so complicated that
they could not be easily put together for a target. And Frank's 3 word
forth implements a inner-inner interpreter that can be used to create
the inner interpreter required to run forth words on the target.
Indisputably the traditional indirect-threaded model is easier for
newbies to implement, but may not be satisfying for challenging or
time-critical applications.
Most embedded application are not so time critical that it really
matters from a development standpoint. There's always a way to make it
faster. What I'm looking for is a way to make development easier.
This discussion reminds me of Brooks 90/10 rule I read in the Mythical
Man Month nearly 25 years ago. There's no need to have 100% of the code
optimized if only 10 percent of it is time critical. Local optimizations
are easy to do, simply compile (or write in assembly) critical words.
I addressed this issue when developing NPCI. Clearly running tokenized
bytecode isn't the fastest kid on the block. Speedup when necessary
could be achived by hooking optimized machine language code to the
interpreter then calling that code from the high level bytecode. It was
my equivalent of a CODE word, though FORTH specification mechanism is
vastly more elegant than my own.
Microcontrollers also have mechanisms for dealing with time critical
stuff. Another reason I love using PICs is the wide variety of hardware
periperals they come packaged. UARTS, multiple timers, PWM, ADC, and the
like are really set/autopilot types of tools. Interrupts can be used to
buffer really time sensitive stuff.
If all else fails after developing the application, simply run it all
through an optimizing compiler removing the inner interpreter altogether
along with other connecting tissue beween words.
There's always a faster crystal you can throw in. I'll probably test my
stuff at 8Mhz because PIC parts have that oscillator built in. If it's
not fast enough I'll throw a 20 Mhz crystal at it. If that's not fast
enough I'll get a propeller and now I have 8 20 MIPS processors to
handle the work. I can use the same technique to migrate the code there,
or simply run the finished product through Cliff's PropellerForth
compiler.
My time on a project is spent developing it. You (that would be
Elizabeth) pointed out in several posts over the years that developing
in a full fledged forth environment is a good thing. I agree. I firmly
believe that environment includes interactive and incremental
development. I'll sacrifice performance to get the project working.
"Make it work, then make it fast (only if necessary)".
Only one piece of the puzzle is missing at this point. Elizabeth
discusses in her post above that the XTL transfers the stack between the
host and the target. In short it implements a form of distributed
execution where you muster the stacks for RPC. In doing so one can run a
application with a set of words distributed between the host and the
target.
So I envision an environment where Forth words are quick interactively
developed on the host. Once satisfied they work, they are incrementally
compiled on the host and transferred to the target. applications can be
a set of shared words between the two until complete at which point all
needed words are transferred to the target, completing the app.
That tends not to be satisfactory, because most embedded apps feature
custom I/O, so real target-specific words can't be tested on the host
without your having to develop simulators for the I/O functions.
I don't think so. Franks microkernel is really just a remote memory
access. So if you have real I/O hardware wired to the target, then you
can write words on the host that accesses that hardware. It certainly
won't be full speed, but you can do it. Once it works in practice, you
then take the word, compile it, and transfer it to the target. In the
compilation you substitute direct memory access for remote memory
access. so Frank's XC@ compiles to @ and XC! compiles to !
Now you won't be able to gather a frame of video from a flash ADC this
way, but for testing it'll be useful.
It's a
lot easier to just go with a fully functional XTL and do all your
testing on the target. Among other benefits, that means you can use
Forth to debug your target hardware, which is wonderful.
What does a fully functional XTL offer? Right now it's kind of a
black box to me. Please enlighten me.
There are still details that need to be worked out such as how to
differentiate between local words and remote words in both systems and
how to facilitate transferring the stacks between the two. In both cases
solutions should be geared towards simplifying the target.
That differentiation is typically done with wordsets (formerly known as
vocabularies). The draft standard identifies "scopes" of words for the
host, cross-compiler, and target; they are usually implemented with
wordsets, although the draft standard doesn't mandate any particular
implementation strategy.
I read that in one of the later chapters of Steven's book. Still a bit
fuzzy as to whether there's a concept of a local interpreter and a
remote interpreter though.
I do believe that the minimal kernel still needs to be specified. It'll
give a list of words that needs to be implemented, tested, and
incorporated in the initial kernel. I can help here because as I stated
earlier my NPCI bytecode interpreter implements a stack based virtual machine
and is written in pic assembly for the 16F family. It has debugged code
for all 16 bit arithmetic and logical ops, number processing for stacks,
and conditional/unconditional jumps. It already implements an IP for the
bytecode.
Well, that sounds useful. Frankly, the reason so little Forth work has
been done on the PIC family (and PICForth is so minimal) is that its
instruction set is really a bit below the mark for a reasonable
implementation. The PIC18 should be do-able, although we haven't yet
seen much demand for it.
It's a chicken and egg problem. Anyone who has a real project with real
deadline will most likely either choose an existing development
environment for the target or choose a chip that is better supported by
Forth. I have the luxury of being an academic and a hobbyist. It also
helps to have a virtually unlimited supply of interns. So I can throw
resources at a project like this because it interests me, not because of
a deadline. There's of course a catch 22 to that too, which is that
since it isn't deadline driven development tends to be bursty.
I look forward to hearing your comments on these thoughts.
I think it would be a good investment of your time to take a hard look
at existing mature Forth cross-compilers. You can get a CD with
extensive docs and links to free evaluation versions of our SwiftX
cross-compilers for many chips (8051, 68HCS08, 68HC11, MSP430, AVR, ARM,
68HC12, 68K family, Coldfire, more) for only $15. You can get supported
boards for most of these processors very inexpensively. The evaluation
compilers are limited only in the size of the target app you can
develop, so you can exercise them and learn a lot. For more info go to
http://www.forth.com/embedded/index.html.
I'll take a look. But frankly I won't get the warm fuzzies about it
until I'm sure that it in fact offers the type of environment I hoping
to run. It's also compilicating that SwiftX is a Windows product (and
justifiably so) and I'm a Linux guy (also justifiably so).
Thanks for the input. I'll take it under advisement and continue to
press on.
BAJ
.
- Follow-Ups:
- Re: Build your own Forth for Microchip PIC (Episode 837)
- From: Elizabeth D Rather
- Re: Build your own Forth for Microchip PIC (Episode 837)
- From: Elizabeth D Rather
- Re: Build your own Forth for Microchip PIC (Episode 837)
- References:
- Build your own Forth for Microchip PIC (Episode 837)
- From: none
- Re: Build your own Forth for Microchip PIC (Episode 837)
- From: Elizabeth D Rather
- Build your own Forth for Microchip PIC (Episode 837)
- Prev by Date: Re: Gforth and gcc "progress"
- Next by Date: Re: Build your own Forth for Microchip PIC (Episode 838): Threading
- Previous by thread: Re: Build your own Forth for Microchip PIC (Episode 837)
- Next by thread: Re: Build your own Forth for Microchip PIC (Episode 837)
- Index(es):
Relevant Pages
|