Re: Developing an alternative to MessageTrans: announcing Project Babel



In article <BRd*cFUfs@xxxxxxxxxxxxxxxxxxxxxxxxxxx>,
Theo Markettos <theom+news@xxxxxxxxxxxxxxxxxxxxxx> wrote:
Graham Shaw <gdshaw@xxxxxxxxxxx> wrote:
However there is a way to avoid these difficulties: by removing the
ambiguity from the source text. MessageTrans does this by
representing each message with a unique code, and essentially the same
method could be used at the level of individual words: ensuring that
distinct meanings have distinct representations. This is the approach
that I am following.

Wow. That looks impressive. I haven't seen anything so ambitious even
on other platforms. I do hope you plan to make it multi-platform, as
it's not just RISC OS that would benefit.

I always write code that is platform-neutral unless there is a good reason
not to. (That said, there's no harm in making RISC OS the primary target
platform.)

I hope you know what you're getting yourself in for... the engines behind
things like Systran (Google Translate etc) seem to be pretty complex.

I'm not attempting anything nearly so ambitious: generating natural
language is a much simpler problem than understanding it, and from the
work I've done so far I've reason to believe that generating the short
snippets needed for user interfaces is achievable.

Whether this can be extended to longer tracts, for documentation and
whatnot, I don't know - but even that would not be breaking new ground.

I'm not familiar with the prior art, but there seem to be other open
source translation tools about... can you explain where your approach
fits in?

I'm trading convenience for simplicity and reliability.

(In other words, creating the source text will be rather tedious, but it
will only need to be done once, and there will be little or no risk of
amusing mis-translations.)

- experimental language definitions for small parts of the English,
German and French languages;

Minor bug report - the French lexicon doesn't seem to be up on the
website.

That's because there isn't anything in the database yet (and there is
virtually nothing in the German one). The experimental versions in the
Subversion repository are more interesting.

(Unfortunately the language definition format is still very fluid at
present, as I gradually work out what features are needed - that's why
there is no specification yet.)

It might be worth asking someone with experience of non-Indo-European
languages for an opinion. I know almost nothing about Chinese, for
example, but imagine it would be quite different. Do you think your
system would cope with languages with complex case structures (like
Lithuanian, with 14 cases)?

It's not so much a question of whether it can cope, but rather, what it
can do in a reasonably efficient and compact manner. It is very flexible
due to the extent to which it is possible to add special cases, and I can
and have added new capabilities when it has allowed the number of special
cases to be reduced.

Anyhow, I don't really expect to achieve a 100% solution, and certainly
not at the first attempt, but provided that the source language is
sufficiently universal it doesn't matter so much about the language
definitions or translation method: awkward cases could always be handled
using an entirely separate back end.

--
Graham Shaw (http://www.riscpkg.org/~gdshaw/)
The RISC OS Packaging Project (http://www.riscpkg.org/)
The RISC OS Toolkit (http://rtk.riscos.org.uk/)
.



Relevant Pages

  • Re: Horses & water
    ... I'm not familiar with the DEC RISC systems. ... But sure you know what an Alpha is? ... Very strict language syntax and the possibility for tight control on ... In most languages integer i gets you a platform dependent integer ...
    (comp.os.vms)
  • Re: Choice of Language
    ... I develop applications in BBC BASIC on RISC OS, Linux and on Windows. ... IMHO The implementation of BBC BASIC on RISC OS is outdated and needs to be brought into line with the current specification from Richard Russell's branch of the language. ... BASIC you can continue wasting valuable development time doing memory ...
    (comp.sys.acorn.programmer)
  • Re: BBC BASIC in a webpage.
    ... RISC OS browser already. ... is supported in browsers like Firefox, Safari, Chrome and maybe Opera ... OS as a backend as a scripting language for HTML. ...
    (comp.sys.acorn.misc)
  • Re: Porting Space Invaders to Microsoft .NET. Prerequisites
    ... programming would be a pleasure under RISC OS. ... an example language here, but this could be a BBC BASIC like language ... something that takes the ease of GUI programming from MS ... glues them together to provide a powerful development tool & debugger ...
    (comp.sys.acorn.programmer)
  • Re: Drag n Drop issue 1, October 2009, is now online and available for download (http://www.dragdrop
    ... comments on YouTube videos and on the mighty Urbandictionary.com... ... Thats the great thing about language it evolves and changes or it dies ... on a Kindle or Sony device until we get a RISC OS hand held device. ... Using an IYONIX pc and RISC OS 5.14, ...
    (comp.sys.acorn.misc)