Re: Questions on possreps



On May 30, 5:09 pm, Erwin <e.sm...@xxxxxxxxxxx> wrote:
On 30 mei, 05:04, David BL <davi...@xxxxxxxxxxxx> wrote:





On May 29, 8:12 am, Erwin <e.sm...@xxxxxxxxxxx> wrote:

Haven't gone through the other responses first, perhaps I should have,
but anyway here goes (see inlined answers) :

On 26 mei, 06:04, David BL <davi...@xxxxxxxxxxxx> wrote:

3) Is it correct that a POSSREP typically doesn't unambiguously define
a method for encoding values of type T on some physical medium?

The POSSREPs as defined in the TTM literature have nothing to do with
physical encoding.

4) Consider a POINT value is encoded in computer memory using a '\0'
terminated ASCII string such as "(1.34,-6.2)", and must be parsed
according to some well defined grammar in order to retrieve the x,y
coordinates. Does this qualify as a CARTESIAN representation?

It could qualify as a valid POSSREP for the POINT type, but it is a
different one from the CARTESIAN possrep typically used in
Date&Darwen's relational writings.

What do you mean it could qualify as a valid POSSREP for POINT when
you just stated POSSREPS have nothing to do with physical encoding?

Sorry. Missed the "suppose ... is encoded in memory as a nul-
terminated string".

I meant to say that there is nothing in the concept you describe that
would make it an "invalid" (logical) possrep. In fact, ALL types can
"conceptually" be regarded as having a "STRING" possrep, which
produces the "externalization" of any value of the type. More or less
equivalent to Java's toString() method that exists for all objects.

I'm not sure what you are saying there. I'll try to describe how I'm
thinking about it in more detail:

Types and hence values and operators are pure mathematical
abstractions. Nested operator invocations provide an entirely
sufficient means for (logically) representing values. I find it
superfluous to introduce POSSREPs for this purpose (as though
operators are inadequate!), as well as all the other confusing and
redundant vernacular: "atomic", "encapsulated", "scalar", "structure",
"selector", "dummy type" etc. In fact I'm now thinking of POSSREPS as
merely a peculiar syntactic sugar for declaring operators.

Operators being pure abstractions must be distinguished from any
implementation as executable routines on computers. In fact operators
are the basis for data representation and not just the basis for
defining calculations to be executed. It seems as though this former
perspective was missed when the redundant idea of POSSREPS was
introduced, and yet strangely was made apparent when each POSSREP
implicitly declared a kind of operator called a selector.

Let STRING be a type. POINT and STRING are distinct types. I see no
need to complicate things by trying to formalise some notion of a
"possible representation" of POINT involving STRING. All that's
required are /explicit/ unary operators that map POINT to STRING and
vice versa.

Even though my example involved a string representation I consider it
to be a physical representation of the value

cartesian(1.34,-6.2)

and not

tostring(cartesian(1.34,-6.2)) = "(1.34,-6.2)"

since I expressly stated that a POINT value was encoded.

The way in which a region of memory is interpreted as a value depends
on some context. Given that physical representations of one type are
typically implemented in terms of physical representations of other
types, it is not generally possible to assume there is some unique
context for interpreting a region of memory as a value.


However, in its role as a "physical possrep", there is more to it than
what you say. It is not sufficient to say "encoded as a nul-
terminated string" to determine the actual bit pattern for a value.
Because the encoding of the string itself is also relevant, and a
possible source of differences. UTF-8 gives different bit patterns
compared to UTF-16. Byte ordering conventions produce different bit
patterns for any kind of integer numbers (except the single-byte ones
of course). etc. etc. Switching the order of the X and Y components
in an encoding that is based on the (two-component) CARTESIAN possrep,
will produce different bit patterns too.

If I had said a Unicode string then I would agree the encoding needs
to be specified as well. However I specified ASCII which is a
character encoding which defines a correspondence between 7 bit binary
patterns and character symbols, and RFC20 suggests embedding in an
octet with the high order bit always 0, so I think it can be assumed
there is no ambiguity on machines with octets as the native
addressable data type.

ASCII was incorporated into the Unicode character set as the first 128
symbols, so the ASCII characters have the same numeric codes in both
sets. This allows UTF-8 to be backward compatible with ASCII. However
UTF-8 refers to a multibyte encoding of Unicode and therefore is
distinct from ASCII.


That was my point where I said "targeted to the machine" : specifying
a physical possrep requires the specification of _every single detail_
that plays a role in determining the ultimate bit pattern. So for
this reason, your CARTESIAN possrep does not qualify as a valid
physical possrep, because it is incomplete.

I agree that a physical representation must be well defined.


5) Is it correct to say that in Tutorial D types are sometimes defined
in terms of possreps?

Yes. In fact this holds for all of them, except then the "basic" ones
such as INTEGER, BOOLEAN, ...

Would you say union types are defined in terms of POSSREPS?

Oops. You're probably right. I wasn't thinking of those. UNION
types typically even will not have a POSSREP of their own (allthough
it is always possible to conceive a toString()-like POSSREP which
includes both "actual" typename and externalization-of-value. But
beware that in the Manifesto, declaring a UNION type is _not_ merely a
matter of saying :

TYPE PLANE_FIGURE IS ELLIPSE UNION POLYGON; /* where ELLIPSE and
POLYGON were formerly root types */

Strictly speaking, the Manifesto requires that "simultaneously", the
declaration of the existing root types is changed to

TYPE ELLIPSE IS PLANE_FIGURE ... , TYPE POLYGON IS
PLANE_FIGURE ... ; /* note the comma, as opposed to a semicolon */

It seems awkward to need to specify one subtype relationship in two
places.
.



Relevant Pages

  • Re: Unicode to ASCII string conversion
    ... files can be ANSI, ASCII, UTF7, UTF8, EBCDIC, UTF16 and many other ... > to string conversion and explicitly no bytearray, ... >> If you need an ASCII file, then use a ASCII encoding. ...
    (microsoft.public.dotnet.languages.vb)
  • Re: Writing extended ascii characters to text file.
    ... so in order to get real ASCII codes you should use the GetBytes ... method of an Encoding instance configured for the ASCII encoding (as far as ... again, you've got bytes, not characters. ... > string line; ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: string to ascii on line feed
    ... first published ASCII as a standard in 1963. ... refer to multiple things, one of which might be "The encoding Java uses when we ask for the 'ASCII' encoding." ... Conceptually, we have a string in memory, and we wish to store that string to disk, using a specific encoding. ... Now when we say "Encoding FOO is n bits", what we usually mean is either "the encoding uses n bits per character to represent a given string" or the less restrictive "*on average*, the encoding uses n bits per character to represent a given string". ...
    (comp.lang.java.programmer)
  • Re: Questions on possreps
    ... a method for encoding values of type T on some physical medium? ... What do you mean it could qualify as a valid POSSREP for POINT when ... terminated string". ... literals are selectors, but not all value selectors are literals. ...
    (comp.databases.theory)
  • Re: Bug in StreamReader.ReadLine()? It reads special chars wrong...
    ... > I compare this string to a string in an Access DB (btw, ... The same thing happens with chars ... ASCII is a 7-bit encoding and has no 'Ñ'. ...
    (microsoft.public.dotnet.languages.csharp)