Re: Uses of processing instructions and notations



Tom Anderson wrote:
On Fri, 5 Dec 2008, Philippe Poulard wrote:

This is a somewhat deprecated usage inherited from the SGML days. I remember I used such machinery 10 years ago to have documents composed of text and binary contents. This had some sense since tools were supported it (I mean SGML/XML editors), [...]

Ah, i see. Thanks for your answer. I will proceed to completely forget about notations and external entities!

Only if you plan on using XML for data encoding or transmission. If you use XML for normal text documents then Notations and Entities (and Processing Instructions) are important tools for document management.

>> but today, people tend to
>> stick media to their documents à la HTML, with a simple href
>> attribute.

Only for trivial documents. If you are doing large-scale long-term complex document management, simple hrefs just don't cut the mustard.

[tom]
How about notations and external entities? Is the idea that they're a mechanism of linkage to external files that's more concrete than just using a URL? So, if i was writing a bizarro world HTML, i could specify images like this in the DTD:

<!NOTATION jpeg SYSTEM "http://some-kind-of-URL";>
<!NOTATION png SYSTEM "http://some-other-kind-of-URL";>
<!ELEMENT img EMPTY>
<!ATTLIST img
src ENTITY
alt CDATA #IMPLIED >

Then in my document i could write:

<!DOCTYPE img PUBLIC "-//Bizarro HTML" "http://bizarrohtml"; [
<!ENTITY lena SYSTEM "lena.jpg" NDATA jpg>
]>
<img src="lena" alt="picture of Lena"/>

Yes, exactly, except that the NOTATION declaration can be used (some would say abused) for two things:

<!NOTATION jpeg PUBLIC "ISO/IEC 10918-1:1994//NOTATION Digital Compression and Coding of Continuous-tone Still Images (JPEG)" SYSTEM "/usr/bin/eog">

A formal catalog can be used to detect the FPI and verify that the image conforms to the specification; and the SI can be used by a processor to run the specified program on the image (eg to embed it in a PDF, or convert it to some other format).

Can i also use that entity in regular text, like:

<p>Here is a picture of Lena &lena;</p>

No, because the processor expects inline entities to resolve to processable XML text or markup.

But another reason for using the technique is for repeated images such as navigation icons. You really don't want to have to add <icon uri="http://some.host.name/dir/dir/dir/someicon.gif"/> every time, especially by hand, when <icon imgref="nextchap"/> is simpler and easier, and lets you manage the icon file references once at the top of the file or in the DTD or (more likely) in a file of entity declarations which can be maintained by a non-XML expert or generated from a database.

And in both cases, what does it *mean*?

The extra level of indirection provides a form of safety-net for your documents. If you have a data warehouse with all 35,000 of your books, articles, manuals, catalogs, whatever, all referencing obsolete URIs that constantly needs updating, you'll find an entity mechanism and a single file a much more efficient way to do it.

If i parsed that into DOM and called getAttribute("src") on the img element, what would i get back?

I would expect it to return the name of the entity ("lena"). A different function should be available (as in XSLT) to resolve the entity reference against the declaration and return the name of the physical file.

What's the point of being able to declare an attribute as being of type NOTATION?

Explained above.

And does anyone actually use any of this stuff?

Yes, extensively. But typically only in the text document management field. Users of rectangular XML (eg spreadsheet data) woud not normally have any use for this stuff at all.

///Peter
--
XML FAQ: http://xml.silmaril.ie/
.



Relevant Pages

  • I have this wonderfull function that works GREAT! and uses AJAX, the JavaScript DOM and XML and that
    ... XML and that i spend almost 2 ... preloads all the images and show them all in one shot... ... I use AJAX to pass the XML file, ... var xmlHttp; //store the reference to the XMLHttpRequest object ...
    (comp.lang.javascript)
  • Re: mfc to .NET
    ... Yes, our IDL system, and its precursor, the LG system, took pointer rebuilding as a key ... There are both "standard" extensions to XML, and ad hoc extensions to XML, that allow this ... The nested-vs.-flat notation was a boolean parameter of the writer; ... When I looked at MFC serialization, I saw it had all of the ...
    (microsoft.public.vc.mfc)
  • Re: Office 2007 - Problem with linked images in a document
    ... I tried the XML thing first, but Word seems to be quite touchy about the ... format and didn't allow much editing before the file got corrupt. ... For images that are linked the Target will include the file path ... checks the documents working folder? ...
    (microsoft.public.word.docmanagement)
  • Re: ListView - Caching Thumbnails?
    ... The program just will yield some odd results if they do (which I figure they can deal with if they are monkeying with the database;). ... Access databases fill the middle ground I guess but still, you can edit XML by hand. ... The underlying goal here is to create an interactive anatomical atlas program using high resolution images. ... Then I go back to school ). ...
    (microsoft.public.dotnet.languages.vb)
  • Re: Future of LISP. Alternative to XML. Web 3.0?
    ... I remember reading Richard Feynman got frustrated with the complicated math ... Eventually he went back to standard notation. ... this XML developer says about problems with mixed content XML. ... convert it to more compact and efficient Lisp and then manipulate it. ...
    (comp.lang.lisp)