Re: Three broken compressors



cr88192 wrote:
"Jim Leonard" <MobyGamer@xxxxxxxxx> wrote in message
news:1148053977.407041.279760@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
cr88192 wrote:
why is the contents of wikipedia a single huge xml file rather than a
huge
number of small files?...

It's a database, actually, a very large distributed database across
hundreds of servers. It's not one single XML file -- that's just the
EXPORT you get if you ask for the data.

ok, this makes sense...

thought:
a database can be easily enough exported to xml. a problem would be a
database itself based on generalized xml. I would suspect that this would
need at least some manner of simplistic schema system, or embedding special
attributes in the tags, or similar...

dunno, misc really...

it is a thought, I have done binary xml formats before, but have not used
them that heavily since then. the realization eventually became that xml
can't effectively represent some things. it is easier to represent xml in
other data, than other data in xml (even if xml can be used for nearly all
the data in a format, what little can not maps over very poorly).

luckily, now, I have better solutions (eg, xml as chunks within a more
generalized container format).

or such...

Wikipedia actually makes very little use of XML. It is mainly used for
titles, timestamps, and user IDs on articles. The article is just one
long string in a <text> tag. All of the structure such as headings,
links, tables, lists, etc. use special characters embedded in the text.

-- Matt Mahoney

.



Relevant Pages

  • Re: XML format for statistical data and analysis?
    ... > Is there an accepted format for statistical data and analysis, ... I do use XML for much of my data. ... database generates XML by means of JDBC and ESQL. ... the database, the statistics software, and the xml. ...
    (sci.stat.math)
  • Re: DISCOVER_XML_METADATA
    ... expansion of ASSL XML returned by the server. ... You could do ExpandObject for the server in step 1. ... you could request ExpandObject for that database -- this is step ... all cubes and nothing else. ...
    (microsoft.public.sqlserver.olap)
  • Re: Moving from delimited to XML
    ... Recently I have started using XML in other areas and realize that this ... The difference between a CSV format and an XML format is that the ... person may have zero or more names, zero or more streets, zero or more ... You might also want to check out Exist, a XML database. ...
    (comp.lang.perl.misc)
  • Re: Moving from delimited to XML
    ... Recently I have started using XML in other areas and realize that this ... The difference between a CSV format and an XML format is that the ... person may have zero or more names, zero or more streets, zero or more ... You might also want to check out Exist, a XML database. ...
    (comp.lang.perl.misc)
  • Re: Preservation of namespace prefixes in XML datatype
    ... representation well beyond merely preserving the logical meaning. ... > Now the ANSI/ISO SQL-2003 standard makes it pretty clear that the XML ... all the guarantee the database gives you is that we preserve the ... >> makes changes in namespace prefixes, will I the user/developer have the ...
    (microsoft.public.sqlserver.xml)