Re: Three broken compressors
- From: "Matt Mahoney" <matmahoney@xxxxxxxxx>
- Date: 19 May 2006 15:14:54 -0700
cr88192 wrote:
"Jim Leonard" <MobyGamer@xxxxxxxxx> wrote in message
news:1148053977.407041.279760@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
cr88192 wrote:ok, this makes sense...
why is the contents of wikipedia a single huge xml file rather than a
huge
number of small files?...
It's a database, actually, a very large distributed database across
hundreds of servers. It's not one single XML file -- that's just the
EXPORT you get if you ask for the data.
thought:
a database can be easily enough exported to xml. a problem would be a
database itself based on generalized xml. I would suspect that this would
need at least some manner of simplistic schema system, or embedding special
attributes in the tags, or similar...
dunno, misc really...
it is a thought, I have done binary xml formats before, but have not used
them that heavily since then. the realization eventually became that xml
can't effectively represent some things. it is easier to represent xml in
other data, than other data in xml (even if xml can be used for nearly all
the data in a format, what little can not maps over very poorly).
luckily, now, I have better solutions (eg, xml as chunks within a more
generalized container format).
or such...
Wikipedia actually makes very little use of XML. It is mainly used for
titles, timestamps, and user IDs on articles. The article is just one
long string in a <text> tag. All of the structure such as headings,
links, tables, lists, etc. use special characters embedded in the text.
-- Matt Mahoney
.
- Follow-Ups:
- Re: Three broken compressors
- From: Sachin Garg
- Re: Three broken compressors
- From: cr88192
- Re: Three broken compressors
- References:
- Three broken compressors
- From: Matt Mahoney
- Re: Three broken compressors
- From: cr88192
- Re: Three broken compressors
- From: Jim Leonard
- Re: Three broken compressors
- From: cr88192
- Three broken compressors
- Prev by Date: Re: Three broken compressors
- Next by Date: Re: BTPC Extension - PyramidWorkshop
- Previous by thread: Re: Three broken compressors
- Next by thread: Re: Three broken compressors
- Index(es):
Relevant Pages
|