Re: Replacing BibTeX (was: biblatex-apa - work underway)



Sorry to jump in late, but I am interested in the discussion because it seems to be leading to a better way to manage bibtex/biblatex databases. This is one of the hard things about bibtex, the other being changing a style or writing a new style to conform to some standard. I haven't had time to investigate biblatex fully, but I am confident from what I've read that it will take care of the style problem when the LaTeX community develops a critical mass of biblatex styles.

It is already possible to manage a bibtex database in Zotero, which will export to bibtex database format from its internal format. The ability with Zotero to capture references from, say, library web sites already reduces the amount of time spent entering bibliographic data, though many references obtained this way need to be cleaned up a bit before they are really useful.

It is interesting to me that Zotero uses an SQLite database for its internal storage; this suggests that it would be possible to use standard database management tools on the bibliographic data, in addition to the rather limited management facilities present already in Zotero.

AFAIK, there is as yet no Zotero to biblatex converter. It would be useful to have one. Even better would be a mechanism to query the Zotero database directly so the biblatex end-user wouldn't have to export from Zotero every so often.

I think the ideal bibliographic data system would:
1) capture references from the web,
2) store them in a way that could be managed with standard database tools,
3) provide a user-friendly data entry environment, and
4) be directly accessible by biblatex.

Tom

On 2008-11-14 12:27:32 -1000, donatetofoss@xxxxxxxxx said:

On Nov 14, 9:27 am, Dan Luecking <LookIn...@xxxxxxxx> wrote:
Pointers to basic (preferably
self-contained) documentation would be especially welcome.

Unfortunately, this area still needs work. Basic information
(including links) can be found at:
http://en.wikipedia.org/wiki/Citation_Style_Language
http://www.zotero.org/support/dev/creating_citation_styles

Zotero's wiki page <http://www.zotero.org/support/dev/
csl_syntax_summary> is probably the closest thing to Oren Patashnik's
"Designing BibTeX Styles."


Developers need to know how things will affect users.

Yes, of course. My main point was that Simon raised "CSL" as a
potential future successor to BibTeX (requiring further development) &
much of the frustration in the thread seemed to come from those that
were looking into using CSL with LaTeX right now.


But I can read the documentation of the .bst language (and point
anyone to that documentation, a single file).

It is actually interesting to trace Oren's 1988 manual to Patrick
Daly's 1999 documents to more recent documents, such as the LaTeX
companion and Nicolas Markey's tutorial "Taming the BeaST." (I
certainly hope that it would take less than a decade to have CSL+LaTeX
work more nicely together & less than two decades for end-user
documentation that was easy to understand!).

I agree that .bst is better documented than the newer CSL!


I thought CSL was a system: people have been contrasting
it to bibtex. From the above, it seems to me that downloading
a CSL is only part of the system, and therefore could hardly
be said to be "working".

Yes, CSL is a part of a system (just as .BST files are, it is the part
that describes how to format citations). The word "BibTeX," has
(perhaps unfortunately) been used to describe:
(1) The whole referencing system
(2) The .bib flat-file database format
(3) Various versions of the 'bibtex' program
and even, occasionally, (4) the BibTeX Style Templates (BST).

'CSL' only describes an equivalent to (4). The nearest equivalents to
the others:
(1) probably 'XBib' <http://xbiblio.sourceforge.net/>, but most
implementations aren't hosted there.
(2) perhaps the bibliographic ontology <http://bibliontology.com/>,
but there are no implementations of this yet & CiteProc often has
other input formats (see below)
(3) most often 'citeproc' with some link to the language it was
implemented in.
(4) CSL


Allow me to draw equivalents. It is difficult, given the multiple
implementations of CiteProc+CSL. We will look at:
(a)Zotero
(b)pandoc+citeproc-hs

With bibtex I can
With (a) or (b), you can...

- obtain a working bibtex by installing TeX Live or MiKTeX.
(a) Install Firefox+Zotero
(b) install haskell, citeproc-hs, and pandoc

- I feed it .bst styles and .bib databases.
(a) Feed Zotero with CSL styles and your Zotero database (which can be
fed MODS, .bib, .ris, and other diverse formats).
(b) Feed pandoc+citeproc-hs with CSL styles and MODS XML

- It is capable of sorting, parsing names, etc., and writing
arbitrary text (usually some sort of bibliography environment
complete with \bibitem commands for each entry).
Zotero and pandoc+citeproc-hs are capable of sorting, parsing names,
and writing arbitrary text, but primary formats are currently:
(a) text, HTML, RTF, or through plugins to word processors
(b) MarkDown (although PanDoc has numerous other import/export
formats, including LaTeX)

- I train it by writing a .bst or finding one that does what I
want.
You train it by writing a .csl or finding one that does what you want.


- I *use* bibtex by running "bibtex filename" where
filename.aux contains the name of the database,
the .bst and the citation keys. (Usually, but not
necessarily, this file is written by LaTeX macros.)
(a) You use a graphical user interface that uses the formats
enumerated, above.
(b) You run 'pandoc --csl apa.csl --mods modsCollection.xml filename'
'apa.csl' describes how citations are formatted
'modsCollection.xml' has reference information (equivalent to a .bib
file)
'filename' is most often a markdown-formatted file that uses cite
tags resembling [Smith99; Jones01@ p. 10] (note the ability to specify
a 'locator' within a text.) Pandoc reads and writes other formats
(HTML, LaTeX, etc.).


What does CiteProc do with the document itself? Is a citeproc
implementation tied to a particular document preparation system?

If you think of "CiteProc" as a "meta project," it is agnostic to
document preparation systems. There are multiple implementations of
CiteProc & each different implementation is able to use different
document preparation systems.


Bibtex, for example, uses only information supplied by the user.
I could feed it that information without any document and it would
happily write me a .bbl file.

CiteProc is similar.


With the right .bst it could
probably write it with troff markup instead of TeX.

The same can be done in CSL, by making a style that uses structured
text (such as the CSL file that generates a BibTeX .bib database).
However, since particular implementations of CiteProc understand
various document systems, there's no reason to put this lexical
formatting in a CSL file. Instead, you use semantic information that
is transformed to a specific language by CiteProc.


What is the format of the citation database? With bibtex it is
the .bib file, which is independent of the bst. I assume the
database for citeproc consists of xml markup, is it independent
of the csl? Independent of the citeproc implementation?

Again, this is implementation-specific (see above). It is not tied to
XML; Zotero uses their own database. Many implementations currently
support MODS XML, though.


Also .bib files are "open-ended". That is, a bibliographic
record can contain arbitrary fields. A BST determines which
ones are used. Is that true of the citeproc/csl/database
system?

Not only are some supported database formats extensible, but they can
be be truly hierarchical.


As above, not much right now: you can generate LaTeX-formatted markup

Hell, that's exactly what a .bst does. That, plus sorting
and data manipulation (changing case, parsing names).

Yes-and-no. Because no CiteProc implementation currently has a
complete/native understanding of LaTeX, the .bbl files are quite a bit
more useful (containing semantic information, leaving cite commands in
place, etc.) than current LaTeX generated by pandoc/citeproc-hs.


or can use it as a round-about way to end up with a BibTeX file. In

The phrase "Bibtex file" is ambiguous. Do you mean a .bib database?

Yes.


--
Tom Dye
T. S. Dye & Colleagues, Archaeologists, Inc.
Honolulu, Hawai`i

.



Relevant Pages

  • Anyone ever consider a filesystem served by MySQL for mail folders?
    ... The actual supporting database would include category strings for each ... Special rules could be constructed that allow special filename formats ... multiple ways easily without saving multiple copies of messages. ... any mail client with no modification to the client and no user ...
    (freebsd-questions)
  • Re: Replacing BibTeX (was: biblatex-apa - work underway)
    ... I haven't had time to investigate biblatex fully, but I am confident from what I've read that it will take care of the style problem when the LaTeX community develops a critical mass of biblatex styles. ... It is already possible to manage a bibtex database in Zotero, which will export to bibtex database format from its internal format. ... The ability with Zotero to capture references from, say, library web sites already reduces the amount of time spent entering bibliographic data, though many references obtained this way need to be cleaned up a bit before they are really useful. ...
    (comp.text.tex)
  • Re: Excel Problem
    ... nearly all my PCs though. ... OO's database? ... have file formats that pretty much anyone else can read. ... All except one PC here run XP now but that didn't change a thing WRT *.doc files. ...
    (sci.electronics.design)
  • Re: Excel Problem
    ... nearly all my PCs though. ... OO's database? ... have file formats that pretty much anyone else can read. ... I also used it to convert hundreds of older .doc files to the version ...
    (sci.electronics.design)
  • Re: Excel Problem
    ... nearly all my PCs though. ... installed Open Office on another computer and salvaged all my 'Works' ... OO's database? ... have file formats that pretty much anyone else can read. ...
    (sci.electronics.design)