0.6 release

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

0.6 release

Bruce D'Arcus
OK, I started to put the first release up on the Soureforge release
system -- version 0.6 -- but got annoyed enough with it (it's sort of
ridiculous how bad the SF interface is, given its function) that I gave
up and just posted it here:

        http://www.users.muohio.edu/darcusb/xslt/citeproc-0.6.tar.gz

This version is basically the "book" version; with all the fixes in
place I needed to format my recently finished book.

I also fixed the citekey and number class rendering, which somewhere
along the way broke.

Finally, I included another DocBook example stylesheet; the one I used
to format my book.  It processes a series of xinluded (external)
chapters and then spits everything out into separate files, including
the bibliography and notes.

A changelog is included in the archive, though this only includes more
recent changes since I moved the repository to darcs.

The plan:

For the next release I want to:

1.  Remove the existing styles -- which are named with generic class
names -- and replace them with real-world canonical styles that
correspond to these classes.  I will definitely include MLA, APA, and
Chicago, but I could use some help on good candidates for the number
and citekey classes.  If anyone wants to help on the somewhat tedious
task of tracking down citation style details for these, please let me
know.

I want to do this first, because it may uncover some changes I need to
make to the CSL schema before going forward.  I already know, for
example, that the way I handle configuration of "et al." handling is
not sufficient.

2.  Remove the DocBook stylesheets I just added, and put that under
separate version control.  I was finding too much of my changelog
relating to the stylesheets, which are just examples, after all.

3.  The big project: figure out how to reimplment the foot/endnote
class under the new architecture.

4.  add a simple (Bash, probably) script to make running citeproc
easier for those not familiar with XSLT.

5.  Better (more aimed at end-user) documentation.

BTW, I've ran across a novel way to implement a web interface for
editing CSL files; it involves using Javascript to do drag-and-drop
list reordering.  I know nothing about JS, alas, but it's encouraging
that it should be possible to do what I envision.

Bruce



Reply | Threaded
Open this post in threaded view
|

Re: 0.6 release

Matthias Steffens
On 15 May 2005 at 09:12 -0400 Bruce wrote:

> 1.  Remove the existing styles -- which are named with generic class
> names -- and replace them with real-world canonical styles that
> correspond to these classes.  I will definitely include MLA, APA, and
> Chicago, but I could use some help on good candidates for the number
> and citekey classes.  If anyone wants to help on the somewhat tedious
> task of tracking down citation style details for these, please let me
> know.

I fear I can't help with generic MLA/APA styles since I'm not used to
them but I'd like to try your solution to see how it performes with
styles from my scientific field. I.e. I'd like to generate a few
styles based on your 'author-year.csl'.

> 4.  add a simple (Bash, probably) script to make running citeproc
> easier for those not familiar with XSLT.
>
> 5.  Better (more aimed at end-user) documentation.

Both would be very helpful for me since I have only very basic
knowledge of XML/XSL etc. I've basically no clue how I would make use
of your solution. So answers to basic questions like:

- what tools are required
- what must be changed (and where) to customize a citation style
- how to call the processor

would help me a lot. In other words: what do I need to do if I want
to send MODS to the processor and get formatted references back (in
plain text or html)?

[Sorry if this is too basic but I just don't know better ;-)]

Thanks, Matthias


Reply | Threaded
Open this post in threaded view
|

Re: 0.6 release

Bruce D'Arcus
On Mon, 16 May 2005 16:18:20 +0200, "Matthias Steffens"
<[hidden email]> said:

> I fear I can't help with generic MLA/APA styles since I'm not used to
> them but I'd like to try your solution to see how it performes with
> styles from my scientific field.

That'd be great.

> > 4.  add a simple (Bash, probably) script to make running citeproc
> > easier for those not familiar with XSLT.
> >
> > 5.  Better (more aimed at end-user) documentation.
>
> Both would be very helpful for me since I have only very basic
> knowledge of XML/XSL etc. I've basically no clue how I would make use
> of your solution. So answers to basic questions like:
>
> - what tools are required
> - what must be changed (and where) to customize a citation style
> - how to call the processor

Much of this is documented already.  See the main page in the doc
directory and let me know if you have any questions.

All you need is really Saxon 8 (latest version is 8.4; note, though,
this version has some performance issues with my stylesheets).
 
Bruce


Reply | Threaded
Open this post in threaded view
|

Re: 0.6 release

Matthias Steffens
On 16.05.2005 at 7:43 Uhr -0700 Bruce D'Arcus wrote:

> > I've basically no clue how I would make use of your solution. So
> > answers to basic questions like:
> >
> > - what tools are required
> > - what must be changed (and where) to customize a citation style
> > - how to call the processor
>
> Much of this is documented already.  See the main page in the doc
> directory and let me know if you have any questions.
>
> All you need is really Saxon 8 (latest version is 8.4; note, though,
> this version has some performance issues with my stylesheets).

Ok, I think I've succesfully installed Saxon 8 (I don't know how to
test that its successfully running but anyhow).

Using the OSX terminal I've cd'ed into the main 'citeproc-0.6'
directory and executed the following command:

  java net.sf.saxon.Transform -o test.html samples/docbook-test.xml \
  xsl/document/dbng-xhtml.xsl citation-style="author-year"

This gives me the following output:

-- output start -----

CiteProc XSL Stylesheets v0.6.0

citation style: author-year
citation class: author-year

Recoverable error on line 109 of file:/Users/msteffens/Languages/XML/XML%20Tools/Bibliographic%20XML%20Tools/xbiblio/citeproc-0.6/xsl/citeproc.xsl:
  FODC0005: java.net.ConnectException: Connection refused
Error on line 109 of file:/Users/msteffens/Languages/XML/XML%20Tools/Bibliographic%20XML%20Tools/xbiblio/citeproc-0.6/xsl/citeproc.xsl:
  FODC0005: Failed to load document
  http://localhost:8080/exist/servlet/db/mods?_query=declare%20namespace%20mods=%22http://www.loc.gov/mods/v3%22;%20for%20$citekey%20in%20('Veer1996a',%20'TimesP2001a',%20'Tilly2000a',%20'Tilly2002a',%20'Thrift1990a',%20'NW2000-0207',%20'NW2000-0424a',%20'Tremblay2001a')%20return%20//mods:mods[@ID=$citekey]&_howmany=-1
Transformation failed: Run-time errors were reported

-- output end -----

Any idea what the problem is?

Could this be a permission issue? And am I allowed to have spaces in
the path name?

Is it correct that 'samples/docbook-test.xml' should be the input XML
file and '-o test.html' specifies the name & location of the output
file?

I don't see anything for MODS in 'xsl/document'. So, how would I
invoke conversion of a MODS file?

Thanks, Matthias


Reply | Threaded
Open this post in threaded view
|

Re: 0.6 release

Matthias Steffens
In reply to this post by Bruce D'Arcus
On 16.05.2005 at 17:48 Uhr +0200 Matthias Steffens wrote:

> Is it correct that 'samples/docbook-test.xml' should be the input
> XML file and '-o test.html' specifies the name & location of the
> output file?
>
> I don't see anything for MODS in 'xsl/document'. So, how would I
> invoke conversion of a MODS file?

Stupid me. I thought that 'docbook-test.xml' contains the
bibliographic data. I have no eXist XML DB installed on my machine.

refbase doesn't support SRU/W yet (does citeproc-0.6 already support
this method?). Instead I'd like to parse a real MODS file and have it
converted to plain text, html or whatever. How would I do that?

Thanks, Matthias


Reply | Threaded
Open this post in threaded view
|

Re: 0.6 release

Bruce D'Arcus
Matthias Steffens wrote:

> refbase doesn't support SRU/W yet (does citeproc-0.6 already support
> this method?). Instead I'd like to parse a real MODS file and have it
> converted to plain text, html or whatever. How would I do that?

Currently citeproc is setup to format documents.  You specify a flatfile
as the db by adding the paramete bibdb=flatfile, the default for which
will use the mods.xml file in the data directory.

That can be configured too.

I should say that if all you want is single-item entry display in a
browser, and you should many may want that. it might be worth
considering doing a citeproc-light.  Some of the really complicated
processing citeproc does (sorting, grouping, etc.*) only really matters
in the context of document formatting.

Bruce

* Example: it's deceptively difficult to get this output (Doe 1999a,
1999c; Smith 2000), but the suffixes and contractions and such don't
matter for wb display.


Reply | Threaded
Open this post in threaded view
|

Re: 0.6 release

Bruce D'Arcus
Bruce D'Arcus wrote:

> I should say that if all you want is single-item entry display in a
> browser, and you should many may want that.

um, should say:

"and you think many may want that."

Am tired!

Bruce


Reply | Threaded
Open this post in threaded view
|

Re: 0.6 release

Matthias Steffens
In reply to this post by Bruce D'Arcus
On 16 May 2005 at 12:36 -0400 Bruce D'Arcus wrote:

> Currently citeproc is setup to format documents.  You specify a
> flatfile as the db by adding the paramete bibdb=flatfile, the
> default for which will use the mods.xml file in the data directory.

Ok.

> I should say that if all you want is single-item entry display in a
> browser, and you should many may want that. it might be worth
> considering doing a citeproc-light.

What I'm interested in is to have citeproc work in a similar fashion
for output of references as bibutils does for output of common
bibliographic formats. I.e., I'd like to integrate citeproc with
refbase so that:

1. a user can select one or more bibliographic database entries in a
   web browser, choose a citation style from a drop down and click on
   "Cite" (all this is already provided by refbase)

2. refbase will generate MODS records for all selected entries and
   pass them to citeproc. In a first incarnation this could simply
   mean saving a MODS XML file to a tmp directory and call citeproc
   via the command line specifying the path to the tmp file as input
   file:
 
   exec("java net.sf.saxon.Transform -o ...")

  (could this step somehow be enhanced if refbase would support SRU/W?)

3. citeproc will convert all MODS records into references (formatted
   according to the given cite style) and return them

4. refbase will take these references and do with them whatever was
   requested by the user (display as plain text or html, send as email,
   etc)

>  Some of the really complicated processing citeproc does (sorting,
>  grouping, etc.*) only really matters in the context of document
>  formatting.

Yes, I can imagine that. Regarding refbase, this would be also of
interest but a goal that is a bit further away. Basically, I'd like a
future version of refbase to be able to act as the possible MODS
source (instead of eXist XML DB). If I've understood you correctly
this would require refbase to support SRU/W, right?

Thanks, Matthias


Reply | Threaded
Open this post in threaded view
|

Re: 0.6 release

Bruce D'Arcus
Matthias Steffens wrote:

>>I should say that if all you want is single-item entry display in a
>>browser, and you should many may want that. it might be worth
>>considering doing a citeproc-light.
>
> What I'm interested in is to have citeproc work in a similar fashion
> for output of references as bibutils does for output of common
> bibliographic formats. I.e., I'd like to integrate citeproc with
> refbase so that:
>
> 1. a user can select one or more bibliographic database entries in a
>    web browser, choose a citation style from a drop down and click on
>    "Cite" (all this is already provided by refbase)
>
> 2. refbase will generate MODS records for all selected entries and
>    pass them to citeproc. In a first incarnation this could simply
>    mean saving a MODS XML file to a tmp directory and call citeproc
>    via the command line specifying the path to the tmp file as input
>    file:
>  
>    exec("java net.sf.saxon.Transform -o ...")
>
>   (could this step somehow be enhanced if refbase would support SRU/W?)
>
> 3. citeproc will convert all MODS records into references (formatted
>    according to the given cite style) and return them
>
> 4. refbase will take these references and do with them whatever was
>    requested by the user (display as plain text or html, send as email,
>    etc)

OK, so I wonder if this isn't something like the reading list sort of
thing that the Oxford people are working on (and wanting to use citeproc
for as well)?  The result, then, is just a formatted list of references?

In that case, you could actually think of a simple DocBook document that
contains the citations, and on which citeproc is run in the bibliography
mode only (not the citations).  There may be other ways to address it to.

Current output modes for citeproc are xhtml, fo, and tex, with wordml
and opendocument (openoffice's file format) planned.

Plain text is simple enough to add.

BTW, did you see the demo I came up with using clientside XSLT and Atom?

        http://www.users.muohio.edu/darcusb/feeds/bib-atom.xml

> Yes, I can imagine that. Regarding refbase, this would be also of
> interest but a goal that is a bit further away. Basically, I'd like a
> future version of refbase to be able to act as the possible MODS
> source (instead of eXist XML DB). If I've understood you correctly
> this would require refbase to support SRU/W, right?

Yes. An XSLT processor can ingest documents over HTTP. This is how
integration with eXist works, which has a nice little RESTful server.

SRU is the obvious standard to support here, though in theory something
even more lightweight is possible.  All citeproc does is issue a single
call -- over a potentially very long url query -- that says "give me all
the MODS records that correspond to X list of unique citekey pointers."

Another theoretical possiblity is for someone to write extension
functions for XSLT processors.  That could allow calls like:

<xsl:copy-of select="xbib:return_record_with_id($cites)"/>

I don't know that this is necessary, and is something farther out to
consider in any case.

Bruce


Reply | Threaded
Open this post in threaded view
|

Re: 0.6 release

Matthias Steffens
On 16 May 2005 at 13:25 -0400 Bruce D'Arcus wrote:

> Matthias Steffens wrote:
> > What I'm interested in is to have citeproc work in a similar
> > fashion for output of references as bibutils does for output of
> > common bibliographic formats.

> OK, so I wonder if this isn't something like the reading list sort
> of thing that the Oxford people are working on (and wanting to use
> citeproc for as well)?  The result, then, is just a formatted list
> of references?

Yes. The returned list of formatted references must retain all local
formatting (i.e. bold, italic, uppercase, etc), so 'xhtml' output is
what I'd normally want.

> In that case, you could actually think of a simple DocBook document
> that contains the citations, and on which citeproc is run in the
> bibliography mode only (not the citations).

Sounds good. Is this 'bibliography mode' already available in
citeproc-0.6 and (if so) how do I invoke it?

I think it wouldn't be too difficult for refbase to dynamically
generate a simple DocBook document containing all the citations. I
assume that the given citation strings must equal the MODS identifiers
given in <mods ID="...">?

However, I'm not sure why this step would be necessary. I mean, for
my setup the whole purpose of this DocBook document would be to pass
the citation IDs, right? If citeproc could start its work with a MODS
file (dynamically generated by refbase), all the IDs would be already
present. That said, would it be too difficult to modify citeproc in
such a way that it could start its work with a given MODS file (but
without a DocBook document) and return a list of formatted references
as xhtml?

> Current output modes for citeproc are xhtml, fo, and tex, with wordml
> and opendocument (openoffice's file format) planned.
>
> Plain text is simple enough to add.

Yes, plain text would be very useful. I can imagine that users might
also want RTF output but I have no clue if this would be difficult to
implement.

> BTW, did you see the demo I came up with using clientside XSLT and Atom?
>
> http://www.users.muohio.edu/darcusb/feeds/bib-atom.xml

I'm not sure it works for me. Safari (on OSX 10.4 Tiger) always
converts the URL to

  <feed://www.users.muohio.edu/darcusb/feeds/bib-atom.xml>

and throws up a strange error then. Firefox displays some text as web
page:

  Some Journal Article
  Jane Doe, Some Article

  Some annotations, complete with rich content, including "embedded quotes".

but not the volume/issue/pages information that's present in the
feed's source. What am I supposed to see? A fully formatted reference?
 

> > Basically, I'd like a future version of refbase to be able to act
> > as the possible MODS source (instead of eXist XML DB). If I've
> > understood you correctly this would require refbase to support
> > SRU/W, right?
>
> Yes. An XSLT processor can ingest documents over HTTP. This is how
> integration with eXist works, which has a nice little RESTful server.
>
> SRU is the obvious standard to support here, though in theory something
> even more lightweight is possible.  All citeproc does is issue a single
> call -- over a potentially very long url query -- that says "give me all
> the MODS records that correspond to X list of unique citekey pointers."

I'm not sure how difficult it would be to support SRU. As a first
measure it would be fine for me to simply provide support for the SRU
query that citeproc generates.

How exactly does the SRU query look like that citeproc sends out?

> Another theoretical possiblity is for someone to write extension
> functions for XSLT processors.  That could allow calls like:
>
> <xsl:copy-of select="xbib:return_record_with_id($cites)"/>

Ok. I guess this means that such an extension function could send a
query in a form that refbase understands already?

Could you give me any pointers to (or examples of) such a function?
Whats the language used for the function?

Thanks again, Matthias


Reply | Threaded
Open this post in threaded view
|

Re: 0.6 release

Bruce D'Arcus
Matthias Steffens wrote:

> However, I'm not sure why this step would be necessary. I mean, for
> my setup the whole purpose of this DocBook document would be to pass
> the citation IDs, right? If citeproc could start its work with a MODS
>  file (dynamically generated by refbase), all the IDs would be
> already present.

True.  I just came up with an example that works, but I need to think a
bit more if there's not a better way.

The structure of citeproc now pretty much assumes it's working with a
document, so my "solution" was to tell it that mods:mods/@ID is in fact
a citation.  It works, but it brings with some it some processing overhead.

> That said, would it be too difficult to modify citeproc in such a way
> that it could start its work with a given MODS file (but without a
> DocBook document) and return a list of formatted references as xhtml?
>  Yes, plain text would be very useful. I can imagine that users might
>  also want RTF output but I have no clue if this would be difficult
> to implement.

RTF is sort of a PITA, but it is just text, so certainly possible.

> but not the volume/issue/pages information that's present in the
> feed's source. What am I supposed to see? A fully formatted
> reference?

No, I didn't get around to doing the rest of the templates. It's just to
show a simple XSLT can be put to good use for these sorts of uses.

>>> Basically, I'd like a future version of refbase to be able to act
>>>  as the possible MODS source (instead of eXist XML DB). If I've
>>> understood you correctly this would require refbase to support
>>> SRU/W, right?
>>
>> Yes. An XSLT processor can ingest documents over HTTP. This is how
>>  integration with eXist works, which has a nice little RESTful
>> server.
>>
>> SRU is the obvious standard to support here, though in theory
>> something even more lightweight is possible.  All citeproc does is
>> issue a single call -- over a potentially very long url query --
>> that says "give me all the MODS records that correspond to X list
>> of unique citekey pointers."
>
>
> I'm not sure how difficult it would be to support SRU. As a first
> measure it would be fine for me to simply provide support for the SRU
>  query that citeproc generates.
>
> How exactly does the SRU query look like that citeproc sends out?

I've not actually tried it, but an expert in these technologies has
suggested something like:

http://localhost:8081/biblio?operation=searchRetrieve&version=1.1&query=cite.key+any+"Smith1992a+Smith1992b+Mitchell1995a"&recordSchema=mods&startRecord=1&maximumRecords=9999

>> Another theoretical possiblity is for someone to write extension
>> functions for XSLT processors.  That could allow calls like:
>>
>> <xsl:copy-of select="xbib:return_record_with_id($cites)"/>
>
>
> Ok. I guess this means that such an extension function could send a
> query in a form that refbase understands already?
>
> Could you give me any pointers to (or examples of) such a function?
> Whats the language used for the function?

Each XSLT processor has their own ways of writing extension functions.
So in Saxon it's Java (and I think maybe XQuery), in libxslt, it's C, etc..

The person who suggested the above (Mike Taylor, of Index Data) has also
suggested the possibility of writing an XSLT extension for ZOOM, such
that one could query either SRU/W or Z39.50 catalogs from an XSLT.

Quite intesting actually.

Bruce


Reply | Threaded
Open this post in threaded view
|

Re: 0.6 release

Bruce D'Arcus
Bruce D'Arcus wrote:

> True.  I just came up with an example that works, but I need to think a
> bit more if there's not a better way.

Here's one way:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0"
   xmlns:xdoc="http://www.pnp-software.com/XSLTdoc"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
   xmlns:mods="http://www.loc.gov/mods/v3"
xmlns="http://www.w3.org/1999/xhtml"
   xmlns:xhtml="http://www.w3.org/1999/xhtml"
xmlns:db="http://docbook.org/ns/docbook"
   xmlns:cs="http://purl.org/NET/xbiblio/csl"
xmlns:bib="http://purl.org/NET/xbiblio/citeproc"
   xmlns:exist="http://exist.sourceforge.net/NS/exist"
exclude-result-prefixes="db xdoc xhtml mods xs
   cs exist bib">

   <xsl:import href="../citeproc.xsl"/>
   <xsl:output method="xhtml" encoding="utf-8" indent="yes"/>
   <xdoc:doc type="stylesheet">
     <xdoc:short>Stylesheet to transform MODS to XHTML.</xdoc:short>
     <xdoc:author>Bruce D’Arcus</xdoc:author>
     <xdoc:copyright>2005, Bruce D’Arcus</xdoc:copyright>
   </xdoc:doc>

   <xsl:param name="include-bib">yes</xsl:param>
   <xsl:variable name="title">References</xsl:variable>

<!--
override raw-biblist variable in citeproc.xsl to just use current document
-->
   <xsl:variable name="raw-biblist">
     <xsl:copy-of select="."/>
   </xsl:variable>

   <xsl:template match="/">
     <html>
       <head>
         <title>
           <xsl:value-of select="$title"/>
         </title>
         <meta name="generator" content="CiteProc XSL Stylesheets
v{$CP-VERSION}"/>
         <link rel="stylesheet" href="forprint.css" type="text/css"
media="print"/>
         <link rel="stylesheet" href="screen.css" type="text/css"
media="screen"/>
       </head>
       <body>
         <div id="content">
           <div id="main-content">
             <h2>
               <xsl:value-of select="$title"/>
             </h2>
             <xsl:call-template name="bib:format-bibliography">
               <xsl:with-param name="output-format" select="'xhtml'"/>
             </xsl:call-template>
           </div>
         </div>
       </body>
     </html>
   </xsl:template>
</xsl:stylesheet>