csl changes

classic Classic list List threaded Threaded
20 messages Options
Reply | Threaded
Open this post in threaded view
|

csl changes

Bruce D'Arcus-3
Hi,

I'm contemplating a change in the way I handle contributors in CSL.

Right now, here is a example of book definition for APA:

         <reftype name="book">
           <creator>
             <names/>
             <role prefix=" (" suffix=")."/>
           </creator>
           <date>
             <year prefix=" (" suffix=") "/>
           </date>
           <title font-style="italic" suffix="."/>
           <origin>
             <place/>
             <publisher prefix=": "/>
           </origin>
           <genre suffix=", "/>
           <medium/>
           <availability prefix=", ">
             <physicalLocation/>
             <url prefix=", "/>
           </availability>
         </reftype>

I adopted the creator structure with role child to offer more
flexibility. However, I'm now thinking that may have been unnecessary,
and am thinking about this instead:

         <reftype name="book">
           <author primary-alternate="editor"
secondary-alternate="title"/>
           <date>
             <year prefix=" (" suffix=") "/>
           </date>
           <title font-style="italic" suffix="."/>
           <editor prefix=" "/>
           <origin>
             <place/>
             <publisher prefix=": "/>
           </origin>
           <genre suffix=", "/>
           <medium/>
           <availability prefix=", ">
             <physicalLocation/>
             <url prefix=", "/>
           </availability>
         </reftype>

There are a few reasons for this:

1)  in thinking about a non-XML (and non-MODS) data model, it's become
apparent to me that it's easier to deal with authors, editors and
translators (and maybe have a fourth "contributors" to catch corner
cases) than otherwise. E.g. simple is better (unless it's too simple of
course!).

2)  it's also simple in terms of processing (the "alternate" stuff is a
little tricky, but it's hard to get around)

3)  I've realized that I had been thinking about editors wrongly when
designing CSL.  They are not equivalent to authors, but serve in their
place when they are not present.  However, if you have, say, an edited
collection of writings by Walter Benjamin, the editor is secondary. The
above would capture all cases then: authored books, edited books, and
those with both.

I have long been contemplating flattening dates, too; am not sure.

I think it's important to keep nested structures for some things,
though, because the formatting actually assumes such.

Also, I think I will indeed split CSL from citeproc, and rename the
latter "citeproc-xsl."  I think I'm almost done with CSL, but there are
some nagging bugs in citeproc, and one big feature now supported in CSL
(reference list grouping) I have not managed to resolve.

I may also have a separate package for styles. In that case I'd have
"csl-schema", "csl-styles" and "citeproc-xsl."  Obviously if we make
any progress with ports of citeproc, this would make room for them.

Thoughts on any of the above?

Bruce



Reply | Threaded
Open this post in threaded view
|

Re: csl changes

Johan Kool-2
Hello Bruce and others,

Here are my (overdue) comments. As always, take them with a grain of  
salt, or however that saying goes...

I would argue against flattening the dates. It would make it much  
easier to localize when this is kept in the nested style.

I think that using the formatting-def structure is less clear  
compared to the structure you used earlier with the "normal" tags for  
each thing, author, title etc. Although I can see that the formatting-
def approach might somewhat easier to implement (perhaps), but I  
think that gain is very minimal and the easier to read normal tags  
are better.

Btw, I should have more time to look at these kind of things from now  
on, so I hope to participate more, as well as pick up the development  
of the CSL Editor for Mac OS X.

Cheers,

Johan


---
More information on:
   http://www.geo.vu.nl/~jkool/





Reply | Threaded
Open this post in threaded view
|

Re: csl changes

Bruce D'Arcus-3
On Feb 5, 2006, at 5:55 PM, Johan Kool wrote:

> Here are my (overdue) comments. As always, take them with a grain of
> salt, or however that saying goes...

Thanks.

> I would argue against flattening the dates. It would make it much
> easier to localize when this is kept in the nested style.

OK, will keep that in mind. Interestingly, ODF already localizes dates,
so I was partly anticipating not having to worry about that. But CSL
should still be independent of ODF.

I still haven't figured out if or how exactly I might want to add CSL
to ODF, but I'm trying to bring them closer in line. One advantage of
the more generic approach is it could also be used to format other
kinds of metadata-based content, like maybe captions and such.

> I think that using the formatting-def structure is less clear compared
> to the structure you used earlier with the "normal" tags for each
> thing, author, title etc. Although I can see that the formatting-def
> approach might somewhat easier to implement (perhaps), but I think
> that gain is very minimal and the easier to read normal tags are
> better.

There are other issues. If I use elements, those values will be fixed
and difficult to change. Also, with existing elements like origin and
location, I am making an assumption upfront about what children will be
allow there. That may be a mistake.

Finally, using enclosure to indicate relations might also be limiting
in some cases. Part of it is implementation (in other languages, and in
GUI's), but part of it is just that people might (??) want to mix
levels in a given group.

I have no strong opinion though. Was just looking for what other people
think. So thanks.

> Btw, I should have more time to look at these kind of things from now
> on, so I hope to participate more, as well as pick up the development
> of the CSL Editor for Mac OS X.

Cool.

BTW, if we get a Python port, I just remembered OS X has a Python-ObjC
bridge.

Bruce



Reply | Threaded
Open this post in threaded view
|

Re: csl changes

Bruce D'Arcus-3
In reply to this post by Johan Kool-2
On Feb 5, 2006, at 5:55 PM, Johan Kool wrote:

> I would argue against flattening the dates. It would make it much
> easier to localize when this is kept in the nested style.

I just had a thought. It would be possible to retain what you're asking
for, Johan, while also simplifying the layout, by just changing this:

     <dates jan="January" feb="February" mar="March" apr="April"
may="May"
       jun="June" jul="July" aug="August" oct="October" nov="November"
       dec="December" full-layout="year-month-day"/>

... to this:

     <dates jan="January" feb="February" mar="March" apr="April"
may="May"
       jun="June" jul="July" aug="August" oct="October" nov="November"
       dec="December">
       <layout type="full">
         <year>
         <month prefix=", ">
         <day prefix=" ">
       </layout>
     </dates>

So then in the cs:item-layout element, I'd have either:

        cs:year
        cs:date-full
        cs:month-day


...or a single cs:date element with a type attribute. I tend to prefer
the first for consistency, but OTOH I have to consider also adding
support for original (non-translated) titles and such.
       
> I think that using the formatting-def structure is less clear compared
> to the structure you used earlier with the "normal" tags for each
> thing, author, title etc. Although I can see that the formatting-def
> approach might somewhat easier to implement (perhaps), but I think
> that gain is very minimal and the easier to read normal tags are
> better.

Am still thinking about this one, and also Peter's question about
whether we need to (again) allow markup for prefix and suffix content.
The only place where I think this might be needed is if the content
contains text (like <container prefix=", In "/>).

Bruce



Reply | Threaded
Open this post in threaded view
|

Re: csl changes

Bruce D'Arcus-3
In reply to this post by Johan Kool-2
On Feb 5, 2006, at 5:55 PM, Johan Kool wrote:

> I think that using the formatting-def structure is less clear compared
> to the structure you used earlier with the "normal" tags for each
> thing, author, title etc. Although I can see that the formatting-def
> approach might somewhat easier to implement (perhaps), but I think
> that gain is very minimal and the easier to read normal tags are
> better.

OK, what about this Johan:

       <metadata-type name="book">
         <author alternate="editor"/>
         <year prefix=" (" suffix=") "/>
         <title font-style="italic" suffix="."/>
         <editor/>
         <format-group prefix="(" suffix=")">
           <publisher-place/>
           <publisher-name prefix=":"/>
         </format-group>
        </metadata-type>
       <metadata-type name="chapter">
         <author alternate="editor"/>
         <year prefix=" (" suffix=") "/>
         <title font-style="italic" suffix="."/>
         <container-title prefix=" "/>
         <editor/>
         <format-group prefix="(" suffix=")">
           <publisher-place/>
           <publisher-name prefix=":"/>
         </format-group>
        </metadata-type>

So uses the flat structure, but elements to indicate the variable. It
will help me to know if your objection is because of the structure, or
the syntax.

Bruce



Reply | Threaded
Open this post in threaded view
|

Re: csl changes

Johan Kool-2
This looks quite neat. It seems to be the easiest readable of what  
I've seen so far.

Johan

Op 7-feb-2006, om 15:44 heeft Bruce D'Arcus het volgende geschreven:

>
> On Feb 5, 2006, at 5:55 PM, Johan Kool wrote:
>
>> I think that using the formatting-def structure is less clear  
>> compared to the structure you used earlier with the "normal" tags  
>> for each thing, author, title etc. Although I can see that the  
>> formatting-def approach might somewhat easier to implement  
>> (perhaps), but I think that gain is very minimal and the easier to  
>> read normal tags are better.
>
> OK, what about this Johan:
>
>       <metadata-type name="book">
>         <author alternate="editor"/>
>         <year prefix=" (" suffix=") "/>
>         <title font-style="italic" suffix="."/>
>         <editor/>
>         <format-group prefix="(" suffix=")">
>           <publisher-place/>
>           <publisher-name prefix=":"/>
>         </format-group>
>        </metadata-type>
>       <metadata-type name="chapter">
>         <author alternate="editor"/>
>         <year prefix=" (" suffix=") "/>
>         <title font-style="italic" suffix="."/>
>         <container-title prefix=" "/>
>         <editor/>
>         <format-group prefix="(" suffix=")">
>           <publisher-place/>
>           <publisher-name prefix=":"/>
>         </format-group>
>        </metadata-type>
>
> So uses the flat structure, but elements to indicate the variable.  
> It will help me to know if your objection is because of the  
> structure, or the syntax.
>
> Bruce
>
>

---
More information on:
   http://www.geo.vu.nl/~jkool/





Reply | Threaded
Open this post in threaded view
|

Re: csl changes

Bruce D'Arcus-3
On Feb 7, 2006, at 9:57 AM, Johan Kool wrote:

> This looks quite neat. It seems to be the easiest readable of what
> I've seen so far.

This is probably even better:

       <item-layout>
          <type name="book">
             <author alternate="title"/>
             <year prefix=" (" suffix=") "/>
             <title suffix="."/>
             <publisher-place/>
             <publisher prefix=":"/>
          </type>
          <type name="chapter">
             <author/>
             <year prefix=" (" suffix=") "/>
             <title/>
             <title relation="container" suffix="."/>
             <title relation="series"/>
             <volume prefix=", "/>
             <pages prefix=", "/>
          </type>
          <type name="article">
             <author alternate="container-title"/>
             <year prefix=" (" suffix=") "/>
             <title/>
             <title relation="container" suffix="."/>
             <volume prefix=", "/>
             <pages prefix=", "/>
          </type>
       </item-layout>

The relation attribute helps a lot for the XML, though am not quite
sure how that'd work in a GUI.

Bruce



Reply | Threaded
Open this post in threaded view
|

Re: csl changes

Johan Kool-2
Op 8-feb-2006, om 23:58 heeft Bruce D'Arcus het volgende geschreven:

> The relation attribute helps a lot for the XML, though am not quite  
> sure how that'd work in a GUI.

I don't think you should worry to much about what a GUI would look  
like. The important part is to have an easily understood system on  
writing it in xml. If that is easy enough, even a text editor could  
the the GUI. :-) But seriously, pretty much everything can be put in  
a GUI with some creativity.

Focus on getting it easily readable for your own parser, as well as  
the human eye. If it works for that, it works for a GUI too.

Johan

---
   http://www.geo.vu.nl/~jkool/



Reply | Threaded
Open this post in threaded view
|

Re: [dev-biblio] Re: [xbiblio-devel] csl changes

Bruce D'Arcus-3
On Feb 8, 2006, at 6:12 PM, Johan Kool wrote:

> Focus on getting it easily readable for your own parser, as well as
> the human eye. If it works for that, it works for a GUI too.

OK, then, I think that's settled.

The final issue is about the prefix/suffix thing. Should they be plain
text (as they are now), or should I allow them to have formatting
attached to them?

If the latter, it'd look like this:

          <type name="book">
             <author alternate="title"/>
             <year>
               <prefix> (</prefix>
               <suffix>) </suffix>
             </year>
             <title>
               <suffix>.</suffix>
             </title>
             <publisher-place/>
             <publisher>
               <prefix>:</prefix>
             </publisher>
          </type>

It's more verbose and difficult to handle, but not significantly so.

For comparison, the alternative would be:

          <type name="book">
             <author alternate="title"/>
             <year prefix=" (" suffix=") "/>
             <title suffix="."/>
             <publisher-place/>
             <publisher prefix=":"/>
          </type>

Bruce



Reply | Threaded
Open this post in threaded view
|

Re: [dev-biblio] Re: [xbiblio-devel] csl changes

Matthias Steffens
Hi,

On 08-Feb-2006 at 18:34 -0500 Bruce D'Arcus wrote:

> On Feb 8, 2006, at 6:12 PM, Johan Kool wrote:
>
> > Focus on getting it easily readable for your own parser, as well as
> > the human eye. If it works for that, it works for a GUI too.

I fully agree with this.

> The final issue is about the prefix/suffix thing. Should they be
> plain text (as they are now), or should I allow them to have
> formatting attached to them?

I think it's a good idea to allow for formatting in prefix/suffix
strings. What about the "In: " string that's used in cases like:

 Dieckmann GS, Hellmer HH (2003) The importance of sea ice: an
  overview. In: Thomas DN, Dieckmann GS (eds) Sea ice - an introduction
  to its physics, chemistry, biology and geology. Blackwell Science
  Ltd, Oxford

This "In: " string is often printed in italics. Or is this not a
prefix/suffix string?

> If the latter, it'd look like this:
>
>           <type name="book">
>              <author alternate="title"/>
>              <year>
>                <prefix> (</prefix>
>                <suffix>) </suffix>

> For comparison, the alternative would be:
>
>           <type name="book">
>              <author alternate="title"/>
>              <year prefix=" (" suffix=") "/>

Personally, I find the first version *way* more easier to read and to
grasp. The second option seems to involve more "screen clutter" and I
have to concentrate more in order to understand it.

The hierarchy in the first example helps me to "parse" this quickly by
eye. But it may be just me...

Best regards, Matthias


Reply | Threaded
Open this post in threaded view
|

Re: [dev-biblio] Re: [xbiblio-devel] csl changes

Johan Kool-2
In reply to this post by Bruce D'Arcus-3
Op 9-feb-2006, om 0:34 heeft Bruce D'Arcus het volgende geschreven:

> The final issue is about the prefix/suffix thing. Should they be  
> plain text (as they are now), or should I allow them to have  
> formatting attached to them?

Yes, I think that allowing formatting would be the wiser thing to do.  
I can well imagine scenarios where such formatting would be much  
needed. I guess that the extra verbosity is just something we'd have  
to live with.

Johan


Reply | Threaded
Open this post in threaded view
|

Re: [dev-biblio] Re: [xbiblio-devel] csl changes

Bruce D'Arcus-3
In reply to this post by Matthias Steffens
On Feb 8, 2006, at 6:53 PM, Matthias Steffens wrote:

> This "In: " string is often printed in italics. Or is this not a
> prefix/suffix string?

No, you're right, and that's exactly the case I'm worried about.

>> If the latter, it'd look like this:
>>
>>           <type name="book">
>>              <author alternate="title"/>
>>              <year>
>>                <prefix> (</prefix>
>>                <suffix>) </suffix>
>
>> For comparison, the alternative would be:
>>
>>           <type name="book">
>>              <author alternate="title"/>
>>              <year prefix=" (" suffix=") "/>
>
> Personally, I find the first version *way* more easier to read and to
> grasp. The second option seems to involve more "screen clutter" and I
> have to concentrate more in order to understand it.
>
> The hierarchy in the first example helps me to "parse" this quickly by
> eye. But it may be just me...

So just to be clear, you'd be happy in more ways than one if I adopt
the first option above; right?

So it seems that's most likely to make everyone happy? It addresses
Peter's concern, and nicely balances a lot of the other issues (file
size, programming ease, flexibility, xml consistency).

The one awkwardness it'll introduce is that I'd now have to do:

<citation>
   <prefix>[</prefix>
   <suffix>]</suffix>
   <item-layout>
     ...
   </item-layout>
</citation>

A little weird, but it'll work.

If I hear no objections, I'll make those changes.

I hope to post example style to start with the ports sometime in the
next few days.

Bruce



Reply | Threaded
Open this post in threaded view
|

Re: [dev-biblio] Re: [xbiblio-devel] csl changes

Matthias Steffens
On 08-Feb-2006 at 19:05 -0500 Bruce D'Arcus wrote:

> > Personally, I find the first version *way* more easier to read and
> > to grasp. The second option seems to involve more "screen clutter"
> > and I have to concentrate more in order to understand it.
> >
> > The hierarchy in the first example helps me to "parse" this quickly
> > by eye. But it may be just me...
>
> So just to be clear, you'd be happy in more ways than one if I adopt
> the first option above; right?

Yes, absolutely. Personally, I find attributes very hard to read while
separate elements line up and indent nicely and are thus easier to read
and understand. And if it helps to improve flexibility as well as
future compatibility, that's even better.

I wouldn't think too much about XML being verbose. Too me it's more
important that the XML structure is very clear. I'm sure people will be
more tempted to adopt a particular XML structure if they are able to
grok it.

Regards, Matthias


Reply | Threaded
Open this post in threaded view
|

Re: [dev-biblio] Re: [xbiblio-devel] csl changes

Bruce D'Arcus-3
In reply to this post by Bruce D'Arcus-3
On Feb 9, 2006, at 4:37 AM, Matthias Basler wrote:

> I'd definitely prefer the version below, for exactly this reason.

OK, then, I modified the schema and wrote an XSLT to mostly convert the
old examples.

For now I've put it all here:

<http://www.users.muohio.edu/darcusb/citations/csl/>

Once I stabilize everything (schema, examples, directory and file
naming conventions, etc.), I'll move it to the Sourceforge site. While
I don't expect to make any huge changes at this point, please get me
feedback on any of the above.

The one feature I'm still working on figuring out is better
international support. I've been talking to a guy who deals with
Japanese texts on this.

As I said, this is a quick-and-dirty way to do an online repository
that I think has promise. In everyday use I'd like my formatter to be
able to grab the needed style from online, and then cache them. Alf
mentioned on his blog awhile back that Endnote ships will thousands of
styles, which seems kind of silly when you consider that a given author
may only ever use a handful of them.

Next step is to figure out how to create a Ruby and/or Python
CitationStyle object out of these.

Bruce



Reply | Threaded
Open this post in threaded view
|

Re: [dev-biblio] Re: [xbiblio-devel] csl changes

Johan Kool-2
   cs-citenumber = element citenumber { cs-formatting.config,  
attribute superscript {"yes"}? }

 >>

   cs-citenumber = element citenumber { cs-formatting.config,  
attribute superscript {"yes" | "no"}? }

?

Plus, I was thinking, should there be support for shortened journal  
names? Like J. Clim. and then every one knows it is the Journal of  
Climatology? Will be very tough to implement though, as I don't think  
there even exists a list with all valid shortenings. Just a thought...

Johan


Reply | Threaded
Open this post in threaded view
|

Re: [dev-biblio] Re: [xbiblio-devel] csl changes

Alf Eaton
On 09 Feb 2006, at 19:11, Johan Kool wrote:
>
> Plus, I was thinking, should there be support for shortened journal  
> names? Like J. Clim. and then every one knows it is the Journal of  
> Climatology? Will be very tough to implement though, as I don't  
> think there even exists a list with all valid shortenings. Just a  
> thought...

I think
http://wos01.isiknowledge.com/help/A_abrvjt.html
http://wos01.isiknowledge.com/help/B_abrvjt.html
etc
is the most complete list available at the moment.

alf.



Reply | Threaded
Open this post in threaded view
|

Re: [dev-biblio] Re: [xbiblio-devel] csl changes

Bruce D'Arcus-3
In reply to this post by Johan Kool-2
On Feb 9, 2006, at 7:11 PM, Johan Kool wrote:

>   cs-citenumber = element citenumber { cs-formatting.config, attribute
> superscript {"yes"}? }
>
> >>
>
>   cs-citenumber = element citenumber { cs-formatting.config, attribute
> superscript {"yes" | "no"}? }
>
> ?

It seems sort of redundant.

> Plus, I was thinking, should there be support for shortened journal
> names?

Well, yes, there should be an abbreviated title element in general,
which would also cover journals.  Thanks.

It would also be nice to have a periodical RDF store somewhere that
included the abbreviation, and one could just link to it.  But that not
be our thing; it's a big job.  The OCLC does have a large csv file of
periodicals though. I converted that to XML and used it for some MODS
conversions.

Bruce



Reply | Threaded
Open this post in threaded view
|

Re: [dev-biblio] Re: [xbiblio-devel] csl changes

Bruce D'Arcus-3
In reply to this post by Johan Kool-2
On Feb 10, 2006, at 8:24 AM, Matthias Basler wrote:

> Yes, abbreviations for journal names are quite common. Implementation
> should not
> be very difficult. In a simple case each user can him/herself define a
> replacement (or alias) list
> (i.e. "Journal of Climatology" -> "J. Clim.")
> that can be switched on/off as the user likes.

I think the solution for csl is simple:

        <title type="short"/>

... or maybe call the attribute "variant."

That way the processor could look for an abbreviated title, and if not
there, default to the simple title.

In terms of how to handle that in data, to me it's clear that
periodicals and such ought to be normalized as full resources/objects.
So if you're using a RDBMS, there's a table called "collections" which
includes periodicals.  That table would then have both "title" and
"short-title" columns.

Likewise, in an RDF representation, you'd do:

<biblio:Journal rdf:about="http://ex.net/journals#x">
   <dc:title>Some Full Journal</dc:title>
   <biblio:abbreviatedTitle>S. F. J.</biblio:abbreviated>
</biblio:Journal>

The same issue applies to corporate/organizational names.

Bruce



Reply | Threaded
Open this post in threaded view
|

Re: [dev-biblio] Re: [xbiblio-devel] csl changes

Johan Kool-2
There are two ways to take abbreviated journal names into account.  
Add an abbreviated title to the data, or let the citation processor  
abbreviate the name. There are some pros and cons for each approach.

Add abbreviated title to data:
- abbreviated title might not be present
- only one abbreviation, different styles might use/prefer different  
abbreviations
+ much easier for the citation processor to implement

Let the citation processor abbreviate:
- making a full list of all abbreviations is a lot of work
+ title might always abbreviated (although maybe not always exactly  
correct) *
+ each style can define its own preferred abbreviations

*) This is the case if it's allowed to replace strings partly, e.g.
Journal -> J.
Climatology -> Clim.

There are probably more things for this list. I'm curious which  
method you prefer. It seems to me that Bruce prefers the first.  
Matthias and me seemingly had the second option in mind.

This whole abbreviation habit is a very annoying habit, and to be  
honest I think that journals would be wise to stop using it...

Johan

---
   http://www.geo.vu.nl/~jkool/



Reply | Threaded
Open this post in threaded view
|

Re: [dev-biblio] Re: [xbiblio-devel] csl changes

Bruce D'Arcus-3
On Feb 10, 2006, at 12:19 PM, Johan Kool wrote:

> There are two ways to take abbreviated journal names into account. Add
> an abbreviated title to the data, or let the citation processor
> abbreviate the name.

Correct.

> There are some pros and cons for each approach.
>
> Add abbreviated title to data:
> - abbreviated title might not be present
> - only one abbreviation, different styles might use/prefer different
> abbreviations

Hmm ... I hadn't thought about that one. Do styles specify how to
abbreviate journal titles? Do they include other periodicals (court
reporters, magazines, newspapers) too?

> + much easier for the citation processor to implement
>
> Let the citation processor abbreviate:
> - making a full list of all abbreviations is a lot of work

And error prone? And consider the code involved in internationalizing
it.

> + title might always abbreviated (although maybe not always exactly
> correct) *
> + each style can define its own preferred abbreviations
>
> *) This is the case if it's allowed to replace strings partly, e.g.
> Journal -> J.
> Climatology -> Clim.
>
> There are probably more things for this list. I'm curious which method
> you prefer. It seems to me that Bruce prefers the first. Matthias and
> me seemingly had the second option in mind.

I have no strong opinion at the moment. I suppose my immediate question
is whether CSL needs an abbreviatedTitle element, or whether the
attribute is enough. I've already added the latter.

> This whole abbreviation habit is a very annoying habit, and to be
> honest I think that journals would be wise to stop using it...

I agree. There are a lot of citation practices that I think are a
vestige of a time before computers. I wonder if this is one of them. I
am really reluctant to support some of those old features, like the
absolutely heinous practice in note citations of doing "op. cit." Every
time I read a book that uses that convention I find myself frustrated.

BTW, my name abbreviation example is "Federal Bureau of Investigation"
--> "FBI"

Bruce