Proposed change to CSL input XML specification

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Proposed change to CSL input XML specification

toreilly
I have a suggestion for adding a data type to the CSL specification.

Overview

Some of the design goals of a bibliography system are (1) simplicity, (2) comprehensiveness, (3) efficient encoding, and (4) adaptability to new uses and contexts. I think that adding one more complex field type to the CSL specification will actually improve the specification with regard to the 4 design goals mentioned above. CSL should add a "serial" field to encode information about serial sources. It would apply to periodicals, magazines, newspapers, case reporters, newsletters, academic journals, and books published in serial formats. If adopted, the CSL-JSON would support 3 complex field types in total: Date, Person, and Serial.

Examples

Without a serial-type field, one would have to encode a serial publication (like "Brown v. Board of Education of Topeka, Kansas, 483 U.S. 347") in ordinary CSL fields:


<source>
    <page>483</page>
    <publication>U.S.</publication>
    <title>Brown v. Board of Education of Topeka, Kansas</title>
    <volume>347</volume>
</source>

With a serial-type field, you would instead encode the information pertaining to the serial publication location within a container that shows the logical relationship between the fields.

<source>
    <title>Brown v. Board of Education of Topeka, Kansas</title>
    <serials>
        <volume>347</volume>
        <publication>U.S.</publication>
        <page>483</page>
    </serials>
</source>

Benefits

The second format shows to the human reader how the discrete fields of "volume", "publication", and "page" are inextricably linked. The essential variables are co-located instead of being dispersed. Moreover, it has technical advantages for encoding citations items in at least 2 real-world situations.

Parallel Citations

Some legal journals require parallel citations to a legal resources. A parallel citation is "A reference to the same case or statute published in two or more sources" according to the Legal Dictionary. For example, when citing a Supreme Court case, the writer may be expected to first cite the official reporter of the Supreme Court, and then cite the reporters published by WestLaw and LexisNexis. Likewise, when citing a multilateral treaty, the author should cite both an domestic reporter and an international reporter. So, if the user of CSL-JSON was required to use parallel citations, the "serial" type data structure would easily support it. See how the previous citation could be extended:


<source>
    <title>Brown v. Board of Education of Topeka, Kansas</title>
    <serials>
        <serial>
            <volume>347</volume>
            <publication>U.S.</publication>
            <page>483</page>
         </serial>
         <serial>
               <volume>74</volume>
               <publication>S. Ct.</publication>
               <page>686</page>
          </serial>
          <serial>
               <volume>98</volume>
               <publication>L. Ed.</publication>
               <page>873</page>
           </serial>
    </serials>
</source>

The first listed publication would be assumed to be the primary publication in a "serials" type field. Without the Serial-type data field, this information would have to be encoded in fields like "1st volume", "1st publication", "1st FirstPage", and "2nd volume", "2nd publication", "2nd FirstPage", etc. Trying to encode all that information in ordinary CSL-JSON fields would be cumbersome while still not being comprehensive; and it would be unfriendly to human users while not providing flexibility for new situations and contexts.


Multipart Articles

Some articles are published across multiple issues of a publication. The "serial type" field would be able to efficiently and legibly encode a citation to a mutlipart article. If a citation appeared as follows:

Harlan F. Stone, The Equitable Rights and Liabilities of Strangers to a Contract (pts. 1 & 2), 18 Colum"page":". L. Rev. 291 (1918), 19 Colum. L. Rev. 177 (1919).

It would be encoded as follows:

<source>
     <author>
            <family>Stone</family>
            <given>Harlan F.</given>
     </author>
     <title>The Equitable Rights and Liabilities of Strangers to a Contract (pts. 1 & 2)</title>
     <serials>
           <serial>
                  <volume>18</volume>
                  <publication>Colum. L. Rev.</publication>
                  <page>291</page>
            </serial>
           <serial>
                  <volume>19</volume>
                  <publication>Colum. L. Rev.</publication>
                  <page>177</page>
            </serial>
     </serials>
     <issued>
          <date-parts>
                  <year>1918</year>
           <date-parts>
           <date-parts>
                  <year>1919</year>
           <date-parts>
     </issued>
</source>

The order of listing for "serial" elements should correspond to their respective dates of publication in the ordering of dates in "issued" element.

Proposed specification for "Serial type" Field

A complex field of "serial" type would consist of an array of one or more publications. Each publication would have 4 possible fields: "volume", "publication", "issue", and "page". The "publication" and "first page" fields would always be required, while "issue" and "volume" would depend on the context and source type. From my research, I believe that there are 3 main classes of serial publications. I will deal with each type in turn.

1. Non-Consecutively paginated serials with Volume numbers.

1st Case is for the serials by are published by non-consecutively paginated volumes (such as an academic journal). In this case, the citation to the source should include the Volume, Publication, Issue, and First Page.

2. Consecutively paginated serials with Volume numbers.

When issues within a volume continue from the pagination number of the previous issue, then identifying the issue number often not required (or even available). The required fields would be Volume, Publication, and First Page.

3. Serials that are identified only by issue, and do not track Volume numbers

Some periodicals, like newspapers, do not have volume numbers, and the issue is identified by the date of publication. In this case, the the volume number is not required. The required fields would be Publication, Issue, and First Page.

This situation creates an interesting predicament in which the date of publication may be duplicated within an item record. The date of publication would be recorded in the normal date-typed "issued" field, as well as within the serial-typed "issue" subfield. The specification could include a recommendation to leave the "issue" subfield blank if it merely copies the information from the "issued" field.

Mapping Serial-Type Fields to standard CSL fields

No CSL styles or processors currently support the "serials" type. In order to make a transition possible, processors should use the following mapping to ensure compatibility between the different styles.

The elements that pertain to information that would also be located in a "serials" field are the following:
  • "container-title" - title of the container holding the item (e.g. the book title for a book chapter, the journal title for a journal article)

  • “page” - range of pages the item (e.g. a journal article) covers in a container (e.g. a journal issue)

  • “page-first” - first page of the range of pages the item (e.g. a journal article) covers in a container (e.g. a journal issue)

  • “issue” - (container) issue holding the item (e.g. “5” when citing a journal article from journal volume 2, issue 5)

  • “volume” - (container) volume holding the item (e.g. “2” when citing a chapter from book volume 2)



Some elements are near-misses for inclusion. I include these to ensure that proper discussion is had.

  • “number-of-volumes” - total number of volumes, usable for citing multi-volume books and such. Multi-volume books are not serial publications. Serial publications are defined to have indefinite length. There is no instance in which this variable would be useful for serial sources.

  • "collection-title" - title of the collection holding the item (e.g. the series title for a book). When a work appears within a collection of works, the title of the containing work should be encoded in this variable. However, not all citation styles support this variable. In my survey of 40 styles, only 28 supported this variable. This makes me wonder if style-creators are using the "container-title" variable for both serial publications and collections.
  • “edition” - (container) edition holding the item (e.g. “3” when citing a chapter in the third edition of a book). I am not aware of any serial publications that use editions.

  • “number” - number identifying the item (e.g. a report number). This is a tricky one, especially for legal drafters. Report numbering can appear to look like the numbering for a serial publication. I am open to doing more research on this.

From standard CSL fields to "serial-type" CSL fields:
  • "container-title" ==> "publication"
  • "page" ==> "page"
  • "page-first" ==> "page"
  • "issue" ==> "issue"
  • "volume" ==> "volume"


From serial-type CSL fields to standard CSL fields:
  • "publication"==>"container-title"
  • "page"==>"page"
  • "issue"=="issue"
  • "volume"==>"volume

When converting from information from a "serials" field to standard CSL fields, only information from the first "serials" child element should be translated.


Summary

The Serial-Type field would stand alongside the Person-Type field and Date-type field as complex fields supported by CSL. I will summarize the proposed Serial-Type format in the following example:

<source>
       <serials>
              <serial>
                     <volume>vol. #</volume>
                     <publication>Title of primary publication</publication>
                     <issue>Issue #</issue>
                     <page>First Page</page>
               </serial>
               ...
       </serials>
</source>



- Thomas O'Reilly

------------------------------------------------------------------------------

_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Proposed change to CSL input XML specification

Sebastian Karcher
Thanks. Since technically CSL is the citation style language and there is no official input format (though obviously citeproc/CSL JSON as used by citeproc-js is relevant), I think it'd be helpful to formulate how this would look in actual CSL syntax. I'm not at all clear on that.

(/rant
Beyond that, I wish law folks would just get DOIs for their resources and be done with this. This is ridiculous. I know -- not going to happen and so we'll eventually have to solve this, but I'm having a really time motivating myself to put effort into accomodating 19th century citation practices. /rant)

On Thu, Sep 22, 2016 at 2:17 PM, Thomas O'Reilly <[hidden email]> wrote:
I have a suggestion for adding a data type to the CSL specification.

Overview

Some of the design goals of a bibliography system are (1) simplicity, (2) comprehensiveness, (3) efficient encoding, and (4) adaptability to new uses and contexts. I think that adding one more complex field type to the CSL specification will actually improve the specification with regard to the 4 design goals mentioned above. CSL should add a "serial" field to encode information about serial sources. It would apply to periodicals, magazines, newspapers, case reporters, newsletters, academic journals, and books published in serial formats. If adopted, the CSL-JSON would support 3 complex field types in total: Date, Person, and Serial.

Examples

Without a serial-type field, one would have to encode a serial publication (like "Brown v. Board of Education of Topeka, Kansas, 483 U.S. 347") in ordinary CSL fields:


<source>
    <page>483</page>
    <publication>U.S.</publication>
    <title>Brown v. Board of Education of Topeka, Kansas</title>
    <volume>347</volume>
</source>

With a serial-type field, you would instead encode the information pertaining to the serial publication location within a container that shows the logical relationship between the fields.

<source>
    <title>Brown v. Board of Education of Topeka, Kansas</title>
    <serials>
        <volume>347</volume>
        <publication>U.S.</publication>
        <page>483</page>
    </serials>
</source>

Benefits

The second format shows to the human reader how the discrete fields of "volume", "publication", and "page" are inextricably linked. The essential variables are co-located instead of being dispersed. Moreover, it has technical advantages for encoding citations items in at least 2 real-world situations.

Parallel Citations

Some legal journals require parallel citations to a legal resources. A parallel citation is "A reference to the same case or statute published in two or more sources" according to the Legal Dictionary. For example, when citing a Supreme Court case, the writer may be expected to first cite the official reporter of the Supreme Court, and then cite the reporters published by WestLaw and LexisNexis. Likewise, when citing a multilateral treaty, the author should cite both an domestic reporter and an international reporter. So, if the user of CSL-JSON was required to use parallel citations, the "serial" type data structure would easily support it. See how the previous citation could be extended:


<source>
    <title>Brown v. Board of Education of Topeka, Kansas</title>
    <serials>
        <serial>
            <volume>347</volume>
            <publication>U.S.</publication>
            <page>483</page>
         </serial>
         <serial>
               <volume>74</volume>
               <publication>S. Ct.</publication>
               <page>686</page>
          </serial>
          <serial>
               <volume>98</volume>
               <publication>L. Ed.</publication>
               <page>873</page>
           </serial>
    </serials>
</source>

The first listed publication would be assumed to be the primary publication in a "serials" type field. Without the Serial-type data field, this information would have to be encoded in fields like "1st volume", "1st publication", "1st FirstPage", and "2nd volume", "2nd publication", "2nd FirstPage", etc. Trying to encode all that information in ordinary CSL-JSON fields would be cumbersome while still not being comprehensive; and it would be unfriendly to human users while not providing flexibility for new situations and contexts.


Multipart Articles

Some articles are published across multiple issues of a publication. The "serial type" field would be able to efficiently and legibly encode a citation to a mutlipart article. If a citation appeared as follows:

Harlan F. Stone, The Equitable Rights and Liabilities of Strangers to a Contract (pts. 1 & 2), 18 Colum"page":". L. Rev. 291 (1918), 19 Colum. L. Rev. 177 (1919).

It would be encoded as follows:

<source>
     <author>
            <family>Stone</family>
            <given>Harlan F.</given>
     </author>
     <title>The Equitable Rights and Liabilities of Strangers to a Contract (pts. 1 & 2)</title>
     <serials>
           <serial>
                  <volume>18</volume>
                  <publication>Colum. L. Rev.</publication>
                  <page>291</page>
            </serial>
           <serial>
                  <volume>19</volume>
                  <publication>Colum. L. Rev.</publication>
                  <page>177</page>
            </serial>
     </serials>
     <issued>
          <date-parts>
                  <year>1918</year>
           <date-parts>
           <date-parts>
                  <year>1919</year>
           <date-parts>
     </issued>
</source>

The order of listing for "serial" elements should correspond to their respective dates of publication in the ordering of dates in "issued" element.

Proposed specification for "Serial type" Field

A complex field of "serial" type would consist of an array of one or more publications. Each publication would have 4 possible fields: "volume", "publication", "issue", and "page". The "publication" and "first page" fields would always be required, while "issue" and "volume" would depend on the context and source type. From my research, I believe that there are 3 main classes of serial publications. I will deal with each type in turn.

1. Non-Consecutively paginated serials with Volume numbers.

1st Case is for the serials by are published by non-consecutively paginated volumes (such as an academic journal). In this case, the citation to the source should include the Volume, Publication, Issue, and First Page.

2. Consecutively paginated serials with Volume numbers.

When issues within a volume continue from the pagination number of the previous issue, then identifying the issue number often not required (or even available). The required fields would be Volume, Publication, and First Page.

3. Serials that are identified only by issue, and do not track Volume numbers

Some periodicals, like newspapers, do not have volume numbers, and the issue is identified by the date of publication. In this case, the the volume number is not required. The required fields would be Publication, Issue, and First Page.

This situation creates an interesting predicament in which the date of publication may be duplicated within an item record. The date of publication would be recorded in the normal date-typed "issued" field, as well as within the serial-typed "issue" subfield. The specification could include a recommendation to leave the "issue" subfield blank if it merely copies the information from the "issued" field.

Mapping Serial-Type Fields to standard CSL fields

No CSL styles or processors currently support the "serials" type. In order to make a transition possible, processors should use the following mapping to ensure compatibility between the different styles.

The elements that pertain to information that would also be located in a "serials" field are the following:
  • "container-title" - title of the container holding the item (e.g. the book title for a book chapter, the journal title for a journal article)

  • “page” - range of pages the item (e.g. a journal article) covers in a container (e.g. a journal issue)

  • “page-first” - first page of the range of pages the item (e.g. a journal article) covers in a container (e.g. a journal issue)

  • “issue” - (container) issue holding the item (e.g. “5” when citing a journal article from journal volume 2, issue 5)

  • “volume” - (container) volume holding the item (e.g. “2” when citing a chapter from book volume 2)



Some elements are near-misses for inclusion. I include these to ensure that proper discussion is had.

  • “number-of-volumes” - total number of volumes, usable for citing multi-volume books and such. Multi-volume books are not serial publications. Serial publications are defined to have indefinite length. There is no instance in which this variable would be useful for serial sources.

  • "collection-title" - title of the collection holding the item (e.g. the series title for a book). When a work appears within a collection of works, the title of the containing work should be encoded in this variable. However, not all citation styles support this variable. In my survey of 40 styles, only 28 supported this variable. This makes me wonder if style-creators are using the "container-title" variable for both serial publications and collections.
  • “edition” - (container) edition holding the item (e.g. “3” when citing a chapter in the third edition of a book). I am not aware of any serial publications that use editions.

  • “number” - number identifying the item (e.g. a report number). This is a tricky one, especially for legal drafters. Report numbering can appear to look like the numbering for a serial publication. I am open to doing more research on this.

From standard CSL fields to "serial-type" CSL fields:
  • "container-title" ==> "publication"
  • "page" ==> "page"
  • "page-first" ==> "page"
  • "issue" ==> "issue"
  • "volume" ==> "volume"


From serial-type CSL fields to standard CSL fields:
  • "publication"==>"container-title"
  • "page"==>"page"
  • "issue"=="issue"
  • "volume"==>"volume

When converting from information from a "serials" field to standard CSL fields, only information from the first "serials" child element should be translated.


Summary

The Serial-Type field would stand alongside the Person-Type field and Date-type field as complex fields supported by CSL. I will summarize the proposed Serial-Type format in the following example:

<source>
       <serials>
              <serial>
                     <volume>vol. #</volume>
                     <publication>Title of primary publication</publication>
                     <issue>Issue #</issue>
                     <page>First Page</page>
               </serial>
               ...
       </serials>
</source>



- Thomas O'Reilly

------------------------------------------------------------------------------

_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel




--
Sebastian Karcher, PhD
www.sebastiankarcher.com

------------------------------------------------------------------------------

_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Proposed change to CSL input XML specification

Bruce D'Arcus-3
In reply to this post by toreilly
So in the context of being generally super busy, I'm a bit overwhelmed by the post. Thanks much for taking the time, but perhaps you can start with first principles, and explain as briefly as possible:

What's wrong with the current CSL formatting specification that leads you to this solution? Perhaps an example output of what cannot now be done?

I started to read your first example, as an example, and I was not seeing the problem you were trying to solve (unless it's an orthogonal problem around data representation, which is not our primary focus).

On Thu, Sep 22, 2016 at 2:17 PM, Thomas O'Reilly <[hidden email]> wrote:
I have a suggestion for adding a data type to the CSL specification.

Overview

Some of the design goals of a bibliography system are (1) simplicity, (2) comprehensiveness, (3) efficient encoding, and (4) adaptability to new uses and contexts. I think that adding one more complex field type to the CSL specification will actually improve the specification with regard to the 4 design goals mentioned above. CSL should add a "serial" field to encode information about serial sources. It would apply to periodicals, magazines, newspapers, case reporters, newsletters, academic journals, and books published in serial formats. If adopted, the CSL-JSON would support 3 complex field types in total: Date, Person, and Serial.

Examples

Without a serial-type field, one would have to encode a serial publication (like "Brown v. Board of Education of Topeka, Kansas, 483 U.S. 347") in ordinary CSL fields:


<source>
    <page>483</page>
    <publication>U.S.</publication>
    <title>Brown v. Board of Education of Topeka, Kansas</title>
    <volume>347</volume>
</source>

With a serial-type field, you would instead encode the information pertaining to the serial publication location within a container that shows the logical relationship between the fields.

<source>
    <title>Brown v. Board of Education of Topeka, Kansas</title>
    <serials>
        <volume>347</volume>
        <publication>U.S.</publication>
        <page>483</page>
    </serials>
</source>

Benefits

The second format shows to the human reader how the discrete fields of "volume", "publication", and "page" are inextricably linked. The essential variables are co-located instead of being dispersed. Moreover, it has technical advantages for encoding citations items in at least 2 real-world situations.

Parallel Citations

Some legal journals require parallel citations to a legal resources. A parallel citation is "A reference to the same case or statute published in two or more sources" according to the Legal Dictionary. For example, when citing a Supreme Court case, the writer may be expected to first cite the official reporter of the Supreme Court, and then cite the reporters published by WestLaw and LexisNexis. Likewise, when citing a multilateral treaty, the author should cite both an domestic reporter and an international reporter. So, if the user of CSL-JSON was required to use parallel citations, the "serial" type data structure would easily support it. See how the previous citation could be extended:


<source>
    <title>Brown v. Board of Education of Topeka, Kansas</title>
    <serials>
        <serial>
            <volume>347</volume>
            <publication>U.S.</publication>
            <page>483</page>
         </serial>
         <serial>
               <volume>74</volume>
               <publication>S. Ct.</publication>
               <page>686</page>
          </serial>
          <serial>
               <volume>98</volume>
               <publication>L. Ed.</publication>
               <page>873</page>
           </serial>
    </serials>
</source>

The first listed publication would be assumed to be the primary publication in a "serials" type field. Without the Serial-type data field, this information would have to be encoded in fields like "1st volume", "1st publication", "1st FirstPage", and "2nd volume", "2nd publication", "2nd FirstPage", etc. Trying to encode all that information in ordinary CSL-JSON fields would be cumbersome while still not being comprehensive; and it would be unfriendly to human users while not providing flexibility for new situations and contexts.


Multipart Articles

Some articles are published across multiple issues of a publication. The "serial type" field would be able to efficiently and legibly encode a citation to a mutlipart article. If a citation appeared as follows:

Harlan F. Stone, The Equitable Rights and Liabilities of Strangers to a Contract (pts. 1 & 2), 18 Colum"page":". L. Rev. 291 (1918), 19 Colum. L. Rev. 177 (1919).

It would be encoded as follows:

<source>
     <author>
            <family>Stone</family>
            <given>Harlan F.</given>
     </author>
     <title>The Equitable Rights and Liabilities of Strangers to a Contract (pts. 1 & 2)</title>
     <serials>
           <serial>
                  <volume>18</volume>
                  <publication>Colum. L. Rev.</publication>
                  <page>291</page>
            </serial>
           <serial>
                  <volume>19</volume>
                  <publication>Colum. L. Rev.</publication>
                  <page>177</page>
            </serial>
     </serials>
     <issued>
          <date-parts>
                  <year>1918</year>
           <date-parts>
           <date-parts>
                  <year>1919</year>
           <date-parts>
     </issued>
</source>

The order of listing for "serial" elements should correspond to their respective dates of publication in the ordering of dates in "issued" element.

Proposed specification for "Serial type" Field

A complex field of "serial" type would consist of an array of one or more publications. Each publication would have 4 possible fields: "volume", "publication", "issue", and "page". The "publication" and "first page" fields would always be required, while "issue" and "volume" would depend on the context and source type. From my research, I believe that there are 3 main classes of serial publications. I will deal with each type in turn.

1. Non-Consecutively paginated serials with Volume numbers.

1st Case is for the serials by are published by non-consecutively paginated volumes (such as an academic journal). In this case, the citation to the source should include the Volume, Publication, Issue, and First Page.

2. Consecutively paginated serials with Volume numbers.

When issues within a volume continue from the pagination number of the previous issue, then identifying the issue number often not required (or even available). The required fields would be Volume, Publication, and First Page.

3. Serials that are identified only by issue, and do not track Volume numbers

Some periodicals, like newspapers, do not have volume numbers, and the issue is identified by the date of publication. In this case, the the volume number is not required. The required fields would be Publication, Issue, and First Page.

This situation creates an interesting predicament in which the date of publication may be duplicated within an item record. The date of publication would be recorded in the normal date-typed "issued" field, as well as within the serial-typed "issue" subfield. The specification could include a recommendation to leave the "issue" subfield blank if it merely copies the information from the "issued" field.

Mapping Serial-Type Fields to standard CSL fields

No CSL styles or processors currently support the "serials" type. In order to make a transition possible, processors should use the following mapping to ensure compatibility between the different styles.

The elements that pertain to information that would also be located in a "serials" field are the following:
  • "container-title" - title of the container holding the item (e.g. the book title for a book chapter, the journal title for a journal article)

  • “page” - range of pages the item (e.g. a journal article) covers in a container (e.g. a journal issue)

  • “page-first” - first page of the range of pages the item (e.g. a journal article) covers in a container (e.g. a journal issue)

  • “issue” - (container) issue holding the item (e.g. “5” when citing a journal article from journal volume 2, issue 5)

  • “volume” - (container) volume holding the item (e.g. “2” when citing a chapter from book volume 2)



Some elements are near-misses for inclusion. I include these to ensure that proper discussion is had.

  • “number-of-volumes” - total number of volumes, usable for citing multi-volume books and such. Multi-volume books are not serial publications. Serial publications are defined to have indefinite length. There is no instance in which this variable would be useful for serial sources.

  • "collection-title" - title of the collection holding the item (e.g. the series title for a book). When a work appears within a collection of works, the title of the containing work should be encoded in this variable. However, not all citation styles support this variable. In my survey of 40 styles, only 28 supported this variable. This makes me wonder if style-creators are using the "container-title" variable for both serial publications and collections.
  • “edition” - (container) edition holding the item (e.g. “3” when citing a chapter in the third edition of a book). I am not aware of any serial publications that use editions.

  • “number” - number identifying the item (e.g. a report number). This is a tricky one, especially for legal drafters. Report numbering can appear to look like the numbering for a serial publication. I am open to doing more research on this.

From standard CSL fields to "serial-type" CSL fields:
  • "container-title" ==> "publication"
  • "page" ==> "page"
  • "page-first" ==> "page"
  • "issue" ==> "issue"
  • "volume" ==> "volume"


From serial-type CSL fields to standard CSL fields:
  • "publication"==>"container-title"
  • "page"==>"page"
  • "issue"=="issue"
  • "volume"==>"volume

When converting from information from a "serials" field to standard CSL fields, only information from the first "serials" child element should be translated.


Summary

The Serial-Type field would stand alongside the Person-Type field and Date-type field as complex fields supported by CSL. I will summarize the proposed Serial-Type format in the following example:

<source>
       <serials>
              <serial>
                     <volume>vol. #</volume>
                     <publication>Title of primary publication</publication>
                     <issue>Issue #</issue>
                     <page>First Page</page>
               </serial>
               ...
       </serials>
</source>



- Thomas O'Reilly

------------------------------------------------------------------------------

_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel



------------------------------------------------------------------------------

_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Proposed change to CSL input XML specification

Sebastian Karcher
Thomas does say what this is for:
1. Articles published across serials such as:
Harlan F. Stone, The Equitable Rights and Liabilities of Strangers to a Contract (pts. 1 & 2), 18 COLUM. L. REV. 291 (1918), 19 COLUM. L. REV. 177 (1919).
(notice the two separate journal issues&dates for a single title

2. Parallel legal citations
Czapinski v. St. Francis Hosp., Inc., 2000 WI 80, 236 Wis. 2d 316, 613 N.W.2d 120.

I only have an approximate understanding of this, but basicually WI80, 236 Wis 2d 316, and 613 N.W.2d 120 are three different places ("reporters") the case has been published and legal citation practices (cf. my rant above) requires to list all three -- hence three serials in a single citation.

Personally I think 1. is rare enough to be irrelevant. You could just cite the above separately (as is often done) or list a date range (which CSL already supports, even though .
2., on the other hand, is a super-common component of legal citations, so to the extent we want to support legal citations, we have to support parallel citations..
Frank in juris-m/csl-m does solve this differently, i.e. by automatically "collapsing" the same case when cited in a single citation, the same way CSL does for the same author. That's also more in line with the data storage model used by most upstream clients of CSL (which is one of my major worries with Thomas's proposal: we can put this in CSL all we want, but if Zotero and Mendeley don't implement a data model that can produce this -- thus making it useless to 80%+ of CSL users, what good does it do us.)
But I'm open to be convinced that there is a compelling _and_ feasible case here. For me, though, the starting point would be mock-csl syntax rather than input data.


On Thu, Sep 22, 2016 at 3:06 PM, Bruce D'Arcus <[hidden email]> wrote:
So in the context of being generally super busy, I'm a bit overwhelmed by the post. Thanks much for taking the time, but perhaps you can start with first principles, and explain as briefly as possible:

What's wrong with the current CSL formatting specification that leads you to this solution? Perhaps an example output of what cannot now be done?

I started to read your first example, as an example, and I was not seeing the problem you were trying to solve (unless it's an orthogonal problem around data representation, which is not our primary focus).

On Thu, Sep 22, 2016 at 2:17 PM, Thomas O'Reilly <[hidden email]> wrote:
I have a suggestion for adding a data type to the CSL specification.

Overview

Some of the design goals of a bibliography system are (1) simplicity, (2) comprehensiveness, (3) efficient encoding, and (4) adaptability to new uses and contexts. I think that adding one more complex field type to the CSL specification will actually improve the specification with regard to the 4 design goals mentioned above. CSL should add a "serial" field to encode information about serial sources. It would apply to periodicals, magazines, newspapers, case reporters, newsletters, academic journals, and books published in serial formats. If adopted, the CSL-JSON would support 3 complex field types in total: Date, Person, and Serial.

Examples

Without a serial-type field, one would have to encode a serial publication (like "Brown v. Board of Education of Topeka, Kansas, 483 U.S. 347") in ordinary CSL fields:


<source>
    <page>483</page>
    <publication>U.S.</publication>
    <title>Brown v. Board of Education of Topeka, Kansas</title>
    <volume>347</volume>
</source>

With a serial-type field, you would instead encode the information pertaining to the serial publication location within a container that shows the logical relationship between the fields.

<source>
    <title>Brown v. Board of Education of Topeka, Kansas</title>
    <serials>
        <volume>347</volume>
        <publication>U.S.</publication>
        <page>483</page>
    </serials>
</source>

Benefits

The second format shows to the human reader how the discrete fields of "volume", "publication", and "page" are inextricably linked. The essential variables are co-located instead of being dispersed. Moreover, it has technical advantages for encoding citations items in at least 2 real-world situations.

Parallel Citations

Some legal journals require parallel citations to a legal resources. A parallel citation is "A reference to the same case or statute published in two or more sources" according to the Legal Dictionary. For example, when citing a Supreme Court case, the writer may be expected to first cite the official reporter of the Supreme Court, and then cite the reporters published by WestLaw and LexisNexis. Likewise, when citing a multilateral treaty, the author should cite both an domestic reporter and an international reporter. So, if the user of CSL-JSON was required to use parallel citations, the "serial" type data structure would easily support it. See how the previous citation could be extended:


<source>
    <title>Brown v. Board of Education of Topeka, Kansas</title>
    <serials>
        <serial>
            <volume>347</volume>
            <publication>U.S.</publication>
            <page>483</page>
         </serial>
         <serial>
               <volume>74</volume>
               <publication>S. Ct.</publication>
               <page>686</page>
          </serial>
          <serial>
               <volume>98</volume>
               <publication>L. Ed.</publication>
               <page>873</page>
           </serial>
    </serials>
</source>

The first listed publication would be assumed to be the primary publication in a "serials" type field. Without the Serial-type data field, this information would have to be encoded in fields like "1st volume", "1st publication", "1st FirstPage", and "2nd volume", "2nd publication", "2nd FirstPage", etc. Trying to encode all that information in ordinary CSL-JSON fields would be cumbersome while still not being comprehensive; and it would be unfriendly to human users while not providing flexibility for new situations and contexts.


Multipart Articles

Some articles are published across multiple issues of a publication. The "serial type" field would be able to efficiently and legibly encode a citation to a mutlipart article. If a citation appeared as follows:

Harlan F. Stone, The Equitable Rights and Liabilities of Strangers to a Contract (pts. 1 & 2), 18 Colum"page":". L. Rev. 291 (1918), 19 Colum. L. Rev. 177 (1919).

It would be encoded as follows:

<source>
     <author>
            <family>Stone</family>
            <given>Harlan F.</given>
     </author>
     <title>The Equitable Rights and Liabilities of Strangers to a Contract (pts. 1 & 2)</title>
     <serials>
           <serial>
                  <volume>18</volume>
                  <publication>Colum. L. Rev.</publication>
                  <page>291</page>
            </serial>
           <serial>
                  <volume>19</volume>
                  <publication>Colum. L. Rev.</publication>
                  <page>177</page>
            </serial>
     </serials>
     <issued>
          <date-parts>
                  <year>1918</year>
           <date-parts>
           <date-parts>
                  <year>1919</year>
           <date-parts>
     </issued>
</source>

The order of listing for "serial" elements should correspond to their respective dates of publication in the ordering of dates in "issued" element.

Proposed specification for "Serial type" Field

A complex field of "serial" type would consist of an array of one or more publications. Each publication would have 4 possible fields: "volume", "publication", "issue", and "page". The "publication" and "first page" fields would always be required, while "issue" and "volume" would depend on the context and source type. From my research, I believe that there are 3 main classes of serial publications. I will deal with each type in turn.

1. Non-Consecutively paginated serials with Volume numbers.

1st Case is for the serials by are published by non-consecutively paginated volumes (such as an academic journal). In this case, the citation to the source should include the Volume, Publication, Issue, and First Page.

2. Consecutively paginated serials with Volume numbers.

When issues within a volume continue from the pagination number of the previous issue, then identifying the issue number often not required (or even available). The required fields would be Volume, Publication, and First Page.

3. Serials that are identified only by issue, and do not track Volume numbers

Some periodicals, like newspapers, do not have volume numbers, and the issue is identified by the date of publication. In this case, the the volume number is not required. The required fields would be Publication, Issue, and First Page.

This situation creates an interesting predicament in which the date of publication may be duplicated within an item record. The date of publication would be recorded in the normal date-typed "issued" field, as well as within the serial-typed "issue" subfield. The specification could include a recommendation to leave the "issue" subfield blank if it merely copies the information from the "issued" field.

Mapping Serial-Type Fields to standard CSL fields

No CSL styles or processors currently support the "serials" type. In order to make a transition possible, processors should use the following mapping to ensure compatibility between the different styles.

The elements that pertain to information that would also be located in a "serials" field are the following:
  • "container-title" - title of the container holding the item (e.g. the book title for a book chapter, the journal title for a journal article)

  • “page” - range of pages the item (e.g. a journal article) covers in a container (e.g. a journal issue)

  • “page-first” - first page of the range of pages the item (e.g. a journal article) covers in a container (e.g. a journal issue)

  • “issue” - (container) issue holding the item (e.g. “5” when citing a journal article from journal volume 2, issue 5)

  • “volume” - (container) volume holding the item (e.g. “2” when citing a chapter from book volume 2)



Some elements are near-misses for inclusion. I include these to ensure that proper discussion is had.

  • “number-of-volumes” - total number of volumes, usable for citing multi-volume books and such. Multi-volume books are not serial publications. Serial publications are defined to have indefinite length. There is no instance in which this variable would be useful for serial sources.

  • "collection-title" - title of the collection holding the item (e.g. the series title for a book). When a work appears within a collection of works, the title of the containing work should be encoded in this variable. However, not all citation styles support this variable. In my survey of 40 styles, only 28 supported this variable. This makes me wonder if style-creators are using the "container-title" variable for both serial publications and collections.
  • “edition” - (container) edition holding the item (e.g. “3” when citing a chapter in the third edition of a book). I am not aware of any serial publications that use editions.

  • “number” - number identifying the item (e.g. a report number). This is a tricky one, especially for legal drafters. Report numbering can appear to look like the numbering for a serial publication. I am open to doing more research on this.

From standard CSL fields to "serial-type" CSL fields:
  • "container-title" ==> "publication"
  • "page" ==> "page"
  • "page-first" ==> "page"
  • "issue" ==> "issue"
  • "volume" ==> "volume"


From serial-type CSL fields to standard CSL fields:
  • "publication"==>"container-title"
  • "page"==>"page"
  • "issue"=="issue"
  • "volume"==>"volume

When converting from information from a "serials" field to standard CSL fields, only information from the first "serials" child element should be translated.


Summary

The Serial-Type field would stand alongside the Person-Type field and Date-type field as complex fields supported by CSL. I will summarize the proposed Serial-Type format in the following example:

<source>
       <serials>
              <serial>
                     <volume>vol. #</volume>
                     <publication>Title of primary publication</publication>
                     <issue>Issue #</issue>
                     <page>First Page</page>
               </serial>
               ...
       </serials>
</source>



- Thomas O'Reilly

------------------------------------------------------------------------------

_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel



------------------------------------------------------------------------------

_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel




--
Sebastian Karcher, PhD
www.sebastiankarcher.com

------------------------------------------------------------------------------

_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Loading...