Citeproc json data input specs

classic Classic list List threaded Threaded
21 messages Options
12
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Citeproc json data input specs

panyasan
Administrator
Hello,

I am currently working on integrating CSL/CiteProc into my web application, starting with citeproc-php (and looking forward to the new release of citeproc-hs!). Now I run into a very basic question: what is the structure of the json data that is fed into citeproc-js / citeproc-php ? The test cases only cover tiny bits of it, but I would need a complete specification of the structure. Is is MODS converted to JSON?

Thanks for a clarification,

Christian
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Citeproc json data input specs

Bruce D'Arcus-3
On Thu, Jun 3, 2010 at 10:27 AM, panyasan <[hidden email]> wrote:

> I am currently working on integrating CSL/CiteProc into my web application,
> starting with citeproc-php (and looking forward to the new release of
> citeproc-hs!). Now I run into a very basic question: what is the structure
> of the json data that is fed into citeproc-js / citeproc-php ? The test
> cases only cover tiny bits of it, but I would need a complete specification
> of the structure. Is is MODS converted to JSON?

Yeah, this keeps coming up. We probably need to do something about
this. Rintze, Frank, what do you think is the easiest, quickest, way
to put this together?

One option I'd earlier thought about was to just define it in RNC, and
perhaps create an XSLT to convert the XML version of it to some kind
of JSON schema and documentation? Does that seems feasible?

Bruce

------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit.  See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Citeproc json data input specs

Andrea Rossato
In reply to this post by panyasan
On Thu, Jun 03, 2010 at 07:27:34AM -0700, panyasan wrote:
>
> Hello,
>
> I am currently working on integrating CSL/CiteProc into my web application,
> starting with citeproc-php (and looking forward to the new release of
> citeproc-hs!). Now I run into a very basic question: what is the structure
> of the json data that is fed into citeproc-js / citeproc-php ? The test
> cases only cover tiny bits of it, but I would need a complete specification
> of the structure. Is is MODS converted to JSON?

no, MODS is only read by citeproc-hs.

While it is true that the JSON data structure is not documented, still
you can find all the needed information in the available
documentation.

The input object - the list of references - is modelled on the CSL
list of variables:
http://bitbucket.org/bdarcus/csl-schema/src/tip/csl-variables.rnc

The cs-names object (author, etc) is documented in the citeproc-js
documentation:
http://gsl-nagoya-u.net/http/pub/citeproc-doc.html#id25

Dates are documented here:
http://gsl-nagoya-u.net/http/pub/citeproc-doc.html#input-dates

The type object (a string) can have the value listed here:
http://bitbucket.org/bdarcus/csl-schema/src/tip/csl-types.rnc

The citation-items object, and the citations one,, which store the
list of cites (or groups of citations, as you prefer) and provide some
commands for the processor, are described in the citeproc-js
documentation and the tests' documentation.

Hope this helps.

Andrea

ps: the Haskell implementation of the JSON data structure is based on
that documentation. Still, Haskell is a strongly typed language, and I
would like to have a specification of the JSON data type in order to
avoid that lack of uniformity (some time an "id" is an Integer, others
a string... there are things that may range from numbers, to string or
even bools) you may find in the test suite. But PHP is well know for
not suffering of such a problem...;-)
https://lwn.net/Articles/369505/

------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit.  See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Citeproc json data input specs

panyasan
Administrator
Andrea, thank you for this very helpful information!

My own data model is very primitive - it is basically an extended BibTeX schema, which I now need to map to this more sophisticated data model.

Thanks again,

Christian
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Citeproc json data input specs

Bruce D'Arcus-3
On Thu, Jun 3, 2010 at 4:10 PM, panyasan <[hidden email]> wrote:

> My own data model is very primitive - it is basically an extended BibTeX
> schema, which I now need to map to this more sophisticated data model.

Could you see if you can document the bibtex --> csl mapping while you
do it? We're going to need to document this, and others.

Bruce

------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit.  See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Citeproc json data input specs

panyasan
Administrator
Sure, I'll be happy to...

C.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Citeproc json data input specs

panyasan
Administrator
Here's the first installment:

    $this->type_data = array (
        "article" => array (
          "label"         => _("Article"),
          "bibtex"        => true,
          "citeProcType"  => "article-journal"
        ),
        "book" => array (
          "label"         => _("Book (Monograph)"),
          "bibtex"        => true,
          "citeProcType"  => "book"
        ),
        "booklet" => array (
          "label"         => _("Booklet"),
          "bibtex"        => true,
          "citeProcType"  => "pamphlet"
        ),
        // non-standard
        "collection" => array (
          "label"         => _("Book (Edited)"),
          "bibtex"        => false,
          "citeProcType"  => "book"
        ),
        // non-standard use: normally same as "proceedings"
        "conference" => array (
          "label"         => _("Conference Paper"),
          "bibtex"        => true,
          "citeProcType"  => "paper-conference"
         ),
        "inbook" => array (
          "label"         => _("Book Chapter"),
          "bibtex"        => true,
          "citeProcType"  => "chapter"
        ),
        // non-standard
        "incollection"  => array (
          "label"         => _("Chapter in Edited Book"),
          "bibtex"        => true,
          "citeProcType"  => "chapter"
        ),
        "inproceedings" => array (
          "label"         => _("Paper in Conference Proceedings"),
          "bibtex"        => true,
          "citeProcType"  => "chapter"
        ),
        // non-standard
        "journal" => array (
          "label"         => _("Journal Issue"),
          "bibtex"        => false,
          "citeProcType"  => "???" // => type: periodical?
          ),
        // non-standard use
        "manual" => array (
          "label"         => _("Handbook"),
          "bibtex"        => true,
          "citeProcType"  => "book"
        ),
        "mastersthesis" => array (
          "label"         => _("Master's Thesis"),
          "bibtex"        => true,
          "citeProcType"  => "thesis"
          ),
        "misc" => array (
          "label"         => _("Miscellaneous"),
          "bibtex"        => true,
          "citeProcType"  => "manuscript" // ????
        ),
        "phdthesis" => array (
          "label"         => _("Ph.D. Thesis"),
          "bibtex"        => true,
          "citeProcType"  => "thesis"
        ),
        "proceedings" => array (
          "label"         => _("Conference Proceedings"),
          "bibtex"        => true,
          "citeProcType"  => "book"
        ),
        "techreport" => array (
          "label"         => _("Report/Working Paper"),
          "bibtex"        => true,
          "citeProcType"  => "report"
        ),
        "unpublished" => array (
          "label"         => _("Unpublished Manuscript"),
          "bibtex"        => true,
          "citeProcType"  => "manuscript"
        )
      );

As you can see, the BibTeX types are much fewer than the CSL types, but I still have a couple of problems mapping my own version of the BibTeX types.

- There isn't a notion of "periodical" or "serial" (like journals) in BibTex, but I also couldn't find one in the CSL specs.
- Is a "booklet" a "pamphlet" or a "book"?
- What would be the equivalent of "misc", stuff that doesn't really fit anywhere else. But then, I have a hard time of thinking of anything that doesn't fit ;-)
- I guess the difference between edited books and monographs does not need to be expressed at the level of the reference type, but is figured out from whether there is an author or an editor?

More later,

C.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Citeproc json data input specs

Bruce D'Arcus-3
On Thu, Jun 3, 2010 at 5:10 PM, panyasan <[hidden email]> wrote:

> As you can see, the BibTeX types are much fewer than the CSL types, but I
> still have a couple of problems mapping my own version of the BibTeX types.
>
> - There isn't a notion of "periodical" or "serial" (like journals) in
> BibTex, but I also couldn't find one in the CSL specs.

container-title

> - Is a "booklet" a "pamphlet" or a "book"?

I'd probably say the former. Not really sure what a "booklet" is :-)

> - What would be the equivalent of "misc", stuff that doesn't really fit
> anywhere else. But then, I have a hard time of thinking of anything that
> doesn't fit ;-)

Yeah, I always hated the "misc" type, so there is none. I believe we
have a "document" type (?) that might work.

> - I guess the difference between edited books and monographs does not need
> to be expressed at the level of the reference type, but is figured out from
> whether there is an author or an editor?

Yes.

Bruce

------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit.  See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Citeproc json data input specs

panyasan
Administrator
Thanks. here is the mapping data for "fields". Where the "csl" key is defined as "false" doesn't always mean that there is no equivalent, just that a simple 1:1 translation is not possible. The more complex fields are parsed separately.

      /**
       * all fields and their metadata
       */
      $this->field_data = array (
        'reftype' => array(
          'label'     => _("Bibliographic Type"),
          'type'      => "string",
          "csl"       => "type"

        ),
        'citekey' => array(
          'label'     => _("Citation Key"),
          'type'      => "string",
          "csl"       => "ID"
        ),
        'abstract' => array(
          'label'     => _("Abstract"),
          'type'      => "string",
          'bibtex'    => true,
          "csl"       => "abstract"
        ),
        // this is used for publisher-place or for author address
        'address' => array(
          'label'     => _("Place"),
          'type'      => "string",
          'bibtex'    => true,
          "csl"       => "publisher-place"
        ),
         // author affiliation
        'affiliation' => array(
          'label'     => _("Affiliation"),
          'type'      => "string",
          'bibtex'    => true,
          "csl"       => false // ???
        ),
        'annote' => array(
          'label'     => _("Annotation"),
          'type'      => "string",
          'bibtex'    => true,
          "csl"       => "annote"
        ),
        'author' => array(
          'label'     => _("Authors"),
          'type'      => "string",
          'bibtex'    => true,
          'separator' => ";",
          "csl"       => "author"
        ),
        'booktitle' => array(
          'label'     => _("Book Title"),
          'type'      => "string",
          'bibtex'    => true,
          "csl"       => "container-title"
        ),
        'contents' => array(
          'label'     => _("Contents"),
          'type'      => "string",
          'bibtex'    => true,
          "csl"       => false
        ),
        'copyright' => array(
          'label'     => _("Copyright"),
          'type'      => "string",
          'bibtex'    => true,
          "csl"       => null
        ),
        'crossref' => array(
          'label'     => _("Cross Reference"),
          'type'      => "string",
          'bibtex'    => true,
          "csl"       => "references"
        ),
        'date'   => array(
          'label'     => _("Date"),
          'type'      => "date",
          'bibtex'    => true,
          "csl"       => false
        ),
        'edition' => array(
          'label'     => _("Edition"),
          'type'      => "string",
          'bibtex'    => true,
          "csl"       => "edition"
        ),
        'editor' => array(
          'label'     => _("Editors"),
          'type'      => "string",
          'bibtex'    => true,
          "csl"       => "author"
        ),
        'howpublished' => array(
          'label'     => _("Published As"),
          'type'      => "string",
          'bibtex'    => true,
          "csl"       => false
        ),
        'institution' => array(
          'label'     => _("Institution"),
          'type'      => "string",
          'bibtex'    => true,
          "csl"       => false
        ),
        'isbn'   => array(
          'label'     => _("ISBN"),
          'type'      => "string",
          'bibtex'    => true,
          "csl"       => "ISBN"
        ),
        'issn'   => array(
          'label'     => _("ISSN"),
          'type'      => "string",
          'bibtex'    => true,
          "csl"       => "ISSN"
        ),
        'journal' => array(
          'label'     => _("Journal"),
          'type'      => "string",
          'bibtex'    => true,
          "csl"       => "container-title"
        ),
        // don't know what this is for, anyways
        'key' => array(
          'label'     => _("Key"),
          'type'      => "string",
          'bibtex'    => true,
          "csl"       => false
        ),
        'keywords' => array(
          'label'     => _("Keywords"),
          'type'      => "string",
          'bibtex'    => true,
          "csl"       => "keyword"
        ),
        'language' => array(
          'label'     => _("Language"),
          'autocomplete'  => array('separator' => null ),
          'type'      => "string",
          'bibtex'    => true,
          "csl"       => false // ???
        ),
        'lccn' => array(
          'label'     => _("Call Number"),
          'type'      => "string",
          'bibtex'    => true,
          "csl"       => "call-number"
        ),
        // field to store where the book is kept
        'location' => array(
          'label'     => _("Location"),
          'type'      => "string",
          'bibtex'    => true,
          "csl"       => false
        ),
        'month' => array(
          'label'     => _("Month"),
          'type'      => "string",
          'bibtex'    => true,
          "csl"       => false
        ),
        'note' => array(
          'label'     => _("Note"),
          'type'      => "string",
          'bibtex'    => true,
          "csl"       => "note"
        ),
        'number' => array(
          'label'     => _("Number"),
          'type'      => "string",
          'bibtex'    => true,
          "csl"       => "number"
        ),
        'organization' => array(
          'label'     => _("Organization"),
          'type'      => "string",
          "csl"       => false
        ),
        'pages' => array(
          'label'     => _("Pages"),
          'type'      => "string",
          'bibtex'    => true,
          "csl"       => "page"
        ),
        'price' => array(
          'label'     => _("Price"),
          'type'      => "string",
          'bibtex'    => true,
          "csl"       => false
        ),
        'publisher' => array(
          'label'         => _("Publisher"),
          'type'          => "string",
          'bibtex'        => true,
          "csl"       => "publisher"
        ),
        'school' => array(
          'label'         => _("University"),
          'type'          => "string",
          'bibtex'        => true,
          "csl"       => false
        ),
        'series' => array(
          'label'         => _("Series"),
          'type'          => "string",
          'bibtex'        => true,
          "csl"       => "collection-title"
        ),
        'size'   => array(
          'label'     => _("Size"),
          'type'      => "string",
          'bibtex'    => true,
          "csl"       => false
        ),
        'subtitle' => array(
          'label'     => _("Subtitle"),
          'type'      => "string",
          'bibtex'    => true,
          "csl"       => false
        ),
        'title' => array(
          'label'     => _("Title"),
          'type'      => "string",
          'bibtex'    => true,
          "csl"       => "title"
        ),
        'url' => array(
          'label'     => _("Internet Link"),
          'type'      => "link",
          'bibtex'    => true,
          "csl"       => "URL"
        ),
        'volume' => array(
          'label'     => _("Volume"),
          'type'      => "string",
          'bibtex'    => true,
          "csl"       => "volume"
        ),
        'year'     => array(
          'label'     => _("Year"),
          'type'      => "int",
          'bibtex'    => true,
          "csl"       => "issued"
        )
      );
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Citeproc json data input specs

fbennett
In reply to this post by Bruce D'Arcus-3
On Fri, Jun 4, 2010 at 6:14 AM, Bruce D'Arcus <[hidden email]> wrote:

> On Thu, Jun 3, 2010 at 5:10 PM, panyasan <[hidden email]> wrote:
>
>> As you can see, the BibTeX types are much fewer than the CSL types, but I
>> still have a couple of problems mapping my own version of the BibTeX types.
>>
>> - There isn't a notion of "periodical" or "serial" (like journals) in
>> BibTex, but I also couldn't find one in the CSL specs.
>
> container-title
>
>> - Is a "booklet" a "pamphlet" or a "book"?
>
> I'd probably say the former. Not really sure what a "booklet" is :-)
>
>> - What would be the equivalent of "misc", stuff that doesn't really fit
>> anywhere else. But then, I have a hard time of thinking of anything that
>> doesn't fit ;-)
>
> Yeah, I always hated the "misc" type, so there is none. I believe we
> have a "document" type (?) that might work.

There is no "document" type in CSL.  There is a "Document" item type
in the Zotero UI, but it maps to vanilla CSL "article", (in both
Zotero 2.0 and 2.1a1).  It is the only Zotero item type that maps to
"article" (as opposed to "article-journal", et cetera), so for Zotero
purposes, for the present, at least, "document" == "article".  But you
wouldn't want to specify that generally, as it's pretty clearly an ad
hoc solution.

One possibility would be to explicitly permit nil or unrecognized
values for "type" in the CSL specification.  Styles that discriminate
between item types would then pick up and process such a creature on
the cs:else branch of a condition statement.  That might require a
modification of the CSL schema though, I'm not sure.

>
>> - I guess the difference between edited books and monographs does not need
>> to be expressed at the level of the reference type, but is figured out from
>> whether there is an author or an editor?
>
> Yes.
>
> Bruce
>
> ------------------------------------------------------------------------------
> ThinkGeek and WIRED's GeekDad team up for the Ultimate
> GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
> lucky parental unit.  See the prize list and enter to win:
> http://p.sf.net/sfu/thinkgeek-promo
> _______________________________________________
> xbiblio-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>

------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit.  See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Citeproc json data input specs

fbennett
In reply to this post by panyasan
On Fri, Jun 4, 2010 at 6:40 AM, panyasan <[hidden email]> wrote:

>
> Thanks. here is the mapping data for "fields". Where the "csl" key is defined
> as "false" doesn't always mean that there is no equivalent, just that a
> simple 1:1 translation is not possible. The more complex fields are parsed
> separately.
>
>      /**
>       * all fields and their metadata
>       */
>      $this->field_data = array (
>        'reftype' => array(
>          'label'     => _("Bibliographic Type"),
>          'type'      => "string",
>          "csl"       => "type"
>
>        ),
>        'citekey' => array(
>          'label'     => _("Citation Key"),
>          'type'      => "string",
>          "csl"       => "ID"
>        ),
>        'abstract' => array(
>          'label'     => _("Abstract"),
>          'type'      => "string",
>          'bibtex'    => true,
>          "csl"       => "abstract"
>        ),
>        // this is used for publisher-place or for author address
>        'address' => array(
>          'label'     => _("Place"),
>          'type'      => "string",
>          'bibtex'    => true,
>          "csl"       => "publisher-place"
>        ),
>         // author affiliation
>        'affiliation' => array(
>          'label'     => _("Affiliation"),
>          'type'      => "string",
>          'bibtex'    => true,
>          "csl"       => false // ???
>        ),
>        'annote' => array(
>          'label'     => _("Annotation"),
>          'type'      => "string",
>          'bibtex'    => true,
>          "csl"       => "annote"
>        ),
>        'author' => array(
>          'label'     => _("Authors"),
>          'type'      => "string",
>          'bibtex'    => true,
>          'separator' => ";",
>          "csl"       => "author"
>        ),
>        'booktitle' => array(
>          'label'     => _("Book Title"),
>          'type'      => "string",
>          'bibtex'    => true,
>          "csl"       => "container-title"
>        ),
>        'contents' => array(
>          'label'     => _("Contents"),
>          'type'      => "string",
>          'bibtex'    => true,
>          "csl"       => false
>        ),
>        'copyright' => array(
>          'label'     => _("Copyright"),
>          'type'      => "string",
>          'bibtex'    => true,
>          "csl"       => null
>        ),
>        'crossref' => array(
>          'label'     => _("Cross Reference"),
>          'type'      => "string",
>          'bibtex'    => true,
>          "csl"       => "references"
>        ),
>        'date'   => array(
>          'label'     => _("Date"),
>          'type'      => "date",
>          'bibtex'    => true,
>          "csl"       => false
>        ),
>        'edition' => array(
>          'label'     => _("Edition"),
>          'type'      => "string",
>          'bibtex'    => true,
>          "csl"       => "edition"
>        ),
>        'editor' => array(
>          'label'     => _("Editors"),
>          'type'      => "string",
>          'bibtex'    => true,
>          "csl"       => "author"
>        ),
>        'howpublished' => array(
>          'label'     => _("Published As"),
>          'type'      => "string",
>          'bibtex'    => true,
>          "csl"       => false
>        ),
>        'institution' => array(
>          'label'     => _("Institution"),
>          'type'      => "string",
>          'bibtex'    => true,
>          "csl"       => false
>        ),
>        'isbn'   => array(
>          'label'     => _("ISBN"),
>          'type'      => "string",
>          'bibtex'    => true,
>          "csl"       => "ISBN"
>        ),
>        'issn'   => array(
>          'label'     => _("ISSN"),
>          'type'      => "string",
>          'bibtex'    => true,
>          "csl"       => "ISSN"
>        ),
>        'journal' => array(
>          'label'     => _("Journal"),
>          'type'      => "string",
>          'bibtex'    => true,
>          "csl"       => "container-title"
>        ),
>        // don't know what this is for, anyways
>        'key' => array(
>          'label'     => _("Key"),
>          'type'      => "string",
>          'bibtex'    => true,
>          "csl"       => false
>        ),
>        'keywords' => array(
>          'label'     => _("Keywords"),
>          'type'      => "string",
>          'bibtex'    => true,
>          "csl"       => "keyword"
>        ),
>        'language' => array(
>          'label'     => _("Language"),
>          'autocomplete'  => array('separator' => null ),
>          'type'      => "string",
>          'bibtex'    => true,
>          "csl"       => false // ???
>        ),
>        'lccn' => array(
>          'label'     => _("Call Number"),
>          'type'      => "string",
>          'bibtex'    => true,
>          "csl"       => "call-number"
>        ),
>        // field to store where the book is kept
>        'location' => array(
>          'label'     => _("Location"),
>          'type'      => "string",
>          'bibtex'    => true,
>          "csl"       => false
>        ),
>        'month' => array(
>          'label'     => _("Month"),
>          'type'      => "string",
>          'bibtex'    => true,
>          "csl"       => false
>        ),
>        'note' => array(
>          'label'     => _("Note"),
>          'type'      => "string",
>          'bibtex'    => true,
>          "csl"       => "note"
>        ),
>        'number' => array(
>          'label'     => _("Number"),
>          'type'      => "string",
>          'bibtex'    => true,
>          "csl"       => "number"
>        ),
>        'organization' => array(
>          'label'     => _("Organization"),
>          'type'      => "string",
>          "csl"       => false
>        ),
>        'pages' => array(
>          'label'     => _("Pages"),
>          'type'      => "string",
>          'bibtex'    => true,
>          "csl"       => "page"
>        ),
>        'price' => array(
>          'label'     => _("Price"),
>          'type'      => "string",
>          'bibtex'    => true,
>          "csl"       => false
>        ),
>        'publisher' => array(
>          'label'         => _("Publisher"),
>          'type'          => "string",
>          'bibtex'        => true,
>          "csl"       => "publisher"
>        ),
>        'school' => array(
>          'label'         => _("University"),
>          'type'          => "string",
>          'bibtex'        => true,
>          "csl"       => false
>        ),
>        'series' => array(
>          'label'         => _("Series"),
>          'type'          => "string",
>          'bibtex'        => true,
>          "csl"       => "collection-title"
>        ),
>        'size'   => array(
>          'label'     => _("Size"),
>          'type'      => "string",
>          'bibtex'    => true,
>          "csl"       => false
>        ),
>        'subtitle' => array(
>          'label'     => _("Subtitle"),
>          'type'      => "string",
>          'bibtex'    => true,
>          "csl"       => false
>        ),
>        'title' => array(
>          'label'     => _("Title"),
>          'type'      => "string",
>          'bibtex'    => true,
>          "csl"       => "title"
>        ),
>        'url' => array(
>          'label'     => _("Internet Link"),
>          'type'      => "link",
>          'bibtex'    => true,
>          "csl"       => "URL"
>        ),
>        'volume' => array(
>          'label'     => _("Volume"),
>          'type'      => "string",
>          'bibtex'    => true,
>          "csl"       => "volume"
>        ),
>        'year'     => array(
>          'label'     => _("Year"),
>          'type'      => "int",
>          'bibtex'    => true,
>          "csl"       => "issued"
>        )
>      );

It's great to see this.

Taking things one step further, I think it's important to adopt two
rules for app->CSL mappings.

(1) No app-side field should map to more than one CSL variable; and

(2) No two fields available in an app-side item type should map to the
same CSL variable.

This would prevent a situation where a CSL style developed on one
system is intractably broken on another.  We're already headed for
that difficulty, as the current Zotero mappings violate both rules.
For example:

The "series" and "seriesTitle" fields available on Zotero
"journalArticle" both map to collection-title ...

http://gsl-nagoya-u.net/http/pub/csl-fields/journalArticle.html

... and the Zotero "place" field maps to both "event-place" and
"publisher-place".

We can avoid potential chaos by requiring two things of applications
that integrate with a CSL processor:

(a) A description of the app-side fields available for each app-side
item type; and

(b) A description of the mappings of app-side fields to CSL variables.

If the descriptions are expressed in JSON (say), a small script can
check whether the rules are broken anywhere.  It's a simple thing, but
not necessarily obvious on the surface; Zotero has been running
happily for years with issues concerning (1) and (2) above; but when
other applications enter the mix and we start swapping CSL styles
between them, it's going to be a problem.

Frank


>
> --
> View this message in context: http://xbiblio-devel.2463403.n2.nabble.com/Citeproc-json-data-input-specs-tp5135372p5137090.html
> Sent from the xbiblio-devel mailing list archive at Nabble.com.
>
> ------------------------------------------------------------------------------
> ThinkGeek and WIRED's GeekDad team up for the Ultimate
> GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
> lucky parental unit.  See the prize list and enter to win:
> http://p.sf.net/sfu/thinkgeek-promo
> _______________________________________________
> xbiblio-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>
------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit.  See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Citeproc json data input specs

Bruce D'Arcus-3
In reply to this post by fbennett
On Thu, Jun 3, 2010 at 5:41 PM, Frank Bennett <[hidden email]> wrote:

> On Fri, Jun 4, 2010 at 6:14 AM, Bruce D'Arcus <[hidden email]> wrote:
>> On Thu, Jun 3, 2010 at 5:10 PM, panyasan <[hidden email]> wrote:
>>
>>> As you can see, the BibTeX types are much fewer than the CSL types, but I
>>> still have a couple of problems mapping my own version of the BibTeX types.
>>>
>>> - There isn't a notion of "periodical" or "serial" (like journals) in
>>> BibTex, but I also couldn't find one in the CSL specs.
>>
>> container-title
>>
>>> - Is a "booklet" a "pamphlet" or a "book"?
>>
>> I'd probably say the former. Not really sure what a "booklet" is :-)
>>
>>> - What would be the equivalent of "misc", stuff that doesn't really fit
>>> anywhere else. But then, I have a hard time of thinking of anything that
>>> doesn't fit ;-)
>>
>> Yeah, I always hated the "misc" type, so there is none. I believe we
>> have a "document" type (?) that might work.
>
> There is no "document" type in CSL.  There is a "Document" item type
> in the Zotero UI, but it maps to vanilla CSL "article", (in both
> Zotero 2.0 and 2.1a1).  It is the only Zotero item type that maps to
> "article" (as opposed to "article-journal", et cetera), so for Zotero
> purposes, for the present, at least, "document" == "article".  But you
> wouldn't want to specify that generally, as it's pretty clearly an ad
> hoc solution.
>
> One possibility would be to explicitly permit nil or unrecognized
> values for "type" in the CSL specification.  Styles that discriminate
> between item types would then pick up and process such a creature on
> the cs:else branch of a condition statement.  That might require a
> modification of the CSL schema though, I'm not sure.

Perhaps we should add "document"?

Bruce

------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit.  See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Citeproc json data input specs

panyasan
Administrator
Returning to the original topic of this thread, and moving a discussion that I started on a different thread to here: I think what would be really good for the integration of the different processors into applications would be to have some sort of "schema validation" like it exists for xml for the json input data. This would allow to enforce a uniform schema accross implementations, and to help application developers to correctly map the application-internal data to the data that is expected by the CSL formatting engines.

After some research, it seems to me that there are multiple projects which try to do some sort of json schema validation, but no standard has emerged yet (correct me if I am wrong). I found this which has an implementation in javascript and PHP:

http://sourceforge.net/projects/jsonschemaphpv/

and which uses a very lightweight syntax to check the schema of a json data structure:

{
  "type":"object",
  "properties":{
    "a":{"type":"number", properties : { ... }},
    "b":{"type":"string"}
  },
  "additionalProperties":false
}

Basically, only three terms (type, properties, and additionalProperties) are used to describe the json schema. Seems pretty easy to implement in other languages, and we get the javascript and PHP for free.

What do you think?

C.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Citeproc json data input specs

Bruce D'Arcus-3
On Sat, Jun 5, 2010 at 6:40 PM, panyasan <[hidden email]> wrote:

>
> Returning to the original topic of this thread, and moving a discussion that
> I started on a different thread to here: I think what would be really good
> for the integration of the different processors into applications would be
> to have some sort of "schema validation" like it exists for xml for the json
> input data. This would allow to enforce a uniform schema accross
> implementations, and to help application developers to correctly map the
> application-internal data to the data that is expected by the CSL formatting
> engines.
>
> After some research, it seems to me that there are multiple projects which
> try to do some sort of json schema validation, but no standard has emerged
> yet (correct me if I am wrong). I found this which has an implementation in
> javascript and PHP:
>
> http://sourceforge.net/projects/jsonschemaphpv/
>
> and which uses a very lightweight syntax to check the schema of a json data
> structure:
>
> {
>  "type":"object",
>  "properties":{
>    "a":{"type":"number", properties : { ... }},
>    "b":{"type":"string"}
>  },
>  "additionalProperties":false
> }
>
> Basically, only three terms (type, properties, and additionalProperties) are
> used to describe the json schema. Seems pretty easy to implement in other
> languages, and we get the javascript and PHP for free.
>
> What do you think?

Rintze and I chatted about this off-list awhile back. Here's some of
that discussion ...

On Wed, Apr 28, 2010 at 9:38 AM, Bruce D'Arcus <[hidden email]> wrote:

> On Wed, Apr 28, 2010 at 9:29 AM, Rintze Zelle <[hidden email]> wrote:
>
>> If JSON is going to be the main input format (at least for the short term),
>> wouldn't it be easier to start with a JSON schema (or a JSON Orderly schema,
>> http://orderly-json.org/)? With a JSON validator, we can then easily test
>> against the JSON data stored in the citeproc-js/CSL test suite.
>
> I have no experience with these JSON schema technologies, but assuming
> they are sufficient for our purposes, there's no reason we couldn't do
> both. As I mentioned earlier, it's just really easy/convenient to be
> able to use the same technology as the CSL schema (can reference
> patterns from it rather than have to duplicate them, for example).
>
> In theory, it should be easy to convert RNG schema into their
> JSON equivalents. I'm guessing someone has probably written the code
> to do this (though I don't know).
>
> Bruce

------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit.  See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Citeproc json data input specs

panyasan
Administrator
Thanks Bruce. So there IS an emerging standard, http://json-schema.org/, good to know. I would vote for creating a reference validation schema in JSONSchema, then. From there, the maintainers of the different implementations can go ahead and include/write validators. I'd volunteer to contribute the PHP one for citeproc-php, if Ron doesn't want to do it himself.

Thanks,
Christian
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Citeproc json data input specs

Bruce D'Arcus-3
On Sun, Jun 6, 2010 at 3:15 AM, panyasan <[hidden email]> wrote:
>
> Thanks Bruce. So there IS an emerging standard, http://json-schema.org/, good
> to know. I would vote for creating a reference validation schema in
> JSONSchema, then.

A question, though: can JSONSchema represent choices? E.g. can it
represent this in RNC?

  name-attributes =
    attribute name { text }
    | (attribute given { text }?,
       attribute family { text }?,
       attribute prefix { text }?,
       attribute suffix { text }?,
       attribute particle { text }?)
}

I think the answer is "no" but am not sure. There is an "enum"
datatype, but this is for string values.

If it really can't represent choices, then it's arguably too
lightweight to be all that useful.

> From there, the maintainers of the different
> implementations can go ahead and include/write validators. I'd volunteer to
> contribute the PHP one for citeproc-php, if Ron doesn't want to do it
> himself.
>
> Thanks,
> Christian

Bruce

------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit.  See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Citeproc json data input specs

Bruce D'Arcus-3
On Sun, Jun 6, 2010 at 11:20 AM, Bruce D'Arcus <[hidden email]> wrote:

> A question, though: can JSONSchema represent choices?

Not sure, but this thread might be related:

<http://groups.google.com/group/json-schema/browse_thread/thread/1603919da6c4cb2e?pli=1>

Bruce

------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit.  See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Citeproc json data input specs

Bruce D'Arcus-3
Also, just for reference, here's the schema draft in RNC:

<http://bitbucket.org/bdarcus/csl-schema/src/tip/csl-data.rnc>

Another thing probably hard/impossible to represent in JSON Schema is
the stuff at the bottom for the HTML subset "rich text.

Bruce

------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit.  See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Citeproc json data input specs

drozzy
Are there any converters from BibTex (.bib) to CSL JSON?

One in haskell or python would be just kickass.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Citeproc json data input specs

Bruce D'Arcus-3
On Wed, Feb 8, 2012 at 9:50 AM, drozzy <[hidden email]> wrote:
> Are there any converters from BibTex (.bib) to CSL JSON?
>
> One in haskell or python would be just kickass.

Pandoc uses citeproc-hs, which in turns uses bibutils to convert
different data; including bibtex. I'm pretty sure it (likely
citeproc-hs) supports CSL JSON as well.

I know there are python libraries out there that parse bibtex. Should
be trivial to dump the results to CSL JSON. But there's nothing I'm
aware of ATM. Feel free to write one up and post it on github ;-)

Bruce

------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
12
Loading...