Quantcast

Embedded CSL fields in Mendeley Word & OO documents

classic Classic list List threaded Threaded
26 messages Options
12
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Embedded CSL fields in Mendeley Word & OO documents

Steve Ridout
Hey guys,

I'm a software engineer at Mendeley, developing our Word and OpenOffice plugins along with Carles Pina.

I'm altering the format we use to store the references documents. Currently we just store our own document UUIDs, but we want to store all the document metadata necessary for CSL formatting, which would make sharing documents between our users easier, and potentially between users of other reference managers which can read this format.

The plan is to embed JSON in a format readable by Frank Bennett's citeproc-js, along with some optional Mendeley specific fields. They would be stored in the Word field codes, or in OpenOffice bookmarks. An example field code would look like this:

{Mendeley Citation{5756f170-e97d-4c32-8279-b2039884c21b};{a8315e66-67a2-4693-b51a-e0741f556d6a} CslCitation:<JSON-DATA>}

where <JSON-DATA> will look like this (but without the whitespace):

{
    "ITEM-1": {
        <CITEPROC-CSL-FIELDS>,
        "mendeley": {
            "account": "steve.ridout@mendeley.com",
            "server": "www.mendeley.com",
            "uuid": "5756f170-e97d-4c32-8279-b2039884c21b"
        }
    },
    "ITEM-2": {
        <CITEPROC-CSL-FIELDS>,
        "mendeley": {
            "account": "steve.ridout@mendeley.com",
            "server": "www.mendeley.com",
            "uuid": "a8315e66-67a2-4693-b51a-e0741f556d6a"
        }
    },
    "ITEM-3": {
        <CITEPROC-CSL-FIELDS>,
        "mendeley": {
            "group": "14217",
            "server": "www.mendeley.com",
            "uuid": "ae405489-9d99-4c05-bc56-788ba48fd16b"
        }
    },
    "mendeley": {
        "previousFormattedCitation": "(Ahn & Schmidt, 1995; Al-shehbaz & O'kane, 2002; Alcaraz & Donaire, 2004)"
    }
   "version": "1"
}

Notes:
 * We need the original "Mendeley Citation{}" at the start for compatibility with old plugin versions, but it's optional.
 * The JSON "mendeley" elements are optional, and if anyone else (e.g. Zotero) wants to they can add their own.
 * The "version" element represents the version of this JSON schema, in case we add to it or change it in future.

Does this sound sensible to you?

Steve
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Embedded CSL fields in Mendeley Word & OO documents

rmzelle
Administrator
On Mon, Mar 7, 2011 at 6:36 AM, Steve Ridout <[hidden email]> wrote:
{Mendeley
Citation{5756f170-e97d-4c32-8279-b2039884c21b};{a8315e66-67a2-4693-b51a-e0741f556d6a}
CslCitation:&lt;JSON-DATA&gt;}

where  will look like this (but without the whitespace):

{
  "version": "1"
}

 * The "version" element represents the version of this JSON schema, in case
we add to it or change it in future.

Does this sound sensible to you?

I think providing a URL/URI for the JSON schema would be more useful.

Rintze

------------------------------------------------------------------------------
What You Don't Know About Data Connectivity CAN Hurt You
This paper provides an overview of data connectivity, details
its effect on application quality, and explores various alternative
solutions. http://p.sf.net/sfu/progress-d2d
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Embedded CSL fields in Mendeley Word & OO documents

Bruce D'Arcus-3
In reply to this post by Steve Ridout
Hi Steve,

This might be an opportunity, given that Zotero has also talked about
embedding JSON.

So let me respond in this way: two questions:

1) How would your proposal address the following, critical, problem?

<http://community.muohio.edu/blogs/darcusb/archives/2009/03/01/the-babel-of-citations>

2) where are you intending to store the source data?

Bruce

------------------------------------------------------------------------------
What You Don't Know About Data Connectivity CAN Hurt You
This paper provides an overview of data connectivity, details
its effect on application quality, and explores various alternative
solutions. http://p.sf.net/sfu/progress-d2d
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Embedded CSL fields in Mendeley Word & OO documents

Simon Kornblith
In reply to this post by Steve Ridout
On Mar 7, 2011, at 6:36 AM, Steve Ridout wrote:

Hey guys,

I'm a software engineer at Mendeley, developing our Word and OpenOffice
plugins along with Carles Pina.

I'm altering the format we use to store the references documents. Currently
we just store our own document UUIDs, but we want to store all the document
metadata necessary for CSL formatting, which would make sharing documents
between our users easier, and potentially between users of other reference
managers which can read this format.

The plan is to embed JSON in a format readable by Frank Bennett's
citeproc-js, along with some optional Mendeley specific fields. They would
be stored in the Word field codes, or in OpenOffice bookmarks. An example
field code would look like this:

{Mendeley
Citation{5756f170-e97d-4c32-8279-b2039884c21b};{a8315e66-67a2-4693-b51a-e0741f556d6a}
CslCitation:&lt;JSON-DATA&gt;}

where  will look like this (but without the whitespace):

{
   "ITEM-1": {
       &lt;CITEPROC-CSL-FIELDS&gt;,
       "mendeley": {
           "account": "[hidden email]",
           "server": "www.mendeley.com",
           "uuid": "5756f170-e97d-4c32-8279-b2039884c21b"
       }
   },
   "ITEM-2": {
       &lt;CITEPROC-CSL-FIELDS&gt;,
       "mendeley": {
           "account": "[hidden email]",
           "server": "www.mendeley.com",
           "uuid": "a8315e66-67a2-4693-b51a-e0741f556d6a"
       }
   },
   "ITEM-3": {
       &lt;CITEPROC-CSL-FIELDS&gt;,
       "mendeley": {
           "group": "14217",
           "server": "www.mendeley.com",
           "uuid": "ae405489-9d99-4c05-bc56-788ba48fd16b"
       }
   },
   "mendeley": {
       "previousFormattedCitation": "(Ahn & Schmidt, 1995; Al-shehbaz &
O'kane, 2002; Alcaraz & Donaire, 2004)"
   }
  "version": "1"
}

Notes:
* We need the original "Mendeley Citation{}" at the start for compatibility
with old plugin versions, but it's optional.
* The JSON "mendeley" elements are optional, and if anyone else (e.g.
Zotero) wants to they can add their own.
* The "version" element represents the version of this JSON schema, in case
we add to it or change it in future.

So, Zotero currently formats like this:

{ ADDIN ZOTERO_ITEM {"citationID":"12rsus7rlj","citationItems":[{"uri":["http://zotero.org/users/331/items/CT7UITEM"]}]} }

I can tweak things to handle Mendeley codes as well fairly easily. We can serialize citation metadata in the "citationItems" element in the near future. This leads me to the following conclusions:

1) Where is the citationID coming from in your current implementation? Are you just re-generating them each time you load the document? Is there a reason not to be saving them?
2) Having an array of URIs may be preferable to having a "mendeley" object. Adding an array of URIs and letting the implementation pick up the first that matches an account it knows about allows the same citation to be linked to multiple accounts simultaneously without having to re-match data each time. It's also more general.
3) There are better formats for carrying metadata than citeproc-js JSON, e.g., Bibliontology RDF, which could be serialized to JSON. Do we want to standardize on one of these instead of citeproc-js JSON?

Simon

------------------------------------------------------------------------------
What You Don't Know About Data Connectivity CAN Hurt You
This paper provides an overview of data connectivity, details
its effect on application quality, and explores various alternative
solutions. http://p.sf.net/sfu/progress-d2d
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Embedded CSL fields in Mendeley Word & OO documents

Bruce D'Arcus-3
On Mon, Mar 7, 2011 at 9:44 AM, Simon Kornblith <[hidden email]> wrote:

...

> 3) There are better formats for carrying metadata than citeproc-js JSON,
> e.g., Bibliontology RDF, which could be serialized to JSON. Do we want to
> standardize on one of these instead of citeproc-js JSON?

On this one, if we want to go further down this road, we might want to
itemize the standard by which we define "better" in this case.

Bruce

------------------------------------------------------------------------------
What You Don't Know About Data Connectivity CAN Hurt You
This paper provides an overview of data connectivity, details
its effect on application quality, and explores various alternative
solutions. http://p.sf.net/sfu/progress-d2d
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Embedded CSL fields in Mendeley Word & OO documents

Steve Ridout
In reply to this post by Bruce D'Arcus-3
I think providing a URL/URI for the JSON schema would be more useful.
That's a good idea, it could also point to a web page with more information.

1) How would your proposal address the following, critical, problem?

<http://community.muohio.edu/blogs/darcusb/archives/2009/03/01/the-babel-of-citations>
If we accept the general problem is
"How can we provide compatibility of citations between different users of different word processors using different reference managers?"

This proposal would embed all the metadata necessary for formatting a citation. So it is very feasible for authors of other plugins to use this data, particularly if they are using CSL processors which accept JSON in the format citeproc-js expects.

The method of embedding this data is not ideal, the Word field codes aren't recognised by OpenOffice and so we provide an "Export" macro to save the data in bookmarks instead if he want's to move between Word and OpenOffice.

2) where are you intending to store the source data?
All the necessary data (title, authors, publication, etc...) will be included where I've written <CITEPROC-CSL-FIELDS>

In addition, the data will be stored in the user's local Mendeley database and on our servers if the user is syncing his database. The data on our servers will only be available to the user who uploaded it unless it's in a "shared group".

Steve
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Embedded CSL fields in Mendeley Word & OO documents

Bruce D'Arcus-3
On Mon, Mar 7, 2011 at 10:19 AM, Steve Ridout <[hidden email]> wrote:

> If we accept the general problem is
> "How can we provide compatibility of citations between different users of
> different word processors using different reference managers?"

Yup; that's it.

> This proposal would embed all the metadata necessary for formatting a
> citation. So it is very feasible for authors of other plugins to use this
> data, particularly if they are using CSL processors which accept JSON in the
> format citeproc-js expects.

OK. So in other words, if Mendeley and Zotero store these data in
compatible way, we can slowly solve this problem?

> The method of embedding this data is not ideal, the Word field codes aren't
> recognised by OpenOffice and so we provide an "Export" macro to save the
> data in bookmarks instead if he want's to move between Word and OpenOffice.
>
>> 2) where are you intending to store the source data?
>>
>
> All the necessary data (title, authors, publication, etc...) will be
> included where I've written &lt;CITEPROC-CSL-FIELDS&gt;

So if one has fifty references to the same source (not unreasonable in
some fields, in a book), then the data is repeated fifty times?

> In addition, the data will be stored in the user's local Mendeley database
> and on our servers if the user is syncing his database. The data on our
> servers will only be available to the user who uploaded it unless it's in a
> "shared group".

So the application-specific component is an additional help, but not required.

I still wonder, and so am just throwing the idea out there, if it's
not better to decouple the following:

- item metadata
- user
- service

For sake of argument, what if you identified a source as
"issn:doi:23298392892" but also stored the user info such that you
can, if needed, search first the user library, but fallback to other
options?

E.g. recognize a URI is just an identifier, and that getting metadata
for that thing is a separate action.

Bruce

------------------------------------------------------------------------------
What You Don't Know About Data Connectivity CAN Hurt You
This paper provides an overview of data connectivity, details
its effect on application quality, and explores various alternative
solutions. http://p.sf.net/sfu/progress-d2d
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Embedded CSL fields in Mendeley Word & OO documents

Steve Ridout
In reply to this post by Simon Kornblith
Simon Kornblith wrote
I can tweak things to handle Mendeley codes as well fairly easily. We can serialize citation metadata in the "citationItems" element in the near future. This leads me to the following conclusions:
Sounds good. It would be great for Zotero and Mendeley users to be able to share documents.

1) Where is the citationID coming from in your current implementation? Are you just re-generating them each time you load the document? Is there a reason not to be saving them?
Yes, they are regenerated every time we run citeproc to generate the formatted citations. I'm not sure of the benefit of embedding the ID inside the document since it's possible that the user could copy and paste a citation and then edit it resulting in two citation clusters with the same citationID but different references. Is there a good reason we should store them?

2) Having an array of URIs may be preferable to having a "mendeley" object. Adding an array of URIs and letting the implementation pick up the first that matches an account it knows about allows the same citation to be linked to multiple accounts simultaneously without having to re-match data each time. It's also more general.
This could also allow linking to multiple Mendeley or Zotero user's accounts which would be nice. It may be wise to restrict the number of accounts added though in case widely circulated documents end up full of account URI clutter.

3) There are better formats for carrying metadata than citeproc-js JSON, e.g., Bibliontology RDF, which could be serialized to JSON. Do we want to standardize on one of these instead of citeproc-js JSON?
JSON is very easy for us to deal with as it can be passed straight to citeproc, and adding support for citeproc features which Mendeley doesn't currently support is easy. Is there a compelling reason to switch to RDF?

Steve
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Embedded CSL fields in Mendeley Word & OO documents

Simon Kornblith
On Mar 7, 2011, at 10:31 AM, Steve Ridout wrote:

>
> Simon Kornblith wrote:
>>
>>
>> I can tweak things to handle Mendeley codes as well fairly easily. We can
>> serialize citation metadata in the "citationItems" element in the near
>> future. This leads me to the following conclusions:
>>
>>
>
> Sounds good. It would be great for Zotero and Mendeley users to be able to
> share documents.

Agreed.

>> 1) Where is the citationID coming from in your current implementation? Are
>> you just re-generating them each time you load the document? Is there a
>> reason not to be saving them?
>>
>
> Yes, they are regenerated every time we run citeproc to generate the
> formatted citations. I'm not sure of the benefit of embedding the ID inside
> the document since it's possible that the user could copy and paste a
> citation and then edit it resulting in two citation clusters with the same
> citationID but different references. Is there a good reason we should store
> them?

I think it depends on how you have implemented citeproc-js; we use the citationID for tracking within the same session. It shouldn't be a problem for Zotero to handle a citation with a missing citationID, since we will automatically regenerate it anyway.

Whether or not you include the citationID, it would be nice to standardize on a single JSON format. Is there a reason not to extend the structure of the existing Zotero format with additional keys as necessary?

>> 2) Having an array of URIs may be preferable to having a "mendeley"
>> object. Adding an array of URIs and letting the implementation pick up the
>> first that matches an account it knows about allows the same citation to
>> be linked to multiple accounts simultaneously without having to re-match
>> data each time. It's also more general.
>>
>
> This could also allow linking to multiple Mendeley or Zotero user's accounts
> which would be nice. It may be wise to restrict the number of accounts added
> though in case widely circulated documents end up full of account URI
> clutter.

If we need a limit here, I suggest a large one, since a URI doesn't take up too much space. Maybe 25-50?

>> 3) There are better formats for carrying metadata than citeproc-js JSON,
>> e.g., Bibliontology RDF, which could be serialized to JSON. Do we want to
>> standardize on one of these instead of citeproc-js JSON?
>>
>
> JSON is very easy for us to deal with as it can be passed straight to
> citeproc, and adding support for citeproc features which Mendeley doesn't
> currently support is easy. Is there a compelling reason to switch to RDF?

CSL JSON is definitely simpler to code, but Bibliontology RDF is more versatile in terms of field support and extensibility. My opinion on this is not very strong.

Simon
------------------------------------------------------------------------------
What You Don't Know About Data Connectivity CAN Hurt You
This paper provides an overview of data connectivity, details
its effect on application quality, and explores various alternative
solutions. http://p.sf.net/sfu/progress-d2d
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Embedded CSL fields in Mendeley Word & OO documents

Simon Kornblith
In reply to this post by Bruce D'Arcus-3
On Mar 7, 2011, at 10:29 AM, Bruce D'Arcus wrote:

On Mon, Mar 7, 2011 at 10:19 AM, Steve Ridout <[hidden email]> wrote:

If we accept the general problem is
"How can we provide compatibility of citations between different users of
different word processors using different reference managers?"

Yup; that's it.

This proposal would embed all the metadata necessary for formatting a
citation. So it is very feasible for authors of other plugins to use this
data, particularly if they are using CSL processors which accept JSON in the
format citeproc-js expects.

OK. So in other words, if Mendeley and Zotero store these data in
compatible way, we can slowly solve this problem?

The method of embedding this data is not ideal, the Word field codes aren't
recognised by OpenOffice and so we provide an "Export" macro to save the
data in bookmarks instead if he want's to move between Word and OpenOffice.

2) where are you intending to store the source data?


All the necessary data (title, authors, publication, etc...) will be
included where I've written &lt;CITEPROC-CSL-FIELDS&gt;

So if one has fifty references to the same source (not unreasonable in
some fields, in a book), then the data is repeated fifty times?

We could potentially embed the data in the first citation only, but we could also embed it in each citation and rely on odt/docx compression to take care of it.

In addition, the data will be stored in the user's local Mendeley database
and on our servers if the user is syncing his database. The data on our
servers will only be available to the user who uploaded it unless it's in a
"shared group".

So the application-specific component is an additional help, but not required.

I still wonder, and so am just throwing the idea out there, if it's
not better to decouple the following:

- item metadata
- user
- service

For sake of argument, what if you identified a source as
"issn:doi:23298392892" but also stored the user info such that you
can, if needed, search first the user library, but fallback to other
options?

E.g. recognize a URI is just an identifier, and that getting metadata
for that thing is a separate action.

There are a bunch of problems with this from a usability perspective:

1) Inability to store metadata for items with no ISBN, DOI, or PMID.
2) Incorrect/incomplete metadata in public repository.
3) User has modified item to add additional data. (There are legitimate reasons to do this, e.g., to add a short title.)
4) Speed of metadata retrieval, if retrieving citations for hundreds of items.

If we use URI arrays, we can easily include this metadata, but I'm not sure it's reliable enough to use as anything but a last resort.

Simon

------------------------------------------------------------------------------
What You Don't Know About Data Connectivity CAN Hurt You
This paper provides an overview of data connectivity, details
its effect on application quality, and explores various alternative
solutions. http://p.sf.net/sfu/progress-d2d
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Embedded CSL fields in Mendeley Word & OO documents

Bruce D'Arcus-3
In reply to this post by Simon Kornblith
On Mon, Mar 7, 2011 at 1:46 PM, Simon Kornblith <[hidden email]> wrote:

...

>> JSON is very easy for us to deal with as it can be passed straight to
>> citeproc, and adding support for citeproc features which Mendeley doesn't
>> currently support is easy. Is there a compelling reason to switch to RDF?
>
> CSL JSON is definitely simpler to code, but Bibliontology RDF is more versatile in terms of field support and extensibility. My opinion on this is not very strong.

And what did you have in mind in terms of a JSON representation of
BIBO? Using a generic RDF-as-JSON, which will be pretty verbose, or
something more idiomatic to JSON?

Pulling back, I've been gravitating towards thinking of two kinds of
representation that ought to be able to be more-or-less round-tripped:

1) a CSL JSON which is very close to the CSL model, and so easy to
process from that standpoint
2) a richer, more extensible, more rigorous and remixable, BIBO RDF

The second has an additional benefit, which is that it can be
serialized in different ways, including as RDFa embedded in HTML
output, which is a medium term goal I'd like to push on: the idea that
the output CSL implementations produce is not just dumb text, but can
also be extracted as structured data.

It can also be embedded as RDF/XML in ODF documents in standard ways
consistent with that spec, and so is accessible to the OOo/LO metadata
API (though MS Office has no such thing, so that leaves the question
of how to deal with that).

But there's no doubt that all of this has some additional costs.

There's also no denying that dumping json in fields is a bit of an
abuse of the formats.

In any case, I don't have a strong opinion either; my main goal is
something that "just works."

Bruce

------------------------------------------------------------------------------
What You Don't Know About Data Connectivity CAN Hurt You
This paper provides an overview of data connectivity, details
its effect on application quality, and explores various alternative
solutions. http://p.sf.net/sfu/progress-d2d
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Embedded CSL fields in Mendeley Word & OO documents

fbennett
In reply to this post by Steve Ridout
On Tue, Mar 8, 2011 at 12:31 AM, Steve Ridout <[hidden email]> wrote:

>
> Simon Kornblith wrote:
>>
>>
>> I can tweak things to handle Mendeley codes as well fairly easily. We can
>> serialize citation metadata in the "citationItems" element in the near
>> future. This leads me to the following conclusions:
>>
>>
>
> Sounds good. It would be great for Zotero and Mendeley users to be able to
> share documents.
>
>
>
>> 1) Where is the citationID coming from in your current implementation? Are
>> you just re-generating them each time you load the document? Is there a
>> reason not to be saving them?
>>
>
> Yes, they are regenerated every time we run citeproc to generate the
> formatted citations. I'm not sure of the benefit of embedding the ID inside
> the document since it's possible that the user could copy and paste a
> citation and then edit it resulting in two citation clusters with the same
> citationID but different references. Is there a good reason we should store
> them?

They allow transactions between the processor and the calling
application to be optimized.

The processCitationCluster() method is called with the data of the
target citation, identified by its own citationID (if known), plus
lists of predecessor and successor citationIDs. By comparing the
citationID sequence against the contents of its internal registry, the
processor can determine which specific citation clusters in the
document require an update, and return the necessary data to the
calling application.

The processor doesn't currently account for the case of duplicate
citationIDs in a single call to processCitationCluster(), but it
should do, and it would be easy to fix up. Citation clusters returned
by the processor are identified to the document interface by sequence
number, not by ID, so it's safe to change them on the fly.
processCitationCluster() can just scan the ID list arguments before
doing its thing, and force any duplicates to null. The processor would
then generate a fresh ID, return it to the calling application
identified by sequence number, and you're ready for the next editing
cycle.


>
>
>
>> 2) Having an array of URIs may be preferable to having a "mendeley"
>> object. Adding an array of URIs and letting the implementation pick up the
>> first that matches an account it knows about allows the same citation to
>> be linked to multiple accounts simultaneously without having to re-match
>> data each time. It's also more general.
>>
>
> This could also allow linking to multiple Mendeley or Zotero user's accounts
> which would be nice. It may be wise to restrict the number of accounts added
> though in case widely circulated documents end up full of account URI
> clutter.
>
>
>
>> 3) There are better formats for carrying metadata than citeproc-js JSON,
>> e.g., Bibliontology RDF, which could be serialized to JSON. Do we want to
>> standardize on one of these instead of citeproc-js JSON?
>>
>
> JSON is very easy for us to deal with as it can be passed straight to
> citeproc, and adding support for citeproc features which Mendeley doesn't
> currently support is easy. Is there a compelling reason to switch to RDF?
>
> Steve
>
> --
> View this message in context: http://xbiblio-devel.2463403.n2.nabble.com/Embedded-CSL-fields-in-Mendeley-Word-OO-documents-tp6096952p6097708.html
> Sent from the xbiblio-devel mailing list archive at Nabble.com.
>
> ------------------------------------------------------------------------------
> What You Don't Know About Data Connectivity CAN Hurt You
> This paper provides an overview of data connectivity, details
> its effect on application quality, and explores various alternative
> solutions. http://p.sf.net/sfu/progress-d2d
> _______________________________________________
> xbiblio-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>

------------------------------------------------------------------------------
What You Don't Know About Data Connectivity CAN Hurt You
This paper provides an overview of data connectivity, details
its effect on application quality, and explores various alternative
solutions. http://p.sf.net/sfu/progress-d2d
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Embedded CSL fields in Mendeley Word & OO documents

fbennett
On Tue, Mar 8, 2011 at 6:33 AM, Frank Bennett <[hidden email]> wrote:

> On Tue, Mar 8, 2011 at 12:31 AM, Steve Ridout <[hidden email]> wrote:
>>
>> Simon Kornblith wrote:
>>>
>>>
>>> I can tweak things to handle Mendeley codes as well fairly easily. We can
>>> serialize citation metadata in the "citationItems" element in the near
>>> future. This leads me to the following conclusions:
>>>
>>>
>>
>> Sounds good. It would be great for Zotero and Mendeley users to be able to
>> share documents.
>>
>>
>>
>>> 1) Where is the citationID coming from in your current implementation? Are
>>> you just re-generating them each time you load the document? Is there a
>>> reason not to be saving them?
>>>
>>
>> Yes, they are regenerated every time we run citeproc to generate the
>> formatted citations. I'm not sure of the benefit of embedding the ID inside
>> the document since it's possible that the user could copy and paste a
>> citation and then edit it resulting in two citation clusters with the same
>> citationID but different references. Is there a good reason we should store
>> them?
>
> They allow transactions between the processor and the calling
> application to be optimized.
>
> The processCitationCluster() method is called with the data of the
> target citation, identified by its own citationID (if known), plus
> lists of predecessor and successor citationIDs. By comparing the
> citationID sequence against the contents of its internal registry, the
> processor can determine which specific citation clusters in the
> document require an update, and return the necessary data to the
> calling application.
>
> The processor doesn't currently account for the case of duplicate
> citationIDs in a single call to processCitationCluster(), but it
> should do, and it would be easy to fix up. Citation clusters returned
> by the processor are identified to the document interface by sequence
> number, not by ID, so it's safe to change them on the fly.
> processCitationCluster() can just scan the ID list arguments before
> doing its thing, and force any duplicates to null. The processor would
> then generate a fresh ID, return it to the calling application
> identified by sequence number, and you're ready for the next editing
> cycle.

(Slight amendment: for duplicates forced to false, we would need to
make a separate call to the document for the data at that position, so
that the registry can be updated; but it should be doable. My idea
with embedding this logic in the processor is to lower the barrier to
the creation of new word processor plugins as far as possible. I have
Abiword in the back of my mind there; their shared document model
combined with robust citation support would be very attractive for
collaborative projects.)

>
>
>>
>>
>>
>>> 2) Having an array of URIs may be preferable to having a "mendeley"
>>> object. Adding an array of URIs and letting the implementation pick up the
>>> first that matches an account it knows about allows the same citation to
>>> be linked to multiple accounts simultaneously without having to re-match
>>> data each time. It's also more general.
>>>
>>
>> This could also allow linking to multiple Mendeley or Zotero user's accounts
>> which would be nice. It may be wise to restrict the number of accounts added
>> though in case widely circulated documents end up full of account URI
>> clutter.
>>
>>
>>
>>> 3) There are better formats for carrying metadata than citeproc-js JSON,
>>> e.g., Bibliontology RDF, which could be serialized to JSON. Do we want to
>>> standardize on one of these instead of citeproc-js JSON?
>>>
>>
>> JSON is very easy for us to deal with as it can be passed straight to
>> citeproc, and adding support for citeproc features which Mendeley doesn't
>> currently support is easy. Is there a compelling reason to switch to RDF?
>>
>> Steve
>>
>> --
>> View this message in context: http://xbiblio-devel.2463403.n2.nabble.com/Embedded-CSL-fields-in-Mendeley-Word-OO-documents-tp6096952p6097708.html
>> Sent from the xbiblio-devel mailing list archive at Nabble.com.
>>
>> ------------------------------------------------------------------------------
>> What You Don't Know About Data Connectivity CAN Hurt You
>> This paper provides an overview of data connectivity, details
>> its effect on application quality, and explores various alternative
>> solutions. http://p.sf.net/sfu/progress-d2d
>> _______________________________________________
>> xbiblio-devel mailing list
>> [hidden email]
>> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
>>
>

------------------------------------------------------------------------------
What You Don't Know About Data Connectivity CAN Hurt You
This paper provides an overview of data connectivity, details
its effect on application quality, and explores various alternative
solutions. http://p.sf.net/sfu/progress-d2d
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Embedded CSL fields in Mendeley Word & OO documents

Steve Ridout
In reply to this post by Simon Kornblith

> Whether or not you include the citationID, it would be nice to standardize on
> a single JSON format. Is there a reason not to extend the structure of the
> existing Zotero format with additional keys as necessary?

Not sure I understand, do you mean adding the CSL metadata within ZOTERO ADDIN{} ?

I'd prefer not to name it Mendeley or Zotero, so my suggestion was to put it in a separate CslCitation block, with other blocks being optional. e.g. we would support field codes like:

{CslCitation:{}}

or

{ZOTERO ADDIN{} AnythingYouLike{}[][] CslCitation:{} OtherStuff...}

or

{Mendeley Citation{} CslCitation:{}}

as long as it has CslCitation:{} somewhere

(Currently we will be adding "Mendeley Citation{}" to the start for compatibility with old versions but we can drop this at some point in future.)
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Embedded CSL fields in Mendeley Word & OO documents

Simon Kornblith
On Mar 9, 2011, at 7:53 AM, Steve Ridout wrote:

>>
>> Whether or not you include the citationID, it would be nice to standardize
>> on
>> a single JSON format. Is there a reason not to extend the structure of the
>> existing Zotero format with additional keys as necessary?
>
> Not sure I understand, do you mean adding the CSL metadata within ZOTERO
> ADDIN{} ?
>
> I'd prefer not to name it Mendeley or Zotero, so my suggestion was to put it
> in a separate CslCitation block, with other blocks being optional. e.g. we
> would support field codes like:
>
> {CslCitation:{}}
>
> or
>
> {ZOTERO ADDIN{} AnythingYouLike{}[][] CslCitation:{} OtherStuff...}
>
> or
>
> {Mendeley Citation{} CslCitation:{}}
>
> as long as it has CslCitation:{} somewhere
>
> (Currently we will be adding "Mendeley Citation{}" to the start for
> compatibility with old versions but we can drop this at some point in
> future.)

I was suggesting that we use the same basic structure for the JSON object itself (potentially with Zotero- and Mendeley-specific extensions). Our current JSON doesn't say anything about Zotero anywhere, and very closely resembles a citeproc-js citation object.

I agree that the field code preceding the JSON should be implementation-agnostic. I'm fine with CslCitation:{}, but I suggest that we specify that it must come at the end of the field. Otherwise, it's hard to parse where the JSON ends with a regexp.

Simon
------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Embedded CSL fields in Mendeley Word & OO documents

Steve Ridout

> I was suggesting that we use the same basic structure for the JSON object itself
> (potentially with Zotero- and Mendeley-specific extensions). Our current JSON
> doesn't say anything about Zotero anywhere, and very closely resembles a citeproc-js citation object.

How does the Zotero JSON differ from that required by citeproc-js?

> I agree that the field code preceding the JSON should be implementation-agnostic.
> I'm fine with CslCitation:{}, but I suggest that we specify that it must come at the
> end of the field. Otherwise, it's hard to parse where the JSON ends with a regexp.

Agreed.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Embedded CSL fields in Mendeley Word & OO documents

Simon Kornblith
On Mar 9, 2011, at 12:21 PM, Steve Ridout wrote:

>>
>> I was suggesting that we use the same basic structure for the JSON object
>> itself
>> (potentially with Zotero- and Mendeley-specific extensions). Our current
>> JSON
>> doesn't say anything about Zotero anywhere, and very closely resembles a
>> citeproc-js citation object.
>
> How does the Zotero JSON differ from that required by citeproc-js?

We use the aforementioned uri array instead of an id on each citationItem, but otherwise it's identical. We would need to extend it to put the content of the CSL fields into the citationItem.

Simon
------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Embedded CSL fields in Mendeley Word & OO documents

Steve Ridout
On 13 March 2011 21:05, Simon Kornblith [via xbiblio-devel] <[hidden email]> wrote:
On Mar 9, 2011, at 12:21 PM, Steve Ridout wrote:

>>
>> I was suggesting that we use the same basic structure for the JSON object
>> itself
>> (potentially with Zotero- and Mendeley-specific extensions). Our current
>> JSON
>> doesn't say anything about Zotero anywhere, and very closely resembles a
>> citeproc-js citation object.
>
> How does the Zotero JSON differ from that required by citeproc-js?

We use the aforementioned uri array instead of an id on each citationItem, but otherwise it's identical. We would need to extend it to put the content of the CSL fields into the citationItem.


How about a format like the following example:

CslCitation:
{
"citationID":"12rsus7rlj",
"citationItems":
[
{
"id":"ITEM-1",
"itemData":
{
"author" : [],
"editor" : [],
"id" : "ITEM-1",
"issued" : { "date-parts" : [ [ "2007" ] ] },
"title" : "My paper"
},
"locator":"21",
"label":"page",
"uris" : 
]
}
],
"properties":
{
"noteIndex": 1
}
}

It's the same structure as a citeproc "minimal citation data object", except there's an added "itemData" element containing the full item data as returned by sys.retrieveItem(), and an extra "uris" array which can contain any number of Mendeley / Zotero / other unique identifiers.

Do you think this would be OK?

 
Simon
------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel



If you reply to this email, your message will be added to the discussion below:
http://xbiblio-devel.2463403.n2.nabble.com/Embedded-CSL-fields-in-Mendeley-Word-OO-documents-tp6096952p6167133.html
To unsubscribe from Embedded CSL fields in Mendeley Word & OO documents, click here.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Embedded CSL fields in Mendeley Word & OO documents

Simon Kornblith
Sounds good to me.

Simon

On Mar 16, 2011, at 2:51 PM, Steve Ridout wrote:

On 13 March 2011 21:05, Simon Kornblith [via xbiblio-devel] <<a href="x-msg://12/user/SendEmail.jtp?type=node&amp;node=6178426&amp;i=0&amp;by-user=t" target="_top" rel="nofollow" link="external">[hidden email]> wrote:
On Mar 9, 2011, at 12:21 PM, Steve Ridout wrote:

>>
>> I was suggesting that we use the same basic structure for the JSON object
>> itself
>> (potentially with Zotero- and Mendeley-specific extensions). Our current
>> JSON
>> doesn't say anything about Zotero anywhere, and very closely resembles a
>> citeproc-js citation object.
>
> How does the Zotero JSON differ from that required by citeproc-js?

We use the aforementioned uri array instead of an id on each citationItem, but otherwise it's identical. We would need to extend it to put the content of the CSL fields into the citationItem.


How about a format like the following example:

CslCitation:
{
"citationID":"12rsus7rlj",
"citationItems":
[
{
"id":"ITEM-1",
"itemData":
{
"author" : [],
"editor" : [],
"id" : "ITEM-1",
"issued" : { "date-parts" : [ [ "2007" ] ] },
"title" : "My paper"
},
"locator":"21",
"label":"page",
"uris" : 
]
}
],
"properties":
{
"noteIndex": 1
}
}

It's the same structure as a citeproc "minimal citation data object", except there's an added "itemData" element containing the full item data as returned by sys.retrieveItem(), and an extra "uris" array which can contain any number of Mendeley / Zotero / other unique identifiers.

Do you think this would be OK?

 
Simon
------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel



If you reply to this email, your message will be added to the discussion below:
http://xbiblio-devel.2463403.n2.nabble.com/Embedded-CSL-fields-in-Mendeley-Word-OO-documents-tp6096952p6167133.html



View this message in context: Re: Embedded CSL fields in Mendeley Word & OO documents
Sent from the xbiblio-devel mailing list archive at Nabble.com.
------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel


------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Embedded CSL fields in Mendeley Word & OO documents

Steve Ridout
On 17 March 2011 04:20, Simon Kornblith [via xbiblio-devel] <[hidden email]> wrote:
Sounds good to me.


That's great, we're almost done implementing this now and it will be in our next development preview (not stable yet) release.

One thing I forgot to put in my last example was a schema URI, @Bruce and @Rintze: maybe you could suggest a good URI to use for the schema version, perhaps starting with http://citationstyles.org/

12
Loading...