CSL roadmap

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

CSL roadmap

rmzelle
Administrator
Hi all,

Now that CSL 1.0.1 is out the door, I would like to inquire what
people think should be on the CSL roadmap for the coming year or so,
and what should have the highest priority. In particular, I'm curious
about the things people would like me to work on. Please speak up!

My own thoughts:

- I'm relatively happy with the workflow for accepting style patches.
Having Travic CI check each pull request, and having the ability to
run the same tests locally
(https://github.com/citation-style-language/styles/wiki/Test-Environment)
has made it much easier to keep errors out. We now also rely much more
on the style authors themselves to fix style errors. Big kudos to
Sylvester for setting up the tests, and thanks to Sebastian and
Charles for helping out with adding styles! I just hope CSL doesn't
become too popular :) (the current workload is still okay).
- I don't think we need to rush with trying to incorporate Frank's MLZ
extensions to CSL into official CSL. MLZ will be a nice testing
ground, and I rather wait a little to see how the MLZ styles perform
in practice.
- Several sites have started offering citeproc-JSON: see e.g.
http://blog.bibsonomy.org/2012/07/feature-of-week-citation-style-language.html
, http://www.doi.org/doi_handbook/5_Applications.html#5.4.1 and
http://crosscite.org/cn/#sec-4-1 . I think we could improve the
consistency of CSL style output between implementations by writing
better documentation on input expectations. This covers:
  * date formats. Bruce seems to be a big proponent of adopting the
Extended Date/Time Format as much as possible (EDTF, see
http://www.loc.gov/standards/datetime/ )
  * name parsing (e.g. extracting non-dropping and dropping particles
from two-field names)
  * field assignments (which item type should, or shouldn't, have
which fields). Aurimas prepared a map for Zotero:
http://aurimasv.github.com/z2csl/typeMap.xml . It would be great if we
could standardize the fields exposed to the CSL processors among the
different reference managers (on a per item type basis). My hope is
that we can clean up Zotero's metadata model in the coming year
(tickets as compiled by the Zotero user community can be found at
https://github.com/ajlyon/zotero-bits/issues ), and offer the result
as a guideline for other reference managers.
  * the JSON schemas. These could use some documentation.
- One of the things I'd really like to see is an improved CSL test
suite. Sylvester posted a Cucumber mockup format a while back (
https://github.com/inukshuk/citeproc-ruby/blob/1c420de0f7a86b7c35782dee86ce62cbebb47ab9/features/condition/is_numeric.feature
). I never had much success adding styles to the current test suite
setup, and I really would like to be able to categorize tests (e.g. on
CSL version, and on the CSL feature that is being tested). If the
infrastructure is there, I wouldn't mind annotating the existing tests
by hand.
- There are still quite a few open CSL tickets (
https://github.com/citation-style-language/schema/issues ). Some have
solutions that require a backwards incompatible release (i.e. CSL 1.1)
and style upgrades, while other tickets have stalled due to the
absence of good ideas or due to disagreement. I don't expect too much
progress here, unless people step in and reboot the discussions.
- I agree with Bruce that we might want to revisit our release
strategy. Going 2.5 years between releases is a bit long. I would
favor slightly more frequent releases (e.g. 1 per year), with
agreement on what can and what cannot end up in a 1.0.x release. We
could also allow certain new backward compatible features as soon as
they are approved and continuously update the spec between formal
releases.

Best,

Rintze

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|

Re: CSL roadmap

Sebastian Karcher
First, congratulations on the - as it looks so far - quite smooth release.

> - I'm relatively happy with the workflow for accepting style patches.
> Having Travic CI check each pull request, and having the ability to
> run the same tests locally
> (https://github.com/citation-style-language/styles/wiki/Test-Environment)
> has made it much easier to keep errors out. We now also rely much more
> on the style authors themselves to fix style errors. Big kudos to
> Sylvester for setting up the tests, and thanks to Sebastian and
> Charles for helping out with adding styles! I just hope CSL doesn't
> become too popular :) (the current workload is still okay).

agreed - I'm a little concerned what will happen when the visual
editor goes live for real. While it writes pretty clean code, it
doesn't validate and it does some odd things still.


> - I don't think we need to rush with trying to incorporate Frank's MLZ
> extensions to CSL into official CSL. MLZ will be a nice testing
> ground, and I rather wait a little to see how the MLZ styles perform
> in practice.
agreed again - a lot of MLZ functionality already works with CSL once
the respective fields are added to Zotero et al.

> (...)

>   * field assignments (which item type should, or shouldn't, have
> which fields). Aurimas prepared a map for Zotero:
> http://aurimasv.github.com/z2csl/typeMap.xml . It would be great if we
> could standardize the fields exposed to the CSL processors among the
> different reference managers (on a per item type basis). My hope is
> that we can clean up Zotero's metadata model in the coming year
> (tickets as compiled by the Zotero user community can be found at
> https://github.com/ajlyon/zotero-bits/issues ), and offer the result
> as a guideline for other reference managers.
This is, imho, really the top priority - with different fields/item
we'll get inconsistent outputs - cf. e.g. publishers for journals.
Happy to help work on this.

(...)
> - There are still quite a few open CSL tickets (
> https://github.com/citation-style-language/schema/issues ). Some have
> solutions that require a backwards incompatible release (i.e. CSL 1.1)
> and style upgrades, while other tickets have stalled due to the
> absence of good ideas or due to disagreement. I don't expect too much
> progress here, unless people step in and reboot the discussions.

I think some of them are pretty important. My top two are:
https://github.com/citation-style-language/schema/issues/36
https://github.com/citation-style-language/schema/issues/94

The first one is pretty major, but I'd really like to see 94 in one of
the smaller releases.


--------
Sebastian Karcher
Ph.D. Candidate
Department of Political Science
Northwestern University

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|

Re: CSL roadmap

rmzelle
Administrator
On Mon, Sep 10, 2012 at 7:08 PM, Sebastian Karcher
<[hidden email]> wrote:

>>   * field assignments (which item type should, or shouldn't, have
>> which fields). Aurimas prepared a map for Zotero:
>> http://aurimasv.github.com/z2csl/typeMap.xml . It would be great if we
>> could standardize the fields exposed to the CSL processors among the
>> different reference managers (on a per item type basis). My hope is
>> that we can clean up Zotero's metadata model in the coming year
>> (tickets as compiled by the Zotero user community can be found at
>> https://github.com/ajlyon/zotero-bits/issues ), and offer the result
>> as a guideline for other reference managers.
> This is, imho, really the top priority - with different fields/item
> we'll get inconsistent outputs - cf. e.g. publishers for journals.
> Happy to help work on this.

The ball for this is really in the court of the Zotero team, though. I
don't think you, me, Avram, Grégoire (Gracile) and others can do much
more prep work for Simon and Dan, and I'm pretty sure we're all
available to give input once they start work on it. But it seems to be
a few months out at least (post-November).

>> - There are still quite a few open CSL tickets (
>> https://github.com/citation-style-language/schema/issues ). Some have
>> solutions that require a backwards incompatible release (i.e. CSL 1.1)
>> and style upgrades, while other tickets have stalled due to the
>> absence of good ideas or due to disagreement. I don't expect too much
>> progress here, unless people step in and reboot the discussions.
>
> I think some of them are pretty important. My top two are:
> https://github.com/citation-style-language/schema/issues/36
> https://github.com/citation-style-language/schema/issues/94
>
> The first one is pretty major, but I'd really like to see 94 in one of
> the smaller releases.

I'm actually hoping that we might be able to come up with a consistent
solution that addresses both custom fields, custom locator labels
(https://github.com/citation-style-language/schema/issues/94), and
custom identifiers
(https://github.com/citation-style-language/schema/issues/33).

A recent thought of mine was to create some sort of distinct
name-space for custom variables, e.g.:

<text custom-variable="my-own-field"/>

That way you would have complete freedom to define custom variable
names, and there wouldn't be conflicts with the 'core' set of official
CSL variables.

Regardless of the particular solution chosen, it might help if we make
a fresh start with the entire discussion. I think nobody will dispute
the usefulness of having custom fields, but I'm very keen to learn, in
the precisest terms possible, what implications Bruce, Dan, Simon, and
others foresee when it comes to custom variables (e.g. with regards to
syncing). That would help me understand what the considerations should
be when thinking of solutions. (a pointer would suffice as well, if
this already has been discussed)

Rintze

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|

Re: CSL roadmap

rmzelle
Administrator
On Mon, Sep 10, 2012 at 8:16 PM, Rintze Zelle <[hidden email]> wrote:
> I'm actually hoping that we might be able to come up with a consistent
> solution that addresses both custom fields, custom locator labels
> (https://github.com/citation-style-language/schema/issues/94), and
> custom identifiers
> (https://github.com/citation-style-language/schema/issues/33).

To stress my point on the need to discuss this in detail, I think that
the representation of custom fields in CSL styles isn't nearly as
problematic an issue as coming to agreement regarding surrounding
issues like syncing (where I'm not an expert) and metadata storage.

Rintze

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|

Re: CSL roadmap

Bruce D'Arcus-3
In reply to this post by rmzelle
On Mon, Sep 10, 2012 at 8:16 PM, Rintze Zelle <[hidden email]> wrote:

>
> On Mon, Sep 10, 2012 at 7:08 PM, Sebastian Karcher
> <[hidden email]> wrote:
> >>   * field assignments (which item type should, or shouldn't, have
> >> which fields). Aurimas prepared a map for Zotero:
> >> http://aurimasv.github.com/z2csl/typeMap.xml . It would be great if we
> >> could standardize the fields exposed to the CSL processors among the
> >> different reference managers (on a per item type basis). My hope is
> >> that we can clean up Zotero's metadata model in the coming year
> >> (tickets as compiled by the Zotero user community can be found at
> >> https://github.com/ajlyon/zotero-bits/issues ), and offer the result
> >> as a guideline for other reference managers.
> > This is, imho, really the top priority - with different fields/item
> > we'll get inconsistent outputs - cf. e.g. publishers for journals.
> > Happy to help work on this.
>
> The ball for this is really in the court of the Zotero team, though. I
> don't think you, me, Avram, Grégoire (Gracile) and others can do much
> more prep work for Simon and Dan, and I'm pretty sure we're all
> available to give input once they start work on it. But it seems to be
> a few months out at least (post-November).
>
> >> - There are still quite a few open CSL tickets (
> >> https://github.com/citation-style-language/schema/issues ). Some have
> >> solutions that require a backwards incompatible release (i.e. CSL 1.1)
> >> and style upgrades, while other tickets have stalled due to the
> >> absence of good ideas or due to disagreement. I don't expect too much
> >> progress here, unless people step in and reboot the discussions.
> >
> > I think some of them are pretty important. My top two are:
> > https://github.com/citation-style-language/schema/issues/36
> > https://github.com/citation-style-language/schema/issues/94
> >
> > The first one is pretty major, but I'd really like to see 94 in one of
> > the smaller releases.
>
> I'm actually hoping that we might be able to come up with a consistent
> solution that addresses both custom fields, custom locator labels
> (https://github.com/citation-style-language/schema/issues/94), and
> custom identifiers
> (https://github.com/citation-style-language/schema/issues/33).
>
> A recent thought of mine was to create some sort of distinct
> name-space for custom variables, e.g.:
>
> <text custom-variable="my-own-field"/>
>
> That way you would have complete freedom to define custom variable
> names, and there wouldn't be conflicts with the 'core' set of official
> CSL variables.
>
> Regardless of the particular solution chosen, it might help if we make
> a fresh start with the entire discussion. I think nobody will dispute
> the usefulness of having custom fields,

I would :-)

In short, and in general, we need to put a high premium on
interoperablity of data and styles.

I don't have time to get into this in depth ATM, but we probably need
discussion of this and the previous concerns I've raised about
managing change in the CSL schema and spec over time.

Bruce

> but I'm very keen to learn, in
> the precisest terms possible, what implications Bruce, Dan, Simon, and
> others foresee when it comes to custom variables (e.g. with regards to
> syncing). That would help me understand what the considerations should
> be when thinking of solutions. (a pointer would suffice as well, if
> this already has been discussed)
>
> Rintze
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> xbiblio-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/xbiblio-devel

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel
Reply | Threaded
Open this post in threaded view
|

Re: CSL roadmap

Sylvester Keil
In reply to this post by rmzelle

On Sep 10, 2012, at 7:46 PM, Rintze Zelle wrote:

> - Several sites have started offering citeproc-JSON: see e.g.
> http://blog.bibsonomy.org/2012/07/feature-of-week-citation-style-language.html
> , http://www.doi.org/doi_handbook/5_Applications.html#5.4.1 and
> http://crosscite.org/cn/#sec-4-1 . I think we could improve the
> consistency of CSL style output between implementations by writing
> better documentation on input expectations. This covers:
>  * date formats. Bruce seems to be a big proponent of adopting the
> Extended Date/Time Format as much as possible (EDTF, see
> http://www.loc.gov/standards/datetime/ )
>  * name parsing (e.g. extracting non-dropping and dropping particles
> from two-field names)
Name parsing should strictly be optional; I've written a name parser for citeproc-ruby to deal with single-field names, but because of language / cultural differences this can quickly become infeasible. In some languages, for example, even word segmentation alone is a hard problem. I've gone to great lengths to support names passed in a single field; IIRC Frank even handles Japanese names specifically –  but I would be careful to make it mandatory for CSL processors to implement such features as it sets the bar pretty high.

>  * field assignments (which item type should, or shouldn't, have
> which fields). Aurimas prepared a map for Zotero:
> http://aurimasv.github.com/z2csl/typeMap.xml . It would be great if we
> could standardize the fields exposed to the CSL processors among the
> different reference managers (on a per item type basis). My hope is
> that we can clean up Zotero's metadata model in the coming year
> (tickets as compiled by the Zotero user community can be found at
> https://github.com/ajlyon/zotero-bits/issues ), and offer the result
> as a guideline for other reference managers.
>  * the JSON schemas. These could use some documentation.
I would agree that the data/input format should receive more attention in the future.

A few months ago, I recorded a some observations about the format here:
https://github.com/citation-style-language/schema/wiki/Processor-input-%28JSON%29

> - One of the things I'd really like to see is an improved CSL test
> suite. Sylvester posted a Cucumber mockup format a while back (
> https://github.com/inukshuk/citeproc-ruby/blob/1c420de0f7a86b7c35782dee86ce62cbebb47ab9/features/condition/is_numeric.feature
> ). I never had much success adding styles to the current test suite
> setup, and I really would like to be able to categorize tests (e.g. on
> CSL version, and on the CSL feature that is being tested). If the
> infrastructure is there, I wouldn't mind annotating the existing tests
> by hand.

I'm still working on the rewrite of citeproc-ruby; I've made good progress over the summer and I'm currently able to test individual rendering elements in isolation. Once I move on to the point where I can work with integration and acceptance type of tests, I intent to continue the work in the csl-test-suite package. The plan, right now, is to use Cucumber, because I think it offers a number of features that we need, but make it a priority for the tests to be convertible to the current JSON format so that other implementations can use the tests without having to make any changes.

Sylvester


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel

signature.asc (210 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: CSL roadmap

rmzelle
Administrator
On Tue, Sep 11, 2012 at 2:42 AM, Sylvester Keil <[hidden email]> wrote:
> Name parsing should strictly be optional; I've written a name parser for citeproc-ruby to deal with single-field names, but because of language / cultural differences this can quickly become infeasible. In some languages, for example, even word segmentation alone is a hard problem. I've gone to great lengths to support names passed in a single field; IIRC Frank even handles Japanese names specifically –  but I would be careful to make it mandatory for CSL processors to implement such features as it sets the bar pretty high.

Agreed that parsing of unstructured data (dates, names) should be
optional. (the closest the spec comes in discussion unstructured data
is in its description of the is-numeric conditional)

>>  * the JSON schemas. These could use some documentation.
>
> I would agree that the data/input format should receive more attention in the future.
>
> A few months ago, I recorded a some observations about the format here:
> https://github.com/citation-style-language/schema/wiki/Processor-input-%28JSON%29

Jakob Voss posted some comments a while back, too:
https://github.com/citation-style-language/schema/issues/31
https://github.com/citation-style-language/schema/issues/70 refers to
the input format as well.
(and, somewhat related,
https://github.com/citation-style-language/schema/issues/6 )

>> - One of the things I'd really like to see is an improved CSL test
>> suite. Sylvester posted a Cucumber mockup format a while back (
>> https://github.com/inukshuk/citeproc-ruby/blob/1c420de0f7a86b7c35782dee86ce62cbebb47ab9/features/condition/is_numeric.feature
>> ). I never had much success adding styles to the current test suite
>> setup, and I really would like to be able to categorize tests (e.g. on
>> CSL version, and on the CSL feature that is being tested). If the
>> infrastructure is there, I wouldn't mind annotating the existing tests
>> by hand.
>
> I'm still working on the rewrite of citeproc-ruby; I've made good progress over the summer and I'm currently able to test individual rendering elements in isolation. Once I move on to the point where I can work with integration and acceptance type of tests, I intent to continue the work in the csl-test-suite package. The plan, right now, is to use Cucumber, because I think it offers a number of features that we need, but make it a priority for the tests to be convertible to the current JSON format so that other implementations can use the tests without having to make any changes.

Sounds great.

Rintze

------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
xbiblio-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/xbiblio-devel